ashaduzzaman
commited on
Commit
•
3a40618
1
Parent(s):
f1010fe
Update README.md
Browse files
README.md
CHANGED
@@ -8,43 +8,65 @@ datasets:
|
|
8 |
model-index:
|
9 |
- name: gpt2-funetuned-eli5
|
10 |
results: []
|
|
|
|
|
|
|
|
|
|
|
|
|
11 |
---
|
12 |
|
13 |
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
14 |
should probably proofread and complete it, then remove this comment. -->
|
15 |
|
16 |
-
# gpt2-funetuned-eli5
|
17 |
|
18 |
-
|
19 |
-
It achieves the following results on the evaluation set:
|
20 |
-
- Loss: 3.8269
|
21 |
|
22 |
-
|
23 |
|
24 |
-
|
25 |
|
26 |
-
|
27 |
|
28 |
-
|
|
|
|
|
|
|
29 |
|
30 |
-
##
|
31 |
|
32 |
-
|
|
|
|
|
|
|
|
|
33 |
|
34 |
-
|
|
|
|
|
|
|
|
|
35 |
|
36 |
-
|
37 |
|
38 |
-
|
39 |
-
-
|
40 |
-
- train_batch_size: 8
|
41 |
-
- eval_batch_size: 8
|
42 |
-
- seed: 42
|
43 |
-
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
|
44 |
-
- lr_scheduler_type: linear
|
45 |
-
- num_epochs: 3.0
|
46 |
|
47 |
-
###
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
48 |
|
49 |
| Training Loss | Epoch | Step | Validation Loss |
|
50 |
|:-------------:|:-----:|:----:|:---------------:|
|
@@ -52,10 +74,34 @@ The following hyperparameters were used during training:
|
|
52 |
| 3.8093 | 2.0 | 2578 | 3.8280 |
|
53 |
| 3.7661 | 3.0 | 3867 | 3.8269 |
|
54 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
55 |
|
56 |
-
|
|
|
|
|
57 |
|
58 |
-
-
|
59 |
-
- Pytorch 2.3.1+cu121
|
60 |
-
- Datasets 2.21.0
|
61 |
-
- Tokenizers 0.19.1
|
|
|
8 |
model-index:
|
9 |
- name: gpt2-funetuned-eli5
|
10 |
results: []
|
11 |
+
language:
|
12 |
+
- en
|
13 |
+
metrics:
|
14 |
+
- perplexity
|
15 |
+
library_name: transformers
|
16 |
+
pipeline_tag: text-generation
|
17 |
---
|
18 |
|
19 |
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
20 |
should probably proofread and complete it, then remove this comment. -->
|
21 |
|
|
|
22 |
|
23 |
+
# gpt2-finetuned-eli5
|
|
|
|
|
24 |
|
25 |
+
This model is a fine-tuned version of [distilbert/distilgpt2](https://huggingface.co/distilbert/distilgpt2), fine-tuned on the `eli5_category` dataset. It has been trained to generate human-like responses to questions, specifically tailored to the Explain Like I'm 5 (ELI5) community. This model aims to provide clear and concise answers suitable for a general audience.
|
26 |
|
27 |
+
## Model Description
|
28 |
|
29 |
+
The `gpt2-finetuned-eli5` model is based on the DistilGPT-2 architecture, which is a smaller, faster, and more efficient version of GPT-2. It retains most of GPT-2's capabilities while being more computationally efficient. The model is particularly adept at generating text that resembles human-written responses, making it suitable for tasks involving natural language understanding and generation.
|
30 |
|
31 |
+
### Key Features:
|
32 |
+
- **Architecture**: DistilGPT-2, a distilled version of GPT-2.
|
33 |
+
- **Purpose**: Generating clear and concise explanations suitable for general audiences, particularly in response to questions typical of the ELI5 community.
|
34 |
+
- **Model Size**: Smaller and more efficient than the original GPT-2, with reduced computational requirements.
|
35 |
|
36 |
+
## Intended Uses & Limitations
|
37 |
|
38 |
+
### Intended Uses:
|
39 |
+
- **Question Answering**: Provide simplified and easy-to-understand answers to a wide range of questions.
|
40 |
+
- **Text Generation**: Generate coherent and contextually relevant text based on a given prompt.
|
41 |
+
- **Educational Tools**: Assist in educational content creation by generating simple explanations of complex topics.
|
42 |
+
- **Chatbots**: Improve the conversational abilities of chatbots by providing human-like responses.
|
43 |
|
44 |
+
### Limitations:
|
45 |
+
- **Simplification Risks**: While the model excels at providing simplified explanations, it might oversimplify or miss nuances, especially with complex topics.
|
46 |
+
- **Dataset Bias**: The model's behavior reflects the data it was trained on. It might exhibit biases present in the training data, leading to inappropriate or biased responses.
|
47 |
+
- **Factually Inaccurate Responses**: The model does not have real-time access to factual databases, and its knowledge is based on the data it was trained on. It might produce outdated or incorrect information.
|
48 |
+
- **Limited Knowledge Cut-off**: The model's training data only includes information up to a certain date, and it does not know about events or developments beyond that time.
|
49 |
|
50 |
+
## Training and Evaluation Data
|
51 |
|
52 |
+
### Training Data:
|
53 |
+
- **Dataset**: The model was fine-tuned on the `eli5_category` dataset, which consists of questions and answers from the Explain Like I'm 5 (ELI5) community. This dataset contains a variety of topics where users seek simple and clear explanations.
|
|
|
|
|
|
|
|
|
|
|
|
|
54 |
|
55 |
+
### Evaluation Data:
|
56 |
+
- The evaluation data consisted of a subset of the ELI5 dataset that was held out during training. The model's performance was assessed based on its ability to generate coherent and contextually appropriate responses.
|
57 |
+
|
58 |
+
## Training Procedure
|
59 |
+
|
60 |
+
### Training Hyperparameters:
|
61 |
+
- **Learning Rate**: 2e-05
|
62 |
+
- **Train Batch Size**: 8
|
63 |
+
- **Eval Batch Size**: 8
|
64 |
+
- **Seed**: 42
|
65 |
+
- **Optimizer**: Adam with betas=(0.9, 0.999) and epsilon=1e-08
|
66 |
+
- **Learning Rate Scheduler Type**: Linear
|
67 |
+
- **Number of Epochs**: 3.0
|
68 |
+
|
69 |
+
### Training Results:
|
70 |
|
71 |
| Training Loss | Epoch | Step | Validation Loss |
|
72 |
|:-------------:|:-----:|:----:|:---------------:|
|
|
|
74 |
| 3.8093 | 2.0 | 2578 | 3.8280 |
|
75 |
| 3.7661 | 3.0 | 3867 | 3.8269 |
|
76 |
|
77 |
+
- The model achieved a final validation loss of 3.8269, indicating a consistent improvement in training performance.
|
78 |
+
|
79 |
+
### Framework Versions:
|
80 |
+
- **Transformers**: 4.42.4
|
81 |
+
- **PyTorch**: 2.3.1+cu121
|
82 |
+
- **Datasets**: 2.21.0
|
83 |
+
- **Tokenizers**: 0.19.1
|
84 |
+
|
85 |
+
## Ethical Considerations
|
86 |
+
|
87 |
+
- **Bias and Fairness**: The model's responses might reflect biases present in the training data. Users should be aware of potential biases and verify the information generated.
|
88 |
+
- **Privacy**: The model was trained on publicly available data. However, care should be taken to avoid using the model for generating content that may violate privacy norms.
|
89 |
+
|
90 |
+
## Example Usage
|
91 |
+
|
92 |
+
To generate text using the `gpt2-finetuned-eli5` model, you can use the following code:
|
93 |
+
|
94 |
+
```python
|
95 |
+
from transformers import pipeline
|
96 |
+
|
97 |
+
# Load the text generation pipeline
|
98 |
+
generator = pipeline("text-generation", model="ashaduzzaman/gpt2-funetuned-eli5")
|
99 |
+
|
100 |
+
# Provide a prompt
|
101 |
+
prompt = "Somatic hypermutation allows the immune system to"
|
102 |
|
103 |
+
# Generate text
|
104 |
+
generator(prompt)
|
105 |
+
```
|
106 |
|
107 |
+
By providing this comprehensive model card, users can better understand the capabilities, limitations, and intended use cases of the `gpt2-finetuned-eli5` model. This ensures responsible and informed usage of the model.
|
|
|
|
|
|