ashaduzzaman commited on
Commit
3a40618
1 Parent(s): f1010fe

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +72 -26
README.md CHANGED
@@ -8,43 +8,65 @@ datasets:
8
  model-index:
9
  - name: gpt2-funetuned-eli5
10
  results: []
 
 
 
 
 
 
11
  ---
12
 
13
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
14
  should probably proofread and complete it, then remove this comment. -->
15
 
16
- # gpt2-funetuned-eli5
17
 
18
- This model is a fine-tuned version of [distilbert/distilgpt2](https://huggingface.co/distilbert/distilgpt2) on the eli5_category dataset.
19
- It achieves the following results on the evaluation set:
20
- - Loss: 3.8269
21
 
22
- ## Model description
23
 
24
- More information needed
25
 
26
- ## Intended uses & limitations
27
 
28
- More information needed
 
 
 
29
 
30
- ## Training and evaluation data
31
 
32
- More information needed
 
 
 
 
33
 
34
- ## Training procedure
 
 
 
 
35
 
36
- ### Training hyperparameters
37
 
38
- The following hyperparameters were used during training:
39
- - learning_rate: 2e-05
40
- - train_batch_size: 8
41
- - eval_batch_size: 8
42
- - seed: 42
43
- - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
44
- - lr_scheduler_type: linear
45
- - num_epochs: 3.0
46
 
47
- ### Training results
 
 
 
 
 
 
 
 
 
 
 
 
 
 
48
 
49
  | Training Loss | Epoch | Step | Validation Loss |
50
  |:-------------:|:-----:|:----:|:---------------:|
@@ -52,10 +74,34 @@ The following hyperparameters were used during training:
52
  | 3.8093 | 2.0 | 2578 | 3.8280 |
53
  | 3.7661 | 3.0 | 3867 | 3.8269 |
54
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
55
 
56
- ### Framework versions
 
 
57
 
58
- - Transformers 4.42.4
59
- - Pytorch 2.3.1+cu121
60
- - Datasets 2.21.0
61
- - Tokenizers 0.19.1
 
8
  model-index:
9
  - name: gpt2-funetuned-eli5
10
  results: []
11
+ language:
12
+ - en
13
+ metrics:
14
+ - perplexity
15
+ library_name: transformers
16
+ pipeline_tag: text-generation
17
  ---
18
 
19
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
20
  should probably proofread and complete it, then remove this comment. -->
21
 
 
22
 
23
+ # gpt2-finetuned-eli5
 
 
24
 
25
+ This model is a fine-tuned version of [distilbert/distilgpt2](https://huggingface.co/distilbert/distilgpt2), fine-tuned on the `eli5_category` dataset. It has been trained to generate human-like responses to questions, specifically tailored to the Explain Like I'm 5 (ELI5) community. This model aims to provide clear and concise answers suitable for a general audience.
26
 
27
+ ## Model Description
28
 
29
+ The `gpt2-finetuned-eli5` model is based on the DistilGPT-2 architecture, which is a smaller, faster, and more efficient version of GPT-2. It retains most of GPT-2's capabilities while being more computationally efficient. The model is particularly adept at generating text that resembles human-written responses, making it suitable for tasks involving natural language understanding and generation.
30
 
31
+ ### Key Features:
32
+ - **Architecture**: DistilGPT-2, a distilled version of GPT-2.
33
+ - **Purpose**: Generating clear and concise explanations suitable for general audiences, particularly in response to questions typical of the ELI5 community.
34
+ - **Model Size**: Smaller and more efficient than the original GPT-2, with reduced computational requirements.
35
 
36
+ ## Intended Uses & Limitations
37
 
38
+ ### Intended Uses:
39
+ - **Question Answering**: Provide simplified and easy-to-understand answers to a wide range of questions.
40
+ - **Text Generation**: Generate coherent and contextually relevant text based on a given prompt.
41
+ - **Educational Tools**: Assist in educational content creation by generating simple explanations of complex topics.
42
+ - **Chatbots**: Improve the conversational abilities of chatbots by providing human-like responses.
43
 
44
+ ### Limitations:
45
+ - **Simplification Risks**: While the model excels at providing simplified explanations, it might oversimplify or miss nuances, especially with complex topics.
46
+ - **Dataset Bias**: The model's behavior reflects the data it was trained on. It might exhibit biases present in the training data, leading to inappropriate or biased responses.
47
+ - **Factually Inaccurate Responses**: The model does not have real-time access to factual databases, and its knowledge is based on the data it was trained on. It might produce outdated or incorrect information.
48
+ - **Limited Knowledge Cut-off**: The model's training data only includes information up to a certain date, and it does not know about events or developments beyond that time.
49
 
50
+ ## Training and Evaluation Data
51
 
52
+ ### Training Data:
53
+ - **Dataset**: The model was fine-tuned on the `eli5_category` dataset, which consists of questions and answers from the Explain Like I'm 5 (ELI5) community. This dataset contains a variety of topics where users seek simple and clear explanations.
 
 
 
 
 
 
54
 
55
+ ### Evaluation Data:
56
+ - The evaluation data consisted of a subset of the ELI5 dataset that was held out during training. The model's performance was assessed based on its ability to generate coherent and contextually appropriate responses.
57
+
58
+ ## Training Procedure
59
+
60
+ ### Training Hyperparameters:
61
+ - **Learning Rate**: 2e-05
62
+ - **Train Batch Size**: 8
63
+ - **Eval Batch Size**: 8
64
+ - **Seed**: 42
65
+ - **Optimizer**: Adam with betas=(0.9, 0.999) and epsilon=1e-08
66
+ - **Learning Rate Scheduler Type**: Linear
67
+ - **Number of Epochs**: 3.0
68
+
69
+ ### Training Results:
70
 
71
  | Training Loss | Epoch | Step | Validation Loss |
72
  |:-------------:|:-----:|:----:|:---------------:|
 
74
  | 3.8093 | 2.0 | 2578 | 3.8280 |
75
  | 3.7661 | 3.0 | 3867 | 3.8269 |
76
 
77
+ - The model achieved a final validation loss of 3.8269, indicating a consistent improvement in training performance.
78
+
79
+ ### Framework Versions:
80
+ - **Transformers**: 4.42.4
81
+ - **PyTorch**: 2.3.1+cu121
82
+ - **Datasets**: 2.21.0
83
+ - **Tokenizers**: 0.19.1
84
+
85
+ ## Ethical Considerations
86
+
87
+ - **Bias and Fairness**: The model's responses might reflect biases present in the training data. Users should be aware of potential biases and verify the information generated.
88
+ - **Privacy**: The model was trained on publicly available data. However, care should be taken to avoid using the model for generating content that may violate privacy norms.
89
+
90
+ ## Example Usage
91
+
92
+ To generate text using the `gpt2-finetuned-eli5` model, you can use the following code:
93
+
94
+ ```python
95
+ from transformers import pipeline
96
+
97
+ # Load the text generation pipeline
98
+ generator = pipeline("text-generation", model="ashaduzzaman/gpt2-funetuned-eli5")
99
+
100
+ # Provide a prompt
101
+ prompt = "Somatic hypermutation allows the immune system to"
102
 
103
+ # Generate text
104
+ generator(prompt)
105
+ ```
106
 
107
+ By providing this comprehensive model card, users can better understand the capabilities, limitations, and intended use cases of the `gpt2-finetuned-eli5` model. This ensures responsible and informed usage of the model.