mNLP-project
/

distilgpt2-finetuned

Text Generation

Generated from Trainer

text-generation-inference

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

thymiantheherb commited on May 21

Commit

49586e6

•

1 Parent(s): 32929de

End of training

Files changed (3) hide show

README.md +9 -9
model.safetensors +1 -1
training_args.bin +1 -1

README.md CHANGED Viewed

@@ -17,11 +17,11 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [distilbert/distilgpt2](https://huggingface.co/distilbert/distilgpt2) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 3.8114
-- Bleu: 0.0101
-- Bertscore Precision: 0.1499
-- Bertscore Recall: 0.1656
-- Bertscore F1: 0.1571
 ## Model description
@@ -44,18 +44,18 @@ The following hyperparameters were used during training:
 - train_batch_size: 8
 - eval_batch_size: 8
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
-- num_epochs: 3.0
 - mixed_precision_training: Native AMP
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss | Bleu   | Bertscore Precision | Bertscore Recall | Bertscore F1 |
 |:-------------:|:-----:|:----:|:---------------:|:------:|:-------------------:|:----------------:|:------------:|
-| 4.1924        | 1.0   | 644  | 4.0681          | 0.0091 | 0.1493              | 0.1649           | 0.1564       |
-| 4.0754        | 2.0   | 1288 | 3.8779          | 0.0099 | 0.1498              | 0.1654           | 0.1569       |
-| 3.8277        | 3.0   | 1932 | 3.8114          | 0.0101 | 0.1499              | 0.1656           | 0.1571       |
 ### Framework versions

 This model is a fine-tuned version of [distilbert/distilgpt2](https://huggingface.co/distilbert/distilgpt2) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 4.0665
+- Bleu: 0.0085
+- Bertscore Precision: 0.1478
+- Bertscore Recall: 0.1636
+- Bertscore F1: 0.1550
 ## Model description
 - train_batch_size: 8
 - eval_batch_size: 8
 - seed: 42
+- gradient_accumulation_steps: 2
+- total_train_batch_size: 16
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
+- num_epochs: 1
 - mixed_precision_training: Native AMP
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss | Bleu   | Bertscore Precision | Bertscore Recall | Bertscore F1 |
 |:-------------:|:-----:|:----:|:---------------:|:------:|:-------------------:|:----------------:|:------------:|
+| 5.0162        | 1.0   | 3223 | 4.0665          | 0.0085 | 0.1478              | 0.1636           | 0.1550       |
 ### Framework versions

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:f0c79c17a1750921715b59469a5381f7cc25934155f493fc482e492564bab7c1
 size 327657928

 version https://git-lfs.github.com/spec/v1
+oid sha256:29300574d12fdd5ebc15042cd87ab92c44f018161802c6d5d5d17c1eda221c16
 size 327657928

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:65336c76c05ee6d71702e751ad04c2c9c8e62744efbcd520c27e8c322042bbd3
 size 5048

 version https://git-lfs.github.com/spec/v1
+oid sha256:17db93795a362c9298df2f82088391ff7ea583f9b96abb0f083048e4907e3d2f
 size 5048