End of training

Files changed (4) hide show

README.md CHANGED Viewed

@@ -17,12 +17,12 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [facebook/bart-large-xsum](https://huggingface.co/facebook/bart-large-xsum) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 1.4704
-- Rouge1: 54.8232
-- Rouge2: 30.1114
-- Rougel: 45.2666
-- Rougelsum: 50.7533
-- Gen Len: 30.3399
 ## Model description
@@ -48,23 +48,24 @@ The following hyperparameters were used during training:
 - gradient_accumulation_steps: 2
 - total_train_batch_size: 8
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
-- lr_scheduler_type: linear
 - num_epochs: 4
 - mixed_precision_training: Native AMP
 ### Training results
 | Training Loss | Epoch  | Step | Validation Loss | Rouge1  | Rouge2  | Rougel  | Rougelsum | Gen Len |
 |:-------------:|:------:|:----:|:---------------:|:-------:|:-------:|:-------:|:---------:|:-------:|
-| 1.3807        | 0.9997 | 1841 | 1.5203          | 52.4158 | 27.5034 | 42.8274 | 48.0361   | 31.4664 |
-| 1.077         | 2.0    | 3683 | 1.5038          | 53.5277 | 28.5946 | 44.2315 | 49.5696   | 30.768  |
-| 0.831         | 2.9997 | 5524 | 1.5362          | 52.9008 | 27.7041 | 43.5637 | 48.3921   | 29.9243 |
-| 0.6919        | 3.9989 | 7364 | 1.6272          | 52.8716 | 27.9183 | 43.8019 | 48.6547   | 30.2002 |
 ### Framework versions
 - Transformers 4.42.4
-- Pytorch 2.3.1+cu121
 - Datasets 2.21.0
 - Tokenizers 0.19.1

 This model is a fine-tuned version of [facebook/bart-large-xsum](https://huggingface.co/facebook/bart-large-xsum) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 2.6994
+- Rouge1: 54.5529
+- Rouge2: 30.0179
+- Rougel: 45.3837
+- Rougelsum: 50.4176
+- Gen Len: 28.967
 ## Model description
 - gradient_accumulation_steps: 2
 - total_train_batch_size: 8
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: cosine
 - num_epochs: 4
 - mixed_precision_training: Native AMP
+- label_smoothing_factor: 0.1
 ### Training results
 | Training Loss | Epoch  | Step | Validation Loss | Rouge1  | Rouge2  | Rougel  | Rougelsum | Gen Len |
 |:-------------:|:------:|:----:|:---------------:|:-------:|:-------:|:-------:|:---------:|:-------:|
+| 2.7327        | 0.9997 | 1841 | 2.7677          | 52.2923 | 27.6237 | 43.1558 | 48.08     | 30.4005 |
+| 2.4597        | 2.0    | 3683 | 2.7286          | 53.4085 | 28.7235 | 44.5737 | 49.3042   | 29.3004 |
+| 2.2042        | 2.9997 | 5524 | 2.7436          | 53.6036 | 28.857  | 44.7337 | 49.2789   | 28.4188 |
+| 2.1096        | 3.9989 | 7364 | 2.7886          | 53.0547 | 28.3597 | 44.0648 | 48.804    | 29.5165 |
 ### Framework versions
 - Transformers 4.42.4
+- Pytorch 2.4.0+cu121
 - Datasets 2.21.0
 - Tokenizers 0.19.1

config.json CHANGED Viewed

@@ -18,7 +18,7 @@
   "decoder_layerdrop": 0.0,
   "decoder_layers": 12,
   "decoder_start_token_id": 2,
-  "dropout": 0.1,
   "early_stopping": true,
   "encoder_attention_heads": 16,
   "encoder_ffn_dim": 4096,

   "decoder_layerdrop": 0.0,
   "decoder_layers": 12,
   "decoder_start_token_id": 2,
+  "dropout": 0.3,
   "early_stopping": true,
   "encoder_attention_heads": 16,
   "encoder_ffn_dim": 4096,

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:a996079c7ea898ab67590499a7df4e3063639d63680d6767449e01343d481964
 size 1625422896

 version https://git-lfs.github.com/spec/v1
+oid sha256:1373397ebd3433e2102a0623450b03c0dab0bca21a2d5932be3da57c3998a39f
 size 1625422896

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:d31a2b64498f4d64053ce41274b1367fc759775c8d0085c7d91c9f6f60f391d6
 size 5240

 version https://git-lfs.github.com/spec/v1
+oid sha256:ac2e3051452b3e645f50ca82e57da79685dfaaae5f9e819072eab4874d3a0f17
 size 5240