sagarsidhwa
/

mt5-small-finetuned-amazon-en-es

text2text-generation

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

sagarsidhwa commited on Dec 5, 2024

Commit

6cd06fa

·

verified ·

1 Parent(s): 2e981b1

V1 Training complete

Files changed (1) hide show

README.md +16 -11

README.md CHANGED Viewed

@@ -19,11 +19,11 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [google/mt5-small](https://huggingface.co/google/mt5-small) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 3.2107
-- Rouge1: 16.5873
-- Rouge2: 8.3667
-- Rougel: 16.096
-- Rougelsum: 16.0654
 ## Model description
@@ -43,20 +43,25 @@ More information needed
 The following hyperparameters were used during training:
 - learning_rate: 5.6e-05
-- train_batch_size: 10
-- eval_batch_size: 10
 - seed: 42
 - optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 - lr_scheduler_type: linear
-- num_epochs: 3
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss | Rouge1  | Rouge2 | Rougel  | Rougelsum |
 |:-------------:|:-----:|:----:|:---------------:|:-------:|:------:|:-------:|:---------:|
-| 7.4326        | 1.0   | 968  | 3.3609          | 14.4654 | 5.3488 | 14.1032 | 14.1348   |
-| 4.1082        | 2.0   | 1936 | 3.2265          | 15.9058 | 7.6084 | 15.3178 | 15.3304   |
-| 3.8946        | 3.0   | 2904 | 3.2107          | 16.5873 | 8.3667 | 16.096  | 16.0654   |
 ### Framework versions

 This model is a fine-tuned version of [google/mt5-small](https://huggingface.co/google/mt5-small) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 3.0303
+- Rouge1: 16.6557
+- Rouge2: 7.7494
+- Rougel: 16.0414
+- Rougelsum: 16.1216
 ## Model description
 The following hyperparameters were used during training:
 - learning_rate: 5.6e-05
+- train_batch_size: 8
+- eval_batch_size: 8
 - seed: 42
 - optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 - lr_scheduler_type: linear
+- num_epochs: 8
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss | Rouge1  | Rouge2 | Rougel  | Rougelsum |
 |:-------------:|:-----:|:----:|:---------------:|:-------:|:------:|:-------:|:---------:|
+| 6.9675        | 1.0   | 1209 | 3.2986          | 15.4389 | 6.948  | 14.7479 | 14.8713   |
+| 3.8997        | 2.0   | 2418 | 3.1665          | 16.3621 | 7.6947 | 15.7833 | 15.7696   |
+| 3.5826        | 3.0   | 3627 | 3.1106          | 17.1917 | 8.4901 | 16.3918 | 16.472    |
+| 3.421         | 4.0   | 4836 | 3.0963          | 17.3735 | 8.8287 | 16.7517 | 16.8372   |
+| 3.3089        | 5.0   | 6045 | 3.0490          | 16.7794 | 7.6926 | 16.1692 | 16.253    |
+| 3.2437        | 6.0   | 7254 | 3.0401          | 16.6808 | 8.0175 | 15.9504 | 16.0499   |
+| 3.2133        | 7.0   | 8463 | 3.0292          | 16.3645 | 7.743  | 15.8797 | 15.9826   |
+| 3.1851        | 8.0   | 9672 | 3.0303          | 16.6557 | 7.7494 | 16.0414 | 16.1216   |
 ### Framework versions