sagarsidhwa commited on
Commit
6cd06fa
·
verified ·
1 Parent(s): 2e981b1

V1 Training complete

Browse files
Files changed (1) hide show
  1. README.md +16 -11
README.md CHANGED
@@ -19,11 +19,11 @@ should probably proofread and complete it, then remove this comment. -->
19
 
20
  This model is a fine-tuned version of [google/mt5-small](https://huggingface.co/google/mt5-small) on an unknown dataset.
21
  It achieves the following results on the evaluation set:
22
- - Loss: 3.2107
23
- - Rouge1: 16.5873
24
- - Rouge2: 8.3667
25
- - Rougel: 16.096
26
- - Rougelsum: 16.0654
27
 
28
  ## Model description
29
 
@@ -43,20 +43,25 @@ More information needed
43
 
44
  The following hyperparameters were used during training:
45
  - learning_rate: 5.6e-05
46
- - train_batch_size: 10
47
- - eval_batch_size: 10
48
  - seed: 42
49
  - optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
50
  - lr_scheduler_type: linear
51
- - num_epochs: 3
52
 
53
  ### Training results
54
 
55
  | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum |
56
  |:-------------:|:-----:|:----:|:---------------:|:-------:|:------:|:-------:|:---------:|
57
- | 7.4326 | 1.0 | 968 | 3.3609 | 14.4654 | 5.3488 | 14.1032 | 14.1348 |
58
- | 4.1082 | 2.0 | 1936 | 3.2265 | 15.9058 | 7.6084 | 15.3178 | 15.3304 |
59
- | 3.8946 | 3.0 | 2904 | 3.2107 | 16.5873 | 8.3667 | 16.096 | 16.0654 |
 
 
 
 
 
60
 
61
 
62
  ### Framework versions
 
19
 
20
  This model is a fine-tuned version of [google/mt5-small](https://huggingface.co/google/mt5-small) on an unknown dataset.
21
  It achieves the following results on the evaluation set:
22
+ - Loss: 3.0303
23
+ - Rouge1: 16.6557
24
+ - Rouge2: 7.7494
25
+ - Rougel: 16.0414
26
+ - Rougelsum: 16.1216
27
 
28
  ## Model description
29
 
 
43
 
44
  The following hyperparameters were used during training:
45
  - learning_rate: 5.6e-05
46
+ - train_batch_size: 8
47
+ - eval_batch_size: 8
48
  - seed: 42
49
  - optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
50
  - lr_scheduler_type: linear
51
+ - num_epochs: 8
52
 
53
  ### Training results
54
 
55
  | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum |
56
  |:-------------:|:-----:|:----:|:---------------:|:-------:|:------:|:-------:|:---------:|
57
+ | 6.9675 | 1.0 | 1209 | 3.2986 | 15.4389 | 6.948 | 14.7479 | 14.8713 |
58
+ | 3.8997 | 2.0 | 2418 | 3.1665 | 16.3621 | 7.6947 | 15.7833 | 15.7696 |
59
+ | 3.5826 | 3.0 | 3627 | 3.1106 | 17.1917 | 8.4901 | 16.3918 | 16.472 |
60
+ | 3.421 | 4.0 | 4836 | 3.0963 | 17.3735 | 8.8287 | 16.7517 | 16.8372 |
61
+ | 3.3089 | 5.0 | 6045 | 3.0490 | 16.7794 | 7.6926 | 16.1692 | 16.253 |
62
+ | 3.2437 | 6.0 | 7254 | 3.0401 | 16.6808 | 8.0175 | 15.9504 | 16.0499 |
63
+ | 3.2133 | 7.0 | 8463 | 3.0292 | 16.3645 | 7.743 | 15.8797 | 15.9826 |
64
+ | 3.1851 | 8.0 | 9672 | 3.0303 | 16.6557 | 7.7494 | 16.0414 | 16.1216 |
65
 
66
 
67
  ### Framework versions