metadata

license: apache-2.0
base_model: google/mt5-small
tags:
  - generated_from_trainer
metrics:
  - rouge
model-index:
  - name: mt5-small-finetuned-b8-e10-1024-128
    results: []

mt5-small-finetuned-b8-e10-1024-128

This model is a fine-tuned version of google/mt5-small on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 3.3822
Rouge1: 13.327
Rouge2: 4.8244
Rougel: 13.1978
Rougelsum: 13.2133
Gen Len: 17.5592

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 16
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 10

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum	Gen Len
4.7372	1.0	1357	3.8287	9.3951	3.6576	9.342	9.3047	12.6653
4.3162	2.0	2714	3.6750	10.9224	4.1119	10.8209	10.8235	15.0997
4.1726	3.0	4071	3.5668	11.7438	4.2353	11.6204	11.6087	16.5169
4.0439	4.0	5428	3.5002	12.402	4.4267	12.2785	12.2924	17.0402
3.9978	5.0	6785	3.4494	12.7762	4.5509	12.6699	12.6829	17.2466
3.9687	6.0	8142	3.4229	12.9652	4.6727	12.8555	12.8761	17.4303
3.8639	7.0	9499	3.4058	13.4216	4.784	13.3097	13.2988	17.4252
3.8474	8.0	10856	3.3924	13.2422	4.7672	13.1416	13.12	17.5046
3.843	9.0	12213	3.3845	13.2519	4.8713	13.1421	13.1304	17.5371
3.8545	10.0	13570	3.3822	13.327	4.8244	13.1978	13.2133	17.5592

Framework versions

Transformers 4.33.0
Pytorch 2.0.0
Datasets 2.14.6
Tokenizers 0.13.3