metadata

license: apache-2.0
base_model: google/mt5-base
tags:
  - generated_from_trainer
metrics:
  - rouge
model-index:
  - name: mt5-base_V25775
    results: []

mt5-base_V25775

This model is a fine-tuned version of google/mt5-base on the None dataset. It achieves the following results on the evaluation set:

Loss: 2.1251
Rouge1: 28.2174
Rouge2: 10.5032
Rougel: 19.8511
Rougelsum: 23.3756
Gen Len: 72.3391

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 3
eval_batch_size: 3
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 20

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum	Gen Len
6.8396	0.81	500	2.5158	15.7446	6.4867	12.6043	13.8967	31.1588
3.1783	1.61	1000	2.3673	21.0031	8.2119	15.9097	17.9958	46.0086
3.0094	2.42	1500	2.3091	20.5903	8.1394	15.7354	17.696	44.7339
2.8754	3.23	2000	2.2652	22.3129	8.6681	16.4687	18.8755	48.9485
2.7643	4.03	2500	2.2320	22.6675	8.8846	16.7258	19.0948	48.6781
2.7	4.84	3000	2.2190	24.1409	9.4362	17.7197	20.2512	52.8498
2.6373	5.65	3500	2.2100	24.594	9.4296	18.0182	20.6398	55.0687
2.6182	6.45	4000	2.2016	25.0763	9.432	18.1113	20.6752	57.4549
2.5552	7.26	4500	2.1767	26.6143	10.1357	19.004	22.0372	62.6738
2.5319	8.06	5000	2.1665	27.0349	10.3809	19.3472	22.5876	64.7167
2.5145	8.87	5500	2.1705	26.6323	9.956	18.9994	22.119	62.3176
2.4923	9.68	6000	2.1499	27.0052	10.0351	19.2887	22.4559	64.2747
2.4367	10.48	6500	2.1418	27.0134	10.1253	19.2614	22.4648	65.2661
2.4312	11.29	7000	2.1503	27.1655	9.9501	19.1768	22.3967	66.6953
2.4186	12.1	7500	2.1370	26.6422	9.7971	19.0065	22.0444	65.9571
2.3977	12.9	8000	2.1395	27.5204	10.3095	19.4189	22.7497	69.0901
2.3596	13.71	8500	2.1302	27.685	10.1479	19.4521	22.7892	70.0644
2.3951	14.52	9000	2.1298	27.8389	10.2493	19.6671	22.933	70.7897
2.3433	15.32	9500	2.1238	27.9095	10.33	19.6428	22.9721	70.4206
2.3789	16.13	10000	2.1271	28.0755	10.5819	19.9535	23.2605	69.97
2.3331	16.94	10500	2.1240	28.1362	10.4656	19.8198	23.1857	70.9485
2.3395	17.74	11000	2.1245	28.1459	10.4803	19.801	23.2469	71.1288
2.3238	18.55	11500	2.1273	28.2156	10.4437	19.858	23.3457	73.485
2.3181	19.35	12000	2.1251	28.2174	10.5032	19.8511	23.3756	72.3391

Framework versions

Transformers 4.32.1
Pytorch 2.1.0
Datasets 2.12.0
Tokenizers 0.13.3