metadata

license: apache-2.0
base_model: google/mt5-base
tags:
  - generated_from_trainer
metrics:
  - bleu
model-index:
  - name: cs_mT5_0.01_50_v0.1
    results: []

cs_mT5_0.01_50_v0.1

This model is a fine-tuned version of google/mt5-base on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 7.3188
Bleu: 1.2029
Gen Len: 19.0

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.01
train_batch_size: 16
eval_batch_size: 16
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 50

Training results

Training Loss	Epoch	Step	Validation Loss	Bleu	Gen Len
3.6406	1.0	6	6.1758	0.1903	19.0
4.2513	2.0	12	6.4360	0.4971	19.0
3.1515	3.0	18	6.2761	0.1689	19.0
3.4713	4.0	24	6.4576	0.4973	19.0
3.2069	5.0	30	6.6858	0.176	10.0
3.5913	6.0	36	6.2785	0.7212	19.0
3.7814	7.0	42	6.1120	0.7212	19.0
3.2429	8.0	48	6.3660	0.3725	19.0
3.2716	9.0	54	6.6523	0.4214	19.0
3.3443	10.0	60	6.4341	0.3793	19.0
2.4705	11.0	66	6.8433	0.7412	19.0
3.0869	12.0	72	6.9583	0.0	19.0
2.5187	13.0	78	6.3333	1.1569	19.0
3.1211	14.0	84	6.4031	0.2813	19.0
2.7326	15.0	90	6.4055	0.7962	19.0
2.5142	16.0	96	6.5799	0.1843	19.0
3.0964	17.0	102	6.8379	0.9395	19.0
2.5998	18.0	108	6.4570	0.0	19.0
3.2495	19.0	114	6.6350	0.2045	19.0
3.2509	20.0	120	6.3533	0.7212	19.0
3.2998	21.0	126	6.3142	0.6756	19.0
2.7829	22.0	132	6.5953	0.6646	19.0
3.0842	23.0	138	6.6276	0.7056	19.0
1.8502	24.0	144	6.6472	0.2386	19.0
1.945	25.0	150	6.6534	0.6966	19.0
2.7704	26.0	156	7.1955	0.7611	13.0
3.1289	27.0	162	6.6522	0.7286	17.0
3.0663	28.0	168	6.3873	0.8029	19.0
3.4269	29.0	174	6.4310	0.204	19.0
2.7845	30.0	180	6.7221	0.3228	19.0
2.0443	31.0	186	6.8353	0.3228	19.0
3.1621	32.0	192	7.1400	0.1346	19.0
2.4147	33.0	198	6.8844	1.2029	19.0
2.5869	34.0	204	6.7074	0.7475	19.0
2.1119	35.0	210	6.5778	0.7212	19.0
1.7629	36.0	216	6.5553	0.7867	19.0
2.3745	37.0	222	6.7126	0.7663	19.0
2.368	38.0	228	6.8008	0.4815	19.0
2.17	39.0	234	6.6388	0.7892	19.0
2.4311	40.0	240	6.6423	0.3228	19.0
2.8392	41.0	246	6.7127	0.3226	19.0
2.386	42.0	252	6.8011	0.31	19.0
2.7473	43.0	258	6.8704	0.31	19.0
1.9796	44.0	264	6.9846	1.2029	19.0
1.4857	45.0	270	7.1239	1.2029	19.0
1.8413	46.0	276	7.2177	1.194	19.0
2.171	47.0	282	7.2605	1.2029	19.0
1.9659	48.0	288	7.3048	1.2029	19.0
1.3681	49.0	294	7.3093	1.2029	19.0
2.086	50.0	300	7.3188	1.2029	19.0

Framework versions

Transformers 4.35.2
Pytorch 1.13.1+cu117
Datasets 2.16.1
Tokenizers 0.15.0