nllb-200-1.3B-ft-eng-to-cym

This model is a fine-tuned version of facebook/nllb-200-1.3B on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 32
eval_batch_size: 32
seed: 42
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 3000
training_steps: 15000

Training Loss	Epoch	Step	Validation Loss	Bleu	Gen Len
1.1624	0.0131	1000	0.9703	32.9798	39.6374
0.9983	0.0261	2000	0.8838	30.2075	39.0996
0.9131	0.0392	3000	0.8247	29.6834	42.2591
0.8483	0.0523	4000	0.7857	30.5889	46.8724
0.8059	0.0653	5000	0.7557	34.7355	43.5247
0.7811	0.0784	6000	0.7313	34.4472	50.4089
0.7494	0.0914	7000	0.7139	35.8339	57.3867
0.7276	0.1045	8000	0.6988	37.0368	58.0957
0.7246	0.1176	9000	0.6893	33.4794	49.0573
0.7227	0.1306	10000	0.6805	32.778	51.9382
0.7053	0.1437	11000	0.6745	35.1997	50.2161
0.6945	0.1568	12000	0.6697	35.0354	54.5599
0.6798	0.1698	13000	0.6656	34.9429	52.2396
0.6822	0.1829	14000	0.6642	35.3399	50.2676
0.6855	0.1960	15000	0.6632	35.7397	49.2773