nllb-200-1.3B-ft-eng-to-cym

This model is a fine-tuned version of facebook/nllb-200-1.3B on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6632
  • Bleu: 35.7397
  • Gen Len: 49.2773

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 3000
  • training_steps: 15000

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
1.1624 0.0131 1000 0.9703 32.9798 39.6374
0.9983 0.0261 2000 0.8838 30.2075 39.0996
0.9131 0.0392 3000 0.8247 29.6834 42.2591
0.8483 0.0523 4000 0.7857 30.5889 46.8724
0.8059 0.0653 5000 0.7557 34.7355 43.5247
0.7811 0.0784 6000 0.7313 34.4472 50.4089
0.7494 0.0914 7000 0.7139 35.8339 57.3867
0.7276 0.1045 8000 0.6988 37.0368 58.0957
0.7246 0.1176 9000 0.6893 33.4794 49.0573
0.7227 0.1306 10000 0.6805 32.778 51.9382
0.7053 0.1437 11000 0.6745 35.1997 50.2161
0.6945 0.1568 12000 0.6697 35.0354 54.5599
0.6798 0.1698 13000 0.6656 34.9429 52.2396
0.6822 0.1829 14000 0.6642 35.3399 50.2676
0.6855 0.1960 15000 0.6632 35.7397 49.2773

Framework versions

  • Transformers 4.49.0
  • Pytorch 2.6.0+cu124
  • Datasets 3.3.2
  • Tokenizers 0.21.0
Downloads last month
2
Safetensors
Model size
1.37B params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for DewiBrynJones/nllb-200-1.3B-ft-eng-to-cym

Finetuned
(5)
this model