DewiBrynJones's picture
End of training
e3122d5 verified
|
raw
history blame
2.56 kB
metadata
library_name: transformers
license: cc-by-nc-4.0
base_model: facebook/nllb-200-1.3B
tags:
  - generated_from_trainer
metrics:
  - bleu
model-index:
  - name: nllb-200-1.3B-ft-eng-to-cym
    results: []

nllb-200-1.3B-ft-eng-to-cym

This model is a fine-tuned version of facebook/nllb-200-1.3B on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6294
  • Bleu: 31.9664
  • Gen Len: 56.0969

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • training_steps: 15000

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
0.9749 0.0731 1000 0.8471 26.4405 64.7467
0.8876 0.1463 2000 0.7820 25.7663 72.4875
0.8296 0.2194 3000 0.7455 29.7194 64.5048
0.7953 0.2926 4000 0.7169 26.1134 62.5134
0.7682 0.3657 5000 0.6996 32.6703 55.0
0.7499 0.4389 6000 0.6835 30.9855 57.704
0.7238 0.5120 7000 0.6696 30.3129 54.3538
0.7184 0.5851 8000 0.6597 33.7707 53.5875
0.7171 0.6583 9000 0.6511 32.3995 53.0923
0.7062 0.7314 10000 0.6440 31.099 56.6207
0.691 0.8046 11000 0.6386 32.5796 55.578
0.6851 0.8777 12000 0.6343 32.4382 55.1046
0.6892 0.9508 13000 0.6317 31.7749 55.8827
0.6586 1.0240 14000 0.6304 31.799 56.5098
0.6659 1.0971 15000 0.6294 31.9664 56.0969

Framework versions

  • Transformers 4.49.0
  • Pytorch 2.6.0+cu124
  • Datasets 3.3.2
  • Tokenizers 0.21.0