nllb-200-distilled-600M-finetuned-py2cpp

This model is a fine-tuned version of facebook/nllb-200-distilled-600M on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 1.1142
  • Bleu: 58.3679
  • Gen Len: 74.4727

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 32
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 10

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
No log 0.99 33 3.5500 28.0752 98.4545
No log 2.0 67 2.6889 28.4762 97.7273
No log 2.99 100 2.1016 13.9425 131.9636
No log 4.0 134 1.6955 20.9551 114.3091
No log 4.99 167 1.4578 44.5358 83.4
No log 6.0 201 1.2986 53.9615 75.0545
No log 6.99 234 1.2113 56.6086 77.4182
No log 8.0 268 1.1550 57.2346 73.8364
No log 8.99 301 1.1222 58.1529 74.2
No log 9.85 330 1.1142 58.3679 74.4727

Framework versions

  • Transformers 4.33.1
  • Pytorch 2.4.0
  • Datasets 3.0.1
  • Tokenizers 0.13.3
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference API
Unable to determine this model's library. Check the docs .

Model tree for hugo-albert/nllb-200-distilled-600M-finetuned-py2cpp

Finetuned
(102)
this model