ko_en

This model is a fine-tuned version of facebook/nllb-200-distilled-600M on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3202
  • Bleu: 0.3969
  • Gen Len: 26.6275

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 64
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 10

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
0.6692 0.2807 500 0.5532 0.2901 26.5817
0.4137 0.5614 1000 0.3736 0.3364 26.4784
0.3748 0.8421 1500 0.3566 0.3507 26.5661
0.353 1.1224 2000 0.3484 0.3599 26.4103
0.3389 1.4031 2500 0.3415 0.3644 26.6078
0.3464 1.6838 3000 0.3362 0.3683 26.4936
0.3501 1.9645 3500 0.3310 0.375 26.6515
0.3173 2.2448 4000 0.3311 0.3729 26.4372
0.3073 2.5255 4500 0.3275 0.378 26.556
0.3056 2.8062 5000 0.3243 0.3811 26.5058
0.2789 3.0865 5500 0.3244 0.3843 26.5323
0.2808 3.3672 6000 0.3229 0.3824 26.6117
0.277 3.6479 6500 0.3215 0.3857 26.4873
0.2936 3.9286 7000 0.3189 0.388 26.6207
0.2641 4.2088 7500 0.3205 0.3889 26.6148
0.2675 4.4895 8000 0.3199 0.3901 26.543
0.2565 4.7702 8500 0.3170 0.392 26.5881
0.2502 5.0505 9000 0.3197 0.3919 26.6686
0.2472 5.3312 9500 0.3199 0.3921 26.6675
0.2613 5.6119 10000 0.3170 0.3918 26.5227
0.2593 5.8926 10500 0.3168 0.3952 26.6377
0.2432 6.1729 11000 0.3188 0.3938 26.5724
0.2317 6.4536 11500 0.3184 0.3934 26.6351
0.2254 6.7343 12000 0.3185 0.3943 26.6772
0.2253 7.0146 12500 0.3192 0.3966 26.6785
0.2368 7.2953 13000 0.3189 0.3959 26.6508
0.2396 7.576 13500 0.3184 0.3949 26.6651
0.2233 7.8567 14000 0.3185 0.3966 26.6405
0.2289 8.1370 14500 0.3200 0.3959 26.6969
0.2322 8.4177 15000 0.3199 0.3956 26.58
0.2233 8.6984 15500 0.3195 0.3957 26.5942
0.231 8.9791 16000 0.3188 0.3977 26.6186
0.2186 9.2594 16500 0.3203 0.3964 26.6423
0.2222 9.5401 17000 0.3205 0.3967 26.632
0.2196 9.8208 17500 0.3202 0.3969 26.6275

Framework versions

  • Transformers 4.47.0
  • Pytorch 2.5.1+cu124
  • Datasets 3.1.0
  • Tokenizers 0.21.0
Downloads last month
11
Safetensors
Model size
615M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for ryusangwon/ko_en_nllb-200-distilled-600M-test

Finetuned
(98)
this model