ko_en

This model is a fine-tuned version of facebook/nllb-200-distilled-600M on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 0.3202
Bleu: 0.3969
Gen Len: 26.6275

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 16
eval_batch_size: 16
seed: 42
gradient_accumulation_steps: 4
total_train_batch_size: 64
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 500
num_epochs: 10

Training results

Training Loss	Epoch	Step	Validation Loss	Bleu	Gen Len
0.6692	0.2807	500	0.5532	0.2901	26.5817
0.4137	0.5614	1000	0.3736	0.3364	26.4784
0.3748	0.8421	1500	0.3566	0.3507	26.5661
0.353	1.1224	2000	0.3484	0.3599	26.4103
0.3389	1.4031	2500	0.3415	0.3644	26.6078
0.3464	1.6838	3000	0.3362	0.3683	26.4936
0.3501	1.9645	3500	0.3310	0.375	26.6515
0.3173	2.2448	4000	0.3311	0.3729	26.4372
0.3073	2.5255	4500	0.3275	0.378	26.556
0.3056	2.8062	5000	0.3243	0.3811	26.5058
0.2789	3.0865	5500	0.3244	0.3843	26.5323
0.2808	3.3672	6000	0.3229	0.3824	26.6117
0.277	3.6479	6500	0.3215	0.3857	26.4873
0.2936	3.9286	7000	0.3189	0.388	26.6207
0.2641	4.2088	7500	0.3205	0.3889	26.6148
0.2675	4.4895	8000	0.3199	0.3901	26.543
0.2565	4.7702	8500	0.3170	0.392	26.5881
0.2502	5.0505	9000	0.3197	0.3919	26.6686
0.2472	5.3312	9500	0.3199	0.3921	26.6675
0.2613	5.6119	10000	0.3170	0.3918	26.5227
0.2593	5.8926	10500	0.3168	0.3952	26.6377
0.2432	6.1729	11000	0.3188	0.3938	26.5724
0.2317	6.4536	11500	0.3184	0.3934	26.6351
0.2254	6.7343	12000	0.3185	0.3943	26.6772
0.2253	7.0146	12500	0.3192	0.3966	26.6785
0.2368	7.2953	13000	0.3189	0.3959	26.6508
0.2396	7.576	13500	0.3184	0.3949	26.6651
0.2233	7.8567	14000	0.3185	0.3966	26.6405
0.2289	8.1370	14500	0.3200	0.3959	26.6969
0.2322	8.4177	15000	0.3199	0.3956	26.58
0.2233	8.6984	15500	0.3195	0.3957	26.5942
0.231	8.9791	16000	0.3188	0.3977	26.6186
0.2186	9.2594	16500	0.3203	0.3964	26.6423
0.2222	9.5401	17000	0.3205	0.3967	26.632
0.2196	9.8208	17500	0.3202	0.3969	26.6275

Framework versions

Transformers 4.47.0
Pytorch 2.5.1+cu124
Datasets 3.1.0
Tokenizers 0.21.0

ryusangwon
/

ko_en_nllb-200-distilled-600M-test

ko_en

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for ryusangwon/ko_en_nllb-200-distilled-600M-test

Evaluation results