nllb-200-distilled-600M_ru_en_finetuned_crystallography
This model is a fine-tuned version of facebook/nllb-200-distilled-600M trained on the ascolda/ru_en_Crystallography_and_Spectroscopy dataset. It achieves the following results on the evaluation set:
- Loss: 0.5602
- Bleu: 56.5855
Model description
The finetuned model yieled better performance on the machine translation task of domain-specific scientific articles related to the Crystallography and Spectroscopy domain.
Metrics used to describe the fine-tuning effect
Below is the comparison of the translation quality metrics for the original NLLB model and my finetuned version. Evaluation is focused on: (1) general translation quality, (2) quality of translation of specific terminology, and (3) uniformity of translation of domain-specific terms in different contexts.
(1) The general translation quality was evaluated using the Bleu metric.
(2) Term Success Rate. In the terminology success rate we compared the machine-translated terms with their dictionary equivalents by checking for the presence of the reference terminology translation in the output by the regular expression match.
(3) Term Consistency. This metric looks at whether technical terms are translated uniformly across the entire text corpus in different contexts. We aim for high consistency, measured by the low occurrence of multiple translations for the same term within the evaluation dataset.
Model | BLEU | Term Success Rate | Term Consistency |
---|---|---|---|
nllb-200-distilled-600M | 38.19 | 0.246 | 0.199 |
nllb-200-distilled-600M_ru_en_finetuned_crystallography | 56.59 | 0.573 | 0.740 |
- Downloads last month
- 3