Francesco-A commited on
Commit
23cd165
·
1 Parent(s): 2c6ddbd

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +10 -11
README.md CHANGED
@@ -26,30 +26,29 @@ model-index:
26
  value: 52.88529894542656
27
  ---
28
 
29
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
30
- should probably proofread and complete it, then remove this comment. -->
31
-
32
- # finetuned-kde4-en-to-fr
33
 
34
  This model is a fine-tuned version of [Helsinki-NLP/opus-mt-en-fr](https://huggingface.co/Helsinki-NLP/opus-mt-en-fr) on the kde4 dataset.
35
  It achieves the following results on the evaluation set:
36
  - Loss: 0.8556
37
  - Bleu: 52.8853
38
 
39
- ## Model description
40
-
41
- More information needed
42
 
43
- ## Intended uses & limitations
44
-
45
- More information needed
46
 
47
  ## Training and evaluation data
48
 
49
- More information needed
50
 
51
  ## Training procedure
52
 
 
 
53
  ### Training hyperparameters
54
 
55
  The following hyperparameters were used during training:
 
26
  value: 52.88529894542656
27
  ---
28
 
29
+ # Model description (finetuned-kde4-en-to-fr)
 
 
 
30
 
31
  This model is a fine-tuned version of [Helsinki-NLP/opus-mt-en-fr](https://huggingface.co/Helsinki-NLP/opus-mt-en-fr) on the kde4 dataset.
32
  It achieves the following results on the evaluation set:
33
  - Loss: 0.8556
34
  - Bleu: 52.8853
35
 
36
+ ## Intended uses
37
+ - Translation of English text to French
38
+ - Generating coherent and accurate translations in the domain of technical computer science
39
 
40
+ ## Limitations
41
+ - The model's performance may degrade when translating sentences with complex or domain-specific terminology that was not present in the training data.
42
+ - It may struggle with idiomatic expressions and cultural nuances that are not captured in the training data.
43
 
44
  ## Training and evaluation data
45
 
46
+ The model was fine-tuned on the KDE4 dataset, which consists of pairs of sentences in English and their French translations. The dataset contains 189,155 pairs for training and 21,018 pairs for validation.
47
 
48
  ## Training procedure
49
 
50
+ The model was trained using the Seq2SeqTrainer API from the 🤗 Transformers library. The training procedure involved tokenizing the input English sentences and target French sentences, preparing the data collation for dynamic batching and fine-tuning the model. The evaluation metric used is *SacreBLEU*.
51
+
52
  ### Training hyperparameters
53
 
54
  The following hyperparameters were used during training: