File size: 3,352 Bytes
150c1de
55d0dac
150c1de
 
 
 
 
 
 
 
 
 
 
 
 
41d1948
150c1de
41d1948
 
 
 
 
 
150c1de
 
08fb8cf
5413b0b
 
 
 
 
 
08fb8cf
5413b0b
08fb8cf
 
5413b0b
 
 
 
 
 
150c1de
 
 
 
 
 
 
 
06810cd
 
 
 
 
 
 
 
 
 
 
 
 
 
150c1de
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
---
inference: false
language:
- ja
- en
- de
- is
- zh
- cs
---
# webbigdata/ALMA-7B-Ja

Original ALMA Model [ALMA-7B](https://huggingface.co/haoranxu/ALMA-7B). (26.95GB)  

ALMA-7B-Ja is a machine translation model that uses ALMA's learning method to translate Japanese to English.(13.3GB)  
The original ALMA-7B supports English and Russian(ru) translation. This model supports Japanese(ja) and English translations instead of Russian.

Like the original model, This model has been verified that it also has a translation ability between the following languages, but if you want the translation function for these languages, it is better to use the original [ALMA-13B model](https://huggingface.co/haoranxu/ALMA-13B).  

German(de) and English(en)  
Chinese(zh) and English(en)  
Icelandic(is) and English(en)  
Czech(cs) and English(en)  


Translating from English (en→xx) BLEU/COMET  
Models           | de     | cs     | is     | zh     | ru/jp  | Avg.   |
|----------------|--------|--------|--------|--------|--------|--------|
NLLB-54B         | 34.50/86.45 | 37.60/90.15 | 24.15/81.76 | 27.38/78.91 | 30.96/87.92 | 30.92/85.04 |
GPT-3.5-D        | 31.80/85.61 | 31.30/88.57 | 15.90/76.28 | 38.30/85.76 | 27.50/86.74 | 28.96/84.59 |
ALMA-7B(Original)| 30.31/85.59 | 29.88/89.10 | 25.71/85.52 | 36.87/85.11 | 27.13/86.98 | 29.89/86.49 |
ALMA-7B-Ja(Ours) | 23.70/82.04 | 18.58/81.36 | 12.20/71.59 | 29.06/82.45 | 14.82/85.40 | 19.67/80.57 |

Translating to English (xx→en) BLEU/COMET
Models           | de     | cs     | is     | zh     | ru/jp  | Avg.   |
|----------------|--------|--------|--------|--------|--------|--------|
NLLB-54B         | 26.89/78.94 | 39.11/80.13 | 23.09/71.66 | 16.56/70.70 | 39.11/81.88 | 28.95/76.66 |
GPT-3.5-D        | 30.90/84.79 | 44.50/86.16 | 31.90/82.13 | 25.00/81.62 | 38.50/84.80 | 34.16/83.90 |
ALMA-7B(Original)| 30.26/84.00 | 43.91/85.86 | 35.97/86.03 | 23.75/79.85 | 39.37/84.58 | 34.55/84.02 |
ALMA-7B-Ja(Ours) | 26.41/83.13 | 34.39/83.50 | 24.77/81.12 | 20.60/78.54 | 15.57/78.61 | 24.35/81.76 |



[Sample Code For Free Colab](https://github.com/webbigdata-jp/python_sample/blob/main/ALMA_7B_Ja_Free_Colab_sample.ipynb)  

There is also a GPTQ quantized version model that reduces model size(3.9GB) and memory usage, although the performance is probably lower.  
And translation ability for languages other than Japanese and English has deteriorated significantly.   
[webbigdata/ALMA-7B-Ja-GPTQ-Ja-En](https://huggingface.co/webbigdata/ALMA-7B-Ja-GPTQ-Ja-En)  


**ALMA** (**A**dvanced **L**anguage **M**odel-based tr**A**nslator) is an LLM-based translation model, which adopts a new translation model paradigm: it begins with fine-tuning on monolingual data and is further optimized using high-quality parallel data. This two-step fine-tuning process ensures strong translation performance. 
Please find more details in their [paper](https://arxiv.org/abs/2309.11674).
```
@misc{xu2023paradigm,
      title={A Paradigm Shift in Machine Translation: Boosting Translation Performance of Large Language Models}, 
      author={Haoran Xu and Young Jin Kim and Amr Sharaf and Hany Hassan Awadalla},
      year={2023},
      eprint={2309.11674},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}
```


## about this work
- **This work was done by :** [webbigdata](https://webbigdata.jp/).