dahara1 commited on
Commit
8071ffb
1 Parent(s): 4863224

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +33 -0
README.md ADDED
@@ -0,0 +1,33 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ inference: false
3
+ language:
4
+ - ja
5
+ - en
6
+ ---
7
+ # webbigdata/ALMA-7B-Ja-GPTQ-Ja-En
8
+
9
+ Original ALMA Model [ALMA-7B](https://huggingface.co/haoranxu/ALMA-7B). (26.95GB) is a new paradigm translation model.
10
+
11
+ [ALMA-7B-Ja-GPTQ-Ja-En]](https://huggingface.co/webbigdata/ALMA-7B-Ja) is a machine translation model that uses ALMA's learning method to translate Japanese to English.(13.3GB)
12
+
13
+ This model is GPTQ quantized version model that reduces model size(3.9GB) and memory usage, although the performance is probably lower.
14
+ And translation ability for languages other than Japanese and English has deteriorated significantly.
15
+
16
+ [Free Colab Sample](https://github.com/webbigdata-jp/python_sample/blob/main/ALMA_7B_Ja_GPTQ_Ja_En_Free_Colab_sample.ipynb)
17
+
18
+
19
+ **ALMA** (**A**dvanced **L**anguage **M**odel-based tr**A**nslator) is an LLM-based translation model, which adopts a new translation model paradigm: it begins with fine-tuning on monolingual data and is further optimized using high-quality parallel data. This two-step fine-tuning process ensures strong translation performance.
20
+ Please find more details in their [paper](https://arxiv.org/abs/2309.11674).
21
+ ```
22
+ @misc{xu2023paradigm,
23
+ title={A Paradigm Shift in Machine Translation: Boosting Translation Performance of Large Language Models},
24
+ author={Haoran Xu and Young Jin Kim and Amr Sharaf and Hany Hassan Awadalla},
25
+ year={2023},
26
+ eprint={2309.11674},
27
+ archivePrefix={arXiv},
28
+ primaryClass={cs.CL}
29
+ }
30
+ ```
31
+
32
+ ## about this work
33
+ - **This work was done by :** [webbigdata](https://webbigdata.jp/).