dahara1 commited on
Commit
0fe87f1
1 Parent(s): a16b389

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -4
README.md CHANGED
@@ -4,11 +4,15 @@ language:
4
  - ja
5
  ---
6
 
7
- 量子化時に日本語と中国語を多めに使っているため、[hugging-quants/Meta-Llama-3.1-8B-Instruct-AWQ-INT4](https://huggingface.co/hugging-quants/Meta-Llama-3.1-8B-Instruct-AWQ-INT4)より日本語データを使って計測したPerplexityが良い事がわかっています
8
- Because Japanese and Chinese are used a lot during quantization, It is known that Perplexity measured using Japanese data is better than [hugging-quants/Meta-Llama-3.1-8B-Instruct-AWQ-INT4](https://huggingface.co/hugging-quants/Meta-Llama-3.1-8B-Instruct-AWQ-INT4).
9
 
10
- ![kaizoku](kaizoku.png)
 
 
 
 
11
 
 
 
12
 
13
 
14
  ```
@@ -48,5 +52,6 @@ inputs = tokenizer.apply_chat_template(
48
  outputs = model.generate(**inputs, do_sample=True, max_new_tokens=256)
49
  print(tokenizer.batch_decode(outputs[:, inputs['input_ids'].shape[1]:], skip_special_tokens=True)[0])
50
 
 
51
 
52
- ```
 
4
  - ja
5
  ---
6
 
 
 
7
 
8
+ llama3.1-8bのAWQ量子化版です。
9
+ 4GB超のGPUメモリがあれば高速に動かす事ができます。
10
+
11
+ This is the AWQ quantization version of llama3.1-8b.
12
+ If you have more than 4GB of GPU memory, you can run it at high speed.  
13
 
14
+ 量子化時に日本語と中国語を多めに使っているため、[hugging-quants/Meta-Llama-3.1-8B-Instruct-AWQ-INT4](https://huggingface.co/hugging-quants/Meta-Llama-3.1-8B-Instruct-AWQ-INT4)より日本語データを使って計測したPerplexityが良い事がわかっています
15
+ Because Japanese and Chinese are used a lot during quantization, It is known that Perplexity measured using Japanese data is better than [hugging-quants/Meta-Llama-3.1-8B-Instruct-AWQ-INT4](https://huggingface.co/hugging-quants/Meta-Llama-3.1-8B-Instruct-AWQ-INT4).
16
 
17
 
18
  ```
 
52
  outputs = model.generate(**inputs, do_sample=True, max_new_tokens=256)
53
  print(tokenizer.batch_decode(outputs[:, inputs['input_ids'].shape[1]:], skip_special_tokens=True)[0])
54
 
55
+ ```
56
 
57
+ ![kaizoku](kaizoku.png)