bhavyaaiplanet
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -30,11 +30,11 @@ effi 7b AWQ is a quantized version of effi 7b whiich is a 7 billion parameter mo
|
|
30 |
|
31 |
### Qunatization Configuration
|
32 |
|
33 |
-
- zero_point
|
34 |
-
- q_group_size
|
35 |
-
- w_bit
|
36 |
-
- version
|
37 |
-
- modules_to_not_convert
|
38 |
|
39 |
|
40 |
|
@@ -77,8 +77,8 @@ print(f"{tokenizer.batch_decode(outputs.detach().cpu().numpy(), skip_special_tok
|
|
77 |
```
|
78 |
|
79 |
### Framework versions
|
80 |
-
- Transformers 4.37.2
|
81 |
-
- Autoawq 0.1.8
|
82 |
|
83 |
### Citation
|
84 |
|
|
|
30 |
|
31 |
### Qunatization Configuration
|
32 |
|
33 |
+
- **zero_point:** true
|
34 |
+
- **q_group_size:** 128
|
35 |
+
- **w_bit:** 4
|
36 |
+
- **version:** "GEMM"
|
37 |
+
- **modules_to_not_convert:** null
|
38 |
|
39 |
|
40 |
|
|
|
77 |
```
|
78 |
|
79 |
### Framework versions
|
80 |
+
- **Transformers** 4.37.2
|
81 |
+
- **Autoawq** 0.1.8
|
82 |
|
83 |
### Citation
|
84 |
|