abhinavkulkarni commited on
Commit
137bf22
1 Parent(s): 911f42b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -6
README.md CHANGED
@@ -24,8 +24,6 @@ Please refer to the AWQ quantization license ([link](https://github.com/llm-awq/
24
 
25
  This model was successfully tested on CUDA driver v530.30.02 and runtime v11.7 with Python v3.10.11. Please note that AWQ requires NVIDIA GPUs with compute capability of 80 or higher.
26
 
27
- For Docker users, the `nvcr.io/nvidia/pytorch:23.06-py3` image is runtime v12.1 but otherwise the same as the configuration above and has also been verified to work.
28
-
29
  ## How to Use
30
 
31
  ```bash
@@ -34,7 +32,6 @@ git clone https://github.com/mit-han-lab/llm-awq \
34
  && git checkout 71d8e68df78de6c0c817b029a568c064bf22132d \
35
  && pip install -e . \
36
  && cd awq/kernels \
37
- && export TORCH_CUDA_ARCH_LIST='8.0 8.6 8.7 8.9 9.0' \
38
  && python setup.py install
39
  ```
40
 
@@ -51,7 +48,7 @@ model_name = "tiiuae/falcon-7b-instruct"
51
  config = AutoConfig.from_pretrained(model_name, trust_remote_code=True)
52
 
53
  # Tokenizer
54
- tokenizer = AutoTokenizer.from_pretrained(model_name)
55
 
56
  # Model
57
  w_bit = 4
@@ -60,7 +57,7 @@ q_config = {
60
  "q_group_size": 64,
61
  }
62
 
63
- load_quant = hf_hub_download('abhinavkulkarni/tiiaue-falcon-7b-instruct-w4-g64-awq', 'pytorch_model.bin')
64
 
65
  with init_empty_weights():
66
  model = AutoModelForCausalLM.from_config(config=config,
@@ -99,7 +96,7 @@ This evaluation was done using [LM-Eval](https://github.com/EleutherAI/lm-evalua
99
  | | |byte_perplexity| 1.6490| | |
100
  | | |bits_per_byte | 0.7216| | |
101
 
102
- [Falcon-7B-Instruct (4-bit 64-group AWQ)](https://huggingface.co/abhinavkulkarni/falcon-7b-instruct-w4-g64-awq)
103
 
104
  | Task |Version| Metric | Value | |Stderr|
105
  |--------|------:|---------------|------:|---|------|
 
24
 
25
  This model was successfully tested on CUDA driver v530.30.02 and runtime v11.7 with Python v3.10.11. Please note that AWQ requires NVIDIA GPUs with compute capability of 80 or higher.
26
 
 
 
27
  ## How to Use
28
 
29
  ```bash
 
32
  && git checkout 71d8e68df78de6c0c817b029a568c064bf22132d \
33
  && pip install -e . \
34
  && cd awq/kernels \
 
35
  && python setup.py install
36
  ```
37
 
 
48
  config = AutoConfig.from_pretrained(model_name, trust_remote_code=True)
49
 
50
  # Tokenizer
51
+ tokenizer = AutoTokenizer.from_pretrained(config.tokenizer_name)
52
 
53
  # Model
54
  w_bit = 4
 
57
  "q_group_size": 64,
58
  }
59
 
60
+ load_quant = hf_hub_download('abhinavkulkarni/tiiuae-falcon-7b-instruct-w4-g64-awq', 'pytorch_model.bin')
61
 
62
  with init_empty_weights():
63
  model = AutoModelForCausalLM.from_config(config=config,
 
96
  | | |byte_perplexity| 1.6490| | |
97
  | | |bits_per_byte | 0.7216| | |
98
 
99
+ [Falcon-7B-Instruct (4-bit 64-group AWQ)](https://huggingface.co/abhinavkulkarni/tiiuae-falcon-7b-instruct-w4-g64-awq)
100
 
101
  | Task |Version| Metric | Value | |Stderr|
102
  |--------|------:|---------------|------:|---|------|