Thanks for this model!

#1
Files changed (1) hide show
  1. README.md +4 -1
README.md CHANGED
@@ -24,6 +24,8 @@ Please refer to the AWQ quantization license ([link](https://github.com/llm-awq/
24
 
25
  This model was successfully tested on CUDA driver v530.30.02 and runtime v11.7 with Python v3.10.11. Please note that AWQ requires NVIDIA GPUs with compute capability of 80 or higher.
26
 
 
 
27
  ## How to Use
28
 
29
  ```bash
@@ -32,6 +34,7 @@ git clone https://github.com/mit-han-lab/llm-awq \
32
  && git checkout 71d8e68df78de6c0c817b029a568c064bf22132d \
33
  && pip install -e . \
34
  && cd awq/kernels \
 
35
  && python setup.py install
36
  ```
37
 
@@ -48,7 +51,7 @@ model_name = "tiiuae/falcon-7b-instruct"
48
  config = AutoConfig.from_pretrained(model_name, trust_remote_code=True)
49
 
50
  # Tokenizer
51
- tokenizer = AutoTokenizer.from_pretrained(config.tokenizer_name)
52
 
53
  # Model
54
  w_bit = 4
 
24
 
25
  This model was successfully tested on CUDA driver v530.30.02 and runtime v11.7 with Python v3.10.11. Please note that AWQ requires NVIDIA GPUs with compute capability of 80 or higher.
26
 
27
+ For Docker users, the `nvcr.io/nvidia/pytorch:23.06-py3` image is runtime v12.1 but otherwise the same as the configuration above and has also been verified to work.
28
+
29
  ## How to Use
30
 
31
  ```bash
 
34
  && git checkout 71d8e68df78de6c0c817b029a568c064bf22132d \
35
  && pip install -e . \
36
  && cd awq/kernels \
37
+ && export TORCH_CUDA_ARCH_LIST='8.0 8.6 8.7 8.9 9.0' \
38
  && python setup.py install
39
  ```
40
 
 
51
  config = AutoConfig.from_pretrained(model_name, trust_remote_code=True)
52
 
53
  # Tokenizer
54
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
55
 
56
  # Model
57
  w_bit = 4