abhinavkulkarni commited on
Commit
c59e771
1 Parent(s): d2be62b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -2
README.md CHANGED
@@ -18,7 +18,7 @@ July 5, 2023
18
 
19
  ## Model License
20
 
21
- Please refer to the original model license ([link](https://huggingface.co/VMware/open-llama-7b-open-instruct)).
22
 
23
  Please refer to the AWQ quantization license ([link](https://github.com/llm-awq/blob/main/LICENSE)).
24
 
@@ -26,6 +26,8 @@ Please refer to the AWQ quantization license ([link](https://github.com/llm-awq/
26
 
27
  This model was successfully tested on CUDA driver v530.30.02 and runtime v11.7 with Python v3.10.11. Please note that AWQ requires NVIDIA GPUs with compute capability of 80 or higher.
28
 
 
 
29
  ## How to Use
30
 
31
  ```bash
@@ -62,7 +64,7 @@ q_config = {
62
  load_quant = hf_hub_download('abhinavkulkarni/open-llama-7b-open-instruct-w4-g128-awq', 'pytorch_model.bin')
63
 
64
  with init_empty_weights():
65
- model = AutoModelForCausalLM.from_pretrained(model_name, config=config,
66
  torch_dtype=torch.float16, trust_remote_code=True)
67
 
68
  real_quantize_model_weight(model, w_bit=w_bit, q_config=q_config, init_only=True)
 
18
 
19
  ## Model License
20
 
21
+ Please refer to original MPT model license ([link](https://huggingface.co/VMware/open-llama-7b-open-instruct)).
22
 
23
  Please refer to the AWQ quantization license ([link](https://github.com/llm-awq/blob/main/LICENSE)).
24
 
 
26
 
27
  This model was successfully tested on CUDA driver v530.30.02 and runtime v11.7 with Python v3.10.11. Please note that AWQ requires NVIDIA GPUs with compute capability of 80 or higher.
28
 
29
+ For Docker users, the `nvcr.io/nvidia/pytorch:23.06-py3` image is runtime v12.1 but otherwise the same as the configuration above and has also been verified to work.
30
+
31
  ## How to Use
32
 
33
  ```bash
 
64
  load_quant = hf_hub_download('abhinavkulkarni/open-llama-7b-open-instruct-w4-g128-awq', 'pytorch_model.bin')
65
 
66
  with init_empty_weights():
67
+ model = AutoModelForCausalLM.from_config(config=config,
68
  torch_dtype=torch.float16, trust_remote_code=True)
69
 
70
  real_quantize_model_weight(model, w_bit=w_bit, q_config=q_config, init_only=True)