abhinavkulkarni
/

tiiuae-falcon-7b-instruct-w4-g64-awq

Text Generation

RefinedWebModel

text-generation-inference

Model card Files Files and versions Community

Abhinav Kulkarni commited on Jul 14, 2023

Commit

69f4c6e

•

1 Parent(s): 137bf22

Updated README

Files changed (1) hide show

README.md +5 -5

README.md CHANGED Viewed

@@ -27,9 +27,9 @@ This model was successfully tested on CUDA driver v530.30.02 and runtime v11.7 w
 ## How to Use
 ```bash
-git clone https://github.com/mit-han-lab/llm-awq \
 && cd llm-awq \
-&& git checkout 71d8e68df78de6c0c817b029a568c064bf22132d \
 && pip install -e . \
 && cd awq/kernels \
 && python setup.py install
@@ -40,9 +40,9 @@ import torch
 from awq.quantize.quantizer import real_quantize_model_weight
 from transformers import AutoModelForCausalLM, AutoConfig, AutoTokenizer
 from accelerate import init_empty_weights, load_checkpoint_and_dispatch
-from huggingface_hub import hf_hub_download
-model_name = "tiiuae/falcon-7b-instruct"
 # Config
 config = AutoConfig.from_pretrained(model_name, trust_remote_code=True)
@@ -57,7 +57,7 @@ q_config = {
     "q_group_size": 64,
 }
-load_quant = hf_hub_download('abhinavkulkarni/tiiuae-falcon-7b-instruct-w4-g64-awq', 'pytorch_model.bin')
 with init_empty_weights():
     model = AutoModelForCausalLM.from_config(config=config,

 ## How to Use
 ```bash
+git clone https://github.com/abhinavkulkarni/llm-awq \
 && cd llm-awq \
+&& git checkout e977c5a570c5048b67a45b1eb823b81de02d0d60 \
 && pip install -e . \
 && cd awq/kernels \
 && python setup.py install
 from awq.quantize.quantizer import real_quantize_model_weight
 from transformers import AutoModelForCausalLM, AutoConfig, AutoTokenizer
 from accelerate import init_empty_weights, load_checkpoint_and_dispatch
+from huggingface_hub import snapshot_download
+model_name = "abhinavkulkarni/tiiuae-falcon-7b-instruct-w4-g64-awq"
 # Config
 config = AutoConfig.from_pretrained(model_name, trust_remote_code=True)
     "q_group_size": 64,
 }
+load_quant = snapshot_download(model_name)
 with init_empty_weights():
     model = AutoModelForCausalLM.from_config(config=config,