dahara1
/

llama3-8b-amd-npu

Model card Files Files and versions Community

dahara1 commited on Jun 6

Commit

e46f1e5

•

1 Parent(s): 19d18f7

Update README.md

Files changed (1) hide show

README.md +21 -6

README.md CHANGED Viewed

@@ -5,15 +5,30 @@ tags:
 - llama3
 ---
-This is a model that has been AWQ quantized and converted to run on the NPU installed in the Ryzen AI PC (for example, Ryzen 9 7940HS Processor) (for Windows environment)
-For information on setting up Ryzen AI for LLMs, see [Running LLM on AMD NPU Hardware](https://www.hackster.io/gharada2013/running-llm-on-amd-npu-hardware-19322f).
-The following script assumes that conda activate XXXX.\setup.bat has been completed in cmd
 ### setup
 ```
 ```
 ### Sample Script
@@ -58,8 +73,8 @@ if __name__ == "__main__":
     p.cpu_affinity([0, 1, 2, 3])
     torch.set_num_threads(4)
-    tokenizer = AutoTokenizer.from_pretrained("dahara1/llama3-8b-amd-npu")
-    ckpt = "pytorch_llama3_8b_w_bit_4_awq_lm_amd.pt"
     terminators = [
         tokenizer.eos_token_id,
         tokenizer.convert_tokens_to_ids("<|eot_id|>")

 - llama3
 ---
+This is a model that has been AWQ quantized and converted to run on the NPU installed in the Ryzen AI PC (for example, Ryzen 9 7940HS Processor) (for Windows environment)
+For information on setting up Ryzen AI for LLMs in window 11, see [Running LLM on AMD NPU Hardware](https://www.hackster.io/gharada2013/running-llm-on-amd-npu-hardware-19322f).
+  The following sample assumes that the setup on the above page has been completed.
 ### setup
+In cmd windows.
 ```
+conda activate ryzenai-transformers
+<your_install_path>\RyzenAI-SW\example\transformers\setup.bat
+git lfs install
+git clone https://huggingface.co/dahara1/llama3-8b-amd-npu
+cd llama3-8b-amd-npu
+git lfs pull
+cd ..
+copy <your_install_path>\RyzenAI-SW\example\transformers\models\llama2\modeling_llama_amd.py .
+# set up Runtime. see [Runtime Setup](https://ryzenai.docs.amd.com/en/latest/runtime_setup.html)
+set XLNX_VART_FIRMWARE=<your_install_path>\voe-4.0-win_amd64\1x4.xclbin
+set NUM_OF_DPU_RUNNERS=1
+# save below sample script as utf8 and llama-3-test.py
+python llama3-test.py
 ```
 ### Sample Script
     p.cpu_affinity([0, 1, 2, 3])
     torch.set_num_threads(4)
+    tokenizer = AutoTokenizer.from_pretrained("llama3-8b-amd-npu")
+    ckpt = "llama3-8b-amd-npu/pytorch_llama3_8b_w_bit_4_awq_lm_amd.pt"
     terminators = [
         tokenizer.eos_token_id,
         tokenizer.convert_tokens_to_ids("<|eot_id|>")