--- license: llama3 language: - fa - en library_name: transformers tags: - LLM - llama-3 - PartAI - conversational --- # Model Details The Dorna models are a family of decoder-only models, specifically trained/fine-tuned on Persian data, developed by [Part AI](https://partdp.ai/). As an initial release, an 8B instruct model from this family is Dorna-Llama3-8B-Instruct is built using the [Meta Llama 3 Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) model. In this repo, we provide `bf16` model and quantized models in the GGUF formats, including `Q2_K`, `Q3_K`, `Q3_K_L`, `Q3_K_M`, `Q3_K_S`, `Q4_0`, `Q4_1`, `Q4_K_M`, `Q4_K_S`, `Q5_0`, `Q5_1`, `Q5_K_M`, `Q5_K_S` and `Q8_0` [Here](https://gist.github.com/Artefact2/b5f810600771265fc1e39442288e8ec9) offers an in-depth report that includes several performance charts. Check it out.

Name	Quant Method	Bits	Memory
dorna-llama3-8b-instruct.Q2_K.gguf	Q2_K	2	3.2 GB
dorna-llama3-8b-instruct.Q3_K_L.gguf	Q3_K_L	3	4.3 GB
dorna-llama3-8b-instruct.Q3_K_M.gguf	Q3_K_M	3	4.1 GB
dorna-llama3-8b-instruct.Q3_K_S.gguf	Q3_K_S	3	3.7 GB
dorna-llama3-8b-instruct.Q4_0.gguf	Q4_1	4	4.7 GB
dorna-llama3-8b-instruct.Q4_1.gguf	Q4_1	4	5.2 GB
dorna-llama3-8b-instruct.Q4_K_M.gguf	Q4_K_M	4	4.9 GB
dorna-llama3-8b-instruct.Q4_K_S.gguf	Q4_K_S	4	4.7 GB
dorna-llama3-8b-instruct.Q5_0.gguf	Q5_0	5	5.6 GB
dorna-llama3-8b-instruct.Q5_1.gguf	Q5_1	5	6.1 GB
dorna-llama3-8b-instruct.Q5_K_M.gguf	Q5_K_M	5	5.73 GB
dorna-llama3-8b-instruct.Q5_K_S.gguf	Q5_K_S	5	5.6 GB
dorna-llama3-8b-instruct.Q6_K.gguf	Q6_K	6	6.6 GB
dorna-llama3-8b-instruct.Q8_0.gguf Recommended	Q8_0	8	8.5 GB
dorna-llama3-8b-instruct.bf16.gguf	None	16	16.2 GB

## Requirements We recommend using the Python version of [`llama.cpp`](https://github.com/ggerganov/llama.cpp) and installing it with the following command: ```bash !pip install https://github.com/abetlen/llama-cpp-python/releases/download/v0.2.78/llama_cpp_python-0.2.78-cp310-cp310-linux_x86_64.whl ``` ## How to use Instead of cloning the repository, which may be inefficient, you can manually download the required GGUF file or use `huggingface-cli` (`pip install huggingface_hub`) as demonstrated below: ```bash !huggingface-cli login --token $HUGGING_FACE_HUB_TOKEN !huggingface-cli download PartAI/Dorna-Llama3-8B-Instruct-GGUF dorna-llama3-8b-instruct.Q8_0.gguf --local-dir . --local-dir-use-symlinks False ``` ```Python from llama_cpp import Llama llm = Llama( model_path="dorna-llama3-8b-instruct.Q8_0.gguf", chat_format="llama-3", n_gpu_layers=-1, n_ctx=2048, ) messages = [ {"role": "system", "content": "You are a helpful Persian assistant. Please answer questions in the asked language."}, {"role": "user", "content": "کاغذ A4 بزرگ تر است یا A5؟"}, ] result = llm.create_chat_completion( messages = messages, top_p=0.85, temperature=0.1 ) print(result) ``` ## Contact us If you have any questions regarding this model, you can reach us via the [community](https://huggingface.co/PartAI/Dorna-Llama3-8B-Instruct-GGUF/discussions) on Hugging Face.