--- license: llama3 language: - fa - en library_name: transformers tags: - LLM - llama-3 - PartAI - conversational --- # Model Details The Dorna models are a family of decoder-only models, specifically trained/fine-tuned on Persian data, developed by [Part AI](https://partdp.ai/). As an initial release, an 8B instruct model from this family is Dorna-Llama3-8B-Instruct is built using the [Meta Llama 3 Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) model. In this repo, we provide `bf16` model and quantized models in the GGUF formats, including `Q2_K`, `Q3_K`, `Q3_K_L`, `Q3_K_M`, `Q3_K_S`, `Q4_0`, `Q4_1`, `Q4_K_M`, `Q4_K_S`, `Q5_0`, `Q5_1`, `Q5_K_M`, `Q5_K_S` and `Q8_0` [Here](https://gist.github.com/Artefact2/b5f810600771265fc1e39442288e8ec9) offers an in-depth report that includes several performance charts. Check it out.
Name Quant Method Bits Memory
dorna-llama3-8b-instruct.Q2_K.gguf Q2_K 2 3.2 GB
dorna-llama3-8b-instruct.Q3_K_L.gguf Q3_K_L 3 4.3 GB
dorna-llama3-8b-instruct.Q3_K_M.gguf Q3_K_M 3 4.1 GB
dorna-llama3-8b-instruct.Q3_K_S.gguf Q3_K_S 3 3.7 GB
dorna-llama3-8b-instruct.Q4_0.gguf Q4_1 4 4.7 GB
dorna-llama3-8b-instruct.Q4_1.gguf Q4_1 4 5.2 GB
dorna-llama3-8b-instruct.Q4_K_M.gguf Q4_K_M 4 4.9 GB
dorna-llama3-8b-instruct.Q4_K_S.gguf Q4_K_S 4 4.7 GB
dorna-llama3-8b-instruct.Q5_0.gguf Q5_0 5 5.6 GB
dorna-llama3-8b-instruct.Q5_1.gguf Q5_1 5 6.1 GB
dorna-llama3-8b-instruct.Q5_K_M.gguf Q5_K_M 5 5.73 GB
dorna-llama3-8b-instruct.Q5_K_S.gguf Q5_K_S 5 5.6 GB
dorna-llama3-8b-instruct.Q6_K.gguf Q6_K 6 6.6 GB
dorna-llama3-8b-instruct.Q8_0.gguf Recommended Q8_0 8 8.5 GB
dorna-llama3-8b-instruct.bf16.gguf None 16 16.2 GB
## Requirements We recommend using the Python version of [`llama.cpp`](https://github.com/ggerganov/llama.cpp) and installing it with the following command: ```bash !pip install https://github.com/abetlen/llama-cpp-python/releases/download/v0.2.78/llama_cpp_python-0.2.78-cp310-cp310-linux_x86_64.whl ``` ## How to use Instead of cloning the repository, which may be inefficient, you can manually download the required GGUF file or use `huggingface-cli` (`pip install huggingface_hub`) as demonstrated below: ```bash !huggingface-cli login --token $HUGGING_FACE_HUB_TOKEN !huggingface-cli download PartAI/Dorna-Llama3-8B-Instruct-GGUF dorna-llama3-8b-instruct.Q8_0.gguf --local-dir . --local-dir-use-symlinks False ``` ```Python from llama_cpp import Llama llm = Llama( model_path="dorna-llama3-8b-instruct.Q8_0.gguf", chat_format="llama-3", n_gpu_layers=-1, n_ctx=2048, ) messages = [ {"role": "system", "content": "You are a helpful Persian assistant. Please answer questions in the asked language."}, {"role": "user", "content": "کاغذ A4 بزرگ تر است یا A5؟"}, ] result = llm.create_chat_completion( messages = messages, top_p=0.85, temperature=0.1 ) print(result) ``` ## Contact us If you have any questions regarding this model, you can reach us via the [community](https://huggingface.co/PartAI/Dorna-Llama3-8B-Instruct-GGUF/discussions) on Hugging Face.