metadata

license: llama3
language:
  - fa
  - en
library_name: transformers
tags:
  - LLM
  - llama-3
  - PartAI
  - conversational

Model Details

The Dorna models are a family of decoder-only models, specifically trained/fine-tuned on Persian data, developed by Part AI. As an initial release, an 8B instruct model from this family is Dorna-Llama3-8B-Instruct is built using the Meta Llama 3 Instruct model.

In this repo, we provide bf16 model and quantized models in the GGUF formats, including Q2_K, Q3_K, Q3_K_L, Q3_K_M, Q3_K_S, Q4_0, Q4_1, Q4_K_M, Q4_K_S, Q5_0, Q5_1, Q5_K_M, Q5_K_S and Q8_0

Here offers an in-depth report that includes several performance charts. Check it out.

Name	Quant Method	Bits	Memory
dorna-llama3-8b-instruct.Q2_K.gguf	Q2_K	2	3.2 GB
dorna-llama3-8b-instruct.Q3_K_L.gguf	Q3_K_L	3	4.3 GB
dorna-llama3-8b-instruct.Q3_K_M.gguf	Q3_K_M	3	4.1 GB
dorna-llama3-8b-instruct.Q3_K_S.gguf	Q3_K_S	3	3.7 GB
dorna-llama3-8b-instruct.Q4_0.gguf	Q4_1	4	4.7 GB
dorna-llama3-8b-instruct.Q4_1.gguf	Q4_1	4	5.2 GB
dorna-llama3-8b-instruct.Q4_K_M.gguf	Q4_K_M	4	4.9 GB
dorna-llama3-8b-instruct.Q4_K_S.gguf	Q4_K_S	4	4.7 GB
dorna-llama3-8b-instruct.Q5_0.gguf	Q5_0	5	5.6 GB
dorna-llama3-8b-instruct.Q5_1.gguf	Q5_1	5	6.1 GB
dorna-llama3-8b-instruct.Q5_K_M.gguf	Q5_K_M	5	5.73 GB
dorna-llama3-8b-instruct.Q5_K_S.gguf	Q5_K_S	5	5.6 GB
dorna-llama3-8b-instruct.Q6_K.gguf	Q6_K	6	6.6 GB
dorna-llama3-8b-instruct.Q8_0.gguf Recommended	Q8_0	8	8.5 GB
dorna-llama3-8b-instruct.bf16.gguf	None	16	16.2 GB

Requirements

We recommend using the Python version of llama.cpp and installing it with the following command:

!pip install https://github.com/abetlen/llama-cpp-python/releases/download/v0.2.78/llama_cpp_python-0.2.78-cp310-cp310-linux_x86_64.whl

How to use

Instead of cloning the repository, which may be inefficient, you can manually download the required GGUF file or use huggingface-cli (pip install huggingface_hub) as demonstrated below:

!huggingface-cli login --token $HUGGING_FACE_HUB_TOKEN
!huggingface-cli download PartAI/Dorna-Llama3-8B-Instruct-GGUF dorna-llama3-8b-instruct.Q8_0.gguf --local-dir . --local-dir-use-symlinks False

from llama_cpp import Llama

llm = Llama(
      model_path="dorna-llama3-8b-instruct.Q8_0.gguf",
      chat_format="llama-3",
      n_gpu_layers=-1,
      n_ctx=2048,

)

messages = [
    {"role": "system", "content": "You are a helpful Persian assistant. Please answer questions in the asked language."},
    {"role": "user", "content": "کاغذ A4 بزرگ تر است یا A5؟"},
]
result = llm.create_chat_completion(
    messages = messages,
    top_p=0.85,
    temperature=0.1

)

print(result)

Contact us

If you have any questions regarding this model, you can reach us via the community on Hugging Face.