--- language: en license: apache-2.0 tags: - text-generation-inference - transformers - ruslanmv - llama - trl base_model: meta-llama/Meta-Llama-3-8B datasets: - ruslanmv/ai-medical-chatbot --- # Medical-Llama3-8B-GGUF [![](future.jpg)](https://ruslanmv.com/) This is a fine-tuned version of the Llama3 8B model, specifically designed to answer medical questions. The model was trained on the AI Medical Chatbot dataset, which can be found at [ruslanmv/ai-medical-chatbot](https://huggingface.co/datasets/ruslanmv/ai-medical-chatbot). This fine-tuned model leverages the GGUF (General-Purpose Gradient-based Quantization with Uniform Forwarding) technique for efficient inference with 4-bit quantization. **Model:** [ruslanmv/Medical-Llama3-8B-GGUF](https://huggingface.co/ruslanmv/Medical-Llama3-8B-GGUF) - **Developed by:** ruslanmv - **License:** apache-2.0 - **Finetuned from model:** meta-llama/Meta-Llama-3-8B ## Installation **Prerequisites:** - A system with CUDA support is highly recommended for optimal performance. - Python 3.10 or later 1. **Install required Python libraries:** ```bash # GPU llama-cpp-python !CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python --force-reinstall --upgrade --no-cache-dir --verbose ``` ```bash %%capture !pip install huggingface-hub hf-transfer ``` 2. **Download model quantized:** ```bash import os os.environ["HF_HUB_ENABLE_HF_TRANSFER"] = "1" !huggingface-cli download \ ruslanmv/Medical-Llama3-8B-GGUF \ medical-llama3-8b.Q5_K_M.gguf \ --local-dir . \ --local-dir-use-symlinks False MODEL_PATH="/content/medical-llama3-8b.Q5_K_M.gguf" ``` ## Example of use Here's an example of how to use the Medical-Llama3-8B-GGUF 4bit model to generate an answer to a medical question: ```python from llama_cpp import Llama import json B_INST, E_INST = "[INST]", "[/INST]" B_SYS, E_SYS = "<>\n", "\n<>\n\n" DEFAULT_SYSTEM_PROMPT = """\ You are an AI Medical Chatbot Assistant, I'm equipped with a wealth of medical knowledge derived from extensive datasets. I aim to provide comprehensive and informative responses to your inquiries. However, please note that while I strive for accuracy, my responses should not replace professional medical advice and short answers. If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information.""" SYSTEM_PROMPT = B_SYS + DEFAULT_SYSTEM_PROMPT + E_SYS def create_prompt(user_query): instruction = f"User asks: {user_query}\n" prompt = B_INST + SYSTEM_PROMPT + instruction + E_INST return prompt.strip() user_query = "I'm a 35-year-old male experiencing symptoms like fatigue, increased sensitivity to cold, and dry, itchy skin. Could these be indicative of hypothyroidism?" prompt = create_prompt(user_query) print(prompt) llm = Llama(model_path=MODEL_PATH, n_gpu_layers=-1) result = llm( prompt=prompt, max_tokens=100, echo=False ) print(result['choices'][0]['text']) ``` The output exmample ```bash Hi, thank you for your query. Hypothyroidism is characterized by fatigue, sensitivity to cold, weight gain, depression, hair loss and mental dullness. I would suggest that you get a complete blood count with thyroid profile including TSH (thyroid stimulating hormone), free thyroxine level, and anti-thyroglobulin antibodies. These tests will help in establishing the diagnosis of hypothyroidism. If there is no family history of autoimmune disorders, then it might be due ``` ## License This model is licensed under the Apache License 2.0. You can find the full license in the LICENSE file.