ChatDoctor: A Medical Chat Model Fine-tuned on LLaMA Model using Medical Domain Knowledge
Yunxiang Li1, Zihan Li2, Kai Zhang3, Ruilong Dan4, You Zhang1
1 University of Texas Southwestern Medical Center, Dallas, USA
2 University of Illinois at Urbana-Champaign, Urbana, USA
3 Ohio State University, Columbus, USA
4 Hangzhou Dianzi University, Hangzhou, China
News
We uploaded a larger training data, InstructorDoctor-200k.
Overview
ChatDoctor is a next-generation AI doctor model that is based on the LLaMA model. The goal of this project is to provide patients with an intelligent and reliable healthcare companion that can answer their medical queries and provide them with personalized medical advice.
The ChatDoctor is an advanced language model that is specifically designed for medical applications. It has been trained on a large corpus of medical literature and has a deep understanding of medical terminology, procedures, and diagnoses. This model serves as the foundation for ChatDoctor, enabling it to analyze patients' symptoms and medical history, provide accurate diagnoses, and suggest appropriate treatment options.
The ChatDoctor model is designed to simulate a conversation between a doctor and a patient, using natural language processing (NLP) and machine learning techniques. Patients can interact with the ChatDoctor model through a chat interface, asking questions about their health, symptoms, or medical conditions. The model will then analyze the input and provide a response that is tailored to the patient's unique situation.
One of the key features of the ChatDoctor model is its ability to learn and adapt over time. As more patients interact with the model, it will continue to refine its responses and improve its accuracy. This means that patients can expect to receive increasingly personalized and accurate medical advice over time.
Setup:
In a conda env with pytorch available, run:
pip install -r requirements.txt
Examples:
Below are some example that describes the medical conversational task.
### Patient: Doctor, I've been experiencing some symptoms like diminished hearing, facial pain, dizziness, diminished vision, and focal weakness. What could be the possible cause?
### ChatDoctor: It sounds like you may have suffered from trauma to your head, which can result in these various symptoms. We will need to run some tests to determine the extent of the injury and how best to treat it.
### Patient: What medical tests do I need to have?
### ChatDoctor: We will need to perform a radiographic imaging procedure, hematologic tests, complete blood count, intravenous fluid replacement, X-ray computed tomography scan (CT scan), glucose measurement (glucose level), and a head CT scan to assess the extent of the injury and determine the appropriate treatment plan.
### Patient: Do you have any recommendations for medications?
### ChatDoctor: Yes, we will prescribe propofol to help you sleep through the procedures, midazolam (Versed) to help relax you, and meperidine (Demerol) to manage your pain. We will also administer sodium bicarbonate to maintain your body pH levels and aminophylline to counteract any respiratory issues caused by the sedatives.
Setup:
In a conda env with pytorch available, run:
pip install -r requirements.txt
Data and model:
1. ChatDoctor Training Dataset:
You can download the following training dataset InstructorDoctor-5K: link
InstructorDoctor-200k: link
InstructorDoctor-200k is based on "MedDialog: a large-scale medical dialogue dataset"
2. Model Weights:
You can download the following model weights .bin file: ChatDoctor Version 1.0: link
Please note that it is only the first version of our model.
3. Model Preparation:
Download the ChatDoctor Version 1.0 model weights and place the weights file in the ./result folder.
How to fine-tuning
torchrun --nproc_per_node=4 --master_port=<your_random_port> train.py \
--model_name_or_path <your_path_to_hf_converted_llama_ckpt_and_tokenizer> \
--data_path ./chatdoctor5k.json \
--bf16 True \
--output_dir pretrained \
--num_train_epochs 3 \
--per_device_train_batch_size 4 \
--per_device_eval_batch_size 4 \
--gradient_accumulation_steps 8 \
--evaluation_strategy "no" \
--save_strategy "steps" \
--save_steps 2000 \
--save_total_limit 1 \
--learning_rate 2e-5 \
--weight_decay 0. \
--warmup_ratio 0.03 \
--lr_scheduler_type "cosine" \
--logging_steps 1 \
--fsdp "full_shard auto_wrap" \
--fsdp_transformer_layer_cls_to_wrap 'LLaMADecoderLayer' \
--tf32 True
How to inference
You can build a ChatDoctor model on your own machine and communicate with it.
python chat.py
Reference
ChatDoctor: A Medical Chat Model Fine-tuned on LLaMA Model using Medical Domain Knowledge
@misc{yunxiang2023chatdoctor,
title={ChatDoctor: A Medical Chat Model Fine-tuned on LLaMA Model using Medical Domain Knowledge},
author={Li Yunxiang and Li Zihan and Zhang Kai and Dan Ruilong and Zhang You},
year={2023},
eprint={2303.14070},
archivePrefix={arXiv},
primaryClass={cs.CL}
}