Mistral-7B Fine-Tuned on Synthetic Medical QA with LoRA (8-bit Quantization)

This repository contains a fine-tuned version of the Mistral-7B model for medical question answering tasks, trained on the SyntheticMedicalQA-4336 dataset. The fine-tuning process leverages Low-Rank Adaptation (LoRA) with 8-bit quantization for memory efficiency and uses the PEFT (Parameter Efficient Fine-Tuning) library for optimization.

Model Overview

Base Model: alecocc/mistral-7B-SFT-medqa-graph-cot
Dataset: SyntheticMedicalQA-4336
Fine-Tuning Method: LoRA (Low-Rank Adaptation)
Quantization: 8-bit (with BitsAndBytesConfig)
Task: Medical Question Answering
Library Used: Transformers, PEFT, and BitsAndBytes

Features

Memory Efficiency: Utilizes 8-bit quantization to reduce memory consumption while maintaining performance.
Fine-Tuned for Medical QA: Specifically optimized to answer medical questions using a synthetic dataset.
LoRA Implementation: Applies LoRA to improve fine-tuning efficiency without modifying the full model architecture.
Optimized Training Techniques:
- Gradient checkpointing
- Gradient accumulation
- AdamW optimizer for stable updates

Dataset Details

The model is trained on the SyntheticMedicalQA-4336 dataset, which contains synthetic medical questions and answers. The dataset is processed as follows:

Questions are mapped to input_text.
Responses are mapped to output_text.
The dataset is split into 80% training and 20% validation.

Training Configuration

Tokenizer

The tokenizer is initialized using the base model:

from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("alecocc/mistral-7B-SFT-medqa-graph-cot", trust_remote_code=True)
tokenizer.pad_token = tokenizer.eos_token

### Quantization
The model uses 8-bit quantization with the following configuration:
```python
from transformers import BitsAndBytesConfig

bnb_config = BitsAndBytesConfig(
    load_in_8bit=True,
    llm_int8_enable_fp32_cpu_offload=True,
    llm_int8_skip_modules=["lm_head"]
)

LoRA Configuration

LoRA is applied to specific attention projection layers:

from peft import LoraConfig

peft_config = LoraConfig(
    task_type="CAUSAL_LM",
    inference_mode=False,
    r=16,
    lora_alpha=32,
    lora_dropout=0.1,
    target_modules=[
        "q_proj", "k_proj", "v_proj", "o_proj",
        "gate_proj", "up_proj", "down_proj"
    ]
)

Training Arguments

The training process is configured with the following parameters:

from transformers import TrainingArguments

training_args = TrainingArguments(
    output_dir="./mistral_medical_finetuned_lora",
    per_device_train_batch_size=1,
    gradient_accumulation_steps=16,
    learning_rate=1e-4,
    num_train_epochs=3,
    logging_steps=10,
    save_strategy="epoch",
    evaluation_strategy="steps",
    eval_steps=50,
    warmup_steps=100,
    gradient_checkpointing=True,
    fp16=False,
    bf16=False,
    optim="adamw_torch",
    max_grad_norm=0.3,
    weight_decay=0.01,
    remove_unused_columns=False
)

Results & Expected Outcomes

The fine-tuned model generates improved responses for medical QA tasks.
LoRA ensures efficient fine-tuning without modifying the full model architecture.
The use of gradient checkpointing and accumulation optimizes memory usage during training.

How to Use This Model

Load the Model and Tokenizer

You can load this fine-tuned model as follows:

from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("your_username/mistral-7B-medqa-lora-8bit-v1")
model = AutoModelForCausalLM.from_pretrained("your_username/mistral-7B-medqa-lora-8bit-v1")

Generate Answers

To generate answers for medical questions:

input_text = "### Question:\nWhat are the symptoms of diabetes?\n\n### Answer:\n"
inputs = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(**inputs, max_length=200)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Metadata

Field	Value
License	Apache 2.0
Language	English
Base Model	alecocc/mistral-7B-SFT-medqa-graph-cot
Pipeline Tag	Auto-detected
Tags	medical-QA, LoRA, 8bit, mistral
Dataset	SyntheticMedicalQA-4336
Metrics	Add metrics (e.g., BLEU, ROUGE)

Citation

If you use this model in your work, please cite it as follows:

@misc{mistral_medical_lora_2025,
  author = {Adrien Cohen},
  title = {Mistral-7B Fine-Tuned on SyntheticMedicalQA with LoRA},
  year = {2025},
  publisher = {Hugging Face},
  url = {https://huggingface.co/your_username/mistral-7B-medqa-lora-8bit-v1}
}