File size: 1,497 Bytes
55408e6
 
 
 
 
 
 
 
 
9e0b978
55408e6
 
 
 
9e0b978
55408e6
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
---
license: mit
datasets:
- starmpcc/Asclepius-Synthetic-Clinical-Notes
language:
- en
---
## Overview

This model, elucidator8918/clinical-ehr-prototype-0.3_GGUF is a Q5_K_M type GGUF and is tailored for clinical documentation, based on the Mistral-7B-Instruct-v0.3-sharded architecture fine-tuned on the Asclepius-Synthetic-Clinical-Notes dataset.

## Key Information

- **Model Name**: Mistral-7B-Instruct-v0.3-sharded
- **Fine-tuned Model Name**: elucidator8918/clinical-ehr-prototype-0.3_GGUF
- **Dataset**: starmpcc/Asclepius-Synthetic-Clinical-Notes
- **Language**: English (en)

## Model Details

- **LoRA Parameters (QLoRA):**
  - LoRA attention dimension: 64
  - Alpha parameter for LoRA scaling: 16
  - Dropout probability for LoRA layers: 0.1

- **bitsandbytes Parameters:**
  - Activate 4-bit precision base model loading
  - Compute dtype for 4-bit base models: float16
  - Quantization type: nf4
  - Activate nested quantization for 4-bit base models: No

- **TrainingArguments Parameters:**
  - Number of training epochs: 1
  - Batch size per GPU for training: 4
  - Batch size per GPU for evaluation: 4
  - Gradient accumulation steps: 1
  - Enable gradient checkpointing: Yes
  - Maximum gradient norm: 0.3
  - Initial learning rate: 2e-4
  - Weight decay: 0.001
  - Optimizer: paged_adamw_32bit
  - Learning rate scheduler type: cosine
  - Warm-up ratio: 0.03
  - Group sequences into batches with the same length: Yes

## License

This model is released under the MIT License.