Update README.md
Browse files
README.md
CHANGED
@@ -88,7 +88,7 @@ Hence, it may struggle to adopt to different prompt styles and code formats.
|
|
88 |
|
89 |
## Training / Evaluation Details
|
90 |
|
91 |
-
The model has been finetuned using 2 A6000 GPUs on CMU LTI's Babel cluster
|
92 |
LoRA modules were attached to `["q_proj", "v_proj"]`. We use DDP for distributed training and BF16 to speed up training. For more details, check [our paper](https://arxiv.org/abs/2403.10534)!
|
93 |
|
94 |
### Results
|
|
|
88 |
|
89 |
## Training / Evaluation Details
|
90 |
|
91 |
+
The model has been finetuned using 2 A6000 GPUs on CMU LTI's Babel cluster. The model has been finetuned using LoRA (`r=8, alpha=16, dropout=0.05, task_type="CAUSAL_LM"`).
|
92 |
LoRA modules were attached to `["q_proj", "v_proj"]`. We use DDP for distributed training and BF16 to speed up training. For more details, check [our paper](https://arxiv.org/abs/2403.10534)!
|
93 |
|
94 |
### Results
|