File size: 546 Bytes
2de4a05
 
acba569
 
 
 
 
 
 
 
 
 
fee6c22
 
0d6c33d
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
---
license: apache-2.0
datasets:
- CaterinaLac/sharegpt-deduplicated
- exams
- Open-Orca/OpenOrca
language:
- en
- zh
- ko
- ja
- fr
---

This model is a Llama2-7B model finetuned on the union of ShareGPT, the exams dataset and a subset of the Orca dataset.
The finetuning was performed with [DeepSpeed Chat](https://github.com/microsoft/DeepSpeed/tree/master/blogs/deepspeed-chat) toolkit (step 1, sft).
The model run for three epochs before reaching a plateau on the validation dataset. We used a cosine scheduler, with an initial LR of 2e-5.