qingy2024 commited on
Commit
cd93b13
·
verified ·
1 Parent(s): 9321d4c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +13 -2
README.md CHANGED
@@ -1,6 +1,12 @@
1
  ---
2
  base_model:
3
  - qingy2024/Qwarkstar-4B
 
 
 
 
 
 
4
  ---
5
 
6
  ## Qwarkstar 4B Instruct (Preview)
@@ -8,6 +14,11 @@ base_model:
8
  > [!NOTE]
9
  > Training complete!
10
 
11
- Fine tuned with SFT on 100k samples from HuggingfaceTB/smoltalk.
 
12
 
13
- It uses the ChatML template.
 
 
 
 
 
1
  ---
2
  base_model:
3
  - qingy2024/Qwarkstar-4B
4
+ license: apache-2.0
5
+ datasets:
6
+ - HuggingFaceTB/smoltalk
7
+ language:
8
+ - en
9
+ pipeline_tag: text-generation
10
  ---
11
 
12
  ## Qwarkstar 4B Instruct (Preview)
 
14
  > [!NOTE]
15
  > Training complete!
16
 
17
+ This model is fine-tuned using Supervised Fine-Tuning (SFT) on 100k samples from the `HuggingFaceTB/smoltalk` dataset.
18
+ It follows the ChatML input-output formatting template.
19
 
20
+ ### Training Details:
21
+ - **Base Model**: `qingy2024/Qwarkstar-4B`
22
+ - **Batch Size**: 32 (2 H100s x 8 per GPU)
23
+ - **Max Gradient Norm**: 1.0
24
+ - **Final Loss**: ~0.59