chuyi777 commited on
Commit
9bf875b
1 Parent(s): 775458a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -1
README.md CHANGED
@@ -6,12 +6,13 @@ Datasets and Hyperparameters
6
  Reward Model:https://huggingface.co/OpenLLMAI/Llama-3-8b-rm-700k
7
  SFT Model: https://huggingface.co/OpenLLMAI/Llama-3-8b-sft-mixture
8
  Prompt Dataset: https://huggingface.co/datasets/OpenLLMAI/prompt-collection-v0.1
 
9
  Max Prompt Length: 2048
10
  Max Response Length: 2048
11
  best_of_n: 2 (2 samples for each prompt)
12
  Learning Rate: 5e-7
13
  Beta: 0.1
14
- Scheduler: Cosine with Warmup and MinLR
15
  Rollout Batch Size: 20000
16
  Training Batch Size: 256
17
  Number of Iterations: 9
 
6
  Reward Model:https://huggingface.co/OpenLLMAI/Llama-3-8b-rm-700k
7
  SFT Model: https://huggingface.co/OpenLLMAI/Llama-3-8b-sft-mixture
8
  Prompt Dataset: https://huggingface.co/datasets/OpenLLMAI/prompt-collection-v0.1
9
+
10
  Max Prompt Length: 2048
11
  Max Response Length: 2048
12
  best_of_n: 2 (2 samples for each prompt)
13
  Learning Rate: 5e-7
14
  Beta: 0.1
15
+ Scheduler: Cosine with Warmup (0.03) and MinLR (0.1)
16
  Rollout Batch Size: 20000
17
  Training Batch Size: 256
18
  Number of Iterations: 9