Yofuria commited on
Commit
f6631a2
·
verified ·
1 Parent(s): e9d0fd9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -19,7 +19,7 @@ should probably proofread and complete it, then remove this comment. -->
19
  [<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/nlp-xiaobo/huggingface/runs/fgjarr1f)
20
  # Mistral-7B-base-simpo-qlora
21
 
22
- This model is a fine-tuned version of [/home/yofuria/PLM/SimPO/zephyr-7b-sft-qlora](https://huggingface.co//home/yofuria/PLM/SimPO/zephyr-7b-sft-qlora) on the HuggingFaceH4/ultrafeedback_binarized dataset.
23
  It achieves the following results on the evaluation set:
24
  - Loss: 1.5543
25
  - Rewards/chosen: -2.0201
@@ -49,11 +49,11 @@ More information needed
49
 
50
  The following hyperparameters were used during training:
51
  - learning_rate: 3e-07
52
- - train_batch_size: 8
53
  - eval_batch_size: 4
54
  - seed: 42
55
  - gradient_accumulation_steps: 8
56
- - total_train_batch_size: 64
57
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
58
  - lr_scheduler_type: cosine
59
  - lr_scheduler_warmup_ratio: 0.1
 
19
  [<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/nlp-xiaobo/huggingface/runs/fgjarr1f)
20
  # Mistral-7B-base-simpo-qlora
21
 
22
+ This model is a fine-tuned version of [alignment-handbook/zephyr-7b-sft-qlora](https://huggingface.co/alignment-handbook/zephyr-7b-sft-qlora) on the HuggingFaceH4/ultrafeedback_binarized dataset.
23
  It achieves the following results on the evaluation set:
24
  - Loss: 1.5543
25
  - Rewards/chosen: -2.0201
 
49
 
50
  The following hyperparameters were used during training:
51
  - learning_rate: 3e-07
52
+ - train_batch_size: 2
53
  - eval_batch_size: 4
54
  - seed: 42
55
  - gradient_accumulation_steps: 8
56
+ - total_train_batch_size: 16
57
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
58
  - lr_scheduler_type: cosine
59
  - lr_scheduler_warmup_ratio: 0.1