Safetensors
qwen2
reasoning
ptrdvn commited on
Commit
a94b423
·
verified ·
1 Parent(s): db6f79d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -7
README.md CHANGED
@@ -202,7 +202,7 @@ The evaluation code for this can be found [here](https://drive.google.com/file/d
202
 
203
  ```yaml
204
  ### model
205
- model_name_or_path: deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
206
 
207
  ### method
208
  stage: sft
@@ -219,7 +219,7 @@ preprocessing_num_workers: 16
219
  packing: true
220
 
221
  ### output
222
- output_dir: /root/train_outputs/DeepSeek-R1-Distill-Qwen-7B/reasoning-multilingual-R1-Llama-70B-train
223
  logging_steps: 1
224
  save_steps: 0.99999
225
  plot_loss: true
@@ -250,11 +250,10 @@ echo '{
250
  }
251
  }' > /root/LLaMA-Factory/data/dataset_info.json
252
 
253
- # 7B Qwen
254
- cd /root/LLaMA-Factory && llamafactory-cli train /root/reasoning_multilingual_train_7B.yaml
255
- rm -r /root/train_outputs/DeepSeek-R1-Distill-Qwen-7B/reasoning-multilingual-R1-Llama-70B-train/checkpoint*
256
- huggingface-cli upload lightblue/DeepSeek-R1-Distill-Qwen-7B-Multilingual /root/train_outputs/DeepSeek-R1-Distill-Qwen-7B/reasoning-multilingual-R1-Llama-70B-train
257
-
258
  ```
259
 
260
  # License
 
202
 
203
  ```yaml
204
  ### model
205
+ model_name_or_path: deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B
206
 
207
  ### method
208
  stage: sft
 
219
  packing: true
220
 
221
  ### output
222
+ output_dir: /root/train_outputs/DeepSeek-R1-Distill-Qwen-1.5B/reasoning-multilingual-R1-Llama-70B-train
223
  logging_steps: 1
224
  save_steps: 0.99999
225
  plot_loss: true
 
250
  }
251
  }' > /root/LLaMA-Factory/data/dataset_info.json
252
 
253
+ # # 1.5B Llama
254
+ cd /root/LLaMA-Factory && llamafactory-cli train /root/reasoning_multilingual_train_1.5B.yaml
255
+ rm -r /root/train_outputs/DeepSeek-R1-Distill-Qwen-1.5B/reasoning-multilingual-R1-Llama-70B-train/checkpoint*
256
+ huggingface-cli upload lightblue/DeepSeek-R1-Distill-Qwen-1.5B-Multilingual /root/train_outputs/DeepSeek-R1-Distill-Qwen-1.5B/reasoning-multilingual-R1-Llama-70B-train
 
257
  ```
258
 
259
  # License