Nanobit commited on
Commit
7ec1050
·
unverified ·
2 Parent(s): 68f0c71 a9e502e

Merge pull request #48 from NanoCode012/feat/update-readme

Browse files
Files changed (1) hide show
  1. README.md +59 -18
README.md CHANGED
@@ -97,6 +97,18 @@ Have dataset(s) in one of the following format (JSONL recommended):
97
  ```json
98
  {"instruction": "...", "input": "...", "output": "...", "reflection": "...", "corrected": "..."}
99
  ```
 
 
 
 
 
 
 
 
 
 
 
 
100
 
101
  > Have some new format to propose? Check if it's already defined in [data.py](src/axolotl/utils/data.py) in `dev` branch!
102
 
@@ -124,17 +136,17 @@ See sample configs in [configs](configs) folder or [examples](examples) for quic
124
 
125
  - loading
126
  ```yaml
127
- load_4bit: true
128
  load_in_8bit: true
129
- bf16: true
130
  fp16: true
131
- tf32: true
132
  ```
133
  Note: Repo does not do 4-bit quantization.
134
 
135
  - lora
136
  ```yaml
137
- adapter: lora # blank for full finetune
138
  lora_r: 8
139
  lora_alpha: 16
140
  lora_dropout: 0.05
@@ -163,28 +175,32 @@ tokenizer_type: AutoTokenizer
163
  # Trust remote code for untrusted source
164
  trust_remote_code:
165
 
166
- # whether you are training a 4-bit quantized model
167
  load_4bit: true
168
  gptq_groupsize: 128 # group size
169
  gptq_model_v1: false # v1 or v2
170
 
171
  # this will attempt to quantize the model down to 8 bits and use adam 8 bit optimizer
172
  load_in_8bit: true
 
 
173
 
174
  # Use CUDA bf16
175
- bf16: true
176
  # Use CUDA fp16
177
  fp16: true
178
  # Use CUDA tf32
179
- tf32: true
180
 
181
  # a list of one or more datasets to finetune the model with
182
  datasets:
183
  # this can be either a hf dataset, or relative path
184
  - path: vicgalle/alpaca-gpt4
185
  # The type of prompt to use for training. [alpaca, sharegpt, gpteacher, oasst, reflection]
186
- type: alpaca
187
  data_files: # path to source data files
 
 
188
 
189
  # axolotl attempts to save the dataset as an arrow after packing the data together so
190
  # subsequent training attempts load faster, relative path
@@ -201,7 +217,7 @@ sequence_len: 2048
201
  # inspired by StackLLaMA. see https://huggingface.co/blog/stackllama#supervised-fine-tuning
202
  max_packed_sequence_len: 1024
203
 
204
- # if you want to use lora, leave blank to train all parameters in original model
205
  adapter: lora
206
  # if you already have a lora model trained that you want to load, put that here
207
  # lora hyperparameters
@@ -224,6 +240,7 @@ lora_out_dir:
224
  lora_fan_in_fan_out: false
225
 
226
  # wandb configuration if you're using it
 
227
  wandb_project:
228
  wandb_watch:
229
  wandb_run_id:
@@ -252,8 +269,18 @@ gradient_checkpointing: false
252
  # stop training after this many evaluation losses have increased in a row
253
  # https://huggingface.co/transformers/v4.2.2/_modules/transformers/trainer_callback.html#EarlyStoppingCallback
254
  early_stopping_patience: 3
255
- # specify a scheduler to use with the optimizer. only one_cycle is supported currently
256
- lr_scheduler:
 
 
 
 
 
 
 
 
 
 
257
  # specify optimizer
258
  optimizer:
259
  # specify weight decay
@@ -262,7 +289,7 @@ weight_decay:
262
  # whether to use xformers attention patch https://github.com/facebookresearch/xformers:
263
  xformers_attention:
264
  # whether to use flash attention patch https://github.com/HazyResearch/flash-attention:
265
- flash_attention:
266
 
267
  # resume from a specific checkpoint dir
268
  resume_from_checkpoint:
@@ -288,11 +315,17 @@ fsdp_config:
288
  # Deepspeed
289
  deepspeed:
290
 
291
- # TODO
292
  torchdistx_path:
293
 
 
 
 
294
  # Debug mode
295
  debug:
 
 
 
296
  ```
297
 
298
  </details>
@@ -317,12 +350,16 @@ accelerate launch scripts/finetune.py configs/your_config.yml
317
 
318
  ### Inference
319
 
320
- Add `--inference` flag to train command above
321
 
322
- If you are inferencing a pretrained LORA, pass
323
- ```bash
324
- --lora_model_dir ./completed-model
325
- ```
 
 
 
 
326
 
327
  ### Merge LORA to base
328
 
@@ -341,6 +378,10 @@ Please reduce any below
341
  - `eval_batch_size`
342
  - `sequence_len`
343
 
 
 
 
 
344
  ## Contributing 🤝
345
 
346
  Bugs? Please check for open issue else create a new [Issue](https://github.com/OpenAccess-AI-Collective/axolotl/issues/new).
 
97
  ```json
98
  {"instruction": "...", "input": "...", "output": "...", "reflection": "...", "corrected": "..."}
99
  ```
100
+ - `explainchoice`: question, choices, (solution OR explanation)
101
+ ```json
102
+ {"question": "...", "choices": ["..."], "solution": "...", "explanation": "..."}
103
+ ```
104
+ - `concisechoice`: question, choices, (solution OR explanation)
105
+ ```json
106
+ {"question": "...", "choices": ["..."], "solution": "...", "explanation": "..."}
107
+ ```
108
+ - `summarizetldr`: article and summary
109
+ ```json
110
+ {"article": "...", "summary": "..."}
111
+ ```
112
 
113
  > Have some new format to propose? Check if it's already defined in [data.py](src/axolotl/utils/data.py) in `dev` branch!
114
 
 
136
 
137
  - loading
138
  ```yaml
139
+ load_in_4bit: true
140
  load_in_8bit: true
141
+ bf16: true # require >=ampere
142
  fp16: true
143
+ tf32: true # require >=ampere
144
  ```
145
  Note: Repo does not do 4-bit quantization.
146
 
147
  - lora
148
  ```yaml
149
+ adapter: lora # qlora or leave blank for full finetune
150
  lora_r: 8
151
  lora_alpha: 16
152
  lora_dropout: 0.05
 
175
  # Trust remote code for untrusted source
176
  trust_remote_code:
177
 
178
+ # whether you are training a 4-bit GPTQ quantized model
179
  load_4bit: true
180
  gptq_groupsize: 128 # group size
181
  gptq_model_v1: false # v1 or v2
182
 
183
  # this will attempt to quantize the model down to 8 bits and use adam 8 bit optimizer
184
  load_in_8bit: true
185
+ # use bitsandbytes 4 bit
186
+ load_in_4bit:
187
 
188
  # Use CUDA bf16
189
+ bf16: true # bool or 'full' for `bf16_full_eval`. require >=ampere
190
  # Use CUDA fp16
191
  fp16: true
192
  # Use CUDA tf32
193
+ tf32: true # require >=ampere
194
 
195
  # a list of one or more datasets to finetune the model with
196
  datasets:
197
  # this can be either a hf dataset, or relative path
198
  - path: vicgalle/alpaca-gpt4
199
  # The type of prompt to use for training. [alpaca, sharegpt, gpteacher, oasst, reflection]
200
+ type: alpaca # format OR format:prompt_style (chat/instruct)
201
  data_files: # path to source data files
202
+ shards: # true if use subset data. make sure to set `shards` param also
203
+ shards: # number of shards to split dataset into
204
 
205
  # axolotl attempts to save the dataset as an arrow after packing the data together so
206
  # subsequent training attempts load faster, relative path
 
217
  # inspired by StackLLaMA. see https://huggingface.co/blog/stackllama#supervised-fine-tuning
218
  max_packed_sequence_len: 1024
219
 
220
+ # if you want to use 'lora' or 'qlora' or leave blank to train all parameters in original model
221
  adapter: lora
222
  # if you already have a lora model trained that you want to load, put that here
223
  # lora hyperparameters
 
240
  lora_fan_in_fan_out: false
241
 
242
  # wandb configuration if you're using it
243
+ wandb_mode:
244
  wandb_project:
245
  wandb_watch:
246
  wandb_run_id:
 
269
  # stop training after this many evaluation losses have increased in a row
270
  # https://huggingface.co/transformers/v4.2.2/_modules/transformers/trainer_callback.html#EarlyStoppingCallback
271
  early_stopping_patience: 3
272
+
273
+ # specify a scheduler and kwargs to use with the optimizer
274
+ lr_scheduler: # 'one_cycle' | 'log_sweep' | empty for cosine
275
+ lr_scheduler_kwargs:
276
+
277
+ # for one_cycle optim
278
+ lr_div_factor: # learning rate div factor
279
+
280
+ # for log_sweep optim
281
+ log_sweep_min_lr:
282
+ log_sweep_max_lr:
283
+
284
  # specify optimizer
285
  optimizer:
286
  # specify weight decay
 
289
  # whether to use xformers attention patch https://github.com/facebookresearch/xformers:
290
  xformers_attention:
291
  # whether to use flash attention patch https://github.com/HazyResearch/flash-attention:
292
+ flash_attention: # require a100 for llama
293
 
294
  # resume from a specific checkpoint dir
295
  resume_from_checkpoint:
 
315
  # Deepspeed
316
  deepspeed:
317
 
318
+ # Path to torch distx for optim 'adamw_anyprecision'
319
  torchdistx_path:
320
 
321
+ # Set padding for data collator to 'longest'
322
+ collator_pad_to_longest:
323
+
324
  # Debug mode
325
  debug:
326
+
327
+ # Seed
328
+ seed:
329
  ```
330
 
331
  </details>
 
350
 
351
  ### Inference
352
 
353
+ Pass the appropriate flag to the train command:
354
 
355
+ - Pretrained LORA:
356
+ ```bash
357
+ --inference --lora_model_dir ./completed-model
358
+ ```
359
+ - Full weights finetune:
360
+ ```bash
361
+ --inference --base_model ./completed-model
362
+ ```
363
 
364
  ### Merge LORA to base
365
 
 
378
  - `eval_batch_size`
379
  - `sequence_len`
380
 
381
+ > RuntimeError: expected scalar type Float but found Half
382
+
383
+ Try set `fp16: true`
384
+
385
  ## Contributing 🤝
386
 
387
  Bugs? Please check for open issue else create a new [Issue](https://github.com/OpenAccess-AI-Collective/axolotl/issues/new).