Commit History
Add support for GPTQ using native transformers/peft (#468)
3355706
unverified
fix: bad dtype for full finetune (#504)
1991946
unverified
Refactor train cfg cli (#499)
125cccb
unverified
simplify linear layer locator
267b7b2
fsdp requires params be the same type too (#493)
98bf76e
unverified
Fix(tokenizer): Make sure to add pad for CodeLlamaTokenizer (#489)
4c37bd0
unverified
fix condition and add logging
3a011ea
rename var and reformat
f319b0b
Update src/axolotl/utils/models.py
7fd662d
unverified
Update src/axolotl/utils/models.py
9e69968
unverified
ignore: address pr review
d03887f
unverified
Maxime
commited on
ignore: linter
a184549
unverified
Maxime
commited on
fix: finetune model inference needs the dtype fix to work with flash-attn
f311df9
unverified
Maxime
commited on
fix types w lora (#478)
0b7ba57
unverified
Fix(tokenizer): Fix condition to add pad token (#477)
71bd062
unverified
improve llama pad token handling (#475)
cb9797e
unverified
recast loralayer, norm, lmhead + embed token weights per original qlora (#393)
96deb6b
unverified
fix evals (#447)
ee26281
unverified
don't use mask expansion for inference (#392)
1687be6
unverified
don't pass rope_scaling kwarg if it's None (#383)
919246f
unverified
try to detect accelerate and only use device_map=None in that case (#373)
094fc2c
unverified
remove unnecessary local variable
0c96727
simplify `load_tokenizer`
efb3b2c
improve GPU logging to break out pytorch cache and system mem
7b55fe6
quiet noise from llama tokenizer by setting pad token earlier
e029ab3
Attention mask and position id fixes for packing (#285)
2bb0b78
unverified
Feat: Add rope scaling (#343)
b521206
unverified
Merge pull request #356 from tmm1/load_model-args
11ddccb
unverified
simplify load_model signature
7181022
log GPU memory usage
e303d64
ensure enable_input_require_grads is called on model before getting the peft model (#345)
176b888
unverified
fix typo
2eda9e0
scope flash-attn+qlora fix correctly, scope to llama, add comment
78b9efb
move flash-attn monkey patch alongside the others
312a9fa
ensure flash-attn fixes happen in both adapter/lora modes, and use torch_dtype
248bf90
qlora w flash attention fixes (#333)
77085ea
unverified
add peft install back since it doesn't get installed by setup.py (#331)
db2a358
unverified
don't resize embeddings to multiples of 32x by default
1066751
Adding logging enhancement
553a86b
support for loading a model by git revision
69a2350
skip explicit model type too if using trust_remote_code
d69da99
don't use llama if trust_remote_code is set since that needs to use AutoModel path
66afb76
optionally define whether to use_fast tokenizer
47d601f
add float16 docs and tweak typehints
88e17ff
style correction
136522f
maciej.karasek
commited on
issue #205 bugfix
556fe40
maciej.karasek
commited on