Commit History
improve how we setup eval/save strategies and steps (#547)
36e53c7
unverified
add optimization for group-by-len (#563)
e5bb22a
unverified
Add training callback to send predictions to WandB table (#521)
5b67ea9
unverified
Early stopping metric (#537)
e30f1e3
unverified
misc fixes/improvements (#513)
a546ca2
unverified
Add support for GPTQ using native transformers/peft (#468)
3355706
unverified
log supervised token count (#448)
7710e81
unverified
Added advanced DDP args (#515)
396a7a7
unverified
Jan Philipp Harries
Jan Philipp Harries
commited on
drop empty tokenized rows too (#509)
c56b450
unverified
add eval benchmark callback (#441)
7657632
unverified
use math.ceil instead of round /cc #498
fd55bc8
pad_to_worst_case_seq_len boolean, for testing memory limits (#498)
8e197f6
unverified
let transformers handle adamw_bnb_8bit
868530c
ReLoRA implementation (with quantization) (#322)
bde3c5a
unverified
always drop samples that are too long (#452)
50682a3
unverified
set env var for FSDP layer to wrap (#453)
5a1985b
unverified
add missing positional arg (#450)
58cf7e7
unverified
fix evals (#447)
ee26281
unverified
disable eval using multipack for now (#437)
f733d0f
unverified
fix comma, not a tuple (#436)
008505c
unverified
use save_strategy from config if available (#434)
b3f5e00
unverified
set env for FSDP offload params (#433)
5247c50
unverified
Fix(config): Update handling of deepspeed config (#404)
c01015f
unverified
fix eval steps and strategy (#403)
da10af0
unverified
Feat(config): add max steps (#387)
3c2ad00
unverified
Added "epoch" evaluation_strategy (#388)
5d48a10
unverified
Feat(config): Add hub_strategy (#386)
73a0b6e
unverified
improve GPU logging to break out pytorch cache and system mem
7b55fe6
Attention mask and position id fixes for packing (#285)
2bb0b78
unverified
log GPU memory usage
e303d64
fix axolotl training args dataclass annotation
ebaec3c
Merge branch 'OpenAccess-AI-Collective:main' into logging_enhancement
83237b8
unverified
The Objective Dad
commited on