Commit History
ignore the fsdp_config section too (#1606) [skip ci]
fff06af
unverified
Pass deepspeed and fsdp as None explicitly when merging adapters to allow custom device_map (#1575)
9e1480e
unverified
ORPO Trainer replacement (#1551)
7d1d22f
unverified
Print versions (#1496)
4313b1a
unverified
don't use deepspeed or fsdp when merging loras (#1479)
87ca3f9
unverified
ORPO (#1419)
2ea70eb
unverified
more fixes 20240228 (#1342) [skip ci]
0f985e1
unverified
hotfix for capabilities loading (#1331)
7de912e
unverified
Pydantic 2.x cfg (#1239)
cc3cebf
unverified
add support for https remote yamls (#1277)
9bca7db
unverified
Peft deepspeed resume (#1227)
c67fb71
unverified
make sure to register the base chatml template even if no system message is provided (#1207)
badda37
unverified
Feat/chatml add system message (#1117)
98b4762
unverified
more dpo fixes for dataset loading and docs (#1185) [skip ci]
5bce45f
unverified
Fix generation_config validation raises Exception for do_merge_lora (#1184)
02f2c72
unverified
Add support for offline mode with HF_HUB_OFFLINE envvar (#1182)
71141de
unverified
don't fail if can't cast weights due to offload when merging (#1172) [skip ci]
fb7f9b9
unverified
Add desc to map/filter (#1162)
6840381
unverified
jupyter lab fixes (#1139) [skip ci]
eaaeefc
unverified
Preprocess dataset size fix (#1131)
7570446
unverified
Reverse caching PR (#1115)
2202a20
unverified
Disable caching on `--disable_caching` in CLI (#1110)
d66b101
unverified
misc fixes from #943 (#1086) [skip ci]
23495a8
unverified
update sharegpt conversations when chatml chat template is set (#1075) [skip ci]
0ce1a65
unverified
Add: mlflow for experiment tracking (#1059) [skip ci]
090c24d
unverified
feature: better device mapping for large models (#918)
bdfefaf
unverified
set default for merge (#1044)
63fb3eb
unverified
RL/DPO (#935)
f243c21
Fix: bf16 support for inference (#981)
3678a6c
unverified
feat: remove need to add load_in* during merge (#1017)
f6ecf14
unverified
remove landmark attn and xpos rope implementations (#1010)
70b46ca
unverified
Fix Deepspeed loading (#950)
5ea3aa3
unverified
ensure merged model matches the training dtype (#902)
1d21aa6
unverified
include the suffix modified string in ascii art (#852)
614cff4
unverified
Feat: Added Gradio support (#812)
738a057
unverified
Create preprocess CLI (#785)
e50ab07
unverified
improve handling of the prepared ds path and other cfg defaults (#701)
1c412c7
unverified
Save Axolotl config as WandB artifact (#716)
490923f
unverified
Jan Philipp Harries
commited on
prepared dataset caching, other misc fixes (#665)
e50a64e
unverified
Warn users to login to HuggingFace (#645)
85b0be2
unverified
Napuh
commited on