Commits · Dovakiins/qwerrwe

fix DefaultDict.or

a13e45d

tmm1 commited on Aug 10, 2023

revert previous change and build ax images w docker on gpu (#371)

918f1b0
unverified

winglian commited on Aug 13, 2023

attempt to run non-base docker builds on regular cpu hosts (#369)

c3fde36
unverified

winglian commited on Aug 12, 2023

Attention mask and position id fixes for packing (#285)

2bb0b78
unverified

winglian commited on Aug 12, 2023

Fix(save): Save as safetensors (#363)

a276c9c
unverified

Nanobit commited on Aug 12, 2023

Add wandb_entity to wandb options, update example configs, update README (#361)

7019509
unverified

Morgan McGuire Morgan McGuire

winglian commited on Aug 12, 2023

Fix(model loading): Warn when model revision is passed to gptq (#364)

96bd6ae
unverified

Nanobit commited on Aug 12, 2023

Fix(message): Improve error message for bad format (#365)

e37d935
unverified

Nanobit commited on Aug 12, 2023

Feat: Add rope scaling (#343)

b521206
unverified

Nanobit commited on Aug 12, 2023

feat(merge): save tokenizer on merge (#362)

289d5c4
unverified

Nanobit commited on Aug 12, 2023

Merge pull request #355 from tmm1/bitsandbytes-fixes

35c8b90
unverified

tmm1 commited on Aug 11, 2023

Update README.md on pretraining_dataset (#360)

fae6ed8
unverified

Nanobit commited on Aug 11, 2023

Clarify pre-tokenize before multigpu (#359)

94d03c8
unverified

Nanobit commited on Aug 11, 2023

Merge pull request #356 from tmm1/load_model-args

11ddccb
unverified

tmm1 commited on Aug 10, 2023

Merge pull request #354 from tmm1/gpu-util

9643121
unverified

tmm1 commited on Aug 9, 2023

simplify load_model signature

7181022

tmm1 commited on Aug 9, 2023

Merge pull request #350 from tmm1/group-len-false-examples

f5c11f8
unverified

tmm1 commited on Aug 9, 2023

bump to latest bitsandbytes release with major bug fixes

fce40aa

tmm1 commited on Aug 9, 2023

use newer pynvml package

9c31410

tmm1 commited on Aug 9, 2023

log GPU memory usage

e303d64

tmm1 commited on Aug 9, 2023

note pattern when using groups

b4d1d22

tmm1 commited on Aug 7, 2023

update comment for group_by_length

9f99104

tmm1 commited on Aug 7, 2023

set group_by_length to false in examples

36fefcf

tmm1 commited on Aug 7, 2023

ensure enable_input_require_grads is called on model before getting the peft model (#345)

176b888
unverified

winglian commited on Aug 6, 2023

experimental llama 2 chat support (#296)

3392270
unverified

Jan Philipp Harries Jan Philipp Harries commited on Aug 6, 2023

add a basic ds zero3 config (#347)

bb53a16
unverified

winglian commited on Aug 6, 2023

Update XFormers Attention Monkeypatch to handle Llama-2 70B (GQA) (#339)

10405b9
unverified

ssmi153 commited on Aug 6, 2023

Added Orca Mini prompt strategy (#263)

c93655c
unverified

Jan Philipp Harries Jan Philipp Harries commited on Aug 5, 2023

optimize the iteration when tokenizeing large datasets (#332)

fe28543
unverified

winglian commited on Aug 4, 2023

Merge pull request #336 from tmm1/flash-attn

0d2e34f
unverified

tmm1 commited on Aug 3, 2023

Merge pull request #337 from tmm1/readme-fix

b56a6c0
unverified

tmm1 commited on Aug 3, 2023

fix typo

2eda9e0

tmm1 commited on Aug 3, 2023

scope flash-attn+qlora fix correctly, scope to llama, add comment

78b9efb

tmm1 commited on Aug 3, 2023

move flash-attn monkey patch alongside the others

312a9fa

tmm1 commited on Aug 3, 2023

python 3.10 and 3.11 both work fine, as does pytorch 2.1.0.dev

58d6659

tmm1 commited on Aug 3, 2023

there is no configs folder

cc7e800

tmm1 commited on Aug 3, 2023

feat/llama-2 examples (#319)

dc71d88
unverified

mhenrichsen Mads Henrichsen commited on Aug 3, 2023

ensure flash-attn fixes happen in both adapter/lora modes, and use torch_dtype

248bf90

tmm1 commited on Aug 2, 2023

qlora w flash attention fixes (#333)

77085ea
unverified

winglian commited on Aug 2, 2023

add peft install back since it doesn't get installed by setup.py (#331)

db2a358
unverified

winglian commited on Jul 31, 2023

pin accelerate so it works with llama2 (#330)

6c9a87c
unverified

winglian commited on Jul 31, 2023

fix FSDP save of final model (#329)

894cba0
unverified

winglian commited on Jul 31, 2023

update README for updated docker images (#328)

41a4d15
unverified

winglian commited on Jul 28, 2023

Prune cuda117 (#327)

2c37bf6
unverified

winglian commited on Jul 26, 2023

latest HEAD of accelerate causes 0 loss immediately w FSDP (#321)

9f69c4d
unverified

winglian commited on Jul 24, 2023

update prompts for open orca to match the paper (#317)

3d4984b
unverified

winglian commited on Jul 22, 2023

disable gh cache for first step of docker builds too

ff7f18d

winglian commited on Jul 22, 2023

add runpod envs to .bashrc, fix bnb env (#316)

cf62cfd
unverified

winglian commited on Jul 22, 2023

don't use the gha cache w docker

c5df969

winglian commited on Jul 22, 2023

Merge pull request #307 from OpenAccess-AI-Collective/xgen-user-sharegpt-tokens

40a53ff
unverified

winglian commited on Jul 22, 2023

Commit History

fix DefaultDict.__or__ a13e45d

revert previous change and build ax images w docker on gpu (#371) 918f1b0 unverified

attempt to run non-base docker builds on regular cpu hosts (#369) c3fde36 unverified

Attention mask and position id fixes for packing (#285) 2bb0b78 unverified

Fix(save): Save as safetensors (#363) a276c9c unverified

Add wandb_entity to wandb options, update example configs, update README (#361) 7019509 unverified

Fix(model loading): Warn when model revision is passed to gptq (#364) 96bd6ae unverified

Fix(message): Improve error message for bad format (#365) e37d935 unverified

Feat: Add rope scaling (#343) b521206 unverified

feat(merge): save tokenizer on merge (#362) 289d5c4 unverified

Merge pull request #355 from tmm1/bitsandbytes-fixes 35c8b90 unverified

Update README.md on pretraining_dataset (#360) fae6ed8 unverified

Clarify pre-tokenize before multigpu (#359) 94d03c8 unverified

Merge pull request #356 from tmm1/load_model-args 11ddccb unverified

Merge pull request #354 from tmm1/gpu-util 9643121 unverified

simplify load_model signature 7181022

Merge pull request #350 from tmm1/group-len-false-examples f5c11f8 unverified

bump to latest bitsandbytes release with major bug fixes fce40aa

use newer pynvml package 9c31410

log GPU memory usage e303d64

note pattern when using groups b4d1d22

update comment for group_by_length 9f99104

set group_by_length to false in examples 36fefcf

ensure enable_input_require_grads is called on model before getting the peft model (#345) 176b888 unverified

experimental llama 2 chat support (#296) 3392270 unverified

add a basic ds zero3 config (#347) bb53a16 unverified

Update XFormers Attention Monkeypatch to handle Llama-2 70B (GQA) (#339) 10405b9 unverified

Added Orca Mini prompt strategy (#263) c93655c unverified

optimize the iteration when tokenizeing large datasets (#332) fe28543 unverified

Merge pull request #336 from tmm1/flash-attn 0d2e34f unverified

Merge pull request #337 from tmm1/readme-fix b56a6c0 unverified

fix typo 2eda9e0

scope flash-attn+qlora fix correctly, scope to llama, add comment 78b9efb

move flash-attn monkey patch alongside the others 312a9fa

python 3.10 and 3.11 both work fine, as does pytorch 2.1.0.dev 58d6659

there is no configs folder cc7e800

feat/llama-2 examples (#319) dc71d88 unverified

ensure flash-attn fixes happen in both adapter/lora modes, and use torch_dtype 248bf90

qlora w flash attention fixes (#333) 77085ea unverified

add peft install back since it doesn't get installed by setup.py (#331) db2a358 unverified

pin accelerate so it works with llama2 (#330) 6c9a87c unverified

fix FSDP save of final model (#329) 894cba0 unverified

update README for updated docker images (#328) 41a4d15 unverified

Prune cuda117 (#327) 2c37bf6 unverified

latest HEAD of accelerate causes 0 loss immediately w FSDP (#321) 9f69c4d unverified

update prompts for open orca to match the paper (#317) 3d4984b unverified

disable gh cache for first step of docker builds too ff7f18d

add runpod envs to .bashrc, fix bnb env (#316) cf62cfd unverified

don't use the gha cache w docker c5df969

Merge pull request #307 from OpenAccess-AI-Collective/xgen-user-sharegpt-tokens 40a53ff unverified

fix DefaultDict.or

a13e45d

revert previous change and build ax images w docker on gpu (#371)

918f1b0
unverified

attempt to run non-base docker builds on regular cpu hosts (#369)

c3fde36
unverified

Attention mask and position id fixes for packing (#285)

2bb0b78
unverified

Fix(save): Save as safetensors (#363)

a276c9c
unverified

Add wandb_entity to wandb options, update example configs, update README (#361)

7019509
unverified

Fix(model loading): Warn when model revision is passed to gptq (#364)

96bd6ae
unverified

Fix(message): Improve error message for bad format (#365)

e37d935
unverified

Feat: Add rope scaling (#343)

b521206
unverified

feat(merge): save tokenizer on merge (#362)

289d5c4
unverified

Merge pull request #355 from tmm1/bitsandbytes-fixes

35c8b90
unverified

Update README.md on pretraining_dataset (#360)

fae6ed8
unverified

Clarify pre-tokenize before multigpu (#359)

94d03c8
unverified

Merge pull request #356 from tmm1/load_model-args

11ddccb
unverified

Merge pull request #354 from tmm1/gpu-util

9643121
unverified

simplify load_model signature

7181022

Merge pull request #350 from tmm1/group-len-false-examples

f5c11f8
unverified

bump to latest bitsandbytes release with major bug fixes

fce40aa

use newer pynvml package

9c31410

log GPU memory usage

e303d64

note pattern when using groups

b4d1d22

update comment for group_by_length

9f99104

set group_by_length to false in examples

36fefcf

ensure enable_input_require_grads is called on model before getting the peft model (#345)

176b888
unverified

experimental llama 2 chat support (#296)

3392270
unverified

add a basic ds zero3 config (#347)

bb53a16
unverified

Update XFormers Attention Monkeypatch to handle Llama-2 70B (GQA) (#339)

10405b9
unverified

Added Orca Mini prompt strategy (#263)

c93655c
unverified

optimize the iteration when tokenizeing large datasets (#332)

fe28543
unverified

Merge pull request #336 from tmm1/flash-attn

0d2e34f
unverified

Merge pull request #337 from tmm1/readme-fix

b56a6c0
unverified

fix typo

2eda9e0

scope flash-attn+qlora fix correctly, scope to llama, add comment

78b9efb

move flash-attn monkey patch alongside the others

312a9fa

python 3.10 and 3.11 both work fine, as does pytorch 2.1.0.dev

58d6659

there is no configs folder

cc7e800

feat/llama-2 examples (#319)

dc71d88
unverified

ensure flash-attn fixes happen in both adapter/lora modes, and use torch_dtype

248bf90

qlora w flash attention fixes (#333)

77085ea
unverified

add peft install back since it doesn't get installed by setup.py (#331)

db2a358
unverified

pin accelerate so it works with llama2 (#330)

6c9a87c
unverified

fix FSDP save of final model (#329)

894cba0
unverified

update README for updated docker images (#328)

41a4d15
unverified

Prune cuda117 (#327)

2c37bf6
unverified

latest HEAD of accelerate causes 0 loss immediately w FSDP (#321)

9f69c4d
unverified

update prompts for open orca to match the paper (#317)

3d4984b
unverified

disable gh cache for first step of docker builds too

ff7f18d

add runpod envs to .bashrc, fix bnb env (#316)

cf62cfd
unverified

don't use the gha cache w docker

c5df969

Merge pull request #307 from OpenAccess-AI-Collective/xgen-user-sharegpt-tokens

40a53ff
unverified