Commits · Dovakiins/qwerrwe

support for loading a model by git revision

69a2350

winglian commited on Jul 14, 2023

Merge branch 'main' into quadratic-warmup

c4cf567
unverified

winglian commited on Jul 10, 2023

better configuration for quadratic warmup

c49729d

winglian commited on Jul 10, 2023

params are adam_, not adamw_

19cf0bd

winglian commited on Jul 8, 2023

skip explicit model type too if using trust_remote_code

d69da99

winglian commited on Jul 8, 2023

don't use llama if trust_remote_code is set since that needs to use AutoModel path

66afb76

winglian commited on Jul 8, 2023

Merge pull request #221 from utensil/local_dataset

b9b7d4c
unverified

winglian commited on Jul 3, 2023

Fix future deprecation push_to_hub_model_id

e79c8e6

Nanobit commited on Jul 3, 2023

Merge pull request #224 from OpenAccess-AI-Collective/system-prompt-data

f150c02
unverified

winglian commited on Jun 27, 2023

push intermediate model checkpoints to hub

612aabd

winglian commited on Jun 27, 2023

add tests and supoort for loader for sys prompt data

3a38271

winglian commited on Jun 18, 2023

optionally define whether to use_fast tokenizer

47d601f

winglian commited on Jun 25, 2023

Support loading data files from a local directory

9bdd30c

utensil commited on Jun 21, 2023

add validation and tests for adamw hyperparam

cb9d3af

winglian commited on Jun 15, 2023

support adamw and grad norm hyperparams

6d0ee4b

winglian commited on Jun 15, 2023

add float16 docs and tweak typehints

88e17ff

winglian commited on Jun 15, 2023

style correction

136522f

maciej.karasek commited on Jun 14, 2023

issue #205 bugfix

556fe40

maciej.karasek commited on Jun 14, 2023

add axolotl trainer and quadratic warmup

7dc580b

winglian commited on Jun 12, 2023

Merge branch 'main' into flash-optimum

fd2c981
unverified

winglian commited on Jun 12, 2023

Merge pull request #187 from OpenAccess-AI-Collective/strip-peft-device-map

93dacba
unverified

winglian commited on Jun 12, 2023

Merge pull request #177 from NanoCode012/fix/landmark-patch

8002ffb
unverified

winglian commited on Jun 12, 2023

Merge pull request #192 from OpenAccess-AI-Collective/sharegpt-custom-prompt

74ef5cc
unverified

winglian commited on Jun 12, 2023

Merge branch 'main' into strip-peft-device-map

5e616d9
unverified

winglian commited on Jun 12, 2023

Merge pull request #159 from AngainorDev/patch-1

8e568bb
unverified

Nanobit commited on Jun 12, 2023

add new sharegpt, refactor prompt so it can be customized later, add exception if no data is processed

aac4b76

winglian commited on Jun 11, 2023

add check for attr

c9a149f

winglian commited on Jun 11, 2023

new validation for mpt w grad checkpoints

14668fa

winglian commited on Jun 11, 2023

Fix strict and Lint

b565ecf

Angainor commited on Jun 11, 2023

match up gradient checkpointing when using lora w config

fe0b768

winglian commited on Jun 11, 2023

Fix set mem_id for inference and refactor

974dc00

Nanobit commited on Jun 11, 2023

Fix undefined LlamaForCausalLM and del try except

563b6d8

Nanobit commited on Jun 11, 2023

peft no longer needs device_map

cd0a6f6

winglian commited on Jun 11, 2023

Address PR suggestion

e285e24
unverified

Nanobit

winglian commited on Jun 11, 2023

Refactor landmark attention patch

919727b

Nanobit commited on Jun 9, 2023

fix formatting

958da70

winglian commited on Jun 10, 2023

Fix missing cfg.

a808bf9
unverified

Angainor Development commited on Jun 10, 2023

Merge pull request #182 from OpenAccess-AI-Collective/fix-llama-ref

0124825
unverified

winglian commited on Jun 10, 2023

address PR feedback

0c6f928

winglian commited on Jun 10, 2023

add streaming dataset support for pretraining datasets

eea2731

winglian commited on Jun 10, 2023

more gpt-neox long ctx fixes

ab5cd28

winglian commited on Jun 1, 2023

fix bettertransformers save, force it to skip after saving correctly in callback

1a82082

winglian commited on Jun 1, 2023

more tweaks to do pre-training with bettertransformers

1210dc8

winglian commited on Jun 1, 2023

experimental expansion of ctx len

488a67d

winglian commited on May 31, 2023

add validation/warning for bettertransformers and torch version

71a43f8

winglian commited on May 28, 2023

add support for opimum bettertransformers

1edc30c

winglian commited on May 27, 2023

fix for local variable 'LlamaForCausalLM' referenced before assignment

14163c1

winglian commited on Jun 10, 2023

Merge branch 'main' into patch-1

79e2a6f
unverified

Angainor Development commited on Jun 10, 2023

add support to extend context with xpos rope

a03a7d7

winglian commited on Jun 10, 2023

fix for max sequence len across different model types

7f09106

winglian commited on Jun 10, 2023

Commit History

support for loading a model by git revision 69a2350

Merge branch 'main' into quadratic-warmup c4cf567 unverified

better configuration for quadratic warmup c49729d

params are adam_*, not adamw_* 19cf0bd

skip explicit model type too if using trust_remote_code d69da99

don't use llama if trust_remote_code is set since that needs to use AutoModel path 66afb76

Merge pull request #221 from utensil/local_dataset b9b7d4c unverified

Fix future deprecation push_to_hub_model_id e79c8e6

Merge pull request #224 from OpenAccess-AI-Collective/system-prompt-data f150c02 unverified

push intermediate model checkpoints to hub 612aabd

add tests and supoort for loader for sys prompt data 3a38271

optionally define whether to use_fast tokenizer 47d601f

Support loading data files from a local directory 9bdd30c

add validation and tests for adamw hyperparam cb9d3af

support adamw and grad norm hyperparams 6d0ee4b

add float16 docs and tweak typehints 88e17ff

style correction 136522f

issue #205 bugfix 556fe40

add axolotl trainer and quadratic warmup 7dc580b

Merge branch 'main' into flash-optimum fd2c981 unverified

Merge pull request #187 from OpenAccess-AI-Collective/strip-peft-device-map 93dacba unverified

Merge pull request #177 from NanoCode012/fix/landmark-patch 8002ffb unverified

Merge pull request #192 from OpenAccess-AI-Collective/sharegpt-custom-prompt 74ef5cc unverified

Merge branch 'main' into strip-peft-device-map 5e616d9 unverified

Merge pull request #159 from AngainorDev/patch-1 8e568bb unverified

add new sharegpt, refactor prompt so it can be customized later, add exception if no data is processed aac4b76

add check for attr c9a149f

new validation for mpt w grad checkpoints 14668fa

Fix strict and Lint b565ecf

match up gradient checkpointing when using lora w config fe0b768

Fix set mem_id for inference and refactor 974dc00

Fix undefined LlamaForCausalLM and del try except 563b6d8

peft no longer needs device_map cd0a6f6

Address PR suggestion e285e24 unverified

Refactor landmark attention patch 919727b

fix formatting 958da70

Fix missing cfg. a808bf9 unverified

Merge pull request #182 from OpenAccess-AI-Collective/fix-llama-ref 0124825 unverified

address PR feedback 0c6f928

add streaming dataset support for pretraining datasets eea2731

more gpt-neox long ctx fixes ab5cd28

fix bettertransformers save, force it to skip after saving correctly in callback 1a82082

more tweaks to do pre-training with bettertransformers 1210dc8

experimental expansion of ctx len 488a67d

add validation/warning for bettertransformers and torch version 71a43f8

add support for opimum bettertransformers 1edc30c

fix for local variable 'LlamaForCausalLM' referenced before assignment 14163c1

Merge branch 'main' into patch-1 79e2a6f unverified

add support to extend context with xpos rope a03a7d7

fix for max sequence len across different model types 7f09106

support for loading a model by git revision

69a2350

Merge branch 'main' into quadratic-warmup

c4cf567
unverified

better configuration for quadratic warmup

c49729d

params are adam_, not adamw_

19cf0bd

skip explicit model type too if using trust_remote_code

d69da99

don't use llama if trust_remote_code is set since that needs to use AutoModel path

66afb76

Merge pull request #221 from utensil/local_dataset

b9b7d4c
unverified

Fix future deprecation push_to_hub_model_id

e79c8e6

Merge pull request #224 from OpenAccess-AI-Collective/system-prompt-data

f150c02
unverified

push intermediate model checkpoints to hub

612aabd

add tests and supoort for loader for sys prompt data

3a38271

optionally define whether to use_fast tokenizer

47d601f

Support loading data files from a local directory

9bdd30c

add validation and tests for adamw hyperparam

cb9d3af

support adamw and grad norm hyperparams

6d0ee4b

add float16 docs and tweak typehints

88e17ff

style correction

136522f

issue #205 bugfix

556fe40

add axolotl trainer and quadratic warmup

7dc580b

Merge branch 'main' into flash-optimum

fd2c981
unverified

Merge pull request #187 from OpenAccess-AI-Collective/strip-peft-device-map

93dacba
unverified

Merge pull request #177 from NanoCode012/fix/landmark-patch

8002ffb
unverified

Merge pull request #192 from OpenAccess-AI-Collective/sharegpt-custom-prompt

74ef5cc
unverified

Merge branch 'main' into strip-peft-device-map

5e616d9
unverified

Merge pull request #159 from AngainorDev/patch-1

8e568bb
unverified

add new sharegpt, refactor prompt so it can be customized later, add exception if no data is processed

aac4b76

add check for attr

c9a149f

new validation for mpt w grad checkpoints

14668fa

Fix strict and Lint

b565ecf

match up gradient checkpointing when using lora w config

fe0b768

Fix set mem_id for inference and refactor

974dc00

Fix undefined LlamaForCausalLM and del try except

563b6d8

peft no longer needs device_map

cd0a6f6

Address PR suggestion

e285e24
unverified

Refactor landmark attention patch

919727b

fix formatting

958da70

Fix missing cfg.

a808bf9
unverified

Merge pull request #182 from OpenAccess-AI-Collective/fix-llama-ref

0124825
unverified

address PR feedback

0c6f928

add streaming dataset support for pretraining datasets

eea2731

more gpt-neox long ctx fixes

ab5cd28

fix bettertransformers save, force it to skip after saving correctly in callback

1a82082

more tweaks to do pre-training with bettertransformers

1210dc8

experimental expansion of ctx len

488a67d

add validation/warning for bettertransformers and torch version

71a43f8

add support for opimum bettertransformers

1edc30c

fix for local variable 'LlamaForCausalLM' referenced before assignment

14163c1

Merge branch 'main' into patch-1

79e2a6f
unverified

add support to extend context with xpos rope

a03a7d7

fix for max sequence len across different model types

7f09106