Commits · Dovakiins/qwerrwe

quiet noise from llama tokenizer by setting pad token earlier

e029ab3

tmm1 commited on Aug 13, 2023

extract module for working with cfg

8cec513

tmm1 commited on Aug 13, 2023

fix DefaultDict.or

a13e45d

tmm1 commited on Aug 10, 2023

Attention mask and position id fixes for packing (#285)

2bb0b78
unverified

winglian commited on Aug 12, 2023

Add wandb_entity to wandb options, update example configs, update README (#361)

7019509
unverified

Morgan McGuire Morgan McGuire

winglian commited on Aug 12, 2023

Fix(model loading): Warn when model revision is passed to gptq (#364)

96bd6ae
unverified

Nanobit commited on Aug 12, 2023

Feat: Add rope scaling (#343)

b521206
unverified

Nanobit commited on Aug 12, 2023

Merge pull request #356 from tmm1/load_model-args

11ddccb
unverified

tmm1 commited on Aug 10, 2023

simplify load_model signature

7181022

tmm1 commited on Aug 9, 2023

log GPU memory usage

e303d64

tmm1 commited on Aug 9, 2023

ensure enable_input_require_grads is called on model before getting the peft model (#345)

176b888
unverified

winglian commited on Aug 6, 2023

experimental llama 2 chat support (#296)

3392270
unverified

Jan Philipp Harries Jan Philipp Harries commited on Aug 6, 2023

optimize the iteration when tokenizeing large datasets (#332)

fe28543
unverified

winglian commited on Aug 4, 2023

fix typo

2eda9e0

tmm1 commited on Aug 3, 2023

scope flash-attn+qlora fix correctly, scope to llama, add comment

78b9efb

tmm1 commited on Aug 3, 2023

move flash-attn monkey patch alongside the others

312a9fa

tmm1 commited on Aug 3, 2023

ensure flash-attn fixes happen in both adapter/lora modes, and use torch_dtype

248bf90

tmm1 commited on Aug 2, 2023

qlora w flash attention fixes (#333)

77085ea
unverified

winglian commited on Aug 2, 2023

add peft install back since it doesn't get installed by setup.py (#331)

db2a358
unverified

winglian commited on Jul 31, 2023

don't resize embeddings to multiples of 32x by default

1066751

winglian commited on Jul 22, 2023

fix axolotl training args dataclass annotation

ebaec3c

winglian commited on Jul 17, 2023

Merge pull request #276 from theobjectivedad/logging_enhancement

6f16c45
unverified

winglian commited on Jul 16, 2023

Fixed pre-commit problems, fixed small bug in logging_config to handle LOG_LEVEL env var

b1f4f7a

theobjectivedad commited on Jul 15, 2023

Merge branch 'OpenAccess-AI-Collective:main' into logging_enhancement

83237b8
unverified

The Objective Dad commited on Jul 15, 2023

Add ability to pass 'name' argument to load_dataset

88089e8

chargoddard commited on Jul 14, 2023

Merge pull request #274 from OpenAccess-AI-Collective/NanoCode012-patch-2

168a7a0
unverified

Nanobit commited on Jul 14, 2023

Adding logging enhancement

553a86b

theobjectivedad commited on Jul 14, 2023

Feat: Add save_safetensors

5491278

Nanobit commited on Jul 14, 2023

Set push to hub as private by default

1514739
unverified

Nanobit commited on Jul 14, 2023

support for loading a model by git revision

69a2350

winglian commited on Jul 14, 2023

Merge branch 'main' into quadratic-warmup

c4cf567
unverified

winglian commited on Jul 10, 2023

better configuration for quadratic warmup

c49729d

winglian commited on Jul 10, 2023

params are adam_, not adamw_

19cf0bd

winglian commited on Jul 8, 2023

skip explicit model type too if using trust_remote_code

d69da99

winglian commited on Jul 8, 2023

don't use llama if trust_remote_code is set since that needs to use AutoModel path

66afb76

winglian commited on Jul 8, 2023

Merge pull request #221 from utensil/local_dataset

b9b7d4c
unverified

winglian commited on Jul 3, 2023

Fix future deprecation push_to_hub_model_id

e79c8e6

Nanobit commited on Jul 3, 2023

Merge pull request #224 from OpenAccess-AI-Collective/system-prompt-data

f150c02
unverified

winglian commited on Jun 27, 2023

push intermediate model checkpoints to hub

612aabd

winglian commited on Jun 27, 2023

add tests and supoort for loader for sys prompt data

3a38271

winglian commited on Jun 18, 2023

optionally define whether to use_fast tokenizer

47d601f

winglian commited on Jun 25, 2023

Support loading data files from a local directory

9bdd30c

utensil commited on Jun 21, 2023

add validation and tests for adamw hyperparam

cb9d3af

winglian commited on Jun 15, 2023

support adamw and grad norm hyperparams

6d0ee4b

winglian commited on Jun 15, 2023

add float16 docs and tweak typehints

88e17ff

winglian commited on Jun 15, 2023

style correction

136522f

maciej.karasek commited on Jun 14, 2023

issue #205 bugfix

556fe40

maciej.karasek commited on Jun 14, 2023

add axolotl trainer and quadratic warmup

7dc580b

winglian commited on Jun 12, 2023

Merge branch 'main' into flash-optimum

fd2c981
unverified

winglian commited on Jun 12, 2023

Merge pull request #187 from OpenAccess-AI-Collective/strip-peft-device-map

93dacba
unverified

winglian commited on Jun 12, 2023

Commit History

quiet noise from llama tokenizer by setting pad token earlier e029ab3

extract module for working with cfg 8cec513

fix DefaultDict.__or__ a13e45d

Attention mask and position id fixes for packing (#285) 2bb0b78 unverified

Add wandb_entity to wandb options, update example configs, update README (#361) 7019509 unverified

Fix(model loading): Warn when model revision is passed to gptq (#364) 96bd6ae unverified

Feat: Add rope scaling (#343) b521206 unverified

Merge pull request #356 from tmm1/load_model-args 11ddccb unverified

simplify load_model signature 7181022

log GPU memory usage e303d64

ensure enable_input_require_grads is called on model before getting the peft model (#345) 176b888 unverified

experimental llama 2 chat support (#296) 3392270 unverified

optimize the iteration when tokenizeing large datasets (#332) fe28543 unverified

fix typo 2eda9e0

scope flash-attn+qlora fix correctly, scope to llama, add comment 78b9efb

move flash-attn monkey patch alongside the others 312a9fa

ensure flash-attn fixes happen in both adapter/lora modes, and use torch_dtype 248bf90

qlora w flash attention fixes (#333) 77085ea unverified

add peft install back since it doesn't get installed by setup.py (#331) db2a358 unverified

don't resize embeddings to multiples of 32x by default 1066751

fix axolotl training args dataclass annotation ebaec3c

Merge pull request #276 from theobjectivedad/logging_enhancement 6f16c45 unverified

Fixed pre-commit problems, fixed small bug in logging_config to handle LOG_LEVEL env var b1f4f7a

Merge branch 'OpenAccess-AI-Collective:main' into logging_enhancement 83237b8 unverified

Add ability to pass 'name' argument to load_dataset 88089e8

Merge pull request #274 from OpenAccess-AI-Collective/NanoCode012-patch-2 168a7a0 unverified

Adding logging enhancement 553a86b

Feat: Add save_safetensors 5491278

Set push to hub as private by default 1514739 unverified

support for loading a model by git revision 69a2350

Merge branch 'main' into quadratic-warmup c4cf567 unverified

better configuration for quadratic warmup c49729d

params are adam_*, not adamw_* 19cf0bd

skip explicit model type too if using trust_remote_code d69da99

don't use llama if trust_remote_code is set since that needs to use AutoModel path 66afb76

Merge pull request #221 from utensil/local_dataset b9b7d4c unverified

Fix future deprecation push_to_hub_model_id e79c8e6

Merge pull request #224 from OpenAccess-AI-Collective/system-prompt-data f150c02 unverified

push intermediate model checkpoints to hub 612aabd

add tests and supoort for loader for sys prompt data 3a38271

optionally define whether to use_fast tokenizer 47d601f

Support loading data files from a local directory 9bdd30c

add validation and tests for adamw hyperparam cb9d3af

support adamw and grad norm hyperparams 6d0ee4b

add float16 docs and tweak typehints 88e17ff

style correction 136522f

issue #205 bugfix 556fe40

add axolotl trainer and quadratic warmup 7dc580b

Merge branch 'main' into flash-optimum fd2c981 unverified

Merge pull request #187 from OpenAccess-AI-Collective/strip-peft-device-map 93dacba unverified

quiet noise from llama tokenizer by setting pad token earlier

e029ab3

extract module for working with cfg

8cec513

fix DefaultDict.or

a13e45d

Attention mask and position id fixes for packing (#285)

2bb0b78
unverified

Add wandb_entity to wandb options, update example configs, update README (#361)

7019509
unverified

Fix(model loading): Warn when model revision is passed to gptq (#364)

96bd6ae
unverified

Feat: Add rope scaling (#343)

b521206
unverified

Merge pull request #356 from tmm1/load_model-args

11ddccb
unverified

simplify load_model signature

7181022

log GPU memory usage

e303d64

ensure enable_input_require_grads is called on model before getting the peft model (#345)

176b888
unverified

experimental llama 2 chat support (#296)

3392270
unverified

optimize the iteration when tokenizeing large datasets (#332)

fe28543
unverified

fix typo

2eda9e0

scope flash-attn+qlora fix correctly, scope to llama, add comment

78b9efb

move flash-attn monkey patch alongside the others

312a9fa

ensure flash-attn fixes happen in both adapter/lora modes, and use torch_dtype

248bf90

qlora w flash attention fixes (#333)

77085ea
unverified

add peft install back since it doesn't get installed by setup.py (#331)

db2a358
unverified

don't resize embeddings to multiples of 32x by default

1066751

fix axolotl training args dataclass annotation

ebaec3c

Merge pull request #276 from theobjectivedad/logging_enhancement

6f16c45
unverified

Fixed pre-commit problems, fixed small bug in logging_config to handle LOG_LEVEL env var

b1f4f7a

Merge branch 'OpenAccess-AI-Collective:main' into logging_enhancement

83237b8
unverified

Add ability to pass 'name' argument to load_dataset

88089e8

Merge pull request #274 from OpenAccess-AI-Collective/NanoCode012-patch-2

168a7a0
unverified

Adding logging enhancement

553a86b

Feat: Add save_safetensors

5491278

Set push to hub as private by default

1514739
unverified

support for loading a model by git revision

69a2350

Merge branch 'main' into quadratic-warmup

c4cf567
unverified

better configuration for quadratic warmup

c49729d

params are adam_, not adamw_

19cf0bd

skip explicit model type too if using trust_remote_code

d69da99

don't use llama if trust_remote_code is set since that needs to use AutoModel path

66afb76

Merge pull request #221 from utensil/local_dataset

b9b7d4c
unverified

Fix future deprecation push_to_hub_model_id

e79c8e6

Merge pull request #224 from OpenAccess-AI-Collective/system-prompt-data

f150c02
unverified

push intermediate model checkpoints to hub

612aabd

add tests and supoort for loader for sys prompt data

3a38271

optionally define whether to use_fast tokenizer

47d601f

Support loading data files from a local directory

9bdd30c

add validation and tests for adamw hyperparam

cb9d3af

support adamw and grad norm hyperparams

6d0ee4b

add float16 docs and tweak typehints

88e17ff

style correction

136522f

issue #205 bugfix

556fe40

add axolotl trainer and quadratic warmup

7dc580b

Merge branch 'main' into flash-optimum

fd2c981
unverified

Merge pull request #187 from OpenAccess-AI-Collective/strip-peft-device-map

93dacba
unverified