Commits · Dovakiins/qwerrwe

misc fixes to add gptq tests (#621)

03e5907
unverified

winglian commited on Sep 22, 2023

support to disable exllama for gptq (#604)

faecff9
unverified

winglian commited on Sep 19, 2023

Delete duplicate lines (#606)

aa656e0
unverified

bofenghuang commited on Sep 19, 2023

btlm and falcon monkey patches for flash attn (#566)

6b9b229
unverified

winglian commited on Sep 17, 2023

make phi training work with Loras (#588)

62eaee7
unverified

winglian commited on Sep 16, 2023

don't resize embeddings if it's already large enough (#577)

3607882
unverified

winglian commited on Sep 15, 2023

Support Sample packing for phi arch (#586)

12a2dbb
unverified

winglian commited on Sep 15, 2023

Add training callback to send predictions to WandB table (#521)

5b67ea9
unverified

Glavin001 commited on Sep 13, 2023

fix for quant config from model (#540)

a94f9cb
unverified

winglian commited on Sep 10, 2023

Add support for GPTQ using native transformers/peft (#468)

3355706
unverified

winglian commited on Sep 5, 2023

fix: bad dtype for full finetune (#504)

1991946
unverified

Maxime

winglian commited on Sep 1, 2023

Refactor train cfg cli (#499)

125cccb
unverified

winglian commited on Aug 29, 2023

simplify linear layer locator

267b7b2

tmm1 commited on Aug 28, 2023

fsdp requires params be the same type too (#493)

98bf76e
unverified

winglian commited on Aug 28, 2023

Fix(tokenizer): Make sure to add pad for CodeLlamaTokenizer (#489)

4c37bd0
unverified

Nanobit commited on Aug 28, 2023

fix condition and add logging

3a011ea

tmm1 commited on Aug 27, 2023

rename var and reformat

f319b0b

tmm1 commited on Aug 27, 2023

Update src/axolotl/utils/models.py

7fd662d
unverified

Maxime

tmm1 commited on Aug 27, 2023

Update src/axolotl/utils/models.py

9e69968
unverified

Maxime

tmm1 commited on Aug 27, 2023

ignore: address pr review

d03887f
unverified

Maxime commited on Aug 26, 2023

ignore: linter

a184549
unverified

Maxime commited on Aug 26, 2023

fix: finetune model inference needs the dtype fix to work with flash-attn

f311df9
unverified

Maxime commited on Aug 26, 2023

fix types w lora (#478)

0b7ba57
unverified

winglian commited on Aug 25, 2023

Fix(tokenizer): Fix condition to add pad token (#477)

71bd062
unverified

Nanobit commited on Aug 25, 2023

improve llama pad token handling (#475)

cb9797e
unverified

winglian commited on Aug 24, 2023

recast loralayer, norm, lmhead + embed token weights per original qlora (#393)

96deb6b
unverified

winglian commited on Aug 21, 2023

fix evals (#447)

ee26281
unverified

winglian commited on Aug 21, 2023

standardize attn hijack patches (#381)

06edf17
unverified

tmm1

winglian commited on Aug 18, 2023

don't use mask expansion for inference (#392)

1687be6
unverified

winglian commited on Aug 15, 2023

don't pass rope_scaling kwarg if it's None (#383)

919246f
unverified

winglian commited on Aug 13, 2023

try to detect accelerate and only use device_map=None in that case (#373)

094fc2c
unverified

tmm1 commited on Aug 13, 2023

remove unnecessary local variable

0c96727

tmm1 commited on Aug 13, 2023

simplify `load_tokenizer`

efb3b2c

tmm1 commited on Aug 13, 2023

improve GPU logging to break out pytorch cache and system mem

7b55fe6

tmm1 commited on Aug 13, 2023

quiet noise from llama tokenizer by setting pad token earlier

e029ab3

tmm1 commited on Aug 13, 2023

Attention mask and position id fixes for packing (#285)

2bb0b78
unverified

winglian commited on Aug 12, 2023

Feat: Add rope scaling (#343)

b521206
unverified

Nanobit commited on Aug 12, 2023

Merge pull request #356 from tmm1/load_model-args

11ddccb
unverified

tmm1 commited on Aug 10, 2023

simplify load_model signature

7181022

tmm1 commited on Aug 9, 2023

log GPU memory usage

e303d64

tmm1 commited on Aug 9, 2023

ensure enable_input_require_grads is called on model before getting the peft model (#345)

176b888
unverified

winglian commited on Aug 6, 2023

fix typo

2eda9e0

tmm1 commited on Aug 3, 2023

scope flash-attn+qlora fix correctly, scope to llama, add comment

78b9efb

tmm1 commited on Aug 3, 2023

move flash-attn monkey patch alongside the others

312a9fa

tmm1 commited on Aug 3, 2023

ensure flash-attn fixes happen in both adapter/lora modes, and use torch_dtype

248bf90

tmm1 commited on Aug 2, 2023

qlora w flash attention fixes (#333)

77085ea
unverified

winglian commited on Aug 2, 2023

add peft install back since it doesn't get installed by setup.py (#331)

db2a358
unverified

winglian commited on Jul 31, 2023

don't resize embeddings to multiples of 32x by default

1066751

winglian commited on Jul 22, 2023

Adding logging enhancement

553a86b

theobjectivedad commited on Jul 14, 2023

support for loading a model by git revision

69a2350

winglian commited on Jul 14, 2023

Commit History

misc fixes to add gptq tests (#621) 03e5907 unverified

support to disable exllama for gptq (#604) faecff9 unverified

Delete duplicate lines (#606) aa656e0 unverified

btlm and falcon monkey patches for flash attn (#566) 6b9b229 unverified

make phi training work with Loras (#588) 62eaee7 unverified

don't resize embeddings if it's already large enough (#577) 3607882 unverified

Support Sample packing for phi arch (#586) 12a2dbb unverified

Add training callback to send predictions to WandB table (#521) 5b67ea9 unverified

fix for quant config from model (#540) a94f9cb unverified

Add support for GPTQ using native transformers/peft (#468) 3355706 unverified

fix: bad dtype for full finetune (#504) 1991946 unverified

Refactor train cfg cli (#499) 125cccb unverified

simplify linear layer locator 267b7b2

fsdp requires params be the same type too (#493) 98bf76e unverified

Fix(tokenizer): Make sure to add pad for CodeLlamaTokenizer (#489) 4c37bd0 unverified

fix condition and add logging 3a011ea

rename var and reformat f319b0b

Update src/axolotl/utils/models.py 7fd662d unverified

Update src/axolotl/utils/models.py 9e69968 unverified

ignore: address pr review d03887f unverified

ignore: linter a184549 unverified

fix: finetune model inference needs the dtype fix to work with flash-attn f311df9 unverified

fix types w lora (#478) 0b7ba57 unverified

Fix(tokenizer): Fix condition to add pad token (#477) 71bd062 unverified

improve llama pad token handling (#475) cb9797e unverified

recast loralayer, norm, lmhead + embed token weights per original qlora (#393) 96deb6b unverified

fix evals (#447) ee26281 unverified

standardize attn hijack patches (#381) 06edf17 unverified

don't use mask expansion for inference (#392) 1687be6 unverified

don't pass rope_scaling kwarg if it's None (#383) 919246f unverified

try to detect accelerate and only use device_map=None in that case (#373) 094fc2c unverified

remove unnecessary local variable 0c96727

simplify `load_tokenizer` efb3b2c

improve GPU logging to break out pytorch cache and system mem 7b55fe6

quiet noise from llama tokenizer by setting pad token earlier e029ab3

Attention mask and position id fixes for packing (#285) 2bb0b78 unverified

Feat: Add rope scaling (#343) b521206 unverified

Merge pull request #356 from tmm1/load_model-args 11ddccb unverified

simplify load_model signature 7181022

log GPU memory usage e303d64

ensure enable_input_require_grads is called on model before getting the peft model (#345) 176b888 unverified

fix typo 2eda9e0

scope flash-attn+qlora fix correctly, scope to llama, add comment 78b9efb

move flash-attn monkey patch alongside the others 312a9fa

ensure flash-attn fixes happen in both adapter/lora modes, and use torch_dtype 248bf90

qlora w flash attention fixes (#333) 77085ea unverified

add peft install back since it doesn't get installed by setup.py (#331) db2a358 unverified

don't resize embeddings to multiples of 32x by default 1066751

Adding logging enhancement 553a86b

support for loading a model by git revision 69a2350

misc fixes to add gptq tests (#621)

03e5907
unverified

support to disable exllama for gptq (#604)

faecff9
unverified

Delete duplicate lines (#606)

aa656e0
unverified

btlm and falcon monkey patches for flash attn (#566)

6b9b229
unverified

make phi training work with Loras (#588)

62eaee7
unverified

don't resize embeddings if it's already large enough (#577)

3607882
unverified

Support Sample packing for phi arch (#586)

12a2dbb
unverified

Add training callback to send predictions to WandB table (#521)

5b67ea9
unverified

fix for quant config from model (#540)

a94f9cb
unverified

Add support for GPTQ using native transformers/peft (#468)

3355706
unverified

fix: bad dtype for full finetune (#504)

1991946
unverified

Refactor train cfg cli (#499)

125cccb
unverified

simplify linear layer locator

267b7b2

fsdp requires params be the same type too (#493)

98bf76e
unverified

Fix(tokenizer): Make sure to add pad for CodeLlamaTokenizer (#489)

4c37bd0
unverified

fix condition and add logging

3a011ea

rename var and reformat

f319b0b

Update src/axolotl/utils/models.py

7fd662d
unverified

Update src/axolotl/utils/models.py

9e69968
unverified

ignore: address pr review

d03887f
unverified

ignore: linter

a184549
unverified

fix: finetune model inference needs the dtype fix to work with flash-attn

f311df9
unverified

fix types w lora (#478)

0b7ba57
unverified

Fix(tokenizer): Fix condition to add pad token (#477)

71bd062
unverified

improve llama pad token handling (#475)

cb9797e
unverified

recast loralayer, norm, lmhead + embed token weights per original qlora (#393)

96deb6b
unverified

fix evals (#447)

ee26281
unverified

standardize attn hijack patches (#381)

06edf17
unverified

don't use mask expansion for inference (#392)

1687be6
unverified

don't pass rope_scaling kwarg if it's None (#383)

919246f
unverified

try to detect accelerate and only use device_map=None in that case (#373)

094fc2c
unverified

remove unnecessary local variable

0c96727

simplify `load_tokenizer`

efb3b2c

improve GPU logging to break out pytorch cache and system mem

7b55fe6

quiet noise from llama tokenizer by setting pad token earlier

e029ab3

Attention mask and position id fixes for packing (#285)

2bb0b78
unverified

Feat: Add rope scaling (#343)

b521206
unverified

Merge pull request #356 from tmm1/load_model-args

11ddccb
unverified

simplify load_model signature

7181022

log GPU memory usage

e303d64

ensure enable_input_require_grads is called on model before getting the peft model (#345)

176b888
unverified

fix typo

2eda9e0

scope flash-attn+qlora fix correctly, scope to llama, add comment

78b9efb

move flash-attn monkey patch alongside the others

312a9fa

ensure flash-attn fixes happen in both adapter/lora modes, and use torch_dtype

248bf90

qlora w flash attention fixes (#333)

77085ea
unverified

add peft install back since it doesn't get installed by setup.py (#331)

db2a358
unverified

don't resize embeddings to multiples of 32x by default

1066751

Adding logging enhancement

553a86b

support for loading a model by git revision

69a2350