Commit History
simplify linear layer locator
267b7b2
tmm1
commited on
fsdp requires params be the same type too (#493)
98bf76e
unverified
winglian
commited on
Fix(tokenizer): Make sure to add pad for CodeLlamaTokenizer (#489)
4c37bd0
unverified
Nanobit
commited on
fix condition and add logging
3a011ea
tmm1
commited on
Merge branch 'main' into patch-4
1f613e5
tmm1
commited on
rename var and reformat
f319b0b
tmm1
commited on
Update src/axolotl/utils/models.py
7fd662d
unverified
Update src/axolotl/utils/models.py
9e69968
unverified
let transformers handle adamw_bnb_8bit
868530c
tmm1
commited on
ignore: address pr review
d03887f
unverified
Maxime
commited on
ignore: linter
a184549
unverified
Maxime
commited on
fix: finetune model inference needs the dtype fix to work with flash-attn
f311df9
unverified
Maxime
commited on
fix types w lora (#478)
0b7ba57
unverified
winglian
commited on
Fix(tokenizer): Fix condition to add pad token (#477)
71bd062
unverified
Nanobit
commited on
improve llama pad token handling (#475)
cb9797e
unverified
winglian
commited on
ReLoRA implementation (with quantization) (#322)
bde3c5a
unverified
workaround so training doesn't hang when packed dataloader batches aren't even (#461)
c69faee
unverified
winglian
commited on
recast loralayer, norm, lmhead + embed token weights per original qlora (#393)
96deb6b
unverified
winglian
commited on
always drop samples that are too long (#452)
50682a3
unverified
winglian
commited on
set env var for FSDP layer to wrap (#453)
5a1985b
unverified
winglian
commited on
add missing positional arg (#450)
58cf7e7
unverified
winglian
commited on
fix evals (#447)
ee26281
unverified
winglian
commited on
support user defined prompters, pretokenized datasets in config, local parquet, local arrow files (#348)
d2e7f27
unverified
winglian
commited on
disable eval using multipack for now (#437)
f733d0f
unverified
winglian
commited on
fix comma, not a tuple (#436)
008505c
unverified
winglian
commited on
use save_strategy from config if available (#434)
b3f5e00
unverified
winglian
commited on
set env for FSDP offload params (#433)
5247c50
unverified
winglian
commited on
Fix(config): Update handling of deepspeed config (#404)
c01015f
unverified
Nanobit
commited on
fix eval steps and strategy (#403)
da10af0
unverified
winglian
commited on
add utils.data.prepare_dataset
2e22404
tmm1
commited on
use context manager to run things on rank0 before others (#397)
fc2d6be
unverified
winglian
commited on
don't use mask expansion for inference (#392)
1687be6
unverified
winglian
commited on
Feat(config): add max steps (#387)
3c2ad00
unverified
ittailup
commited on
Added "epoch" evaluation_strategy (#388)
5d48a10
unverified
flotos
commited on
Feat(config): Add hub_strategy (#386)
73a0b6e
unverified
Nanobit
commited on
don't pass rope_scaling kwarg if it's None (#383)
919246f
unverified
winglian
commited on
Fix crash when running without CUDA
15f6e57
chargoddard
commited on
try to detect accelerate and only use device_map=None in that case (#373)
094fc2c
unverified
tmm1
commited on
remove unnecessary local variable
0c96727
tmm1
commited on
simplify `load_tokenizer`
efb3b2c
tmm1
commited on
improve GPU logging to break out pytorch cache and system mem
7b55fe6
tmm1
commited on
quiet noise from llama tokenizer by setting pad token earlier
e029ab3
tmm1
commited on
extract module for working with cfg
8cec513
tmm1
commited on
fix DefaultDict.__or__
a13e45d
tmm1
commited on
Attention mask and position id fixes for packing (#285)
2bb0b78
unverified
winglian
commited on
Add wandb_entity to wandb options, update example configs, update README (#361)
7019509
unverified
Fix(model loading): Warn when model revision is passed to gptq (#364)
96bd6ae
unverified
Nanobit
commited on
Feat: Add rope scaling (#343)
b521206
unverified
Nanobit
commited on