Commit History
streaming multipack for pretraining dataset (#959)
553c80f
unverified
RL/DPO (#935)
f243c21
winglian
commited on
Fix Deepspeed loading (#950)
5ea3aa3
unverified
winglian
commited on
support for mamba (#915)
40a6362
unverified
winglian
commited on
don't train if eval split is too small (#873)
797f3dd
unverified
winglian
commited on
various bugfixes (#856)
1470650
unverified
winglian
commited on
multipack w batch sampler (#795)
641e6f7
unverified
winglian
commited on
use accelerate logging for zero/main loggin only
b2430ce
winglian
commited on
cleanup verbosity a bit
4c834bf
winglian
commited on
Threaded MultipackDistributedDataloader with prefetched samples (#759)
05bd6f1
unverified
casperhansen
commited on
refactor setup trainer so we can add more hooks (#773)
6c81c61
unverified
winglian
commited on
fixes for alpaca w chatml, and don't include attention_mask w mistral for flash attention (#728)
3553172
unverified
winglian
commited on
Save Axolotl config as WandB artifact (#716)
490923f
unverified
Jan Philipp Harries
commited on
refactor to set eval_batch_size earlier if unset, so we can warn if mismatched (#662)
2642cae
unverified
winglian
commited on
Make dataset_processes configurable (#651)
9ec2077
unverified
corbt
commited on
Fix(cfg): Add validation for save_strategy and eval_strategy (#633)
383f88d
unverified
Nanobit
commited on
attention_mask not needed for training (#642)
e8cbf50
unverified
winglian
commited on
chore(callback): Remove old peft saving code (#510)
d5f8589
unverified
Nanobit
commited on
misc fixes to add gptq tests (#621)
03e5907
unverified
winglian
commited on
run eval on the first step to get a baseline (#617)
2844eb2
unverified
winglian
commited on
minor tweaks to simplify (#597)
31b9e0c
unverified
winglian
commited on
gather/broadcast the max value of the packing efficiency automatically (#463)
b15b19e
unverified
winglian
commited on
don't add position_ids for evals (#591)
ab534d7
unverified
winglian
commited on
optionally configure sample packing for evals (#589)
21ec195
unverified
winglian
commited on
fix save_steps so it doesn't get duplicated (#567)
3fbde76
unverified
winglian
commited on
improve how we setup eval/save strategies and steps (#547)
36e53c7
unverified
winglian
commited on
add optimization for group-by-len (#563)
e5bb22a
unverified
winglian
commited on
Add training callback to send predictions to WandB table (#521)
5b67ea9
unverified
Glavin001
commited on
Early stopping metric (#537)
e30f1e3
unverified
winglian
commited on
misc fixes/improvements (#513)
a546ca2
unverified
winglian
commited on
Add support for GPTQ using native transformers/peft (#468)
3355706
unverified
winglian
commited on
log supervised token count (#448)
7710e81
unverified
winglian
commited on
Added advanced DDP args (#515)
396a7a7
unverified
Jan Philipp Harries
Jan Philipp Harries
commited on
drop empty tokenized rows too (#509)
c56b450
unverified
winglian
commited on
add eval benchmark callback (#441)
7657632
unverified
winglian
commited on
use math.ceil instead of round /cc #498
fd55bc8
tmm1
commited on
pad_to_worst_case_seq_len boolean, for testing memory limits (#498)
8e197f6
unverified
let transformers handle adamw_bnb_8bit
868530c
tmm1
commited on
ReLoRA implementation (with quantization) (#322)
bde3c5a
unverified
always drop samples that are too long (#452)
50682a3
unverified
winglian
commited on
set env var for FSDP layer to wrap (#453)
5a1985b
unverified
winglian
commited on
add missing positional arg (#450)
58cf7e7
unverified
winglian
commited on
fix evals (#447)
ee26281
unverified
winglian
commited on
disable eval using multipack for now (#437)
f733d0f
unverified
winglian
commited on
fix comma, not a tuple (#436)
008505c
unverified
winglian
commited on
use save_strategy from config if available (#434)
b3f5e00
unverified
winglian
commited on
set env for FSDP offload params (#433)
5247c50
unverified
winglian
commited on