Commit History
chore(config): clean up old log for Qwen (#1034)
74532dd
unverified
Nanobit
commited on
Feat: Warns to add to modules_to_save when adding tokens or switching special_tokens (#787)
1ffa386
unverified
Nanobit
commited on
fix: switch to using the HuggingFace Transformers NEFT implementation (#941)
ef24342
unverified
kallewoof
commited on
new evals_per_epoch and saves_per_epoch to make things cleaner (#944)
5f79b82
unverified
winglian
commited on
Support device_map=sequential & max_memory config parameters (#903)
992e742
unverified
Feat(wandb): Refactor to be more flexible (#767)
a1da39c
unverified
Nanobit
commited on
Feat: Add Qwen (#894)
1115c50
unverified
Nanobit
commited on
fix: warning should not show if eval_batch_size not provided (#896)
7ee3c4c
unverified
Nanobit
commited on
Feat: Add warmup_ratio (#893)
fb12895
unverified
Nanobit
commited on
allow overriding of model_config parameters from the YML (#853)
1bc1186
unverified
winglian
commited on
simplify by removing duplicate base_model_config (#772)
2d8def6
unverified
winglian
commited on
Fix: Warn when fullfinetune without adapter (#770)
44c9d01
unverified
Nanobit
commited on
convert exponential notation lr to floats (#771)
ca84cca
unverified
winglian
commited on
Fix: eval table conflict with eval_sample_packing (#769)
9923b72
unverified
Nanobit
commited on
Implement fused modules (#747)
15d3a65
unverified
refactor to set eval_batch_size earlier if unset, so we can warn if mismatched (#662)
2642cae
unverified
winglian
commited on
Make dataset_processes configurable (#651)
9ec2077
unverified
corbt
commited on
Fix bug when using pretokenized datasets (#652)
590d603
unverified
ich
commited on
Feat: Add example for Mistral (#644)
eb41f76
unverified
Nanobit
commited on
Fix(cfg): Add validation for save_strategy and eval_strategy (#633)
383f88d
unverified
Nanobit
commited on
use fastchat conversations template (#578)
e7d3e2d
unverified
winglian
commited on
Feat: Add support for upstream FA2 (#626)
19a600a
unverified
Nanobit
commited on
Fix: Fail bf16 check when running on cpu during merge (#631)
cfbce02
unverified
Nanobit
commited on
add bf16 check (#587)
131afdb
unverified
winglian
commited on
make phi training work with Loras (#588)
62eaee7
unverified
winglian
commited on
E2e device cuda (#575)
2414673
unverified
winglian
commited on
Model parallel (#538)
f6060a6
unverified
winglian
commited on
Add training callback to send predictions to WandB table (#521)
5b67ea9
unverified
Glavin001
commited on
Fix pretraining with iterable/streaming Dataset (#556)
2f586d1
unverified
Jan Philipp Harries
Jan Philipp Harries
commited on
Early stopping metric (#537)
e30f1e3
unverified
winglian
commited on
recommend padding when using sample packing (#531)
3437149
unverified
winglian
commited on
Add support for GPTQ using native transformers/peft (#468)
3355706
unverified
winglian
commited on
move is_llama_derived_model into normalize_config (#524)
44454ae
unverified
tmm1
commited on
ReLoRA implementation (with quantization) (#322)
bde3c5a
unverified
recast loralayer, norm, lmhead + embed token weights per original qlora (#393)
96deb6b
unverified
winglian
commited on
Fix(config): Update handling of deepspeed config (#404)
c01015f
unverified
Nanobit
commited on
try to detect accelerate and only use device_map=None in that case (#373)
094fc2c
unverified
tmm1
commited on
improve GPU logging to break out pytorch cache and system mem
7b55fe6
tmm1
commited on
extract module for working with cfg
8cec513
tmm1
commited on