Enable or disable bf16 support based on availability (#1116) 0865613 unverified Simon Hällqvist commited on Jan 14, 2024
update sharegpt conversations when chatml chat template is set (#1075) [skip ci] 0ce1a65 unverified winglian commited on Jan 10, 2024
be more robust about checking embedding modules for lora finetunes (#1074) [skip ci] 0f10080 unverified winglian commited on Jan 10, 2024
feature: better device mapping for large models (#918) bdfefaf unverified kallewoof Karl-Johan Alm winglian commited on Jan 5, 2024
Feat: Warns to add to modules_to_save when adding tokens or switching special_tokens (#787) 1ffa386 unverified Nanobit commited on Dec 22, 2023
fix: switch to using the HuggingFace Transformers NEFT implementation (#941) ef24342 unverified kallewoof commited on Dec 13, 2023
new evals_per_epoch and saves_per_epoch to make things cleaner (#944) 5f79b82 unverified winglian commited on Dec 12, 2023
Support device_map=sequential & max_memory config parameters (#903) 992e742 unverified Bryan Thornbury winglian commited on Dec 4, 2023
fix: warning should not show if eval_batch_size not provided (#896) 7ee3c4c unverified Nanobit commited on Nov 25, 2023
allow overriding of model_config parameters from the YML (#853) 1bc1186 unverified winglian commited on Nov 16, 2023
simplify by removing duplicate base_model_config (#772) 2d8def6 unverified winglian commited on Oct 23, 2023
Fix: Warn when fullfinetune without adapter (#770) 44c9d01 unverified Nanobit commited on Oct 22, 2023
convert exponential notation lr to floats (#771) ca84cca unverified winglian commited on Oct 22, 2023
Fix: eval table conflict with eval_sample_packing (#769) 9923b72 unverified Nanobit commited on Oct 22, 2023
refactor to set eval_batch_size earlier if unset, so we can warn if mismatched (#662) 2642cae unverified winglian commited on Oct 3, 2023
Fix(cfg): Add validation for save_strategy and eval_strategy (#633) 383f88d unverified Nanobit commited on Sep 28, 2023
Fix: Fail bf16 check when running on cpu during merge (#631) cfbce02 unverified Nanobit commited on Sep 25, 2023
Add training callback to send predictions to WandB table (#521) 5b67ea9 unverified Glavin001 commited on Sep 13, 2023
Fix pretraining with iterable/streaming Dataset (#556) 2f586d1 unverified Jan Philipp Harries Jan Philipp Harries commited on Sep 13, 2023
recommend padding when using sample packing (#531) 3437149 unverified winglian commited on Sep 6, 2023
Add support for GPTQ using native transformers/peft (#468) 3355706 unverified winglian commited on Sep 5, 2023
move is_llama_derived_model into normalize_config (#524) 44454ae unverified tmm1 commited on Sep 4, 2023
ReLoRA implementation (with quantization) (#322) bde3c5a unverified chargoddard winglian commited on Aug 24, 2023
recast loralayer, norm, lmhead + embed token weights per original qlora (#393) 96deb6b unverified winglian commited on Aug 21, 2023
Fix(config): Update handling of deepspeed config (#404) c01015f unverified Nanobit commited on Aug 15, 2023
try to detect accelerate and only use device_map=None in that case (#373) 094fc2c unverified tmm1 commited on Aug 13, 2023