Commit History
bump transformers and update attention class map name (#1023)
bcc78d8
unverified
Feat: Warns to add to modules_to_save when adding tokens or switching special_tokens (#787)
1ffa386
unverified
fix mistral prompt assembly (#982)
7bbaac9
unverified
Fix prompt assembly for llama (#952)
5ada140
unverified
Respect sequence_len in config for `type: llama2_chat` (#926)
f1de29d
unverified
support for mamba (#915)
40a6362
unverified
Feat(wandb): Refactor to be more flexible (#767)
a1da39c
unverified
Feat: Add warmup_ratio (#893)
fb12895
unverified
Phi update 202311 (#876)
9bf854e
unverified
add e2e tests for checking functionality of resume from checkpoint (#865)
b3a61e8
unverified
use temp_dir kwarg instead
6dc68a6
missing dunder-init
7de6a56
chore: lint
c74f045
make sure to cleanup tmp output_dir for e2e tests
0402d19
simplify by removing duplicate base_model_config (#772)
2d8def6
unverified
Fix: Warn when fullfinetune without adapter (#770)
44c9d01
unverified
convert exponential notation lr to floats (#771)
ca84cca
unverified
Fix: eval table conflict with eval_sample_packing (#769)
9923b72
unverified
remove lora fused packing test (#758)
21cf09b
unverified
Implement fused modules (#747)
15d3a65
unverified
misc sharegpt fixes (#723)
f30afe4
unverified
Feat: Allow usage of native Mistral FA when no sample_packing (#669)
697c50d
unverified
add mistral e2e tests (#649)
5b0bc48
unverified
Fix(cfg): Add validation for save_strategy and eval_strategy (#633)
383f88d
unverified
use fastchat conversations template (#578)
e7d3e2d
unverified
Fix: Fail bf16 check when running on cpu during merge (#631)
cfbce02
unverified
better handling and logging of empty sharegpt turns (#603)
a363604
unverified
misc fixes to add gptq tests (#621)
03e5907
unverified
Support Sample packing for phi arch (#586)
12a2dbb
unverified
E2e device cuda (#575)
2414673
unverified
e2e testing (#574)
9218ebe
unverified
Fix pretraining with iterable/streaming Dataset (#556)
2f586d1
unverified
Jan Philipp Harries
Jan Philipp Harries
commited on
workaround for md5 variations (#533)
0b4cf5b
unverified
recommend padding when using sample packing (#531)
3437149
unverified
fix test fixture b/c hf trainer tokenization changed (#464)
d5dcf9c
unverified
fix fixture for new tokenizer handling in transformers (#428)
8cace80
unverified
simplify `load_tokenizer`
efb3b2c
extract module for working with cfg
8cec513
fix DefaultDict.__or__
a13e45d
Attention mask and position id fixes for packing (#285)
2bb0b78
unverified
experimental llama 2 chat support (#296)
3392270
unverified
Jan Philipp Harries
Jan Philipp Harries
commited on