Commit History
recommend padding when using sample packing (#531)
		3437149
	
		
		unverified
	log rank too (#527)
		245c5c4
	
		
		unverified
	misc fixes/improvements (#513)
		a546ca2
	
		
		unverified
	Add support for GPTQ using native transformers/peft (#468)
		3355706
	
		
		unverified
	Merge pull request #520 from bdashore3/sharegpt-fixes
		daa4fac
	
		
		unverified
	reorg a bit
		fc8766e
	
		
		
	use flash_attn rmsnorm when available (#526)
		72a6fe1
	
		
		unverified
	use flash_attn xentropy when available (#525)
		5fe30b1
	
		
		unverified
	move is_llama_derived_model into normalize_config (#524)
		44454ae
	
		
		unverified
	No gather single gpu (#523)
		09f1543
	
		
		unverified
	Prompters: ShareGPT: Allow for custom system prompts
		995557b
	
		
		
	fix: bad dtype for full finetune (#504)
		1991946
	
		
		unverified
	Fix(doc): Inform Windows users to use WSL/docker (#518)
		f51c9c5
	
		
		unverified
	log supervised token count (#448)
		7710e81
	
		
		unverified
	Debug tokenization output: Add ability to output text only (no tokens), and/or specify num samples to see (#511)
		48434be
	
		
		unverified
	
		Tom Jobbins
		
	commited on
		
		
Added advanced DDP args (#515)
		396a7a7
	
		
		unverified
	
		Jan Philipp Harries
		
		Jan Philipp Harries
		
	commited on
		
		
split train from other cli options (#503)
		b21e4a2
	
		
		unverified
	Changed Bench Eval to report metrics correctly by split. Added total accuracy and renamed previously used bench_accuracy to bench_average_accuracy. (#512)
		42f9642
	
		
		unverified
	
		Alpay Ariyak
		
	commited on
		
		
drop empty tokenized rows too (#509)
		c56b450
	
		
		unverified
	set zero3 optimizer betas to auto so they inherit from HF trainer config (#507)
		1e07c16
	
		
		unverified
	add eval benchmark callback (#441)
		7657632
	
		
		unverified
	customizable ascii art (#506)
		548787d
	
		
		unverified
	support for datasets with multiple names (#480)
		5ac3392
	
		
		unverified
	remove --force-reinstall from Dockerfile to ensure correct pytorch version (#492)
		e356b29
	
		
		unverified
	Fix(doc): Clarify no amp to full yaml docs (#496)
		48c5647
	
		
		unverified
	tweak: use default config file when only one file is present (#501)
		36b2e1c
	
		
		unverified
	
		Maxime
		
	commited on
		
		
Refactor train cfg cli (#499)
		125cccb
	
		
		unverified
	use math.ceil instead of round /cc #498
		fd55bc8
	
		
		
	pad_to_worst_case_seq_len boolean, for testing memory limits (#498)
		8e197f6
	
		
		unverified
	simplify linear layer locator
		267b7b2
	
		
		
	fsdp requires params be the same type too (#493)
		98bf76e
	
		
		unverified
	Fix(tokenizer): Make sure to add pad for CodeLlamaTokenizer (#489)
		4c37bd0
	
		
		unverified
	Merge pull request #485 from maximegmd/patch-4
		f144e98
	
		
		unverified
	fix condition and add logging
		3a011ea
	
		
		
	Merge branch 'main' into patch-4
		1f613e5
	
		
		
	rename var and reformat
		f319b0b
	
		
		
	Update src/axolotl/utils/models.py
		7fd662d
	
		
		unverified
	Update src/axolotl/utils/models.py
		9e69968
	
		
		unverified
	Feat(cfg): Add code-llama configs for all sizes  (#479)
		3513071
	
		
		unverified
	Feat(deepspeed): Add zero2 config (#476)
		3fc9006
	
		
		unverified
	Feat(doc): Update eval_steps doc (#487)
		ad8be43
	
		
		unverified
	Add example Llama 2 ReLoRA config (#471)
		fe4d6ba
	
		
		unverified
	Merge pull request #486 from OpenAccess-AI-Collective/adam-bnb-simpler
		f313010
	
		
		unverified
	let transformers handle adamw_bnb_8bit
		868530c
	
		
		
	ignore: address pr review
		d03887f
	
		
		unverified
	
		Maxime
		
	commited on
		
		
fix: inference did not move the model to the correct device (#483)
		17605b8
	
		
		unverified
	
		Maxime
		
	commited on
		
		
ignore: linter
		a184549
	
		
		unverified
	
		Maxime
		
	commited on
		
		
fix: finetune model inference needs the dtype fix to work with flash-attn
		f311df9
	
		
		unverified
	
		Maxime
		
	commited on
		
		
Fix missing 'packaging' wheel (#482)
		c500d02
	
		
		unverified
	
		Maxime
		
	commited on