Commit History
fix for flash attn w mistral w/o sammple packing (#648)
b2edaae
unverified
Mistral flash attn packing (#646)
b6ab8aa
unverified
skip some flash attn patches unless explicitly enabled (#643)
895f0a0
unverified
use fastchat conversations template (#578)
e7d3e2d
unverified
update for recent transformers updates (#636)
60c7c48
unverified
Feat: Add support for upstream FA2 (#626)
19a600a
unverified
btlm and falcon monkey patches for flash attn (#566)
6b9b229
unverified
Add training callback to send predictions to WandB table (#521)
5b67ea9
unverified
reorg a bit
fc8766e
use flash_attn rmsnorm when available (#526)
72a6fe1
unverified
use flash_attn xentropy when available (#525)
5fe30b1
unverified
fix checkpints on multigpu (#481)
31f3e71
unverified
ReLoRA implementation (with quantization) (#322)
bde3c5a
unverified
fix eval regression caused in 13f7efaf74fcd3c4514277ccb71914c589873f6a
a213d99
is_causal fix for evals?
fbf49a4
fix evals (#447)
ee26281
unverified
fix check for flash attn branching (#377)
343ac84
unverified
Attention mask and position id fixes for packing (#285)
2bb0b78
unverified
Update XFormers Attention Monkeypatch to handle Llama-2 70B (GQA) (#339)
10405b9
unverified
ssmi153
commited on