Commit History
Merge pull request #336 from tmm1/flash-attn
0d2e34f
unverified
Merge pull request #337 from tmm1/readme-fix
b56a6c0
unverified
fix typo
2eda9e0
scope flash-attn+qlora fix correctly, scope to llama, add comment
78b9efb
move flash-attn monkey patch alongside the others
312a9fa
python 3.10 and 3.11 both work fine, as does pytorch 2.1.0.dev
58d6659
there is no configs folder
cc7e800
feat/llama-2 examples (#319)
dc71d88
unverified
ensure flash-attn fixes happen in both adapter/lora modes, and use torch_dtype
248bf90
qlora w flash attention fixes (#333)
77085ea
unverified
add peft install back since it doesn't get installed by setup.py (#331)
db2a358
unverified
pin accelerate so it works with llama2 (#330)
6c9a87c
unverified
fix FSDP save of final model (#329)
894cba0
unverified
update README for updated docker images (#328)
41a4d15
unverified
Prune cuda117 (#327)
2c37bf6
unverified
latest HEAD of accelerate causes 0 loss immediately w FSDP (#321)
9f69c4d
unverified
update prompts for open orca to match the paper (#317)
3d4984b
unverified
disable gh cache for first step of docker builds too
ff7f18d
add runpod envs to .bashrc, fix bnb env (#316)
cf62cfd
unverified
don't use the gha cache w docker
c5df969
Merge pull request #307 from OpenAccess-AI-Collective/xgen-user-sharegpt-tokens
40a53ff
unverified
Merge pull request #306 from ethanhs/xgen
dcdec44
unverified
Merge pull request #313 from OpenAccess-AI-Collective/tokenizer-llama2-embeddings
3ffb018
unverified
Merge pull request #299 from OpenAccess-AI-Collective/flash-attention-2
a94f2ee
unverified
don't resize embeddings to multiples of 32x by default
1066751
Merge pull request #308 from OpenAccess-AI-Collective/apache2-license
1b63bf1
unverified
add apache 2.0 license
5cce2a4
better handling since xgen tokenizer breaks with convert_tokens_to_ids
2a428e8
pin flash attention 2 to the fix for backwards pass
cdf85fd
flash attention 2
9b790d3
Add XGen info to README and example config
3881143
Ethan Smith
commited on