huggyllama/llama-65b

#1
by KnutJaegersberg - opened

the config file has this as file path which looks a little weird

huggyllama/llama-65b

LLM360/K2 would look better :)

KnutJaegersberg changed discussion status to closed

Hah yeah I agree, it looks funny.

Additionally, lack of gqa is an architectural choice that is puzzling to me.

the config file has this as file path which looks a little weird

huggyllama/llama-65b

Yeah thanks for spotting this. This is because we when we did a checkpoint conversion, we loaded a model and then modify it, loading our own weights etc.

fixing them now.

Hah yeah I agree, it looks funny.

Additionally, lack of gqa is an architectural choice that is puzzling to me.

Maybe not the best choice now I am looking at it. During our initial design, we tend to choose simple choices since our goal is to make research on these easier.

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment