YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Quantization made by Richard Erkhov.

Github

Discord

Request more models

K2-Chat - GGUF

Name Quant method Size
K2-Chat.Q2_K.gguf Q2_K 22.46GB
K2-Chat.IQ3_XS.gguf IQ3_XS 24.81GB
K2-Chat.IQ3_S.gguf IQ3_S 26.23GB
K2-Chat.Q3_K_S.gguf Q3_K_S 26.23GB
K2-Chat.IQ3_M.gguf IQ3_M 27.78GB
K2-Chat.Q3_K.gguf Q3_K 29.46GB
K2-Chat.Q3_K_M.gguf Q3_K_M 29.46GB
K2-Chat.Q3_K_L.gguf Q3_K_L 32.27GB
K2-Chat.IQ4_XS.gguf IQ4_XS 32.64GB
K2-Chat.Q4_0.gguf Q4_0 34.27GB
K2-Chat.IQ4_NL.gguf IQ4_NL 34.48GB
K2-Chat.Q4_K_S.gguf Q4_K_S 34.51GB
K2-Chat.Q4_K.gguf Q4_K 36.65GB
K2-Chat.Q4_K_M.gguf Q4_K_M 36.65GB
K2-Chat.Q4_1.gguf Q4_1 38.05GB
K2-Chat.Q5_0.gguf Q5_0 41.84GB
K2-Chat.Q5_K_S.gguf Q5_K_S 41.84GB
K2-Chat.Q5_K.gguf Q5_K 43.06GB
K2-Chat.Q5_K_M.gguf Q5_K_M 43.06GB
K2-Chat.Q5_1.gguf Q5_1 45.62GB
K2-Chat.Q6_K.gguf Q6_K 49.88GB
K2-Chat.Q8_0.gguf Q8_0 64.61GB

Original model description:

license: apache-2.0

K2-Chat: a fully-reproducible large language model outperforming Llama 2 70B Chat using 35% less compute

K2 Chat is finetuned from K2-65B. K2 Chat outperforms Llama 2-70B-Chat on all evaluations conducted. The model also outperforms Llama 3-70B-Instruct on coding tasks.

k2 eval table

LLM360 Model Performance and Evaluation Collection

The LLM360 Performance and Evaluation Collection is a robust evaluations set consisting of general and domain specific evaluations to assess model knowledge and function.

Evaluations include standard best practice benchmarks, medical, math, and coding knowledge. More about the evaluations can be found here.

k2 big eval table

Open LLM Leaderboard

Evaluation Score Raw Score
IFEval 51.52 52
BBH 33.79 54
Math Lvl 5 1.59 2
GPQA 7.49 31
MUSR 16.82 46
MMLU-PRO 26.34 34
Average 22.93 36.5

Datasets and Mix

Subset #Tokens Avg. #Q Avg. Query Len Avg. #R Avg. Reply Len
MathInstruct 66,639,699 1.00 81.53 1.00 172.78
OpenHermes-2 404,820,694 1.01 152.38 1.01 249.12
FLAN_3M 2,346,961,387 1.00 727.49 1.00 54.83
Standford Encyclopedia Philosophy 786,928 1.00 219.09 1.00 166.28
TinyStories 1,448,898 1.00 260.82 1.00 207.47
Safety & Alignment Data 99,976,621 1.00 126.71 1.00 373.79
Total 2,920,634,227

Loading K2-Chat

from transformers import AutoModelForCausalLM, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("LLM360/K2-Chat")
model = AutoModelForCausalLM.from_pretrained("LLM360/K2-Chat")

prompt = '<|beginofuser|>what is the highest mountain on earth?<|beginofsystem|>'

input_ids = tokenizer(prompt, return_tensors="pt").input_ids
gen_tokens = model.generate(input_ids, do_sample=True, max_new_tokens=128)

print("-"*20 + "Output for model"  + 20 * '-')
print(tokenizer.batch_decode(gen_tokens)[0])

Alternatively, you can construct the prompt by applying the chat template of tokenizer on input conversation:

from transformers import AutoModelForCausalLM, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("LLM360/K2-Chat")
model = AutoModelForCausalLM.from_pretrained("LLM360/K2-Chat")

messages = [{"role": "user", "content": "what is the highest mountain on earth?"}]

input_ids = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors="pt")
gen_tokens = model.generate(input_ids, do_sample=True, max_new_tokens=128)

print("-"*20 + "Output for model"  + 20 * '-')
print(tokenizer.batch_decode(gen_tokens)[0])

LLM360 Developer Suite

We provide step-by-step finetuning tutorials for tech enthusiasts, AI practitioners and academic or industry researchers here.

About LLM360

LLM360 is an open research lab enabling community-owned AGI through open-source large model research and development.

LLM360 enables community-owned AGI by creating standards and tools to advance the bleeding edge of LLM capability and empower knowledge transfer, research, and development.

We believe in a future where artificial general intelligence (AGI) is created by the community, for the community. Through an open ecosystem of equitable computational resources, high quality data, and flowing technical knowledge, we can ensure ethical AGI development and universal access for all innovators.

Visit us

Citation

BibTeX:

@article{
      title={LLM360 K2-65B: Scaling Up Fully Transparent Open-Source LLMs}, 
      author={The LLM360 Team},
      year={2024},
}
Downloads last month
79
GGUF
Model size
65.3B params
Architecture
llama

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.