metadata
library_name: transformers
license: apache-2.0
language:
- en
EdgeRunner-Tactical-7B
Introduction
EdgeRunner-Tactical-7B is a powerful and efficient language model for the edge. Our mission is to build Generative AI for the edge that is safe, secure, and transparent. To that end, the EdgeRunner team is proud to release EdgeRunner-Tactical-7B, the most powerful language model for its size to date.
EdgeRunner-Tactical-7B is a 7 billion parameter language model that delivers powerful performance while demonstrating the potential of running state-of-the-art (SOTA) models at the edge. It is the highest-scoring model in the 7B-XXB range, outperforming Gemini Pro, Mixtral-8x7B, and Meta-Llama-3-8B-Instruct. EdgeRunner-Tactical-7B also outperforms larger models, including GPT-4o mini and Mistral Large on the Arena Hard Benchmark.
Highlights
- 7 billion parameters
- SOTA performance for its size
- Initialized from Qwen2-Instruct
- Applied Self-Play Preference Optimization (SPPO) for continuous training on Qwen2-Instruct
- Outperforms Mistral Large
- Outperforms Mixtral-8x7B
- Approaches Meta Llama-3-70B
- Supports a context length of 128K tokens, making it ideal for tasks requiring many conversation turns or working with large amounts of text
Quickstart
Below is a code snippet to show you how to load the tokenizer and model, and how to generate contents.
from transformers import AutoModelForCausalLM, AutoTokenizer
device = "cuda"
model = AutoModelForCausalLM.from_pretrained(
"edgerunner-ai/EdgeRunner-Tactical-7B",
torch_dtype="auto",
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("edgerunner-ai/EdgeRunner-Tactical-7B")
prompt = "Give me a short introduction to large language model."
messages = [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(device)
generated_ids = model.generate(
model_inputs.input_ids,
max_new_tokens=512
)
generated_ids = [
output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]
response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
Example Outputs
Create a Quantum Future:
Ask for a structured JSON output:
Evaluation
In this section, we report the results for EdgeRunner-Tactical-7B models on standard automatic benchmarks. Below are the results.
Arena-Hard Benchmark
Model |
Score |
95% CI |
Avg #Tokens |
gpt-4-turbo-2024-04-09 |
82.6 |
(-1.6, 2.1) |
662 |
gpt-4-0125-preview |
78.0 |
(-1.8, 2.1) |
619 |
claude-3-opus-20240229 |
60.4 |
(-2.8, 2.6) |
541 |
gpt-4-0314 |
50.0 |
(0.0, 0.0) |
423 |
claude-3-haiku-20240307 |
41.5 |
(-2.5, 2.9) |
505 |
llama-3-70b-chat-hf |
41.1 |
(-2.7, 1.7) |
583 |
EdgeRunner-Tactical-7B |
38.2 |
(-2.3, 2.7) |
719 |
gpt-4-0613 |
37.9 |
(-2.2, 2.6) |
354 |
mistral-large-2402 |
37.7 |
(-1.9, 2.0) |
400 |
mixtral-8x22b-instruct-v0.1 |
36.4 |
(-2.0, 2.0) |
430 |
Qwen1.5-72B-Chat |
36.1 |
(-2.3, 2.4) |
474 |
command-r-plus |
33.1 |
(-2.6, 2.0) |
541 |
mistral-medium |
31.9 |
(-2.1, 2.1) |
485 |
gpt-3.5-turbo-0613 |
24.8 |
(-2.2, 1.7) |
401 |
dbrx-instruct |
24.6 |
(-2.0, 2.4) |
415 |
Qwen2-7B-Instruct |
23.5 |
(-1.9, 2.0) |
605 |
Mixtral-8x7B-Instruct-v0.1 |
23.4 |
(-1.9, 1.9) |
457 |
gpt-3.5-turbo-0125 |
23.3 |
(-1.9, 2.0) |
329 |
InfiniteBench
Task Name |
GPT-4 |
YaRN-Mistral-7B |
Kimi-Chat |
Claude 2 |
Yi-6B-200K |
Yi-34B-200K |
Chatglm3-6B-128K |
EdgeRunner-Tactical-7B |
Qwen2-7B-Instruct |
Retrieve.PassKey |
100% |
92.71% |
98.14% |
97.80% |
100.00% |
100.00% |
92.20% |
100% |
100% |
Retrieve.Number |
100% |
56.61% |
95.42% |
98.14% |
94.92% |
100.00% |
80.68% |
100% |
99.83% |
Retrieve.KV |
89.00% |
< 5% |
53.60% |
65.40% |
< 5% |
< 5% |
< 5% |
2.2% |
1.8% |
En.Sum |
14.73% |
9.09% |
17.96% |
14.50% |
< 5% |
< 5% |
< 5% |
33.07% |
29.13% |
En.QA |
22.44% |
9.55% |
16.52% |
11.97% |
9.20% |
12.17% |
< 5% |
3.4% |
9.09% |
En.MC |
67.25% |
27.95% |
72.49% |
62.88% |
36.68% |
38.43% |
10.48% |
66.81% |
66.37% |
En.Dia |
8.50% |
7.50% |
11.50% |
46.50% |
< 5% |
< 5% |
< 5% |
29% |
17% |
Zh.QA |
25.96% |
16.98% |
17.93% |
9.64% |
15.07% |
13.61% |
< 5% |
4.6% |
11.14% |
Code.Debug |
37.06% |
< 5% |
17.77% |
< 5% |
9.14% |
13.96% |
7.36% |
22.08% |
24.61% |
Code.Run |
23.25% |
< 5% |
< 5% |
< 5% |
< 5% |
< 5% |
< 5% |
0% |
0.5% |
Math.Calc |
< 5% |
< 5% |
< 5% |
< 5% |
< 5% |
< 5% |
< 5% |
0% |
0% |
Math.Find |
60.00% |
17.14% |
12.57% |
32.29% |
< 5% |
25.71% |
7.71% |
29.14% |
31.42% |
GSM@ZeroEval
Model |
Acc |
No Answer |
Reason Lens |
Llama-3.1-405B-Instruct-Turbo |
95.91 |
0.08 |
365.07 |
claude-3-5-sonnet-20240620 |
95.6 |
0 |
465.19 |
claude-3-opus-20240229 |
95.6 |
0 |
410.62 |
gpt-4o-2024-05-13 |
95.38 |
0 |
479.98 |
gpt-4o-mini-2024-07-18 |
94.24 |
0 |
463.71 |
deepseek-chat |
93.93 |
0 |
495.52 |
deepseek-coder |
93.78 |
0 |
566.89 |
gemini-1.5-pro |
93.4 |
0 |
389.17 |
Meta-Llama-3-70B-Instruct |
93.03 |
0 |
352.05 |
Qwen2-72B-Instruct |
92.65 |
0 |
375.96 |
claude-3-sonnet-20240229 |
91.51 |
0 |
762.69 |
gemini-1.5-flash |
91.36 |
0 |
344.61 |
gemma-2-27b-it@together |
90.22 |
0 |
364.68 |
claude-3-haiku-20240307 |
88.78 |
0 |
587.65 |
gemma-2-9b-it |
87.41 |
0 |
394.83 |
reka-core-20240501 |
87.41 |
0.08 |
414.7 |
Athene-70B |
86.66 |
0.3 |
253.53 |
Yi-1.5-34B-Chat |
84.08 |
0.08 |
553.47 |
Llama-3.1-8B-Instruct |
82.87 |
0.45 |
414.19 |
Mistral-Nemo-Instruct-2407 |
82.79 |
0 |
349.81 |
yi-large-preview |
82.64 |
0 |
514.25 |
EdgeRunner-Tactical-7B |
81.12 |
0.08 |
615.89 |
gpt-3.5-turbo-0125 |
80.36 |
0 |
350.97 |
command-r-plus |
80.14 |
0.08 |
294.08 |
Qwen2-7B-Instruct |
80.06 |
0 |
452.6 |
yi-large |
80.06 |
0 |
479.87 |
Meta-Llama-3-8B-Instruct |
78.47 |
0 |
429.39 |
Yi-1.5-9B-Chat |
76.42 |
0.08 |
485.39 |
Phi-3-mini-4k-instruct |
75.51 |
0 |
462.53 |
reka-flash-20240226 |
74.68 |
0.45 |
460.06 |
Meta-Llama-3.1-8B-Instruct |
72.33 |
0.38 |
483.41 |
Mixtral-8x7B-Instruct-v0.1 |
70.13 |
2.27 |
361.12 |
Llama-3-Instruct-8B-SimPO-v0.2 |
57.54 |
2.05 |
505.25 |
command-r |
52.99 |
0 |
294.43 |
Qwen2-1.5B-Instruct |
43.37 |
4.78 |
301.67 |
MMLU-REDUX@ZeroEval
Model |
Acc |
No answer |
Reason Lens |
gpt-4o-2024-05-13 |
88.01 |
0.14 |
629.79 |
claude-3-5-sonnet-20240620 |
86 |
0.18 |
907.1 |
Llama-3.1-405B-Instruct-Turbo |
85.64 |
0.76 |
449.71 |
gpt-4-turbo-2024-04-09 |
85.31 |
0.04 |
631.38 |
gemini-1.5-pro |
82.76 |
1.94 |
666.7 |
claude-3-opus-20240229 |
82.54 |
0.58 |
500.35 |
yi-large-preview |
82.15 |
0.14 |
982.6 |
gpt-4-0314 |
81.64 |
0.04 |
397.22 |
Qwen2-72B-Instruct |
81.61 |
0.29 |
486.41 |
gpt-4o-mini-2024-07-18 |
81.5 |
0.07 |
526 |
yi-large |
81.17 |
0 |
774.85 |
deepseek-chat |
80.81 |
0.11 |
691.91 |
deepseek-coder |
79.63 |
0.14 |
704.72 |
Meta-Llama-3-70B-Instruct |
78.01 |
0.11 |
520.77 |
gemini-1.5-flash |
77.36 |
1.26 |
583.45 |
Athene-70B |
76.64 |
0.04 |
552.61 |
reka-core-20240501 |
76.42 |
0.76 |
701.67 |
gemma-2-27b-it@together |
75.67 |
0.61 |
446.51 |
claude-3-sonnet-20240229 |
74.87 |
0.07 |
671.75 |
gemma-2-9b-it@nvidia |
72.82 |
0.76 |
499 |
Yi-1.5-34B-Chat |
72.79 |
1.01 |
620.1 |
claude-3-haiku-20240307 |
72.32 |
0.04 |
644.59 |
Phi-3-mini-4k-instruct |
70.34 |
0.43 |
677.09 |
command-r-plus |
68.61 |
0 |
401.51 |
gpt-3.5-turbo-0125 |
68.36 |
0.04 |
357.92 |
EdgeRunner-Tactical-7B |
67.71 |
0.65 |
917.6 |
Llama-3.1-8B-Instruct |
67.13 |
3.38 |
399.54 |
Qwen2-7B-Instruct |
66.92 |
0.72 |
533.15 |
Mistral-Nemo-Instruct-2407 |
66.88 |
0.47 |
464.19 |
Yi-1.5-9B-Chat |
65.05 |
4.61 |
542.87 |
Meta-Llama-3.1-8B-Instruct |
64.79 |
1.94 |
463.76 |
reka-flash-20240226 |
64.72 |
0.32 |
659.25 |
Mixtral-8x7B-Instruct-v0.1 |
63.17 |
5.51 |
324.31 |
Meta-Llama-3-8B-Instruct |
61.66 |
0.97 |
600.81 |
command-r |
61.12 |
0.04 |
382.23 |
Llama-3-Instruct-8B-SimPO-v0.2 |
55.22 |
1.19 |
450.6 |
Qwen2-1.5B-Instruct |
41.11 |
7.74 |
280.56 |
WildBench
Model |
WB_Elo |
RewardScore_Avg |
task_macro_reward.K=-1 |
Length |
gpt-4o-2024-05-13 |
1248.12 |
50.05 |
40.80 |
3723.52 |
claude-3-5-sonnet-20240620 |
1229.76 |
46.16 |
37.63 |
2911.85 |
gpt-4-turbo-2024-04-09 |
1225.29 |
46.19 |
37.17 |
3093.17 |
gpt-4-0125-preview |
1211.44 |
41.24 |
30.20 |
3335.64 |
gemini-1.5-pro |
1209.23 |
45.27 |
37.59 |
3247.97 |
yi-large-preview |
1209.00 |
46.92 |
38.54 |
3512.68 |
claude-3-opus-20240229 |
1206.56 |
37.03 |
22.35 |
2685.98 |
Meta-Llama-3-70B-Instruct |
1197.72 |
35.15 |
22.54 |
3046.64 |
Athene-70B |
1197.41 |
29.77 |
0.00 |
3175.14 |
deepseek-coder-v2 |
1194.11 |
29.39 |
11.38 |
2795.31 |
gpt-4o-mini-2024-07-18 |
1192.43 |
28.57 |
0.00 |
3648.13 |
yi-large |
1191.88 |
33.35 |
17.77 |
3095.34 |
gemini-1.5-flash |
1190.30 |
37.45 |
26.04 |
3654.40 |
deepseek-v2-chat-0628 |
1188.07 |
27.00 |
0.00 |
3252.38 |
gemma-2-9b-it-SimPO |
1184.67 |
26.64 |
0.00 |
4277.67 |
gemma-2-9b-it-DPO |
1182.43 |
26.61 |
0.00 |
3982.63 |
nemotron-4-340b-instruct |
1181.77 |
33.76 |
19.85 |
2754.01 |
claude-3-sonnet-20240229 |
1179.81 |
28.09 |
10.70 |
2670.24 |
deepseekv2-chat |
1178.76 |
30.41 |
12.60 |
2896.97 |
gemma-2-27b-it@together |
1178.34 |
24.27 |
0.00 |
2924.55 |
Qwen2-72B-Instruct |
1176.75 |
24.77 |
5.03 |
2856.45 |
reka-core-20240501 |
1173.85 |
31.48 |
17.06 |
2592.59 |
Mistral-Nemo-Instruct-2407 |
1165.29 |
22.19 |
0.00 |
3318.21 |
Yi-1.5-34B-Chat |
1163.69 |
30.83 |
16.06 |
3523.56 |
EdgeRunner-Tactical-7B |
1162.88 |
22.26 |
0.00 |
3754.66 |
claude-3-haiku-20240307 |
1160.56 |
16.30 |
-6.30 |
2601.03 |
mistral-large-2402 |
1159.72 |
13.27 |
-12.36 |
2514.98 |
deepseek-v2-coder-0628 |
1155.97 |
22.83 |
0.00 |
2580.18 |
gemma-2-9b-it |
1154.30 |
21.35 |
0.00 |
2802.89 |
Llama-3-8B-Magpie-Align-v0.1 |
1154.13 |
28.72 |
18.14 |
3107.77 |
command-r-plus |
1153.15 |
16.58 |
-3.60 |
3293.81 |
glm-4-9b-chat |
1152.68 |
20.71 |
2.33 |
3692.04 |
Qwen1.5-72B-Chat-greedy |
1151.97 |
20.83 |
1.72 |
2392.36 |
Yi-1.5-9B-Chat |
1151.43 |
21.80 |
4.93 |
3468.23 |
Llama-3-Instruct-8B-SimPO |
1151.38 |
23.31 |
9.57 |
2541.93 |
Llama-3-Instruct-8B-SimPO-v0.2 |
1150.81 |
18.58 |
0.00 |
2533.76 |
SELM-Llama-3-8B-Instruct-iter-3 |
1148.03 |
17.89 |
0.53 |
2913.15 |
Llama-3-Instruct-8B-SimPO-ExPO |
1147.24 |
21.39 |
7.77 |
2480.65 |
Meta-Llama-3-8B-Instruct |
1140.76 |
6.72 |
-15.76 |
2975.19 |
Qwen2-7B-Instruct |
1137.66 |
16.20 |
0.00 |
3216.43 |
Starling-LM-7B-beta-ExPO |
1137.58 |
11.28 |
-9.01 |
2835.83 |
Hermes-2-Theta-Llama-3-8B |
1135.99 |
3.18 |
-23.28 |
2742.17 |
Llama-3.1-8B-Instruct |
1135.42 |
16.38 |
0.00 |
3750.60 |