edgerunner-ai
/

EdgeRunner-Tactical-7B

@@ -1,9 +1,251 @@
-### MT-Bench
-- Score: 8.55
-### Arena hard
-![image/png](https://cdn-uploads.huggingface.co/production/uploads/668ed3dcd857a9ca47edb75c/kFiab1FT9LW7CfzFHPNO4.png)
-### Alpaca Bench 2
-![image/png](https://cdn-uploads.huggingface.co/production/uploads/668ed3dcd857a9ca47edb75c/8YXOS5evbCv96FccidQ99.png)

+---
+library_name: transformers
+license: apache-2.0
+language:
+- en
+---
+# EdgeRunner-Tactical-7B
+## Introduction
+EdgeRunner-Tactical-7B is a powerful and efficient language model for the edge. Our mission is to build Generative AI for the edge that is safe, secure, and transparent. To that end, the EdgeRunner team is proud to release EdgeRunner-Tactical-7B, the most powerful language model for its size to date.
+EdgeRunner-Tactical-7B is a 7 billion parameter language model that delivers powerful performance while demonstrating the potential of running state-of-the-art (SOTA) models at the edge. It is the highest-scoring model in the 7B-XXB range, outperforming Gemini Pro, Mixtral-8x7B, and Meta-Llama-3-8B-Instruct. EdgeRunner-Tactical-7B also outperforms larger models, including GPT-4o mini and Mistral Large on the Arena Hard Benchmark.
+## Highlights
+- 7 billion parameters
+- SOTA performance for its size
+- Initialized from Qwen2-Instruct
+- Applied Self-Play Preference Optimization ([SPPO](https://arxiv.org/abs/2405.00675)) for continuous training on Qwen2-Instruct
+- Outperforms Mistral Large
+- Outperforms Mixtral-8x7B
+- Approaches Meta Llama-3-70B
+- Supports a context length of 128K tokens, making it ideal for tasks requiring many conversation turns or working with large amounts of text
+## Quickstart
+Below is a code snippet to show you how to load the tokenizer and model, and how to generate contents.
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+device = "cuda" # the device to load the model onto
+model = AutoModelForCausalLM.from_pretrained(
+    "edgerunner-ai/EdgeRunner-Tactical-7B",
+    torch_dtype="auto",
+    device_map="auto"
+)
+tokenizer = AutoTokenizer.from_pretrained("edgerunner-ai/EdgeRunner-Tactical-7B")
+prompt = "Give me a short introduction to large language model."
+messages = [
+    {"role": "system", "content": "You are a helpful assistant."},
+    {"role": "user", "content": prompt}
+]
+text = tokenizer.apply_chat_template(
+    messages,
+    tokenize=False,
+    add_generation_prompt=True
+)
+model_inputs = tokenizer([text], return_tensors="pt").to(device)
+generated_ids = model.generate(
+    model_inputs.input_ids,
+    max_new_tokens=512
+)
+generated_ids = [
+    output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
+]
+response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
+```
+## Example Outputs
+### Create a Quantum Future:
+<img src="https://cdn-uploads.huggingface.co/production/uploads/633fe629f81b9d10135fefda/3b00jTWhIV_5OWxtW6zFI.png" width="95%">
+### Ask for a structured JSON output:
+<img src="https://cdn-uploads.huggingface.co/production/uploads/633fe629f81b9d10135fefda/CzW5qUh9tAkZV8k8Xs4nm.png" width="95%">
+## Evaluation
+In this section, we report the results for EdgeRunner-Tactical-7B models on standard automatic benchmarks. Below are the results.
+### Arena-Hard Benchmark
+| Model                          | Score | 95% CI    | Avg #Tokens |
+| :----------------------------- | :----: | :------:  | :---------: |
+| gpt-4-turbo-2024-04-09         | 82.6   | (-1.6, 2.1) | 662         |
+| gpt-4-0125-preview             | 78.0   | (-1.8, 2.1) | 619         |
+| claude-3-opus-20240229         | 60.4   | (-2.8, 2.6) | 541         |
+| gpt-4-0314                     | 50.0   | (0.0, 0.0)  | 423         |
+| claude-3-haiku-20240307        | 41.5   | (-2.5, 2.9) | 505         |
+| llama-3-70b-chat-hf            | 41.1   | (-2.7, 1.7) | 583         |
+| EdgeRunner-Tactical-7B         | 38.2   | (-2.3, 2.7) | 719         |
+| gpt-4-0613                     | 37.9   | (-2.2, 2.6) | 354         |
+| mistral-large-2402             | 37.7   | (-1.9, 2.0) | 400         |
+| mixtral-8x22b-instruct-v0.1    | 36.4   | (-2.0, 2.0) | 430         |
+| Qwen1.5-72B-Chat               | 36.1   | (-2.3, 2.4) | 474         |
+| command-r-plus                 | 33.1   | (-2.6, 2.0) | 541         |
+| mistral-medium                 | 31.9   | (-2.1, 2.1) | 485         |
+| gpt-3.5-turbo-0613             | 24.8   | (-2.2, 1.7) | 401         |
+| dbrx-instruct                  | 24.6   | (-2.0, 2.4) | 415         |
+| Qwen2-7B-Instruct              | 23.5   | (-1.9, 2.0) | 605         |
+| Mixtral-8x7B-Instruct-v0.1     | 23.4   | (-1.9, 1.9) | 457         |
+| gpt-3.5-turbo-0125             | 23.3   | (-1.9, 2.0) | 329         |
+### InfiniteBench
+| Task Name       | GPT-4 | YaRN-Mistral-7B | Kimi-Chat | Claude 2 | Yi-6B-200K | Yi-34B-200K | Chatglm3-6B-128K | EdgeRunner-Tactical-7B | Qwen2-7B-Instruct |
+|-----------------|-------|------------------|-----------|----------|------------|-------------|------------------|------------------------|-------------------|
+| Retrieve.PassKey| 100%  | 92.71%           | 98.14%    | 97.80%   | 100.00%    | 100.00%     | 92.20%           | 100%                   | 100%              |
+| Retrieve.Number | 100%  | 56.61%           | 95.42%    | 98.14%   | 94.92%     | 100.00%     | 80.68%           | 100%                   | 99.83%            |
+| Retrieve.KV     | 89.00%| < 5%             | 53.60%    | 65.40%   | < 5%       | < 5%        | < 5%             | 2.2%                   | 1.8%              |
+| En.Sum          | 14.73%| 9.09%            | 17.96%    | 14.50%   | < 5%       | < 5%        | < 5%             | 33.07%                 | 29.13%            |
+| En.QA           | 22.44%| 9.55%            | 16.52%    | 11.97%   | 9.20%      | 12.17%      | < 5%             | 3.4%                   | 9.09%             |
+| En.MC           | 67.25%| 27.95%           | 72.49%    | 62.88%   | 36.68%     | 38.43%      | 10.48%           | 66.81%                 | 66.37%            |
+| En.Dia          | 8.50% | 7.50%            | 11.50%    | 46.50%   | < 5%       | < 5%        | < 5%             | 29%                    | 17%               |
+| Zh.QA           | 25.96%| 16.98%           | 17.93%    | 9.64%    | 15.07%     | 13.61%      | < 5%             | 4.6%                   | 11.14%            |
+| Code.Debug      | 37.06%| < 5%             | 17.77%    | < 5%     | 9.14%      | 13.96%      | 7.36%            | 22.08%                 | 24.61%            |
+| Code.Run        | 23.25%| < 5%             | < 5%      | < 5%     | < 5%       | < 5%        | < 5%             | 0%                     | 0.5%              |
+| Math.Calc       | < 5%  | < 5%             | < 5%      | < 5%     | < 5%       | < 5%        | < 5%             | 0%                     | 0%                |
+| Math.Find       | 60.00%| 17.14%           | 12.57%    | 32.29%   | < 5%       | 25.71%      | 7.71%            | 29.14%                 | 31.42%            |
+### GSM@ZeroEval
+| Model                               | Acc    | No Answer | Reason Lens |
+|-------------------------------------|--------|-----------|-------------|
+| Llama-3.1-405B-Instruct-Turbo       | 95.91  | 0.08      | 365.07      |
+| claude-3-5-sonnet-20240620          | 95.6   | 0         | 465.19      |
+| claude-3-opus-20240229              | 95.6   | 0         | 410.62      |
+| gpt-4o-2024-05-13                   | 95.38  | 0         | 479.98      |
+| gpt-4o-mini-2024-07-18              | 94.24  | 0         | 463.71      |
+| deepseek-chat                       | 93.93  | 0         | 495.52      |
+| deepseek-coder                      | 93.78  | 0         | 566.89      |
+| gemini-1.5-pro                      | 93.4   | 0         | 389.17      |
+| Meta-Llama-3-70B-Instruct           | 93.03  | 0         | 352.05      |
+| Qwen2-72B-Instruct                  | 92.65  | 0         | 375.96      |
+| claude-3-sonnet-20240229            | 91.51  | 0         | 762.69      |
+| gemini-1.5-flash                    | 91.36  | 0         | 344.61      |
+| gemma-2-27b-it@together             | 90.22  | 0         | 364.68      |
+| claude-3-haiku-20240307             | 88.78  | 0         | 587.65      |
+| gemma-2-9b-it                       | 87.41  | 0         | 394.83      |
+| reka-core-20240501                  | 87.41  | 0.08      | 414.7       |
+| Athene-70B                          | 86.66  | 0.3       | 253.53      |
+| Yi-1.5-34B-Chat                     | 84.08  | 0.08      | 553.47      |
+| Llama-3.1-8B-Instruct               | 82.87  | 0.45      | 414.19      |
+| Mistral-Nemo-Instruct-2407          | 82.79  | 0         | 349.81      |
+| yi-large-preview                    | 82.64  | 0         | 514.25      |
+| EdgeRunner-Tactical-7B              | 81.12  | 0.08      | 615.89      |
+| gpt-3.5-turbo-0125                  | 80.36  | 0         | 350.97      |
+| command-r-plus                      | 80.14  | 0.08      | 294.08      |
+| Qwen2-7B-Instruct                   | 80.06  | 0         | 452.6       |
+| yi-large                            | 80.06  | 0         | 479.87      |
+| Meta-Llama-3-8B-Instruct            | 78.47  | 0         | 429.39      |
+| Yi-1.5-9B-Chat                      | 76.42  | 0.08      | 485.39      |
+| Phi-3-mini-4k-instruct              | 75.51  | 0         | 462.53      |
+| reka-flash-20240226                 | 74.68  | 0.45      | 460.06      |
+| Meta-Llama-3.1-8B-Instruct          | 72.33  | 0.38      | 483.41      |
+| Mixtral-8x7B-Instruct-v0.1          | 70.13  | 2.27      | 361.12      |
+| Llama-3-Instruct-8B-SimPO-v0.2      | 57.54  | 2.05      | 505.25      |
+| command-r                           | 52.99  | 0         | 294.43      |
+| Qwen2-1.5B-Instruct                 | 43.37  | 4.78      | 301.67      |
+### MMLU-REDUX@ZeroEval
+| Model                              | Acc   | No answer | Reason Lens |
+|------------------------------------|-------|-----------|-------------|
+| gpt-4o-2024-05-13                  | 88.01 | 0.14      | 629.79      |
+| claude-3-5-sonnet-20240620         | 86    | 0.18      | 907.1       |
+| Llama-3.1-405B-Instruct-Turbo      | 85.64 | 0.76      | 449.71      |
+| gpt-4-turbo-2024-04-09             | 85.31 | 0.04      | 631.38      |
+| gemini-1.5-pro                     | 82.76 | 1.94      | 666.7       |
+| claude-3-opus-20240229             | 82.54 | 0.58      | 500.35      |
+| yi-large-preview                   | 82.15 | 0.14      | 982.6       |
+| gpt-4-0314                         | 81.64 | 0.04      | 397.22      |
+| Qwen2-72B-Instruct                 | 81.61 | 0.29      | 486.41      |
+| gpt-4o-mini-2024-07-18             | 81.5  | 0.07      | 526         |
+| yi-large                           | 81.17 | 0         | 774.85      |
+| deepseek-chat                      | 80.81 | 0.11      | 691.91      |
+| deepseek-coder                     | 79.63 | 0.14      | 704.72      |
+| Meta-Llama-3-70B-Instruct          | 78.01 | 0.11      | 520.77      |
+| gemini-1.5-flash                   | 77.36 | 1.26      | 583.45      |
+| Athene-70B                         | 76.64 | 0.04      | 552.61      |
+| reka-core-20240501                 | 76.42 | 0.76      | 701.67      |
+| gemma-2-27b-it@together            | 75.67 | 0.61      | 446.51      |
+| claude-3-sonnet-20240229           | 74.87 | 0.07      | 671.75      |
+| gemma-2-9b-it@nvidia               | 72.82 | 0.76      | 499         |
+| Yi-1.5-34B-Chat                    | 72.79 | 1.01      | 620.1       |
+| claude-3-haiku-20240307            | 72.32 | 0.04      | 644.59      |
+| Phi-3-mini-4k-instruct             | 70.34 | 0.43      | 677.09      |
+| command-r-plus                     | 68.61 | 0         | 401.51      |
+| gpt-3.5-turbo-0125                 | 68.36 | 0.04      | 357.92      |
+| EdgeRunner-Tactical-7B             | 67.71 | 0.65      | 917.6       |
+| Llama-3.1-8B-Instruct              | 67.13 | 3.38      | 399.54      |
+| Qwen2-7B-Instruct                  | 66.92 | 0.72      | 533.15      |
+| Mistral-Nemo-Instruct-2407         | 66.88 | 0.47      | 464.19      |
+| Yi-1.5-9B-Chat                     | 65.05 | 4.61      | 542.87      |
+| Meta-Llama-3.1-8B-Instruct         | 64.79 | 1.94      | 463.76      |
+| reka-flash-20240226                | 64.72 | 0.32      | 659.25      |
+| Mixtral-8x7B-Instruct-v0.1         | 63.17 | 5.51      | 324.31      |
+| Meta-Llama-3-8B-Instruct           | 61.66 | 0.97      | 600.81      |
+| command-r                          | 61.12 | 0.04      | 382.23      |
+| Llama-3-Instruct-8B-SimPO-v0.2     | 55.22 | 1.19      | 450.6       |
+| Qwen2-1.5B-Instruct                | 41.11 | 7.74      | 280.56      |
+### WildBench
+| Model                               | WB_Elo  | RewardScore_Avg | task_macro_reward.K=-1 | Length   |
+|-------------------------------------|---------|-----------------|------------------------|----------|
+| gpt-4o-2024-05-13                   | 1248.12 | 50.05           | 40.80                  | 3723.52  |
+| claude-3-5-sonnet-20240620          | 1229.76 | 46.16           | 37.63                  | 2911.85  |
+| gpt-4-turbo-2024-04-09              | 1225.29 | 46.19           | 37.17                  | 3093.17  |
+| gpt-4-0125-preview                  | 1211.44 | 41.24           | 30.20                  | 3335.64  |
+| gemini-1.5-pro                      | 1209.23 | 45.27           | 37.59                  | 3247.97  |
+| yi-large-preview                    | 1209.00 | 46.92           | 38.54                  | 3512.68  |
+| claude-3-opus-20240229              | 1206.56 | 37.03           | 22.35                  | 2685.98  |
+| Meta-Llama-3-70B-Instruct           | 1197.72 | 35.15           | 22.54                  | 3046.64  |
+| Athene-70B                          | 1197.41 | 29.77           | 0.00                   | 3175.14  |
+| deepseek-coder-v2                   | 1194.11 | 29.39           | 11.38                  | 2795.31  |
+| gpt-4o-mini-2024-07-18              | 1192.43 | 28.57           | 0.00                   | 3648.13  |
+| yi-large                            | 1191.88 | 33.35           | 17.77                  | 3095.34  |
+| gemini-1.5-flash                    | 1190.30 | 37.45           | 26.04                  | 3654.40  |
+| deepseek-v2-chat-0628               | 1188.07 | 27.00           | 0.00                   | 3252.38  |
+| gemma-2-9b-it-SimPO                 | 1184.67 | 26.64           | 0.00                   | 4277.67  |
+| gemma-2-9b-it-DPO                   | 1182.43 | 26.61           | 0.00                   | 3982.63  |
+| nemotron-4-340b-instruct            | 1181.77 | 33.76           | 19.85                  | 2754.01  |
+| claude-3-sonnet-20240229            | 1179.81 | 28.09           | 10.70                  | 2670.24  |
+| deepseekv2-chat                     | 1178.76 | 30.41           | 12.60                  | 2896.97  |
+| gemma-2-27b-it@together             | 1178.34 | 24.27           | 0.00                   | 2924.55  |
+| Qwen2-72B-Instruct                  | 1176.75 | 24.77           | 5.03                   | 2856.45  |
+| reka-core-20240501                  | 1173.85 | 31.48           | 17.06                  | 2592.59  |
+| Mistral-Nemo-Instruct-2407          | 1165.29 | 22.19           | 0.00                   | 3318.21  |
+| Yi-1.5-34B-Chat                     | 1163.69 | 30.83           | 16.06                  | 3523.56  |
+| EdgeRunner-Tactical-7B              | 1162.88 | 22.26           | 0.00                   | 3754.66  |
+| claude-3-haiku-20240307             | 1160.56 | 16.30           | -6.30                  | 2601.03  |
+| mistral-large-2402                  | 1159.72 | 13.27           | -12.36                 | 2514.98  |
+| deepseek-v2-coder-0628              | 1155.97 | 22.83           | 0.00                   | 2580.18  |
+| gemma-2-9b-it                       | 1154.30 | 21.35           | 0.00                   | 2802.89  |
+| Llama-3-8B-Magpie-Align-v0.1        | 1154.13 | 28.72           | 18.14                  | 3107.77  |
+| command-r-plus                      | 1153.15 | 16.58           | -3.60                  | 3293.81  |
+| glm-4-9b-chat                       | 1152.68 | 20.71           | 2.33                   | 3692.04  |
+| Qwen1.5-72B-Chat-greedy             | 1151.97 | 20.83           | 1.72                   | 2392.36  |
+| Yi-1.5-9B-Chat                      | 1151.43 | 21.80           | 4.93                   | 3468.23  |
+| Llama-3-Instruct-8B-SimPO           | 1151.38 | 23.31           | 9.57                   | 2541.93  |
+| Llama-3-Instruct-8B-SimPO-v0.2      | 1150.81 | 18.58           | 0.00                   | 2533.76  |
+| SELM-Llama-3-8B-Instruct-iter-3     | 1148.03 | 17.89           | 0.53                   | 2913.15  |
+| Llama-3-Instruct-8B-SimPO-ExPO      | 1147.24 | 21.39           | 7.77                   | 2480.65  |
+| Meta-Llama-3-8B-Instruct            | 1140.76 | 6.72            | -15.76                 | 2975.19  |
+| Qwen2-7B-Instruct                   | 1137.66 | 16.20           | 0.00                   | 3216.43  |
+| Starling-LM-7B-beta-ExPO            | 1137.58 | 11.28           | -9.01                  | 2835.83  |
+| Hermes-2-Theta-Llama-3-8B           | 1135.99 | 3.18            | -23.28                 | 2742.17  |
+| Llama-3.1-8B-Instruct               | 1135.42 | 16.38           | 0.00                   | 3750.60  |