Text Generation
Transformers
PyTorch
English
olmo2
conversational
Inference Endpoints
amanrangapur commited on
Commit
1a897bf
·
verified ·
1 Parent(s): 2d0c006

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +20 -18
README.md CHANGED
@@ -80,24 +80,26 @@ See the Falcon 180B model card for an example of this.
80
 
81
  ## Performance
82
 
83
- | Model | AVG | AE2 | BBH | DROP | GSM8K | IFE | MATH | MMLU | Safety | PQA | TQA |
84
- |-------------------|--------|--------|--------|--------|--------|--------|--------|--------|--------|--------|--------|
85
- | OLMo 2 7B SFT | 51.4 | 10.2 | 49.6 | 59.6 | 74.6 | 66.9 | 25.3 | 61.1 | 94.6 | 23.6 | 48.6 |
86
- | OLMo 2 7B DPO | 55.9 | 27.9 | 51.1 | 60.2 | 82.6 | 73.0 | 30.3 | 60.8 | 93.7 | 23.5 | 56.0 |
87
- | OLMo 2 7B Instruct| 56.5 | 29.1 | 51.4 | 60.5 | 85.1 | 72.3 | 32.5 | 61.3 | 93.3 | 23.2 | 56.5 |
88
- | OLMo 2 13B SFT | 56.6 | 11.5 | 59.9 | 71.3 | 76.3 | 68.6 | 29.5 | 68.0 | 94.3 | 29.4 | 57.1 |
89
- | OLMo 2 13B DPO | 62.0 | 38.3 | 61.4 | 71.5 | 82.3 | 80.2 | 35.2 | 67.9 | 90.3 | 29.0 | 63.9 |
90
- | OLMo 2 13B Instruct| 63.4 | 39.5 | 63.0 | 71.5 | 87.4 | 82.6 | 39.2 | 68.5 | 89.7 | 28.8 | 64.3 |
91
- | **OLMo 2 32B SFT**| 58.09 | 14.49 | 67.10 | 75.68 | 79.76 | 74.49 | 36.02 | 77.80 | - | 34.25 | 63.26 |
92
- | **OLMo 2 32B DPO**| 64.95 | 46.18 | 68.05 | 76.45 | 85.60 | 80.59 | 39.08 | 78.26 | - | 36.57 | 73.78 |
93
- | **OLMo 2 32B Instruct**| 66.17 | 45.70 | 69.10 | 76.49 | 89.08 | 83.55 | 42.74 | 78.53 | - | 36.70 | 73.64 |
94
- | Gemma-2-27b | 61.32 | 49.01 | 72.69 | 67.52 | 80.67 | 63.22 | 35.06 | 70.66 | 75.9 | 33.85 | 64.58 |
95
- | GPT-3.5 Turbo 0125| 59.56 | 38.7 | 66.6 | 70.2 | 74.3 | 66.9 | 41.2 | 70.2 | 69.1 | 45.0 | 62.9* |
96
- | GPT 4o Mini2024-07-18| 65.72 | 49.7 | 65.9* | 36.3 | 83.0 | 83.5 | 67.9 | 82.2 | 84.9 | 39.0 | 64.8* |
97
- | Qwen2.5-32B | 66.54 | 39.07 | 82.34 | 48.26 | 87.49 | 82.44 | 77.89 | 84.66 | 82.4 | 26.10 | 70.57 |
98
- | Mistral-Small-24B | 67.6 | 43.20 | 80.11 | 78.51 | 87.19 | 77.26 | 65.86 | 83.72 | 66.5 | 24.38 | 68.14 |
99
- | Llama-3.1-70B | 69.99 | 32.91 | 82.97 | 76.96 | 94.47 | 87.99 | 56.17 | 85.15 | 76.4 | 46.50 | 66.83 |
100
- | Llama-3.3-70B | 72.96 | 36.48 | 85.79 | 77.99 | 93.56 | 90.76 | 71.84 | 85.85 | 70.4 | 48.24 | 66.11 |
 
 
101
 
102
  ## License and use
103
 
 
80
 
81
  ## Performance
82
 
83
+ | Model | Average | 2 LC | BBH | DROP | GSM8k | IFEval | MATH | MMLU | Safety | PopQA | TruthQA |
84
+ |-------|---------|------|-----|------|-------|--------|------|------|--------|-------|---------|
85
+ | **Closed API models** | | | | | | | | | | | |
86
+ | GPT-3.5 Turbo 0125 | 59.6 | 38.7 | 66.6 | 70.2 | 74.3 | 66.9 | 41.2 | 70.2 | 69.1 | 45.0 | 62.9 |
87
+ | GPT 4o Mini 2024-07-18 | 65.7 | 49.7 | 65.9 | 36.3 | 83.0 | 83.5 | 67.9 | 82.2 | 84.9 | 39.0 | 64.8 |
88
+ | **Open weights models** | | | | | | | | | | | |
89
+ | Mistral-Nemo-Instruct-2407 | 50.9 | 45.8 | 54.6 | 23.6 | 81.4 | 64.5 | 31.9 | 70.0 | 52.7 | 26.9 | 57.7 |
90
+ | Ministral-8B-Instruct | 52.1 | 31.4 | 56.2 | 56.2 | 80.0 | 56.4 | 40.0 | 68.5 | 56.2 | 20.2 | 55.5 |
91
+ | Gemma-2-27b-it | 61.3 | 49.0 | 72.7 | 67.5 | 80.7 | 63.2 | 35.1 | 70.7 | 75.9 | 33.9 | 64.6 |
92
+ | Qwen2.5-32B | 66.5 | 39.1 | 82.3 | 48.3 | 87.5 | 82.4 | 77.9 | 84.7 | 82.4 | 26.1 | 70.6 |
93
+ | Mistral-Small-24B | 67.6 | 43.2 | 80.1 | 78.5 | 87.2 | 77.3 | 65.9 | 83.7 | 66.5 | 24.4 | 68.1 |
94
+ | Llama-3.1-70B | 70.0 | 32.9 | 83.0 | 77.0 | 94.5 | 88.0 | 56.2 | 85.2 | 76.4 | 46.5 | 66.8 |
95
+ | Llama-3.3-70B | 73.0 | 36.5 | 85.8 | 78.0 | 93.6 | 90.8 | 71.8 | 85.9 | 70.4 | 48.2 | 66.1 |
96
+ | Gemma-3-27b-it | - | 63.4 | 83.7 | 69.2 | 91.1 | - | - | 81.8 | - | 30.9 | - |
97
+ | **Fully open models** | | | | | | | | | | | |
98
+ | OLMo-2-7B-1124-Instruct | 55.7 | 31.0 | 48.5 | 58.9 | 85.2 | 75.6 | 31.3 | 63.9 | 81.2 | 24.6 | 56.3 |
99
+ | OLMo-2-13B-1124-Instruct | 61.4 | 37.5 | 58.4 | 72.1 | 87.4 | 80.4 | 39.7 | 68.6 | 77.5 | 28.8 | 63.9 |
100
+ | **OLMo-2-32B-0325-SFT** | 61.7 | 16.9 | 69.7 | 77.2 | 78.4 | 72.4 | 35.9 | 76.1 | 93.8 | 35.4 | 61.3 |
101
+ | **OLMo-2-32B-0325-DPO** | 68.8 | 44.1 | 70.2 | 77.5 | 85.7 | 83.8 | 46.8 | 78.0 | 91.9 | 36.4 | 73.5 |
102
+ | **OLMo-2-32B-0325-Instruct** | 68.8 | 42.8 | 70.6 | 78.0 | 87.6 | 85.6 | 49.7 | 77.3 | 85.9 | 37.5 | 73.2 |
103
 
104
  ## License and use
105