RichardErkhov commited on
Commit
9452a33
·
verified ·
1 Parent(s): 195aeef

uploaded readme

Browse files
Files changed (1) hide show
  1. README.md +296 -0
README.md ADDED
@@ -0,0 +1,296 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Quantization made by Richard Erkhov.
2
+
3
+ [Github](https://github.com/RichardErkhov)
4
+
5
+ [Discord](https://discord.gg/pvy7H8DZMG)
6
+
7
+ [Request more models](https://github.com/RichardErkhov/quant_request)
8
+
9
+
10
+ Starling_Monarch_Westlake_Garten-7B-v0.1 - GGUF
11
+ - Model creator: https://huggingface.co/giraffe176/
12
+ - Original model: https://huggingface.co/giraffe176/Starling_Monarch_Westlake_Garten-7B-v0.1/
13
+
14
+
15
+ | Name | Quant method | Size |
16
+ | ---- | ---- | ---- |
17
+ | [Starling_Monarch_Westlake_Garten-7B-v0.1.Q2_K.gguf](https://huggingface.co/RichardErkhov/giraffe176_-_Starling_Monarch_Westlake_Garten-7B-v0.1-gguf/blob/main/Starling_Monarch_Westlake_Garten-7B-v0.1.Q2_K.gguf) | Q2_K | 2.53GB |
18
+ | [Starling_Monarch_Westlake_Garten-7B-v0.1.IQ3_XS.gguf](https://huggingface.co/RichardErkhov/giraffe176_-_Starling_Monarch_Westlake_Garten-7B-v0.1-gguf/blob/main/Starling_Monarch_Westlake_Garten-7B-v0.1.IQ3_XS.gguf) | IQ3_XS | 2.81GB |
19
+ | [Starling_Monarch_Westlake_Garten-7B-v0.1.IQ3_S.gguf](https://huggingface.co/RichardErkhov/giraffe176_-_Starling_Monarch_Westlake_Garten-7B-v0.1-gguf/blob/main/Starling_Monarch_Westlake_Garten-7B-v0.1.IQ3_S.gguf) | IQ3_S | 2.96GB |
20
+ | [Starling_Monarch_Westlake_Garten-7B-v0.1.Q3_K_S.gguf](https://huggingface.co/RichardErkhov/giraffe176_-_Starling_Monarch_Westlake_Garten-7B-v0.1-gguf/blob/main/Starling_Monarch_Westlake_Garten-7B-v0.1.Q3_K_S.gguf) | Q3_K_S | 2.95GB |
21
+ | [Starling_Monarch_Westlake_Garten-7B-v0.1.IQ3_M.gguf](https://huggingface.co/RichardErkhov/giraffe176_-_Starling_Monarch_Westlake_Garten-7B-v0.1-gguf/blob/main/Starling_Monarch_Westlake_Garten-7B-v0.1.IQ3_M.gguf) | IQ3_M | 3.06GB |
22
+ | [Starling_Monarch_Westlake_Garten-7B-v0.1.Q3_K.gguf](https://huggingface.co/RichardErkhov/giraffe176_-_Starling_Monarch_Westlake_Garten-7B-v0.1-gguf/blob/main/Starling_Monarch_Westlake_Garten-7B-v0.1.Q3_K.gguf) | Q3_K | 3.28GB |
23
+ | [Starling_Monarch_Westlake_Garten-7B-v0.1.Q3_K_M.gguf](https://huggingface.co/RichardErkhov/giraffe176_-_Starling_Monarch_Westlake_Garten-7B-v0.1-gguf/blob/main/Starling_Monarch_Westlake_Garten-7B-v0.1.Q3_K_M.gguf) | Q3_K_M | 3.28GB |
24
+ | [Starling_Monarch_Westlake_Garten-7B-v0.1.Q3_K_L.gguf](https://huggingface.co/RichardErkhov/giraffe176_-_Starling_Monarch_Westlake_Garten-7B-v0.1-gguf/blob/main/Starling_Monarch_Westlake_Garten-7B-v0.1.Q3_K_L.gguf) | Q3_K_L | 3.56GB |
25
+ | [Starling_Monarch_Westlake_Garten-7B-v0.1.IQ4_XS.gguf](https://huggingface.co/RichardErkhov/giraffe176_-_Starling_Monarch_Westlake_Garten-7B-v0.1-gguf/blob/main/Starling_Monarch_Westlake_Garten-7B-v0.1.IQ4_XS.gguf) | IQ4_XS | 3.67GB |
26
+ | [Starling_Monarch_Westlake_Garten-7B-v0.1.Q4_0.gguf](https://huggingface.co/RichardErkhov/giraffe176_-_Starling_Monarch_Westlake_Garten-7B-v0.1-gguf/blob/main/Starling_Monarch_Westlake_Garten-7B-v0.1.Q4_0.gguf) | Q4_0 | 3.83GB |
27
+ | [Starling_Monarch_Westlake_Garten-7B-v0.1.IQ4_NL.gguf](https://huggingface.co/RichardErkhov/giraffe176_-_Starling_Monarch_Westlake_Garten-7B-v0.1-gguf/blob/main/Starling_Monarch_Westlake_Garten-7B-v0.1.IQ4_NL.gguf) | IQ4_NL | 3.87GB |
28
+ | [Starling_Monarch_Westlake_Garten-7B-v0.1.Q4_K_S.gguf](https://huggingface.co/RichardErkhov/giraffe176_-_Starling_Monarch_Westlake_Garten-7B-v0.1-gguf/blob/main/Starling_Monarch_Westlake_Garten-7B-v0.1.Q4_K_S.gguf) | Q4_K_S | 3.86GB |
29
+ | [Starling_Monarch_Westlake_Garten-7B-v0.1.Q4_K.gguf](https://huggingface.co/RichardErkhov/giraffe176_-_Starling_Monarch_Westlake_Garten-7B-v0.1-gguf/blob/main/Starling_Monarch_Westlake_Garten-7B-v0.1.Q4_K.gguf) | Q4_K | 4.07GB |
30
+ | [Starling_Monarch_Westlake_Garten-7B-v0.1.Q4_K_M.gguf](https://huggingface.co/RichardErkhov/giraffe176_-_Starling_Monarch_Westlake_Garten-7B-v0.1-gguf/blob/main/Starling_Monarch_Westlake_Garten-7B-v0.1.Q4_K_M.gguf) | Q4_K_M | 4.07GB |
31
+ | [Starling_Monarch_Westlake_Garten-7B-v0.1.Q4_1.gguf](https://huggingface.co/RichardErkhov/giraffe176_-_Starling_Monarch_Westlake_Garten-7B-v0.1-gguf/blob/main/Starling_Monarch_Westlake_Garten-7B-v0.1.Q4_1.gguf) | Q4_1 | 4.24GB |
32
+ | [Starling_Monarch_Westlake_Garten-7B-v0.1.Q5_0.gguf](https://huggingface.co/RichardErkhov/giraffe176_-_Starling_Monarch_Westlake_Garten-7B-v0.1-gguf/blob/main/Starling_Monarch_Westlake_Garten-7B-v0.1.Q5_0.gguf) | Q5_0 | 4.65GB |
33
+ | [Starling_Monarch_Westlake_Garten-7B-v0.1.Q5_K_S.gguf](https://huggingface.co/RichardErkhov/giraffe176_-_Starling_Monarch_Westlake_Garten-7B-v0.1-gguf/blob/main/Starling_Monarch_Westlake_Garten-7B-v0.1.Q5_K_S.gguf) | Q5_K_S | 4.65GB |
34
+ | [Starling_Monarch_Westlake_Garten-7B-v0.1.Q5_K.gguf](https://huggingface.co/RichardErkhov/giraffe176_-_Starling_Monarch_Westlake_Garten-7B-v0.1-gguf/blob/main/Starling_Monarch_Westlake_Garten-7B-v0.1.Q5_K.gguf) | Q5_K | 4.78GB |
35
+ | [Starling_Monarch_Westlake_Garten-7B-v0.1.Q5_K_M.gguf](https://huggingface.co/RichardErkhov/giraffe176_-_Starling_Monarch_Westlake_Garten-7B-v0.1-gguf/blob/main/Starling_Monarch_Westlake_Garten-7B-v0.1.Q5_K_M.gguf) | Q5_K_M | 4.78GB |
36
+ | [Starling_Monarch_Westlake_Garten-7B-v0.1.Q5_1.gguf](https://huggingface.co/RichardErkhov/giraffe176_-_Starling_Monarch_Westlake_Garten-7B-v0.1-gguf/blob/main/Starling_Monarch_Westlake_Garten-7B-v0.1.Q5_1.gguf) | Q5_1 | 5.07GB |
37
+ | [Starling_Monarch_Westlake_Garten-7B-v0.1.Q6_K.gguf](https://huggingface.co/RichardErkhov/giraffe176_-_Starling_Monarch_Westlake_Garten-7B-v0.1-gguf/blob/main/Starling_Monarch_Westlake_Garten-7B-v0.1.Q6_K.gguf) | Q6_K | 5.53GB |
38
+ | [Starling_Monarch_Westlake_Garten-7B-v0.1.Q8_0.gguf](https://huggingface.co/RichardErkhov/giraffe176_-_Starling_Monarch_Westlake_Garten-7B-v0.1-gguf/blob/main/Starling_Monarch_Westlake_Garten-7B-v0.1.Q8_0.gguf) | Q8_0 | 7.17GB |
39
+
40
+
41
+
42
+
43
+ Original model description:
44
+ ---
45
+ base_model:
46
+ - mistralai/Mistral-7B-v0.1
47
+ - berkeley-nest/Starling-LM-7B-alpha
48
+ - mlabonne/AlphaMonarch-7B
49
+ - cognitivecomputations/WestLake-7B-v2-laser
50
+ - senseable/garten2-7b
51
+
52
+ library_name: transformers
53
+ tags:
54
+ - mergekit
55
+ - merge
56
+ license: cc-by-nc-4.0
57
+
58
+ model-index:
59
+ - name: Starling_Monarch_Westlake_Garten-7B-v0.1
60
+ results:
61
+ - task:
62
+ type: text-generation
63
+ name: Text Generation
64
+ dataset:
65
+ name: EQ-Bench
66
+ type: eq-bench
67
+ config: EQ-Bench
68
+ split: v2.1
69
+ args:
70
+ num_few_shot: 3
71
+ metrics:
72
+ - type: acc_norm
73
+ value: 80.01
74
+ name: self-reported
75
+ source:
76
+ url: https://github.com/EQ-bench/EQ-Bench
77
+ name: EQ-Bench v2.1
78
+ - task:
79
+ type: text-generation
80
+ name: Text Generation
81
+ dataset:
82
+ name: AI2 Reasoning Challenge (25-Shot)
83
+ type: ai2_arc
84
+ config: ARC-Challenge
85
+ split: test
86
+ args:
87
+ num_few_shot: 25
88
+ metrics:
89
+ - type: acc_norm
90
+ value: 71.76
91
+ name: normalized accuracy
92
+ source:
93
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=giraffe176/Starling_Monarch_Westlake_Garten-7B-v0.1
94
+ name: Open LLM Leaderboard
95
+ - task:
96
+ type: text-generation
97
+ name: Text Generation
98
+ dataset:
99
+ name: HellaSwag (10-Shot)
100
+ type: hellaswag
101
+ split: validation
102
+ args:
103
+ num_few_shot: 10
104
+ metrics:
105
+ - type: acc_norm
106
+ value: 88.15
107
+ name: normalized accuracy
108
+ source:
109
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=giraffe176/Starling_Monarch_Westlake_Garten-7B-v0.1
110
+ name: Open LLM Leaderboard
111
+ - task:
112
+ type: text-generation
113
+ name: Text Generation
114
+ dataset:
115
+ name: MMLU (5-Shot)
116
+ type: cais/mmlu
117
+ config: all
118
+ split: test
119
+ args:
120
+ num_few_shot: 5
121
+ metrics:
122
+ - type: acc
123
+ value: 65.07
124
+ name: accuracy
125
+ source:
126
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=giraffe176/Starling_Monarch_Westlake_Garten-7B-v0.1
127
+ name: Open LLM Leaderboard
128
+ - task:
129
+ type: text-generation
130
+ name: Text Generation
131
+ dataset:
132
+ name: TruthfulQA (0-shot)
133
+ type: truthful_qa
134
+ config: multiple_choice
135
+ split: validation
136
+ args:
137
+ num_few_shot: 0
138
+ metrics:
139
+ - type: mc2
140
+ value: 67.92
141
+ source:
142
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=giraffe176/Starling_Monarch_Westlake_Garten-7B-v0.1
143
+ name: Open LLM Leaderboard
144
+ - task:
145
+ type: text-generation
146
+ name: Text Generation
147
+ dataset:
148
+ name: Winogrande (5-shot)
149
+ type: winogrande
150
+ config: winogrande_xl
151
+ split: validation
152
+ args:
153
+ num_few_shot: 5
154
+ metrics:
155
+ - type: acc
156
+ value: 82.16
157
+ name: accuracy
158
+ source:
159
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=giraffe176/Starling_Monarch_Westlake_Garten-7B-v0.1
160
+ name: Open LLM Leaderboard
161
+ - task:
162
+ type: text-generation
163
+ name: Text Generation
164
+ dataset:
165
+ name: GSM8k (5-shot)
166
+ type: gsm8k
167
+ config: main
168
+ split: test
169
+ args:
170
+ num_few_shot: 5
171
+ metrics:
172
+ - type: acc
173
+ value: 71.95
174
+ name: accuracy
175
+ source:
176
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=giraffe176/Starling_Monarch_Westlake_Garten-7B-v0.1
177
+ name: Open LLM Leaderboard
178
+ ---
179
+ # Starling_Monarch_Westlake_Garten-7B-v0.1
180
+
181
+ <img src="https://cdn-uploads.huggingface.co/production/uploads/655a9883cbbaec115c3fd6b3/Chyn1eXYC0LSY6yVdeRBV.png" alt="drawing" width="800"/>
182
+
183
+ After experimenting with density for a previous merge (containing similar models), I decided to experiment with weight gradients. My thought that was that if the merge was done with care and attention, I'd be able to create something greater than the sum of its parts.
184
+ Hoping that, through a merge of really good models, I'd be able to create something greater than the sum of its parts.
185
+
186
+ I came across the EQ-Bench Benchmark [(Paper)](https://arxiv.org/abs/2312.06281) as part of my earlier testing. It is a very light and quick benchmark that yields powerful insights into how well the model performs in emotional intelligence related prompts.
187
+ As part of this process, I tried to figure out if there was a way to determine an optimal set of gradient weights that would lead to the most successful merge as measured against EQ-Bench. At first, my goal was to simply exceed WestLake-7B, but then I kept pushing to see what I could come up with.
188
+ Too late in the process, I learned that [dare_ties](https://arxiv.org/abs/2311.03099) has a random element to it. Valuable information for next time, I guess. After concluding that project, I began collecting more data, this time setting a specified seed in mergekit for reproducibility. As I was collecting data, I hit the goal I had set for myself.
189
+ This model is *not* a result of the above work but is the genesis of how this model came to be.
190
+
191
+ I present, **Starling_Monarch_Westlake_Garten-7B-v0.1**, the **only 7B model to score > 80** on the EQ-Bench v2.1 benchmark found [here](https://github.com/EQ-bench/EQ-Bench), outscoring larger models like [abacusai/Smaug-72B-v0.1](https://huggingface.co/abacusai/Smaug-72B-v0.1) and [cognitivecomputations/dolphin-2.2-70b](https://huggingface.co/cognitivecomputations/dolphin-2.2-70b)
192
+
193
+ It also surpasses its components in the GSM8K benchmark, with a score of 71.95. I'll be looking to bring out more logic and emotion in the next evolution of this model.
194
+
195
+ It also earned 8.109 on MT-Bench[(paper)](https://arxiv.org/abs/2306.05685), outscoring Chat-GPT 3.5 and Claude v1.
196
+
197
+ This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
198
+
199
+ ## Merge Details
200
+ ### Merge Method
201
+
202
+
203
+ This model was merged using the [DARE](https://arxiv.org/abs/2311.03099) [TIES](https://arxiv.org/abs/2306.01708) merge method using [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) as a base.
204
+ The seed for this merge is 176
205
+ ### Models Merged
206
+
207
+ The following models were included in the merge:
208
+ * [berkeley-nest/Starling-LM-7B-alpha](https://huggingface.co/berkeley-nest/Starling-LM-7B-alpha)
209
+ * [mlabonne/AlphaMonarch-7B](https://huggingface.co/mlabonne/AlphaMonarch-7B)
210
+ * [cognitivecomputations/WestLake-7B-v2-laser](https://huggingface.co/cognitivecomputations/WestLake-7B-v2-laser)
211
+ * [senseable/garten2-7b](https://huggingface.co/senseable/garten2-7b)
212
+
213
+ ### Configuration
214
+
215
+ The following YAML configuration was used to produce this model:
216
+
217
+ ```yaml
218
+ models:
219
+ - model: mistralai/Mistral-7B-v0.1
220
+ # No parameters necessary for base model
221
+
222
+ - model: cognitivecomputations/WestLake-7B-v2-laser
223
+ parameters:
224
+ density: 0.58
225
+ weight: [0.3877, 0.1636, 0.186, 0.0502]
226
+
227
+
228
+
229
+ - model: senseable/garten2-7b
230
+ parameters:
231
+ density: 0.58
232
+ weight: [0.234, 0.2423, 0.2148, 0.2775]
233
+
234
+
235
+
236
+ - model: berkeley-nest/Starling-LM-7B-alpha
237
+ parameters:
238
+ density: 0.58
239
+ weight: [0.1593, 0.1573, 0.1693, 0.3413]
240
+
241
+
242
+
243
+ - model: mlabonne/AlphaMonarch-7B
244
+ parameters:
245
+ density: 0.58
246
+ weight: [0.219, 0.4368, 0.4299, 0.331]
247
+
248
+
249
+
250
+ merge_method: dare_ties
251
+ base_model: mistralai/Mistral-7B-v0.1
252
+ parameters:
253
+ int8_mask: true
254
+ dtype: bfloat16
255
+
256
+ ```
257
+
258
+
259
+
260
+
261
+ ### Table of Benchmarks
262
+
263
+ ## Open LLM Leaderboard
264
+
265
+ | | Average | ARC | HellaSwag | MMLU | TruthfulQA | Winogrande | GSM8K |
266
+ |---------------------------------------------------------|---------|-------|-----------|-------|------------|------------|-------|
267
+ | giraffe176/Starling_Monarch_Westlake_Garten-7B-v0.1 | 74.9 | 71.76 | 88.15 | 65.07 | 67.92 | 84.53 | 71.95 |
268
+ | mlabonne/AlphaMonarch-7B | 75.99 | 73.04 | 89.18 | 64.4 | 77.91 | 84.69 | 66.72 |
269
+ | senseable/WestLake-7B-v2 | 74.68 | 73.04 | 88.65 | 64.71 | 67.06 | 86.98 | 67.63 |
270
+ | berkeley-nest/Starling-LM-7B-alpha | 67.13 | 63.82 | 84.9 | 63.64 | 46.39 | 80.58 | 62.4 |
271
+ | senseable/garten2-7b | 72.65 | 69.37 | 87.54 | 65.44 | 59.5 | 84.69 | 69.37 |
272
+
273
+
274
+
275
+ ## Yet Another LLM Leaderboard benchmarks
276
+
277
+ | Model |AGIEval|GPT4All|TruthfulQA|Bigbench|Average|
278
+ |---------------------------------------------------------------------------------------------------------------------------------|------:|------:|---------:|-------:|------:|
279
+ |[giraffe176/Starling_Monarch_Westlake_Garten-7B-v0.1](https://huggingface.co/giraffe176/Starling_Monarch_Westlake_Garten-7B-v0.1)| 44.99| 76.93| 68.04| 47.71| 59.42|
280
+ |[mlabonne/AlphaMonarch-7B](https://huggingface.co/mlabonne/AlphaMonarch-7B) | 45.37| 77 | 78.39| 50.2 | 62.74|
281
+ |[berkeley-nest/Starling-LM-7B-alpha](https://huggingface.co/berkeley-nest/Starling-LM-7B-alpha) | 42.06| 72.72| 47.33| 42.53| 51.16 |
282
+
283
+ ## Misc. Benchmarks
284
+
285
+ | | MT-Bench | EQ-Bench v2.1 |
286
+ |---------------------------------------------------------|---------------------------------------------|---------------------------------------------------------------------------------|
287
+ | giraffe176/Starling_Monarch_Westlake_Garten-7B-v0.1 | 8.109375 | 80.01 (3 Shot, ChatML, ooba) |
288
+ | mlabonne/AlphaMonarch-7B | 8.23750 | 76.08 |
289
+ | senseable/WestLake-7B-v2 | X | 78.7 |
290
+ | berkeley-nest/Starling-LM-7B-alpha | 8.09 | 68.69 (1 Shot, ChatML, ooba) |
291
+ | senseable/garten2-7b | X | 75.03 |
292
+ | claude-v1 | 7.900000 | 76.83 |
293
+ | gpt-3.5-turbo | 7.943750 | 71.74 |
294
+ | | [(Paper)](https://arxiv.org/abs/2306.05685) | [(Paper)](https://arxiv.org/abs/2312.06281) [Leaderboard](https://eqbench.com/) |
295
+
296
+