alexmarques commited on
Commit
e191b8d
·
verified ·
1 Parent(s): 8f89d5f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +72 -5
README.md CHANGED
@@ -25,7 +25,7 @@ license: llama3.1
25
  - **Out-of-scope:** Use in any manner that violates applicable laws or regulations (including trade compliance laws).
26
  - **Release Date:** 7/11/2024
27
  - **Version:** 1.0
28
- - **License(s):** [Llama3.1]
29
  - **Model Developers:** Neural Magic
30
 
31
  Quantized version of [Meta-Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct).
@@ -156,7 +156,7 @@ This version of the lm-evaluation-harness includes versions of ARC-Challenge and
156
  </td>
157
  </tr>
158
  <tr>
159
- <td>ARC Challenge (25-shot)
160
  </td>
161
  <td>83.19
162
  </td>
@@ -166,7 +166,7 @@ This version of the lm-evaluation-harness includes versions of ARC-Challenge and
166
  </td>
167
  </tr>
168
  <tr>
169
- <td>GSM-8K (5-shot, strict-match)
170
  </td>
171
  <td>82.79
172
  </td>
@@ -196,7 +196,7 @@ This version of the lm-evaluation-harness includes versions of ARC-Challenge and
196
  </td>
197
  </tr>
198
  <tr>
199
- <td>TruthfulQA (0-shot)
200
  </td>
201
  <td>54.04
202
  </td>
@@ -215,4 +215,71 @@ This version of the lm-evaluation-harness includes versions of ARC-Challenge and
215
  <td><strong>99.3%</strong>
216
  </td>
217
  </tr>
218
- </table>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
25
  - **Out-of-scope:** Use in any manner that violates applicable laws or regulations (including trade compliance laws).
26
  - **Release Date:** 7/11/2024
27
  - **Version:** 1.0
28
+ - **License(s):** Llama3.1
29
  - **Model Developers:** Neural Magic
30
 
31
  Quantized version of [Meta-Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct).
 
156
  </td>
157
  </tr>
158
  <tr>
159
+ <td>ARC Challenge (0-shot)
160
  </td>
161
  <td>83.19
162
  </td>
 
166
  </td>
167
  </tr>
168
  <tr>
169
+ <td>GSM-8K (CoT, 8-shot, strict-match)
170
  </td>
171
  <td>82.79
172
  </td>
 
196
  </td>
197
  </tr>
198
  <tr>
199
+ <td>TruthfulQA (0-shot, mc2)
200
  </td>
201
  <td>54.04
202
  </td>
 
215
  <td><strong>99.3%</strong>
216
  </td>
217
  </tr>
218
+ </table>
219
+
220
+ ### Reproduction
221
+
222
+ The results were obtained using the following commands:
223
+
224
+ #### MMLU
225
+ ```
226
+ lm_eval \
227
+ --model vllm \
228
+ --model_args pretrained="neuralmagic/Meta-Llama-3.1-8B-Instruct-quantized.w8a8",dtype=auto,add_bos_token=True,max_model_len=4096,tensor_parallel_size=1 \
229
+ --tasks mmlu \
230
+ --num_fewshot 5 \
231
+ --batch_size auto
232
+ ```
233
+
234
+ #### ARC-Challenge
235
+ ```
236
+ lm_eval \
237
+ --model vllm \
238
+ --model_args pretrained="neuralmagic/Meta-Llama-3.1-8B-Instruct-quantized.w8a8",dtype=auto,add_bos_token=True,max_model_len=4096,tensor_parallel_size=1 \
239
+ --tasks arc_challenge_llama_3.1_instruct \
240
+ --apply_chat_template \
241
+ --num_fewshot 0 \
242
+ --batch_size auto
243
+ ```
244
+
245
+ #### GSM-8K
246
+ ```
247
+ lm_eval \
248
+ --model vllm \
249
+ --model_args pretrained="neuralmagic/Meta-Llama-3.1-8B-Instruct-quantized.w8a8",dtype=auto,add_bos_token=True,max_model_len=4096,tensor_parallel_size=1 \
250
+ --tasks gsm8k_cot_llama_3.1_instruct \
251
+ --fewshot_as_multiturn \
252
+ --apply_chat_template \
253
+ --num_fewshot 8 \
254
+ --batch_size auto
255
+ ```
256
+
257
+ #### Hellaswag
258
+ ```
259
+ lm_eval \
260
+ --model vllm \
261
+ --model_args pretrained="neuralmagic/Meta-Llama-3.1-8B-Instruct-quantized.w8a8",dtype=auto,add_bos_token=True,max_model_len=4096,tensor_parallel_size=1 \
262
+ --tasks hellaswag \
263
+ --num_fewshot 10 \
264
+ --batch_size auto
265
+ ```
266
+
267
+ #### Winogrande
268
+ ```
269
+ lm_eval \
270
+ --model vllm \
271
+ --model_args pretrained="neuralmagic/Meta-Llama-3.1-8B-Instruct-quantized.w8a8",dtype=auto,add_bos_token=True,max_model_len=4096,tensor_parallel_size=1 \
272
+ --tasks winogrande \
273
+ --num_fewshot 5 \
274
+ --batch_size auto
275
+ ```
276
+
277
+ #### TruthfulQA
278
+ ```
279
+ lm_eval \
280
+ --model vllm \
281
+ --model_args pretrained="neuralmagic/Meta-Llama-3.1-8B-Instruct-quantized.w8a8",dtype=auto,add_bos_token=True,max_model_len=4096,tensor_parallel_size=1 \
282
+ --tasks truthfulqa \
283
+ --num_fewshot 0 \
284
+ --batch_size auto
285
+ ```