Details of Ability Loss
Original model:
vllm (pretrained=/root/autodl-tmp/Cydonia-24B-v2,add_bos_token=true,max_model_len=2048,dtype=bfloat16), gen_kwargs: (None), limit: 250.0, num_fewshot: 5, batch_size: auto
Tasks | Version | Filter | n-shot | Metric | Value | Stderr | ||
---|---|---|---|---|---|---|---|---|
gsm8k | 3 | flexible-extract | 5 | exact_match | β | 0.912 | Β± | 0.018 |
strict-match | 5 | exact_match | β | 0.912 | Β± | 0.018 |
vllm (pretrained=/root/autodl-tmp/Cydonia-24B-v2,add_bos_token=true,max_model_len=2048,dtype=bfloat16), gen_kwargs: (None), limit: 500.0, num_fewshot: 5, batch_size: auto
Tasks | Version | Filter | n-shot | Metric | Value | Stderr | ||
---|---|---|---|---|---|---|---|---|
gsm8k | 3 | flexible-extract | 5 | exact_match | β | 0.904 | Β± | 0.0132 |
strict-match | 5 | exact_match | β | 0.894 | Β± | 0.0138 |
vllm (pretrained=/root/autodl-tmp/Cydonia-24B-v2,add_bos_token=true,max_model_len=700,dtype=bfloat16), gen_kwargs: (None), limit: 15.0, num_fewshot: None, batch_size: 1
Groups | Version | Filter | n-shot | Metric | Value | Stderr | ||
---|---|---|---|---|---|---|---|---|
mmlu | 2 | none | acc | β | 0.7942 | Β± | 0.0131 | |
- humanities | 2 | none | acc | β | 0.8205 | Β± | 0.0257 | |
- other | 2 | none | acc | β | 0.8103 | Β± | 0.0271 | |
- social sciences | 2 | none | acc | β | 0.8500 | Β± | 0.0257 | |
- stem | 2 | none | acc | β | 0.7298 | Β± | 0.0249 |
Final W8A8 quantization model:
vllm (pretrained=/root/autodl-tmp/87-1536,add_bos_token=true,max_model_len=2048,dtype=bfloat16), gen_kwargs: (None), limit: 250.0, num_fewshot: 5, batch_size: auto
Tasks | Version | Filter | n-shot | Metric | Value | Stderr | ||
---|---|---|---|---|---|---|---|---|
gsm8k | 3 | flexible-extract | 5 | exact_match | β | 0.880 | Β± | 0.0206 |
strict-match | 5 | exact_match | β | 0.868 | Β± | 0.0215 |
vllm (pretrained=/root/autodl-tmp/87-1536,add_bos_token=true,max_model_len=2048,dtype=bfloat16), gen_kwargs: (None), limit: 500.0, num_fewshot: 5, batch_size: auto
Tasks | Version | Filter | n-shot | Metric | Value | Stderr | ||
---|---|---|---|---|---|---|---|---|
gsm8k | 3 | flexible-extract | 5 | exact_match | β | 0.880 | Β± | 0.0145 |
strict-match | 5 | exact_match | β | 0.854 | Β± | 0.0158 |
vllm (pretrained=/root/autodl-tmp/87-1536,add_bos_token=true,max_model_len=700,dtype=bfloat16), gen_kwargs: (None), limit: 15.0, num_fewshot: None, batch_size: 1
Groups | Version | Filter | n-shot | Metric | Value | Stderr | ||
---|---|---|---|---|---|---|---|---|
mmlu | 2 | none | acc | β | 0.7836 | Β± | 0.0131 | |
- humanities | 2 | none | acc | β | 0.8359 | Β± | 0.0249 | |
- other | 2 | none | acc | β | 0.7795 | Β± | 0.0279 | |
- social sciences | 2 | none | acc | β | 0.8444 | Β± | 0.0259 | |
- stem | 2 | none | acc | β | 0.7123 | Β± | 0.0251 |
0.912->0.868: β0.044(4.82%)
0.894->0.854: β0.04(4.47%)
0.7942->0.7836: β0.0106(1.33%)