Marcus2112's picture
Update README.md
6c38017 verified
---
datasets:
- JeanKaddour/minipile
language:
- en
base_model:
- EleutherAI/pythia-1.4b-deduped
---
| Benchmark | Measure | | 1.4B Pile Deduplicated | 1.4B MiniPile | Percentage Difference in Means |
| ---------------- | ---------- | --- | ---------------------- | -------------------------- | ------------------------------ |
| ARC-Challenge | acc | ↑ | **0.2600 ± 0.0130** | 0.1903 ± 0.0115 | -26.8077 |
| MMLU | acc | ↑ | **0.2388 ± 0.0036** | 0.2295 ± 0.0035 | -3.8945 |
| HellaSwag | acc | ↑ | **0.4177 ± 0.0049** | 0.2579 ± 0.0044 | -38.2571 |
| WinoGrande | acc | ↑ | **0.5730 ± 0.0140** | 0.5185 ± 0.0140 | -9.5133 |
| Lambada (OpenAI) | acc | ↑ | **0.6202 ± 0.0068** | 0.0000 ± 0.0000 | -100.0000 |
| Lambada (OpenAI) | perplexity | ↓ | **6.1041 ± 0.1531** | 1564928.5258 ± 118691.4565 | 25637234.3458 |
| Lambada (Std) | acc | ↑ | **0.4898 ± 0.0070** | 0.0000 ± 0.0000 | -100.0000 |
| Lambada (Std) | perplexity | ↓ | **11.2448 ± 0.3305** | 8848600.9409 ± 745031.8900 | 78690503.1312 |
| BLiMP | acc | ↑ | **0.8154 ± 0.0013** | 0.5483 ± 0.0017 | -32.7569 |
| ARC-Easy | acc | ↑ | **0.6174 ± 0.0100** | 0.2715 ± 0.0091 | -56.0253 |