aloobun
/

d-SmolLM2-360M

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

aloobun commited on Nov 21, 2024

Commit

7802a05

·

verified ·

1 Parent(s): fd30bdf

Update README.md

Files changed (1) hide show

README.md +13 -6

README.md CHANGED Viewed

@@ -29,12 +29,6 @@ It slightly improves upon the performance of the basemodel on the following task
 # Eval Results aloobun/d-SmolLM2-360M (WIP)
-Todo:
-ifeval (0-shot, generative)
-Math-lvl-5 (4-shots, generative, minerva version)
 ## GPQA
@@ -100,3 +94,16 @@ Math-lvl-5 (4-shots, generative, minerva version)
 |                  |       |none  |     0|inst_level_strict_acc  |↑  |0.2770|±  |   N/A|
 |                  |       |none  |     0|prompt_level_loose_acc |↑  |0.1497|±  |0.0154|
 |                  |       |none  |     0|prompt_level_strict_acc|↑  |0.1423|±  |0.0150|

 # Eval Results aloobun/d-SmolLM2-360M (WIP)
 ## GPQA
 |                  |       |none  |     0|inst_level_strict_acc  |↑  |0.2770|±  |   N/A|
 |                  |       |none  |     0|prompt_level_loose_acc |↑  |0.1497|±  |0.0154|
 |                  |       |none  |     0|prompt_level_strict_acc|↑  |0.1423|±  |0.0150|
+## MATH HARD
+|                    Tasks                    |Version|Filter|n-shot|  Metric   |   |Value |   |Stderr|
+|---------------------------------------------|-------|------|-----:|-----------|---|-----:|---|-----:|
+|leaderboard_math_hard                        |    N/A|      |      |           |   |      |   |      |
+| - leaderboard_math_algebra_hard             |      2|none  |     4|exact_match|↑  |0.0033|±  |0.0033|
+| - leaderboard_math_counting_and_prob_hard   |      2|none  |     4|exact_match|↑  |0.0081|±  |0.0081|
+| - leaderboard_math_geometry_hard            |      2|none  |     4|exact_match|↑  |0.0000|±  |0.0000|
+| - leaderboard_math_intermediate_algebra_hard|      2|none  |     4|exact_match|↑  |0.0000|±  |0.0000|
+| - leaderboard_math_num_theory_hard          |      2|none  |     4|exact_match|↑  |0.0065|±  |0.0065|
+| - leaderboard_math_prealgebra_hard          |      2|none  |     4|exact_match|↑  |0.0104|±  |0.0073|
+| - leaderboard_math_precalculus_hard         |      2|none  |     4|exact_match|↑  |0.0000|±  |0.0000|