llm_contamination_detector / data /code_eval_board.csv
Yeyito's picture
gsm8k fix, queue, ref_model column
4e4454b
raw
history blame
641 Bytes
T,Models,ARC,HellaSwag,MMLU,TruthfulQA,Winogrande,GSM8K,Reference Model
🟒,roneneldan/TinyStories-3M,0.06,0.1,0.13,0.2,0.01,0,huggyllama/llama-7b
🟒,roneneldan/TinyStories-1M,0.05,0.11,0.09,0.17,0.01,0,huggyllama/llama-7b
πŸ”Ά,Fredithefish/ReasonixPajama-3B-HF,0.15,0.24,0.21,0.94,0.01,0.44,huggyllama/llama-7b
🟒,mistralai/Mistral-7B-v0.1,0.54,0.51,0.46,0.75,0,0.91,huggyllama/llama-7b
πŸ”Ά,rishiraj/meow,0.11,0.49,0.28,0.36,0.02,0.95,huggyllama/llama-7b
πŸ”Ά,Q-bert/MetaMath-Cybertron-Starling,0.52,0.64,0.51,0.75,0.01,0.99,huggyllama/llama-7b
πŸ”Ά,upstage/SOLAR-10.7B-Instruct-v1.0,0.11,0.49,0.28,0.36,0.01,0.96,huggyllama/llama-7b