lingoly-too / leaderboard.csv
Jude Khouja
1st clean draft
36ce9ab
raw
history blame contribute delete
848 Bytes
Model,Provider,Type,Baseline score,Obfuscated score
Aya 23 35B,Cohere,Open source,0.10654349746757057,0.05708180119638717
Claude 3.5 Sonnet,Anthropic,Closed source,0.48255271180599657,0.2810140963355337
Claude 3.7 Sonnet,Anthropic,Closed source,0.5994013309112796,0.4357505520191723
GPT 4.5,OpenAI,Closed source,0.4208265195574057,0.2545024812218498
GPT 4o,OpenAI,Closed source,0.31371291749661456,0.1563339989919302
Gemini 1.5 Pro,Google,Closed source,0.3690345167304693,0.20461522579355207
Llama 3.3 70B-Instruct,Meta,Open source,0.11452795751175084,0.08213118755937426
Phi4,Microsoft,Open source,0.1809802769595679,0.10996628714372364
DeepSeek R1,DeepSeek,Open source,0.3965527162895584,0.2649618642615188
o1-preview,OpenAI,Closed source,0.47730527712315257,0.3222020975619888
o3-mini,OpenAI,Closed source,0.42172257807447155,0.3059086523804619