188
BigCodeBench Leaderboard
π₯
Explore and analyze code evaluation data
Explore and analyze code evaluation data
Uncensored General Intelligence Leaderboard
Display chatbot leaderboard and stats
Embedding Leaderboard
Track, rank and evaluate open LLMs and chatbots
Submit code models for evaluation on benchmarks
Display a web page
Request and view assessments for speech recognition models
Generate images from text descriptions
View LLM Performance Leaderboard
Display and explore zebra puzzle leaderboard
imgsys.org -- arena for text guided image generation
Embed and use ZeroEval for evaluation tasks
Browse and submit language model benchmarks
Blind vote on HF TTS models!
Display data interactively
DABstep Reasoning Benchmark Leaderboard
Ranking of LLMs for agentic tasks