🚀 The World’s First Public Diffusion Model Leaderboard (CLIP Score + FID Benchmarks!)
The diffusion space is moving fast, but reproducible evaluation has lagged behind. Until now.
We’re excited to announce the launch of the DreamLayer AI Diffusion Model Leaderboard, a public benchmark where top text-to-image models are scored using standardized metrics like CLIP Score, FID Score, Precision/Recall, and F1 Score.
🏆 Note: We’re hosting an Image Generation Kaggle Challenge!
Join the competition here: Kaggle.com/competitions/text-to-image-challengeCompete for a chance to win one of 5 cash prizes
Why This Matters
Reproducibility is one of the biggest challenges in generative AI research.
Comparing OpenAI’s DALL·E, Stability AI’s SD Turbo, Google Gemini’s Nano Banana, Black Forest Labs’ Flux, Runway Gen-4, Ideogram, or Luma Labs’ Photon has often been inconsistent due to different prompts, seeds, and scoring setups.
Our pipeline automates:
- Prompt pack setup across 200 prompts
- Seed control for identical conditions
- Metric scoring with CLIP, FID, and composition correctness
âś… The result: 200 prompts benchmarked in just 45 minutes per model, with transparent, reproducible metrics anyone can trust.
Early Results
- Best CLIP Score: Luma Labs Photon (0.265)
- Best FID Score: Ideogram V3 (305.60)
- Best Recall: Stability SD Turbo (0.533)
- Best Overall F1: Luma Labs Photon (0.463)
📊 See the full leaderboard live on DreamLayer: dreamlayer.io
What’s Next
We’re inviting the community to submit their own diffusion models for evaluation.
Whether you’re a researcher, startup, or lab, this leaderboard is designed to accelerate fair, reproducible benchmarking and push the field forward.
👉 Want your model added? Contact us!
