Running on CPU Upgrade Featured 3.05k The Smol Training Playbook ๐ 3.05k The secrets to building world-class LLMs
google/embeddinggemma-300m Sentence Similarity โข 0.3B โข Updated Sep 25, 2025 โข 1.88M โข โข 1.54k
FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language Paper โข 2506.20920 โข Published Jun 26, 2025 โข 77
OWSM v4: Improving Open Whisper-Style Speech Models via Data Scaling and Cleaning Paper โข 2506.00338 โข Published May 31, 2025 โข 10
view changelog Hugging Face Changelog Xet is now the default storage option for new users and organizations May 23, 2025 โข 76
Running on Zero Featured 1.75k Dia 1.6B ๐ฏ 1.75k Generate realistic dialogue from a script, using Dia!
view article Article Speeding Up LLM Decoding with Advanced Universal Assisted Generation Techniques Mar 24, 2025 โข 20
Running 3.74k The Ultra-Scale Playbook ๐ 3.74k The ultimate guide to training LLM on large GPU Clusters
view article Article KV Caching Explained: Optimizing Transformer Inference Efficiency Jan 30, 2025 โข 245
view article Article How biased is Whisper ? Evaluating Whisper Models for Robustness to Diverse English Accents Jan 29, 2025 โข 17