BibleTTS: a large, high-fidelity, multilingual, and uniquely African speech corpus Paper β’ 2207.03546 β’ Published Jul 7, 2022 β’ 1
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference Paper β’ 2412.13663 β’ Published Dec 18, 2024 β’ 135
Open Whisper-style Speech Models (OWSM) Collection Fully open Whisper-style speech foundation models developed by CMU WAVLab: https://www.wavlab.org/activities/2024/owsm/ β’ 15 items β’ Updated Feb 6 β’ 5
CommonCrawl Collection Large web-mined general corpus based on CommonCrawl. β’ 7 items β’ Updated Dec 8, 2024 β’ 2
AfriCOMET Collection COMET evaluation models for African languages β’ 6 items β’ Updated Oct 1, 2024 β’ 2
MobileLLM Collection Optimizing Sub-billion Parameter Language Models for On-Device Use Cases (ICML 2024) https://arxiv.org/abs/2402.14905 β’ 9 items β’ Updated Nov 27, 2024 β’ 111
Optimized ONNX models for NVIDIA RTX GPUs Collection Collection of optimized ONNX model checkpoints for NVIDIA RTX GPUs β’ 7 items β’ Updated about 5 hours ago β’ 10
Spaces for Model / Space / useful Utilities in Hugging Face Collection 276 items β’ Updated 7 days ago β’ 9
EMMA-500: Enhancing Massively Multilingual Adaptation of Large Language Models Paper β’ 2409.17892 β’ Published Sep 26, 2024 β’ 2
Faith and Fate: Limits of Transformers on Compositionality Paper β’ 2305.18654 β’ Published May 29, 2023 β’ 6
π» Local SmolLMs Collection SmolLM models in MLC, ONNX and GGUF format for local applications + in-browser demos β’ 14 items β’ Updated 26 days ago β’ 51
πͺ SmolLM Collection A series of smol LLMs: 135M, 360M and 1.7B. We release base and Instruct models as well as the training corpus and some WebGPU demos β’ 12 items β’ Updated 26 days ago β’ 219