Llama 3.3 (All Versions) Collection Meta's new Llama 3.3 (70B) model in all formats. Includes GGUF, 4-bit bnb and original versions. • 3 items • Updated 3 days ago • 35
Moshi v0.1 Release Collection MLX, Candle & PyTorch model checkpoints released as part of the Moshi release from Kyutai. Run inference via: https://github.com/kyutai-labs/moshi • 13 items • Updated Sep 18, 2024 • 227
OpenAssistant Conversations -- Democratizing Large Language Model Alignment Paper • 2304.07327 • Published Apr 14, 2023 • 6
Qwen2.5 Collection Qwen2.5 language models, including pretrained and instruction-tuned models of 7 sizes, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B. • 45 items • Updated Nov 28, 2024 • 509
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model Paper • 2502.02737 • Published 3 days ago • 110
Scaling Laws for Floating Point Quantization Training Paper • 2501.02423 • Published Jan 5 • 26
ACECODER: Acing Coder RL via Automated Test-Case Synthesis Paper • 2502.01718 • Published 4 days ago • 22
RandLoRA: Full-rank parameter-efficient fine-tuning of large models Paper • 2502.00987 • Published 4 days ago • 9
Preference Leakage: A Contamination Problem in LLM-as-a-judge Paper • 2502.01534 • Published 4 days ago • 34
Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs Paper • 2501.18585 • Published 8 days ago • 51
The Differences Between Direct Alignment Algorithms are a Blur Paper • 2502.01237 • Published 4 days ago • 108
The Surprising Agreement Between Convex Optimization Theory and Learning-Rate Scheduling for Large Model Training Paper • 2501.18965 • Published 7 days ago • 5
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training Paper • 2501.17161 • Published 10 days ago • 100