🌙 March 2025 - Open releases from the Chinese community Collection 30 items • Updated about 18 hours ago • 12
How far can we go with ImageNet for Text-to-Image generation? Paper • 2502.21318 • Published Feb 28 • 25
The Danger of Overthinking: Examining the Reasoning-Action Dilemma in Agentic Tasks Paper • 2502.08235 • Published Feb 12 • 55
FoNE: Precise Single-Token Number Embeddings via Fourier Features Paper • 2502.09741 • Published Feb 13 • 11
Aira Collection Aira is a series of chatbots developed as an experimentation playground for value alignment. • 27 items • Updated Jun 20, 2024 • 1
Loxa Collection a Loxa family models are best models to running on CPU and GPU with high quality(=>92% accuracy) • 5 items • Updated Feb 3 • 2
Quadrifoglio 🍀 Collection Small text2text models finetuned on Italian machine translation tasks. • 6 items • Updated Jan 12 • 1
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference Paper • 2412.13663 • Published Dec 18, 2024 • 139
RedPajama: an Open Dataset for Training Large Language Models Paper • 2411.12372 • Published Nov 19, 2024 • 55
FluidML: Fast and Memory Efficient Inference Optimization Paper • 2411.09242 • Published Nov 14, 2024 • 1
TÜLU 3: Pushing Frontiers in Open Language Model Post-Training Paper • 2411.15124 • Published Nov 22, 2024 • 62
SmolLM2 Collection State-of-the-art compact LLMs for on-device applications: 1.7B, 360M, 135M • 16 items • Updated Feb 20 • 251