Small but mighty: 82M parameters, runs locally, speaks multiple languages. The best part? It's Apache 2.0 licensed! This could unlock so many possibilities ✨
🚀 The open source community is unstoppable: 4M total downloads for DeepSeek models on Hugging Face, with 3.2M coming from the +600 models created by the community.
Yes, DeepSeek R1's release is impressive. But the real story is what happened in just 7 days after:
- Original release: 8 models, 540K downloads. Just the beginning...
- The community turned those open-weight models into +550 NEW models on Hugging Face. Total downloads? 2.5M—nearly 5X the originals.
The reason? DeepSeek models are open-weight, letting anyone build on top of them. Interesting to note that the community focused on quantized versions for better efficiency & accessibility. They want models that use less memory, run faster, and are more energy-efficient.
When you empower builders, innovation explodes. For everyone. 🚀
The most popular community model? @bartowski's DeepSeek-R1-Distill-Qwen-32B-GGUF version — 1M downloads alone.
✨ Launched All-Scenario Reasoning Model (language, visual, and search reasoning capabilities) , with medical expertise as one of its key highlights. https://ying.baichuan-ai.com/chat
✨ Released Baichuan-M1-14B Medical LLM on the hub Available in both Base and Instruct versions, support English & Chinese.
What happened yesterday in the Chinese AI community? 🚀
T2A-01-HD 👉 https://hailuo.ai/audio MiniMax's Text-to-Audio model, now in Hailuo AI, offers 300+ voices in 17+ languages and instant emotional voice cloning.
Tare 👉 https://www.trae.ai/ A new coding tool by Bytedance for professional developers, supporting English & Chinese with free access to Claude 3.5 and GPT-4 for a limited time.
Kimi K 1.5 👉 https://github.com/MoonshotAI/Kimi-k1.5 | https://kimi.ai/ An O1-level multi-modal model by MoonShot AI, utilizing reinforcement learning with long and short-chain-of-thought and supporting up to 128k tokens.
And today…
Hunyuan 3D-2.0 👉 tencent/Hunyuan3D-2 A SoTA 3D synthesis system for high-res textured assets by Tencent Hunyuan , with open weights and code!
Reminder: Don’t. Use. ChatGPT. As. A. Calculator. Seriously. 🤖
Loved listening to @sasha on Hard Fork—it really made me think.
A few takeaways that hit home: - Individual culpability only gets you so far. The real priority: demanding accountability and transparency from companies. - Evaluate if generative AI is the right tool for certain tasks (like search) before using it.
✨ MIT License : enabling distillation for custom models ✨ 32B & 70B models match OpenAI o1-mini in multiple capabilities ✨ API live now! Access Chain of Thought reasoning with model='deepseek-reasoner'