AI & ML interests

None defined yet.

Recent Activity

AdinaY 
posted an update 4 days ago
view post
Post
2866
MiniMax M2.5 is now available on the hub 🚀

MiniMaxAI/MiniMax-M2.5

✨ 229B - Modified MIT license
✨37% faster than M2.1
✨ ~$1/hour at 100 TPS
  • 1 reply
·
sergiopaniego 
posted an update 4 days ago
AdinaY 
posted an update 5 days ago
AdinaY 
posted an update 6 days ago
view post
Post
2745
Game on 🎮🚀

While Seedance 2.0’s videos are all over the timeline, DeepSeek quietly pushed a new model update in its app.

GLM-5 from Z.ai adds more momentum.

Ming-flash-omni from Ant Group , MiniCPM-SALA from OpenBMB
, and the upcoming MiniMax M2.5 keep the heat on 🔥

Spring Festival is around the corner,
no one’s sleeping!

✨ More releases coming, stay tuned
https://huggingface.co/collections/zh-ai-community/2026-february-china-open-source-highlights
AdinaY 
posted an update 6 days ago
view post
Post
3767
Ming-flash-omni 2.0 🚀 New open omni-MLLM released by Ant Group

inclusionAI/Ming-flash-omni-2.0

✨ MIT license
✨ MoE - 100B/6B active
✨ Zero-shot voice cloning + controllable audio
✨ Fine-grained visual knowledge grounding
  • 2 replies
·
AdinaY 
posted an update 8 days ago
view post
Post
610
LLaDA 2.1 is out 🔥 A new series of MoE diffusion language model released by AntGroup

inclusionAI/LLaDA2.1-mini
inclusionAI/LLaDA2.1-flash

✨LLaDA2.1-mini: 16B - Apache2.0
✨LLaDA2.1-flash: 100B - Apache2.0
✨Both delivers editable generation, RL-trained diffusion reasoning and fast inference
  • 2 replies
·
sergiopaniego 
posted an update 8 days ago
view post
Post
391
if you're looking for a good first issue to get your open-source journey started, you could contribute to this TRL issue by documenting one impactful paper in the docs

we have a broad list to cover!! 🧐

https://github.com/huggingface/trl/issues/4407
AdinaY 
posted an update 13 days ago
view post
Post
2498
AI for science is moving fast🚀

Intern-S1-Pro 🔬 a MoE multimodal scientific reasoning model from Shanghai AI Lab

internlm/Intern-S1-Pro

✨ 1T total / 22B active
✨ Apache 2.0
✨ SoTA scientific reasoning performance
✨ FoPE enables scalable modeling of long physical time series (10⁰–10⁶)
  • 2 replies
·
AdinaY 
posted an update 14 days ago
view post
Post
1314
✨ China’s open source AI ecosystem has entered a new phase

https://huggingface.co/blog/huggingface/one-year-since-the-deepseek-moment-blog-3

One year after the “DeepSeek Moment,” open source has become the default. Models, research, infrastructure, and deployment are increasingly shared to support large-scale, system-level integration.

This final blog examines how leading Chinese AI organizations are evolving ,and what this implies for the future of open source.
AdinaY 
posted an update 14 days ago
view post
Post
349
GLM just entered the OCR field🔥

zai-org/GLM-OCR

✨ 0.9B
✨ MIT licensed
✨ Multimodal GLM-V architecture
✨ #1 on OmniDocBench v1.5 (94.62)
AdinaY 
posted an update 15 days ago
AdinaY 
posted an update 19 days ago
view post
Post
1102
What a week 🤯

Following DeepSeek, Kimi, Qwen, Baidu, and Ant Group, Unitree Robotics
has now released a VLA model on the hub too!

unitreerobotics/UnifoLM-VLA-Base
sergiopaniego 
posted an update 19 days ago
view post
Post
462
Meet the Post-Training Toolkit (PTT), which easily integrates with TRL via a single callback, by Aditya Challapally ( @microsoft ):

🔍 Detects training issues early
🛠 Lets you intervene safely
📊 Keeps long training runs stable, auditable & efficient

Microsoft blog: https://devblogs.microsoft.com/engineering-at-microsoft/diagnosing-instability-in-production-scale-agent-rl/

Integration guide: https://huggingface.co/docs/trl/main/en/ptt_integration

Code: https://github.com/microsoft/post-training-toolkit
victor 
posted an update 19 days ago
view post
Post
660
Interesting article: use Claude Code to help open models write CUDA kernels (for eg) by turning CC traces into Skills. They made a library out of it 👀

https://huggingface.co/blog/upskill
AdinaY 
posted an update 20 days ago
view post
Post
284
LongCat-Flash-Lite🔥 a non-thinking MoE model released by Meituan LongCat team.

meituan-longcat/LongCat-Flash-Lite

✨ Total 68.5B / 3B active - MIT license
✨ 256k context
✨ Faster inference with N-gram embeddings