AI & ML interests

None defined yet.

Recent Activity

merve 
posted an update about 12 hours ago
view post
Post
196
large AI labs have dropped so many open models last week 🔥 don't miss out on them

→ Apple released on-device vision LMs apple/fastvlm-68ac97b9cd5cacefdd04872e & apple/mobileclip2-68ac947dcb035c54bcd20c47
→ OpenGVLab released InternVL3.5, 32 new vision LMs with one based on gpt-oss! (OS) OpenGVLab/internvl35-68ac87bd52ebe953485927fb
→ MSFT released a killer small TTS model (OS) microsoft/VibeVoice-1.5B

find more herehttps://huggingface.co/collections/merve/august-29-releases-68b5a3754cfb8abf59e2b486
giadap 
posted an update about 13 hours ago
view post
Post
75
I've noticed something. While we're careful about what we post on social media, we're sharing our deepest and most intimate thoughts with AI chatbots -- health concerns, financial worries, relationship issues, business ideas...

With OpenAI hinting at ChatGPT advertising, this matters more than ever. Unlike banner ads, AI advertising happens within the conversation itself. Sponsors could subtly influence that relationship advice or financial guidance.

The good news? We have options.
🤝 Open source AI models let us keep conversations private, avoid surveillance-based business models, and build systems that actually serve users first.

Read more about it in our latest blog post, co-written with
@frimelle
https://huggingface.co/blog/giadap/privacy-conversational-ai
AdinaY 
posted an update about 13 hours ago
view post
Post
73
🔥 August highlights from Chinese AI community

zh-ai-community/august-2025-china-open-source-highlights-68a2de5630f406edaf320e88

✨ Efficiency leads the month
- At scale: optimizing compute use in massive MoE models e.g. DeepSeek v3.1
- In small models: lightweight & deployable
e.g. MiniCPM V 4.5, Step Audio 2-mini, Intern S1-mini,Ovis2.5-9B etc.

✨ Reasoning + Agentic wave 🌊 Not just demos, but real product use cases.
- Meituan, DeepSeek: large-scale models tuned for reasoning & tools
- Qwen, GLM, InternLM: multimodal reasoning + agentic interaction
- CodeAgent, Prover, Baichuan-M2-32B: domain-focused (coding, logic, specialized reasoning)

✨ Open source is exploding across all types of companies!!
- Big tech: Tencent, ByteDance, Xiaomi, Kuaishou, Alibaba/Qwen, Skywork, Ant Group
- Startups: DeepSeek (yes, still a startup!), Zhipu, Baichuan, StepFun, OpenBMB
- New entrants: Meituan, RedNote
- Research labs: Shanghai AI Lab (InternLM, OpenGVLab)

✨ Open source was explicitly mentioned in the State Council’s new guidance on deepening the "AI+" strategy.
- Open-source: support communities, encourage contributions (incl. university credits & recognition), foster new application approaches, and build globally impactful ecosystems 👀

💡 The Chinese community didn’t slow down at all in August 🤯 September, the last month before the Golden Week holiday, may bring even more surprises.

Stay Tuned!
AdinaY 
posted an update about 17 hours ago
view post
Post
108
Hunyuan-MT-7B 🔥 open translation model released by Tencent Hunyuan

tencent/hunyuan-mt-68b42f76d473f82798882597

✨ Supports 33 languages, including 5 ethnic minority languages in China 👀
✨ Including a translation ensemble model: Chimera-7B
✨ Full pipeline: pretrain > CPT > SFT > enhancement > ensemble refinement > SOTA performance at similar scale
AdinaY 
posted an update about 18 hours ago
view post
Post
114
From food delivery to frontier AI 🚀 Meituan, the leading lifestyle platform just dropped its first open SoTA LLM: LongCat-Flash 🔥

meituan-longcat/LongCat-Flash-Chat

✨ 560B total / ~27B active MoE — MIT license
✨ 128k context length + advanced reasoning
✨ ScMoE design: 100+ TPS inference
✨ Stable large-scale training + strong agentic performance
AdinaY 
posted an update 4 days ago
view post
Post
439
USO 🎨 Unified customization model released by Bytedance research

Demo
bytedance-research/USO
Model
bytedance-research/USO
Paper
USO: Unified Style and Subject-Driven Generation via Disentangled and Reward Learning (2508.18966)

✨ Large-scale triplet dataset (content, style, stylized)
✨ Disentangled learning: style alignment + content preservation
✨ Style Reward Learning (SRL) for higher fidelity
✨ USO-Bench: 1st benchmark for style & subject jointly
✨ SOTA results on subject consistency & style similarity
AdinaY 
posted an update 4 days ago
view post
Post
342
Step-Audio 2🔥 New end to end multimodal LLM for audio & speech, released by StepFun

stepfun-ai/step-audio-2-68b003c3a47b273fffaf67a8

✨ Direct raw audio: text & speech ,no ASR+LLM+TTS pipeline
✨ High-IQ reasoning: RL + CoT for paralinguistic cues
✨ Multimodal RAG + tool calling
✨ Emotion, timbre, dialect & style control
✨ SOTA on ASR, paralinguistic, speech dialog
giadap 
posted an update 5 days ago
view post
Post
236
📊 We benchmark models for coding, reasoning, or safety… but what about companionship?

At Hugging Face, we’ve been digging into this question because many of you know how deeply I care about how people build emotional bonds with AI.

That’s why, building on our ongoing research, my amazing co-author and colleague @frimelle created the AI Companionship Leaderboard 🦾
frimelle/companionship-leaderboard

Grounded in our INTIMA benchmark, the leaderboard evaluates models across four dimensions of companionship:
🤖 Assistant Traits: the “voice” and role the model projects
🌷 Relationship & Intimacy: whether it signals closeness or bonding
💘 Emotional Investment: the depth of its emotional engagement
🤲 User Vulnerabilities: how it responds to sensitive disclosures

This work builds on our paper with @frimelle and @yjernite .

📢 Now we’d love your perspective: which open models should we test next for the leaderboard? Drop your suggestions in the comments or reach out! Together we can expand the leaderboard and build a clearer picture of what companionship in AI really looks like.

Paper: INTIMA: A Benchmark for Human-AI Companionship Behavior (2508.09998)
INTIMA Benchmark: AI-companionship/INTIMA
  • 1 reply
·
sergiopaniego 
posted an update 6 days ago
view post
Post
320
It's now posible to do end-2-end ML without leaving the @huggingface Hub, by combining TRL + HF jobs + Trackio!!

🐡We just released a full guide explaining the process.

Go check it out!

📖 Guide: https://huggingface.co/docs/trl/main/en/jobs_training

💡 Reminder: HF Jobs is only available for Pro, Team, or Enterprise plans. Yet another reason to upgrade
AdinaY 
posted an update 7 days ago
view post
Post
1055
🇨🇳 China’s State Council just released its “AI+” Action Plan (2025)

<The State Council’s Guidance on Deepened Implementation of the ‘AI+’ Strategy>
zh-ai-community/china-ai-policy-research

✨Goal: By 2035, AI will deeply empower all sectors, reshape productivity & society

✨Focus on 6 pillars:
>Science & Tech
>Industry
>Consumption
>Public welfare
>Governance
>Global cooperation

✨Highlights:
>Models: advance theory, efficient training/inference, evaluation system
>Data: high-quality datasets, IP/copyright reform, new incentives
>Compute: boost chips & clusters, improve national network, promote cloud standardization, and ensure inclusive, efficient, green, secure supply.
>Applications: AI-as-a-service, test bases, new standards
>Open-source: support communities, encourage contributions (incl. university credits & recognition), foster new application approaches, and build globally impactful ecosystems 👀
>Talent, policy & safety frameworks to secure sustainable growth
merve 
posted an update 7 days ago
AdinaY 
posted an update 7 days ago
view post
Post
4821
MiniCPM-V 4.5 🚀 New MLLM for image, multi-image & video understanding, running even on your phone, released by OpenBMB

openbmb/MiniCPM-V-4_5

✨ SOTA vision language capability
✨ 96× video token compression > high-FPS & long video reasoning
✨ Switchable fast vs deep thinking modes
✨ Strong OCR, document parsing, supports 30+ languages
AdinaY 
posted an update 7 days ago
view post
Post
267
InternVL3.5 🔥 New family of multimodal model by Shanghai AI lab

OpenGVLab/internvl35-68ac87bd52ebe953485927fb

✨ 1B · 2B · 4B · 8B · 14B · 38B | MoE → 20B-A4B · 30B-A3B · 241B-A28B 📄Apache 2.0
✨ +16% reasoning performance, 4.05× speedup vs InternVL3
✨ Cascade RL (offline + online) : stronger reasoning
✨ ViR: efficient visual token routing
✨ DvD: calable vision–language deployment
✨ Supports GUI & embodied agency 🤖
Xenova 
posted an update 11 days ago
view post
Post
2905
Okay this is insane... WebGPU-accelerated semantic video tracking, powered by DINOv3 and Transformers.js! 🤯
Demo (+ source code): webml-community/DINOv3-video-tracking

This will revolutionize AI-powered video editors... which can now run 100% locally in your browser, no server inference required (costs $0)! 😍

How does it work? 🤔
1️⃣ Generate and cache image features for each frame
2️⃣ Create a list of embeddings for selected patch(es)
3️⃣ Compute cosine similarity between each patch and the selected patch(es)
4️⃣ Highlight those whose score is above some threshold

... et voilà! 🥳

You can also make selections across frames to improve temporal consistency! This is super useful if the object changes its appearance slightly throughout the video.

Excited to see what the community builds with it!
AdinaY 
posted an update 12 days ago
AdinaY 
posted an update 12 days ago
view post
Post
3615
Seed-OSS 🔥 The latest open LLM from Bytedance Seed team

ByteDance-Seed/seed-oss-68a609f4201e788db05b5dcd

✨ 36B - Base & Instruct
✨ Apache 2.0
✨ Native 512K long context
✨ Strong reasoning & agentic intelligence
✨ 2 Base versions: with & without synthetic data
AdinaY 
posted an update 13 days ago