Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
AdinaY 
posted an update 4 days ago
Post
346
Step-Audio 2🔥 New end to end multimodal LLM for audio & speech, released by StepFun

stepfun-ai/step-audio-2-68b003c3a47b273fffaf67a8

✨ Direct raw audio: text & speech ,no ASR+LLM+TTS pipeline
✨ High-IQ reasoning: RL + CoT for paralinguistic cues
✨ Multimodal RAG + tool calling
✨ Emotion, timbre, dialect & style control
✨ SOTA on ASR, paralinguistic, speech dialog
In this post