Yuxuan Wang's picture

Yuxuan Wang

ColorfulAI

·

https://patrick-tssn.github.io/

patrick-tssn

AI & ML interests

Multimodal Learning

Recent Activity

upvoted a collection 2 months ago

upvoted a paper 3 months ago

BEDA: Belief Estimation as Probabilistic Constraints for Performing Strategic Dialogue Acts

authored a paper 4 months ago

Qwen3-VL Technical Report

View all activity

Organizations

upvoted a collection 2 months ago

Qwen3-TTS

7 items • Updated Jan 22 • 337

upvoted a paper 3 months ago

BEDA: Belief Estimation as Probabilistic Constraints for Performing Strategic Dialogue Acts

Paper • 2512.24885 • Published Dec 31, 2025 • 5

authored 5 papers 4 months ago

Qwen3-VL Technical Report

Paper • 2511.21631 • Published Nov 26, 2025 • 161

Qwen3-Omni Technical Report

Paper • 2509.17765 • Published Sep 22, 2025 • 151

OmniVideoBench: Towards Audio-Visual Understanding Evaluation for Omni MLLMs

Paper • 2510.10689 • Published Oct 12, 2025 • 47

v-HUB: A Benchmark for Video Humor Understanding from Vision and Sound

Paper • 2509.25773 • Published Sep 30, 2025

Omni-Captioner: Data Pipeline, Models, and Benchmark for Omni Detailed Perception

Paper • 2510.12720 • Published Oct 14, 2025 • 2

New activity in bigai-nlco/VideoHallucer 5 months ago

remove duplicate data in temporal.json

#3 opened 5 months ago by

upvoted a paper 10 months ago

Discrete Markov Bridge

Paper • 2505.19752 • Published May 26, 2025 • 16

upvoted a paper 11 months ago

Seek in the Dark: Reasoning via Test-Time Instance-Level Policy Gradient in Latent Space

Paper • 2505.13308 • Published May 19, 2025 • 27

authored a paper 11 months ago

Seek in the Dark: Reasoning via Test-Time Instance-Level Policy Gradient in Latent Space

Paper • 2505.13308 • Published May 19, 2025 • 27

updated a dataset 11 months ago

ColorfulAI/MoviePuzzle

Viewer • Updated May 14, 2025 • 1 • 7

published a dataset 11 months ago

ColorfulAI/MoviePuzzle

Viewer • Updated May 14, 2025 • 1 • 7

New activity in ColorfulAI/M4-IT 12 months ago

Update dataset card with OmniMMI information

#1 opened 12 months ago by

New activity in bigai-nlco/OmniMMI 12 months ago

Add task category, link to code

#2 opened 12 months ago by

New activity in ColorfulAI/M4-Audio-LongVA-7B-Qwen2 12 months ago

Add pipeline tag, library name, paper link and Github link

#1 opened 12 months ago by

New activity in ColorfulAI/M4-LongVA-7B-Qwen2 12 months ago

Add pipeline tag, library name, link to paper and project page

#1 opened 12 months ago by

authored a paper 12 months ago

OmniMMI: A Comprehensive Multi-modal Interaction Benchmark in Streaming Video Contexts

Paper • 2503.22952 • Published Mar 29, 2025 • 17

updated a model 12 months ago

ColorfulAI/OpenOmni-8B-Llama3-Omni

9B • Updated Apr 2, 2025 • 2 • 1

upvoted a paper 12 months ago

OmniMMI: A Comprehensive Multi-modal Interaction Benchmark in Streaming Video Contexts

Paper • 2503.22952 • Published Mar 29, 2025 • 17