7 111 55

Rui Zhao

ruizhaocv

https://ruizhaocv.github.io/

AI & ML interests

Multimodal and GenAI

Recent Activity

upvoted a paper 1 day ago

FlowTok: Flowing Seamlessly Across Text and Image Tokens

upvoted a paper 1 day ago

ReCamMaster: Camera-Controlled Generative Rendering from A Single Video

upvoted a paper 4 days ago

CINEMA: Coherent Multi-Subject Video Generation via MLLM-Based Guidance

View all activity

Organizations

ruizhaocv's activity

upvoted 2 papers 1 day ago

FlowTok: Flowing Seamlessly Across Text and Image Tokens

Paper • 2503.10772 • Published 4 days ago • 15

ReCamMaster: Camera-Controlled Generative Rendering from A Single Video

Paper • 2503.11647 • Published 3 days ago • 90

upvoted a paper 4 days ago

CINEMA: Coherent Multi-Subject Video Generation via MLLM-Based Guidance

Paper • 2503.10391 • Published 5 days ago • 10

upvoted 2 papers 5 days ago

TPDiff: Temporal Pyramid Video Diffusion Model

Paper • 2503.09566 • Published 5 days ago • 41

Tuning-Free Multi-Event Long Video Generation via Synchronized Coupled Sampling

Paper • 2503.08605 • Published 7 days ago • 23

liked a model 7 days ago

tencent/HunyuanVideo-I2V

Image-to-Video • Updated 5 days ago • 2.67k • 264

upvoted a paper 7 days ago

Automated Movie Generation via Multi-Agent CoT Planning

Paper • 2503.07314 • Published 8 days ago • 40

upvoted a paper 12 days ago

Babel: Open Multilingual Large Language Models Serving Over 90% of Global Speakers

Paper • 2503.00865 • Published 16 days ago • 58

authored a paper 12 days ago

DoraCycle: Domain-Oriented Adaptation of Unified Generative Model in Multimodal Cycles

Paper • 2503.03651 • Published 13 days ago • 16

upvoted a paper 12 days ago

DoraCycle: Domain-Oriented Adaptation of Unified Generative Model in Multimodal Cycles

Paper • 2503.03651 • Published 13 days ago • 16

commented a paper 12 days ago

DoraCycle: Domain-Oriented Adaptation of Unified Generative Model in Multimodal Cycles

Paper • 2503.03651 • Published 13 days ago • 16 •

upvoted a paper 14 days ago

Difix3D+: Improving 3D Reconstructions with Single-Step Diffusion Models

Paper • 2503.01774 • Published 14 days ago • 41

upvoted a paper 22 days ago

PhotoDoodle: Learning Artistic Image Editing from Few-Shot Pairwise Data

Paper • 2502.14397 • Published 26 days ago • 38

liked a model 23 days ago

Comfy-Org/HunyuanVideo_repackaged

Updated 9 days ago • 157

upvoted a paper 25 days ago

Dynamic Concepts Personalization from Single Videos

Paper • 2502.14844 • Published 25 days ago • 16

upvoted 2 papers 26 days ago

AdaptiveStep: Automatically Dividing Reasoning Step through Model Confidence

Paper • 2502.13943 • Published 26 days ago • 7

Qwen2.5-VL Technical Report

Paper • 2502.13923 • Published 26 days ago • 164

upvoted 2 papers 27 days ago

Phantom: Subject-consistent video generation via cross-modal alignment

Paper • 2502.11079 • Published 30 days ago • 52

I Think, Therefore I Diffuse: Enabling Multimodal In-Context Reasoning in Diffusion Models

Paper • 2502.10458 • Published Feb 12 • 30

liked a model 27 days ago

Skywork/SkyReels-A1

Image-to-Video • Updated 14 days ago • 976 • 48