6 7 3

Mu Cai

mucai

https://pages.cs.wisc.edu/~mucai/

AI & ML interests

Computer Vision, Deep Learning, 3D Vision, Vision and Language,

Recent Activity

upvoted a paper 20 days ago

SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

authored a paper 4 months ago

TemporalBench: Benchmarking Fine-grained Temporal Understanding for Multimodal Video Models

commented on a paper 4 months ago

TemporalBench: Benchmarking Fine-grained Temporal Understanding for Multimodal Video Models

View all activity

Organizations

mucai's activity

upvoted a paper 20 days ago

SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

Paper • 2501.17161 • Published 28 days ago • 107

authored a paper 4 months ago

TemporalBench: Benchmarking Fine-grained Temporal Understanding for Multimodal Video Models

Paper • 2410.10818 • Published Oct 14, 2024 • 17

commented a paper 4 months ago

TemporalBench: Benchmarking Fine-grained Temporal Understanding for Multimodal Video Models

Paper • 2410.10818 • Published Oct 14, 2024 • 17 •

liked a dataset 4 months ago

microsoft/TemporalBench

Viewer • Updated Nov 7, 2024 • 27.1k • 366 • 11

authored a paper 5 months ago

Vinoground: Scrutinizing LMMs over Dense Temporal Reasoning with Short Videos

Paper • 2410.02763 • Published Oct 3, 2024 • 7

upvoted a paper 5 months ago

Vinoground: Scrutinizing LMMs over Dense Temporal Reasoning with Short Videos

Paper • 2410.02763 • Published Oct 3, 2024 • 7

commented a paper 5 months ago

Vinoground: Scrutinizing LMMs over Dense Temporal Reasoning with Short Videos

Paper • 2410.02763 • Published Oct 3, 2024 • 7 •

New activity in mucai/matryoshka-multimodal-models 7 months ago

Apply for community grant: Academic project (gpu and storage)

#1 opened 7 months ago by

mucai

updated a Space 7 months ago

Matryoshka Multimodal Models

🐨

updated 2 collections 7 months ago

Matryoshka Multimodal Models

Collection

3 items • Updated Aug 4, 2024 • 3

ViP-LLaVA

Collection

2 items • Updated Aug 4, 2024 • 2

upvoted a paper 7 months ago

Matryoshka Multimodal Models

Paper • 2405.17430 • Published May 27, 2024 • 31

upvoted 2 papers 8 months ago

FunAudioLLM: Voice Understanding and Generation Foundation Models for Natural Interaction Between Humans and LLMs

Paper • 2407.04051 • Published Jul 4, 2024 • 36

LLaRA: Supercharging Robot Learning Data for Vision-Language Policy

Paper • 2406.20095 • Published Jun 28, 2024 • 18

authored a paper 8 months ago

LLaRA: Supercharging Robot Learning Data for Vision-Language Policy

Paper • 2406.20095 • Published Jun 28, 2024 • 18

updated a Space 9 months ago

ViP-Bench Evaluator

🐨

New activity in mucai/llava-next-vicuna-7b-m3 9 months ago

Add paper link to connect the model to its paper on Daily Paper page

#1 opened 9 months ago by

AdinaY

New activity in mucai/llava-v1.5-7b-m3 9 months ago

Add meta data and paper link

#1 opened 9 months ago by

AdinaY

upvoted a collection 9 months ago

Matryoshka Multimodal Models

Collection

3 items • Updated Aug 4, 2024 • 3