7 8 13

Manan Shah

cs-mshah

https://cs-mshah.github.io/

AI & ML interests

Computer Vision

Recent Activity

upvoted a paper about 3 hours ago

Vision-R1: Incentivizing Reasoning Capability in Multimodal Large Language Models

liked a model about 3 hours ago

ali-vilab/ACE_Plus

liked a model 9 days ago

chandar-lab/NeoBERT

View all activity

Organizations

cs-mshah's activity

upvoted a paper about 3 hours ago

Vision-R1: Incentivizing Reasoning Capability in Multimodal Large Language Models

Paper • 2503.06749 • Published 5 days ago • 21

liked a model about 3 hours ago

ali-vilab/ACE_Plus

Updated 2 days ago • 329 • 165

liked a model 9 days ago

chandar-lab/NeoBERT

Feature Extraction • Updated 11 days ago • 4.21k • 94

liked a dataset 16 days ago

allenai/CoSyn-400K

Viewer • Updated 14 days ago • 408k • 3.79k • 5

upvoted an article 21 days ago

Article

SmolVLM2: Bringing Video Understanding to Every Device

22 days ago

• 205

liked 2 models 4 months ago

amphion/MaskGCT

Text-to-Speech • Updated Dec 22, 2024 • 172 • 276

NimVideo/cogvideox-2b-img2vid

Image-to-Video • Updated Oct 28, 2024 • 397 • 76

updated a dataset 4 months ago

cs-mshah/SynMirror

Preview • Updated Nov 5, 2024 • 220 • 1

liked a Space 6 months ago

114

Image Matching Webui

🤗

Find similar images by uploading a photo

authored a paper 6 months ago

Reflecting Reality: Enabling Diffusion Models to Produce Faithful Mirror Reflections

Paper • 2409.14677 • Published Sep 23, 2024 • 16

upvoted a paper 6 months ago

Reflecting Reality: Enabling Diffusion Models to Produce Faithful Mirror Reflections

Paper • 2409.14677 • Published Sep 23, 2024 • 16

commented 2 papers 6 months ago

Reflecting Reality: Enabling Diffusion Models to Produce Faithful Mirror Reflections

Paper • 2409.14677 • Published Sep 23, 2024 • 16 •

Reflecting Reality: Enabling Diffusion Models to Produce Faithful Mirror Reflections

Paper • 2409.14677 • Published Sep 23, 2024 • 16 •

liked a Space 6 months ago

Research Tracker

🚀

liked a model 6 months ago

XLabs-AI/flux-lora-collection

Text-to-Image • Updated Aug 14, 2024 • 525

updated a model 6 months ago

cs-mshah/florence_ft

Text Generation • Updated Sep 14, 2024 • 15

upvoted 2 papers 7 months ago

Building and better understanding vision-language models: insights and future directions

Paper • 2408.12637 • Published Aug 22, 2024 • 126

BRAT: Bonus oRthogonAl Token for Architecture Agnostic Textual Inversion

Paper • 2408.04785 • Published Aug 8, 2024 • 9

New activity in tiange/Cap3D 7 months ago

Captions for full ABO dataset

#17 opened 8 months ago by

cs-mshah

upvoted a collection 8 months ago

Perturbed Attention Guidance pipelines

Collection

Pipelines for Perturbed Attention Guidance with 🧨 library • 8 items • Updated Jun 26, 2024 • 6