Jaemin Cho's picture

Jaemin Cho

j-min

·

https://j-min.io

AI & ML interests

None yet

Recent Activity

new activity 7 days ago

j-min/IterInpaint-CLEVR:Adding `safetensors` variant of this model

new activity 7 days ago

j-min/reco_sd14_coco:Adding `safetensors` variant of this model

commented on a paper 5 months ago

RotBench: Evaluating Multimodal Large Language Models on Identifying Image Rotation

View all activity

Organizations

New activity in j-min/IterInpaint-CLEVR 7 days ago

Adding `safetensors` variant of this model

#2 opened 4 months ago by

New activity in j-min/reco_sd14_coco 7 days ago

Adding `safetensors` variant of this model

#1 opened 4 months ago by

commented 2 papers 5 months ago

RotBench: Evaluating Multimodal Large Language Models on Identifying Image Rotation

Paper • 2508.13968 • Published Aug 19, 2025 • 1 •

Bifrost-1: Bridging Multimodal LLMs and Diffusion Models with Patch-level CLIP Latents

Paper • 2508.05954 • Published Aug 8, 2025 • 6 •

New activity in j-min/PaintSkills 6 months ago

Upload train images (in zip files)

#8 opened 6 months ago by

Upload count/train_images

#7 opened 7 months ago by

commented a paper 6 months ago

Diffuman4D: 4D Consistent Human View Synthesis from Sparse-View Videos with Spatio-Temporal Diffusion Models

Paper • 2507.13344 • Published Jul 17, 2025 • 57 •

New activity in j-min/PaintSkills 7 months ago

Upload spatial/val_images

#5 opened 7 months ago by

Upload object/val_images

#4 opened 7 months ago by

Upload count/val_images

#3 opened 7 months ago by

commented 2 papers 7 months ago

Video-Skill-CoT: Skill-based Chain-of-Thoughts for Domain-Adaptive Video Reasoning

Paper • 2506.03525 • Published Jun 4, 2025 • 6 •

EPiC: Efficient Video Camera Control Learning with Precise Anchor-Video Guidance

Paper • 2505.21876 • Published May 28, 2025 • 9 •

New activity in j-min/vicuna-13b-v0-merged 8 months ago

Adding `safetensors` variant of this model

#2 opened 8 months ago by

commented a paper 8 months ago

CAPTURe: Evaluating Spatial Reasoning in Vision Language Models via Occluded Object Counting

Paper • 2504.15485 • Published Apr 21, 2025 • 4 •

commented a paper 11 months ago

M3DocRAG: Multi-modal Retrieval is What You Need for Multi-page Multi-document Understanding

Paper • 2411.04952 • Published Nov 7, 2024 • 29 •

New activity in j-min/layoutbench 12 months ago

[bot] Conversion to Parquet

#1 opened about 1 year ago by

parquet-converter

commented 3 papers about 1 year ago

VideoRepair: Improving Text-to-Video Generation via Misalignment Evaluation and Localized Refinement

Paper • 2411.15115 • Published Nov 22, 2024 • 9 •

VideoRepair: Improving Text-to-Video Generation via Misalignment Evaluation and Localized Refinement

Paper • 2411.15115 • Published Nov 22, 2024 • 9 •

M3DocRAG: Multi-modal Retrieval is What You Need for Multi-page Multi-document Understanding

Paper • 2411.04952 • Published Nov 7, 2024 • 29 •

New activity in j-min/reco_sd14_laion about 1 year ago

customized dataset

#1 opened over 1 year ago by