5 19 108

Jaeyoon Jung PRO

lastdefiance20

AI & ML interests

multimodal

Recent Activity

liked a model 10 minutes ago

HuggingFaceTB/SmolVLM2-500M-Video-Instruct

upvoted a paper 4 days ago

SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features

liked a model 6 days ago

perplexity-ai/r1-1776

View all activity

Organizations

lastdefiance20's activity

liked a model 10 minutes ago

HuggingFaceTB/SmolVLM2-500M-Video-Instruct

Video-Text-to-Text • Updated 3 days ago • 1.24k • 25

upvoted a paper 4 days ago

SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features

Paper • 2502.14786 • Published 5 days ago • 115

liked a model 6 days ago

perplexity-ai/r1-1776

Updated 6 days ago • 8.06k • 1.68k

upvoted a paper 6 days ago

Magma: A Foundation Model for Multimodal AI Agents

Paper • 2502.13130 • Published 7 days ago • 42

liked a dataset 10 days ago

open-thoughts/OpenThoughts-114k

Viewer • Updated 5 days ago • 228k • 106k • 601

upvoted 2 papers 13 days ago

Scaling Pre-training to One Hundred Billion Data for Vision Language Models

Paper • 2502.07617 • Published 14 days ago • 28

Omni-RGPT: Unifying Image and Video Region-level Understanding via Token Marks

Paper • 2501.08326 • Published Jan 14 • 32

liked a model 15 days ago

ibm-granite/granite-vision-3.1-2b-preview

Image-Text-to-Text • Updated about 9 hours ago • 10.9k • 80

liked a dataset about 1 month ago

DAMO-NLP-SG/multimodal_textbook

Updated Jan 11 • 6.23k • 132

upvoted a paper about 1 month ago

Training Large Language Models to Reason in a Continuous Latent Space

Paper • 2412.06769 • Published Dec 9, 2024 • 78

liked a dataset about 1 month ago

HumanLLMs/Human-Like-DPO-Dataset

Viewer • Updated Jan 12 • 10.9k • 2.73k • 200

upvoted a paper about 1 month ago

LLaVA-Mini: Efficient Image and Video Large Multimodal Models with One Vision Token

Paper • 2501.03895 • Published Jan 7 • 50

upvoted a paper about 2 months ago

LearnLM: Improving Gemini for Learning

Paper • 2412.16429 • Published Dec 21, 2024 • 22

liked a model about 2 months ago

hexgrad/Kokoro-82M

Text-to-Speech • Updated 23 days ago • 1.12M • 3.4k

upvoted a paper about 2 months ago

Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey

Paper • 2412.18619 • Published Dec 16, 2024 • 55

liked a model 2 months ago

ibm-granite/granite-3.1-8b-instruct

Text Generation • Updated 25 days ago • 92.8k • 152

upvoted a paper 2 months ago

Qwen2.5 Technical Report

Paper • 2412.15115 • Published Dec 19, 2024 • 346

liked 2 models 2 months ago

answerdotai/ModernBERT-base

Fill-Mask • Updated Jan 15 • 10.1M • 765

vidore/colpali-v1.3-hf

Visual Document Retrieval • Updated 20 days ago • 1.45k • 23

liked a dataset 2 months ago

BAAI/Infinity-MM

Updated Dec 13, 2024 • 13k • 90