9 89 14

Dhruv Diddi PRO

ddiddi

AI & ML interests

None yet

Recent Activity

liked a model about 13 hours ago

bartowski/Mistral-Small-24B-Instruct-2501-GGUF

upvoted a paper 7 days ago

VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding

upvoted a paper 7 days ago

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

View all activity

Organizations

ddiddi's activity

upvoted 4 papers 7 days ago

VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding

Paper • 2501.13106 • Published 8 days ago • 75

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published 9 days ago • 275

GPS as a Control Signal for Image Generation

Paper • 2501.12390 • Published 9 days ago • 12

Test-Time Preference Optimization: On-the-Fly Alignment via Iterative Textual Feedback

Paper • 2501.12895 • Published 9 days ago • 51

upvoted a paper about 1 month ago

Apollo: An Exploration of Video Understanding in Large Multimodal Models

Paper • 2412.10360 • Published Dec 13, 2024 • 139

upvoted 7 papers about 2 months ago

Large Action Models: From Inception to Implementation

Paper • 2412.10047 • Published Dec 13, 2024 • 33

PIG: Physics-Informed Gaussians as Adaptive Parametric Mesh Representations

Paper • 2412.05994 • Published Dec 8, 2024 • 18

InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions

Paper • 2412.09596 • Published Dec 12, 2024 • 93

MAtCha Gaussians: Atlas of Charts for High-Quality Geometry and Photorealism From Sparse Views

Paper • 2412.06767 • Published Dec 9, 2024 • 6

upvoted 4 papers 2 months ago

UniPose: A Unified Multimodal Framework for Human Pose Comprehension, Generation and Editing

Paper • 2411.16781 • Published Nov 25, 2024 • 10

AfriMed-QA: A Pan-African, Multi-Specialty, Medical Question-Answering Benchmark Dataset

Paper • 2411.15640 • Published Nov 23, 2024 • 4

Star Attention: Efficient LLM Inference over Long Sequences

Paper • 2411.17116 • Published Nov 26, 2024 • 49

Material Anything: Generating Materials for Any 3D Object via Diffusion

Paper • 2411.15138 • Published Nov 22, 2024 • 42

upvoted 4 papers 3 months ago

HtmlRAG: HTML is Better Than Plain Text for Modeling Retrieved Knowledge in RAG Systems

Paper • 2411.02959 • Published Nov 5, 2024 • 66

Breaking the Memory Barrier: Near Infinite Batch Size Scaling for Contrastive Loss

Paper • 2410.17243 • Published Oct 22, 2024 • 89

Can Knowledge Editing Really Correct Hallucinations?

Paper • 2410.16251 • Published Oct 21, 2024 • 54

ROCKET-1: Master Open-World Interaction with Visual-Temporal Context Prompting

Paper • 2410.17856 • Published Oct 23, 2024 • 49