Submitted by akhaliq 52 Inference Performance Optimization for Large Language Models on CPUs · 10 authors 7
Submitted by akhaliq 41 LLaVA-NeXT-Interleave: Tackling Multi-image, Video, and 3D in Large Multimodal Models · 8 authors 3
Submitted by akhaliq 15 VEnhancer: Generative Space-Time Enhancement for Video Generation · 9 authors 1
Submitted by akhaliq 13 Still-Moving: Customized Video Generation without Customized Video Data · 10 authors 2
Submitted by akhaliq 9 Do Vision and Language Models Share Concepts? A Vector Space Alignment Study · 4 authors 3
Submitted by davanstrien 7 CosmoCLIP: Generalizing Large Vision-Language Models for Astronomical Imaging · 5 authors 1
Submitted by HikariDawn 4 This&That: Language-Gesture Controlled Video Generation for Robot Planning · 7 authors 1
Submitted by PAlbert31 4 An accurate detection is not all you need to combat label noise in web-noisy datasets · 6 authors 4