Submitted by akhaliq 64 Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context · 671 authors 5
Submitted by akhaliq 44 ELLA: Equip Diffusion Models with LLM for Enhanced Semantic Alignment · 6 authors 2
Submitted by akhaliq 25 Personalized Audiobook Recommendations at Spotify Through Graph Neural Networks · 14 authors 1
Submitted by akhaliq 24 CogView3: Finer and Faster Text-to-Image Generation via Relay Diffusion · 9 authors 3
Submitted by akhaliq 22 CRM: Single Image to 3D Textured Mesh with Convolutional Reconstruction Model · 9 authors 2
Submitted by akhaliq 21 VideoElevator: Elevating Video Generation Quality with Versatile Text-to-Image Diffusion Models · 8 authors 1