Submitted by akhaliq 66 Megalodon: Efficient LLM Pretraining and Inference with Unlimited Context Length · 10 authors 1
Submitted by akhaliq 30 Video2Game: Real-time, Interactive, Realistic and Browser-Compatible Environment from a Single Video · 4 authors 2
Submitted by akhaliq 21 Ctrl-Adapter: An Efficient and Versatile Framework for Adapting Diverse Controls to Any Diffusion Model · 4 authors
Submitted by akhaliq 13 HQ-Edit: A High-Quality Dataset for Instruction-based Image Editing · 8 authors
Submitted by akhaliq 12 Tango 2: Aligning Diffusion-based Text-to-Audio Generations through Direct Preference Optimization · 6 authors
Submitted by akhaliq 11 TextHawk: Exploring Efficient Fine-Grained Perception of Multimodal Large Language Models · 6 authors
Submitted by akhaliq 7 CompGS: Efficient 3D Scene Representation via Compressed Gaussian Splatting · 6 authors
Submitted by akhaliq 7 Taming Latent Diffusion Model for Neural Radiance Field Inpainting · 8 authors