Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2401.02330

The Evolution of Multimodal Model Architectures

Paper • 2405.17927 • Published May 28, 2024 • 1
What matters when building vision-language models?

Paper • 2405.02246 • Published May 3, 2024 • 102
Efficient Architectures for High Resolution Vision-Language Models

Paper • 2501.02584 • Published Jan 5
Building and better understanding vision-language models: insights and future directions

Paper • 2408.12637 • Published Aug 22, 2024 • 126

Exploring the Potential of Encoder-free Architectures in 3D LMMs

Paper • 2502.09620 • Published 28 days ago • 25
The Evolution of Multimodal Model Architectures

Paper • 2405.17927 • Published May 28, 2024 • 1
What matters when building vision-language models?

Paper • 2405.02246 • Published May 3, 2024 • 102
Efficient Architectures for High Resolution Vision-Language Models

Paper • 2501.02584 • Published Jan 5

Small Multimodal Models

Textbooks Are All You Need

Paper • 2306.11644 • Published Jun 20, 2023 • 142
LLaVA-φ: Efficient Multi-Modal Assistant with Small Language Model

Paper • 2401.02330 • Published Jan 4, 2024 • 17
Textbooks Are All You Need II: phi-1.5 technical report

Paper • 2309.05463 • Published Sep 11, 2023 • 87
Visual Instruction Tuning

Paper • 2304.08485 • Published Apr 17, 2023 • 13

Self-Rewarding Language Models

Paper • 2401.10020 • Published Jan 18, 2024 • 147
ReFT: Reasoning with Reinforced Fine-Tuning

Paper • 2401.08967 • Published Jan 17, 2024 • 30
Tuning Language Models by Proxy

Paper • 2401.08565 • Published Jan 16, 2024 • 23
TrustLLM: Trustworthiness in Large Language Models

Paper • 2401.05561 • Published Jan 10, 2024 • 69

model architecture

Mixtral of Experts

Paper • 2401.04088 • Published Jan 8, 2024 • 158
MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts

Paper • 2401.04081 • Published Jan 8, 2024 • 71
TinyLlama: An Open-Source Small Language Model

Paper • 2401.02385 • Published Jan 4, 2024 • 93
LLaMA Pro: Progressive LLaMA with Block Expansion

Paper • 2401.02415 • Published Jan 4, 2024 • 53

DocLLM: A layout-aware generative language model for multimodal document understanding

Paper • 2401.00908 • Published Dec 31, 2023 • 180
COSMO: COntrastive Streamlined MultimOdal Model with Interleaved Pre-Training

Paper • 2401.00849 • Published Jan 1, 2024 • 17
LLaVA-Plus: Learning to Use Tools for Creating Multimodal Agents

Paper • 2311.05437 • Published Nov 9, 2023 • 50
LLaVA-Interactive: An All-in-One Demo for Image Chat, Segmentation, Generation and Editing

Paper • 2311.00571 • Published Nov 1, 2023 • 41

LLaVA-φ: Efficient Multi-Modal Assistant with Small Language Model

Paper • 2401.02330 • Published Jan 4, 2024 • 17
DocLLM: A layout-aware generative language model for multimodal document understanding

Paper • 2401.00908 • Published Dec 31, 2023 • 180

small_or_multimodal_llm

LLaVA-φ: Efficient Multi-Modal Assistant with Small Language Model

Paper • 2401.02330 • Published Jan 4, 2024 • 17

PIA: Your Personalized Image Animator via Plug-and-Play Modules in Text-to-Image Models

Paper • 2312.13964 • Published Dec 21, 2023 • 20
LLM in a flash: Efficient Large Language Model Inference with Limited Memory

Paper • 2312.11514 • Published Dec 12, 2023 • 259
StreamDiffusion: A Pipeline-level Solution for Real-time Interactive Generation

Paper • 2312.12491 • Published Dec 19, 2023 • 70
LLaVA-φ: Efficient Multi-Modal Assistant with Small Language Model

Paper • 2401.02330 • Published Jan 4, 2024 • 17

Extending Context Window of Large Language Models via Semantic Compression

Paper • 2312.09571 • Published Dec 15, 2023 • 15
LLaVA-Plus: Learning to Use Tools for Creating Multimodal Agents

Paper • 2311.05437 • Published Nov 9, 2023 • 50
LLaVA-Grounding: Grounded Visual Chat with Large Multimodal Models

Paper • 2312.02949 • Published Dec 5, 2023 • 14
TinyLLaVA: A Framework of Small-scale Large Multimodal Models

Paper • 2402.14289 • Published Feb 22, 2024 • 20

Previous
1
2
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs