view article Article Introducing multi-backends (TRT-LLM, vLLM) support for Text Generation Inference 23 days ago • 63
view article Article Memory-efficient Diffusion Transformers with Quanto and Diffusers Jul 30, 2024 • 63
What matters when building vision-language models? Paper • 2405.02246 • Published May 3, 2024 • 102