LoftQ: LoRA-Fine-Tuning-Aware Quantization for Large Language Models Paper β’ 2310.08659 β’ Published Oct 12, 2023 β’ 25
view article Article Fine-tuning LLMs to 1.58bit: extreme quantization made easy Sep 18, 2024 β’ 217
Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations Paper β’ 2405.18392 β’ Published May 28, 2024 β’ 12
BitNet: Scaling 1-bit Transformers for Large Language Models Paper β’ 2310.11453 β’ Published Oct 17, 2023 β’ 96
Meta Llama 3 Collection This collection hosts the transformers and original repos of the Meta Llama 3 and Llama Guard 2 releases β’ 5 items β’ Updated Dec 6, 2024 β’ 711
view article Article Introducing Idefics2: A Powerful 8B Vision-Language Model for the community Apr 15, 2024 β’ 173
view article Article Overview of natively supported quantization schemes in π€ Transformers Sep 12, 2023 β’ 11
view article Article A Gentle Introduction to 8-bit Matrix Multiplication for transformers at scale using transformers, accelerate and bitsandbytes Aug 17, 2022 β’ 69
view article Article Making LLMs even more accessible with bitsandbytes, 4-bit quantization and QLoRA May 24, 2023 β’ 112
view article Article Binary and Scalar Embedding Quantization for Significantly Faster & Cheaper Retrieval Mar 22, 2024 β’ 71