Haotong Qin's picture

1 4 1

Haotong Qin

HaotongQin

·

https://htqin.github.io/

AI & ML interests

Model Compression, Efficient AIGC

Recent Activity

authored a paper about 1 month ago

How Good is Google Bard's Visual Understanding? An Empirical Study on Open Challenges

authored a paper about 1 month ago

Accurate LoRA-Finetuning Quantization of LLMs via Information Retention

authored a paper about 1 month ago

BiBERT: Accurate Fully Binarized BERT

View all activity

Organizations

HaotongQin's activity

authored 9 papers about 1 month ago

How Good is Google Bard's Visual Understanding? An Empirical Study on Open Challenges

Paper • 2307.15016 • Published Jul 27, 2023

Accurate LoRA-Finetuning Quantization of LLMs via Information Retention

Paper • 2402.05445 • Published Feb 8, 2024

BiBERT: Accurate Fully Binarized BERT

Paper • 2203.06390 • Published Mar 12, 2022

DB-LLM: Accurate Dual-Binarization for Efficient LLMs

Paper • 2402.11960 • Published Feb 19, 2024 • 2

OHQ: On-chip Hardware-aware Quantization

Paper • 2309.01945 • Published Sep 5, 2023 • 1

BinaryDM: Towards Accurate Binarization of Diffusion Model

Paper • 2404.05662 • Published Apr 8, 2024

SliM-LLM: Salience-Driven Mixed-Precision Quantization for Large Language Models

Paper • 2405.14917 • Published May 23, 2024 • 1

BiBench: Benchmarking and Analyzing Network Binarization

Paper • 2301.11233 • Published Jan 26, 2023

A Survey of Low-bit Large Language Models: Basics, Systems, and Algorithms

Paper • 2409.16694 • Published Sep 25, 2024

upvoted a paper 9 months ago

DiM: Diffusion Mamba for Efficient High-Resolution Image Synthesis

Paper • 2405.14224 • Published May 23, 2024 • 14

posted an update 10 months ago

Post

1910

We release an empirical study to showcase "How Good Are Low-bit Quantized hashtag#LLaMA3 🦙 Models" with existing LLM quantization techniques!

In this study, the performance of the low-bit LLaMA3 models (especially LLaMA3-70B) is impressively notable. 🚀 However, the results also exposed significant performance degradation issues faced by existing quantization techniques when dealing with LLaMA3, especially under ultra-low bit-width.

We hope this study can serve as a reference for the LLM quantization community and promote the emergence of stronger LLM quantization methods in the context of LLaMA3's release. More work is on the way...

How Good Are Low-bit Quantized LLaMA3 Models? An Empirical Study (2404.14047)

https://huggingface.co/collections/LLMQ/llama3-quantization-66251258525135aeda16513c

updated a collection 10 months ago

LLaMA3-Quantization

This is the official quantized models collection of “How Good Are Low-bit Quantized LLaMA3 Models? An Empirical Study” • 9 items • Updated Apr 23, 2024 • 4

authored a paper 10 months ago

How Good Are Low-bit Quantized LLaMA3 Models? An Empirical Study

Paper • 2404.14047 • Published Apr 22, 2024 • 45

liked a model 10 months ago

Efficient-ML/LLaMA-3-8B-GPTQ-4bit-b128

Updated Apr 21, 2024 • 3

commented a paper 10 months ago

How Good Are Low-bit Quantized LLaMA3 Models? An Empirical Study

Paper • 2404.14047 • Published Apr 22, 2024 • 45 •

upvoted a paper 10 months ago

How Good Are Low-bit Quantized LLaMA3 Models? An Empirical Study

Paper • 2404.14047 • Published Apr 22, 2024 • 45

upvoted a collection 10 months ago

LLaMA3-Quantization

This is the official quantized models collection of “How Good Are Low-bit Quantized LLaMA3 Models? An Empirical Study” • 9 items • Updated Apr 23, 2024 • 4

updated a collection 10 months ago

LLaMA3-Quantization

This is the official quantized models collection of “How Good Are Low-bit Quantized LLaMA3 Models? An Empirical Study” • 9 items • Updated Apr 23, 2024 • 4

upvoted a paper about 1 year ago

BiLLM: Pushing the Limit of Post-Training Quantization for LLMs

Paper • 2402.04291 • Published Feb 6, 2024 • 49