Haotong Qin

HaotongQin
·

AI & ML interests

Model Compression, Efficient AIGC

Recent Activity

Organizations

Efficient Intelligence and Systems's profile picture

HaotongQin's activity

posted an update 10 months ago
view post
Post
1910
We release an empirical study to showcase "How Good Are Low-bit Quantized hashtag#LLaMA3 🦙 Models" with existing LLM quantization techniques!

In this study, the performance of the low-bit LLaMA3 models (especially LLaMA3-70B) is impressively notable. 🚀 However, the results also exposed significant performance degradation issues faced by existing quantization techniques when dealing with LLaMA3, especially under ultra-low bit-width.

We hope this study can serve as a reference for the LLM quantization community and promote the emergence of stronger LLM quantization methods in the context of LLaMA3's release. More work is on the way...

How Good Are Low-bit Quantized LLaMA3 Models? An Empirical Study (2404.14047)

https://huggingface.co/collections/LLMQ/llama3-quantization-66251258525135aeda16513c