AI & ML interests

Low-bit Quantization of Large Language Models (LLMs)

Recent Activity

HaoranChu  updated a model about 1 month ago
Efficient-ML/GPTQ-for-Qwen3
HaoranChu  updated a collection about 1 month ago
Qwen3-Quantization
View all activity

Welcome to the official Hugging Face organization for LLMQ. In this organization, you can find quantized models of LLM by cutting-edge quantization methods. In order to access models here, please select the suitable model for your personal use.

We are dedicated to advancing the field of Artificial Intelligence with a focus on enhancing efficiency. Our primary research interests include quantiation, binarization, efficient learning, etc. We are committed to innovating and developing cutting-edge techniques that make large language model (LLM) more accessible and sustainable, minimizing computational costs and maximizing performance. Our interdisciplinary approach leverages global expertise to push the boundaries of efficient AI technologies.

Recent Works:

[22.04.2024] How Good Are Low-bit Quantized LLaMA3 Models? An Empirical Study. Arxiv, 2024. ArXiv GitHub

Collections 2

datasets 0

None public yet