Efficient Intelligence and Systems

community

Efficient-ML

Activity Feed

AI & ML interests

Low-bit Quantization of Large Language Models (LLMs)

Recent Activity

HaotongQin authored a paper about 1 month ago

QVGen: Pushing the Limit of Quantized Video Generative Models

HaoranChu updated a model about 1 month ago

Efficient-ML/GPTQ-for-Qwen3

HaoranChu updated a collection about 1 month ago

Qwen3-Quantization

View all activity

Organization Card

Community About org cards

Welcome to the official Hugging Face organization for LLMQ. In this organization, you can find quantized models of LLM by cutting-edge quantization methods. In order to access models here, please select the suitable model for your personal use.

We are dedicated to advancing the field of Artificial Intelligence with a focus on enhancing efficiency. Our primary research interests include quantiation, binarization, efficient learning, etc. We are committed to innovating and developing cutting-edge techniques that make large language model (LLM) more accessible and sustainable, minimizing computational costs and maximizing performance. Our interdisciplinary approach leverages global expertise to push the boundaries of efficient AI technologies.

Recent Works:

[22.04.2024] How Good Are Low-bit Quantized LLaMA3 Models? An Empirical Study. Arxiv, 2024. ArXiv GitHub

Collections 2

models 52

datasets 0

None public yet

Efficient Intelligence and Systems

AI & ML interests

Recent Activity

Collections 2

Efficient-ML/Qwen3-0.6B-base-gptq-w4-128

Efficient-ML/Qwen3-0.6B-base-gptq-w8-128

Efficient-ML/Qwen3-0.6B-base-gptq-w8-perchannel

Efficient-ML/Qwen3-0.6B-base-gptq-w4-perchannel

Efficient-ML/LLaMA-3-8B-GPTQ-4bit-b128

Efficient-ML/LLaMA-3-8B-SmoothQuant-4bit-4bit

Efficient-ML/LLaMA-3-8B-AWQ-4bit-b128

Efficient-ML/LLaMA-3-8B-SmoothQuant-8bit-8bit

Efficient-ML/Qwen3-0.6B-base-gptq-w4-128

Efficient-ML/Qwen3-0.6B-base-gptq-w8-128

Efficient-ML/Qwen3-0.6B-base-gptq-w8-perchannel

Efficient-ML/Qwen3-0.6B-base-gptq-w4-perchannel

Efficient-ML/LLaMA-3-8B-GPTQ-4bit-b128

Efficient-ML/LLaMA-3-8B-SmoothQuant-4bit-4bit

Efficient-ML/LLaMA-3-8B-AWQ-4bit-b128

Efficient-ML/LLaMA-3-8B-SmoothQuant-8bit-8bit

models 52

Efficient-ML/GPTQ-for-Qwen3

Efficient-ML/Qwen3-awq

Efficient-ML/Qwen3-8B-gptq-w8-perchannel

Efficient-ML/Qwen3-14B-gptq-w4-perchannel

Efficient-ML/Qwen3-14B-gptq-w4-128

Efficient-ML/Qwen3-14B-gptq-w8-perchannel

Efficient-ML/Qwen3-14B-gptq-w8-128

Efficient-ML/Qwen3-14B-base-gptq-w8-perchannel

Efficient-ML/Qwen3-14B-base-gptq-w8-128

Efficient-ML/Qwen3-8B-gptq-w8-128

datasets 0

AI & ML interests

Recent Activity

Team members 9

Collections 2

models 52 Sort: Recently updated

datasets 0

models 52