MIT HAN Lab

university

https://hanlab.mit.edu/

SongHan_MIT

mit-han-lab

AI & ML interests

Efficient algorithm, system, and hardware for machine learning.

Recent Activity

songhan authored a paper about 2 months ago

LServe: Efficient Long-sequence LLM Serving with Unified Sparse Attention

synxlin authored a paper about 2 months ago

LServe: Efficient Long-sequence LLM Serving with Unified Sparse Attention

Guangxuan-Xiao authored a paper about 2 months ago

LServe: Efficient Long-sequence LLM Serving with Unified Sparse Attention

View all activity

mit-han-lab's activity

Ligeng-Zhu

updated a collection 4 days ago

llama4-family

5 items • Updated 4 days ago

Lmxyy

in mit-han-lab/svdq-int4-flux.1-dev 7 days ago

possible ipadapter support?

#2 opened 7 days ago by

reverentelusarca

Lmxyy

updated a model 8 days ago

mit-han-lab/nunchaku

Updated 8 days ago • 30

Lmxyy

updated a dataset 9 days ago

mit-han-lab/nunchaku-test

Viewer • Updated 9 days ago • 440 • 163

Lmxyy

updated a dataset 11 days ago

mit-han-lab/svdquant-datasets

Preview • Updated 11 days ago • 458 • 2

Lmxyy

updated a collection 27 days ago

SVDQuant

Models and datasets for "SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models" • 20 items • Updated 27 days ago • 22

Lmxyy

updated 2 models 27 days ago

mit-han-lab/svdq-int4-flux.1-fill-dev

Image-to-Image • Updated 27 days ago • 41k • 9

mit-han-lab/svdq-int4-flux.1-depth-dev

Image-to-Image • Updated 27 days ago • 3.27k • 2

songhan

authored a paper about 2 months ago

LServe: Efficient Long-sequence LLM Serving with Unified Sparse Attention

Paper • 2502.14866 • Published Feb 20 • 13

synxlin

authored a paper about 2 months ago

LServe: Efficient Long-sequence LLM Serving with Unified Sparse Attention

Paper • 2502.14866 • Published Feb 20 • 13

Guangxuan-Xiao

authored a paper about 2 months ago

LServe: Efficient Long-sequence LLM Serving with Unified Sparse Attention

Paper • 2502.14866 • Published Feb 20 • 13

kentang1998

authored a paper about 2 months ago

LServe: Efficient Long-sequence LLM Serving with Unified Sparse Attention

Paper • 2502.14866 • Published Feb 20 • 13

Shangy

authored a paper about 2 months ago

AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

Paper • 2306.00978 • Published Jun 1, 2023 • 9