Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,19 @@
|
|
1 |
-
---
|
2 |
-
license: apache-2.0
|
3 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: apache-2.0
|
3 |
+
---
|
4 |
+
|
5 |
+
|
6 |
+
# [NeurIPS'24]Q-VLM: Post-training Quantization for Large Vision-Language Models
|
7 |
+
|
8 |
+
*Efficient and accurate memory saving method towards W4A4 large multi-modal models.* [[Paper](https://arxiv.org/abs/2410.08119)][[Code](https://github.com/ChangyuanWang17/QVLM)]
|
9 |
+
|
10 |
+
> Q-VLM: Post-training Quantization for Large Vision-Language Models
|
11 |
+
> [Changyuan Wang](https://changyuanwang17.github.io), [Ziwei Wang](https://ziweiwangthu.github.io), [Xiuwei Xu](https://xuxw98.github.io/), [Yansong Tang](https://andytang15.github.io), [Jie Zhou](https://scholar.google.com/citations?user=6a79aPwAAAAJ&hl=en&authuser=1), [Jiwen Lu](http://ivg.au.tsinghua.edu.cn/Jiwen_Lu/)
|
12 |
+
|
13 |
+
## Finetuning LLaVA Model on ScienceQA Dataset
|
14 |
+
|
15 |
+
Thanks for LLaVA (https://github.com/haotian-liu/LLaVA) for the amazing open-source model!
|
16 |
+
|
17 |
+
We combined the LLaVA-7B-v1.1 model ([LLaVA-7B-v1.1](https://huggingface.co/liuhaotian/LLaVA-Lightning-7B-delta-v1-1)) and the projector from LLaVA-7B-v1.3 ([LLaVA-7B-v1.3 projector](https://huggingface.co/liuhaotian/llava-pretrain-vicuna-7b-v1.3/tree/main)) and finetuned the model on the ScienceQA dataset. This model is used to test the effectiveness of our quantization method on the ScienceQA dataset.
|
18 |
+
|
19 |
+
|