nintwentydo commited on
Commit
c60426f
·
verified ·
1 Parent(s): 2371c2d

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +33 -0
README.md ADDED
@@ -0,0 +1,33 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - vllm
4
+ - sparsity
5
+ language:
6
+ - en
7
+ - de
8
+ - fr
9
+ - it
10
+ - pt
11
+ - hi
12
+ - es
13
+ - th
14
+ pipeline_tag: image-text-to-text
15
+ license: apache-2.0
16
+ library_name: vllm
17
+ base_model:
18
+ - mistral-community/pixtral-12b
19
+ - mgoin/pixtral-12b
20
+ - mistralai/Pixtral-12B-2409
21
+ base_model_relation: quantized
22
+ ---
23
+
24
+ # Pixtral-12B-2409: 2:4 sparse
25
+
26
+ 2:4 sparse version of [mistral-community/pixtral-12b](https://huggingface.co/mgoin/pixtral-12b) using [kylesayrs/gptq-partition branch of LLM Compressor](https://github.com/vllm-project/llm-compressor/tree/kylesayrs/gptq-partition) for optimised inference on VLLM.
27
+
28
+ Example VLLM usage
29
+ ```
30
+ vllm serve nintwentydo/pixtral-12b-2409-2of4-sparse --max-model-len 131072 --limit-mm-per-prompt 'image=4'
31
+ ```
32
+
33
+ If you want a more advanced/fully featured chat template you can use [this jinja template](https://raw.githubusercontent.com/nintwentydo/tabbyAPI/refs/heads/main/templates/pixtral12b.jinja)