File size: 280 Bytes
a612659 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
---
license: apache-2.0
tags:
- marlin
- gptq
- pruned
---
A Mistral-7B pruned50 with Marlin Kernel and AutoGPTQ
Please see my tutorial to execute this model:
https://vilsonrodrigues.medium.com/sparse-quantize-and-serving-llms-with-neuralmagic-autogptq-and-vllm-03961b72ec3a
|