city96 commited on
Commit
6bc05ae
·
verified ·
1 Parent(s): 9b95ba3

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +17 -0
README.md ADDED
@@ -0,0 +1,17 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: xtuner/llava-llama-3-8b-v1_1-transformers
3
+ library_name: gguf
4
+ quantized_by: city96
5
+ tags:
6
+ - image-text-to-text
7
+ ---
8
+
9
+ This is an imatrix gguf conversion of [xtuner/llava-llama-3-8b-v1_1-transformers](https://huggingface.co/xtuner/llava-llama-3-8b-v1_1-transformers).
10
+
11
+ Mainly intended to be used as the text encoder for Hunyuan Video, but possible to use for vision tasks with the [mmproj](https://huggingface.co/xtuner/llava-llama-3-8b-v1_1-gguf/blob/main/llava-llama-3-8b-v1_1-mmproj-f16.gguf) file from the xtuner gguf repository.
12
+
13
+ The imatrix dataset used was [`calibration_datav3.txt`](https://gist.github.com/bartowski1182/eb213dccb3571f863da82e99418f81e8) by [Bartowski](https://huggingface.co/bartowski), which was used for all quants under Q6_K. Tested against wikitext / no imatrix and it outperformed both.
14
+
15
+ Note that the `vocab_size` is different between the [transformers](https://huggingface.co/xtuner/llava-llama-3-8b-v1_1-transformers) (128 320) and the [hf](https://huggingface.co/xtuner/llava-llama-3-8b-v1_1-hf) (128 256) repositories. This used the former as it was what was used in the official Hunyuan Video code.
16
+
17
+ *IQ quants will be slow in ComfyUI due to using numpy fallback.*