|
--- |
|
license: apache-2.0 |
|
language: |
|
- en |
|
pipeline_tag: image-text-to-text |
|
tags: |
|
- llava |
|
- vision |
|
--- |
|
|
|
## llava: computer vision GGUF |
|
|
|
- Clip Handler > mmproj-f16.gguf ([624MB](https://huggingface.co/calcuis/llava-gguf/blob/main/mmproj-f16.gguf)) |
|
- Vision Model > llava-***.gguf (opt anyone you like) |
|
- provide a link of picture as prompt |
|
- it will probably give hints to you for complying your text-to-image prompt (reverse engineering) 🎃 |
|
|
|
## how to use |
|
- compatible with [gguf-connector](https://pypi.org/project/gguf-connector/) (ggc v) 👻 |
|
- see initial test examples (with just Q2_K - [2.53GB](https://huggingface.co/calcuis/llava-gguf/blob/main/llava-q2_k.gguf)) below |
|
|
|
### prompt test#1 |
|
![screenshot](https://raw.githubusercontent.com/calcuis/comfy/master/sd3.5g.png) |
|
prompt picture URL: https://raw.githubusercontent.com/calcuis/comfy/master/sd3.5g.png |
|
|
|
response: "The image features a computer screen displaying a cartoon of a woman with a fox head, sitting at a dining table. The table is set with a cake, a knife, and |
|
a cup, giving the impression that the woman is about to enjoy a meal. The cartoon is displayed in a graphic style, making it visually appealing and |
|
engaging." |
|
|
|
### prompt test#2 |
|
<img src="https://raw.githubusercontent.com/calcuis/suko-puzzle/master/demo.png" width="300" height="500"> |
|
prompt picture URL: https://raw.githubusercontent.com/calcuis/suko-puzzle/master/demo.png |
|
|
|
response: "The image shows a close-up of a cell phone screen, displaying a game board with a puzzle. The game board is filled with numbers, and there are four |
|
different colored circles surrounding the puzzle. The game appears to be a popular one, as the image is taken from the perspective of someone looking at the |
|
screen." |
|
|
|
### prompt test#3 |
|
<img src="https://raw.githubusercontent.com/calcuis/gguf-connector/master/gguf.gif" width="128" height="128"> |
|
prompt picture URL: https://raw.githubusercontent.com/calcuis/gguf-connector/master/gguf.gif |
|
|
|
response: "The image features a green cartoon character with a mouth wide open, appearing to be yawning. The character has a toothy smile and is wearing a pair of |
|
yellow gloves. The overall scene is a colorful and fun representation of the green character." |
|
|
|
feedback from connector: "Awesome!" |