Triangle104/X-Ray_Alpha-Q4_K_M-GGUF
This model was converted to GGUF format from SicariusSicariiStuff/X-Ray_Alpha
using llama.cpp via the ggml.ai's GGUF-my-repo space.
Refer to the original model card for more details on the model.
This is a pre-alpha proof-of-concept of a real fully uncensored vision model.
Why do I say "real"? The few vision models we got (qwen, llama 3.2) were "censored," and their fine-tunes were made only to the text portion of the model, as training a vision model is a serious pain.
The only actually trained and uncensored vision model I am aware of is ToriiGate; the rest of the vision models are just the stock vision + a fine-tuned LLM.
Having a fully compliant vision model is a critical step toward democratizing vision capabilities for various tasks, especially image tagging. This is a critical step in both making LORAs for image diffusion models, and for mass tagging images to pretrain a diffusion model.
In other words, having a fully compliant and accurate vision model will allow the open source community to easily train both loras and even pretrain image diffusion models.
Another important task can be content moderation and classification, in various use cases there might not be black and white, where some content that might be considered NSFW by corporations, is allowed, while other content is not, there's nuance. Today's vision models do not let the users decide, as they will straight up refuse to inference any content that Google \ Some other corporations decided is not to their liking, and therefore these stock models are useless in a lot of cases.
What if someone wants to classify art that includes nudity? Having a naked statue over 1,000 years old displayed in the middle of a city, in a museum, or at the city square is perfectly acceptable, however, a stock vision model will straight up refuse to inference something like that.
It's like in many "sensitive" topics that LLMs will straight up refuse to answer, while the content is publicly available on Wikipedia. This is an attitude of cynical patronism, I say cynical because corporations take private data to train their models, and it is "perfectly fine", yet- they serve as the arbitrators of morality and indirectly preach to us from a position of a suggested moral superiority. This gatekeeping hurts innovation badly, with vision models especially so, as the task of tagging cannot be done by a single person at scale, but a corporation can.
Use with llama.cpp
Install llama.cpp through brew (works on Mac and Linux)
brew install llama.cpp
Invoke the llama.cpp server or the CLI.
CLI:
llama-cli --hf-repo Triangle104/X-Ray_Alpha-Q4_K_M-GGUF --hf-file x-ray_alpha-q4_k_m.gguf -p "The meaning to life and the universe is"
Server:
llama-server --hf-repo Triangle104/X-Ray_Alpha-Q4_K_M-GGUF --hf-file x-ray_alpha-q4_k_m.gguf -c 2048
Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well.
Step 1: Clone llama.cpp from GitHub.
git clone https://github.com/ggerganov/llama.cpp
Step 2: Move into the llama.cpp folder and build it with LLAMA_CURL=1
flag along with other hardware-specific flags (for ex: LLAMA_CUDA=1 for Nvidia GPUs on Linux).
cd llama.cpp && LLAMA_CURL=1 make
Step 3: Run inference through the main binary.
./llama-cli --hf-repo Triangle104/X-Ray_Alpha-Q4_K_M-GGUF --hf-file x-ray_alpha-q4_k_m.gguf -p "The meaning to life and the universe is"
or
./llama-server --hf-repo Triangle104/X-Ray_Alpha-Q4_K_M-GGUF --hf-file x-ray_alpha-q4_k_m.gguf -c 2048
- Downloads last month
- 29
4-bit
Model tree for Triangle104/X-Ray_Alpha-Q4_K_M-GGUF
Base model
google/gemma-3-4b-pt