YAML Metadata
Warning:
empty or missing yaml metadata in repo card
(https://huggingface.co/docs/hub/model-cards#model-card-metadata)
Model upload of Airoboros-13B-SuperHOT in 4-bit GPTQ version, converted using GPTQ-for-LLaMa; Source model from https://huggingface.co/Peeepy/Airoboros-13b-SuperHOT-8k.
This uses the Airoboros-13B(v1.2) model and applies the SuperHOT 8K LoRA on top, allowing for improved coherence at larger context lenghts, as well as improving output quality of Airoboros to be more verbose.
You will need a monkey-patch at inference to use the 8k context, please see patch file present, if you are using a different inference engine (like llama.cpp / exllama) you will need to add the monkey patch there.
Note: If you are using exllama the monkey-patch is built into the engine, please use -cpe to set the scaling factor, ie. if you are running it at 4k context, pass -cpe 2 -l 4096
Patch file present in repo or can be accessed here: https://huggingface.co/kaiokendev/superhot-13b-8k-no-rlhf-test/raw/main/llama_rope_scaled_monkey_patch.py
- Downloads last month
- 15
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.