LLaMA 33b finetuned on wikitext_document_level
with a linear ROPE scaling of 8, for a 16k token context length.
This is a merged version of llama33b-16k-qlora.
Note that this is not an instruct model - this is base LLaMA with an extended sequence length.
- Downloads last month
- 50
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
the model is not deployed on the HF Inference API.