Same as h2oai/h2ogpt-16k-codellama-34b-instruct but with config.json modified to be 32k for embeddings, which still functions fine as 16k model and allows stretching into 32k in vLLM that otherwise cannot modify maximum sequence length.
- Downloads last month
- 673
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.