philschmid
/

codegen-6B-mono-sharded-bnb

Text Generation

endpoints-template

Inference Endpoints

Model card Files Files and versions Community

codegen-6B-mono-sharded-bnb / README.md

philschmid's picture

philschmid HF staff

added pipeline

17e379b about 2 years ago

|

history blame contribute delete

1.94 kB

	---
	license: bsd-3-clause
	tags:
	- endpoints-template
	pipeline_tag: text-generation
	---
	# Sharded fork of [Salesforce/codegen-6B-mono](https://huggingface.co/Salesforce/codegen-6B-mono) with a custom pipeline.py

	This repository implements a custom `pipeline` task for `text-generation` for 🤗 Inference Endpoints for LLM inference using bitsandbytes quantization. The code for the customized pipeline is in the [pipeline.py](https://huggingface.co/philschmid/codegen-6B-mono-sharded-bnb/blob/main/pipeline.py).

	There is also a [notebook](https://huggingface.co/philschmid/codegen-6B-mono-sharded-bnb/blob/main/create_handler.ipynb) included.

	### expected Request payload
	```json
	{
	"inputs": "# load distilbert model and initialize text-classification pipeline\nmodel_id = 'distil",
	"parameters": {
	"top_k": 100,
	"max_length": 64,
	"early_stopping": true,
	"do_sample": true,
	"eos_token_id": 50256,
	}
	}
	```

	below is an example on how to run a request using Python and `requests`.

	## Run Request
	```python
	import json
	from typing import List
	import requests as r
	import base64
	ENDPOINT_URL = ""
	HF_TOKEN = ""

	parameters={
	"top_k": 100,
	"max_length": 64,
	"early_stopping": True,
	"do_sample": True,
	"eos_token_id": 50256,
	}

	def predict(code_snippet:str=None):
	payload = {"inputs": code_snippet,"parameters": parameters}
	response = r.post(
	ENDPOINT_URL, headers={"Authorization": f"Bearer {HF_TOKEN}"}, json=payload
	)
	return response.json()
	prediction = predict(
	code_snippet="# load distilbert model and initialize text-classification pipeline\nmodel_id = 'distil"
	)
	```
	expected output
	```python
	{'generated_text': "# load distilbert model and initialize text-classification pipeline\nmodel_id = 'distilbert-base-uncased'\nmodel_url = 'https://tfhub.dev/tensorflow/small_bert/1'\n\nmodel_dir = './distilBERT'"}
	```