Instructions to use Primeness/primelive22 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Primeness/primelive22 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="Primeness/primelive22")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("Primeness/primelive22")
model = AutoModelForCausalLM.from_pretrained("Primeness/primelive22")

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use Primeness/primelive22 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Primeness/primelive22"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Primeness/primelive22",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/Primeness/primelive22

SGLang

How to use Primeness/primelive22 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Primeness/primelive22" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Primeness/primelive22",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Primeness/primelive22" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Primeness/primelive22",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use Primeness/primelive22 with Docker Model Runner:
```
docker model run hf.co/Primeness/primelive22
```

Pygmalion 1.3B

Model description

Pymalion 1.3B is a proof-of-concept dialogue model based on EleutherAI's pythia-1.3b-deduped.

Warning: This model is NOT suitable for use by minors. It will output X-rated content under certain circumstances.

Training data

The fine-tuning dataset consisted of 56MB of dialogue data gathered from multiple sources, which includes both real and partially machine-generated conversations.

Training procedure

Fine-tuning was done using ColossalAI (specifically, with a slightly modified version of their OPT fine-tune example) for around 11.4 million tokens over 5440 steps on a single 24GB GPU. The run took just under 21 hours.

Intended use

The easy way

We provide a notebook with a Gradio UI for playing around with the model without having to manually format inputs. This notebook can be found here.

The manual way

The model can be used as a regular text generation model, but it'll perform best if the input prompt adheres to the following format:

[CHARACTER]'s Persona: [A few sentences about the character you want the model to play]

[DIALOGUE HISTORY]
You: [Your input message here]
[CHARACTER]:

Where [CHARACTER] is, as you can probably guess, the name of the character you want the model to portray, and [DIALOGUE HISTORY] is chat history so the model can have some conversational context to draw from. Ideally it'll be pairs of messages like:

[CHARACTER]: [some dialogue here]
You: [your response to the dialogue above]

Apart from chat history, you can also just add example conversations in [DIALOGUE HISTORY] to show how the character should speak - ideally at the beginning, so it doesn't get confused as to what's conversation history vs. character definition.

Known issues

The model can get stuck repeating certain phrases, or sometimes even entire sentences.
- We believe this is due to that behavior being present in the training data itself, and plan to investigate and adjust accordingly for future versions.

Downloads last month: 1

Safetensors

Model size

2B params

Tensor type

F16