TRL Model
This is a TRL language model that has been fine-tuned with reinforcement learning to guide the model outputs according to a value, function, or human feedback. The model can be used for text generation.
Usage
To use this model for inference, first install the TRL library:
python -m pip install trl
You can then generate text as follows:
from transformers import pipeline
generator = pipeline("text-generation", model="IrwinD//tmp/tmpoz9k3o9o/IrwinD/log_sage_ppo_model")
outputs = generator("Hello, my llama is cute")
If you want to use the model for training or to obtain the outputs from the value head, load the model as follows:
from transformers import AutoTokenizer
from trl import AutoModelForCausalLMWithValueHead
tokenizer = AutoTokenizer.from_pretrained("IrwinD//tmp/tmpoz9k3o9o/IrwinD/log_sage_ppo_model")
model = AutoModelForCausalLMWithValueHead.from_pretrained("IrwinD//tmp/tmpoz9k3o9o/IrwinD/log_sage_ppo_model")
inputs = tokenizer("Hello, my llama is cute", return_tensors="pt")
outputs = model(**inputs, labels=inputs["input_ids"])
- Downloads last month
- 109
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.