TRL documentation

Reward Functions

You are viewing v0.20.0 version. A newer version v0.24.0 is available.
Hugging Face's logo
Join the Hugging Face community

and get access to the augmented documentation experience

to get started

Reward Functions

This module contains some useful reward functions, primarily intended for use with the GRPOTrainer.

Format rewards

think_format_reward

trl.rewards.think_format_reward

< >

( completions: list **kwargs ) list[float]

Parameters

  • completions (list[list[dict[str, str]]]) — List of completions to be evaluated. Each completion must be a list of one message, i.e. a dictionary containing the key "content" with the value being the text of the completion.
  • **kwargs — Additional keyword arguments. This function does not use them, but they are required in the function signature to ensure compatibility with trainers like GRPOTrainer.

Returns

list[float]

A list of rewards, where each reward is 1.0 if the completion matches the expected format, otherwise 0.0.

Reward function that checks if the reasoning process is enclosed within "<think>" and "</think>" tags. The function returns a reward of 1.0 if the format is correct, otherwise 0.0.

Example:

>>> from trl.rewards import think_format_reward

>>> completions = [
...     [{"content": "<think>\nThis is my reasoning.\n</think>\nThis is my answer."}],
...     [{"content": "<think>\nThis is my reasoning.\nThis is my answer."}],
... ]
>>> think_format_reward(completions)
[1.0, 0.0]
< > Update on GitHub