This repository hosts the under-trained detoxify models used as value models in the experiments in the paper Language Model Decoding as Likelihood-Utility Alignment testing the robustness to noise of value-guided decoding algorithms (MCTS and VGBS). For more details see the paper or the project's homepage and GitHub repository.
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
HF Inference deployability: The model has no library tag.