This repository hosts the under-trained detoxify models used as value models in the experiments in the paper Language Model Decoding as Likelihood-Utility Alignment testing the robustness to noise of value-guided decoding algorithms (MCTS and VGBS). For more details see the paper or the project's homepage and GitHub repository.
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
HF Inference API was unable to determine this model's library.