hamishivi commited on
Commit
b8fdd2e
·
verified ·
1 Parent(s): 7f33f1a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -19,7 +19,7 @@ license: apache-2.0
19
  Tulu is a series of language models that are trained to act as helpful assistants.
20
  Tulu V2.5 is a series of models trained using DPO and PPO starting from the [Tulu 2 suite](https://huggingface.co/collections/allenai/tulu-v2-suite-6551b56e743e6349aab45101).
21
  This is a **value** model produced during the PPO training of [this](https://huggingface.co/hamishivi/tulu-v2.5-ppo-7b-uf-mean) model.
22
- It was initialised from the [Tulu v2.573B UltraFeedback RM](https://huggingface.co/hamishivi/tulu-v2.5-7b-uf-rm).
23
  We release the value model as it may provide a good starting point for additional research or improved decoding with our released PPO models.
24
 
25
  At time of writing, you may have to [install transformers from source](https://huggingface.co/docs/transformers/en/installation#install-from-source) to get the `LlamaForTokenClassification` class.
 
19
  Tulu is a series of language models that are trained to act as helpful assistants.
20
  Tulu V2.5 is a series of models trained using DPO and PPO starting from the [Tulu 2 suite](https://huggingface.co/collections/allenai/tulu-v2-suite-6551b56e743e6349aab45101).
21
  This is a **value** model produced during the PPO training of [this](https://huggingface.co/hamishivi/tulu-v2.5-ppo-7b-uf-mean) model.
22
+ It was initialised from the [Tulu v2.5 7B UltraFeedback RM](https://huggingface.co/hamishivi/tulu-v2.5-7b-uf-rm).
23
  We release the value model as it may provide a good starting point for additional research or improved decoding with our released PPO models.
24
 
25
  At time of writing, you may have to [install transformers from source](https://huggingface.co/docs/transformers/en/installation#install-from-source) to get the `LlamaForTokenClassification` class.