Pythia 1.4B Based Reward Model

Compute was generously provided by Stability AI

How to use

# install open assistant model_training module (e.g. run `pip install -e .` in `model/` directory of open-assistant repository)
import model_training.models.reward_model  # noqa: F401 (registers reward model for AutoModel loading)

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)
input_text = "<|prompter|>Hi how are you?<|endoftext|><|assistant|>Hi, I am Open-Assistant a large open-source language model trained by LAION AI. How can I help you today?<|endoftext|>"
inputs = tokenizer(input_text, return_tensors="pt")
score = rm(**inputs).logits[0].cpu().detach()
print(score)

Datasets

datasets:
    - oasst_export:
        lang: "en,es,de,fr"
        input_file_path: 2023-03-27_oasst_research_ready_synth.jsonl.gz
        val_split: 0.1
    - augment_oasst:
        input_file_path: augmented_latin_cyrillic_oasst_2023-03-27_v2.jsonl
    - anthropic_rlhf:
        fraction: 0.1
        max_val_set: 1000
    - shp:
        max_val_set: 1000
    - hellaswag:
        fraction: 0.5
        max_val_set: 1000
    - webgpt:
        val_split: 0.05
        max_val_set: 1000
    - hf_summary_pairs:
        fraction: 0.1
        max_val_set: 250

(internal note: ignore (high) eval accuracy values of oasst_export, oasst-eval samples were part of training set)

Downloads last month
127
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no pipeline_tag.

Space using OpenAssistant/oasst-rm-2.1-pythia-1.4b-epoch-2.5 1