Felladrin
/

Pythia-31M-Chat-v1

@@ -47,17 +47,10 @@ inference:
 # A Pythia Chat Model of 31M Parameters
 - Base model: [EleutherAI/pythia-31m](https://huggingface.co/EleutherAI/pythia-31m)
-- Datasets:
-  - [totally-not-an-llm/EverythingLM-data-V3](https://huggingface.co/datasets/totally-not-an-llm/EverythingLM-data-V3)
-  - [databricks/databricks-dolly-15k](https://huggingface.co/datasets/databricks/databricks-dolly-15k)
-  - [THUDM/webglm-qa](https://huggingface.co/datasets/THUDM/webglm-qa)
-  - [starfishmedical/webGPT_x_dolly](https://huggingface.co/datasets/starfishmedical/webGPT_x_dolly)
-  - [Amod/mental_health_counseling_conversations](https://huggingface.co/datasets/Amod/mental_health_counseling_conversations)
-  - [sablo/oasst2_curated](https://huggingface.co/datasets/sablo/oasst2_curated)
-  - [cognitivecomputations/wizard_vicuna_70k_unfiltered](https://huggingface.co/datasets/cognitivecomputations/wizard_vicuna_70k_unfiltered)
-  - [mlabonne/chatml_dpo_pairs](https://huggingface.co/datasets/mlabonne/chatml_dpo_pairs)
-## Recommended Prompt Format
 ```
 <|im_start|>system
@@ -67,7 +60,7 @@ inference:
 <|im_start|>assistant
 ```
-## Recommended Inference Parameters
 ```yml
 do_sample: true
@@ -76,3 +69,93 @@ top_p: 0.25
 top_k: 7
 repetition_penalty: 1.0016
 ```

 # A Pythia Chat Model of 31M Parameters
 - Base model: [EleutherAI/pythia-31m](https://huggingface.co/EleutherAI/pythia-31m)
+- Availability in other ML formats:
+  - ONNX: [Felladrin/onnx-Pythia-31M-Chat-v1](https://huggingface.co/Felladrin/onnx-Pythia-31M-Chat-v1)
+## Recommended prompt format
 ```
 <|im_start|>system
 <|im_start|>assistant
 ```
+## Recommended inference parameters
 ```yml
 do_sample: true
 top_k: 7
 repetition_penalty: 1.0016
 ```
+## Datasets and parameters used for training
+| Dataset | License Type |
+|---------|--------------|
+| [totally-not-an-llm/EverythingLM-data-V3](https://huggingface.co/datasets/totally-not-an-llm/EverythingLM-data-V3) | mit |
+| [databricks/databricks-dolly-15k](https://huggingface.co/datasets/databricks/databricks-dolly-15k) | cc-by-sa-3.0 |
+| [THUDM/webglm-qa](https://huggingface.co/datasets/THUDM/webglm-qa) | apache-2.0 |
+| [starfishmedical/webGPT_x_dolly](https://huggingface.co/datasets/starfishmedical/webGPT_x_dolly) | cc-by-sa-3.0 |
+| [Amod/mental_health_counseling_conversations](https://huggingface.co/datasets/Amod/mental_health_counseling_conversations) | openrail |
+| [sablo/oasst2_curated](https://huggingface.co/datasets/sablo/oasst2_curated) | apache-2.0 |
+| [cognitivecomputations/wizard_vicuna_70k_unfiltered](https://huggingface.co/datasets/cognitivecomputations/wizard_vicuna_70k_unfiltered) | apache-2.0 |
+| [mlabonne/chatml_dpo_pairs](https://huggingface.co/datasets/mlabonne/chatml_dpo_pairs) | apache-2.0 |
+```python
+SFTTrainer(
+    model,
+    train_dataset=train_dataset,
+    dataset_text_field="text",
+    eval_dataset=eval_dataset,
+    max_seq_length=2048,
+    packing=True,
+    args=TrainingArguments(
+        learning_rate=2e-6,
+        per_device_train_batch_size=1,
+        per_device_eval_batch_size=1,
+        gradient_accumulation_steps=16,
+        lr_scheduler_type="cosine",
+        num_train_epochs=1,
+        logging_strategy="steps",
+        save_strategy="steps",
+        evaluation_strategy="steps",
+        logging_steps=10,
+        eval_steps=10,
+        save_steps=10,
+        warmup_steps=50,
+        load_best_model_at_end=True,
+        metric_for_best_model="eval_loss",
+        greater_is_better=False,
+        weight_decay=0.01,
+        save_total_limit=10,
+        neftune_noise_alpha=5,
+    ),
+    callbacks=[
+        EarlyStoppingCallback(
+            early_stopping_patience=3,
+            early_stopping_threshold=0.005
+        ),
+    ],
+)
+```
+```python
+DPOTrainer(
+    model,
+    beta=0.1,
+    train_dataset=dataset,
+    tokenizer=tokenizer,
+    eval_dataset=eval_dataset,
+    max_length=1536,
+    max_prompt_length=1024,
+    args=TrainingArguments(
+        learning_rate=2e-6,
+        per_device_train_batch_size=1,
+        per_device_eval_batch_size=1,
+        gradient_accumulation_steps=1,
+        lr_scheduler_type="cosine",
+        num_train_epochs=1,
+        logging_strategy="steps",
+        save_strategy="steps",
+        evaluation_strategy="steps",
+        logging_steps=1,
+        eval_steps=1,
+        save_steps=1,
+        warmup_steps=0,
+        load_best_model_at_end=True,
+        metric_for_best_model="eval_loss",
+        greater_is_better=False,
+        weight_decay=0.0,
+        neftune_noise_alpha=5,
+        remove_unused_columns=False,
+    ),
+    callbacks=[
+        EarlyStoppingCallback(
+            early_stopping_patience=3,
+            early_stopping_threshold=0.005
+        ),
+    ],
+)
+```