Add training info to the Readme
Browse files
README.md
CHANGED
@@ -47,17 +47,10 @@ inference:
|
|
47 |
# A Pythia Chat Model of 31M Parameters
|
48 |
|
49 |
- Base model: [EleutherAI/pythia-31m](https://huggingface.co/EleutherAI/pythia-31m)
|
50 |
-
-
|
51 |
-
- [
|
52 |
-
- [databricks/databricks-dolly-15k](https://huggingface.co/datasets/databricks/databricks-dolly-15k)
|
53 |
-
- [THUDM/webglm-qa](https://huggingface.co/datasets/THUDM/webglm-qa)
|
54 |
-
- [starfishmedical/webGPT_x_dolly](https://huggingface.co/datasets/starfishmedical/webGPT_x_dolly)
|
55 |
-
- [Amod/mental_health_counseling_conversations](https://huggingface.co/datasets/Amod/mental_health_counseling_conversations)
|
56 |
-
- [sablo/oasst2_curated](https://huggingface.co/datasets/sablo/oasst2_curated)
|
57 |
-
- [cognitivecomputations/wizard_vicuna_70k_unfiltered](https://huggingface.co/datasets/cognitivecomputations/wizard_vicuna_70k_unfiltered)
|
58 |
-
- [mlabonne/chatml_dpo_pairs](https://huggingface.co/datasets/mlabonne/chatml_dpo_pairs)
|
59 |
|
60 |
-
## Recommended
|
61 |
|
62 |
```
|
63 |
<|im_start|>system
|
@@ -67,7 +60,7 @@ inference:
|
|
67 |
<|im_start|>assistant
|
68 |
```
|
69 |
|
70 |
-
## Recommended
|
71 |
|
72 |
```yml
|
73 |
do_sample: true
|
@@ -76,3 +69,93 @@ top_p: 0.25
|
|
76 |
top_k: 7
|
77 |
repetition_penalty: 1.0016
|
78 |
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
47 |
# A Pythia Chat Model of 31M Parameters
|
48 |
|
49 |
- Base model: [EleutherAI/pythia-31m](https://huggingface.co/EleutherAI/pythia-31m)
|
50 |
+
- Availability in other ML formats:
|
51 |
+
- ONNX: [Felladrin/onnx-Pythia-31M-Chat-v1](https://huggingface.co/Felladrin/onnx-Pythia-31M-Chat-v1)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
52 |
|
53 |
+
## Recommended prompt format
|
54 |
|
55 |
```
|
56 |
<|im_start|>system
|
|
|
60 |
<|im_start|>assistant
|
61 |
```
|
62 |
|
63 |
+
## Recommended inference parameters
|
64 |
|
65 |
```yml
|
66 |
do_sample: true
|
|
|
69 |
top_k: 7
|
70 |
repetition_penalty: 1.0016
|
71 |
```
|
72 |
+
|
73 |
+
## Datasets and parameters used for training
|
74 |
+
|
75 |
+
| Dataset | License Type |
|
76 |
+
|---------|--------------|
|
77 |
+
| [totally-not-an-llm/EverythingLM-data-V3](https://huggingface.co/datasets/totally-not-an-llm/EverythingLM-data-V3) | mit |
|
78 |
+
| [databricks/databricks-dolly-15k](https://huggingface.co/datasets/databricks/databricks-dolly-15k) | cc-by-sa-3.0 |
|
79 |
+
| [THUDM/webglm-qa](https://huggingface.co/datasets/THUDM/webglm-qa) | apache-2.0 |
|
80 |
+
| [starfishmedical/webGPT_x_dolly](https://huggingface.co/datasets/starfishmedical/webGPT_x_dolly) | cc-by-sa-3.0 |
|
81 |
+
| [Amod/mental_health_counseling_conversations](https://huggingface.co/datasets/Amod/mental_health_counseling_conversations) | openrail |
|
82 |
+
| [sablo/oasst2_curated](https://huggingface.co/datasets/sablo/oasst2_curated) | apache-2.0 |
|
83 |
+
| [cognitivecomputations/wizard_vicuna_70k_unfiltered](https://huggingface.co/datasets/cognitivecomputations/wizard_vicuna_70k_unfiltered) | apache-2.0 |
|
84 |
+
| [mlabonne/chatml_dpo_pairs](https://huggingface.co/datasets/mlabonne/chatml_dpo_pairs) | apache-2.0 |
|
85 |
+
|
86 |
+
```python
|
87 |
+
SFTTrainer(
|
88 |
+
model,
|
89 |
+
train_dataset=train_dataset,
|
90 |
+
dataset_text_field="text",
|
91 |
+
eval_dataset=eval_dataset,
|
92 |
+
max_seq_length=2048,
|
93 |
+
packing=True,
|
94 |
+
args=TrainingArguments(
|
95 |
+
learning_rate=2e-6,
|
96 |
+
per_device_train_batch_size=1,
|
97 |
+
per_device_eval_batch_size=1,
|
98 |
+
gradient_accumulation_steps=16,
|
99 |
+
lr_scheduler_type="cosine",
|
100 |
+
num_train_epochs=1,
|
101 |
+
logging_strategy="steps",
|
102 |
+
save_strategy="steps",
|
103 |
+
evaluation_strategy="steps",
|
104 |
+
logging_steps=10,
|
105 |
+
eval_steps=10,
|
106 |
+
save_steps=10,
|
107 |
+
warmup_steps=50,
|
108 |
+
load_best_model_at_end=True,
|
109 |
+
metric_for_best_model="eval_loss",
|
110 |
+
greater_is_better=False,
|
111 |
+
weight_decay=0.01,
|
112 |
+
save_total_limit=10,
|
113 |
+
neftune_noise_alpha=5,
|
114 |
+
),
|
115 |
+
callbacks=[
|
116 |
+
EarlyStoppingCallback(
|
117 |
+
early_stopping_patience=3,
|
118 |
+
early_stopping_threshold=0.005
|
119 |
+
),
|
120 |
+
],
|
121 |
+
)
|
122 |
+
```
|
123 |
+
|
124 |
+
```python
|
125 |
+
DPOTrainer(
|
126 |
+
model,
|
127 |
+
beta=0.1,
|
128 |
+
train_dataset=dataset,
|
129 |
+
tokenizer=tokenizer,
|
130 |
+
eval_dataset=eval_dataset,
|
131 |
+
max_length=1536,
|
132 |
+
max_prompt_length=1024,
|
133 |
+
args=TrainingArguments(
|
134 |
+
learning_rate=2e-6,
|
135 |
+
per_device_train_batch_size=1,
|
136 |
+
per_device_eval_batch_size=1,
|
137 |
+
gradient_accumulation_steps=1,
|
138 |
+
lr_scheduler_type="cosine",
|
139 |
+
num_train_epochs=1,
|
140 |
+
logging_strategy="steps",
|
141 |
+
save_strategy="steps",
|
142 |
+
evaluation_strategy="steps",
|
143 |
+
logging_steps=1,
|
144 |
+
eval_steps=1,
|
145 |
+
save_steps=1,
|
146 |
+
warmup_steps=0,
|
147 |
+
load_best_model_at_end=True,
|
148 |
+
metric_for_best_model="eval_loss",
|
149 |
+
greater_is_better=False,
|
150 |
+
weight_decay=0.0,
|
151 |
+
neftune_noise_alpha=5,
|
152 |
+
remove_unused_columns=False,
|
153 |
+
),
|
154 |
+
callbacks=[
|
155 |
+
EarlyStoppingCallback(
|
156 |
+
early_stopping_patience=3,
|
157 |
+
early_stopping_threshold=0.005
|
158 |
+
),
|
159 |
+
],
|
160 |
+
)
|
161 |
+
```
|