Felladrin commited on
Commit
89d2495
·
1 Parent(s): e6a52e4

Add training info to the Readme

Browse files
Files changed (1) hide show
  1. README.md +94 -11
README.md CHANGED
@@ -47,17 +47,10 @@ inference:
47
  # A Pythia Chat Model of 31M Parameters
48
 
49
  - Base model: [EleutherAI/pythia-31m](https://huggingface.co/EleutherAI/pythia-31m)
50
- - Datasets:
51
- - [totally-not-an-llm/EverythingLM-data-V3](https://huggingface.co/datasets/totally-not-an-llm/EverythingLM-data-V3)
52
- - [databricks/databricks-dolly-15k](https://huggingface.co/datasets/databricks/databricks-dolly-15k)
53
- - [THUDM/webglm-qa](https://huggingface.co/datasets/THUDM/webglm-qa)
54
- - [starfishmedical/webGPT_x_dolly](https://huggingface.co/datasets/starfishmedical/webGPT_x_dolly)
55
- - [Amod/mental_health_counseling_conversations](https://huggingface.co/datasets/Amod/mental_health_counseling_conversations)
56
- - [sablo/oasst2_curated](https://huggingface.co/datasets/sablo/oasst2_curated)
57
- - [cognitivecomputations/wizard_vicuna_70k_unfiltered](https://huggingface.co/datasets/cognitivecomputations/wizard_vicuna_70k_unfiltered)
58
- - [mlabonne/chatml_dpo_pairs](https://huggingface.co/datasets/mlabonne/chatml_dpo_pairs)
59
 
60
- ## Recommended Prompt Format
61
 
62
  ```
63
  <|im_start|>system
@@ -67,7 +60,7 @@ inference:
67
  <|im_start|>assistant
68
  ```
69
 
70
- ## Recommended Inference Parameters
71
 
72
  ```yml
73
  do_sample: true
@@ -76,3 +69,93 @@ top_p: 0.25
76
  top_k: 7
77
  repetition_penalty: 1.0016
78
  ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
47
  # A Pythia Chat Model of 31M Parameters
48
 
49
  - Base model: [EleutherAI/pythia-31m](https://huggingface.co/EleutherAI/pythia-31m)
50
+ - Availability in other ML formats:
51
+ - ONNX: [Felladrin/onnx-Pythia-31M-Chat-v1](https://huggingface.co/Felladrin/onnx-Pythia-31M-Chat-v1)
 
 
 
 
 
 
 
52
 
53
+ ## Recommended prompt format
54
 
55
  ```
56
  <|im_start|>system
 
60
  <|im_start|>assistant
61
  ```
62
 
63
+ ## Recommended inference parameters
64
 
65
  ```yml
66
  do_sample: true
 
69
  top_k: 7
70
  repetition_penalty: 1.0016
71
  ```
72
+
73
+ ## Datasets and parameters used for training
74
+
75
+ | Dataset | License Type |
76
+ |---------|--------------|
77
+ | [totally-not-an-llm/EverythingLM-data-V3](https://huggingface.co/datasets/totally-not-an-llm/EverythingLM-data-V3) | mit |
78
+ | [databricks/databricks-dolly-15k](https://huggingface.co/datasets/databricks/databricks-dolly-15k) | cc-by-sa-3.0 |
79
+ | [THUDM/webglm-qa](https://huggingface.co/datasets/THUDM/webglm-qa) | apache-2.0 |
80
+ | [starfishmedical/webGPT_x_dolly](https://huggingface.co/datasets/starfishmedical/webGPT_x_dolly) | cc-by-sa-3.0 |
81
+ | [Amod/mental_health_counseling_conversations](https://huggingface.co/datasets/Amod/mental_health_counseling_conversations) | openrail |
82
+ | [sablo/oasst2_curated](https://huggingface.co/datasets/sablo/oasst2_curated) | apache-2.0 |
83
+ | [cognitivecomputations/wizard_vicuna_70k_unfiltered](https://huggingface.co/datasets/cognitivecomputations/wizard_vicuna_70k_unfiltered) | apache-2.0 |
84
+ | [mlabonne/chatml_dpo_pairs](https://huggingface.co/datasets/mlabonne/chatml_dpo_pairs) | apache-2.0 |
85
+
86
+ ```python
87
+ SFTTrainer(
88
+ model,
89
+ train_dataset=train_dataset,
90
+ dataset_text_field="text",
91
+ eval_dataset=eval_dataset,
92
+ max_seq_length=2048,
93
+ packing=True,
94
+ args=TrainingArguments(
95
+ learning_rate=2e-6,
96
+ per_device_train_batch_size=1,
97
+ per_device_eval_batch_size=1,
98
+ gradient_accumulation_steps=16,
99
+ lr_scheduler_type="cosine",
100
+ num_train_epochs=1,
101
+ logging_strategy="steps",
102
+ save_strategy="steps",
103
+ evaluation_strategy="steps",
104
+ logging_steps=10,
105
+ eval_steps=10,
106
+ save_steps=10,
107
+ warmup_steps=50,
108
+ load_best_model_at_end=True,
109
+ metric_for_best_model="eval_loss",
110
+ greater_is_better=False,
111
+ weight_decay=0.01,
112
+ save_total_limit=10,
113
+ neftune_noise_alpha=5,
114
+ ),
115
+ callbacks=[
116
+ EarlyStoppingCallback(
117
+ early_stopping_patience=3,
118
+ early_stopping_threshold=0.005
119
+ ),
120
+ ],
121
+ )
122
+ ```
123
+
124
+ ```python
125
+ DPOTrainer(
126
+ model,
127
+ beta=0.1,
128
+ train_dataset=dataset,
129
+ tokenizer=tokenizer,
130
+ eval_dataset=eval_dataset,
131
+ max_length=1536,
132
+ max_prompt_length=1024,
133
+ args=TrainingArguments(
134
+ learning_rate=2e-6,
135
+ per_device_train_batch_size=1,
136
+ per_device_eval_batch_size=1,
137
+ gradient_accumulation_steps=1,
138
+ lr_scheduler_type="cosine",
139
+ num_train_epochs=1,
140
+ logging_strategy="steps",
141
+ save_strategy="steps",
142
+ evaluation_strategy="steps",
143
+ logging_steps=1,
144
+ eval_steps=1,
145
+ save_steps=1,
146
+ warmup_steps=0,
147
+ load_best_model_at_end=True,
148
+ metric_for_best_model="eval_loss",
149
+ greater_is_better=False,
150
+ weight_decay=0.0,
151
+ neftune_noise_alpha=5,
152
+ remove_unused_columns=False,
153
+ ),
154
+ callbacks=[
155
+ EarlyStoppingCallback(
156
+ early_stopping_patience=3,
157
+ early_stopping_threshold=0.005
158
+ ),
159
+ ],
160
+ )
161
+ ```