End of training
Browse files
README.md
CHANGED
@@ -2,6 +2,7 @@
|
|
2 |
license: other
|
3 |
library_name: peft
|
4 |
tags:
|
|
|
5 |
- generated_from_trainer
|
6 |
base_model: Qwen/Qwen1.5-0.5B-Chat
|
7 |
model-index:
|
@@ -119,7 +120,9 @@ special_tokens:
|
|
119 |
|
120 |
# Qwen1.5-Capybara-0.5B-Chat
|
121 |
|
122 |
-
This model is a fine-tuned version of [Qwen/Qwen1.5-0.5B-Chat](https://huggingface.co/Qwen/Qwen1.5-0.5B-Chat) on
|
|
|
|
|
123 |
|
124 |
## Model description
|
125 |
|
@@ -152,6 +155,16 @@ The following hyperparameters were used during training:
|
|
152 |
- lr_scheduler_warmup_steps: 15
|
153 |
- num_epochs: 1
|
154 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
155 |
### Framework versions
|
156 |
|
157 |
- PEFT 0.9.1.dev0
|
|
|
2 |
license: other
|
3 |
library_name: peft
|
4 |
tags:
|
5 |
+
- axolotl
|
6 |
- generated_from_trainer
|
7 |
base_model: Qwen/Qwen1.5-0.5B-Chat
|
8 |
model-index:
|
|
|
120 |
|
121 |
# Qwen1.5-Capybara-0.5B-Chat
|
122 |
|
123 |
+
This model is a fine-tuned version of [Qwen/Qwen1.5-0.5B-Chat](https://huggingface.co/Qwen/Qwen1.5-0.5B-Chat) on the None dataset.
|
124 |
+
It achieves the following results on the evaluation set:
|
125 |
+
- Loss: 1.0419
|
126 |
|
127 |
## Model description
|
128 |
|
|
|
155 |
- lr_scheduler_warmup_steps: 15
|
156 |
- num_epochs: 1
|
157 |
|
158 |
+
### Training results
|
159 |
+
|
160 |
+
| Training Loss | Epoch | Step | Validation Loss |
|
161 |
+
|:-------------:|:-----:|:----:|:---------------:|
|
162 |
+
| 1.164 | 0.0 | 1 | 1.2662 |
|
163 |
+
| 0.759 | 0.25 | 343 | 1.0705 |
|
164 |
+
| 0.6798 | 0.5 | 686 | 1.0525 |
|
165 |
+
| 1.2828 | 0.75 | 1029 | 1.0419 |
|
166 |
+
|
167 |
+
|
168 |
### Framework versions
|
169 |
|
170 |
- PEFT 0.9.1.dev0
|