Peaky8linders
/

hc-mistral-alpaca

Generated from Trainer

4-bit precision

Model card Files Files and versions Metrics Training metrics Community

Peaky8linders commited on Jun 19, 2024

Commit

e8172de

·

verified ·

1 Parent(s): 2817a60

End of training

Files changed (1) hide show

README.md +16 -9

README.md CHANGED Viewed

@@ -32,7 +32,7 @@ data_seed: 49
 seed: 49
 datasets:
-  - path: _synth_data/alpaca_synth_queries_healed.jsonl
     type: sharegpt
     conversation: alpaca
 dataset_prepared_path: last_run_prepared
@@ -66,11 +66,11 @@ lora_target_modules:
 gradient_accumulation_steps: 4
 micro_batch_size: 16
-eval_batch_size: 16
-num_epochs: 1
 optimizer: adamw_bnb_8bit
 lr_scheduler: cosine
-learning_rate: 0.002
 max_grad_norm: 1.0
 adam_beta2: 0.95
 adam_epsilon: 0.00001
@@ -116,7 +116,7 @@ save_safetensors: true
 This model is a fine-tuned version of [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 1.1640
 ## Model description
@@ -135,22 +135,29 @@ More information needed
 ### Training hyperparameters
 The following hyperparameters were used during training:
-- learning_rate: 0.002
 - train_batch_size: 16
-- eval_batch_size: 16
 - seed: 49
 - gradient_accumulation_steps: 4
 - total_train_batch_size: 64
 - optimizer: Adam with betas=(0.9,0.95) and epsilon=1e-05
 - lr_scheduler_type: cosine
 - lr_scheduler_warmup_steps: 20
-- num_epochs: 1
 ### Training results
 | Training Loss | Epoch  | Step | Validation Loss |
 |:-------------:|:------:|:----:|:---------------:|
-| 1.1341        | 0.0006 | 1    | 1.1640          |
 ### Framework versions

 seed: 49
 datasets:
+  - path: _synth_data/alpaca_synth_queries_healed_sample.jsonl
     type: sharegpt
     conversation: alpaca
 dataset_prepared_path: last_run_prepared
 gradient_accumulation_steps: 4
 micro_batch_size: 16
+eval_batch_size: 1
+num_epochs: 2
 optimizer: adamw_bnb_8bit
 lr_scheduler: cosine
+learning_rate: 0.0002
 max_grad_norm: 1.0
 adam_beta2: 0.95
 adam_epsilon: 0.00001
 This model is a fine-tuned version of [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.0318
 ## Model description
 ### Training hyperparameters
 The following hyperparameters were used during training:
+- learning_rate: 0.0002
 - train_batch_size: 16
+- eval_batch_size: 1
 - seed: 49
 - gradient_accumulation_steps: 4
 - total_train_batch_size: 64
 - optimizer: Adam with betas=(0.9,0.95) and epsilon=1e-05
 - lr_scheduler_type: cosine
 - lr_scheduler_warmup_steps: 20
+- num_epochs: 2
 ### Training results
 | Training Loss | Epoch  | Step | Validation Loss |
 |:-------------:|:------:|:----:|:---------------:|
+| 1.1214        | 0.0011 | 1    | 1.1843          |
+| 0.0856        | 0.2501 | 225  | 0.0910          |
+| 0.0599        | 0.5001 | 450  | 0.0561          |
+| 0.0326        | 0.7502 | 675  | 0.0447          |
+| 0.0393        | 1.0003 | 900  | 0.0372          |
+| 0.0255        | 1.2503 | 1125 | 0.0341          |
+| 0.0261        | 1.5004 | 1350 | 0.0324          |
+| 0.0392        | 1.7505 | 1575 | 0.0318          |
 ### Framework versions