Krisbiantoro
/

mistral_mix

Generated from Trainer

Model card Files Files and versions Metrics Training metrics Community

Krisbiantoro commited on Apr 6, 2024

Commit

9f6814a

·

verified ·

1 Parent(s): e101746

Model save

Files changed (1) hide show

README.md +16 -7

README.md CHANGED Viewed

@@ -20,7 +20,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) on the generator dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.8933
 ## Model description
@@ -40,23 +40,32 @@ More information needed
 The following hyperparameters were used during training:
 - learning_rate: 0.0001
-- train_batch_size: 2
-- eval_batch_size: 1
 - seed: 42
 - gradient_accumulation_steps: 32
-- total_train_batch_size: 64
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: cosine
 - lr_scheduler_warmup_ratio: 0.03
-- num_epochs: 1
 - mixed_precision_training: Native AMP
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
-| 0.9453        | 0.38  | 50   | 0.9288          |
-| 0.8869        | 0.76  | 100  | 0.8933          |
 ### Framework versions

 This model is a fine-tuned version of [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) on the generator dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.9585
 ## Model description
 The following hyperparameters were used during training:
 - learning_rate: 0.0001
+- train_batch_size: 4
+- eval_batch_size: 2
 - seed: 42
 - gradient_accumulation_steps: 32
+- total_train_batch_size: 128
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: cosine
 - lr_scheduler_warmup_ratio: 0.03
+- num_epochs: 2
 - mixed_precision_training: Native AMP
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
+| 1.1281        | 0.18  | 20   | 1.0894          |
+| 1.0534        | 0.36  | 40   | 1.0328          |
+| 1.0235        | 0.54  | 60   | 1.0056          |
+| 1.0012        | 0.72  | 80   | 0.9886          |
+| 0.9931        | 0.9   | 100  | 0.9764          |
+| 0.9241        | 1.08  | 120  | 0.9711          |
+| 0.8974        | 1.26  | 140  | 0.9663          |
+| 0.8971        | 1.44  | 160  | 0.9624          |
+| 0.8978        | 1.62  | 180  | 0.9598          |
+| 0.8786        | 1.8   | 200  | 0.9588          |
+| 0.8886        | 1.98  | 220  | 0.9585          |
 ### Framework versions