Sorour
/

cls_fomc_phi3_v1

Generated from Trainer

Model card Files Files and versions Metrics Training metrics Community

Sorour commited on May 25, 2024

Commit

ad662de

·

verified ·

1 Parent(s): d189e55

Model save

Files changed (1) hide show

README.md +8 -11

README.md CHANGED Viewed

@@ -20,7 +20,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [microsoft/Phi-3-mini-4k-instruct](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct) on the generator dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.6130
 ## Model description
@@ -40,11 +40,11 @@ More information needed
 The following hyperparameters were used during training:
 - learning_rate: 0.0002
-- train_batch_size: 1
 - eval_batch_size: 8
 - seed: 42
 - gradient_accumulation_steps: 4
-- total_train_batch_size: 4
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: constant
 - lr_scheduler_warmup_ratio: 0.03
@@ -55,18 +55,15 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch  | Step | Validation Loss |
 |:-------------:|:------:|:----:|:---------------:|
-| 0.6352        | 0.3089 | 20   | 0.6405          |
-| 0.6186        | 0.6178 | 40   | 0.6192          |
-| 0.6122        | 0.9266 | 60   | 0.6108          |
-| 0.5154        | 1.2355 | 80   | 0.6209          |
-| 0.5187        | 1.5444 | 100  | 0.6251          |
-| 0.515         | 1.8533 | 120  | 0.6130          |
 ### Framework versions
 - PEFT 0.11.1
-- Transformers 4.41.0
-- Pytorch 2.2.1+cu121
 - Datasets 2.19.1
 - Tokenizers 0.19.1

 This model is a fine-tuned version of [microsoft/Phi-3-mini-4k-instruct](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct) on the generator dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.6211
 ## Model description
 The following hyperparameters were used during training:
 - learning_rate: 0.0002
+- train_batch_size: 2
 - eval_batch_size: 8
 - seed: 42
 - gradient_accumulation_steps: 4
+- total_train_batch_size: 8
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: constant
 - lr_scheduler_warmup_ratio: 0.03
 | Training Loss | Epoch  | Step | Validation Loss |
 |:-------------:|:------:|:----:|:---------------:|
+| 0.639         | 0.6299 | 20   | 0.6435          |
+| 0.5796        | 1.2598 | 40   | 0.6208          |
+| 0.5579        | 1.8898 | 60   | 0.6211          |
 ### Framework versions
 - PEFT 0.11.1
+- Transformers 4.41.1
+- Pytorch 2.3.0+cu121
 - Datasets 2.19.1
 - Tokenizers 0.19.1