thorirhrafn
/

gpt_icesum

Text Generation

Generated from Trainer

text-generation-inference

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

thorirhrafn commited on Mar 25, 2024

Commit

f21e813

·

verified ·

1 Parent(s): 2dbaa6b

End of training

Files changed (1) hide show

README.md +12 -16

README.md CHANGED Viewed

@@ -16,7 +16,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [AI-Sweden-Models/gpt-sw3-1.3b](https://huggingface.co/AI-Sweden-Models/gpt-sw3-1.3b) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 1.7824
 ## Model description
@@ -35,31 +35,27 @@ More information needed
 ### Training hyperparameters
 The following hyperparameters were used during training:
-- learning_rate: 2e-05
 - train_batch_size: 4
 - eval_batch_size: 4
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
-- num_epochs: 3
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
-| 1.9553        | 0.22  | 50   | 1.8683          |
-| 1.9425        | 0.44  | 100  | 1.8240          |
-| 1.8376        | 0.67  | 150  | 1.8040          |
-| 2.0224        | 0.89  | 200  | 1.7953          |
-| 1.8172        | 1.11  | 250  | 1.7903          |
-| 1.9457        | 1.33  | 300  | 1.7875          |
-| 1.8177        | 1.56  | 350  | 1.7853          |
-| 1.82          | 1.78  | 400  | 1.7837          |
-| 1.9207        | 2.0   | 450  | 1.7830          |
-| 1.7946        | 2.22  | 500  | 1.7832          |
-| 1.8675        | 2.44  | 550  | 1.7828          |
-| 1.8384        | 2.67  | 600  | 1.7826          |
-| 1.9814        | 2.89  | 650  | 1.7824          |
 ### Framework versions

 This model is a fine-tuned version of [AI-Sweden-Models/gpt-sw3-1.3b](https://huggingface.co/AI-Sweden-Models/gpt-sw3-1.3b) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 1.7838
 ## Model description
 ### Training hyperparameters
 The following hyperparameters were used during training:
+- learning_rate: 5e-05
 - train_batch_size: 4
 - eval_batch_size: 4
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
+- num_epochs: 2
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
+| 1.9006        | 0.22  | 50   | 1.8021          |
+| 1.907         | 0.44  | 100  | 1.7894          |
+| 1.815         | 0.67  | 150  | 1.7845          |
+| 2.0118        | 0.89  | 200  | 1.7850          |
+| 1.7555        | 1.11  | 250  | 1.7863          |
+| 1.8844        | 1.33  | 300  | 1.7857          |
+| 1.7689        | 1.56  | 350  | 1.7851          |
+| 1.7703        | 1.78  | 400  | 1.7838          |
+| 1.8758        | 2.0   | 450  | 1.7838          |
 ### Framework versions