thorirhrafn commited on
Commit
f21e813
·
verified ·
1 Parent(s): 2dbaa6b

End of training

Browse files
Files changed (1) hide show
  1. README.md +12 -16
README.md CHANGED
@@ -16,7 +16,7 @@ should probably proofread and complete it, then remove this comment. -->
16
 
17
  This model is a fine-tuned version of [AI-Sweden-Models/gpt-sw3-1.3b](https://huggingface.co/AI-Sweden-Models/gpt-sw3-1.3b) on the None dataset.
18
  It achieves the following results on the evaluation set:
19
- - Loss: 1.7824
20
 
21
  ## Model description
22
 
@@ -35,31 +35,27 @@ More information needed
35
  ### Training hyperparameters
36
 
37
  The following hyperparameters were used during training:
38
- - learning_rate: 2e-05
39
  - train_batch_size: 4
40
  - eval_batch_size: 4
41
  - seed: 42
42
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
43
  - lr_scheduler_type: linear
44
- - num_epochs: 3
45
 
46
  ### Training results
47
 
48
  | Training Loss | Epoch | Step | Validation Loss |
49
  |:-------------:|:-----:|:----:|:---------------:|
50
- | 1.9553 | 0.22 | 50 | 1.8683 |
51
- | 1.9425 | 0.44 | 100 | 1.8240 |
52
- | 1.8376 | 0.67 | 150 | 1.8040 |
53
- | 2.0224 | 0.89 | 200 | 1.7953 |
54
- | 1.8172 | 1.11 | 250 | 1.7903 |
55
- | 1.9457 | 1.33 | 300 | 1.7875 |
56
- | 1.8177 | 1.56 | 350 | 1.7853 |
57
- | 1.82 | 1.78 | 400 | 1.7837 |
58
- | 1.9207 | 2.0 | 450 | 1.7830 |
59
- | 1.7946 | 2.22 | 500 | 1.7832 |
60
- | 1.8675 | 2.44 | 550 | 1.7828 |
61
- | 1.8384 | 2.67 | 600 | 1.7826 |
62
- | 1.9814 | 2.89 | 650 | 1.7824 |
63
 
64
 
65
  ### Framework versions
 
16
 
17
  This model is a fine-tuned version of [AI-Sweden-Models/gpt-sw3-1.3b](https://huggingface.co/AI-Sweden-Models/gpt-sw3-1.3b) on the None dataset.
18
  It achieves the following results on the evaluation set:
19
+ - Loss: 1.7838
20
 
21
  ## Model description
22
 
 
35
  ### Training hyperparameters
36
 
37
  The following hyperparameters were used during training:
38
+ - learning_rate: 5e-05
39
  - train_batch_size: 4
40
  - eval_batch_size: 4
41
  - seed: 42
42
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
43
  - lr_scheduler_type: linear
44
+ - num_epochs: 2
45
 
46
  ### Training results
47
 
48
  | Training Loss | Epoch | Step | Validation Loss |
49
  |:-------------:|:-----:|:----:|:---------------:|
50
+ | 1.9006 | 0.22 | 50 | 1.8021 |
51
+ | 1.907 | 0.44 | 100 | 1.7894 |
52
+ | 1.815 | 0.67 | 150 | 1.7845 |
53
+ | 2.0118 | 0.89 | 200 | 1.7850 |
54
+ | 1.7555 | 1.11 | 250 | 1.7863 |
55
+ | 1.8844 | 1.33 | 300 | 1.7857 |
56
+ | 1.7689 | 1.56 | 350 | 1.7851 |
57
+ | 1.7703 | 1.78 | 400 | 1.7838 |
58
+ | 1.8758 | 2.0 | 450 | 1.7838 |
 
 
 
 
59
 
60
 
61
  ### Framework versions