bilkultheek commited on
Commit
c9388ec
·
verified ·
1 Parent(s): ebff4ba

End of training

Browse files
README.md CHANGED
@@ -18,7 +18,7 @@ should probably proofread and complete it, then remove this comment. -->
18
 
19
  This model is a fine-tuned version of [ahxt/LiteLlama-460M-1T](https://huggingface.co/ahxt/LiteLlama-460M-1T) on the None dataset.
20
  It achieves the following results on the evaluation set:
21
- - Loss: 2.3021
22
 
23
  ## Model description
24
 
@@ -46,18 +46,24 @@ The following hyperparameters were used during training:
46
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
47
  - lr_scheduler_type: cosine
48
  - lr_scheduler_warmup_ratio: 0.03
49
- - num_epochs: 5
50
 
51
  ### Training results
52
 
53
  | Training Loss | Epoch | Step | Validation Loss |
54
  |:-------------:|:-----:|:----:|:---------------:|
55
- | 4.1436 | 0.8 | 25 | 3.8815 |
56
- | 3.6028 | 1.6 | 50 | 3.2639 |
57
- | 2.9395 | 2.4 | 75 | 2.5905 |
58
- | 2.4548 | 3.2 | 100 | 2.3582 |
59
- | 2.337 | 4.0 | 125 | 2.3102 |
60
- | 2.3125 | 4.8 | 150 | 2.3024 |
 
 
 
 
 
 
61
 
62
 
63
  ### Framework versions
 
18
 
19
  This model is a fine-tuned version of [ahxt/LiteLlama-460M-1T](https://huggingface.co/ahxt/LiteLlama-460M-1T) on the None dataset.
20
  It achieves the following results on the evaluation set:
21
+ - Loss: 2.0471
22
 
23
  ## Model description
24
 
 
46
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
47
  - lr_scheduler_type: cosine
48
  - lr_scheduler_warmup_ratio: 0.03
49
+ - num_epochs: 10
50
 
51
  ### Training results
52
 
53
  | Training Loss | Epoch | Step | Validation Loss |
54
  |:-------------:|:-----:|:----:|:---------------:|
55
+ | 4.1747 | 0.8 | 25 | 3.9257 |
56
+ | 3.626 | 1.6 | 50 | 3.2474 |
57
+ | 2.8441 | 2.4 | 75 | 2.4490 |
58
+ | 2.3365 | 3.2 | 100 | 2.2482 |
59
+ | 2.2153 | 4.0 | 125 | 2.1758 |
60
+ | 2.1591 | 4.8 | 150 | 2.1316 |
61
+ | 2.1214 | 5.6 | 175 | 2.1011 |
62
+ | 2.0946 | 6.4 | 200 | 2.0781 |
63
+ | 2.0818 | 7.2 | 225 | 2.0622 |
64
+ | 2.0614 | 8.0 | 250 | 2.0528 |
65
+ | 2.0571 | 8.8 | 275 | 2.0485 |
66
+ | 2.0522 | 9.6 | 300 | 2.0471 |
67
 
68
 
69
  ### Framework versions
runs/Aug03_09-18-38_fastgpuserv/events.out.tfevents.1722690531.fastgpuserv.2914688.1 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:01f6462e33b14e20a451c94459665f31bb87ad7d037deac0f025e8b117299d64
3
+ size 359