alexander-hm commited on
Commit
8649ab2
·
verified ·
1 Parent(s): 69e4edc

End of training

Browse files
Files changed (7) hide show
  1. README.md +113 -0
  2. all_results.json +12 -0
  3. completed +0 -0
  4. eval_results.json +7 -0
  5. metrics.json +1 -0
  6. train_results.json +8 -0
  7. trainer_state.json +0 -0
README.md ADDED
@@ -0,0 +1,113 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: huggyllama/llama-7b
3
+ library_name: peft
4
+ license: other
5
+ tags:
6
+ - generated_from_trainer
7
+ model-index:
8
+ - name: llama-7b_alpaca_l0.0002_64
9
+ results: []
10
+ ---
11
+
12
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
13
+ should probably proofread and complete it, then remove this comment. -->
14
+
15
+ # llama-7b_alpaca_l0.0002_64
16
+
17
+ This model is a fine-tuned version of [huggyllama/llama-7b](https://huggingface.co/huggyllama/llama-7b) on an unknown dataset.
18
+ It achieves the following results on the evaluation set:
19
+ - Loss: 1.8634
20
+
21
+ ## Model description
22
+
23
+ More information needed
24
+
25
+ ## Intended uses & limitations
26
+
27
+ More information needed
28
+
29
+ ## Training and evaluation data
30
+
31
+ More information needed
32
+
33
+ ## Training procedure
34
+
35
+ ### Training hyperparameters
36
+
37
+ The following hyperparameters were used during training:
38
+ - learning_rate: 0.0002
39
+ - train_batch_size: 1
40
+ - eval_batch_size: 1
41
+ - seed: 0
42
+ - gradient_accumulation_steps: 16
43
+ - total_train_batch_size: 16
44
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
45
+ - lr_scheduler_type: constant
46
+ - lr_scheduler_warmup_ratio: 0.03
47
+ - training_steps: 0
48
+
49
+ ### Training results
50
+
51
+ | Training Loss | Epoch | Step | Validation Loss |
52
+ |:-------------:|:------:|:----:|:---------------:|
53
+ | 1.6644 | 0.0003 | 1 | 2.9705 |
54
+ | 2.6972 | 0.0587 | 187 | 1.9034 |
55
+ | 1.8541 | 0.1174 | 374 | 1.8795 |
56
+ | 1.2478 | 0.1761 | 561 | 1.8640 |
57
+ | 1.8178 | 0.2348 | 748 | 1.8499 |
58
+ | 2.5147 | 0.2935 | 935 | 1.8168 |
59
+ | 1.4811 | 0.3522 | 1122 | 1.8192 |
60
+ | 1.5617 | 0.4108 | 1309 | 1.8117 |
61
+ | 2.6811 | 0.4695 | 1496 | 1.8032 |
62
+ | 1.9665 | 0.5282 | 1683 | 1.7964 |
63
+ | 1.5309 | 0.5869 | 1870 | 1.7978 |
64
+ | 1.5708 | 0.6456 | 2057 | 1.8058 |
65
+ | 2.2185 | 0.7043 | 2244 | 1.7870 |
66
+ | 2.6575 | 0.7630 | 2431 | 1.7815 |
67
+ | 1.6179 | 0.8217 | 2618 | 1.7880 |
68
+ | 1.4937 | 0.8804 | 2805 | 1.7864 |
69
+ | 2.2623 | 0.9391 | 2992 | 1.7782 |
70
+ | 2.4709 | 0.9978 | 3179 | 1.7754 |
71
+ | 1.9605 | 1.0565 | 3366 | 1.7927 |
72
+ | 1.5008 | 1.1151 | 3553 | 1.7958 |
73
+ | 1.4912 | 1.1738 | 3740 | 1.8099 |
74
+ | 1.9176 | 1.2325 | 3927 | 1.8007 |
75
+ | 1.5569 | 1.2912 | 4114 | 1.7962 |
76
+ | 1.3717 | 1.3499 | 4301 | 1.8071 |
77
+ | 1.5241 | 1.4086 | 4488 | 1.8020 |
78
+ | 2.1042 | 1.4673 | 4675 | 1.7964 |
79
+ | 1.6643 | 1.5260 | 4862 | 1.7947 |
80
+ | 1.3815 | 1.5847 | 5049 | 1.7994 |
81
+ | 2.5619 | 1.6434 | 5236 | 1.7989 |
82
+ | 1.7651 | 1.7021 | 5423 | 1.7948 |
83
+ | 1.4931 | 1.7608 | 5610 | 1.7908 |
84
+ | 1.5089 | 1.8195 | 5797 | 1.7957 |
85
+ | 1.768 | 1.8781 | 5984 | 1.7989 |
86
+ | 1.769 | 1.9368 | 6171 | 1.7915 |
87
+ | 1.5345 | 1.9955 | 6358 | 1.7887 |
88
+ | 1.2575 | 2.0542 | 6545 | 1.8514 |
89
+ | 1.1761 | 2.1129 | 6732 | 1.8809 |
90
+ | 1.4524 | 2.1716 | 6919 | 1.8932 |
91
+ | 1.5745 | 2.2303 | 7106 | 1.8655 |
92
+ | 1.1251 | 2.2890 | 7293 | 1.8609 |
93
+ | 1.2381 | 2.3477 | 7480 | 1.8901 |
94
+ | 1.7963 | 2.4064 | 7667 | 1.8743 |
95
+ | 1.4293 | 2.4651 | 7854 | 1.8580 |
96
+ | 1.3278 | 2.5238 | 8041 | 1.8687 |
97
+ | 1.2364 | 2.5824 | 8228 | 1.9165 |
98
+ | 1.5239 | 2.6411 | 8415 | 1.8834 |
99
+ | 1.3108 | 2.6998 | 8602 | 1.8617 |
100
+ | 1.2084 | 2.7585 | 8789 | 1.8702 |
101
+ | 1.3279 | 2.8172 | 8976 | 1.8786 |
102
+ | 1.7506 | 2.8759 | 9163 | 1.8734 |
103
+ | 1.4208 | 2.9346 | 9350 | 1.8601 |
104
+ | 1.2449 | 2.9933 | 9537 | 1.8668 |
105
+
106
+
107
+ ### Framework versions
108
+
109
+ - PEFT 0.12.1.dev0
110
+ - Transformers 4.45.0.dev0
111
+ - Pytorch 2.3.0+cu121
112
+ - Datasets 2.19.0
113
+ - Tokenizers 0.19.1
all_results.json ADDED
@@ -0,0 +1,12 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "epoch": 2.9998823021695635,
3
+ "eval_loss": 1.8633852005004883,
4
+ "eval_runtime": 122.5195,
5
+ "eval_samples_per_second": 8.162,
6
+ "eval_steps_per_second": 8.162,
7
+ "total_flos": 5.173921474210529e+17,
8
+ "train_loss": 1.6460282540416138,
9
+ "train_runtime": 173208.0103,
10
+ "train_samples_per_second": 0.883,
11
+ "train_steps_per_second": 0.055
12
+ }
completed ADDED
File without changes
eval_results.json ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ {
2
+ "epoch": 2.9998823021695635,
3
+ "eval_loss": 1.8633852005004883,
4
+ "eval_runtime": 122.5195,
5
+ "eval_samples_per_second": 8.162,
6
+ "eval_steps_per_second": 8.162
7
+ }
metrics.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"run_name": "huggyllama/llama-7b_alpaca_l0.0002_64", "train_runtime": 173208.0103, "train_samples_per_second": 0.883, "train_steps_per_second": 0.055, "total_flos": 5.173921474210529e+17, "train_loss": 1.6460282540416138, "epoch": 2.9998823021695635, "eval_loss": 1.8633852005004883, "eval_runtime": 122.5195, "eval_samples_per_second": 8.162, "eval_steps_per_second": 8.162}
train_results.json ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "epoch": 2.9998823021695635,
3
+ "total_flos": 5.173921474210529e+17,
4
+ "train_loss": 1.6460282540416138,
5
+ "train_runtime": 173208.0103,
6
+ "train_samples_per_second": 0.883,
7
+ "train_steps_per_second": 0.055
8
+ }
trainer_state.json ADDED
The diff for this file is too large to render. See raw diff