Upload folder using huggingface_hub
Browse files- best-model.pt +3 -0
- dev.tsv +0 -0
- final-model.pt +3 -0
- loss.tsv +11 -0
- runs/events.out.tfevents.1697294459.c8b2203b18a8.2923.14 +3 -0
- test.tsv +0 -0
- training.log +260 -0
best-model.pt
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:1f55a50a185754f70852d874145cdc5f660d11d1c76fbdf64691bffe2ce4f0e8
|
3 |
+
size 870793839
|
dev.tsv
ADDED
The diff for this file is too large to render.
See raw diff
|
|
final-model.pt
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:676d28b8f4620bae348c185612a396ef58cf3b1933a994c24d2d111e9a43d98e
|
3 |
+
size 870793956
|
loss.tsv
ADDED
@@ -0,0 +1,11 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
EPOCH TIMESTAMP LEARNING_RATE TRAIN_LOSS DEV_LOSS DEV_PRECISION DEV_RECALL DEV_F1 DEV_ACCURACY
|
2 |
+
1 14:58:53 0.0001 0.6239 0.1172 0.5034 0.6831 0.5796 0.4207
|
3 |
+
2 15:16:51 0.0001 0.0909 0.1110 0.5634 0.6911 0.6208 0.4593
|
4 |
+
3 15:34:42 0.0001 0.0648 0.1631 0.5482 0.7025 0.6158 0.4538
|
5 |
+
4 15:52:18 0.0001 0.0458 0.2121 0.5599 0.7963 0.6575 0.5014
|
6 |
+
5 16:09:49 0.0001 0.0306 0.2789 0.5407 0.7071 0.6128 0.4504
|
7 |
+
6 16:27:20 0.0001 0.0212 0.3139 0.5503 0.7323 0.6284 0.4672
|
8 |
+
7 16:45:30 0.0001 0.0158 0.3258 0.5606 0.7094 0.6263 0.4665
|
9 |
+
8 17:03:08 0.0000 0.0097 0.3519 0.5533 0.7540 0.6383 0.4779
|
10 |
+
9 17:21:13 0.0000 0.0072 0.3742 0.5560 0.7609 0.6425 0.4826
|
11 |
+
10 17:38:48 0.0000 0.0045 0.3830 0.5539 0.7586 0.6403 0.4808
|
runs/events.out.tfevents.1697294459.c8b2203b18a8.2923.14
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:324afaf79fbee2310838ac9fd03f1e505ce05565c49b95b407b63a64364ab787
|
3 |
+
size 2030580
|
test.tsv
ADDED
The diff for this file is too large to render.
See raw diff
|
|
training.log
ADDED
@@ -0,0 +1,260 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
2023-10-14 14:40:59,478 ----------------------------------------------------------------------------------------------------
|
2 |
+
2023-10-14 14:40:59,480 Model: "SequenceTagger(
|
3 |
+
(embeddings): ByT5Embeddings(
|
4 |
+
(model): T5EncoderModel(
|
5 |
+
(shared): Embedding(384, 1472)
|
6 |
+
(encoder): T5Stack(
|
7 |
+
(embed_tokens): Embedding(384, 1472)
|
8 |
+
(block): ModuleList(
|
9 |
+
(0): T5Block(
|
10 |
+
(layer): ModuleList(
|
11 |
+
(0): T5LayerSelfAttention(
|
12 |
+
(SelfAttention): T5Attention(
|
13 |
+
(q): Linear(in_features=1472, out_features=384, bias=False)
|
14 |
+
(k): Linear(in_features=1472, out_features=384, bias=False)
|
15 |
+
(v): Linear(in_features=1472, out_features=384, bias=False)
|
16 |
+
(o): Linear(in_features=384, out_features=1472, bias=False)
|
17 |
+
(relative_attention_bias): Embedding(32, 6)
|
18 |
+
)
|
19 |
+
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
|
20 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
21 |
+
)
|
22 |
+
(1): T5LayerFF(
|
23 |
+
(DenseReluDense): T5DenseGatedActDense(
|
24 |
+
(wi_0): Linear(in_features=1472, out_features=3584, bias=False)
|
25 |
+
(wi_1): Linear(in_features=1472, out_features=3584, bias=False)
|
26 |
+
(wo): Linear(in_features=3584, out_features=1472, bias=False)
|
27 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
28 |
+
(act): NewGELUActivation()
|
29 |
+
)
|
30 |
+
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
|
31 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
32 |
+
)
|
33 |
+
)
|
34 |
+
)
|
35 |
+
(1-11): 11 x T5Block(
|
36 |
+
(layer): ModuleList(
|
37 |
+
(0): T5LayerSelfAttention(
|
38 |
+
(SelfAttention): T5Attention(
|
39 |
+
(q): Linear(in_features=1472, out_features=384, bias=False)
|
40 |
+
(k): Linear(in_features=1472, out_features=384, bias=False)
|
41 |
+
(v): Linear(in_features=1472, out_features=384, bias=False)
|
42 |
+
(o): Linear(in_features=384, out_features=1472, bias=False)
|
43 |
+
)
|
44 |
+
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
|
45 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
46 |
+
)
|
47 |
+
(1): T5LayerFF(
|
48 |
+
(DenseReluDense): T5DenseGatedActDense(
|
49 |
+
(wi_0): Linear(in_features=1472, out_features=3584, bias=False)
|
50 |
+
(wi_1): Linear(in_features=1472, out_features=3584, bias=False)
|
51 |
+
(wo): Linear(in_features=3584, out_features=1472, bias=False)
|
52 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
53 |
+
(act): NewGELUActivation()
|
54 |
+
)
|
55 |
+
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
|
56 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
57 |
+
)
|
58 |
+
)
|
59 |
+
)
|
60 |
+
)
|
61 |
+
(final_layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
|
62 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
63 |
+
)
|
64 |
+
)
|
65 |
+
)
|
66 |
+
(locked_dropout): LockedDropout(p=0.5)
|
67 |
+
(linear): Linear(in_features=1472, out_features=13, bias=True)
|
68 |
+
(loss_function): CrossEntropyLoss()
|
69 |
+
)"
|
70 |
+
2023-10-14 14:40:59,481 ----------------------------------------------------------------------------------------------------
|
71 |
+
2023-10-14 14:40:59,481 MultiCorpus: 14465 train + 1392 dev + 2432 test sentences
|
72 |
+
- NER_HIPE_2022 Corpus: 14465 train + 1392 dev + 2432 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/letemps/fr/with_doc_seperator
|
73 |
+
2023-10-14 14:40:59,481 ----------------------------------------------------------------------------------------------------
|
74 |
+
2023-10-14 14:40:59,481 Train: 14465 sentences
|
75 |
+
2023-10-14 14:40:59,481 (train_with_dev=False, train_with_test=False)
|
76 |
+
2023-10-14 14:40:59,481 ----------------------------------------------------------------------------------------------------
|
77 |
+
2023-10-14 14:40:59,481 Training Params:
|
78 |
+
2023-10-14 14:40:59,481 - learning_rate: "0.00015"
|
79 |
+
2023-10-14 14:40:59,481 - mini_batch_size: "4"
|
80 |
+
2023-10-14 14:40:59,481 - max_epochs: "10"
|
81 |
+
2023-10-14 14:40:59,481 - shuffle: "True"
|
82 |
+
2023-10-14 14:40:59,481 ----------------------------------------------------------------------------------------------------
|
83 |
+
2023-10-14 14:40:59,481 Plugins:
|
84 |
+
2023-10-14 14:40:59,481 - TensorboardLogger
|
85 |
+
2023-10-14 14:40:59,482 - LinearScheduler | warmup_fraction: '0.1'
|
86 |
+
2023-10-14 14:40:59,482 ----------------------------------------------------------------------------------------------------
|
87 |
+
2023-10-14 14:40:59,482 Final evaluation on model from best epoch (best-model.pt)
|
88 |
+
2023-10-14 14:40:59,482 - metric: "('micro avg', 'f1-score')"
|
89 |
+
2023-10-14 14:40:59,482 ----------------------------------------------------------------------------------------------------
|
90 |
+
2023-10-14 14:40:59,482 Computation:
|
91 |
+
2023-10-14 14:40:59,482 - compute on device: cuda:0
|
92 |
+
2023-10-14 14:40:59,482 - embedding storage: none
|
93 |
+
2023-10-14 14:40:59,482 ----------------------------------------------------------------------------------------------------
|
94 |
+
2023-10-14 14:40:59,482 Model training base path: "hmbench-letemps/fr-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs4-wsFalse-e10-lr0.00015-poolingfirst-layers-1-crfFalse-4"
|
95 |
+
2023-10-14 14:40:59,482 ----------------------------------------------------------------------------------------------------
|
96 |
+
2023-10-14 14:40:59,482 ----------------------------------------------------------------------------------------------------
|
97 |
+
2023-10-14 14:40:59,482 Logging anything other than scalars to TensorBoard is currently not supported.
|
98 |
+
2023-10-14 14:42:41,066 epoch 1 - iter 361/3617 - loss 2.50686900 - time (sec): 101.58 - samples/sec: 360.10 - lr: 0.000015 - momentum: 0.000000
|
99 |
+
2023-10-14 14:44:22,406 epoch 1 - iter 722/3617 - loss 2.10752883 - time (sec): 202.92 - samples/sec: 369.39 - lr: 0.000030 - momentum: 0.000000
|
100 |
+
2023-10-14 14:46:12,979 epoch 1 - iter 1083/3617 - loss 1.65683609 - time (sec): 313.49 - samples/sec: 363.76 - lr: 0.000045 - momentum: 0.000000
|
101 |
+
2023-10-14 14:47:57,110 epoch 1 - iter 1444/3617 - loss 1.31501097 - time (sec): 417.63 - samples/sec: 364.72 - lr: 0.000060 - momentum: 0.000000
|
102 |
+
2023-10-14 14:49:44,051 epoch 1 - iter 1805/3617 - loss 1.09620008 - time (sec): 524.57 - samples/sec: 361.95 - lr: 0.000075 - momentum: 0.000000
|
103 |
+
2023-10-14 14:51:28,614 epoch 1 - iter 2166/3617 - loss 0.93978591 - time (sec): 629.13 - samples/sec: 364.30 - lr: 0.000090 - momentum: 0.000000
|
104 |
+
2023-10-14 14:53:09,495 epoch 1 - iter 2527/3617 - loss 0.82889421 - time (sec): 730.01 - samples/sec: 366.34 - lr: 0.000105 - momentum: 0.000000
|
105 |
+
2023-10-14 14:54:50,772 epoch 1 - iter 2888/3617 - loss 0.74534435 - time (sec): 831.29 - samples/sec: 367.53 - lr: 0.000120 - momentum: 0.000000
|
106 |
+
2023-10-14 14:56:30,263 epoch 1 - iter 3249/3617 - loss 0.67952585 - time (sec): 930.78 - samples/sec: 367.24 - lr: 0.000135 - momentum: 0.000000
|
107 |
+
2023-10-14 14:58:13,235 epoch 1 - iter 3610/3617 - loss 0.62441070 - time (sec): 1033.75 - samples/sec: 366.94 - lr: 0.000150 - momentum: 0.000000
|
108 |
+
2023-10-14 14:58:14,910 ----------------------------------------------------------------------------------------------------
|
109 |
+
2023-10-14 14:58:14,910 EPOCH 1 done: loss 0.6239 - lr: 0.000150
|
110 |
+
2023-10-14 14:58:53,380 DEV : loss 0.11717528849840164 - f1-score (micro avg) 0.5796
|
111 |
+
2023-10-14 14:58:53,438 saving best model
|
112 |
+
2023-10-14 14:58:54,356 ----------------------------------------------------------------------------------------------------
|
113 |
+
2023-10-14 15:00:36,235 epoch 2 - iter 361/3617 - loss 0.10878939 - time (sec): 101.88 - samples/sec: 370.35 - lr: 0.000148 - momentum: 0.000000
|
114 |
+
2023-10-14 15:02:20,856 epoch 2 - iter 722/3617 - loss 0.10496323 - time (sec): 206.50 - samples/sec: 369.37 - lr: 0.000147 - momentum: 0.000000
|
115 |
+
2023-10-14 15:04:10,294 epoch 2 - iter 1083/3617 - loss 0.10512196 - time (sec): 315.94 - samples/sec: 358.08 - lr: 0.000145 - momentum: 0.000000
|
116 |
+
2023-10-14 15:05:52,218 epoch 2 - iter 1444/3617 - loss 0.10247324 - time (sec): 417.86 - samples/sec: 360.21 - lr: 0.000143 - momentum: 0.000000
|
117 |
+
2023-10-14 15:07:32,739 epoch 2 - iter 1805/3617 - loss 0.10015258 - time (sec): 518.38 - samples/sec: 362.60 - lr: 0.000142 - momentum: 0.000000
|
118 |
+
2023-10-14 15:09:12,203 epoch 2 - iter 2166/3617 - loss 0.09832530 - time (sec): 617.84 - samples/sec: 366.60 - lr: 0.000140 - momentum: 0.000000
|
119 |
+
2023-10-14 15:10:55,993 epoch 2 - iter 2527/3617 - loss 0.09600180 - time (sec): 721.63 - samples/sec: 368.59 - lr: 0.000138 - momentum: 0.000000
|
120 |
+
2023-10-14 15:12:42,577 epoch 2 - iter 2888/3617 - loss 0.09489153 - time (sec): 828.22 - samples/sec: 367.36 - lr: 0.000137 - momentum: 0.000000
|
121 |
+
2023-10-14 15:14:25,871 epoch 2 - iter 3249/3617 - loss 0.09188276 - time (sec): 931.51 - samples/sec: 367.52 - lr: 0.000135 - momentum: 0.000000
|
122 |
+
2023-10-14 15:16:09,934 epoch 2 - iter 3610/3617 - loss 0.09054350 - time (sec): 1035.58 - samples/sec: 366.37 - lr: 0.000133 - momentum: 0.000000
|
123 |
+
2023-10-14 15:16:11,806 ----------------------------------------------------------------------------------------------------
|
124 |
+
2023-10-14 15:16:11,807 EPOCH 2 done: loss 0.0909 - lr: 0.000133
|
125 |
+
2023-10-14 15:16:51,726 DEV : loss 0.11103517562150955 - f1-score (micro avg) 0.6208
|
126 |
+
2023-10-14 15:16:51,794 saving best model
|
127 |
+
2023-10-14 15:16:57,345 ----------------------------------------------------------------------------------------------------
|
128 |
+
2023-10-14 15:18:47,930 epoch 3 - iter 361/3617 - loss 0.06378214 - time (sec): 110.58 - samples/sec: 346.95 - lr: 0.000132 - momentum: 0.000000
|
129 |
+
2023-10-14 15:20:27,753 epoch 3 - iter 722/3617 - loss 0.06540151 - time (sec): 210.40 - samples/sec: 356.29 - lr: 0.000130 - momentum: 0.000000
|
130 |
+
2023-10-14 15:22:11,280 epoch 3 - iter 1083/3617 - loss 0.06698159 - time (sec): 313.93 - samples/sec: 359.05 - lr: 0.000128 - momentum: 0.000000
|
131 |
+
2023-10-14 15:23:52,034 epoch 3 - iter 1444/3617 - loss 0.06526504 - time (sec): 414.68 - samples/sec: 366.94 - lr: 0.000127 - momentum: 0.000000
|
132 |
+
2023-10-14 15:25:35,189 epoch 3 - iter 1805/3617 - loss 0.06525753 - time (sec): 517.84 - samples/sec: 368.07 - lr: 0.000125 - momentum: 0.000000
|
133 |
+
2023-10-14 15:27:21,741 epoch 3 - iter 2166/3617 - loss 0.06493734 - time (sec): 624.39 - samples/sec: 367.06 - lr: 0.000123 - momentum: 0.000000
|
134 |
+
2023-10-14 15:29:02,837 epoch 3 - iter 2527/3617 - loss 0.06477906 - time (sec): 725.49 - samples/sec: 368.41 - lr: 0.000122 - momentum: 0.000000
|
135 |
+
2023-10-14 15:30:43,338 epoch 3 - iter 2888/3617 - loss 0.06486032 - time (sec): 825.99 - samples/sec: 367.73 - lr: 0.000120 - momentum: 0.000000
|
136 |
+
2023-10-14 15:32:21,513 epoch 3 - iter 3249/3617 - loss 0.06429360 - time (sec): 924.16 - samples/sec: 369.65 - lr: 0.000118 - momentum: 0.000000
|
137 |
+
2023-10-14 15:34:00,097 epoch 3 - iter 3610/3617 - loss 0.06473516 - time (sec): 1022.75 - samples/sec: 370.86 - lr: 0.000117 - momentum: 0.000000
|
138 |
+
2023-10-14 15:34:01,771 ----------------------------------------------------------------------------------------------------
|
139 |
+
2023-10-14 15:34:01,772 EPOCH 3 done: loss 0.0648 - lr: 0.000117
|
140 |
+
2023-10-14 15:34:42,878 DEV : loss 0.1630961298942566 - f1-score (micro avg) 0.6158
|
141 |
+
2023-10-14 15:34:42,948 ----------------------------------------------------------------------------------------------------
|
142 |
+
2023-10-14 15:36:23,601 epoch 4 - iter 361/3617 - loss 0.04594123 - time (sec): 100.65 - samples/sec: 363.48 - lr: 0.000115 - momentum: 0.000000
|
143 |
+
2023-10-14 15:38:03,903 epoch 4 - iter 722/3617 - loss 0.04874699 - time (sec): 200.95 - samples/sec: 370.90 - lr: 0.000113 - momentum: 0.000000
|
144 |
+
2023-10-14 15:39:47,633 epoch 4 - iter 1083/3617 - loss 0.04891474 - time (sec): 304.68 - samples/sec: 372.99 - lr: 0.000112 - momentum: 0.000000
|
145 |
+
2023-10-14 15:41:28,223 epoch 4 - iter 1444/3617 - loss 0.04690533 - time (sec): 405.27 - samples/sec: 370.54 - lr: 0.000110 - momentum: 0.000000
|
146 |
+
2023-10-14 15:43:08,244 epoch 4 - iter 1805/3617 - loss 0.04706884 - time (sec): 505.29 - samples/sec: 371.67 - lr: 0.000108 - momentum: 0.000000
|
147 |
+
2023-10-14 15:44:53,122 epoch 4 - iter 2166/3617 - loss 0.04732382 - time (sec): 610.17 - samples/sec: 371.22 - lr: 0.000107 - momentum: 0.000000
|
148 |
+
2023-10-14 15:46:34,153 epoch 4 - iter 2527/3617 - loss 0.04755205 - time (sec): 711.20 - samples/sec: 372.43 - lr: 0.000105 - momentum: 0.000000
|
149 |
+
2023-10-14 15:48:16,775 epoch 4 - iter 2888/3617 - loss 0.04692890 - time (sec): 813.82 - samples/sec: 373.54 - lr: 0.000103 - momentum: 0.000000
|
150 |
+
2023-10-14 15:49:58,502 epoch 4 - iter 3249/3617 - loss 0.04641577 - time (sec): 915.55 - samples/sec: 372.42 - lr: 0.000102 - momentum: 0.000000
|
151 |
+
2023-10-14 15:51:37,654 epoch 4 - iter 3610/3617 - loss 0.04580765 - time (sec): 1014.70 - samples/sec: 373.81 - lr: 0.000100 - momentum: 0.000000
|
152 |
+
2023-10-14 15:51:39,377 ----------------------------------------------------------------------------------------------------
|
153 |
+
2023-10-14 15:51:39,378 EPOCH 4 done: loss 0.0458 - lr: 0.000100
|
154 |
+
2023-10-14 15:52:18,475 DEV : loss 0.21207064390182495 - f1-score (micro avg) 0.6575
|
155 |
+
2023-10-14 15:52:18,532 saving best model
|
156 |
+
2023-10-14 15:52:21,265 ----------------------------------------------------------------------------------------------------
|
157 |
+
2023-10-14 15:53:59,089 epoch 5 - iter 361/3617 - loss 0.02339736 - time (sec): 97.82 - samples/sec: 384.30 - lr: 0.000098 - momentum: 0.000000
|
158 |
+
2023-10-14 15:55:43,726 epoch 5 - iter 722/3617 - loss 0.02553085 - time (sec): 202.46 - samples/sec: 387.78 - lr: 0.000097 - momentum: 0.000000
|
159 |
+
2023-10-14 15:57:28,602 epoch 5 - iter 1083/3617 - loss 0.02625521 - time (sec): 307.33 - samples/sec: 381.45 - lr: 0.000095 - momentum: 0.000000
|
160 |
+
2023-10-14 15:59:07,613 epoch 5 - iter 1444/3617 - loss 0.02682998 - time (sec): 406.34 - samples/sec: 378.67 - lr: 0.000093 - momentum: 0.000000
|
161 |
+
2023-10-14 16:00:54,191 epoch 5 - iter 1805/3617 - loss 0.02777093 - time (sec): 512.92 - samples/sec: 371.65 - lr: 0.000092 - momentum: 0.000000
|
162 |
+
2023-10-14 16:02:34,896 epoch 5 - iter 2166/3617 - loss 0.02927279 - time (sec): 613.63 - samples/sec: 373.11 - lr: 0.000090 - momentum: 0.000000
|
163 |
+
2023-10-14 16:04:12,788 epoch 5 - iter 2527/3617 - loss 0.02908364 - time (sec): 711.52 - samples/sec: 374.65 - lr: 0.000088 - momentum: 0.000000
|
164 |
+
2023-10-14 16:05:50,506 epoch 5 - iter 2888/3617 - loss 0.02954096 - time (sec): 809.24 - samples/sec: 375.84 - lr: 0.000087 - momentum: 0.000000
|
165 |
+
2023-10-14 16:07:30,269 epoch 5 - iter 3249/3617 - loss 0.03004065 - time (sec): 909.00 - samples/sec: 375.39 - lr: 0.000085 - momentum: 0.000000
|
166 |
+
2023-10-14 16:09:09,000 epoch 5 - iter 3610/3617 - loss 0.03067994 - time (sec): 1007.73 - samples/sec: 376.30 - lr: 0.000083 - momentum: 0.000000
|
167 |
+
2023-10-14 16:09:10,704 ----------------------------------------------------------------------------------------------------
|
168 |
+
2023-10-14 16:09:10,705 EPOCH 5 done: loss 0.0306 - lr: 0.000083
|
169 |
+
2023-10-14 16:09:49,418 DEV : loss 0.2788721024990082 - f1-score (micro avg) 0.6128
|
170 |
+
2023-10-14 16:09:49,475 ----------------------------------------------------------------------------------------------------
|
171 |
+
2023-10-14 16:11:32,473 epoch 6 - iter 361/3617 - loss 0.02077887 - time (sec): 103.00 - samples/sec: 381.48 - lr: 0.000082 - momentum: 0.000000
|
172 |
+
2023-10-14 16:13:14,265 epoch 6 - iter 722/3617 - loss 0.02070222 - time (sec): 204.79 - samples/sec: 373.48 - lr: 0.000080 - momentum: 0.000000
|
173 |
+
2023-10-14 16:14:54,315 epoch 6 - iter 1083/3617 - loss 0.01942465 - time (sec): 304.84 - samples/sec: 374.09 - lr: 0.000078 - momentum: 0.000000
|
174 |
+
2023-10-14 16:16:32,568 epoch 6 - iter 1444/3617 - loss 0.02052837 - time (sec): 403.09 - samples/sec: 375.08 - lr: 0.000077 - momentum: 0.000000
|
175 |
+
2023-10-14 16:18:15,025 epoch 6 - iter 1805/3617 - loss 0.02128468 - time (sec): 505.55 - samples/sec: 372.79 - lr: 0.000075 - momentum: 0.000000
|
176 |
+
2023-10-14 16:19:57,182 epoch 6 - iter 2166/3617 - loss 0.02145412 - time (sec): 607.70 - samples/sec: 373.59 - lr: 0.000073 - momentum: 0.000000
|
177 |
+
2023-10-14 16:21:37,405 epoch 6 - iter 2527/3617 - loss 0.02127559 - time (sec): 707.93 - samples/sec: 373.85 - lr: 0.000072 - momentum: 0.000000
|
178 |
+
2023-10-14 16:23:17,748 epoch 6 - iter 2888/3617 - loss 0.02174407 - time (sec): 808.27 - samples/sec: 375.50 - lr: 0.000070 - momentum: 0.000000
|
179 |
+
2023-10-14 16:24:58,379 epoch 6 - iter 3249/3617 - loss 0.02165837 - time (sec): 908.90 - samples/sec: 375.13 - lr: 0.000068 - momentum: 0.000000
|
180 |
+
2023-10-14 16:26:39,103 epoch 6 - iter 3610/3617 - loss 0.02119253 - time (sec): 1009.63 - samples/sec: 375.80 - lr: 0.000067 - momentum: 0.000000
|
181 |
+
2023-10-14 16:26:41,074 ----------------------------------------------------------------------------------------------------
|
182 |
+
2023-10-14 16:26:41,074 EPOCH 6 done: loss 0.0212 - lr: 0.000067
|
183 |
+
2023-10-14 16:27:20,798 DEV : loss 0.313930869102478 - f1-score (micro avg) 0.6284
|
184 |
+
2023-10-14 16:27:20,855 ----------------------------------------------------------------------------------------------------
|
185 |
+
2023-10-14 16:29:06,678 epoch 7 - iter 361/3617 - loss 0.01422485 - time (sec): 105.82 - samples/sec: 383.75 - lr: 0.000065 - momentum: 0.000000
|
186 |
+
2023-10-14 16:30:54,879 epoch 7 - iter 722/3617 - loss 0.01422789 - time (sec): 214.02 - samples/sec: 363.13 - lr: 0.000063 - momentum: 0.000000
|
187 |
+
2023-10-14 16:32:43,802 epoch 7 - iter 1083/3617 - loss 0.01393561 - time (sec): 322.94 - samples/sec: 357.57 - lr: 0.000062 - momentum: 0.000000
|
188 |
+
2023-10-14 16:34:23,506 epoch 7 - iter 1444/3617 - loss 0.01434256 - time (sec): 422.65 - samples/sec: 361.08 - lr: 0.000060 - momentum: 0.000000
|
189 |
+
2023-10-14 16:36:06,389 epoch 7 - iter 1805/3617 - loss 0.01398391 - time (sec): 525.53 - samples/sec: 361.44 - lr: 0.000058 - momentum: 0.000000
|
190 |
+
2023-10-14 16:37:52,259 epoch 7 - iter 2166/3617 - loss 0.01422903 - time (sec): 631.40 - samples/sec: 360.88 - lr: 0.000057 - momentum: 0.000000
|
191 |
+
2023-10-14 16:39:35,485 epoch 7 - iter 2527/3617 - loss 0.01414726 - time (sec): 734.63 - samples/sec: 362.33 - lr: 0.000055 - momentum: 0.000000
|
192 |
+
2023-10-14 16:41:19,107 epoch 7 - iter 2888/3617 - loss 0.01450497 - time (sec): 838.25 - samples/sec: 364.58 - lr: 0.000053 - momentum: 0.000000
|
193 |
+
2023-10-14 16:43:02,248 epoch 7 - iter 3249/3617 - loss 0.01550406 - time (sec): 941.39 - samples/sec: 364.18 - lr: 0.000052 - momentum: 0.000000
|
194 |
+
2023-10-14 16:44:48,987 epoch 7 - iter 3610/3617 - loss 0.01571456 - time (sec): 1048.13 - samples/sec: 361.60 - lr: 0.000050 - momentum: 0.000000
|
195 |
+
2023-10-14 16:44:51,074 ----------------------------------------------------------------------------------------------------
|
196 |
+
2023-10-14 16:44:51,074 EPOCH 7 done: loss 0.0158 - lr: 0.000050
|
197 |
+
2023-10-14 16:45:30,018 DEV : loss 0.3257623016834259 - f1-score (micro avg) 0.6263
|
198 |
+
2023-10-14 16:45:30,075 ----------------------------------------------------------------------------------------------------
|
199 |
+
2023-10-14 16:47:09,244 epoch 8 - iter 361/3617 - loss 0.01142318 - time (sec): 99.17 - samples/sec: 387.29 - lr: 0.000048 - momentum: 0.000000
|
200 |
+
2023-10-14 16:48:57,539 epoch 8 - iter 722/3617 - loss 0.01106006 - time (sec): 207.46 - samples/sec: 374.10 - lr: 0.000047 - momentum: 0.000000
|
201 |
+
2023-10-14 16:50:47,572 epoch 8 - iter 1083/3617 - loss 0.01148474 - time (sec): 317.49 - samples/sec: 363.91 - lr: 0.000045 - momentum: 0.000000
|
202 |
+
2023-10-14 16:52:29,277 epoch 8 - iter 1444/3617 - loss 0.01109820 - time (sec): 419.20 - samples/sec: 363.08 - lr: 0.000043 - momentum: 0.000000
|
203 |
+
2023-10-14 16:54:07,215 epoch 8 - iter 1805/3617 - loss 0.01060742 - time (sec): 517.14 - samples/sec: 368.74 - lr: 0.000042 - momentum: 0.000000
|
204 |
+
2023-10-14 16:55:45,728 epoch 8 - iter 2166/3617 - loss 0.01013602 - time (sec): 615.65 - samples/sec: 368.66 - lr: 0.000040 - momentum: 0.000000
|
205 |
+
2023-10-14 16:57:28,600 epoch 8 - iter 2527/3617 - loss 0.01003184 - time (sec): 718.52 - samples/sec: 370.30 - lr: 0.000038 - momentum: 0.000000
|
206 |
+
2023-10-14 16:59:08,971 epoch 8 - iter 2888/3617 - loss 0.01030870 - time (sec): 818.89 - samples/sec: 370.18 - lr: 0.000037 - momentum: 0.000000
|
207 |
+
2023-10-14 17:00:48,363 epoch 8 - iter 3249/3617 - loss 0.01013006 - time (sec): 918.29 - samples/sec: 371.85 - lr: 0.000035 - momentum: 0.000000
|
208 |
+
2023-10-14 17:02:27,074 epoch 8 - iter 3610/3617 - loss 0.00973135 - time (sec): 1017.00 - samples/sec: 372.91 - lr: 0.000033 - momentum: 0.000000
|
209 |
+
2023-10-14 17:02:28,759 ----------------------------------------------------------------------------------------------------
|
210 |
+
2023-10-14 17:02:28,759 EPOCH 8 done: loss 0.0097 - lr: 0.000033
|
211 |
+
2023-10-14 17:03:08,171 DEV : loss 0.3519401252269745 - f1-score (micro avg) 0.6383
|
212 |
+
2023-10-14 17:03:08,238 ----------------------------------------------------------------------------------------------------
|
213 |
+
2023-10-14 17:04:56,121 epoch 9 - iter 361/3617 - loss 0.00400351 - time (sec): 107.88 - samples/sec: 337.43 - lr: 0.000032 - momentum: 0.000000
|
214 |
+
2023-10-14 17:06:44,549 epoch 9 - iter 722/3617 - loss 0.00553294 - time (sec): 216.31 - samples/sec: 337.08 - lr: 0.000030 - momentum: 0.000000
|
215 |
+
2023-10-14 17:08:26,029 epoch 9 - iter 1083/3617 - loss 0.00718701 - time (sec): 317.79 - samples/sec: 350.01 - lr: 0.000028 - momentum: 0.000000
|
216 |
+
2023-10-14 17:10:14,915 epoch 9 - iter 1444/3617 - loss 0.00747014 - time (sec): 426.67 - samples/sec: 351.11 - lr: 0.000027 - momentum: 0.000000
|
217 |
+
2023-10-14 17:12:13,127 epoch 9 - iter 1805/3617 - loss 0.00700743 - time (sec): 544.89 - samples/sec: 346.36 - lr: 0.000025 - momentum: 0.000000
|
218 |
+
2023-10-14 17:13:56,710 epoch 9 - iter 2166/3617 - loss 0.00753126 - time (sec): 648.47 - samples/sec: 348.52 - lr: 0.000023 - momentum: 0.000000
|
219 |
+
2023-10-14 17:15:35,725 epoch 9 - iter 2527/3617 - loss 0.00713809 - time (sec): 747.48 - samples/sec: 353.09 - lr: 0.000022 - momentum: 0.000000
|
220 |
+
2023-10-14 17:17:15,835 epoch 9 - iter 2888/3617 - loss 0.00710439 - time (sec): 847.60 - samples/sec: 358.03 - lr: 0.000020 - momentum: 0.000000
|
221 |
+
2023-10-14 17:18:53,848 epoch 9 - iter 3249/3617 - loss 0.00691053 - time (sec): 945.61 - samples/sec: 361.20 - lr: 0.000018 - momentum: 0.000000
|
222 |
+
2023-10-14 17:20:32,410 epoch 9 - iter 3610/3617 - loss 0.00723383 - time (sec): 1044.17 - samples/sec: 363.12 - lr: 0.000017 - momentum: 0.000000
|
223 |
+
2023-10-14 17:20:34,181 ----------------------------------------------------------------------------------------------------
|
224 |
+
2023-10-14 17:20:34,181 EPOCH 9 done: loss 0.0072 - lr: 0.000017
|
225 |
+
2023-10-14 17:21:13,627 DEV : loss 0.37418004870414734 - f1-score (micro avg) 0.6425
|
226 |
+
2023-10-14 17:21:13,685 ----------------------------------------------------------------------------------------------------
|
227 |
+
2023-10-14 17:22:51,217 epoch 10 - iter 361/3617 - loss 0.00288085 - time (sec): 97.53 - samples/sec: 383.84 - lr: 0.000015 - momentum: 0.000000
|
228 |
+
2023-10-14 17:24:32,451 epoch 10 - iter 722/3617 - loss 0.00430136 - time (sec): 198.76 - samples/sec: 381.28 - lr: 0.000013 - momentum: 0.000000
|
229 |
+
2023-10-14 17:26:18,156 epoch 10 - iter 1083/3617 - loss 0.00516123 - time (sec): 304.47 - samples/sec: 376.22 - lr: 0.000012 - momentum: 0.000000
|
230 |
+
2023-10-14 17:27:59,531 epoch 10 - iter 1444/3617 - loss 0.00455380 - time (sec): 405.84 - samples/sec: 374.69 - lr: 0.000010 - momentum: 0.000000
|
231 |
+
2023-10-14 17:29:41,578 epoch 10 - iter 1805/3617 - loss 0.00417121 - time (sec): 507.89 - samples/sec: 373.65 - lr: 0.000008 - momentum: 0.000000
|
232 |
+
2023-10-14 17:31:24,980 epoch 10 - iter 2166/3617 - loss 0.00427900 - time (sec): 611.29 - samples/sec: 373.25 - lr: 0.000007 - momentum: 0.000000
|
233 |
+
2023-10-14 17:33:04,407 epoch 10 - iter 2527/3617 - loss 0.00423939 - time (sec): 710.72 - samples/sec: 374.13 - lr: 0.000005 - momentum: 0.000000
|
234 |
+
2023-10-14 17:34:44,494 epoch 10 - iter 2888/3617 - loss 0.00423096 - time (sec): 810.81 - samples/sec: 376.32 - lr: 0.000003 - momentum: 0.000000
|
235 |
+
2023-10-14 17:36:23,538 epoch 10 - iter 3249/3617 - loss 0.00456365 - time (sec): 909.85 - samples/sec: 376.25 - lr: 0.000002 - momentum: 0.000000
|
236 |
+
2023-10-14 17:38:04,514 epoch 10 - iter 3610/3617 - loss 0.00445018 - time (sec): 1010.83 - samples/sec: 375.28 - lr: 0.000000 - momentum: 0.000000
|
237 |
+
2023-10-14 17:38:06,359 ----------------------------------------------------------------------------------------------------
|
238 |
+
2023-10-14 17:38:06,359 EPOCH 10 done: loss 0.0045 - lr: 0.000000
|
239 |
+
2023-10-14 17:38:48,489 DEV : loss 0.383007675409317 - f1-score (micro avg) 0.6403
|
240 |
+
2023-10-14 17:38:49,482 ----------------------------------------------------------------------------------------------------
|
241 |
+
2023-10-14 17:38:49,484 Loading model from best epoch ...
|
242 |
+
2023-10-14 17:38:53,352 SequenceTagger predicts: Dictionary with 13 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org
|
243 |
+
2023-10-14 17:39:53,980
|
244 |
+
Results:
|
245 |
+
- F-score (micro) 0.6356
|
246 |
+
- F-score (macro) 0.4981
|
247 |
+
- Accuracy 0.4788
|
248 |
+
|
249 |
+
By class:
|
250 |
+
precision recall f1-score support
|
251 |
+
|
252 |
+
loc 0.6276 0.7699 0.6915 591
|
253 |
+
pers 0.5664 0.7171 0.6329 357
|
254 |
+
org 0.1757 0.1646 0.1699 79
|
255 |
+
|
256 |
+
micro avg 0.5787 0.7050 0.6356 1027
|
257 |
+
macro avg 0.4565 0.5505 0.4981 1027
|
258 |
+
weighted avg 0.5715 0.7050 0.6310 1027
|
259 |
+
|
260 |
+
2023-10-14 17:39:53,980 ----------------------------------------------------------------------------------------------------
|