2023-10-17 11:13:56,603 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:13:56,605 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): ElectraModel( (embeddings): ElectraEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): ElectraEncoder( (layer): ModuleList( (0-11): 12 x ElectraLayer( (attention): ElectraAttention( (self): ElectraSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): ElectraSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): ElectraIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): ElectraOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=13, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-17 11:13:56,605 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:13:56,606 MultiCorpus: 6183 train + 680 dev + 2113 test sentences - NER_HIPE_2022 Corpus: 6183 train + 680 dev + 2113 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/topres19th/en/with_doc_seperator 2023-10-17 11:13:56,606 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:13:56,606 Train: 6183 sentences 2023-10-17 11:13:56,606 (train_with_dev=False, train_with_test=False) 2023-10-17 11:13:56,606 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:13:56,606 Training Params: 2023-10-17 11:13:56,606 - learning_rate: "3e-05" 2023-10-17 11:13:56,606 - mini_batch_size: "4" 2023-10-17 11:13:56,606 - max_epochs: "10" 2023-10-17 11:13:56,606 - shuffle: "True" 2023-10-17 11:13:56,606 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:13:56,606 Plugins: 2023-10-17 11:13:56,607 - TensorboardLogger 2023-10-17 11:13:56,607 - LinearScheduler | warmup_fraction: '0.1' 2023-10-17 11:13:56,607 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:13:56,607 Final evaluation on model from best epoch (best-model.pt) 2023-10-17 11:13:56,607 - metric: "('micro avg', 'f1-score')" 2023-10-17 11:13:56,607 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:13:56,607 Computation: 2023-10-17 11:13:56,607 - compute on device: cuda:0 2023-10-17 11:13:56,607 - embedding storage: none 2023-10-17 11:13:56,607 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:13:56,607 Model training base path: "hmbench-topres19th/en-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-3" 2023-10-17 11:13:56,607 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:13:56,607 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:13:56,608 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-17 11:14:08,890 epoch 1 - iter 154/1546 - loss 2.29547994 - time (sec): 12.28 - samples/sec: 1002.18 - lr: 0.000003 - momentum: 0.000000 2023-10-17 11:14:21,057 epoch 1 - iter 308/1546 - loss 1.29346448 - time (sec): 24.45 - samples/sec: 1007.81 - lr: 0.000006 - momentum: 0.000000 2023-10-17 11:14:34,224 epoch 1 - iter 462/1546 - loss 0.90808138 - time (sec): 37.61 - samples/sec: 995.33 - lr: 0.000009 - momentum: 0.000000 2023-10-17 11:14:46,652 epoch 1 - iter 616/1546 - loss 0.70493184 - time (sec): 50.04 - samples/sec: 1001.84 - lr: 0.000012 - momentum: 0.000000 2023-10-17 11:14:59,593 epoch 1 - iter 770/1546 - loss 0.58385751 - time (sec): 62.98 - samples/sec: 992.94 - lr: 0.000015 - momentum: 0.000000 2023-10-17 11:15:12,599 epoch 1 - iter 924/1546 - loss 0.51470217 - time (sec): 75.99 - samples/sec: 975.42 - lr: 0.000018 - momentum: 0.000000 2023-10-17 11:15:25,294 epoch 1 - iter 1078/1546 - loss 0.46185738 - time (sec): 88.68 - samples/sec: 964.67 - lr: 0.000021 - momentum: 0.000000 2023-10-17 11:15:37,971 epoch 1 - iter 1232/1546 - loss 0.41727367 - time (sec): 101.36 - samples/sec: 968.07 - lr: 0.000024 - momentum: 0.000000 2023-10-17 11:15:50,195 epoch 1 - iter 1386/1546 - loss 0.38182017 - time (sec): 113.59 - samples/sec: 979.04 - lr: 0.000027 - momentum: 0.000000 2023-10-17 11:16:02,470 epoch 1 - iter 1540/1546 - loss 0.35440131 - time (sec): 125.86 - samples/sec: 982.92 - lr: 0.000030 - momentum: 0.000000 2023-10-17 11:16:02,944 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:16:02,945 EPOCH 1 done: loss 0.3530 - lr: 0.000030 2023-10-17 11:16:05,529 DEV : loss 0.05910523235797882 - f1-score (micro avg) 0.7659 2023-10-17 11:16:05,566 saving best model 2023-10-17 11:16:06,167 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:16:18,356 epoch 2 - iter 154/1546 - loss 0.08768355 - time (sec): 12.19 - samples/sec: 972.96 - lr: 0.000030 - momentum: 0.000000 2023-10-17 11:16:30,262 epoch 2 - iter 308/1546 - loss 0.08328678 - time (sec): 24.09 - samples/sec: 996.17 - lr: 0.000029 - momentum: 0.000000 2023-10-17 11:16:42,966 epoch 2 - iter 462/1546 - loss 0.08465471 - time (sec): 36.80 - samples/sec: 990.15 - lr: 0.000029 - momentum: 0.000000 2023-10-17 11:16:55,659 epoch 2 - iter 616/1546 - loss 0.08609058 - time (sec): 49.49 - samples/sec: 999.87 - lr: 0.000029 - momentum: 0.000000 2023-10-17 11:17:08,034 epoch 2 - iter 770/1546 - loss 0.08402493 - time (sec): 61.86 - samples/sec: 994.12 - lr: 0.000028 - momentum: 0.000000 2023-10-17 11:17:19,827 epoch 2 - iter 924/1546 - loss 0.08269652 - time (sec): 73.66 - samples/sec: 1011.66 - lr: 0.000028 - momentum: 0.000000 2023-10-17 11:17:32,826 epoch 2 - iter 1078/1546 - loss 0.08211590 - time (sec): 86.66 - samples/sec: 998.57 - lr: 0.000028 - momentum: 0.000000 2023-10-17 11:17:45,948 epoch 2 - iter 1232/1546 - loss 0.08093449 - time (sec): 99.78 - samples/sec: 983.42 - lr: 0.000027 - momentum: 0.000000 2023-10-17 11:17:58,852 epoch 2 - iter 1386/1546 - loss 0.08134950 - time (sec): 112.68 - samples/sec: 994.24 - lr: 0.000027 - momentum: 0.000000 2023-10-17 11:18:11,277 epoch 2 - iter 1540/1546 - loss 0.08052732 - time (sec): 125.11 - samples/sec: 990.05 - lr: 0.000027 - momentum: 0.000000 2023-10-17 11:18:11,777 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:18:11,777 EPOCH 2 done: loss 0.0812 - lr: 0.000027 2023-10-17 11:18:14,902 DEV : loss 0.06066319718956947 - f1-score (micro avg) 0.7671 2023-10-17 11:18:14,938 saving best model 2023-10-17 11:18:16,489 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:18:29,564 epoch 3 - iter 154/1546 - loss 0.05252984 - time (sec): 13.07 - samples/sec: 908.49 - lr: 0.000026 - momentum: 0.000000 2023-10-17 11:18:42,028 epoch 3 - iter 308/1546 - loss 0.05150808 - time (sec): 25.54 - samples/sec: 902.92 - lr: 0.000026 - momentum: 0.000000 2023-10-17 11:18:54,060 epoch 3 - iter 462/1546 - loss 0.05008931 - time (sec): 37.57 - samples/sec: 945.68 - lr: 0.000026 - momentum: 0.000000 2023-10-17 11:19:06,497 epoch 3 - iter 616/1546 - loss 0.05637854 - time (sec): 50.01 - samples/sec: 969.28 - lr: 0.000025 - momentum: 0.000000 2023-10-17 11:19:18,427 epoch 3 - iter 770/1546 - loss 0.05511052 - time (sec): 61.94 - samples/sec: 989.20 - lr: 0.000025 - momentum: 0.000000 2023-10-17 11:19:30,271 epoch 3 - iter 924/1546 - loss 0.05463507 - time (sec): 73.78 - samples/sec: 999.35 - lr: 0.000025 - momentum: 0.000000 2023-10-17 11:19:42,021 epoch 3 - iter 1078/1546 - loss 0.05491450 - time (sec): 85.53 - samples/sec: 1007.21 - lr: 0.000024 - momentum: 0.000000 2023-10-17 11:19:53,763 epoch 3 - iter 1232/1546 - loss 0.05459277 - time (sec): 97.27 - samples/sec: 1013.08 - lr: 0.000024 - momentum: 0.000000 2023-10-17 11:20:05,476 epoch 3 - iter 1386/1546 - loss 0.05518890 - time (sec): 108.98 - samples/sec: 1020.20 - lr: 0.000024 - momentum: 0.000000 2023-10-17 11:20:17,203 epoch 3 - iter 1540/1546 - loss 0.05554067 - time (sec): 120.71 - samples/sec: 1026.37 - lr: 0.000023 - momentum: 0.000000 2023-10-17 11:20:17,646 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:20:17,647 EPOCH 3 done: loss 0.0556 - lr: 0.000023 2023-10-17 11:20:20,446 DEV : loss 0.08556292951107025 - f1-score (micro avg) 0.7732 2023-10-17 11:20:20,475 saving best model 2023-10-17 11:20:21,890 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:20:33,722 epoch 4 - iter 154/1546 - loss 0.03577965 - time (sec): 11.83 - samples/sec: 1097.27 - lr: 0.000023 - momentum: 0.000000 2023-10-17 11:20:45,706 epoch 4 - iter 308/1546 - loss 0.03488418 - time (sec): 23.81 - samples/sec: 1107.14 - lr: 0.000023 - momentum: 0.000000 2023-10-17 11:20:58,160 epoch 4 - iter 462/1546 - loss 0.03705614 - time (sec): 36.27 - samples/sec: 1054.19 - lr: 0.000022 - momentum: 0.000000 2023-10-17 11:21:10,800 epoch 4 - iter 616/1546 - loss 0.03656234 - time (sec): 48.91 - samples/sec: 1031.28 - lr: 0.000022 - momentum: 0.000000 2023-10-17 11:21:22,899 epoch 4 - iter 770/1546 - loss 0.03807395 - time (sec): 61.00 - samples/sec: 1027.23 - lr: 0.000022 - momentum: 0.000000 2023-10-17 11:21:35,839 epoch 4 - iter 924/1546 - loss 0.03823139 - time (sec): 73.94 - samples/sec: 1013.14 - lr: 0.000021 - momentum: 0.000000 2023-10-17 11:21:47,505 epoch 4 - iter 1078/1546 - loss 0.03665500 - time (sec): 85.61 - samples/sec: 1012.27 - lr: 0.000021 - momentum: 0.000000 2023-10-17 11:21:59,565 epoch 4 - iter 1232/1546 - loss 0.03711619 - time (sec): 97.67 - samples/sec: 1006.84 - lr: 0.000021 - momentum: 0.000000 2023-10-17 11:22:11,925 epoch 4 - iter 1386/1546 - loss 0.03726587 - time (sec): 110.03 - samples/sec: 1012.36 - lr: 0.000020 - momentum: 0.000000 2023-10-17 11:22:24,289 epoch 4 - iter 1540/1546 - loss 0.03708960 - time (sec): 122.40 - samples/sec: 1010.95 - lr: 0.000020 - momentum: 0.000000 2023-10-17 11:22:24,761 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:22:24,762 EPOCH 4 done: loss 0.0369 - lr: 0.000020 2023-10-17 11:22:27,688 DEV : loss 0.0917343944311142 - f1-score (micro avg) 0.8195 2023-10-17 11:22:27,719 saving best model 2023-10-17 11:22:29,184 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:22:41,658 epoch 5 - iter 154/1546 - loss 0.02368522 - time (sec): 12.47 - samples/sec: 982.48 - lr: 0.000020 - momentum: 0.000000 2023-10-17 11:22:54,113 epoch 5 - iter 308/1546 - loss 0.01941446 - time (sec): 24.92 - samples/sec: 1027.45 - lr: 0.000019 - momentum: 0.000000 2023-10-17 11:23:06,811 epoch 5 - iter 462/1546 - loss 0.02121999 - time (sec): 37.62 - samples/sec: 984.80 - lr: 0.000019 - momentum: 0.000000 2023-10-17 11:23:19,317 epoch 5 - iter 616/1546 - loss 0.02123932 - time (sec): 50.13 - samples/sec: 974.54 - lr: 0.000019 - momentum: 0.000000 2023-10-17 11:23:31,434 epoch 5 - iter 770/1546 - loss 0.02250424 - time (sec): 62.25 - samples/sec: 979.67 - lr: 0.000018 - momentum: 0.000000 2023-10-17 11:23:43,680 epoch 5 - iter 924/1546 - loss 0.02230769 - time (sec): 74.49 - samples/sec: 984.33 - lr: 0.000018 - momentum: 0.000000 2023-10-17 11:23:56,140 epoch 5 - iter 1078/1546 - loss 0.02319952 - time (sec): 86.95 - samples/sec: 990.92 - lr: 0.000018 - momentum: 0.000000 2023-10-17 11:24:09,092 epoch 5 - iter 1232/1546 - loss 0.02412054 - time (sec): 99.90 - samples/sec: 989.89 - lr: 0.000017 - momentum: 0.000000 2023-10-17 11:24:21,282 epoch 5 - iter 1386/1546 - loss 0.02598636 - time (sec): 112.09 - samples/sec: 992.74 - lr: 0.000017 - momentum: 0.000000 2023-10-17 11:24:33,226 epoch 5 - iter 1540/1546 - loss 0.02551503 - time (sec): 124.04 - samples/sec: 995.81 - lr: 0.000017 - momentum: 0.000000 2023-10-17 11:24:33,716 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:24:33,716 EPOCH 5 done: loss 0.0257 - lr: 0.000017 2023-10-17 11:24:36,785 DEV : loss 0.10130724310874939 - f1-score (micro avg) 0.7893 2023-10-17 11:24:36,819 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:24:49,940 epoch 6 - iter 154/1546 - loss 0.01652449 - time (sec): 13.12 - samples/sec: 903.13 - lr: 0.000016 - momentum: 0.000000 2023-10-17 11:25:02,121 epoch 6 - iter 308/1546 - loss 0.01442618 - time (sec): 25.30 - samples/sec: 913.93 - lr: 0.000016 - momentum: 0.000000 2023-10-17 11:25:14,373 epoch 6 - iter 462/1546 - loss 0.01784155 - time (sec): 37.55 - samples/sec: 954.25 - lr: 0.000016 - momentum: 0.000000 2023-10-17 11:25:27,405 epoch 6 - iter 616/1546 - loss 0.01777913 - time (sec): 50.58 - samples/sec: 971.08 - lr: 0.000015 - momentum: 0.000000 2023-10-17 11:25:39,347 epoch 6 - iter 770/1546 - loss 0.01795534 - time (sec): 62.53 - samples/sec: 984.87 - lr: 0.000015 - momentum: 0.000000 2023-10-17 11:25:51,305 epoch 6 - iter 924/1546 - loss 0.01739042 - time (sec): 74.48 - samples/sec: 991.11 - lr: 0.000015 - momentum: 0.000000 2023-10-17 11:26:03,459 epoch 6 - iter 1078/1546 - loss 0.01775013 - time (sec): 86.64 - samples/sec: 998.09 - lr: 0.000014 - momentum: 0.000000 2023-10-17 11:26:16,160 epoch 6 - iter 1232/1546 - loss 0.01809639 - time (sec): 99.34 - samples/sec: 994.38 - lr: 0.000014 - momentum: 0.000000 2023-10-17 11:26:29,256 epoch 6 - iter 1386/1546 - loss 0.01849270 - time (sec): 112.43 - samples/sec: 990.80 - lr: 0.000014 - momentum: 0.000000 2023-10-17 11:26:41,336 epoch 6 - iter 1540/1546 - loss 0.01830477 - time (sec): 124.51 - samples/sec: 993.88 - lr: 0.000013 - momentum: 0.000000 2023-10-17 11:26:41,807 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:26:41,808 EPOCH 6 done: loss 0.0182 - lr: 0.000013 2023-10-17 11:26:44,694 DEV : loss 0.10719826072454453 - f1-score (micro avg) 0.7934 2023-10-17 11:26:44,728 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:26:57,550 epoch 7 - iter 154/1546 - loss 0.01081941 - time (sec): 12.82 - samples/sec: 918.18 - lr: 0.000013 - momentum: 0.000000 2023-10-17 11:27:10,095 epoch 7 - iter 308/1546 - loss 0.00812489 - time (sec): 25.36 - samples/sec: 946.51 - lr: 0.000013 - momentum: 0.000000 2023-10-17 11:27:22,185 epoch 7 - iter 462/1546 - loss 0.00814902 - time (sec): 37.45 - samples/sec: 979.98 - lr: 0.000012 - momentum: 0.000000 2023-10-17 11:27:34,148 epoch 7 - iter 616/1546 - loss 0.00838999 - time (sec): 49.42 - samples/sec: 982.44 - lr: 0.000012 - momentum: 0.000000 2023-10-17 11:27:46,310 epoch 7 - iter 770/1546 - loss 0.00970640 - time (sec): 61.58 - samples/sec: 979.40 - lr: 0.000012 - momentum: 0.000000 2023-10-17 11:27:58,186 epoch 7 - iter 924/1546 - loss 0.01186251 - time (sec): 73.46 - samples/sec: 991.95 - lr: 0.000011 - momentum: 0.000000 2023-10-17 11:28:10,494 epoch 7 - iter 1078/1546 - loss 0.01264509 - time (sec): 85.76 - samples/sec: 1016.31 - lr: 0.000011 - momentum: 0.000000 2023-10-17 11:28:22,874 epoch 7 - iter 1232/1546 - loss 0.01279804 - time (sec): 98.14 - samples/sec: 1010.40 - lr: 0.000011 - momentum: 0.000000 2023-10-17 11:28:35,168 epoch 7 - iter 1386/1546 - loss 0.01293158 - time (sec): 110.44 - samples/sec: 1007.01 - lr: 0.000010 - momentum: 0.000000 2023-10-17 11:28:47,480 epoch 7 - iter 1540/1546 - loss 0.01261237 - time (sec): 122.75 - samples/sec: 1006.53 - lr: 0.000010 - momentum: 0.000000 2023-10-17 11:28:47,956 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:28:47,956 EPOCH 7 done: loss 0.0127 - lr: 0.000010 2023-10-17 11:28:50,876 DEV : loss 0.11056238412857056 - f1-score (micro avg) 0.7773 2023-10-17 11:28:50,910 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:29:03,762 epoch 8 - iter 154/1546 - loss 0.01168277 - time (sec): 12.85 - samples/sec: 908.17 - lr: 0.000010 - momentum: 0.000000 2023-10-17 11:29:16,091 epoch 8 - iter 308/1546 - loss 0.00829209 - time (sec): 25.18 - samples/sec: 961.42 - lr: 0.000009 - momentum: 0.000000 2023-10-17 11:29:28,416 epoch 8 - iter 462/1546 - loss 0.00887281 - time (sec): 37.50 - samples/sec: 949.77 - lr: 0.000009 - momentum: 0.000000 2023-10-17 11:29:40,478 epoch 8 - iter 616/1546 - loss 0.00942431 - time (sec): 49.57 - samples/sec: 954.38 - lr: 0.000009 - momentum: 0.000000 2023-10-17 11:29:53,170 epoch 8 - iter 770/1546 - loss 0.00837839 - time (sec): 62.26 - samples/sec: 971.20 - lr: 0.000008 - momentum: 0.000000 2023-10-17 11:30:05,612 epoch 8 - iter 924/1546 - loss 0.00809697 - time (sec): 74.70 - samples/sec: 988.36 - lr: 0.000008 - momentum: 0.000000 2023-10-17 11:30:17,628 epoch 8 - iter 1078/1546 - loss 0.00792721 - time (sec): 86.72 - samples/sec: 994.36 - lr: 0.000008 - momentum: 0.000000 2023-10-17 11:30:29,873 epoch 8 - iter 1232/1546 - loss 0.00741324 - time (sec): 98.96 - samples/sec: 1001.21 - lr: 0.000007 - momentum: 0.000000 2023-10-17 11:30:42,467 epoch 8 - iter 1386/1546 - loss 0.00743982 - time (sec): 111.56 - samples/sec: 996.95 - lr: 0.000007 - momentum: 0.000000 2023-10-17 11:30:54,660 epoch 8 - iter 1540/1546 - loss 0.00794718 - time (sec): 123.75 - samples/sec: 999.99 - lr: 0.000007 - momentum: 0.000000 2023-10-17 11:30:55,169 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:30:55,169 EPOCH 8 done: loss 0.0079 - lr: 0.000007 2023-10-17 11:30:58,242 DEV : loss 0.10748875141143799 - f1-score (micro avg) 0.8082 2023-10-17 11:30:58,271 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:31:10,443 epoch 9 - iter 154/1546 - loss 0.00097898 - time (sec): 12.17 - samples/sec: 1034.89 - lr: 0.000006 - momentum: 0.000000 2023-10-17 11:31:22,828 epoch 9 - iter 308/1546 - loss 0.00288126 - time (sec): 24.56 - samples/sec: 1065.60 - lr: 0.000006 - momentum: 0.000000 2023-10-17 11:31:35,404 epoch 9 - iter 462/1546 - loss 0.00393364 - time (sec): 37.13 - samples/sec: 1023.72 - lr: 0.000006 - momentum: 0.000000 2023-10-17 11:31:47,705 epoch 9 - iter 616/1546 - loss 0.00335481 - time (sec): 49.43 - samples/sec: 1025.62 - lr: 0.000005 - momentum: 0.000000 2023-10-17 11:32:00,084 epoch 9 - iter 770/1546 - loss 0.00299308 - time (sec): 61.81 - samples/sec: 1019.06 - lr: 0.000005 - momentum: 0.000000 2023-10-17 11:32:12,334 epoch 9 - iter 924/1546 - loss 0.00368111 - time (sec): 74.06 - samples/sec: 1007.26 - lr: 0.000005 - momentum: 0.000000 2023-10-17 11:32:24,354 epoch 9 - iter 1078/1546 - loss 0.00488522 - time (sec): 86.08 - samples/sec: 1009.42 - lr: 0.000004 - momentum: 0.000000 2023-10-17 11:32:36,765 epoch 9 - iter 1232/1546 - loss 0.00490894 - time (sec): 98.49 - samples/sec: 1011.12 - lr: 0.000004 - momentum: 0.000000 2023-10-17 11:32:49,481 epoch 9 - iter 1386/1546 - loss 0.00510702 - time (sec): 111.21 - samples/sec: 1012.93 - lr: 0.000004 - momentum: 0.000000 2023-10-17 11:33:02,191 epoch 9 - iter 1540/1546 - loss 0.00513657 - time (sec): 123.92 - samples/sec: 998.79 - lr: 0.000003 - momentum: 0.000000 2023-10-17 11:33:02,652 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:33:02,653 EPOCH 9 done: loss 0.0051 - lr: 0.000003 2023-10-17 11:33:05,592 DEV : loss 0.11327943950891495 - f1-score (micro avg) 0.8167 2023-10-17 11:33:05,621 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:33:17,814 epoch 10 - iter 154/1546 - loss 0.00274619 - time (sec): 12.19 - samples/sec: 992.32 - lr: 0.000003 - momentum: 0.000000 2023-10-17 11:33:29,904 epoch 10 - iter 308/1546 - loss 0.00202981 - time (sec): 24.28 - samples/sec: 1061.96 - lr: 0.000003 - momentum: 0.000000 2023-10-17 11:33:42,176 epoch 10 - iter 462/1546 - loss 0.00201148 - time (sec): 36.55 - samples/sec: 1047.57 - lr: 0.000002 - momentum: 0.000000 2023-10-17 11:33:54,082 epoch 10 - iter 616/1546 - loss 0.00298200 - time (sec): 48.46 - samples/sec: 1034.92 - lr: 0.000002 - momentum: 0.000000 2023-10-17 11:34:06,280 epoch 10 - iter 770/1546 - loss 0.00292290 - time (sec): 60.66 - samples/sec: 1037.45 - lr: 0.000002 - momentum: 0.000000 2023-10-17 11:34:18,626 epoch 10 - iter 924/1546 - loss 0.00273291 - time (sec): 73.00 - samples/sec: 1038.34 - lr: 0.000001 - momentum: 0.000000 2023-10-17 11:34:30,882 epoch 10 - iter 1078/1546 - loss 0.00327913 - time (sec): 85.26 - samples/sec: 1033.74 - lr: 0.000001 - momentum: 0.000000 2023-10-17 11:34:43,145 epoch 10 - iter 1232/1546 - loss 0.00313428 - time (sec): 97.52 - samples/sec: 1014.51 - lr: 0.000001 - momentum: 0.000000 2023-10-17 11:34:55,365 epoch 10 - iter 1386/1546 - loss 0.00291812 - time (sec): 109.74 - samples/sec: 1019.93 - lr: 0.000000 - momentum: 0.000000 2023-10-17 11:35:08,505 epoch 10 - iter 1540/1546 - loss 0.00316831 - time (sec): 122.88 - samples/sec: 1007.44 - lr: 0.000000 - momentum: 0.000000 2023-10-17 11:35:08,964 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:35:08,965 EPOCH 10 done: loss 0.0032 - lr: 0.000000 2023-10-17 11:35:11,979 DEV : loss 0.11607711017131805 - f1-score (micro avg) 0.7951 2023-10-17 11:35:12,999 ---------------------------------------------------------------------------------------------------- 2023-10-17 11:35:13,001 Loading model from best epoch ... 2023-10-17 11:35:15,647 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-BUILDING, B-BUILDING, E-BUILDING, I-BUILDING, S-STREET, B-STREET, E-STREET, I-STREET 2023-10-17 11:35:24,658 Results: - F-score (micro) 0.8219 - F-score (macro) 0.7427 - Accuracy 0.7156 By class: precision recall f1-score support LOC 0.8829 0.8446 0.8633 946 BUILDING 0.6772 0.5784 0.6239 185 STREET 0.7692 0.7143 0.7407 56 micro avg 0.8484 0.7970 0.8219 1187 macro avg 0.7764 0.7124 0.7427 1187 weighted avg 0.8455 0.7970 0.8202 1187 2023-10-17 11:35:24,659 ----------------------------------------------------------------------------------------------------