stefan-it's picture
Upload folder using huggingface_hub
17481d0
2023-10-13 15:47:36,877 ----------------------------------------------------------------------------------------------------
2023-10-13 15:47:36,879 Model: "SequenceTagger(
(embeddings): ByT5Embeddings(
(model): T5EncoderModel(
(shared): Embedding(384, 1472)
(encoder): T5Stack(
(embed_tokens): Embedding(384, 1472)
(block): ModuleList(
(0): T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1472, out_features=384, bias=False)
(k): Linear(in_features=1472, out_features=384, bias=False)
(v): Linear(in_features=1472, out_features=384, bias=False)
(o): Linear(in_features=384, out_features=1472, bias=False)
(relative_attention_bias): Embedding(32, 6)
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseGatedActDense(
(wi_0): Linear(in_features=1472, out_features=3584, bias=False)
(wi_1): Linear(in_features=1472, out_features=3584, bias=False)
(wo): Linear(in_features=3584, out_features=1472, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): NewGELUActivation()
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(1-11): 11 x T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1472, out_features=384, bias=False)
(k): Linear(in_features=1472, out_features=384, bias=False)
(v): Linear(in_features=1472, out_features=384, bias=False)
(o): Linear(in_features=384, out_features=1472, bias=False)
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseGatedActDense(
(wi_0): Linear(in_features=1472, out_features=3584, bias=False)
(wi_1): Linear(in_features=1472, out_features=3584, bias=False)
(wo): Linear(in_features=3584, out_features=1472, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): NewGELUActivation()
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(final_layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=1472, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-13 15:47:36,879 ----------------------------------------------------------------------------------------------------
2023-10-13 15:47:36,879 MultiCorpus: 14465 train + 1392 dev + 2432 test sentences
- NER_HIPE_2022 Corpus: 14465 train + 1392 dev + 2432 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/letemps/fr/with_doc_seperator
2023-10-13 15:47:36,879 ----------------------------------------------------------------------------------------------------
2023-10-13 15:47:36,879 Train: 14465 sentences
2023-10-13 15:47:36,879 (train_with_dev=False, train_with_test=False)
2023-10-13 15:47:36,879 ----------------------------------------------------------------------------------------------------
2023-10-13 15:47:36,879 Training Params:
2023-10-13 15:47:36,880 - learning_rate: "0.00015"
2023-10-13 15:47:36,880 - mini_batch_size: "4"
2023-10-13 15:47:36,880 - max_epochs: "10"
2023-10-13 15:47:36,880 - shuffle: "True"
2023-10-13 15:47:36,880 ----------------------------------------------------------------------------------------------------
2023-10-13 15:47:36,880 Plugins:
2023-10-13 15:47:36,880 - TensorboardLogger
2023-10-13 15:47:36,880 - LinearScheduler | warmup_fraction: '0.1'
2023-10-13 15:47:36,880 ----------------------------------------------------------------------------------------------------
2023-10-13 15:47:36,880 Final evaluation on model from best epoch (best-model.pt)
2023-10-13 15:47:36,880 - metric: "('micro avg', 'f1-score')"
2023-10-13 15:47:36,880 ----------------------------------------------------------------------------------------------------
2023-10-13 15:47:36,880 Computation:
2023-10-13 15:47:36,880 - compute on device: cuda:0
2023-10-13 15:47:36,880 - embedding storage: none
2023-10-13 15:47:36,881 ----------------------------------------------------------------------------------------------------
2023-10-13 15:47:36,881 Model training base path: "hmbench-letemps/fr-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs4-wsFalse-e10-lr0.00015-poolingfirst-layers-1-crfFalse-2"
2023-10-13 15:47:36,881 ----------------------------------------------------------------------------------------------------
2023-10-13 15:47:36,881 ----------------------------------------------------------------------------------------------------
2023-10-13 15:47:36,881 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-13 15:49:16,380 epoch 1 - iter 361/3617 - loss 2.52765117 - time (sec): 99.50 - samples/sec: 375.46 - lr: 0.000015 - momentum: 0.000000
2023-10-13 15:50:55,095 epoch 1 - iter 722/3617 - loss 2.14450318 - time (sec): 198.21 - samples/sec: 377.11 - lr: 0.000030 - momentum: 0.000000
2023-10-13 15:52:34,986 epoch 1 - iter 1083/3617 - loss 1.68063315 - time (sec): 298.10 - samples/sec: 379.63 - lr: 0.000045 - momentum: 0.000000
2023-10-13 15:54:13,230 epoch 1 - iter 1444/3617 - loss 1.33922640 - time (sec): 396.35 - samples/sec: 381.28 - lr: 0.000060 - momentum: 0.000000
2023-10-13 15:55:49,352 epoch 1 - iter 1805/3617 - loss 1.11175819 - time (sec): 492.47 - samples/sec: 384.30 - lr: 0.000075 - momentum: 0.000000
2023-10-13 15:57:25,329 epoch 1 - iter 2166/3617 - loss 0.95966876 - time (sec): 588.45 - samples/sec: 385.53 - lr: 0.000090 - momentum: 0.000000
2023-10-13 15:59:05,198 epoch 1 - iter 2527/3617 - loss 0.84659547 - time (sec): 688.31 - samples/sec: 385.09 - lr: 0.000105 - momentum: 0.000000
2023-10-13 16:00:44,426 epoch 1 - iter 2888/3617 - loss 0.75756054 - time (sec): 787.54 - samples/sec: 384.55 - lr: 0.000120 - momentum: 0.000000
2023-10-13 16:02:21,853 epoch 1 - iter 3249/3617 - loss 0.68764398 - time (sec): 884.97 - samples/sec: 386.12 - lr: 0.000135 - momentum: 0.000000
2023-10-13 16:03:57,756 epoch 1 - iter 3610/3617 - loss 0.63260278 - time (sec): 980.87 - samples/sec: 386.66 - lr: 0.000150 - momentum: 0.000000
2023-10-13 16:03:59,433 ----------------------------------------------------------------------------------------------------
2023-10-13 16:03:59,434 EPOCH 1 done: loss 0.6317 - lr: 0.000150
2023-10-13 16:04:35,632 DEV : loss 0.1328110247850418 - f1-score (micro avg) 0.5468
2023-10-13 16:04:35,688 saving best model
2023-10-13 16:04:36,553 ----------------------------------------------------------------------------------------------------
2023-10-13 16:06:14,210 epoch 2 - iter 361/3617 - loss 0.11626753 - time (sec): 97.65 - samples/sec: 375.76 - lr: 0.000148 - momentum: 0.000000
2023-10-13 16:07:54,314 epoch 2 - iter 722/3617 - loss 0.11000947 - time (sec): 197.76 - samples/sec: 377.76 - lr: 0.000147 - momentum: 0.000000
2023-10-13 16:09:32,730 epoch 2 - iter 1083/3617 - loss 0.10626444 - time (sec): 296.17 - samples/sec: 381.60 - lr: 0.000145 - momentum: 0.000000
2023-10-13 16:11:10,144 epoch 2 - iter 1444/3617 - loss 0.10456416 - time (sec): 393.59 - samples/sec: 383.58 - lr: 0.000143 - momentum: 0.000000
2023-10-13 16:12:49,896 epoch 2 - iter 1805/3617 - loss 0.10240111 - time (sec): 493.34 - samples/sec: 385.22 - lr: 0.000142 - momentum: 0.000000
2023-10-13 16:14:24,736 epoch 2 - iter 2166/3617 - loss 0.10226438 - time (sec): 588.18 - samples/sec: 385.19 - lr: 0.000140 - momentum: 0.000000
2023-10-13 16:16:04,084 epoch 2 - iter 2527/3617 - loss 0.10035753 - time (sec): 687.53 - samples/sec: 384.17 - lr: 0.000138 - momentum: 0.000000
2023-10-13 16:17:46,721 epoch 2 - iter 2888/3617 - loss 0.09824782 - time (sec): 790.17 - samples/sec: 384.12 - lr: 0.000137 - momentum: 0.000000
2023-10-13 16:19:27,035 epoch 2 - iter 3249/3617 - loss 0.09643723 - time (sec): 890.48 - samples/sec: 383.53 - lr: 0.000135 - momentum: 0.000000
2023-10-13 16:21:07,446 epoch 2 - iter 3610/3617 - loss 0.09599864 - time (sec): 990.89 - samples/sec: 382.64 - lr: 0.000133 - momentum: 0.000000
2023-10-13 16:21:09,242 ----------------------------------------------------------------------------------------------------
2023-10-13 16:21:09,242 EPOCH 2 done: loss 0.0959 - lr: 0.000133
2023-10-13 16:21:49,050 DEV : loss 0.12169007211923599 - f1-score (micro avg) 0.5729
2023-10-13 16:21:49,110 saving best model
2023-10-13 16:21:51,717 ----------------------------------------------------------------------------------------------------
2023-10-13 16:23:32,625 epoch 3 - iter 361/3617 - loss 0.06045412 - time (sec): 100.90 - samples/sec: 389.18 - lr: 0.000132 - momentum: 0.000000
2023-10-13 16:25:10,123 epoch 3 - iter 722/3617 - loss 0.06228126 - time (sec): 198.40 - samples/sec: 382.78 - lr: 0.000130 - momentum: 0.000000
2023-10-13 16:26:50,214 epoch 3 - iter 1083/3617 - loss 0.06334489 - time (sec): 298.49 - samples/sec: 380.37 - lr: 0.000128 - momentum: 0.000000
2023-10-13 16:28:29,213 epoch 3 - iter 1444/3617 - loss 0.06457721 - time (sec): 397.49 - samples/sec: 380.35 - lr: 0.000127 - momentum: 0.000000
2023-10-13 16:30:10,356 epoch 3 - iter 1805/3617 - loss 0.06506915 - time (sec): 498.63 - samples/sec: 380.18 - lr: 0.000125 - momentum: 0.000000
2023-10-13 16:31:47,566 epoch 3 - iter 2166/3617 - loss 0.06594029 - time (sec): 595.84 - samples/sec: 379.51 - lr: 0.000123 - momentum: 0.000000
2023-10-13 16:33:27,197 epoch 3 - iter 2527/3617 - loss 0.06615646 - time (sec): 695.47 - samples/sec: 382.22 - lr: 0.000122 - momentum: 0.000000
2023-10-13 16:35:04,152 epoch 3 - iter 2888/3617 - loss 0.06725648 - time (sec): 792.43 - samples/sec: 381.27 - lr: 0.000120 - momentum: 0.000000
2023-10-13 16:36:41,646 epoch 3 - iter 3249/3617 - loss 0.06667975 - time (sec): 889.92 - samples/sec: 382.81 - lr: 0.000118 - momentum: 0.000000
2023-10-13 16:38:19,427 epoch 3 - iter 3610/3617 - loss 0.06625041 - time (sec): 987.70 - samples/sec: 384.07 - lr: 0.000117 - momentum: 0.000000
2023-10-13 16:38:21,051 ----------------------------------------------------------------------------------------------------
2023-10-13 16:38:21,051 EPOCH 3 done: loss 0.0662 - lr: 0.000117
2023-10-13 16:38:59,279 DEV : loss 0.1475251019001007 - f1-score (micro avg) 0.6326
2023-10-13 16:38:59,336 saving best model
2023-10-13 16:39:01,911 ----------------------------------------------------------------------------------------------------
2023-10-13 16:40:38,948 epoch 4 - iter 361/3617 - loss 0.04260643 - time (sec): 97.03 - samples/sec: 382.61 - lr: 0.000115 - momentum: 0.000000
2023-10-13 16:42:16,995 epoch 4 - iter 722/3617 - loss 0.04125262 - time (sec): 195.08 - samples/sec: 390.28 - lr: 0.000113 - momentum: 0.000000
2023-10-13 16:43:52,860 epoch 4 - iter 1083/3617 - loss 0.04631582 - time (sec): 290.94 - samples/sec: 390.67 - lr: 0.000112 - momentum: 0.000000
2023-10-13 16:45:27,629 epoch 4 - iter 1444/3617 - loss 0.04612849 - time (sec): 385.71 - samples/sec: 391.37 - lr: 0.000110 - momentum: 0.000000
2023-10-13 16:47:09,646 epoch 4 - iter 1805/3617 - loss 0.04673774 - time (sec): 487.73 - samples/sec: 387.18 - lr: 0.000108 - momentum: 0.000000
2023-10-13 16:48:49,185 epoch 4 - iter 2166/3617 - loss 0.04602262 - time (sec): 587.27 - samples/sec: 385.50 - lr: 0.000107 - momentum: 0.000000
2023-10-13 16:50:27,835 epoch 4 - iter 2527/3617 - loss 0.04628779 - time (sec): 685.92 - samples/sec: 384.91 - lr: 0.000105 - momentum: 0.000000
2023-10-13 16:52:06,093 epoch 4 - iter 2888/3617 - loss 0.04553124 - time (sec): 784.18 - samples/sec: 385.34 - lr: 0.000103 - momentum: 0.000000
2023-10-13 16:53:44,924 epoch 4 - iter 3249/3617 - loss 0.04561411 - time (sec): 883.01 - samples/sec: 386.56 - lr: 0.000102 - momentum: 0.000000
2023-10-13 16:55:22,268 epoch 4 - iter 3610/3617 - loss 0.04637074 - time (sec): 980.35 - samples/sec: 386.93 - lr: 0.000100 - momentum: 0.000000
2023-10-13 16:55:23,902 ----------------------------------------------------------------------------------------------------
2023-10-13 16:55:23,902 EPOCH 4 done: loss 0.0464 - lr: 0.000100
2023-10-13 16:56:03,441 DEV : loss 0.21360917389392853 - f1-score (micro avg) 0.6419
2023-10-13 16:56:03,498 saving best model
2023-10-13 16:56:06,081 ----------------------------------------------------------------------------------------------------
2023-10-13 16:57:41,184 epoch 5 - iter 361/3617 - loss 0.02867783 - time (sec): 95.10 - samples/sec: 407.12 - lr: 0.000098 - momentum: 0.000000
2023-10-13 16:59:18,313 epoch 5 - iter 722/3617 - loss 0.03093844 - time (sec): 192.23 - samples/sec: 403.57 - lr: 0.000097 - momentum: 0.000000
2023-10-13 17:00:54,026 epoch 5 - iter 1083/3617 - loss 0.03106379 - time (sec): 287.94 - samples/sec: 397.08 - lr: 0.000095 - momentum: 0.000000
2023-10-13 17:02:29,472 epoch 5 - iter 1444/3617 - loss 0.03435095 - time (sec): 383.39 - samples/sec: 400.62 - lr: 0.000093 - momentum: 0.000000
2023-10-13 17:04:05,072 epoch 5 - iter 1805/3617 - loss 0.03291689 - time (sec): 478.99 - samples/sec: 401.05 - lr: 0.000092 - momentum: 0.000000
2023-10-13 17:05:41,393 epoch 5 - iter 2166/3617 - loss 0.03481865 - time (sec): 575.31 - samples/sec: 396.71 - lr: 0.000090 - momentum: 0.000000
2023-10-13 17:07:19,475 epoch 5 - iter 2527/3617 - loss 0.03380849 - time (sec): 673.39 - samples/sec: 396.08 - lr: 0.000088 - momentum: 0.000000
2023-10-13 17:08:57,422 epoch 5 - iter 2888/3617 - loss 0.03376094 - time (sec): 771.34 - samples/sec: 395.27 - lr: 0.000087 - momentum: 0.000000
2023-10-13 17:10:33,716 epoch 5 - iter 3249/3617 - loss 0.03464489 - time (sec): 867.63 - samples/sec: 393.19 - lr: 0.000085 - momentum: 0.000000
2023-10-13 17:12:12,849 epoch 5 - iter 3610/3617 - loss 0.03480793 - time (sec): 966.76 - samples/sec: 392.27 - lr: 0.000083 - momentum: 0.000000
2023-10-13 17:12:14,580 ----------------------------------------------------------------------------------------------------
2023-10-13 17:12:14,580 EPOCH 5 done: loss 0.0349 - lr: 0.000083
2023-10-13 17:12:53,693 DEV : loss 0.2400546371936798 - f1-score (micro avg) 0.651
2023-10-13 17:12:53,753 saving best model
2023-10-13 17:12:56,347 ----------------------------------------------------------------------------------------------------
2023-10-13 17:14:34,985 epoch 6 - iter 361/3617 - loss 0.01720928 - time (sec): 98.63 - samples/sec: 386.11 - lr: 0.000082 - momentum: 0.000000
2023-10-13 17:16:13,842 epoch 6 - iter 722/3617 - loss 0.01844733 - time (sec): 197.49 - samples/sec: 380.67 - lr: 0.000080 - momentum: 0.000000
2023-10-13 17:17:52,649 epoch 6 - iter 1083/3617 - loss 0.01854603 - time (sec): 296.30 - samples/sec: 378.69 - lr: 0.000078 - momentum: 0.000000
2023-10-13 17:19:32,124 epoch 6 - iter 1444/3617 - loss 0.02106696 - time (sec): 395.77 - samples/sec: 380.99 - lr: 0.000077 - momentum: 0.000000
2023-10-13 17:21:08,674 epoch 6 - iter 1805/3617 - loss 0.02062784 - time (sec): 492.32 - samples/sec: 382.09 - lr: 0.000075 - momentum: 0.000000
2023-10-13 17:22:47,208 epoch 6 - iter 2166/3617 - loss 0.02025036 - time (sec): 590.86 - samples/sec: 382.21 - lr: 0.000073 - momentum: 0.000000
2023-10-13 17:24:27,370 epoch 6 - iter 2527/3617 - loss 0.02039739 - time (sec): 691.02 - samples/sec: 381.52 - lr: 0.000072 - momentum: 0.000000
2023-10-13 17:26:05,854 epoch 6 - iter 2888/3617 - loss 0.02102720 - time (sec): 789.50 - samples/sec: 383.40 - lr: 0.000070 - momentum: 0.000000
2023-10-13 17:27:44,896 epoch 6 - iter 3249/3617 - loss 0.02116438 - time (sec): 888.54 - samples/sec: 383.69 - lr: 0.000068 - momentum: 0.000000
2023-10-13 17:29:25,435 epoch 6 - iter 3610/3617 - loss 0.02233088 - time (sec): 989.08 - samples/sec: 383.39 - lr: 0.000067 - momentum: 0.000000
2023-10-13 17:29:27,292 ----------------------------------------------------------------------------------------------------
2023-10-13 17:29:27,292 EPOCH 6 done: loss 0.0223 - lr: 0.000067
2023-10-13 17:30:06,560 DEV : loss 0.27845296263694763 - f1-score (micro avg) 0.6351
2023-10-13 17:30:06,618 ----------------------------------------------------------------------------------------------------
2023-10-13 17:31:45,647 epoch 7 - iter 361/3617 - loss 0.01354530 - time (sec): 99.03 - samples/sec: 390.06 - lr: 0.000065 - momentum: 0.000000
2023-10-13 17:33:23,229 epoch 7 - iter 722/3617 - loss 0.01275613 - time (sec): 196.61 - samples/sec: 388.03 - lr: 0.000063 - momentum: 0.000000
2023-10-13 17:35:02,365 epoch 7 - iter 1083/3617 - loss 0.01402730 - time (sec): 295.74 - samples/sec: 389.84 - lr: 0.000062 - momentum: 0.000000
2023-10-13 17:36:40,537 epoch 7 - iter 1444/3617 - loss 0.01348114 - time (sec): 393.92 - samples/sec: 385.67 - lr: 0.000060 - momentum: 0.000000
2023-10-13 17:38:17,672 epoch 7 - iter 1805/3617 - loss 0.01492394 - time (sec): 491.05 - samples/sec: 386.77 - lr: 0.000058 - momentum: 0.000000
2023-10-13 17:39:55,040 epoch 7 - iter 2166/3617 - loss 0.01481483 - time (sec): 588.42 - samples/sec: 390.00 - lr: 0.000057 - momentum: 0.000000
2023-10-13 17:41:31,995 epoch 7 - iter 2527/3617 - loss 0.01489788 - time (sec): 685.38 - samples/sec: 389.85 - lr: 0.000055 - momentum: 0.000000
2023-10-13 17:43:07,557 epoch 7 - iter 2888/3617 - loss 0.01477959 - time (sec): 780.94 - samples/sec: 389.20 - lr: 0.000053 - momentum: 0.000000
2023-10-13 17:44:43,562 epoch 7 - iter 3249/3617 - loss 0.01557973 - time (sec): 876.94 - samples/sec: 388.67 - lr: 0.000052 - momentum: 0.000000
2023-10-13 17:46:22,754 epoch 7 - iter 3610/3617 - loss 0.01541438 - time (sec): 976.13 - samples/sec: 388.37 - lr: 0.000050 - momentum: 0.000000
2023-10-13 17:46:24,563 ----------------------------------------------------------------------------------------------------
2023-10-13 17:46:24,564 EPOCH 7 done: loss 0.0155 - lr: 0.000050
2023-10-13 17:47:02,672 DEV : loss 0.32551926374435425 - f1-score (micro avg) 0.6588
2023-10-13 17:47:02,728 saving best model
2023-10-13 17:47:05,315 ----------------------------------------------------------------------------------------------------
2023-10-13 17:48:43,299 epoch 8 - iter 361/3617 - loss 0.01360752 - time (sec): 97.98 - samples/sec: 388.59 - lr: 0.000048 - momentum: 0.000000
2023-10-13 17:50:23,431 epoch 8 - iter 722/3617 - loss 0.01121896 - time (sec): 198.11 - samples/sec: 391.82 - lr: 0.000047 - momentum: 0.000000
2023-10-13 17:52:02,062 epoch 8 - iter 1083/3617 - loss 0.01074591 - time (sec): 296.74 - samples/sec: 391.38 - lr: 0.000045 - momentum: 0.000000
2023-10-13 17:53:40,669 epoch 8 - iter 1444/3617 - loss 0.00996253 - time (sec): 395.35 - samples/sec: 391.11 - lr: 0.000043 - momentum: 0.000000
2023-10-13 17:55:19,861 epoch 8 - iter 1805/3617 - loss 0.01086557 - time (sec): 494.54 - samples/sec: 386.03 - lr: 0.000042 - momentum: 0.000000
2023-10-13 17:57:00,564 epoch 8 - iter 2166/3617 - loss 0.01081858 - time (sec): 595.24 - samples/sec: 386.15 - lr: 0.000040 - momentum: 0.000000
2023-10-13 17:58:37,728 epoch 8 - iter 2527/3617 - loss 0.01085878 - time (sec): 692.41 - samples/sec: 385.09 - lr: 0.000038 - momentum: 0.000000
2023-10-13 18:00:14,557 epoch 8 - iter 2888/3617 - loss 0.01065470 - time (sec): 789.24 - samples/sec: 385.34 - lr: 0.000037 - momentum: 0.000000
2023-10-13 18:01:51,532 epoch 8 - iter 3249/3617 - loss 0.01090509 - time (sec): 886.21 - samples/sec: 386.25 - lr: 0.000035 - momentum: 0.000000
2023-10-13 18:03:29,711 epoch 8 - iter 3610/3617 - loss 0.01057309 - time (sec): 984.39 - samples/sec: 385.52 - lr: 0.000033 - momentum: 0.000000
2023-10-13 18:03:31,317 ----------------------------------------------------------------------------------------------------
2023-10-13 18:03:31,317 EPOCH 8 done: loss 0.0106 - lr: 0.000033
2023-10-13 18:04:13,133 DEV : loss 0.3422335982322693 - f1-score (micro avg) 0.6499
2023-10-13 18:04:13,198 ----------------------------------------------------------------------------------------------------
2023-10-13 18:05:51,892 epoch 9 - iter 361/3617 - loss 0.00746127 - time (sec): 98.69 - samples/sec: 369.13 - lr: 0.000032 - momentum: 0.000000
2023-10-13 18:07:30,684 epoch 9 - iter 722/3617 - loss 0.00693418 - time (sec): 197.48 - samples/sec: 378.82 - lr: 0.000030 - momentum: 0.000000
2023-10-13 18:09:11,794 epoch 9 - iter 1083/3617 - loss 0.00835998 - time (sec): 298.59 - samples/sec: 379.21 - lr: 0.000028 - momentum: 0.000000
2023-10-13 18:10:51,226 epoch 9 - iter 1444/3617 - loss 0.00843426 - time (sec): 398.03 - samples/sec: 379.83 - lr: 0.000027 - momentum: 0.000000
2023-10-13 18:12:31,468 epoch 9 - iter 1805/3617 - loss 0.00811795 - time (sec): 498.27 - samples/sec: 380.63 - lr: 0.000025 - momentum: 0.000000
2023-10-13 18:14:10,084 epoch 9 - iter 2166/3617 - loss 0.00779077 - time (sec): 596.88 - samples/sec: 382.22 - lr: 0.000023 - momentum: 0.000000
2023-10-13 18:15:43,914 epoch 9 - iter 2527/3617 - loss 0.00773124 - time (sec): 690.71 - samples/sec: 384.09 - lr: 0.000022 - momentum: 0.000000
2023-10-13 18:17:18,514 epoch 9 - iter 2888/3617 - loss 0.00758069 - time (sec): 785.31 - samples/sec: 383.80 - lr: 0.000020 - momentum: 0.000000
2023-10-13 18:18:53,935 epoch 9 - iter 3249/3617 - loss 0.00735282 - time (sec): 880.73 - samples/sec: 385.99 - lr: 0.000018 - momentum: 0.000000
2023-10-13 18:20:35,878 epoch 9 - iter 3610/3617 - loss 0.00724054 - time (sec): 982.68 - samples/sec: 385.88 - lr: 0.000017 - momentum: 0.000000
2023-10-13 18:20:37,660 ----------------------------------------------------------------------------------------------------
2023-10-13 18:20:37,660 EPOCH 9 done: loss 0.0072 - lr: 0.000017
2023-10-13 18:21:18,008 DEV : loss 0.3737005591392517 - f1-score (micro avg) 0.6637
2023-10-13 18:21:18,067 saving best model
2023-10-13 18:21:20,653 ----------------------------------------------------------------------------------------------------
2023-10-13 18:23:01,430 epoch 10 - iter 361/3617 - loss 0.00219566 - time (sec): 100.77 - samples/sec: 379.80 - lr: 0.000015 - momentum: 0.000000
2023-10-13 18:24:41,655 epoch 10 - iter 722/3617 - loss 0.00254285 - time (sec): 201.00 - samples/sec: 378.43 - lr: 0.000013 - momentum: 0.000000
2023-10-13 18:26:18,860 epoch 10 - iter 1083/3617 - loss 0.00509300 - time (sec): 298.20 - samples/sec: 380.51 - lr: 0.000012 - momentum: 0.000000
2023-10-13 18:27:59,093 epoch 10 - iter 1444/3617 - loss 0.00524195 - time (sec): 398.44 - samples/sec: 381.66 - lr: 0.000010 - momentum: 0.000000
2023-10-13 18:29:39,266 epoch 10 - iter 1805/3617 - loss 0.00561785 - time (sec): 498.61 - samples/sec: 379.46 - lr: 0.000008 - momentum: 0.000000
2023-10-13 18:31:16,248 epoch 10 - iter 2166/3617 - loss 0.00605238 - time (sec): 595.59 - samples/sec: 380.96 - lr: 0.000007 - momentum: 0.000000
2023-10-13 18:32:54,488 epoch 10 - iter 2527/3617 - loss 0.00590583 - time (sec): 693.83 - samples/sec: 382.92 - lr: 0.000005 - momentum: 0.000000
2023-10-13 18:34:33,504 epoch 10 - iter 2888/3617 - loss 0.00607679 - time (sec): 792.85 - samples/sec: 384.33 - lr: 0.000003 - momentum: 0.000000
2023-10-13 18:36:09,760 epoch 10 - iter 3249/3617 - loss 0.00604311 - time (sec): 889.10 - samples/sec: 383.09 - lr: 0.000002 - momentum: 0.000000
2023-10-13 18:37:48,846 epoch 10 - iter 3610/3617 - loss 0.00584381 - time (sec): 988.19 - samples/sec: 383.85 - lr: 0.000000 - momentum: 0.000000
2023-10-13 18:37:50,520 ----------------------------------------------------------------------------------------------------
2023-10-13 18:37:50,521 EPOCH 10 done: loss 0.0058 - lr: 0.000000
2023-10-13 18:38:31,617 DEV : loss 0.3819710314273834 - f1-score (micro avg) 0.6651
2023-10-13 18:38:31,677 saving best model
2023-10-13 18:38:35,138 ----------------------------------------------------------------------------------------------------
2023-10-13 18:38:35,140 Loading model from best epoch ...
2023-10-13 18:38:39,133 SequenceTagger predicts: Dictionary with 13 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org
2023-10-13 18:39:36,439
Results:
- F-score (micro) 0.6279
- F-score (macro) 0.4773
- Accuracy 0.4701
By class:
precision recall f1-score support
loc 0.6557 0.7411 0.6958 591
pers 0.5661 0.6835 0.6193 357
org 0.1200 0.1139 0.1169 79
micro avg 0.5886 0.6728 0.6279 1027
macro avg 0.4473 0.5128 0.4773 1027
weighted avg 0.5833 0.6728 0.6247 1027
2023-10-13 18:39:36,439 ----------------------------------------------------------------------------------------------------