|
2023-10-17 18:01:41,762 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:01:41,763 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): ElectraModel( |
|
(embeddings): ElectraEmbeddings( |
|
(word_embeddings): Embedding(32001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): ElectraEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x ElectraLayer( |
|
(attention): ElectraAttention( |
|
(self): ElectraSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): ElectraSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): ElectraIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): ElectraOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-17 18:01:41,763 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:01:41,763 MultiCorpus: 1166 train + 165 dev + 415 test sentences |
|
- NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator |
|
2023-10-17 18:01:41,763 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:01:41,763 Train: 1166 sentences |
|
2023-10-17 18:01:41,763 (train_with_dev=False, train_with_test=False) |
|
2023-10-17 18:01:41,763 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:01:41,763 Training Params: |
|
2023-10-17 18:01:41,763 - learning_rate: "3e-05" |
|
2023-10-17 18:01:41,763 - mini_batch_size: "8" |
|
2023-10-17 18:01:41,763 - max_epochs: "10" |
|
2023-10-17 18:01:41,763 - shuffle: "True" |
|
2023-10-17 18:01:41,763 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:01:41,763 Plugins: |
|
2023-10-17 18:01:41,763 - TensorboardLogger |
|
2023-10-17 18:01:41,763 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-17 18:01:41,763 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:01:41,764 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-17 18:01:41,764 - metric: "('micro avg', 'f1-score')" |
|
2023-10-17 18:01:41,764 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:01:41,764 Computation: |
|
2023-10-17 18:01:41,764 - compute on device: cuda:0 |
|
2023-10-17 18:01:41,764 - embedding storage: none |
|
2023-10-17 18:01:41,764 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:01:41,764 Model training base path: "hmbench-newseye/fi-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-3" |
|
2023-10-17 18:01:41,764 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:01:41,764 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:01:41,764 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-17 18:01:43,042 epoch 1 - iter 14/146 - loss 3.53043795 - time (sec): 1.28 - samples/sec: 2887.22 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-17 18:01:44,750 epoch 1 - iter 28/146 - loss 3.22220354 - time (sec): 2.99 - samples/sec: 2903.33 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 18:01:46,108 epoch 1 - iter 42/146 - loss 2.82585575 - time (sec): 4.34 - samples/sec: 2974.54 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-17 18:01:47,961 epoch 1 - iter 56/146 - loss 2.33875546 - time (sec): 6.20 - samples/sec: 2863.59 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-17 18:01:49,464 epoch 1 - iter 70/146 - loss 1.98561751 - time (sec): 7.70 - samples/sec: 2859.02 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-17 18:01:51,151 epoch 1 - iter 84/146 - loss 1.73604633 - time (sec): 9.39 - samples/sec: 2822.89 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-17 18:01:52,455 epoch 1 - iter 98/146 - loss 1.56244089 - time (sec): 10.69 - samples/sec: 2869.67 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-17 18:01:53,840 epoch 1 - iter 112/146 - loss 1.45136493 - time (sec): 12.07 - samples/sec: 2856.69 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-17 18:01:55,351 epoch 1 - iter 126/146 - loss 1.32971185 - time (sec): 13.59 - samples/sec: 2849.01 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-17 18:01:56,627 epoch 1 - iter 140/146 - loss 1.24009769 - time (sec): 14.86 - samples/sec: 2874.39 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-17 18:01:57,281 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:01:57,282 EPOCH 1 done: loss 1.2042 - lr: 0.000029 |
|
2023-10-17 18:01:58,453 DEV : loss 0.2369639128446579 - f1-score (micro avg) 0.3429 |
|
2023-10-17 18:01:58,459 saving best model |
|
2023-10-17 18:01:58,784 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:02:00,122 epoch 2 - iter 14/146 - loss 0.29832645 - time (sec): 1.34 - samples/sec: 3233.20 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-17 18:02:01,215 epoch 2 - iter 28/146 - loss 0.27090210 - time (sec): 2.43 - samples/sec: 3144.62 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-17 18:02:02,734 epoch 2 - iter 42/146 - loss 0.24774915 - time (sec): 3.95 - samples/sec: 3163.37 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-17 18:02:04,573 epoch 2 - iter 56/146 - loss 0.24935139 - time (sec): 5.79 - samples/sec: 3031.59 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-17 18:02:05,827 epoch 2 - iter 70/146 - loss 0.24406299 - time (sec): 7.04 - samples/sec: 3021.97 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-17 18:02:07,349 epoch 2 - iter 84/146 - loss 0.23566930 - time (sec): 8.56 - samples/sec: 2980.19 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-17 18:02:08,992 epoch 2 - iter 98/146 - loss 0.22785994 - time (sec): 10.21 - samples/sec: 2986.86 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-17 18:02:10,368 epoch 2 - iter 112/146 - loss 0.22262977 - time (sec): 11.58 - samples/sec: 2997.58 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-17 18:02:11,870 epoch 2 - iter 126/146 - loss 0.22621003 - time (sec): 13.08 - samples/sec: 3001.66 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-17 18:02:13,129 epoch 2 - iter 140/146 - loss 0.22404336 - time (sec): 14.34 - samples/sec: 2985.96 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-17 18:02:13,639 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:02:13,639 EPOCH 2 done: loss 0.2206 - lr: 0.000027 |
|
2023-10-17 18:02:14,920 DEV : loss 0.1299794614315033 - f1-score (micro avg) 0.5919 |
|
2023-10-17 18:02:14,926 saving best model |
|
2023-10-17 18:02:15,371 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:02:16,875 epoch 3 - iter 14/146 - loss 0.12717687 - time (sec): 1.50 - samples/sec: 3121.96 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-17 18:02:18,340 epoch 3 - iter 28/146 - loss 0.13016870 - time (sec): 2.97 - samples/sec: 3101.19 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-17 18:02:19,758 epoch 3 - iter 42/146 - loss 0.14516907 - time (sec): 4.39 - samples/sec: 3022.70 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-17 18:02:21,273 epoch 3 - iter 56/146 - loss 0.13349387 - time (sec): 5.90 - samples/sec: 2944.83 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-17 18:02:22,750 epoch 3 - iter 70/146 - loss 0.12864265 - time (sec): 7.38 - samples/sec: 2953.89 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-17 18:02:24,312 epoch 3 - iter 84/146 - loss 0.13230498 - time (sec): 8.94 - samples/sec: 2942.26 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-17 18:02:25,476 epoch 3 - iter 98/146 - loss 0.12924129 - time (sec): 10.10 - samples/sec: 2969.36 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-17 18:02:27,038 epoch 3 - iter 112/146 - loss 0.12219386 - time (sec): 11.66 - samples/sec: 2977.73 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-17 18:02:28,564 epoch 3 - iter 126/146 - loss 0.12044637 - time (sec): 13.19 - samples/sec: 2959.39 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-17 18:02:30,111 epoch 3 - iter 140/146 - loss 0.12007643 - time (sec): 14.74 - samples/sec: 2913.51 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-17 18:02:30,652 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:02:30,652 EPOCH 3 done: loss 0.1233 - lr: 0.000024 |
|
2023-10-17 18:02:32,289 DEV : loss 0.10894083976745605 - f1-score (micro avg) 0.7187 |
|
2023-10-17 18:02:32,299 saving best model |
|
2023-10-17 18:02:32,723 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:02:34,231 epoch 4 - iter 14/146 - loss 0.07416896 - time (sec): 1.51 - samples/sec: 3010.11 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-17 18:02:35,856 epoch 4 - iter 28/146 - loss 0.07141417 - time (sec): 3.13 - samples/sec: 2907.94 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-17 18:02:37,152 epoch 4 - iter 42/146 - loss 0.08970989 - time (sec): 4.43 - samples/sec: 2908.80 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-17 18:02:38,867 epoch 4 - iter 56/146 - loss 0.07963099 - time (sec): 6.14 - samples/sec: 2874.00 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-17 18:02:40,463 epoch 4 - iter 70/146 - loss 0.07752524 - time (sec): 7.74 - samples/sec: 2873.54 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-17 18:02:41,955 epoch 4 - iter 84/146 - loss 0.08042752 - time (sec): 9.23 - samples/sec: 2870.29 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-17 18:02:43,558 epoch 4 - iter 98/146 - loss 0.08106679 - time (sec): 10.83 - samples/sec: 2818.94 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-17 18:02:45,073 epoch 4 - iter 112/146 - loss 0.08397046 - time (sec): 12.35 - samples/sec: 2831.20 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-17 18:02:46,566 epoch 4 - iter 126/146 - loss 0.08434407 - time (sec): 13.84 - samples/sec: 2828.54 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-17 18:02:47,841 epoch 4 - iter 140/146 - loss 0.08375380 - time (sec): 15.12 - samples/sec: 2812.99 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-17 18:02:48,440 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:02:48,440 EPOCH 4 done: loss 0.0837 - lr: 0.000020 |
|
2023-10-17 18:02:49,738 DEV : loss 0.11354158073663712 - f1-score (micro avg) 0.7364 |
|
2023-10-17 18:02:49,744 saving best model |
|
2023-10-17 18:02:50,202 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:02:51,702 epoch 5 - iter 14/146 - loss 0.05382494 - time (sec): 1.50 - samples/sec: 2810.34 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-17 18:02:53,009 epoch 5 - iter 28/146 - loss 0.06071119 - time (sec): 2.80 - samples/sec: 2969.28 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-17 18:02:54,527 epoch 5 - iter 42/146 - loss 0.05523519 - time (sec): 4.32 - samples/sec: 3056.07 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-17 18:02:55,986 epoch 5 - iter 56/146 - loss 0.05750211 - time (sec): 5.78 - samples/sec: 2996.72 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-17 18:02:57,546 epoch 5 - iter 70/146 - loss 0.06410589 - time (sec): 7.34 - samples/sec: 2872.91 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-17 18:02:59,083 epoch 5 - iter 84/146 - loss 0.06215775 - time (sec): 8.88 - samples/sec: 2907.99 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-17 18:03:00,391 epoch 5 - iter 98/146 - loss 0.05920684 - time (sec): 10.18 - samples/sec: 2912.85 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-17 18:03:01,906 epoch 5 - iter 112/146 - loss 0.05729003 - time (sec): 11.70 - samples/sec: 2882.89 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-17 18:03:03,453 epoch 5 - iter 126/146 - loss 0.05501850 - time (sec): 13.25 - samples/sec: 2905.99 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-17 18:03:05,100 epoch 5 - iter 140/146 - loss 0.05545585 - time (sec): 14.89 - samples/sec: 2890.13 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-17 18:03:05,589 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:03:05,589 EPOCH 5 done: loss 0.0549 - lr: 0.000017 |
|
2023-10-17 18:03:06,918 DEV : loss 0.11634857952594757 - f1-score (micro avg) 0.7364 |
|
2023-10-17 18:03:06,925 saving best model |
|
2023-10-17 18:03:07,379 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:03:08,799 epoch 6 - iter 14/146 - loss 0.04238142 - time (sec): 1.41 - samples/sec: 2913.49 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-17 18:03:10,367 epoch 6 - iter 28/146 - loss 0.03655978 - time (sec): 2.98 - samples/sec: 2978.87 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-17 18:03:11,678 epoch 6 - iter 42/146 - loss 0.04211034 - time (sec): 4.29 - samples/sec: 2890.83 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-17 18:03:13,075 epoch 6 - iter 56/146 - loss 0.04107637 - time (sec): 5.69 - samples/sec: 2815.81 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 18:03:14,598 epoch 6 - iter 70/146 - loss 0.03835445 - time (sec): 7.21 - samples/sec: 2834.14 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 18:03:16,185 epoch 6 - iter 84/146 - loss 0.04034492 - time (sec): 8.80 - samples/sec: 2893.99 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 18:03:17,290 epoch 6 - iter 98/146 - loss 0.03978206 - time (sec): 9.90 - samples/sec: 2918.91 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 18:03:18,683 epoch 6 - iter 112/146 - loss 0.04031278 - time (sec): 11.30 - samples/sec: 2921.99 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-17 18:03:20,095 epoch 6 - iter 126/146 - loss 0.04005854 - time (sec): 12.71 - samples/sec: 2955.67 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-17 18:03:21,608 epoch 6 - iter 140/146 - loss 0.03903056 - time (sec): 14.22 - samples/sec: 2987.59 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-17 18:03:22,326 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:03:22,326 EPOCH 6 done: loss 0.0384 - lr: 0.000014 |
|
2023-10-17 18:03:23,864 DEV : loss 0.11619787663221359 - f1-score (micro avg) 0.7478 |
|
2023-10-17 18:03:23,869 saving best model |
|
2023-10-17 18:03:24,312 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:03:25,733 epoch 7 - iter 14/146 - loss 0.03971526 - time (sec): 1.41 - samples/sec: 2868.48 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-17 18:03:26,974 epoch 7 - iter 28/146 - loss 0.04113722 - time (sec): 2.65 - samples/sec: 2856.67 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-17 18:03:28,350 epoch 7 - iter 42/146 - loss 0.03581246 - time (sec): 4.03 - samples/sec: 2894.28 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-17 18:03:29,725 epoch 7 - iter 56/146 - loss 0.03125811 - time (sec): 5.41 - samples/sec: 2955.57 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-17 18:03:31,234 epoch 7 - iter 70/146 - loss 0.03390483 - time (sec): 6.92 - samples/sec: 2998.62 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-17 18:03:32,727 epoch 7 - iter 84/146 - loss 0.03434154 - time (sec): 8.41 - samples/sec: 2922.72 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-17 18:03:34,202 epoch 7 - iter 98/146 - loss 0.03222323 - time (sec): 9.88 - samples/sec: 2921.38 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-17 18:03:35,723 epoch 7 - iter 112/146 - loss 0.03097942 - time (sec): 11.40 - samples/sec: 2894.68 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-17 18:03:37,396 epoch 7 - iter 126/146 - loss 0.03133453 - time (sec): 13.08 - samples/sec: 2874.13 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-17 18:03:38,840 epoch 7 - iter 140/146 - loss 0.02975401 - time (sec): 14.52 - samples/sec: 2914.63 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-17 18:03:39,546 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:03:39,547 EPOCH 7 done: loss 0.0289 - lr: 0.000010 |
|
2023-10-17 18:03:40,808 DEV : loss 0.12387975305318832 - f1-score (micro avg) 0.7793 |
|
2023-10-17 18:03:40,813 saving best model |
|
2023-10-17 18:03:41,234 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:03:42,601 epoch 8 - iter 14/146 - loss 0.02000203 - time (sec): 1.37 - samples/sec: 3044.44 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-17 18:03:44,171 epoch 8 - iter 28/146 - loss 0.01956968 - time (sec): 2.94 - samples/sec: 2897.32 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-17 18:03:45,590 epoch 8 - iter 42/146 - loss 0.01928448 - time (sec): 4.35 - samples/sec: 2916.95 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-17 18:03:47,084 epoch 8 - iter 56/146 - loss 0.02405329 - time (sec): 5.85 - samples/sec: 2973.71 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-17 18:03:48,535 epoch 8 - iter 70/146 - loss 0.02559984 - time (sec): 7.30 - samples/sec: 2977.54 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-17 18:03:49,980 epoch 8 - iter 84/146 - loss 0.02430293 - time (sec): 8.74 - samples/sec: 2997.30 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-17 18:03:51,652 epoch 8 - iter 98/146 - loss 0.02276596 - time (sec): 10.42 - samples/sec: 2971.35 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-17 18:03:52,912 epoch 8 - iter 112/146 - loss 0.02276090 - time (sec): 11.68 - samples/sec: 2946.39 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-17 18:03:54,513 epoch 8 - iter 126/146 - loss 0.02259700 - time (sec): 13.28 - samples/sec: 2960.07 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-17 18:03:55,891 epoch 8 - iter 140/146 - loss 0.02211640 - time (sec): 14.66 - samples/sec: 2938.42 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-17 18:03:56,387 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:03:56,387 EPOCH 8 done: loss 0.0219 - lr: 0.000007 |
|
2023-10-17 18:03:57,620 DEV : loss 0.12991590797901154 - f1-score (micro avg) 0.7723 |
|
2023-10-17 18:03:57,625 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:03:58,998 epoch 9 - iter 14/146 - loss 0.02121112 - time (sec): 1.37 - samples/sec: 2778.46 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 18:04:00,753 epoch 9 - iter 28/146 - loss 0.02062558 - time (sec): 3.13 - samples/sec: 2796.60 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 18:04:02,551 epoch 9 - iter 42/146 - loss 0.02599310 - time (sec): 4.92 - samples/sec: 2863.48 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 18:04:03,939 epoch 9 - iter 56/146 - loss 0.02190832 - time (sec): 6.31 - samples/sec: 2853.47 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 18:04:05,179 epoch 9 - iter 70/146 - loss 0.02164107 - time (sec): 7.55 - samples/sec: 2878.04 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-17 18:04:06,666 epoch 9 - iter 84/146 - loss 0.02212824 - time (sec): 9.04 - samples/sec: 2896.08 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-17 18:04:08,036 epoch 9 - iter 98/146 - loss 0.02065961 - time (sec): 10.41 - samples/sec: 2874.83 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-17 18:04:09,463 epoch 9 - iter 112/146 - loss 0.02049381 - time (sec): 11.84 - samples/sec: 2930.78 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-17 18:04:10,874 epoch 9 - iter 126/146 - loss 0.01964867 - time (sec): 13.25 - samples/sec: 2891.82 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-17 18:04:12,171 epoch 9 - iter 140/146 - loss 0.01888946 - time (sec): 14.54 - samples/sec: 2895.64 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-17 18:04:12,832 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:04:12,832 EPOCH 9 done: loss 0.0180 - lr: 0.000004 |
|
2023-10-17 18:04:14,602 DEV : loss 0.1360742151737213 - f1-score (micro avg) 0.7604 |
|
2023-10-17 18:04:14,609 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:04:16,465 epoch 10 - iter 14/146 - loss 0.02531276 - time (sec): 1.85 - samples/sec: 2708.78 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-17 18:04:18,092 epoch 10 - iter 28/146 - loss 0.01920590 - time (sec): 3.48 - samples/sec: 2742.03 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-17 18:04:19,521 epoch 10 - iter 42/146 - loss 0.01738091 - time (sec): 4.91 - samples/sec: 2769.85 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-17 18:04:20,878 epoch 10 - iter 56/146 - loss 0.01529737 - time (sec): 6.27 - samples/sec: 2803.00 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-17 18:04:22,307 epoch 10 - iter 70/146 - loss 0.01440336 - time (sec): 7.70 - samples/sec: 2783.17 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-17 18:04:23,876 epoch 10 - iter 84/146 - loss 0.01454689 - time (sec): 9.27 - samples/sec: 2737.19 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-17 18:04:25,406 epoch 10 - iter 98/146 - loss 0.01355592 - time (sec): 10.80 - samples/sec: 2754.28 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-17 18:04:27,080 epoch 10 - iter 112/146 - loss 0.01381817 - time (sec): 12.47 - samples/sec: 2767.65 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-17 18:04:28,326 epoch 10 - iter 126/146 - loss 0.01558305 - time (sec): 13.72 - samples/sec: 2798.28 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-17 18:04:29,771 epoch 10 - iter 140/146 - loss 0.01583718 - time (sec): 15.16 - samples/sec: 2800.43 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-17 18:04:30,590 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:04:30,591 EPOCH 10 done: loss 0.0152 - lr: 0.000000 |
|
2023-10-17 18:04:31,868 DEV : loss 0.1429719179868698 - f1-score (micro avg) 0.7533 |
|
2023-10-17 18:04:32,223 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 18:04:32,224 Loading model from best epoch ... |
|
2023-10-17 18:04:33,624 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd |
|
2023-10-17 18:04:36,114 |
|
Results: |
|
- F-score (micro) 0.7598 |
|
- F-score (macro) 0.6737 |
|
- Accuracy 0.6318 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
PER 0.8260 0.8592 0.8423 348 |
|
LOC 0.6483 0.8123 0.7211 261 |
|
ORG 0.4250 0.3269 0.3696 52 |
|
HumanProd 0.8000 0.7273 0.7619 22 |
|
|
|
micro avg 0.7263 0.7965 0.7598 683 |
|
macro avg 0.6748 0.6814 0.6737 683 |
|
weighted avg 0.7267 0.7965 0.7574 683 |
|
|
|
2023-10-17 18:04:36,114 ---------------------------------------------------------------------------------------------------- |
|
|