/home/chaeyun/.conda/envs/risall/lib/python3.9/site-packages/timm/models/layers/__init__.py:48: FutureWarning: Importing from timm.models.layers is deprecated, please import via timm.layers warnings.warn(f"Importing from {__name__} is deprecated, please import via timm.layers", FutureWarning) Local Rank: 0, World Size: 1 RANK and WORLD_SIZE in environment: 0/1 Image size: 480 metric learning flag : False Namespace(amsgrad=False, batch_size=16, bert_tokenizer='bert-base-uncased', ck_bert='bert-base-uncased', dataset='ref-zom', ddp_trained_weights=False, device='cuda:0', epochs=40, fusion_drop=0.0, img_size=480, lr=5e-05, mha='', model='lavt_one', model_id='refzom_lavt_bs16_repro', output_dir='./models/refzom_lavt_bs16_repro', pin_mem=False, pretrained_swin_weights='./pretrained_weights/swin_base_patch4_window12_384_22k.pth', print_freq=10, refer_data_root='./refer/data/', resume='', split='test', splitBy='final', swin_type='base', weight_decay=0.01, window12=False, workers=8, metric_learning=False, metric_loss_weight=0.1, metric_mode='hardpos_rev3', exclude_multiobj=False, hn_prob=0.0, hp_selection='naive', margin_value=10, temperature=0.05, addzero=False, local_rank=0) loading dataset ref-zom into memory... loading dataset split final creating index... index created. DONE (t=6.09s) loading dataset ref-zom into memory... loading dataset split final creating index... index created. DONE (t=6.79s) local rank 0 / global rank 0 successfully built train dataset. /home/chaeyun/.conda/envs/risall/lib/python3.9/site-packages/torch/utils/data/dataloader.py:557: UserWarning: This DataLoader will create 8 worker processes in total. Our suggested max number of worker in current system is 4, which is smaller than what this DataLoader is going to create. Please be aware that excessive worker creation might get DataLoader running slow or even freeze, lower the worker number to avoid potential slowness/freeze if necessary. warnings.warn(_create_warning_msg( lavt_one Window size 12! /home/chaeyun/.conda/envs/risall/lib/python3.9/site-packages/torch/functional.py:504: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ../aten/src/ATen/native/TensorShape.cpp:3526.) return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined] Initializing Multi-modal Swin Transformer weights from ./pretrained_weights/swin_base_patch4_window12_384_22k.pth Epoch: [0] [ 0/4276] eta: 11:00:22 lr: 4.999973690357136e-05 loss: 0.8191 (0.8191) time: 9.2662 data: 2.3629 max mem: 31199 Epoch: [0] [ 10/4276] eta: 4:11:30 lr: 4.9997105930824546e-05 loss: 0.7876 (0.7858) time: 3.5374 data: 0.2208 max mem: 33293 Epoch: [0] [ 20/4276] eta: 3:51:48 lr: 4.999447494269448e-05 loss: 0.7170 (0.7298) time: 2.9681 data: 0.0064 max mem: 33293 Epoch: [0] [ 30/4276] eta: 3:44:43 lr: 4.9991843939180193e-05 loss: 0.6004 (0.6728) time: 2.9764 data: 0.0067 max mem: 33293 Epoch: [0] [ 40/4276] eta: 3:40:47 lr: 4.998921292028069e-05 loss: 0.4740 (0.6149) time: 2.9795 data: 0.0069 max mem: 33293 Epoch: [0] [ 50/4276] eta: 3:37:53 lr: 4.998658188599496e-05 loss: 0.4142 (0.5703) time: 2.9664 data: 0.0068 max mem: 33293 Epoch: [0] [ 60/4276] eta: 3:35:42 lr: 4.9983950836322044e-05 loss: 0.3817 (0.5379) time: 2.9524 data: 0.0068 max mem: 33293 Epoch: [0] [ 70/4276] eta: 3:34:10 lr: 4.998131977126093e-05 loss: 0.3365 (0.5087) time: 2.9577 data: 0.0070 max mem: 33293 Epoch: [0] [ 80/4276] eta: 3:32:50 lr: 4.997868869081064e-05 loss: 0.3426 (0.4882) time: 2.9632 data: 0.0070 max mem: 33293 Epoch: [0] [ 90/4276] eta: 3:31:46 lr: 4.997605759497018e-05 loss: 0.3399 (0.4719) time: 2.9650 data: 0.0067 max mem: 33293 Epoch: [0] [ 100/4276] eta: 3:30:40 lr: 4.997342648373855e-05 loss: 0.3237 (0.4614) time: 2.9590 data: 0.0067 max mem: 33293 Epoch: [0] [ 110/4276] eta: 3:29:21 lr: 4.997079535711478e-05 loss: 0.3239 (0.4483) time: 2.9232 data: 0.0066 max mem: 33293 Epoch: [0] [ 120/4276] eta: 3:28:09 lr: 4.996816421509786e-05 loss: 0.2960 (0.4366) time: 2.8957 data: 0.0064 max mem: 33293 Epoch: [0] [ 130/4276] eta: 3:27:13 lr: 4.996553305768681e-05 loss: 0.2965 (0.4286) time: 2.9090 data: 0.0065 max mem: 33293 Epoch: [0] [ 140/4276] eta: 3:26:18 lr: 4.996290188488064e-05 loss: 0.3104 (0.4198) time: 2.9178 data: 0.0064 max mem: 33294 Epoch: [0] [ 150/4276] eta: 3:25:27 lr: 4.996027069667836e-05 loss: 0.2984 (0.4114) time: 2.9146 data: 0.0065 max mem: 33294 Epoch: [0] [ 160/4276] eta: 3:24:36 lr: 4.9957639493078964e-05 loss: 0.2983 (0.4050) time: 2.9115 data: 0.0066 max mem: 33294 Epoch: [0] [ 170/4276] eta: 3:23:47 lr: 4.995500827408148e-05 loss: 0.3237 (0.4007) time: 2.9046 data: 0.0066 max mem: 33294 Epoch: [0] [ 180/4276] eta: 3:23:05 lr: 4.99523770396849e-05 loss: 0.3311 (0.3968) time: 2.9123 data: 0.0065 max mem: 33294 Epoch: [0] [ 190/4276] eta: 3:22:20 lr: 4.994974578988825e-05 loss: 0.3183 (0.3924) time: 2.9124 data: 0.0066 max mem: 33294 Epoch: [0] [ 200/4276] eta: 3:21:37 lr: 4.994711452469053e-05 loss: 0.3114 (0.3884) time: 2.9046 data: 0.0066 max mem: 33294 Epoch: [0] [ 210/4276] eta: 3:20:56 lr: 4.994448324409074e-05 loss: 0.3062 (0.3844) time: 2.9072 data: 0.0064 max mem: 33294 Epoch: [0] [ 220/4276] eta: 3:20:16 lr: 4.9941851948087904e-05 loss: 0.2794 (0.3803) time: 2.9089 data: 0.0064 max mem: 33294 Epoch: [0] [ 230/4276] eta: 3:19:39 lr: 4.9939220636681015e-05 loss: 0.2725 (0.3756) time: 2.9149 data: 0.0064 max mem: 33294 Epoch: [0] [ 240/4276] eta: 3:19:01 lr: 4.993658930986909e-05 loss: 0.2842 (0.3726) time: 2.9181 data: 0.0064 max mem: 33294 Epoch: [0] [ 250/4276] eta: 3:18:25 lr: 4.993395796765113e-05 loss: 0.2928 (0.3703) time: 2.9162 data: 0.0066 max mem: 33294 Epoch: [0] [ 260/4276] eta: 3:17:50 lr: 4.993132661002616e-05 loss: 0.3064 (0.3677) time: 2.9203 data: 0.0065 max mem: 33294 Epoch: [0] [ 270/4276] eta: 3:17:20 lr: 4.992869523699317e-05 loss: 0.3064 (0.3653) time: 2.9367 data: 0.0067 max mem: 33294 Epoch: [0] [ 280/4276] eta: 3:16:50 lr: 4.992606384855117e-05 loss: 0.2700 (0.3619) time: 2.9528 data: 0.0069 max mem: 33294 Epoch: [0] [ 290/4276] eta: 3:16:21 lr: 4.9923432444699166e-05 loss: 0.2592 (0.3587) time: 2.9547 data: 0.0066 max mem: 33294 Epoch: [0] [ 300/4276] eta: 3:15:51 lr: 4.9920801025436183e-05 loss: 0.2614 (0.3556) time: 2.9559 data: 0.0066 max mem: 33295 Epoch: [0] [ 310/4276] eta: 3:15:15 lr: 4.991816959076121e-05 loss: 0.2614 (0.3534) time: 2.9295 data: 0.0066 max mem: 33295 Epoch: [0] [ 320/4276] eta: 3:14:44 lr: 4.991553814067325e-05 loss: 0.2774 (0.3518) time: 2.9210 data: 0.0065 max mem: 33295 Epoch: [0] [ 330/4276] eta: 3:14:10 lr: 4.991290667517133e-05 loss: 0.2774 (0.3494) time: 2.9277 data: 0.0064 max mem: 33295 Epoch: [0] [ 340/4276] eta: 3:13:40 lr: 4.9910275194254444e-05 loss: 0.2652 (0.3471) time: 2.9336 data: 0.0064 max mem: 33295 Epoch: [0] [ 350/4276] eta: 3:13:08 lr: 4.990764369792159e-05 loss: 0.2645 (0.3451) time: 2.9404 data: 0.0064 max mem: 33295 Epoch: [0] [ 360/4276] eta: 3:12:35 lr: 4.990501218617179e-05 loss: 0.2670 (0.3439) time: 2.9270 data: 0.0064 max mem: 33295 Epoch: [0] [ 370/4276] eta: 3:12:04 lr: 4.9902380659004044e-05 loss: 0.2592 (0.3417) time: 2.9307 data: 0.0064 max mem: 33295 Epoch: [0] [ 380/4276] eta: 3:11:36 lr: 4.989974911641736e-05 loss: 0.2384 (0.3399) time: 2.9501 data: 0.0064 max mem: 33295 Epoch: [0] [ 390/4276] eta: 3:11:04 lr: 4.9897117558410747e-05 loss: 0.2829 (0.3387) time: 2.9420 data: 0.0064 max mem: 33295 Epoch: [0] [ 400/4276] eta: 3:10:31 lr: 4.98944859849832e-05 loss: 0.2791 (0.3372) time: 2.9206 data: 0.0066 max mem: 33295 Epoch: [0] [ 410/4276] eta: 3:09:59 lr: 4.989185439613374e-05 loss: 0.2606 (0.3355) time: 2.9218 data: 0.0066 max mem: 33295 Epoch: [0] [ 420/4276] eta: 3:09:27 lr: 4.9889222791861365e-05 loss: 0.2738 (0.3348) time: 2.9215 data: 0.0063 max mem: 33295 Epoch: [0] [ 430/4276] eta: 3:08:54 lr: 4.988659117216507e-05 loss: 0.2746 (0.3334) time: 2.9149 data: 0.0065 max mem: 33295 Epoch: [0] [ 440/4276] eta: 3:08:23 lr: 4.988395953704388e-05 loss: 0.2671 (0.3320) time: 2.9166 data: 0.0068 max mem: 33295 Epoch: [0] [ 450/4276] eta: 3:07:50 lr: 4.988132788649679e-05 loss: 0.2588 (0.3309) time: 2.9128 data: 0.0065 max mem: 33295 Epoch: [0] [ 460/4276] eta: 3:07:21 lr: 4.98786962205228e-05 loss: 0.2576 (0.3294) time: 2.9295 data: 0.0066 max mem: 33295 Epoch: [0] [ 470/4276] eta: 3:06:52 lr: 4.987606453912093e-05 loss: 0.2499 (0.3279) time: 2.9538 data: 0.0067 max mem: 33295 Epoch: [0] [ 480/4276] eta: 3:06:23 lr: 4.9873432842290176e-05 loss: 0.2632 (0.3269) time: 2.9486 data: 0.0064 max mem: 33295 Epoch: [0] [ 490/4276] eta: 3:05:53 lr: 4.9870801130029545e-05 loss: 0.2707 (0.3258) time: 2.9446 data: 0.0064 max mem: 33295 Epoch: [0] [ 500/4276] eta: 3:05:23 lr: 4.986816940233804e-05 loss: 0.2794 (0.3250) time: 2.9408 data: 0.0064 max mem: 33295 Epoch: [0] [ 510/4276] eta: 3:04:54 lr: 4.986553765921467e-05 loss: 0.2553 (0.3235) time: 2.9471 data: 0.0065 max mem: 33295 Epoch: [0] [ 520/4276] eta: 3:04:25 lr: 4.986290590065843e-05 loss: 0.2514 (0.3225) time: 2.9524 data: 0.0068 max mem: 33296 Epoch: [0] [ 530/4276] eta: 3:03:56 lr: 4.986027412666833e-05 loss: 0.2751 (0.3216) time: 2.9525 data: 0.0066 max mem: 33296 Epoch: [0] [ 540/4276] eta: 3:03:27 lr: 4.985764233724338e-05 loss: 0.2597 (0.3206) time: 2.9538 data: 0.0064 max mem: 33296 Epoch: [0] [ 550/4276] eta: 3:02:58 lr: 4.985501053238258e-05 loss: 0.2672 (0.3200) time: 2.9545 data: 0.0065 max mem: 33296 Epoch: [0] [ 560/4276] eta: 3:02:29 lr: 4.985237871208492e-05 loss: 0.2754 (0.3193) time: 2.9552 data: 0.0067 max mem: 33296 Epoch: [0] [ 570/4276] eta: 3:02:00 lr: 4.984974687634943e-05 loss: 0.2663 (0.3182) time: 2.9553 data: 0.0067 max mem: 33296 Epoch: [0] [ 580/4276] eta: 3:01:30 lr: 4.98471150251751e-05 loss: 0.2662 (0.3173) time: 2.9430 data: 0.0067 max mem: 33297 Epoch: [0] [ 590/4276] eta: 3:01:01 lr: 4.9844483158560936e-05 loss: 0.2483 (0.3162) time: 2.9392 data: 0.0068 max mem: 33297 Epoch: [0] [ 600/4276] eta: 3:00:31 lr: 4.984185127650594e-05 loss: 0.2539 (0.3154) time: 2.9426 data: 0.0066 max mem: 33297 Epoch: [0] [ 610/4276] eta: 3:00:00 lr: 4.983921937900911e-05 loss: 0.2558 (0.3144) time: 2.9300 data: 0.0066 max mem: 33297 Epoch: [0] [ 620/4276] eta: 2:59:29 lr: 4.983658746606946e-05 loss: 0.2447 (0.3136) time: 2.9264 data: 0.0067 max mem: 33297 Epoch: [0] [ 630/4276] eta: 2:59:00 lr: 4.9833955537685986e-05 loss: 0.2843 (0.3134) time: 2.9386 data: 0.0066 max mem: 33297 Epoch: [0] [ 640/4276] eta: 2:58:30 lr: 4.983132359385769e-05 loss: 0.2635 (0.3124) time: 2.9423 data: 0.0065 max mem: 33297 Epoch: [0] [ 650/4276] eta: 2:58:01 lr: 4.982869163458359e-05 loss: 0.2460 (0.3115) time: 2.9446 data: 0.0065 max mem: 33297 Epoch: [0] [ 660/4276] eta: 2:57:32 lr: 4.982605965986266e-05 loss: 0.2472 (0.3107) time: 2.9502 data: 0.0066 max mem: 33297 Epoch: [0] [ 670/4276] eta: 2:57:01 lr: 4.982342766969393e-05 loss: 0.2508 (0.3098) time: 2.9399 data: 0.0069 max mem: 33297 Epoch: [0] [ 680/4276] eta: 2:56:30 lr: 4.982079566407639e-05 loss: 0.2459 (0.3089) time: 2.9255 data: 0.0069 max mem: 33297 Epoch: [0] [ 690/4276] eta: 2:56:00 lr: 4.9818163643009045e-05 loss: 0.2459 (0.3081) time: 2.9251 data: 0.0066 max mem: 33297 Epoch: [0] [ 700/4276] eta: 2:55:30 lr: 4.98155316064909e-05 loss: 0.2499 (0.3075) time: 2.9295 data: 0.0067 max mem: 33297 Epoch: [0] [ 710/4276] eta: 2:54:59 lr: 4.981289955452095e-05 loss: 0.2535 (0.3071) time: 2.9263 data: 0.0068 max mem: 33297 Epoch: [0] [ 720/4276] eta: 2:54:29 lr: 4.98102674870982e-05 loss: 0.2593 (0.3064) time: 2.9213 data: 0.0068 max mem: 33297 Epoch: [0] [ 730/4276] eta: 2:53:58 lr: 4.980763540422166e-05 loss: 0.2648 (0.3061) time: 2.9200 data: 0.0067 max mem: 33297 Epoch: [0] [ 740/4276] eta: 2:53:29 lr: 4.9805003305890326e-05 loss: 0.2617 (0.3057) time: 2.9310 data: 0.0065 max mem: 33297 Epoch: [0] [ 750/4276] eta: 2:53:00 lr: 4.98023711921032e-05 loss: 0.2466 (0.3050) time: 2.9544 data: 0.0067 max mem: 33297 Epoch: [0] [ 760/4276] eta: 2:52:31 lr: 4.9799739062859274e-05 loss: 0.2323 (0.3041) time: 2.9601 data: 0.0068 max mem: 33297 Epoch: [0] [ 770/4276] eta: 2:52:00 lr: 4.9797106918157566e-05 loss: 0.2379 (0.3037) time: 2.9295 data: 0.0065 max mem: 33297 Epoch: [0] [ 780/4276] eta: 2:51:29 lr: 4.979447475799707e-05 loss: 0.2628 (0.3033) time: 2.9044 data: 0.0063 max mem: 33297 Epoch: [0] [ 790/4276] eta: 2:50:57 lr: 4.979184258237678e-05 loss: 0.2672 (0.3029) time: 2.8919 data: 0.0065 max mem: 33297 Epoch: [0] [ 800/4276] eta: 2:50:27 lr: 4.97892103912957e-05 loss: 0.2608 (0.3025) time: 2.9055 data: 0.0065 max mem: 33297 Epoch: [0] [ 810/4276] eta: 2:49:57 lr: 4.978657818475285e-05 loss: 0.2499 (0.3018) time: 2.9299 data: 0.0068 max mem: 33297 Epoch: [0] [ 820/4276] eta: 2:49:28 lr: 4.978394596274721e-05 loss: 0.2581 (0.3014) time: 2.9392 data: 0.0068 max mem: 33297 Epoch: [0] [ 830/4276] eta: 2:48:58 lr: 4.978131372527778e-05 loss: 0.2691 (0.3011) time: 2.9471 data: 0.0066 max mem: 33297 Epoch: [0] [ 840/4276] eta: 2:48:29 lr: 4.977868147234358e-05 loss: 0.2533 (0.3009) time: 2.9447 data: 0.0068 max mem: 33297 Epoch: [0] [ 850/4276] eta: 2:48:00 lr: 4.977604920394359e-05 loss: 0.2533 (0.3005) time: 2.9436 data: 0.0066 max mem: 33297 Epoch: [0] [ 860/4276] eta: 2:47:30 lr: 4.9773416920076824e-05 loss: 0.2370 (0.3000) time: 2.9459 data: 0.0065 max mem: 33297 Epoch: [0] [ 870/4276] eta: 2:47:01 lr: 4.9770784620742265e-05 loss: 0.2456 (0.2997) time: 2.9440 data: 0.0065 max mem: 33297 Epoch: [0] [ 880/4276] eta: 2:46:32 lr: 4.976815230593893e-05 loss: 0.2593 (0.2993) time: 2.9451 data: 0.0067 max mem: 33297 Epoch: [0] [ 890/4276] eta: 2:46:03 lr: 4.976551997566581e-05 loss: 0.2593 (0.2989) time: 2.9510 data: 0.0070 max mem: 33297 Epoch: [0] [ 900/4276] eta: 2:45:34 lr: 4.976288762992192e-05 loss: 0.2581 (0.2985) time: 2.9506 data: 0.0070 max mem: 33297 Epoch: [0] [ 910/4276] eta: 2:45:04 lr: 4.976025526870624e-05 loss: 0.2618 (0.2981) time: 2.9484 data: 0.0066 max mem: 33297 Epoch: [0] [ 920/4276] eta: 2:44:35 lr: 4.975762289201778e-05 loss: 0.2618 (0.2976) time: 2.9486 data: 0.0068 max mem: 33297 Epoch: [0] [ 930/4276] eta: 2:44:06 lr: 4.9754990499855535e-05 loss: 0.2517 (0.2972) time: 2.9530 data: 0.0070 max mem: 33297 Epoch: [0] [ 940/4276] eta: 2:43:37 lr: 4.975235809221851e-05 loss: 0.2468 (0.2967) time: 2.9533 data: 0.0068 max mem: 33297 Epoch: [0] [ 950/4276] eta: 2:43:08 lr: 4.97497256691057e-05 loss: 0.2436 (0.2961) time: 2.9510 data: 0.0068 max mem: 33297 Epoch: [0] [ 960/4276] eta: 2:42:39 lr: 4.97470932305161e-05 loss: 0.2458 (0.2956) time: 2.9502 data: 0.0068 max mem: 33297 Epoch: [0] [ 970/4276] eta: 2:42:09 lr: 4.974446077644872e-05 loss: 0.2496 (0.2952) time: 2.9493 data: 0.0071 max mem: 33297 Epoch: [0] [ 980/4276] eta: 2:41:40 lr: 4.974182830690255e-05 loss: 0.2540 (0.2953) time: 2.9483 data: 0.0071 max mem: 33297 Epoch: [0] [ 990/4276] eta: 2:41:11 lr: 4.97391958218766e-05 loss: 0.2564 (0.2948) time: 2.9488 data: 0.0070 max mem: 33297 Epoch: [0] [1000/4276] eta: 2:40:42 lr: 4.973656332136985e-05 loss: 0.2580 (0.2945) time: 2.9516 data: 0.0072 max mem: 33297 Epoch: [0] [1010/4276] eta: 2:40:13 lr: 4.973393080538131e-05 loss: 0.2435 (0.2940) time: 2.9514 data: 0.0068 max mem: 33297 Epoch: [0] [1020/4276] eta: 2:39:43 lr: 4.973129827390998e-05 loss: 0.2447 (0.2937) time: 2.9430 data: 0.0067 max mem: 33297 Epoch: [0] [1030/4276] eta: 2:39:13 lr: 4.9728665726954854e-05 loss: 0.2573 (0.2936) time: 2.9376 data: 0.0068 max mem: 33297 Epoch: [0] [1040/4276] eta: 2:38:44 lr: 4.972603316451494e-05 loss: 0.2459 (0.2931) time: 2.9444 data: 0.0068 max mem: 33297 Epoch: [0] [1050/4276] eta: 2:38:15 lr: 4.972340058658922e-05 loss: 0.2459 (0.2931) time: 2.9493 data: 0.0066 max mem: 33297 Epoch: [0] [1060/4276] eta: 2:37:46 lr: 4.97207679931767e-05 loss: 0.2493 (0.2930) time: 2.9553 data: 0.0065 max mem: 33297 Epoch: [0] [1070/4276] eta: 2:37:17 lr: 4.971813538427638e-05 loss: 0.2410 (0.2925) time: 2.9544 data: 0.0065 max mem: 33297 Epoch: [0] [1080/4276] eta: 2:36:48 lr: 4.971550275988726e-05 loss: 0.2362 (0.2921) time: 2.9590 data: 0.0067 max mem: 33297 Epoch: [0] [1090/4276] eta: 2:36:19 lr: 4.971287012000833e-05 loss: 0.2386 (0.2917) time: 2.9576 data: 0.0069 max mem: 33297 Epoch: [0] [1100/4276] eta: 2:35:49 lr: 4.971023746463858e-05 loss: 0.2389 (0.2913) time: 2.9450 data: 0.0067 max mem: 33297 Epoch: [0] [1110/4276] eta: 2:35:20 lr: 4.970760479377703e-05 loss: 0.2571 (0.2912) time: 2.9469 data: 0.0066 max mem: 33297 Epoch: [0] [1120/4276] eta: 2:34:51 lr: 4.970497210742266e-05 loss: 0.2620 (0.2911) time: 2.9501 data: 0.0065 max mem: 33297 Epoch: [0] [1130/4276] eta: 2:34:21 lr: 4.970233940557447e-05 loss: 0.2468 (0.2908) time: 2.9507 data: 0.0066 max mem: 33297 Epoch: [0] [1140/4276] eta: 2:33:52 lr: 4.9699706688231465e-05 loss: 0.2450 (0.2904) time: 2.9460 data: 0.0068 max mem: 33297 Epoch: [0] [1150/4276] eta: 2:33:23 lr: 4.9697073955392634e-05 loss: 0.2669 (0.2903) time: 2.9432 data: 0.0066 max mem: 33297 Epoch: [0] [1160/4276] eta: 2:32:52 lr: 4.969444120705697e-05 loss: 0.2669 (0.2901) time: 2.9251 data: 0.0066 max mem: 33297 Epoch: [0] [1170/4276] eta: 2:32:22 lr: 4.969180844322347e-05 loss: 0.2510 (0.2897) time: 2.9199 data: 0.0069 max mem: 33297 Epoch: [0] [1180/4276] eta: 2:31:53 lr: 4.968917566389114e-05 loss: 0.2454 (0.2892) time: 2.9447 data: 0.0070 max mem: 33297 Epoch: [0] [1190/4276] eta: 2:31:24 lr: 4.9686542869058976e-05 loss: 0.2454 (0.2889) time: 2.9528 data: 0.0070 max mem: 33297 Epoch: [0] [1200/4276] eta: 2:30:55 lr: 4.968391005872597e-05 loss: 0.2496 (0.2886) time: 2.9556 data: 0.0070 max mem: 33297 Epoch: [0] [1210/4276] eta: 2:30:26 lr: 4.968127723289112e-05 loss: 0.2381 (0.2883) time: 2.9551 data: 0.0068 max mem: 33297 Epoch: [0] [1220/4276] eta: 2:29:56 lr: 4.967864439155341e-05 loss: 0.2339 (0.2879) time: 2.9497 data: 0.0066 max mem: 33297 Epoch: [0] [1230/4276] eta: 2:29:27 lr: 4.967601153471185e-05 loss: 0.2318 (0.2877) time: 2.9497 data: 0.0065 max mem: 33297 Epoch: [0] [1240/4276] eta: 2:28:57 lr: 4.967337866236543e-05 loss: 0.2562 (0.2875) time: 2.9421 data: 0.0066 max mem: 33297 Epoch: [0] [1250/4276] eta: 2:28:27 lr: 4.9670745774513146e-05 loss: 0.2563 (0.2872) time: 2.9144 data: 0.0066 max mem: 33297 Epoch: [0] [1260/4276] eta: 2:27:56 lr: 4.966811287115399e-05 loss: 0.2285 (0.2868) time: 2.9000 data: 0.0069 max mem: 33297 Epoch: [0] [1270/4276] eta: 2:27:26 lr: 4.966547995228697e-05 loss: 0.2553 (0.2868) time: 2.8981 data: 0.0069 max mem: 33297 Epoch: [0] [1280/4276] eta: 2:26:55 lr: 4.966284701791107e-05 loss: 0.2601 (0.2866) time: 2.8980 data: 0.0064 max mem: 33297 Epoch: [0] [1290/4276] eta: 2:26:26 lr: 4.9660214068025285e-05 loss: 0.2456 (0.2865) time: 2.9142 data: 0.0067 max mem: 33297 Epoch: [0] [1300/4276] eta: 2:25:56 lr: 4.965758110262861e-05 loss: 0.2558 (0.2861) time: 2.9323 data: 0.0067 max mem: 33297 Epoch: [0] [1310/4276] eta: 2:25:27 lr: 4.965494812172005e-05 loss: 0.2216 (0.2857) time: 2.9485 data: 0.0066 max mem: 33297 Epoch: [0] [1320/4276] eta: 2:24:58 lr: 4.9652315125298596e-05 loss: 0.2378 (0.2857) time: 2.9542 data: 0.0068 max mem: 33297 Epoch: [0] [1330/4276] eta: 2:24:29 lr: 4.964968211336323e-05 loss: 0.2374 (0.2853) time: 2.9522 data: 0.0069 max mem: 33297 Epoch: [0] [1340/4276] eta: 2:23:59 lr: 4.964704908591295e-05 loss: 0.2330 (0.2849) time: 2.9516 data: 0.0068 max mem: 33297 Epoch: [0] [1350/4276] eta: 2:23:30 lr: 4.9644416042946765e-05 loss: 0.2636 (0.2847) time: 2.9523 data: 0.0065 max mem: 33297 Epoch: [0] [1360/4276] eta: 2:23:01 lr: 4.964178298446365e-05 loss: 0.2394 (0.2845) time: 2.9532 data: 0.0067 max mem: 33297 Epoch: [0] [1370/4276] eta: 2:22:32 lr: 4.963914991046262e-05 loss: 0.2375 (0.2841) time: 2.9540 data: 0.0070 max mem: 33297 Epoch: [0] [1380/4276] eta: 2:22:02 lr: 4.963651682094265e-05 loss: 0.2385 (0.2840) time: 2.9531 data: 0.0075 max mem: 33297 Epoch: [0] [1390/4276] eta: 2:21:32 lr: 4.9633883715902746e-05 loss: 0.2413 (0.2837) time: 2.9308 data: 0.0076 max mem: 33297 Epoch: [0] [1400/4276] eta: 2:21:02 lr: 4.963125059534189e-05 loss: 0.2413 (0.2836) time: 2.9087 data: 0.0069 max mem: 33297 Epoch: [0] [1410/4276] eta: 2:20:32 lr: 4.962861745925908e-05 loss: 0.2433 (0.2834) time: 2.9090 data: 0.0067 max mem: 33297 Epoch: [0] [1420/4276] eta: 2:20:03 lr: 4.962598430765332e-05 loss: 0.2565 (0.2833) time: 2.9282 data: 0.0065 max mem: 33297 Epoch: [0] [1430/4276] eta: 2:19:34 lr: 4.96233511405236e-05 loss: 0.2446 (0.2830) time: 2.9499 data: 0.0066 max mem: 33297 Epoch: [0] [1440/4276] eta: 2:19:04 lr: 4.96207179578689e-05 loss: 0.2347 (0.2831) time: 2.9531 data: 0.0067 max mem: 33297 Epoch: [0] [1450/4276] eta: 2:18:35 lr: 4.961808475968822e-05 loss: 0.2529 (0.2829) time: 2.9556 data: 0.0065 max mem: 33297 Epoch: [0] [1460/4276] eta: 2:18:06 lr: 4.961545154598057e-05 loss: 0.2529 (0.2828) time: 2.9526 data: 0.0065 max mem: 33297 Epoch: [0] [1470/4276] eta: 2:17:37 lr: 4.961281831674491e-05 loss: 0.2483 (0.2826) time: 2.9508 data: 0.0066 max mem: 33297 Epoch: [0] [1480/4276] eta: 2:17:07 lr: 4.961018507198025e-05 loss: 0.2461 (0.2825) time: 2.9503 data: 0.0069 max mem: 33297 Epoch: [0] [1490/4276] eta: 2:16:38 lr: 4.960755181168559e-05 loss: 0.2453 (0.2822) time: 2.9456 data: 0.0070 max mem: 33297 Epoch: [0] [1500/4276] eta: 2:16:09 lr: 4.9604918535859916e-05 loss: 0.2324 (0.2819) time: 2.9486 data: 0.0068 max mem: 33297 Epoch: [0] [1510/4276] eta: 2:15:40 lr: 4.9602285244502217e-05 loss: 0.2318 (0.2818) time: 2.9530 data: 0.0068 max mem: 33297 Epoch: [0] [1520/4276] eta: 2:15:10 lr: 4.9599651937611484e-05 loss: 0.2382 (0.2815) time: 2.9524 data: 0.0071 max mem: 33297 Epoch: [0] [1530/4276] eta: 2:14:41 lr: 4.9597018615186716e-05 loss: 0.2382 (0.2814) time: 2.9489 data: 0.0071 max mem: 33297 Epoch: [0] [1540/4276] eta: 2:14:11 lr: 4.9594385277226904e-05 loss: 0.2406 (0.2811) time: 2.9445 data: 0.0071 max mem: 33297 Epoch: [0] [1550/4276] eta: 2:13:42 lr: 4.9591751923731036e-05 loss: 0.2432 (0.2809) time: 2.9323 data: 0.0072 max mem: 33297 Epoch: [0] [1560/4276] eta: 2:13:11 lr: 4.95891185546981e-05 loss: 0.2333 (0.2806) time: 2.9095 data: 0.0070 max mem: 33297 Epoch: [0] [1570/4276] eta: 2:12:41 lr: 4.958648517012709e-05 loss: 0.2248 (0.2802) time: 2.9043 data: 0.0069 max mem: 33297 Epoch: [0] [1580/4276] eta: 2:12:11 lr: 4.9583851770017016e-05 loss: 0.2228 (0.2801) time: 2.9046 data: 0.0073 max mem: 33297 Epoch: [0] [1590/4276] eta: 2:11:41 lr: 4.958121835436684e-05 loss: 0.2448 (0.2799) time: 2.9078 data: 0.0070 max mem: 33297 Epoch: [0] [1600/4276] eta: 2:11:12 lr: 4.957858492317557e-05 loss: 0.2510 (0.2800) time: 2.9349 data: 0.0068 max mem: 33297 Epoch: [0] [1610/4276] eta: 2:10:43 lr: 4.9575951476442196e-05 loss: 0.2257 (0.2796) time: 2.9496 data: 0.0069 max mem: 33297 Epoch: [0] [1620/4276] eta: 2:10:14 lr: 4.95733180141657e-05 loss: 0.2257 (0.2794) time: 2.9485 data: 0.0070 max mem: 33297 Epoch: [0] [1630/4276] eta: 2:09:44 lr: 4.957068453634508e-05 loss: 0.2445 (0.2795) time: 2.9485 data: 0.0074 max mem: 33297 Epoch: [0] [1640/4276] eta: 2:09:15 lr: 4.9568051042979336e-05 loss: 0.2494 (0.2794) time: 2.9492 data: 0.0069 max mem: 33297 Epoch: [0] [1650/4276] eta: 2:08:46 lr: 4.9565417534067444e-05 loss: 0.2494 (0.2793) time: 2.9468 data: 0.0067 max mem: 33297 Epoch: [0] [1660/4276] eta: 2:08:16 lr: 4.95627840096084e-05 loss: 0.2335 (0.2790) time: 2.9463 data: 0.0068 max mem: 33297 Epoch: [0] [1670/4276] eta: 2:07:47 lr: 4.956015046960119e-05 loss: 0.2361 (0.2789) time: 2.9483 data: 0.0066 max mem: 33297 Epoch: [0] [1680/4276] eta: 2:07:18 lr: 4.9557516914044805e-05 loss: 0.2649 (0.2789) time: 2.9530 data: 0.0070 max mem: 33297 Epoch: [0] [1690/4276] eta: 2:06:48 lr: 4.955488334293824e-05 loss: 0.2531 (0.2787) time: 2.9546 data: 0.0069 max mem: 33297 Epoch: [0] [1700/4276] eta: 2:06:19 lr: 4.9552249756280486e-05 loss: 0.2376 (0.2786) time: 2.9536 data: 0.0069 max mem: 33297 Epoch: [0] [1710/4276] eta: 2:05:50 lr: 4.954961615407053e-05 loss: 0.2746 (0.2785) time: 2.9517 data: 0.0068 max mem: 33297 Epoch: [0] [1720/4276] eta: 2:05:20 lr: 4.954698253630735e-05 loss: 0.2704 (0.2786) time: 2.9458 data: 0.0067 max mem: 33297 Epoch: [0] [1730/4276] eta: 2:04:51 lr: 4.9544348902989954e-05 loss: 0.2485 (0.2783) time: 2.9469 data: 0.0070 max mem: 33297 Epoch: [0] [1740/4276] eta: 2:04:22 lr: 4.954171525411733e-05 loss: 0.2254 (0.2782) time: 2.9522 data: 0.0071 max mem: 33297 Epoch: [0] [1750/4276] eta: 2:03:53 lr: 4.953908158968846e-05 loss: 0.2630 (0.2782) time: 2.9511 data: 0.0073 max mem: 33297 Epoch: [0] [1760/4276] eta: 2:03:23 lr: 4.9536447909702324e-05 loss: 0.2630 (0.2779) time: 2.9466 data: 0.0072 max mem: 33297 Epoch: [0] [1770/4276] eta: 2:02:54 lr: 4.953381421415793e-05 loss: 0.2362 (0.2777) time: 2.9463 data: 0.0069 max mem: 33297 Epoch: [0] [1780/4276] eta: 2:02:24 lr: 4.953118050305426e-05 loss: 0.2516 (0.2776) time: 2.9473 data: 0.0068 max mem: 33297 Epoch: [0] [1790/4276] eta: 2:01:55 lr: 4.9528546776390295e-05 loss: 0.2287 (0.2773) time: 2.9509 data: 0.0070 max mem: 33297 Epoch: [0] [1800/4276] eta: 2:01:26 lr: 4.9525913034165035e-05 loss: 0.2263 (0.2772) time: 2.9501 data: 0.0070 max mem: 33297 Epoch: [0] [1810/4276] eta: 2:00:57 lr: 4.952327927637747e-05 loss: 0.2489 (0.2771) time: 2.9503 data: 0.0068 max mem: 33297 Epoch: [0] [1820/4276] eta: 2:00:27 lr: 4.9520645503026567e-05 loss: 0.2422 (0.2769) time: 2.9501 data: 0.0070 max mem: 33297 Epoch: [0] [1830/4276] eta: 1:59:58 lr: 4.951801171411133e-05 loss: 0.2422 (0.2768) time: 2.9488 data: 0.0072 max mem: 33297 Epoch: [0] [1840/4276] eta: 1:59:28 lr: 4.9515377909630756e-05 loss: 0.2343 (0.2767) time: 2.9490 data: 0.0070 max mem: 33297 Epoch: [0] [1850/4276] eta: 1:58:59 lr: 4.951274408958383e-05 loss: 0.2509 (0.2766) time: 2.9408 data: 0.0067 max mem: 33297 Epoch: [0] [1860/4276] eta: 1:58:29 lr: 4.951011025396952e-05 loss: 0.2443 (0.2764) time: 2.9291 data: 0.0071 max mem: 33297 Epoch: [0] [1870/4276] eta: 1:58:00 lr: 4.950747640278683e-05 loss: 0.2373 (0.2764) time: 2.9219 data: 0.0078 max mem: 33297 Epoch: [0] [1880/4276] eta: 1:57:30 lr: 4.950484253603475e-05 loss: 0.2539 (0.2762) time: 2.9238 data: 0.0075 max mem: 33297 Epoch: [0] [1890/4276] eta: 1:57:00 lr: 4.950220865371226e-05 loss: 0.2433 (0.2759) time: 2.9109 data: 0.0070 max mem: 33297 Epoch: [0] [1900/4276] eta: 1:56:30 lr: 4.949957475581835e-05 loss: 0.2374 (0.2757) time: 2.9214 data: 0.0069 max mem: 33297 Epoch: [0] [1910/4276] eta: 1:56:01 lr: 4.9496940842352e-05 loss: 0.2374 (0.2756) time: 2.9461 data: 0.0067 max mem: 33297 Epoch: [0] [1920/4276] eta: 1:55:31 lr: 4.949430691331222e-05 loss: 0.2358 (0.2753) time: 2.9309 data: 0.0066 max mem: 33297 Epoch: [0] [1930/4276] eta: 1:55:02 lr: 4.949167296869797e-05 loss: 0.2280 (0.2751) time: 2.9223 data: 0.0069 max mem: 33297 Epoch: [0] [1940/4276] eta: 1:54:32 lr: 4.9489039008508256e-05 loss: 0.2574 (0.2751) time: 2.9096 data: 0.0068 max mem: 33297 Epoch: [0] [1950/4276] eta: 1:54:02 lr: 4.948640503274205e-05 loss: 0.2496 (0.2749) time: 2.8917 data: 0.0066 max mem: 33297 Epoch: [0] [1960/4276] eta: 1:53:32 lr: 4.9483771041398345e-05 loss: 0.2193 (0.2747) time: 2.9079 data: 0.0068 max mem: 33297 Epoch: [0] [1970/4276] eta: 1:53:03 lr: 4.9481137034476136e-05 loss: 0.2109 (0.2744) time: 2.9336 data: 0.0070 max mem: 33297 Epoch: [0] [1980/4276] eta: 1:52:33 lr: 4.9478503011974395e-05 loss: 0.2109 (0.2741) time: 2.9461 data: 0.0069 max mem: 33297 Epoch: [0] [1990/4276] eta: 1:52:04 lr: 4.947586897389212e-05 loss: 0.2252 (0.2740) time: 2.9490 data: 0.0072 max mem: 33297 Epoch: [0] [2000/4276] eta: 1:51:35 lr: 4.9473234920228295e-05 loss: 0.2395 (0.2739) time: 2.9487 data: 0.0072 max mem: 33297 Epoch: [0] [2010/4276] eta: 1:51:05 lr: 4.94706008509819e-05 loss: 0.2395 (0.2737) time: 2.9492 data: 0.0072 max mem: 33297 Epoch: [0] [2020/4276] eta: 1:50:36 lr: 4.946796676615192e-05 loss: 0.2497 (0.2737) time: 2.9491 data: 0.0076 max mem: 33297 Epoch: [0] [2030/4276] eta: 1:50:07 lr: 4.946533266573735e-05 loss: 0.2280 (0.2734) time: 2.9483 data: 0.0077 max mem: 33297 Epoch: [0] [2040/4276] eta: 1:49:37 lr: 4.946269854973718e-05 loss: 0.2137 (0.2733) time: 2.9495 data: 0.0072 max mem: 33297 Epoch: [0] [2050/4276] eta: 1:49:08 lr: 4.946006441815038e-05 loss: 0.2293 (0.2732) time: 2.9487 data: 0.0071 max mem: 33297 Epoch: [0] [2060/4276] eta: 1:48:38 lr: 4.945743027097593e-05 loss: 0.2427 (0.2731) time: 2.9352 data: 0.0074 max mem: 33297 Epoch: [0] [2070/4276] eta: 1:48:09 lr: 4.945479610821284e-05 loss: 0.2442 (0.2730) time: 2.9226 data: 0.0076 max mem: 33297 Epoch: [0] [2080/4276] eta: 1:47:39 lr: 4.945216192986008e-05 loss: 0.2442 (0.2729) time: 2.9116 data: 0.0077 max mem: 33297 Epoch: [0] [2090/4276] eta: 1:47:09 lr: 4.944952773591663e-05 loss: 0.2410 (0.2727) time: 2.9016 data: 0.0071 max mem: 33297 Epoch: [0] [2100/4276] eta: 1:46:39 lr: 4.9446893526381486e-05 loss: 0.2371 (0.2726) time: 2.9076 data: 0.0070 max mem: 33297 Epoch: [0] [2110/4276] eta: 1:46:10 lr: 4.944425930125363e-05 loss: 0.2346 (0.2724) time: 2.9115 data: 0.0073 max mem: 33297 Epoch: [0] [2120/4276] eta: 1:45:40 lr: 4.944162506053206e-05 loss: 0.2323 (0.2723) time: 2.9052 data: 0.0071 max mem: 33297 Epoch: [0] [2130/4276] eta: 1:45:10 lr: 4.943899080421572e-05 loss: 0.2456 (0.2722) time: 2.9237 data: 0.0070 max mem: 33297 Epoch: [0] [2140/4276] eta: 1:44:41 lr: 4.943635653230363e-05 loss: 0.2497 (0.2721) time: 2.9492 data: 0.0070 max mem: 33297 Epoch: [0] [2150/4276] eta: 1:44:12 lr: 4.943372224479477e-05 loss: 0.2360 (0.2720) time: 2.9503 data: 0.0073 max mem: 33297 Epoch: [0] [2160/4276] eta: 1:43:43 lr: 4.9431087941688126e-05 loss: 0.2337 (0.2719) time: 2.9557 data: 0.0074 max mem: 33297 Epoch: [0] [2170/4276] eta: 1:43:13 lr: 4.942845362298267e-05 loss: 0.2393 (0.2719) time: 2.9566 data: 0.0076 max mem: 33297 Epoch: [0] [2180/4276] eta: 1:42:44 lr: 4.942581928867738e-05 loss: 0.2517 (0.2718) time: 2.9497 data: 0.0074 max mem: 33297 Epoch: [0] [2190/4276] eta: 1:42:15 lr: 4.942318493877126e-05 loss: 0.2488 (0.2718) time: 2.9494 data: 0.0072 max mem: 33297 Epoch: [0] [2200/4276] eta: 1:41:45 lr: 4.942055057326329e-05 loss: 0.2401 (0.2716) time: 2.9460 data: 0.0070 max mem: 33297 Epoch: [0] [2210/4276] eta: 1:41:16 lr: 4.9417916192152437e-05 loss: 0.2365 (0.2716) time: 2.9506 data: 0.0068 max mem: 33297 Epoch: [0] [2220/4276] eta: 1:40:47 lr: 4.941528179543769e-05 loss: 0.2389 (0.2714) time: 2.9527 data: 0.0071 max mem: 33297 Epoch: [0] [2230/4276] eta: 1:40:17 lr: 4.9412647383118054e-05 loss: 0.2323 (0.2713) time: 2.9472 data: 0.0074 max mem: 33297 Epoch: [0] [2240/4276] eta: 1:39:48 lr: 4.941001295519249e-05 loss: 0.2250 (0.2711) time: 2.9455 data: 0.0073 max mem: 33297 Epoch: [0] [2250/4276] eta: 1:39:19 lr: 4.9407378511659986e-05 loss: 0.2375 (0.2710) time: 2.9428 data: 0.0071 max mem: 33297 Epoch: [0] [2260/4276] eta: 1:38:49 lr: 4.9404744052519524e-05 loss: 0.2450 (0.2710) time: 2.9460 data: 0.0071 max mem: 33297 Epoch: [0] [2270/4276] eta: 1:38:20 lr: 4.940210957777009e-05 loss: 0.2381 (0.2709) time: 2.9479 data: 0.0069 max mem: 33297 Epoch: [0] [2280/4276] eta: 1:37:50 lr: 4.939947508741067e-05 loss: 0.2237 (0.2707) time: 2.9441 data: 0.0067 max mem: 33297 Epoch: [0] [2290/4276] eta: 1:37:21 lr: 4.939684058144023e-05 loss: 0.2291 (0.2706) time: 2.9456 data: 0.0069 max mem: 33297 Epoch: [0] [2300/4276] eta: 1:36:52 lr: 4.939420605985777e-05 loss: 0.2121 (0.2703) time: 2.9479 data: 0.0071 max mem: 33297 Epoch: [0] [2310/4276] eta: 1:36:22 lr: 4.939157152266227e-05 loss: 0.2182 (0.2702) time: 2.9456 data: 0.0071 max mem: 33297 Epoch: [0] [2320/4276] eta: 1:35:53 lr: 4.938893696985271e-05 loss: 0.2435 (0.2701) time: 2.9278 data: 0.0074 max mem: 33297 Epoch: [0] [2330/4276] eta: 1:35:23 lr: 4.938630240142806e-05 loss: 0.2213 (0.2699) time: 2.9185 data: 0.0074 max mem: 33297 Epoch: [0] [2340/4276] eta: 1:34:54 lr: 4.938366781738731e-05 loss: 0.2339 (0.2698) time: 2.9312 data: 0.0071 max mem: 33297 Epoch: [0] [2350/4276] eta: 1:34:24 lr: 4.9381033217729464e-05 loss: 0.2290 (0.2696) time: 2.9449 data: 0.0071 max mem: 33297 Epoch: [0] [2360/4276] eta: 1:33:55 lr: 4.937839860245347e-05 loss: 0.2279 (0.2695) time: 2.9500 data: 0.0070 max mem: 33297 Epoch: [0] [2370/4276] eta: 1:33:26 lr: 4.937576397155832e-05 loss: 0.2337 (0.2694) time: 2.9463 data: 0.0065 max mem: 33297 Epoch: [0] [2380/4276] eta: 1:32:56 lr: 4.9373129325043e-05 loss: 0.2150 (0.2692) time: 2.9455 data: 0.0064 max mem: 33297 Epoch: [0] [2390/4276] eta: 1:32:27 lr: 4.93704946629065e-05 loss: 0.2146 (0.2690) time: 2.9463 data: 0.0063 max mem: 33297 Epoch: [0] [2400/4276] eta: 1:31:57 lr: 4.936785998514778e-05 loss: 0.2194 (0.2690) time: 2.9485 data: 0.0064 max mem: 33297 Epoch: [0] [2410/4276] eta: 1:31:28 lr: 4.936522529176583e-05 loss: 0.2382 (0.2688) time: 2.9463 data: 0.0064 max mem: 33297 Epoch: [0] [2420/4276] eta: 1:30:59 lr: 4.9362590582759643e-05 loss: 0.2317 (0.2686) time: 2.9397 data: 0.0068 max mem: 33297 Epoch: [0] [2430/4276] eta: 1:30:29 lr: 4.935995585812819e-05 loss: 0.2384 (0.2686) time: 2.9406 data: 0.0068 max mem: 33297 Epoch: [0] [2440/4276] eta: 1:30:00 lr: 4.935732111787044e-05 loss: 0.2459 (0.2685) time: 2.9450 data: 0.0067 max mem: 33297 Epoch: [0] [2450/4276] eta: 1:29:31 lr: 4.935468636198539e-05 loss: 0.2321 (0.2683) time: 2.9473 data: 0.0069 max mem: 33298 Epoch: [0] [2460/4276] eta: 1:29:01 lr: 4.935205159047202e-05 loss: 0.2321 (0.2682) time: 2.9499 data: 0.0067 max mem: 33298 Epoch: [0] [2470/4276] eta: 1:28:32 lr: 4.9349416803329296e-05 loss: 0.2545 (0.2682) time: 2.9548 data: 0.0065 max mem: 33298 Epoch: [0] [2480/4276] eta: 1:28:03 lr: 4.934678200055621e-05 loss: 0.2506 (0.2681) time: 2.9532 data: 0.0066 max mem: 33298 Epoch: [0] [2490/4276] eta: 1:27:33 lr: 4.934414718215174e-05 loss: 0.2395 (0.2680) time: 2.9446 data: 0.0069 max mem: 33298 Epoch: [0] [2500/4276] eta: 1:27:04 lr: 4.934151234811486e-05 loss: 0.2601 (0.2680) time: 2.9351 data: 0.0066 max mem: 33298 Epoch: [0] [2510/4276] eta: 1:26:34 lr: 4.933887749844457e-05 loss: 0.2446 (0.2679) time: 2.9384 data: 0.0065 max mem: 33298 Epoch: [0] [2520/4276] eta: 1:26:05 lr: 4.933624263313982e-05 loss: 0.2271 (0.2677) time: 2.9473 data: 0.0069 max mem: 33298 Epoch: [0] [2530/4276] eta: 1:25:35 lr: 4.933360775219961e-05 loss: 0.2068 (0.2675) time: 2.9449 data: 0.0069 max mem: 33298 Epoch: [0] [2540/4276] eta: 1:25:06 lr: 4.933097285562291e-05 loss: 0.2081 (0.2673) time: 2.9444 data: 0.0069 max mem: 33298 Epoch: [0] [2550/4276] eta: 1:24:37 lr: 4.932833794340871e-05 loss: 0.2390 (0.2673) time: 2.9433 data: 0.0068 max mem: 33298 Epoch: [0] [2560/4276] eta: 1:24:07 lr: 4.932570301555597e-05 loss: 0.2150 (0.2671) time: 2.9425 data: 0.0066 max mem: 33298 Epoch: [0] [2570/4276] eta: 1:23:38 lr: 4.932306807206369e-05 loss: 0.2082 (0.2670) time: 2.9431 data: 0.0067 max mem: 33298 Epoch: [0] [2580/4276] eta: 1:23:08 lr: 4.932043311293083e-05 loss: 0.2260 (0.2669) time: 2.9438 data: 0.0068 max mem: 33298 Epoch: [0] [2590/4276] eta: 1:22:39 lr: 4.931779813815639e-05 loss: 0.2401 (0.2668) time: 2.9463 data: 0.0068 max mem: 33298 Epoch: [0] [2600/4276] eta: 1:22:10 lr: 4.931516314773932e-05 loss: 0.2433 (0.2668) time: 2.9511 data: 0.0068 max mem: 33298 Epoch: [0] [2610/4276] eta: 1:21:40 lr: 4.931252814167863e-05 loss: 0.2419 (0.2667) time: 2.9485 data: 0.0068 max mem: 33298 Epoch: [0] [2620/4276] eta: 1:21:11 lr: 4.930989311997328e-05 loss: 0.2261 (0.2666) time: 2.9495 data: 0.0069 max mem: 33298 Epoch: [0] [2630/4276] eta: 1:20:42 lr: 4.930725808262225e-05 loss: 0.2363 (0.2665) time: 2.9521 data: 0.0070 max mem: 33298 Epoch: [0] [2640/4276] eta: 1:20:12 lr: 4.930462302962452e-05 loss: 0.2201 (0.2663) time: 2.9512 data: 0.0068 max mem: 33298 Epoch: [0] [2650/4276] eta: 1:19:43 lr: 4.930198796097906e-05 loss: 0.2201 (0.2662) time: 2.9499 data: 0.0068 max mem: 33298 Epoch: [0] [2660/4276] eta: 1:19:14 lr: 4.929935287668487e-05 loss: 0.2365 (0.2661) time: 2.9474 data: 0.0074 max mem: 33298 Epoch: [0] [2670/4276] eta: 1:18:44 lr: 4.92967177767409e-05 loss: 0.2451 (0.2661) time: 2.9474 data: 0.0073 max mem: 33298 Epoch: [0] [2680/4276] eta: 1:18:15 lr: 4.929408266114614e-05 loss: 0.2362 (0.2660) time: 2.9476 data: 0.0066 max mem: 33298 Epoch: [0] [2690/4276] eta: 1:17:45 lr: 4.929144752989958e-05 loss: 0.2426 (0.2659) time: 2.9466 data: 0.0066 max mem: 33298 Epoch: [0] [2700/4276] eta: 1:17:16 lr: 4.9288812383000185e-05 loss: 0.2272 (0.2657) time: 2.9473 data: 0.0068 max mem: 33298 Epoch: [0] [2710/4276] eta: 1:16:47 lr: 4.9286177220446924e-05 loss: 0.2272 (0.2657) time: 2.9491 data: 0.0069 max mem: 33298 Epoch: [0] [2720/4276] eta: 1:16:17 lr: 4.928354204223878e-05 loss: 0.2600 (0.2656) time: 2.9478 data: 0.0071 max mem: 33298 Epoch: [0] [2730/4276] eta: 1:15:48 lr: 4.9280906848374736e-05 loss: 0.2353 (0.2656) time: 2.9510 data: 0.0073 max mem: 33298 Epoch: [0] [2740/4276] eta: 1:15:18 lr: 4.927827163885377e-05 loss: 0.2353 (0.2655) time: 2.9525 data: 0.0074 max mem: 33298 Epoch: [0] [2750/4276] eta: 1:14:49 lr: 4.927563641367485e-05 loss: 0.2450 (0.2655) time: 2.9546 data: 0.0072 max mem: 33298 Epoch: [0] [2760/4276] eta: 1:14:20 lr: 4.9273001172836954e-05 loss: 0.2397 (0.2654) time: 2.9384 data: 0.0071 max mem: 33298 Epoch: [0] [2770/4276] eta: 1:13:50 lr: 4.927036591633906e-05 loss: 0.2265 (0.2653) time: 2.9349 data: 0.0073 max mem: 33298 Epoch: [0] [2780/4276] eta: 1:13:21 lr: 4.9267730644180155e-05 loss: 0.2332 (0.2651) time: 2.9507 data: 0.0071 max mem: 33298 Epoch: [0] [2790/4276] eta: 1:12:51 lr: 4.926509535635919e-05 loss: 0.2502 (0.2651) time: 2.9512 data: 0.0067 max mem: 33298 Epoch: [0] [2800/4276] eta: 1:12:22 lr: 4.926246005287517e-05 loss: 0.2447 (0.2650) time: 2.9496 data: 0.0064 max mem: 33298 Epoch: [0] [2810/4276] eta: 1:11:53 lr: 4.925982473372705e-05 loss: 0.2210 (0.2649) time: 2.9468 data: 0.0066 max mem: 33298 Epoch: [0] [2820/4276] eta: 1:11:23 lr: 4.9257189398913815e-05 loss: 0.2337 (0.2648) time: 2.9470 data: 0.0068 max mem: 33298 Epoch: [0] [2830/4276] eta: 1:10:54 lr: 4.925455404843444e-05 loss: 0.2419 (0.2647) time: 2.9576 data: 0.0066 max mem: 33298 Epoch: [0] [2840/4276] eta: 1:10:25 lr: 4.9251918682287896e-05 loss: 0.2445 (0.2646) time: 2.9816 data: 0.0064 max mem: 33298 Epoch: [0] [2850/4276] eta: 1:09:56 lr: 4.924928330047316e-05 loss: 0.2465 (0.2647) time: 3.0116 data: 0.0063 max mem: 33298 Epoch: [0] [2860/4276] eta: 1:09:27 lr: 4.924664790298922e-05 loss: 0.2449 (0.2645) time: 3.0314 data: 0.0065 max mem: 33298 Epoch: [0] [2870/4276] eta: 1:08:58 lr: 4.9244012489835024e-05 loss: 0.2395 (0.2645) time: 3.0195 data: 0.0063 max mem: 33298 Epoch: [0] [2880/4276] eta: 1:08:29 lr: 4.924137706100957e-05 loss: 0.2502 (0.2645) time: 3.0164 data: 0.0061 max mem: 33298 Epoch: [0] [2890/4276] eta: 1:08:00 lr: 4.9238741616511825e-05 loss: 0.2426 (0.2644) time: 3.0340 data: 0.0061 max mem: 33298 Epoch: [0] [2900/4276] eta: 1:07:31 lr: 4.9236106156340765e-05 loss: 0.2235 (0.2642) time: 3.0324 data: 0.0059 max mem: 33298 Epoch: [0] [2910/4276] eta: 1:07:02 lr: 4.923347068049537e-05 loss: 0.2225 (0.2642) time: 3.0300 data: 0.0058 max mem: 33298 Epoch: [0] [2920/4276] eta: 1:06:33 lr: 4.92308351889746e-05 loss: 0.2284 (0.2641) time: 3.0199 data: 0.0057 max mem: 33298 Epoch: [0] [2930/4276] eta: 1:06:03 lr: 4.922819968177744e-05 loss: 0.2302 (0.2641) time: 2.9985 data: 0.0057 max mem: 33298 Epoch: [0] [2940/4276] eta: 1:05:34 lr: 4.9225564158902866e-05 loss: 0.2159 (0.2639) time: 3.0034 data: 0.0056 max mem: 33298 Epoch: [0] [2950/4276] eta: 1:05:05 lr: 4.9222928620349843e-05 loss: 0.2117 (0.2638) time: 3.0126 data: 0.0056 max mem: 33298 Epoch: [0] [2960/4276] eta: 1:04:36 lr: 4.9220293066117353e-05 loss: 0.2387 (0.2638) time: 2.9976 data: 0.0056 max mem: 33298 Epoch: [0] [2970/4276] eta: 1:04:07 lr: 4.9217657496204366e-05 loss: 0.2684 (0.2639) time: 3.0129 data: 0.0057 max mem: 33298 Epoch: [0] [2980/4276] eta: 1:03:38 lr: 4.921502191060986e-05 loss: 0.2533 (0.2638) time: 3.0288 data: 0.0057 max mem: 33298 Epoch: [0] [2990/4276] eta: 1:03:08 lr: 4.9212386309332805e-05 loss: 0.2242 (0.2636) time: 3.0313 data: 0.0058 max mem: 33298 Epoch: [0] [3000/4276] eta: 1:02:39 lr: 4.9209750692372166e-05 loss: 0.2222 (0.2635) time: 3.0350 data: 0.0060 max mem: 33298 Epoch: [0] [3010/4276] eta: 1:02:10 lr: 4.9207115059726935e-05 loss: 0.2313 (0.2635) time: 3.0344 data: 0.0060 max mem: 33298 Epoch: [0] [3020/4276] eta: 1:01:41 lr: 4.920447941139607e-05 loss: 0.2382 (0.2634) time: 3.0351 data: 0.0057 max mem: 33298 Epoch: [0] [3030/4276] eta: 1:01:12 lr: 4.920184374737855e-05 loss: 0.2392 (0.2634) time: 3.0172 data: 0.0055 max mem: 33298 Epoch: [0] [3040/4276] eta: 1:00:43 lr: 4.9199208067673355e-05 loss: 0.2568 (0.2634) time: 3.0004 data: 0.0057 max mem: 33298 Epoch: [0] [3050/4276] eta: 1:00:13 lr: 4.919657237227944e-05 loss: 0.2512 (0.2633) time: 2.9978 data: 0.0058 max mem: 33298 Epoch: [0] [3060/4276] eta: 0:59:44 lr: 4.919393666119579e-05 loss: 0.2274 (0.2632) time: 2.9989 data: 0.0056 max mem: 33298 Epoch: [0] [3070/4276] eta: 0:59:15 lr: 4.919130093442138e-05 loss: 0.2410 (0.2632) time: 3.0064 data: 0.0056 max mem: 33298 Epoch: [0] [3080/4276] eta: 0:58:46 lr: 4.918866519195517e-05 loss: 0.2308 (0.2631) time: 3.0260 data: 0.0060 max mem: 33298 Epoch: [0] [3090/4276] eta: 0:58:17 lr: 4.918602943379615e-05 loss: 0.2254 (0.2631) time: 3.0416 data: 0.0064 max mem: 33298 Epoch: [0] [3100/4276] eta: 0:57:48 lr: 4.9183393659943286e-05 loss: 0.2379 (0.2630) time: 3.0427 data: 0.0063 max mem: 33298 Epoch: [0] [3110/4276] eta: 0:57:18 lr: 4.918075787039553e-05 loss: 0.2287 (0.2629) time: 3.0507 data: 0.0062 max mem: 33298 Epoch: [0] [3120/4276] eta: 0:56:49 lr: 4.917812206515188e-05 loss: 0.2135 (0.2628) time: 3.0571 data: 0.0066 max mem: 33298 Epoch: [0] [3130/4276] eta: 0:56:20 lr: 4.917548624421131e-05 loss: 0.2276 (0.2627) time: 3.0555 data: 0.0067 max mem: 33298 Epoch: [0] [3140/4276] eta: 0:55:51 lr: 4.917285040757276e-05 loss: 0.2317 (0.2627) time: 3.0732 data: 0.0070 max mem: 33298 Epoch: [0] [3150/4276] eta: 0:55:22 lr: 4.917021455523523e-05 loss: 0.2398 (0.2627) time: 3.0841 data: 0.0077 max mem: 33298 Epoch: [0] [3160/4276] eta: 0:54:53 lr: 4.9167578687197674e-05 loss: 0.2362 (0.2626) time: 3.0868 data: 0.0079 max mem: 33298 Epoch: [0] [3170/4276] eta: 0:54:24 lr: 4.916494280345909e-05 loss: 0.2238 (0.2626) time: 3.0978 data: 0.0080 max mem: 33298 Epoch: [0] [3180/4276] eta: 0:53:55 lr: 4.9162306904018415e-05 loss: 0.2415 (0.2626) time: 3.1106 data: 0.0078 max mem: 33298 Epoch: [0] [3190/4276] eta: 0:53:26 lr: 4.915967098887464e-05 loss: 0.2526 (0.2625) time: 3.1149 data: 0.0074 max mem: 33298 Epoch: [0] [3200/4276] eta: 0:52:57 lr: 4.915703505802674e-05 loss: 0.2253 (0.2624) time: 3.1197 data: 0.0072 max mem: 33298 Epoch: [0] [3210/4276] eta: 0:52:28 lr: 4.915439911147367e-05 loss: 0.2203 (0.2624) time: 3.1214 data: 0.0073 max mem: 33298 Epoch: [0] [3220/4276] eta: 0:51:59 lr: 4.9151763149214406e-05 loss: 0.2369 (0.2623) time: 3.1219 data: 0.0076 max mem: 33298 Epoch: [0] [3230/4276] eta: 0:51:30 lr: 4.9149127171247925e-05 loss: 0.2399 (0.2623) time: 3.1116 data: 0.0074 max mem: 33298 Epoch: [0] [3240/4276] eta: 0:51:01 lr: 4.9146491177573196e-05 loss: 0.2399 (0.2622) time: 3.0886 data: 0.0073 max mem: 33298 Epoch: [0] [3250/4276] eta: 0:50:32 lr: 4.9143855168189185e-05 loss: 0.2350 (0.2621) time: 3.0975 data: 0.0076 max mem: 33298 Epoch: [0] [3260/4276] eta: 0:50:03 lr: 4.914121914309486e-05 loss: 0.2415 (0.2621) time: 3.1270 data: 0.0080 max mem: 33298 Epoch: [0] [3270/4276] eta: 0:49:34 lr: 4.91385831022892e-05 loss: 0.2512 (0.2620) time: 3.1189 data: 0.0082 max mem: 33298 Epoch: [0] [3280/4276] eta: 0:49:05 lr: 4.913594704577117e-05 loss: 0.2512 (0.2620) time: 3.0982 data: 0.0074 max mem: 33298 Epoch: [0] [3290/4276] eta: 0:48:36 lr: 4.913331097353974e-05 loss: 0.2580 (0.2620) time: 3.1118 data: 0.0077 max mem: 33298 Epoch: [0] [3300/4276] eta: 0:48:07 lr: 4.9130674885593874e-05 loss: 0.2587 (0.2620) time: 3.1347 data: 0.0084 max mem: 33298 Epoch: [0] [3310/4276] eta: 0:47:38 lr: 4.912803878193255e-05 loss: 0.2587 (0.2620) time: 3.1400 data: 0.0076 max mem: 33298 Epoch: [0] [3320/4276] eta: 0:47:09 lr: 4.912540266255473e-05 loss: 0.2582 (0.2620) time: 3.1323 data: 0.0073 max mem: 33298 Epoch: [0] [3330/4276] eta: 0:46:39 lr: 4.9122766527459394e-05 loss: 0.2351 (0.2619) time: 3.1213 data: 0.0077 max mem: 33298 Epoch: [0] [3340/4276] eta: 0:46:10 lr: 4.91201303766455e-05 loss: 0.2351 (0.2619) time: 3.1179 data: 0.0076 max mem: 33298 Epoch: [0] [3350/4276] eta: 0:45:41 lr: 4.9117494210112014e-05 loss: 0.2306 (0.2617) time: 3.1230 data: 0.0073 max mem: 33298 Epoch: [0] [3360/4276] eta: 0:45:12 lr: 4.911485802785792e-05 loss: 0.2248 (0.2617) time: 3.1222 data: 0.0078 max mem: 33298 Epoch: [0] [3370/4276] eta: 0:44:43 lr: 4.9112221829882175e-05 loss: 0.2576 (0.2617) time: 3.1281 data: 0.0080 max mem: 33298 Epoch: [0] [3380/4276] eta: 0:44:13 lr: 4.9109585616183754e-05 loss: 0.2396 (0.2617) time: 3.1007 data: 0.0077 max mem: 33298 Epoch: [0] [3390/4276] eta: 0:43:44 lr: 4.9106949386761617e-05 loss: 0.2502 (0.2617) time: 3.0963 data: 0.0077 max mem: 33298 Epoch: [0] [3400/4276] eta: 0:43:15 lr: 4.9104313141614746e-05 loss: 0.2502 (0.2616) time: 3.1380 data: 0.0077 max mem: 33298 Epoch: [0] [3410/4276] eta: 0:42:46 lr: 4.9101676880742106e-05 loss: 0.2383 (0.2616) time: 3.1380 data: 0.0078 max mem: 33298 Epoch: [0] [3420/4276] eta: 0:42:17 lr: 4.9099040604142646e-05 loss: 0.2438 (0.2615) time: 3.1266 data: 0.0085 max mem: 33298 Epoch: [0] [3430/4276] eta: 0:41:47 lr: 4.909640431181535e-05 loss: 0.2447 (0.2615) time: 3.1314 data: 0.0086 max mem: 33298 Epoch: [0] [3440/4276] eta: 0:41:18 lr: 4.909376800375919e-05 loss: 0.2445 (0.2615) time: 3.1310 data: 0.0080 max mem: 33298 Epoch: [0] [3450/4276] eta: 0:40:49 lr: 4.909113167997313e-05 loss: 0.2413 (0.2614) time: 3.1238 data: 0.0076 max mem: 33298 Epoch: [0] [3460/4276] eta: 0:40:20 lr: 4.908849534045613e-05 loss: 0.2413 (0.2613) time: 3.1216 data: 0.0074 max mem: 33298 Epoch: [0] [3470/4276] eta: 0:39:50 lr: 4.9085858985207164e-05 loss: 0.2130 (0.2612) time: 3.1302 data: 0.0073 max mem: 33298 Epoch: [0] [3480/4276] eta: 0:39:21 lr: 4.9083222614225194e-05 loss: 0.2436 (0.2612) time: 3.1322 data: 0.0074 max mem: 33298 Epoch: [0] [3490/4276] eta: 0:38:52 lr: 4.9080586227509195e-05 loss: 0.2456 (0.2611) time: 3.1293 data: 0.0074 max mem: 33298 Epoch: [0] [3500/4276] eta: 0:38:22 lr: 4.907794982505813e-05 loss: 0.2470 (0.2611) time: 3.1238 data: 0.0074 max mem: 33298 Epoch: [0] [3510/4276] eta: 0:37:53 lr: 4.907531340687096e-05 loss: 0.2230 (0.2610) time: 3.1078 data: 0.0076 max mem: 33298 Epoch: [0] [3520/4276] eta: 0:37:24 lr: 4.907267697294666e-05 loss: 0.2329 (0.2610) time: 3.1026 data: 0.0081 max mem: 33298 Epoch: [0] [3530/4276] eta: 0:36:54 lr: 4.90700405232842e-05 loss: 0.2351 (0.2609) time: 3.1049 data: 0.0081 max mem: 33298 Epoch: [0] [3540/4276] eta: 0:36:25 lr: 4.906740405788254e-05 loss: 0.2297 (0.2609) time: 3.1262 data: 0.0078 max mem: 33298 Epoch: [0] [3550/4276] eta: 0:35:56 lr: 4.906476757674064e-05 loss: 0.2418 (0.2608) time: 3.1607 data: 0.0078 max mem: 33298 Epoch: [0] [3560/4276] eta: 0:35:26 lr: 4.9062131079857484e-05 loss: 0.2401 (0.2608) time: 3.1680 data: 0.0082 max mem: 33298 Epoch: [0] [3570/4276] eta: 0:34:57 lr: 4.9059494567232015e-05 loss: 0.2352 (0.2607) time: 3.1532 data: 0.0082 max mem: 33298 Epoch: [0] [3580/4276] eta: 0:34:28 lr: 4.905685803886322e-05 loss: 0.2309 (0.2606) time: 3.1604 data: 0.0079 max mem: 33298 Epoch: [0] [3590/4276] eta: 0:33:58 lr: 4.905422149475005e-05 loss: 0.2223 (0.2606) time: 3.1749 data: 0.0080 max mem: 33298 Epoch: [0] [3600/4276] eta: 0:33:29 lr: 4.905158493489148e-05 loss: 0.2400 (0.2605) time: 3.1669 data: 0.0078 max mem: 33298 Epoch: [0] [3610/4276] eta: 0:33:00 lr: 4.9048948359286475e-05 loss: 0.2559 (0.2605) time: 3.1473 data: 0.0073 max mem: 33298 Epoch: [0] [3620/4276] eta: 0:32:30 lr: 4.9046311767934e-05 loss: 0.2408 (0.2604) time: 3.1339 data: 0.0072 max mem: 33298 Epoch: [0] [3630/4276] eta: 0:32:01 lr: 4.9043675160833014e-05 loss: 0.2394 (0.2605) time: 3.1357 data: 0.0072 max mem: 33298 Epoch: [0] [3640/4276] eta: 0:31:31 lr: 4.9041038537982484e-05 loss: 0.2400 (0.2604) time: 3.1280 data: 0.0071 max mem: 33298 Epoch: [0] [3650/4276] eta: 0:31:02 lr: 4.903840189938138e-05 loss: 0.2297 (0.2604) time: 3.1177 data: 0.0068 max mem: 33298 Epoch: [0] [3660/4276] eta: 0:30:32 lr: 4.903576524502866e-05 loss: 0.2278 (0.2603) time: 3.1081 data: 0.0064 max mem: 33298 Epoch: [0] [3670/4276] eta: 0:30:03 lr: 4.90331285749233e-05 loss: 0.2287 (0.2603) time: 3.1094 data: 0.0072 max mem: 33298 Epoch: [0] [3680/4276] eta: 0:29:33 lr: 4.903049188906426e-05 loss: 0.2307 (0.2603) time: 3.1226 data: 0.0076 max mem: 33298 Epoch: [0] [3690/4276] eta: 0:29:04 lr: 4.90278551874505e-05 loss: 0.2311 (0.2602) time: 3.1274 data: 0.0074 max mem: 33298 Epoch: [0] [3700/4276] eta: 0:28:34 lr: 4.902521847008099e-05 loss: 0.2425 (0.2602) time: 3.1280 data: 0.0077 max mem: 33298 Epoch: [0] [3710/4276] eta: 0:28:05 lr: 4.902258173695469e-05 loss: 0.2339 (0.2601) time: 3.1241 data: 0.0077 max mem: 33298 Epoch: [0] [3720/4276] eta: 0:27:35 lr: 4.901994498807056e-05 loss: 0.2233 (0.2600) time: 3.1286 data: 0.0079 max mem: 33298 Epoch: [0] [3730/4276] eta: 0:27:05 lr: 4.901730822342757e-05 loss: 0.2428 (0.2600) time: 3.1201 data: 0.0081 max mem: 33298 Epoch: [0] [3740/4276] eta: 0:26:36 lr: 4.9014671443024683e-05 loss: 0.2458 (0.2600) time: 3.1243 data: 0.0078 max mem: 33298 Epoch: [0] [3750/4276] eta: 0:26:06 lr: 4.901203464686087e-05 loss: 0.2505 (0.2600) time: 3.1503 data: 0.0072 max mem: 33298 Epoch: [0] [3760/4276] eta: 0:25:37 lr: 4.900939783493509e-05 loss: 0.2442 (0.2599) time: 3.1379 data: 0.0077 max mem: 33298 Epoch: [0] [3770/4276] eta: 0:25:07 lr: 4.900676100724629e-05 loss: 0.2442 (0.2599) time: 3.0977 data: 0.0077 max mem: 33298 Epoch: [0] [3780/4276] eta: 0:24:37 lr: 4.9004124163793464e-05 loss: 0.2349 (0.2598) time: 3.0866 data: 0.0071 max mem: 33298 Epoch: [0] [3790/4276] eta: 0:24:08 lr: 4.900148730457556e-05 loss: 0.2258 (0.2597) time: 3.0993 data: 0.0074 max mem: 33298 Epoch: [0] [3800/4276] eta: 0:23:38 lr: 4.899885042959152e-05 loss: 0.2301 (0.2598) time: 3.0942 data: 0.0074 max mem: 33298 Epoch: [0] [3810/4276] eta: 0:23:08 lr: 4.899621353884034e-05 loss: 0.2285 (0.2597) time: 3.0895 data: 0.0072 max mem: 33298 Epoch: [0] [3820/4276] eta: 0:22:39 lr: 4.899357663232097e-05 loss: 0.2202 (0.2596) time: 3.1084 data: 0.0076 max mem: 33298 Epoch: [0] [3830/4276] eta: 0:22:09 lr: 4.899093971003238e-05 loss: 0.2269 (0.2596) time: 3.1035 data: 0.0075 max mem: 33298 Epoch: [0] [3840/4276] eta: 0:21:39 lr: 4.8988302771973514e-05 loss: 0.2316 (0.2595) time: 3.0806 data: 0.0069 max mem: 33298 Epoch: [0] [3850/4276] eta: 0:21:10 lr: 4.898566581814335e-05 loss: 0.2278 (0.2594) time: 3.1072 data: 0.0076 max mem: 33298 Epoch: [0] [3860/4276] eta: 0:20:40 lr: 4.898302884854084e-05 loss: 0.2199 (0.2593) time: 3.1248 data: 0.0082 max mem: 33298 Epoch: [0] [3870/4276] eta: 0:20:10 lr: 4.8980391863164966e-05 loss: 0.2401 (0.2593) time: 3.1143 data: 0.0077 max mem: 33298 Epoch: [0] [3880/4276] eta: 0:19:41 lr: 4.897775486201467e-05 loss: 0.2383 (0.2592) time: 3.1318 data: 0.0080 max mem: 33298 Epoch: [0] [3890/4276] eta: 0:19:11 lr: 4.8975117845088916e-05 loss: 0.2321 (0.2592) time: 3.1473 data: 0.0085 max mem: 33298 Epoch: [0] [3900/4276] eta: 0:18:41 lr: 4.8972480812386675e-05 loss: 0.2560 (0.2592) time: 3.1301 data: 0.0083 max mem: 33298 Epoch: [0] [3910/4276] eta: 0:18:12 lr: 4.89698437639069e-05 loss: 0.2209 (0.2591) time: 3.1204 data: 0.0081 max mem: 33298 Epoch: [0] [3920/4276] eta: 0:17:42 lr: 4.896720669964856e-05 loss: 0.2188 (0.2590) time: 3.0960 data: 0.0079 max mem: 33298 Epoch: [0] [3930/4276] eta: 0:17:12 lr: 4.896456961961061e-05 loss: 0.2274 (0.2590) time: 3.0935 data: 0.0077 max mem: 33298 Epoch: [0] [3940/4276] eta: 0:16:43 lr: 4.896193252379202e-05 loss: 0.2374 (0.2589) time: 3.1470 data: 0.0081 max mem: 33298 Epoch: [0] [3950/4276] eta: 0:16:13 lr: 4.895929541219174e-05 loss: 0.2374 (0.2588) time: 3.1737 data: 0.0085 max mem: 33298 Epoch: [0] [3960/4276] eta: 0:15:43 lr: 4.895665828480874e-05 loss: 0.2495 (0.2588) time: 3.1759 data: 0.0083 max mem: 33298 Epoch: [0] [3970/4276] eta: 0:15:13 lr: 4.895402114164197e-05 loss: 0.2472 (0.2588) time: 3.1734 data: 0.0080 max mem: 33298 Epoch: [0] [3980/4276] eta: 0:14:44 lr: 4.895138398269041e-05 loss: 0.2346 (0.2588) time: 3.1671 data: 0.0081 max mem: 33298 Epoch: [0] [3990/4276] eta: 0:14:14 lr: 4.8948746807953e-05 loss: 0.2253 (0.2587) time: 3.1603 data: 0.0078 max mem: 33298 Epoch: [0] [4000/4276] eta: 0:13:44 lr: 4.894610961742871e-05 loss: 0.2223 (0.2587) time: 3.1495 data: 0.0075 max mem: 33298 Epoch: [0] [4010/4276] eta: 0:13:14 lr: 4.89434724111165e-05 loss: 0.2366 (0.2587) time: 3.1359 data: 0.0078 max mem: 33298 Epoch: [0] [4020/4276] eta: 0:12:45 lr: 4.8940835189015334e-05 loss: 0.2346 (0.2586) time: 3.1331 data: 0.0076 max mem: 33298 Epoch: [0] [4030/4276] eta: 0:12:15 lr: 4.8938197951124166e-05 loss: 0.2346 (0.2585) time: 3.1505 data: 0.0073 max mem: 33298 Epoch: [0] [4040/4276] eta: 0:11:45 lr: 4.893556069744196e-05 loss: 0.2396 (0.2585) time: 3.1675 data: 0.0078 max mem: 33298 Epoch: [0] [4050/4276] eta: 0:11:15 lr: 4.893292342796766e-05 loss: 0.2216 (0.2584) time: 3.1669 data: 0.0083 max mem: 33298 Epoch: [0] [4060/4276] eta: 0:10:45 lr: 4.893028614270026e-05 loss: 0.2174 (0.2584) time: 3.1742 data: 0.0088 max mem: 33298 Epoch: [0] [4070/4276] eta: 0:10:16 lr: 4.892764884163869e-05 loss: 0.2208 (0.2583) time: 3.1772 data: 0.0086 max mem: 33298 Epoch: [0] [4080/4276] eta: 0:09:46 lr: 4.892501152478192e-05 loss: 0.2159 (0.2583) time: 3.1677 data: 0.0084 max mem: 33298 Epoch: [0] [4090/4276] eta: 0:09:16 lr: 4.89223741921289e-05 loss: 0.2239 (0.2582) time: 3.1687 data: 0.0086 max mem: 33298 Epoch: [0] [4100/4276] eta: 0:08:46 lr: 4.891973684367861e-05 loss: 0.2435 (0.2582) time: 3.1512 data: 0.0083 max mem: 33298 Epoch: [0] [4110/4276] eta: 0:08:16 lr: 4.891709947942999e-05 loss: 0.2456 (0.2582) time: 3.1417 data: 0.0088 max mem: 33298 Epoch: [0] [4120/4276] eta: 0:07:46 lr: 4.8914462099382e-05 loss: 0.2332 (0.2581) time: 3.1582 data: 0.0091 max mem: 33298 Epoch: [0] [4130/4276] eta: 0:07:17 lr: 4.891182470353361e-05 loss: 0.2264 (0.2581) time: 3.1784 data: 0.0091 max mem: 33298 Epoch: [0] [4140/4276] eta: 0:06:47 lr: 4.890918729188378e-05 loss: 0.2296 (0.2580) time: 3.1867 data: 0.0096 max mem: 33298 Epoch: [0] [4150/4276] eta: 0:06:17 lr: 4.8906549864431455e-05 loss: 0.2296 (0.2580) time: 3.1799 data: 0.0092 max mem: 33298 Epoch: [0] [4160/4276] eta: 0:05:47 lr: 4.89039124211756e-05 loss: 0.2374 (0.2580) time: 3.1747 data: 0.0082 max mem: 33298 Epoch: [0] [4170/4276] eta: 0:05:17 lr: 4.890127496211517e-05 loss: 0.2529 (0.2580) time: 3.1383 data: 0.0075 max mem: 33298 Epoch: [0] [4180/4276] eta: 0:04:47 lr: 4.8898637487249124e-05 loss: 0.2503 (0.2579) time: 3.1377 data: 0.0075 max mem: 33298 Epoch: [0] [4190/4276] eta: 0:04:17 lr: 4.889599999657643e-05 loss: 0.2334 (0.2579) time: 3.1541 data: 0.0085 max mem: 33298 Epoch: [0] [4200/4276] eta: 0:03:47 lr: 4.889336249009603e-05 loss: 0.2334 (0.2579) time: 3.1428 data: 0.0085 max mem: 33298 Epoch: [0] [4210/4276] eta: 0:03:17 lr: 4.88907249678069e-05 loss: 0.2422 (0.2579) time: 3.1394 data: 0.0085 max mem: 33298 Epoch: [0] [4220/4276] eta: 0:02:47 lr: 4.888808742970799e-05 loss: 0.2539 (0.2579) time: 3.1295 data: 0.0086 max mem: 33298 Epoch: [0] [4230/4276] eta: 0:02:17 lr: 4.888544987579824e-05 loss: 0.2704 (0.2580) time: 3.1302 data: 0.0083 max mem: 33298 Epoch: [0] [4240/4276] eta: 0:01:47 lr: 4.888281230607663e-05 loss: 0.2662 (0.2580) time: 3.1380 data: 0.0079 max mem: 33298 Epoch: [0] [4250/4276] eta: 0:01:17 lr: 4.888017472054211e-05 loss: 0.2455 (0.2579) time: 3.1191 data: 0.0078 max mem: 33298 Epoch: [0] [4260/4276] eta: 0:00:47 lr: 4.887753711919363e-05 loss: 0.2455 (0.2579) time: 3.1022 data: 0.0079 max mem: 33298 Epoch: [0] [4270/4276] eta: 0:00:17 lr: 4.8874899502030166e-05 loss: 0.2345 (0.2579) time: 3.1047 data: 0.0072 max mem: 33298 Epoch: [0] Total time: 3:33:41 Test: [ 0/21770] eta: 9:13:33 time: 1.5256 data: 1.3477 max mem: 33298 Test: [ 100/21770] eta: 0:19:29 time: 0.0390 data: 0.0012 max mem: 33298 Test: [ 200/21770] eta: 0:16:43 time: 0.0389 data: 0.0012 max mem: 33298 Test: [ 300/21770] eta: 0:15:44 time: 0.0390 data: 0.0011 max mem: 33298 Test: [ 400/21770] eta: 0:15:13 time: 0.0390 data: 0.0012 max mem: 33298 Test: [ 500/21770] eta: 0:14:54 time: 0.0395 data: 0.0012 max mem: 33298 Test: [ 600/21770] eta: 0:14:40 time: 0.0392 data: 0.0012 max mem: 33298 Test: [ 700/21770] eta: 0:14:28 time: 0.0393 data: 0.0012 max mem: 33298 Test: [ 800/21770] eta: 0:14:19 time: 0.0388 data: 0.0012 max mem: 33298 Test: [ 900/21770] eta: 0:14:10 time: 0.0389 data: 0.0012 max mem: 33298 Test: [ 1000/21770] eta: 0:14:02 time: 0.0389 data: 0.0012 max mem: 33298 Test: [ 1100/21770] eta: 0:13:55 time: 0.0395 data: 0.0012 max mem: 33298 Test: [ 1200/21770] eta: 0:13:49 time: 0.0387 data: 0.0012 max mem: 33298 Test: [ 1300/21770] eta: 0:13:42 time: 0.0387 data: 0.0012 max mem: 33298 Test: [ 1400/21770] eta: 0:13:36 time: 0.0388 data: 0.0012 max mem: 33298 Test: [ 1500/21770] eta: 0:13:31 time: 0.0386 data: 0.0012 max mem: 33298 Test: [ 1600/21770] eta: 0:13:25 time: 0.0385 data: 0.0012 max mem: 33298 Test: [ 1700/21770] eta: 0:13:19 time: 0.0385 data: 0.0011 max mem: 33298 Test: [ 1800/21770] eta: 0:13:14 time: 0.0389 data: 0.0012 max mem: 33298 Test: [ 1900/21770] eta: 0:13:10 time: 0.0390 data: 0.0013 max mem: 33298 Test: [ 2000/21770] eta: 0:13:05 time: 0.0392 data: 0.0012 max mem: 33298 Test: [ 2100/21770] eta: 0:13:00 time: 0.0389 data: 0.0011 max mem: 33298 Test: [ 2200/21770] eta: 0:12:56 time: 0.0390 data: 0.0012 max mem: 33298 Test: [ 2300/21770] eta: 0:12:51 time: 0.0389 data: 0.0012 max mem: 33298 Test: [ 2400/21770] eta: 0:12:47 time: 0.0392 data: 0.0012 max mem: 33298 Test: [ 2500/21770] eta: 0:12:43 time: 0.0391 data: 0.0012 max mem: 33298 Test: [ 2600/21770] eta: 0:12:38 time: 0.0392 data: 0.0012 max mem: 33298 Test: [ 2700/21770] eta: 0:12:34 time: 0.0388 data: 0.0012 max mem: 33298 Test: [ 2800/21770] eta: 0:12:30 time: 0.0386 data: 0.0011 max mem: 33298 Test: [ 2900/21770] eta: 0:12:25 time: 0.0387 data: 0.0011 max mem: 33298 Test: [ 3000/21770] eta: 0:12:21 time: 0.0390 data: 0.0012 max mem: 33298 Test: [ 3100/21770] eta: 0:12:17 time: 0.0388 data: 0.0012 max mem: 33298 Test: [ 3200/21770] eta: 0:12:13 time: 0.0390 data: 0.0012 max mem: 33298 Test: [ 3300/21770] eta: 0:12:08 time: 0.0392 data: 0.0012 max mem: 33298 Test: [ 3400/21770] eta: 0:12:04 time: 0.0396 data: 0.0012 max mem: 33298 Test: [ 3500/21770] eta: 0:12:01 time: 0.0403 data: 0.0012 max mem: 33298 Test: [ 3600/21770] eta: 0:11:57 time: 0.0400 data: 0.0012 max mem: 33298 Test: [ 3700/21770] eta: 0:11:53 time: 0.0398 data: 0.0012 max mem: 33298 Test: [ 3800/21770] eta: 0:11:49 time: 0.0398 data: 0.0012 max mem: 33298 Test: [ 3900/21770] eta: 0:11:46 time: 0.0394 data: 0.0011 max mem: 33298 Test: [ 4000/21770] eta: 0:11:42 time: 0.0400 data: 0.0012 max mem: 33298 Test: [ 4100/21770] eta: 0:11:38 time: 0.0399 data: 0.0012 max mem: 33298 Test: [ 4200/21770] eta: 0:11:34 time: 0.0399 data: 0.0012 max mem: 33298 Test: [ 4300/21770] eta: 0:11:30 time: 0.0392 data: 0.0012 max mem: 33298 Test: [ 4400/21770] eta: 0:11:26 time: 0.0391 data: 0.0013 max mem: 33298 Test: [ 4500/21770] eta: 0:11:22 time: 0.0391 data: 0.0013 max mem: 33298 Test: [ 4600/21770] eta: 0:11:18 time: 0.0392 data: 0.0013 max mem: 33298 Test: [ 4700/21770] eta: 0:11:14 time: 0.0389 data: 0.0012 max mem: 33298 Test: [ 4800/21770] eta: 0:11:10 time: 0.0389 data: 0.0012 max mem: 33298 Test: [ 4900/21770] eta: 0:11:06 time: 0.0391 data: 0.0013 max mem: 33298 Test: [ 5000/21770] eta: 0:11:02 time: 0.0398 data: 0.0012 max mem: 33298 Test: [ 5100/21770] eta: 0:10:58 time: 0.0391 data: 0.0012 max mem: 33298 Test: [ 5200/21770] eta: 0:10:54 time: 0.0393 data: 0.0012 max mem: 33298 Test: [ 5300/21770] eta: 0:10:50 time: 0.0399 data: 0.0012 max mem: 33298 Test: [ 5400/21770] eta: 0:10:46 time: 0.0400 data: 0.0012 max mem: 33298 Test: [ 5500/21770] eta: 0:10:42 time: 0.0400 data: 0.0013 max mem: 33298 Test: [ 5600/21770] eta: 0:10:38 time: 0.0400 data: 0.0012 max mem: 33298 Test: [ 5700/21770] eta: 0:10:34 time: 0.0399 data: 0.0012 max mem: 33298 Test: [ 5800/21770] eta: 0:10:31 time: 0.0401 data: 0.0012 max mem: 33298 Test: [ 5900/21770] eta: 0:10:27 time: 0.0396 data: 0.0012 max mem: 33298 Test: [ 6000/21770] eta: 0:10:23 time: 0.0394 data: 0.0012 max mem: 33298 Test: [ 6100/21770] eta: 0:10:19 time: 0.0397 data: 0.0012 max mem: 33298 Test: [ 6200/21770] eta: 0:10:15 time: 0.0391 data: 0.0012 max mem: 33298 Test: [ 6300/21770] eta: 0:10:11 time: 0.0393 data: 0.0012 max mem: 33298 Test: [ 6400/21770] eta: 0:10:07 time: 0.0392 data: 0.0012 max mem: 33298 Test: [ 6500/21770] eta: 0:10:03 time: 0.0393 data: 0.0011 max mem: 33298 Test: [ 6600/21770] eta: 0:09:59 time: 0.0392 data: 0.0012 max mem: 33298 Test: [ 6700/21770] eta: 0:09:55 time: 0.0400 data: 0.0012 max mem: 33298 Test: [ 6800/21770] eta: 0:09:51 time: 0.0400 data: 0.0012 max mem: 33298 Test: [ 6900/21770] eta: 0:09:47 time: 0.0396 data: 0.0012 max mem: 33298 Test: [ 7000/21770] eta: 0:09:43 time: 0.0395 data: 0.0012 max mem: 33298 Test: [ 7100/21770] eta: 0:09:39 time: 0.0392 data: 0.0012 max mem: 33298 Test: [ 7200/21770] eta: 0:09:35 time: 0.0393 data: 0.0012 max mem: 33298 Test: [ 7300/21770] eta: 0:09:31 time: 0.0395 data: 0.0012 max mem: 33298 Test: [ 7400/21770] eta: 0:09:27 time: 0.0402 data: 0.0013 max mem: 33298 Test: [ 7500/21770] eta: 0:09:23 time: 0.0390 data: 0.0012 max mem: 33298 Test: [ 7600/21770] eta: 0:09:19 time: 0.0390 data: 0.0012 max mem: 33298 Test: [ 7700/21770] eta: 0:09:15 time: 0.0389 data: 0.0012 max mem: 33298 Test: [ 7800/21770] eta: 0:09:11 time: 0.0397 data: 0.0012 max mem: 33298 Test: [ 7900/21770] eta: 0:09:07 time: 0.0391 data: 0.0012 max mem: 33298 Test: [ 8000/21770] eta: 0:09:03 time: 0.0393 data: 0.0012 max mem: 33298 Test: [ 8100/21770] eta: 0:08:59 time: 0.0389 data: 0.0012 max mem: 33298 Test: [ 8200/21770] eta: 0:08:55 time: 0.0397 data: 0.0013 max mem: 33298 Test: [ 8300/21770] eta: 0:08:51 time: 0.0391 data: 0.0011 max mem: 33298 Test: [ 8400/21770] eta: 0:08:47 time: 0.0391 data: 0.0012 max mem: 33298 Test: [ 8500/21770] eta: 0:08:43 time: 0.0391 data: 0.0011 max mem: 33298 Test: [ 8600/21770] eta: 0:08:39 time: 0.0390 data: 0.0012 max mem: 33298 Test: [ 8700/21770] eta: 0:08:35 time: 0.0391 data: 0.0011 max mem: 33298 Test: [ 8800/21770] eta: 0:08:31 time: 0.0390 data: 0.0012 max mem: 33298 Test: [ 8900/21770] eta: 0:08:27 time: 0.0393 data: 0.0012 max mem: 33298 Test: [ 9000/21770] eta: 0:08:23 time: 0.0387 data: 0.0012 max mem: 33298 Test: [ 9100/21770] eta: 0:08:19 time: 0.0387 data: 0.0012 max mem: 33298 Test: [ 9200/21770] eta: 0:08:15 time: 0.0387 data: 0.0011 max mem: 33298 Test: [ 9300/21770] eta: 0:08:11 time: 0.0391 data: 0.0011 max mem: 33298 Test: [ 9400/21770] eta: 0:08:07 time: 0.0387 data: 0.0012 max mem: 33298 Test: [ 9500/21770] eta: 0:08:03 time: 0.0387 data: 0.0012 max mem: 33298 Test: [ 9600/21770] eta: 0:07:59 time: 0.0388 data: 0.0012 max mem: 33298 Test: [ 9700/21770] eta: 0:07:55 time: 0.0391 data: 0.0011 max mem: 33298 Test: [ 9800/21770] eta: 0:07:51 time: 0.0392 data: 0.0012 max mem: 33298 Test: [ 9900/21770] eta: 0:07:47 time: 0.0394 data: 0.0012 max mem: 33298 Test: [10000/21770] eta: 0:07:43 time: 0.0387 data: 0.0012 max mem: 33298 Test: [10100/21770] eta: 0:07:39 time: 0.0388 data: 0.0012 max mem: 33298 Test: [10200/21770] eta: 0:07:35 time: 0.0390 data: 0.0011 max mem: 33298 Test: [10300/21770] eta: 0:07:31 time: 0.0394 data: 0.0012 max mem: 33298 Test: [10400/21770] eta: 0:07:27 time: 0.0390 data: 0.0012 max mem: 33298 Test: [10500/21770] eta: 0:07:23 time: 0.0392 data: 0.0012 max mem: 33298 Test: [10600/21770] eta: 0:07:19 time: 0.0389 data: 0.0012 max mem: 33298 Test: [10700/21770] eta: 0:07:15 time: 0.0387 data: 0.0012 max mem: 33298 Test: [10800/21770] eta: 0:07:11 time: 0.0387 data: 0.0012 max mem: 33298 Test: [10900/21770] eta: 0:07:07 time: 0.0388 data: 0.0011 max mem: 33298 Test: [11000/21770] eta: 0:07:03 time: 0.0389 data: 0.0012 max mem: 33298 Test: [11100/21770] eta: 0:06:59 time: 0.0389 data: 0.0012 max mem: 33298 Test: [11200/21770] eta: 0:06:55 time: 0.0390 data: 0.0011 max mem: 33298 Test: [11300/21770] eta: 0:06:51 time: 0.0393 data: 0.0012 max mem: 33298 Test: [11400/21770] eta: 0:06:48 time: 0.0389 data: 0.0012 max mem: 33298 Test: [11500/21770] eta: 0:06:44 time: 0.0396 data: 0.0012 max mem: 33298 Test: [11600/21770] eta: 0:06:40 time: 0.0397 data: 0.0012 max mem: 33298 Test: [11700/21770] eta: 0:06:36 time: 0.0397 data: 0.0012 max mem: 33298 Test: [11800/21770] eta: 0:06:32 time: 0.0394 data: 0.0012 max mem: 33298 Test: [11900/21770] eta: 0:06:28 time: 0.0394 data: 0.0011 max mem: 33298 Test: [12000/21770] eta: 0:06:24 time: 0.0392 data: 0.0012 max mem: 33298 Test: [12100/21770] eta: 0:06:20 time: 0.0398 data: 0.0012 max mem: 33298 Test: [12200/21770] eta: 0:06:16 time: 0.0389 data: 0.0012 max mem: 33298 Test: [12300/21770] eta: 0:06:12 time: 0.0388 data: 0.0012 max mem: 33298 Test: [12400/21770] eta: 0:06:08 time: 0.0388 data: 0.0012 max mem: 33298 Test: [12500/21770] eta: 0:06:04 time: 0.0387 data: 0.0012 max mem: 33298 Test: [12600/21770] eta: 0:06:00 time: 0.0387 data: 0.0012 max mem: 33298 Test: [12700/21770] eta: 0:05:56 time: 0.0388 data: 0.0012 max mem: 33298 Test: [12800/21770] eta: 0:05:52 time: 0.0387 data: 0.0011 max mem: 33298 Test: [12900/21770] eta: 0:05:48 time: 0.0387 data: 0.0012 max mem: 33298 Test: [13000/21770] eta: 0:05:44 time: 0.0395 data: 0.0012 max mem: 33298 Test: [13100/21770] eta: 0:05:40 time: 0.0391 data: 0.0012 max mem: 33298 Test: [13200/21770] eta: 0:05:36 time: 0.0398 data: 0.0011 max mem: 33298 Test: [13300/21770] eta: 0:05:33 time: 0.0402 data: 0.0011 max mem: 33298 Test: [13400/21770] eta: 0:05:29 time: 0.0398 data: 0.0012 max mem: 33298 Test: [13500/21770] eta: 0:05:25 time: 0.0403 data: 0.0012 max mem: 33298 Test: [13600/21770] eta: 0:05:21 time: 0.0397 data: 0.0011 max mem: 33298 Test: [13700/21770] eta: 0:05:17 time: 0.0389 data: 0.0012 max mem: 33298 Test: [13800/21770] eta: 0:05:13 time: 0.0393 data: 0.0012 max mem: 33298 Test: [13900/21770] eta: 0:05:09 time: 0.0394 data: 0.0012 max mem: 33298 Test: [14000/21770] eta: 0:05:05 time: 0.0403 data: 0.0012 max mem: 33298 Test: [14100/21770] eta: 0:05:01 time: 0.0395 data: 0.0011 max mem: 33298 Test: [14200/21770] eta: 0:04:57 time: 0.0397 data: 0.0012 max mem: 33298 Test: [14300/21770] eta: 0:04:53 time: 0.0387 data: 0.0012 max mem: 33298 Test: [14400/21770] eta: 0:04:49 time: 0.0391 data: 0.0012 max mem: 33298 Test: [14500/21770] eta: 0:04:46 time: 0.0387 data: 0.0011 max mem: 33298 Test: [14600/21770] eta: 0:04:42 time: 0.0386 data: 0.0011 max mem: 33298 Test: [14700/21770] eta: 0:04:38 time: 0.0386 data: 0.0011 max mem: 33298 Test: [14800/21770] eta: 0:04:34 time: 0.0398 data: 0.0011 max mem: 33298 Test: [14900/21770] eta: 0:04:30 time: 0.0397 data: 0.0011 max mem: 33298 Test: [15000/21770] eta: 0:04:26 time: 0.0391 data: 0.0011 max mem: 33298 Test: [15100/21770] eta: 0:04:22 time: 0.0400 data: 0.0011 max mem: 33298 Test: [15200/21770] eta: 0:04:18 time: 0.0401 data: 0.0011 max mem: 33298 Test: [15300/21770] eta: 0:04:14 time: 0.0388 data: 0.0012 max mem: 33298 Test: [15400/21770] eta: 0:04:10 time: 0.0389 data: 0.0012 max mem: 33298 Test: [15500/21770] eta: 0:04:06 time: 0.0389 data: 0.0012 max mem: 33298 Test: [15600/21770] eta: 0:04:02 time: 0.0388 data: 0.0012 max mem: 33298 Test: [15700/21770] eta: 0:03:58 time: 0.0388 data: 0.0012 max mem: 33298 Test: [15800/21770] eta: 0:03:54 time: 0.0387 data: 0.0012 max mem: 33298 Test: [15900/21770] eta: 0:03:50 time: 0.0390 data: 0.0012 max mem: 33298 Test: [16000/21770] eta: 0:03:46 time: 0.0392 data: 0.0012 max mem: 33298 Test: [16100/21770] eta: 0:03:42 time: 0.0391 data: 0.0012 max mem: 33298 Test: [16200/21770] eta: 0:03:39 time: 0.0392 data: 0.0012 max mem: 33298 Test: [16300/21770] eta: 0:03:35 time: 0.0390 data: 0.0012 max mem: 33298 Test: [16400/21770] eta: 0:03:31 time: 0.0394 data: 0.0012 max mem: 33298 Test: [16500/21770] eta: 0:03:27 time: 0.0398 data: 0.0012 max mem: 33298 Test: [16600/21770] eta: 0:03:23 time: 0.0387 data: 0.0011 max mem: 33298 Test: [16700/21770] eta: 0:03:19 time: 0.0387 data: 0.0012 max mem: 33298 Test: [16800/21770] eta: 0:03:15 time: 0.0393 data: 0.0012 max mem: 33298 Test: [16900/21770] eta: 0:03:11 time: 0.0387 data: 0.0012 max mem: 33298 Test: [17000/21770] eta: 0:03:07 time: 0.0386 data: 0.0011 max mem: 33298 Test: [17100/21770] eta: 0:03:03 time: 0.0392 data: 0.0012 max mem: 33298 Test: [17200/21770] eta: 0:02:59 time: 0.0389 data: 0.0012 max mem: 33298 Test: [17300/21770] eta: 0:02:55 time: 0.0390 data: 0.0012 max mem: 33298 Test: [17400/21770] eta: 0:02:51 time: 0.0392 data: 0.0012 max mem: 33298 Test: [17500/21770] eta: 0:02:47 time: 0.0391 data: 0.0012 max mem: 33298 Test: [17600/21770] eta: 0:02:43 time: 0.0390 data: 0.0012 max mem: 33298 Test: [17700/21770] eta: 0:02:39 time: 0.0391 data: 0.0012 max mem: 33298 Test: [17800/21770] eta: 0:02:36 time: 0.0390 data: 0.0012 max mem: 33298 Test: [17900/21770] eta: 0:02:32 time: 0.0390 data: 0.0011 max mem: 33298 Test: [18000/21770] eta: 0:02:28 time: 0.0393 data: 0.0012 max mem: 33298 Test: [18100/21770] eta: 0:02:24 time: 0.0395 data: 0.0012 max mem: 33298 Test: [18200/21770] eta: 0:02:20 time: 0.0393 data: 0.0012 max mem: 33298 Test: [18300/21770] eta: 0:02:16 time: 0.0396 data: 0.0011 max mem: 33298 Test: [18400/21770] eta: 0:02:12 time: 0.0392 data: 0.0012 max mem: 33298 Test: [18500/21770] eta: 0:02:08 time: 0.0392 data: 0.0012 max mem: 33298 Test: [18600/21770] eta: 0:02:04 time: 0.0390 data: 0.0012 max mem: 33298 Test: [18700/21770] eta: 0:02:00 time: 0.0390 data: 0.0011 max mem: 33298 Test: [18800/21770] eta: 0:01:56 time: 0.0388 data: 0.0012 max mem: 33298 Test: [18900/21770] eta: 0:01:52 time: 0.0387 data: 0.0011 max mem: 33298 Test: [19000/21770] eta: 0:01:48 time: 0.0387 data: 0.0011 max mem: 33298 Test: [19100/21770] eta: 0:01:44 time: 0.0390 data: 0.0012 max mem: 33298 Test: [19200/21770] eta: 0:01:40 time: 0.0387 data: 0.0011 max mem: 33298 Test: [19300/21770] eta: 0:01:37 time: 0.0390 data: 0.0012 max mem: 33298 Test: [19400/21770] eta: 0:01:33 time: 0.0390 data: 0.0012 max mem: 33298 Test: [19500/21770] eta: 0:01:29 time: 0.0389 data: 0.0012 max mem: 33298 Test: [19600/21770] eta: 0:01:25 time: 0.0392 data: 0.0012 max mem: 33298 Test: [19700/21770] eta: 0:01:21 time: 0.0394 data: 0.0012 max mem: 33298 Test: [19800/21770] eta: 0:01:17 time: 0.0400 data: 0.0011 max mem: 33298 Test: [19900/21770] eta: 0:01:13 time: 0.0397 data: 0.0011 max mem: 33298 Test: [20000/21770] eta: 0:01:09 time: 0.0390 data: 0.0012 max mem: 33298 Test: [20100/21770] eta: 0:01:05 time: 0.0398 data: 0.0012 max mem: 33298 Test: [20200/21770] eta: 0:01:01 time: 0.0391 data: 0.0012 max mem: 33298 Test: [20300/21770] eta: 0:00:57 time: 0.0390 data: 0.0012 max mem: 33298 Test: [20400/21770] eta: 0:00:53 time: 0.0390 data: 0.0012 max mem: 33298 Test: [20500/21770] eta: 0:00:49 time: 0.0390 data: 0.0012 max mem: 33298 Test: [20600/21770] eta: 0:00:45 time: 0.0390 data: 0.0012 max mem: 33298 Test: [20700/21770] eta: 0:00:42 time: 0.0390 data: 0.0012 max mem: 33298 Test: [20800/21770] eta: 0:00:38 time: 0.0390 data: 0.0012 max mem: 33298 Test: [20900/21770] eta: 0:00:34 time: 0.0391 data: 0.0012 max mem: 33298 Test: [21000/21770] eta: 0:00:30 time: 0.0390 data: 0.0012 max mem: 33298 Test: [21100/21770] eta: 0:00:26 time: 0.0388 data: 0.0012 max mem: 33298 Test: [21200/21770] eta: 0:00:22 time: 0.0389 data: 0.0012 max mem: 33298 Test: [21300/21770] eta: 0:00:18 time: 0.0390 data: 0.0012 max mem: 33298 Test: [21400/21770] eta: 0:00:14 time: 0.0393 data: 0.0012 max mem: 33298 Test: [21500/21770] eta: 0:00:10 time: 0.0394 data: 0.0012 max mem: 33298 Test: [21600/21770] eta: 0:00:06 time: 0.0391 data: 0.0012 max mem: 33298 Test: [21700/21770] eta: 0:00:02 time: 0.0391 data: 0.0013 max mem: 33298 Test: Total time: 0:14:15 Final results: Mean IoU is 0.00 precision@0.5 = 0.00 precision@0.6 = 0.00 precision@0.7 = 0.00 precision@0.8 = 0.00 precision@0.9 = 0.00 overall IoU = 0.00 mean IoU = 0.00 Mean accuracy for one-to-zero sample is 100.00 Average object IoU 0.0 Overall IoU 0.0 Better epoch: 0 Epoch: [1] [ 0/4276] eta: 6:44:18 lr: 4.887331692414045e-05 loss: 0.2073 (0.2073) time: 5.6732 data: 2.1749 max mem: 33298 Epoch: [1] [ 10/4276] eta: 4:11:40 lr: 4.8870679281670814e-05 loss: 0.2174 (0.2258) time: 3.5397 data: 0.2038 max mem: 33298 Epoch: [1] [ 20/4276] eta: 4:01:51 lr: 4.886804162338348e-05 loss: 0.2288 (0.2278) time: 3.2965 data: 0.0065 max mem: 33298 Epoch: [1] [ 30/4276] eta: 3:55:34 lr: 4.886540394927738e-05 loss: 0.2292 (0.2302) time: 3.2131 data: 0.0072 max mem: 33298 Epoch: [1] [ 40/4276] eta: 3:52:30 lr: 4.886276625935149e-05 loss: 0.2308 (0.2282) time: 3.1709 data: 0.0078 max mem: 33298 Epoch: [1] [ 50/4276] eta: 3:49:41 lr: 4.8860128553604756e-05 loss: 0.2308 (0.2287) time: 3.1558 data: 0.0073 max mem: 33298 Epoch: [1] [ 60/4276] eta: 3:48:07 lr: 4.8857490832036145e-05 loss: 0.2173 (0.2312) time: 3.1509 data: 0.0075 max mem: 33298 Epoch: [1] [ 70/4276] eta: 3:46:42 lr: 4.8854853094644606e-05 loss: 0.2139 (0.2298) time: 3.1657 data: 0.0081 max mem: 33298 Epoch: [1] [ 80/4276] eta: 3:45:16 lr: 4.8852215341429095e-05 loss: 0.2139 (0.2294) time: 3.1443 data: 0.0083 max mem: 33298 Epoch: [1] [ 90/4276] eta: 3:43:47 lr: 4.884957757238856e-05 loss: 0.2152 (0.2273) time: 3.1135 data: 0.0080 max mem: 33298 Epoch: [1] [ 100/4276] eta: 3:42:42 lr: 4.884693978752198e-05 loss: 0.2234 (0.2321) time: 3.1126 data: 0.0075 max mem: 33298 Epoch: [1] [ 110/4276] eta: 3:41:33 lr: 4.884430198682829e-05 loss: 0.2312 (0.2332) time: 3.1143 data: 0.0070 max mem: 33298 Epoch: [1] [ 120/4276] eta: 3:40:21 lr: 4.884166417030644e-05 loss: 0.2243 (0.2332) time: 3.0878 data: 0.0073 max mem: 33298 Epoch: [1] [ 130/4276] eta: 3:39:14 lr: 4.8839026337955394e-05 loss: 0.2293 (0.2347) time: 3.0731 data: 0.0073 max mem: 33298 Epoch: [1] [ 140/4276] eta: 3:38:27 lr: 4.883638848977413e-05 loss: 0.2549 (0.2351) time: 3.0952 data: 0.0070 max mem: 33298 Epoch: [1] [ 150/4276] eta: 3:37:45 lr: 4.883375062576156e-05 loss: 0.2175 (0.2338) time: 3.1252 data: 0.0070 max mem: 33298 Epoch: [1] [ 160/4276] eta: 3:37:08 lr: 4.883111274591667e-05 loss: 0.2180 (0.2340) time: 3.1381 data: 0.0074 max mem: 33298 Epoch: [1] [ 170/4276] eta: 3:36:38 lr: 4.88284748502384e-05 loss: 0.2302 (0.2346) time: 3.1594 data: 0.0083 max mem: 33298 Epoch: [1] [ 180/4276] eta: 3:36:04 lr: 4.8825836938725715e-05 loss: 0.2407 (0.2352) time: 3.1656 data: 0.0086 max mem: 33298 Epoch: [1] [ 190/4276] eta: 3:35:32 lr: 4.8823199011377564e-05 loss: 0.2495 (0.2363) time: 3.1600 data: 0.0084 max mem: 33298 Epoch: [1] [ 200/4276] eta: 3:35:01 lr: 4.88205610681929e-05 loss: 0.2471 (0.2371) time: 3.1661 data: 0.0085 max mem: 33298 Epoch: [1] [ 210/4276] eta: 3:34:37 lr: 4.881792310917068e-05 loss: 0.2370 (0.2368) time: 3.1874 data: 0.0090 max mem: 33298 Epoch: [1] [ 220/4276] eta: 3:34:08 lr: 4.881528513430986e-05 loss: 0.2314 (0.2363) time: 3.1928 data: 0.0091 max mem: 33298 Epoch: [1] [ 230/4276] eta: 3:33:35 lr: 4.881264714360938e-05 loss: 0.2208 (0.2357) time: 3.1687 data: 0.0091 max mem: 33298 Epoch: [1] [ 240/4276] eta: 3:33:02 lr: 4.8810009137068206e-05 loss: 0.2228 (0.2362) time: 3.1600 data: 0.0091 max mem: 33298 Epoch: [1] [ 250/4276] eta: 3:32:30 lr: 4.880737111468529e-05 loss: 0.2637 (0.2372) time: 3.1621 data: 0.0088 max mem: 33298 Epoch: [1] [ 260/4276] eta: 3:31:55 lr: 4.880473307645959e-05 loss: 0.2593 (0.2378) time: 3.1557 data: 0.0087 max mem: 33298 Epoch: [1] [ 270/4276] eta: 3:31:22 lr: 4.880209502239005e-05 loss: 0.2593 (0.2382) time: 3.1520 data: 0.0087 max mem: 33298 Epoch: [1] [ 280/4276] eta: 3:30:46 lr: 4.879945695247563e-05 loss: 0.2296 (0.2378) time: 3.1434 data: 0.0085 max mem: 33298 Epoch: [1] [ 290/4276] eta: 3:30:11 lr: 4.8796818866715285e-05 loss: 0.2211 (0.2373) time: 3.1389 data: 0.0079 max mem: 33298 Epoch: [1] [ 300/4276] eta: 3:29:35 lr: 4.879418076510796e-05 loss: 0.2164 (0.2369) time: 3.1391 data: 0.0076 max mem: 33298 Epoch: [1] [ 310/4276] eta: 3:29:03 lr: 4.879154264765261e-05 loss: 0.2187 (0.2369) time: 3.1438 data: 0.0084 max mem: 33298 Epoch: [1] [ 320/4276] eta: 3:28:27 lr: 4.878890451434819e-05 loss: 0.2321 (0.2374) time: 3.1409 data: 0.0090 max mem: 33298 Epoch: [1] [ 330/4276] eta: 3:27:52 lr: 4.878626636519366e-05 loss: 0.2384 (0.2372) time: 3.1330 data: 0.0092 max mem: 33298 Epoch: [1] [ 340/4276] eta: 3:27:17 lr: 4.8783628200187945e-05 loss: 0.2339 (0.2368) time: 3.1341 data: 0.0091 max mem: 33298 Epoch: [1] [ 350/4276] eta: 3:26:38 lr: 4.878099001933003e-05 loss: 0.2123 (0.2364) time: 3.1117 data: 0.0081 max mem: 33298 Epoch: [1] [ 360/4276] eta: 3:26:08 lr: 4.877835182261885e-05 loss: 0.2269 (0.2372) time: 3.1328 data: 0.0083 max mem: 33298 Epoch: [1] [ 370/4276] eta: 3:25:38 lr: 4.8775713610053366e-05 loss: 0.2269 (0.2368) time: 3.1712 data: 0.0085 max mem: 33298 Epoch: [1] [ 380/4276] eta: 3:25:07 lr: 4.877307538163252e-05 loss: 0.2092 (0.2368) time: 3.1702 data: 0.0079 max mem: 33298 Epoch: [1] [ 390/4276] eta: 3:24:37 lr: 4.877043713735526e-05 loss: 0.2407 (0.2372) time: 3.1704 data: 0.0082 max mem: 33298 Epoch: [1] [ 400/4276] eta: 3:24:06 lr: 4.8767798877220556e-05 loss: 0.2427 (0.2375) time: 3.1677 data: 0.0083 max mem: 33298 Epoch: [1] [ 410/4276] eta: 3:23:35 lr: 4.8765160601227356e-05 loss: 0.2320 (0.2373) time: 3.1641 data: 0.0080 max mem: 33298 Epoch: [1] [ 420/4276] eta: 3:22:59 lr: 4.876252230937459e-05 loss: 0.2262 (0.2377) time: 3.1391 data: 0.0080 max mem: 33298 Epoch: [1] [ 430/4276] eta: 3:22:18 lr: 4.875988400166123e-05 loss: 0.2306 (0.2376) time: 3.0848 data: 0.0073 max mem: 33298 Epoch: [1] [ 440/4276] eta: 3:21:38 lr: 4.875724567808622e-05 loss: 0.2357 (0.2373) time: 3.0540 data: 0.0068 max mem: 33298 Epoch: [1] [ 450/4276] eta: 3:21:00 lr: 4.875460733864852e-05 loss: 0.2364 (0.2376) time: 3.0654 data: 0.0075 max mem: 33298 Epoch: [1] [ 460/4276] eta: 3:20:25 lr: 4.875196898334706e-05 loss: 0.2252 (0.2372) time: 3.0954 data: 0.0078 max mem: 33298 Epoch: [1] [ 470/4276] eta: 3:19:51 lr: 4.874933061218082e-05 loss: 0.2128 (0.2370) time: 3.1159 data: 0.0081 max mem: 33298 Epoch: [1] [ 480/4276] eta: 3:19:18 lr: 4.874669222514871e-05 loss: 0.2208 (0.2369) time: 3.1273 data: 0.0084 max mem: 33298 Epoch: [1] [ 490/4276] eta: 3:18:48 lr: 4.874405382224973e-05 loss: 0.2208 (0.2369) time: 3.1504 data: 0.0092 max mem: 33298 Epoch: [1] [ 500/4276] eta: 3:18:16 lr: 4.874141540348279e-05 loss: 0.2243 (0.2368) time: 3.1586 data: 0.0098 max mem: 33298 Epoch: [1] [ 510/4276] eta: 3:17:45 lr: 4.873877696884686e-05 loss: 0.2250 (0.2364) time: 3.1512 data: 0.0092 max mem: 33298 Epoch: [1] [ 520/4276] eta: 3:17:08 lr: 4.8736138518340883e-05 loss: 0.2228 (0.2364) time: 3.1175 data: 0.0083 max mem: 33298 Epoch: [1] [ 530/4276] eta: 3:16:33 lr: 4.8733500051963815e-05 loss: 0.2398 (0.2364) time: 3.0919 data: 0.0075 max mem: 33298 Epoch: [1] [ 540/4276] eta: 3:15:57 lr: 4.8730861569714595e-05 loss: 0.2327 (0.2363) time: 3.0876 data: 0.0073 max mem: 33298 Epoch: [1] [ 550/4276] eta: 3:15:26 lr: 4.872822307159218e-05 loss: 0.2395 (0.2366) time: 3.1162 data: 0.0080 max mem: 33298 Epoch: [1] [ 560/4276] eta: 3:14:54 lr: 4.8725584557595524e-05 loss: 0.2499 (0.2369) time: 3.1473 data: 0.0086 max mem: 33298 Epoch: [1] [ 570/4276] eta: 3:14:23 lr: 4.872294602772358e-05 loss: 0.2345 (0.2369) time: 3.1453 data: 0.0085 max mem: 33298 Epoch: [1] [ 580/4276] eta: 3:13:53 lr: 4.872030748197527e-05 loss: 0.2411 (0.2368) time: 3.1638 data: 0.0094 max mem: 33298 Epoch: [1] [ 590/4276] eta: 3:13:23 lr: 4.871766892034957e-05 loss: 0.2251 (0.2364) time: 3.1749 data: 0.0094 max mem: 33298 Epoch: [1] [ 600/4276] eta: 3:12:53 lr: 4.871503034284543e-05 loss: 0.2192 (0.2364) time: 3.1742 data: 0.0087 max mem: 33300 Epoch: [1] [ 610/4276] eta: 3:12:22 lr: 4.871239174946177e-05 loss: 0.2133 (0.2360) time: 3.1585 data: 0.0085 max mem: 33300 Epoch: [1] [ 620/4276] eta: 3:11:50 lr: 4.870975314019757e-05 loss: 0.2077 (0.2359) time: 3.1403 data: 0.0088 max mem: 33300 Epoch: [1] [ 630/4276] eta: 3:11:17 lr: 4.8707114515051765e-05 loss: 0.2333 (0.2362) time: 3.1346 data: 0.0087 max mem: 33300 Epoch: [1] [ 640/4276] eta: 3:10:45 lr: 4.8704475874023306e-05 loss: 0.2326 (0.2360) time: 3.1350 data: 0.0089 max mem: 33300 Epoch: [1] [ 650/4276] eta: 3:10:11 lr: 4.870183721711114e-05 loss: 0.2174 (0.2359) time: 3.1230 data: 0.0094 max mem: 33300 Epoch: [1] [ 660/4276] eta: 3:09:36 lr: 4.869919854431422e-05 loss: 0.2223 (0.2359) time: 3.0960 data: 0.0091 max mem: 33300 Epoch: [1] [ 670/4276] eta: 3:09:01 lr: 4.869655985563148e-05 loss: 0.2186 (0.2356) time: 3.0784 data: 0.0085 max mem: 33300 Epoch: [1] [ 680/4276] eta: 3:08:26 lr: 4.8693921151061884e-05 loss: 0.2162 (0.2354) time: 3.0724 data: 0.0078 max mem: 33300 Epoch: [1] [ 690/4276] eta: 3:07:55 lr: 4.8691282430604374e-05 loss: 0.2162 (0.2352) time: 3.1136 data: 0.0079 max mem: 33300 Epoch: [1] [ 700/4276] eta: 3:07:25 lr: 4.868864369425789e-05 loss: 0.2236 (0.2353) time: 3.1700 data: 0.0079 max mem: 33300 Epoch: [1] [ 710/4276] eta: 3:06:52 lr: 4.868600494202139e-05 loss: 0.2474 (0.2355) time: 3.1451 data: 0.0076 max mem: 33300 Epoch: [1] [ 720/4276] eta: 3:06:20 lr: 4.868336617389383e-05 loss: 0.2415 (0.2355) time: 3.1145 data: 0.0078 max mem: 33300 Epoch: [1] [ 730/4276] eta: 3:05:49 lr: 4.868072738987413e-05 loss: 0.2415 (0.2358) time: 3.1389 data: 0.0079 max mem: 33300 Epoch: [1] [ 740/4276] eta: 3:05:19 lr: 4.867808858996126e-05 loss: 0.2375 (0.2359) time: 3.1685 data: 0.0081 max mem: 33300 Epoch: [1] [ 750/4276] eta: 3:04:50 lr: 4.867544977415415e-05 loss: 0.2251 (0.2357) time: 3.1891 data: 0.0083 max mem: 33300 Epoch: [1] [ 760/4276] eta: 3:04:21 lr: 4.867281094245177e-05 loss: 0.2072 (0.2354) time: 3.1969 data: 0.0085 max mem: 33300 Epoch: [1] [ 770/4276] eta: 3:03:53 lr: 4.867017209485304e-05 loss: 0.2160 (0.2355) time: 3.2055 data: 0.0086 max mem: 33300 Epoch: [1] [ 780/4276] eta: 3:03:23 lr: 4.8667533231356924e-05 loss: 0.2282 (0.2356) time: 3.2065 data: 0.0085 max mem: 33300 Epoch: [1] [ 790/4276] eta: 3:02:54 lr: 4.866489435196236e-05 loss: 0.2395 (0.2359) time: 3.1990 data: 0.0088 max mem: 33300 Epoch: [1] [ 800/4276] eta: 3:02:24 lr: 4.8662255456668307e-05 loss: 0.2398 (0.2359) time: 3.1874 data: 0.0085 max mem: 33300 Epoch: [1] [ 810/4276] eta: 3:01:52 lr: 4.86596165454737e-05 loss: 0.2342 (0.2359) time: 3.1584 data: 0.0083 max mem: 33300 Epoch: [1] [ 820/4276] eta: 3:01:18 lr: 4.865697761837748e-05 loss: 0.2450 (0.2359) time: 3.1079 data: 0.0084 max mem: 33300 Epoch: [1] [ 830/4276] eta: 3:00:44 lr: 4.8654338675378614e-05 loss: 0.2360 (0.2361) time: 3.0917 data: 0.0083 max mem: 33300 Epoch: [1] [ 840/4276] eta: 3:00:10 lr: 4.865169971647603e-05 loss: 0.2263 (0.2362) time: 3.0913 data: 0.0090 max mem: 33300 Epoch: [1] [ 850/4276] eta: 2:59:36 lr: 4.8649060741668674e-05 loss: 0.2145 (0.2362) time: 3.0739 data: 0.0091 max mem: 33300 Epoch: [1] [ 860/4276] eta: 2:59:01 lr: 4.864642175095549e-05 loss: 0.2118 (0.2362) time: 3.0677 data: 0.0083 max mem: 33300 Epoch: [1] [ 870/4276] eta: 2:58:26 lr: 4.8643782744335434e-05 loss: 0.2287 (0.2363) time: 3.0615 data: 0.0081 max mem: 33300 Epoch: [1] [ 880/4276] eta: 2:57:51 lr: 4.8641143721807456e-05 loss: 0.2357 (0.2364) time: 3.0600 data: 0.0079 max mem: 33300 Epoch: [1] [ 890/4276] eta: 2:57:17 lr: 4.863850468337047e-05 loss: 0.2289 (0.2364) time: 3.0609 data: 0.0078 max mem: 33300 Epoch: [1] [ 900/4276] eta: 2:56:44 lr: 4.8635865629023455e-05 loss: 0.2239 (0.2363) time: 3.0760 data: 0.0084 max mem: 33300 Epoch: [1] [ 910/4276] eta: 2:56:11 lr: 4.8633226558765346e-05 loss: 0.2339 (0.2364) time: 3.1019 data: 0.0088 max mem: 33300 Epoch: [1] [ 920/4276] eta: 2:55:39 lr: 4.863058747259508e-05 loss: 0.2410 (0.2365) time: 3.1209 data: 0.0082 max mem: 33300 Epoch: [1] [ 930/4276] eta: 2:55:06 lr: 4.8627948370511605e-05 loss: 0.2287 (0.2364) time: 3.1029 data: 0.0083 max mem: 33300 Epoch: [1] [ 940/4276] eta: 2:54:34 lr: 4.8625309252513865e-05 loss: 0.2101 (0.2363) time: 3.1027 data: 0.0087 max mem: 33300 Epoch: [1] [ 950/4276] eta: 2:54:04 lr: 4.862267011860081e-05 loss: 0.2130 (0.2361) time: 3.1575 data: 0.0090 max mem: 33300 Epoch: [1] [ 960/4276] eta: 2:53:35 lr: 4.862003096877138e-05 loss: 0.2220 (0.2361) time: 3.2010 data: 0.0088 max mem: 33300 Epoch: [1] [ 970/4276] eta: 2:53:05 lr: 4.861739180302451e-05 loss: 0.2265 (0.2360) time: 3.1995 data: 0.0080 max mem: 33300 Epoch: [1] [ 980/4276] eta: 2:52:34 lr: 4.861475262135917e-05 loss: 0.2331 (0.2363) time: 3.1615 data: 0.0083 max mem: 33300 Epoch: [1] [ 990/4276] eta: 2:52:02 lr: 4.8612113423774274e-05 loss: 0.2281 (0.2362) time: 3.1304 data: 0.0085 max mem: 33300 Epoch: [1] [1000/4276] eta: 2:51:30 lr: 4.8609474210268785e-05 loss: 0.2278 (0.2362) time: 3.1311 data: 0.0079 max mem: 33300 Epoch: [1] [1010/4276] eta: 2:50:59 lr: 4.860683498084163e-05 loss: 0.2201 (0.2360) time: 3.1436 data: 0.0076 max mem: 33300 Epoch: [1] [1020/4276] eta: 2:50:27 lr: 4.860419573549177e-05 loss: 0.2162 (0.2359) time: 3.1395 data: 0.0088 max mem: 33300 Epoch: [1] [1030/4276] eta: 2:49:56 lr: 4.860155647421814e-05 loss: 0.2278 (0.2361) time: 3.1427 data: 0.0092 max mem: 33300 Epoch: [1] [1040/4276] eta: 2:49:26 lr: 4.859891719701969e-05 loss: 0.2349 (0.2359) time: 3.1604 data: 0.0087 max mem: 33300 Epoch: [1] [1050/4276] eta: 2:48:54 lr: 4.859627790389535e-05 loss: 0.2326 (0.2360) time: 3.1532 data: 0.0090 max mem: 33300 Epoch: [1] [1060/4276] eta: 2:48:22 lr: 4.8593638594844065e-05 loss: 0.2326 (0.2362) time: 3.1319 data: 0.0086 max mem: 33300 Epoch: [1] [1070/4276] eta: 2:47:50 lr: 4.859099926986479e-05 loss: 0.2267 (0.2361) time: 3.1206 data: 0.0081 max mem: 33300 Epoch: [1] [1080/4276] eta: 2:47:18 lr: 4.8588359928956454e-05 loss: 0.2189 (0.2359) time: 3.1090 data: 0.0078 max mem: 33300 Epoch: [1] [1090/4276] eta: 2:46:46 lr: 4.858572057211801e-05 loss: 0.2202 (0.2358) time: 3.1218 data: 0.0083 max mem: 33300 Epoch: [1] [1100/4276] eta: 2:46:14 lr: 4.858308119934839e-05 loss: 0.2135 (0.2358) time: 3.1344 data: 0.0087 max mem: 33300 Epoch: [1] [1110/4276] eta: 2:45:43 lr: 4.858044181064655e-05 loss: 0.2191 (0.2359) time: 3.1267 data: 0.0082 max mem: 33300 Epoch: [1] [1120/4276] eta: 2:45:11 lr: 4.857780240601142e-05 loss: 0.2222 (0.2360) time: 3.1264 data: 0.0080 max mem: 33300 Epoch: [1] [1130/4276] eta: 2:44:40 lr: 4.8575162985441944e-05 loss: 0.2222 (0.2359) time: 3.1398 data: 0.0084 max mem: 33300 Epoch: [1] [1140/4276] eta: 2:44:09 lr: 4.857252354893706e-05 loss: 0.2317 (0.2358) time: 3.1585 data: 0.0090 max mem: 33300 Epoch: [1] [1150/4276] eta: 2:43:38 lr: 4.856988409649573e-05 loss: 0.2376 (0.2358) time: 3.1639 data: 0.0087 max mem: 33300 Epoch: [1] [1160/4276] eta: 2:43:07 lr: 4.856724462811687e-05 loss: 0.2378 (0.2358) time: 3.1609 data: 0.0084 max mem: 33300 Epoch: [1] [1170/4276] eta: 2:42:37 lr: 4.856460514379943e-05 loss: 0.2280 (0.2358) time: 3.1719 data: 0.0087 max mem: 33300 Epoch: [1] [1180/4276] eta: 2:42:06 lr: 4.856196564354236e-05 loss: 0.2273 (0.2356) time: 3.1768 data: 0.0089 max mem: 33300 Epoch: [1] [1190/4276] eta: 2:41:35 lr: 4.855932612734459e-05 loss: 0.2219 (0.2356) time: 3.1486 data: 0.0087 max mem: 33300 Epoch: [1] [1200/4276] eta: 2:41:03 lr: 4.855668659520506e-05 loss: 0.2276 (0.2356) time: 3.1392 data: 0.0087 max mem: 33300 Epoch: [1] [1210/4276] eta: 2:40:32 lr: 4.8554047047122725e-05 loss: 0.2150 (0.2355) time: 3.1559 data: 0.0089 max mem: 33300 Epoch: [1] [1220/4276] eta: 2:40:01 lr: 4.855140748309652e-05 loss: 0.2110 (0.2354) time: 3.1581 data: 0.0086 max mem: 33300 Epoch: [1] [1230/4276] eta: 2:39:30 lr: 4.854876790312537e-05 loss: 0.2168 (0.2353) time: 3.1498 data: 0.0080 max mem: 33300 Epoch: [1] [1240/4276] eta: 2:38:58 lr: 4.8546128307208234e-05 loss: 0.2408 (0.2354) time: 3.1417 data: 0.0076 max mem: 33300 Epoch: [1] [1250/4276] eta: 2:38:27 lr: 4.854348869534404e-05 loss: 0.2408 (0.2354) time: 3.1481 data: 0.0072 max mem: 33300 Epoch: [1] [1260/4276] eta: 2:37:56 lr: 4.854084906753174e-05 loss: 0.2129 (0.2353) time: 3.1528 data: 0.0071 max mem: 33300 Epoch: [1] [1270/4276] eta: 2:37:25 lr: 4.853820942377028e-05 loss: 0.2270 (0.2355) time: 3.1482 data: 0.0074 max mem: 33300 Epoch: [1] [1280/4276] eta: 2:36:54 lr: 4.853556976405857e-05 loss: 0.2385 (0.2355) time: 3.1470 data: 0.0081 max mem: 33300 Epoch: [1] [1290/4276] eta: 2:36:22 lr: 4.853293008839557e-05 loss: 0.2385 (0.2356) time: 3.1414 data: 0.0084 max mem: 33300 Epoch: [1] [1300/4276] eta: 2:35:50 lr: 4.853029039678023e-05 loss: 0.2233 (0.2355) time: 3.1324 data: 0.0083 max mem: 33300 Epoch: [1] [1310/4276] eta: 2:35:18 lr: 4.852765068921146e-05 loss: 0.2161 (0.2353) time: 3.1265 data: 0.0084 max mem: 33300 Epoch: [1] [1320/4276] eta: 2:34:46 lr: 4.852501096568823e-05 loss: 0.2334 (0.2355) time: 3.1143 data: 0.0080 max mem: 33300 Epoch: [1] [1330/4276] eta: 2:34:13 lr: 4.8522371226209455e-05 loss: 0.2201 (0.2353) time: 3.0883 data: 0.0081 max mem: 33300 Epoch: [1] [1340/4276] eta: 2:33:40 lr: 4.8519731470774096e-05 loss: 0.2098 (0.2352) time: 3.0694 data: 0.0082 max mem: 33300 Epoch: [1] [1350/4276] eta: 2:33:07 lr: 4.851709169938107e-05 loss: 0.2231 (0.2352) time: 3.0612 data: 0.0076 max mem: 33300 Epoch: [1] [1360/4276] eta: 2:32:35 lr: 4.8514451912029326e-05 loss: 0.2271 (0.2352) time: 3.0863 data: 0.0089 max mem: 33300 Epoch: [1] [1370/4276] eta: 2:32:03 lr: 4.851181210871781e-05 loss: 0.2229 (0.2350) time: 3.1105 data: 0.0098 max mem: 33300 Epoch: [1] [1380/4276] eta: 2:31:30 lr: 4.8509172289445446e-05 loss: 0.2230 (0.2350) time: 3.0811 data: 0.0089 max mem: 33300 Epoch: [1] [1390/4276] eta: 2:30:57 lr: 4.850653245421119e-05 loss: 0.2230 (0.2350) time: 3.0712 data: 0.0080 max mem: 33300 Epoch: [1] [1400/4276] eta: 2:30:25 lr: 4.850389260301396e-05 loss: 0.2206 (0.2351) time: 3.0995 data: 0.0080 max mem: 33300 Epoch: [1] [1410/4276] eta: 2:29:54 lr: 4.850125273585271e-05 loss: 0.2309 (0.2352) time: 3.1160 data: 0.0082 max mem: 33300 Epoch: [1] [1420/4276] eta: 2:29:22 lr: 4.8498612852726374e-05 loss: 0.2374 (0.2353) time: 3.1206 data: 0.0080 max mem: 33300 Epoch: [1] [1430/4276] eta: 2:28:50 lr: 4.849597295363388e-05 loss: 0.2203 (0.2352) time: 3.1298 data: 0.0078 max mem: 33300 Epoch: [1] [1440/4276] eta: 2:28:19 lr: 4.849333303857418e-05 loss: 0.2201 (0.2354) time: 3.1389 data: 0.0076 max mem: 33300 Epoch: [1] [1450/4276] eta: 2:27:48 lr: 4.84906931075462e-05 loss: 0.2329 (0.2354) time: 3.1407 data: 0.0074 max mem: 33300 Epoch: [1] [1460/4276] eta: 2:27:17 lr: 4.8488053160548894e-05 loss: 0.2374 (0.2355) time: 3.1512 data: 0.0077 max mem: 33300 Epoch: [1] [1470/4276] eta: 2:26:46 lr: 4.848541319758118e-05 loss: 0.2374 (0.2355) time: 3.1578 data: 0.0078 max mem: 33300 Epoch: [1] [1480/4276] eta: 2:26:15 lr: 4.848277321864201e-05 loss: 0.2323 (0.2356) time: 3.1494 data: 0.0077 max mem: 33300 Epoch: [1] [1490/4276] eta: 2:25:43 lr: 4.848013322373031e-05 loss: 0.2291 (0.2355) time: 3.1443 data: 0.0083 max mem: 33300 Epoch: [1] [1500/4276] eta: 2:25:12 lr: 4.847749321284502e-05 loss: 0.2107 (0.2354) time: 3.1320 data: 0.0087 max mem: 33300 Epoch: [1] [1510/4276] eta: 2:24:39 lr: 4.847485318598508e-05 loss: 0.2059 (0.2353) time: 3.0982 data: 0.0079 max mem: 33300 Epoch: [1] [1520/4276] eta: 2:24:07 lr: 4.847221314314942e-05 loss: 0.2226 (0.2352) time: 3.0903 data: 0.0080 max mem: 33300 Epoch: [1] [1530/4276] eta: 2:23:35 lr: 4.846957308433699e-05 loss: 0.2234 (0.2353) time: 3.1079 data: 0.0086 max mem: 33300 Epoch: [1] [1540/4276] eta: 2:23:03 lr: 4.846693300954671e-05 loss: 0.2225 (0.2352) time: 3.1042 data: 0.0088 max mem: 33300 Epoch: [1] [1550/4276] eta: 2:22:31 lr: 4.846429291877753e-05 loss: 0.2313 (0.2352) time: 3.0963 data: 0.0082 max mem: 33300 Epoch: [1] [1560/4276] eta: 2:21:58 lr: 4.8461652812028375e-05 loss: 0.2154 (0.2350) time: 3.0775 data: 0.0075 max mem: 33300 Epoch: [1] [1570/4276] eta: 2:21:26 lr: 4.845901268929819e-05 loss: 0.2028 (0.2349) time: 3.0797 data: 0.0075 max mem: 33300 Epoch: [1] [1580/4276] eta: 2:20:55 lr: 4.845637255058591e-05 loss: 0.1979 (0.2347) time: 3.1080 data: 0.0078 max mem: 33300 Epoch: [1] [1590/4276] eta: 2:20:23 lr: 4.845373239589046e-05 loss: 0.2157 (0.2347) time: 3.1145 data: 0.0081 max mem: 33300 Epoch: [1] [1600/4276] eta: 2:19:51 lr: 4.8451092225210786e-05 loss: 0.2157 (0.2349) time: 3.1067 data: 0.0080 max mem: 33300 Epoch: [1] [1610/4276] eta: 2:19:18 lr: 4.8448452038545824e-05 loss: 0.2147 (0.2347) time: 3.0791 data: 0.0083 max mem: 33300 Epoch: [1] [1620/4276] eta: 2:18:46 lr: 4.8445811835894504e-05 loss: 0.2202 (0.2347) time: 3.0718 data: 0.0087 max mem: 33300 Epoch: [1] [1630/4276] eta: 2:18:14 lr: 4.844317161725576e-05 loss: 0.2440 (0.2350) time: 3.1005 data: 0.0092 max mem: 33300 Epoch: [1] [1640/4276] eta: 2:17:42 lr: 4.8440531382628536e-05 loss: 0.2372 (0.2351) time: 3.0923 data: 0.0087 max mem: 33300 Epoch: [1] [1650/4276] eta: 2:17:10 lr: 4.843789113201176e-05 loss: 0.2241 (0.2351) time: 3.0830 data: 0.0086 max mem: 33300 Epoch: [1] [1660/4276] eta: 2:16:38 lr: 4.8435250865404364e-05 loss: 0.2241 (0.2350) time: 3.0910 data: 0.0089 max mem: 33300 Epoch: [1] [1670/4276] eta: 2:16:07 lr: 4.8432610582805283e-05 loss: 0.2269 (0.2350) time: 3.1131 data: 0.0084 max mem: 33300 Epoch: [1] [1680/4276] eta: 2:15:35 lr: 4.842997028421346e-05 loss: 0.2438 (0.2352) time: 3.1405 data: 0.0089 max mem: 33300 Epoch: [1] [1690/4276] eta: 2:15:04 lr: 4.842732996962783e-05 loss: 0.2386 (0.2351) time: 3.1381 data: 0.0087 max mem: 33300 Epoch: [1] [1700/4276] eta: 2:14:33 lr: 4.842468963904731e-05 loss: 0.2333 (0.2352) time: 3.1522 data: 0.0084 max mem: 33300 Epoch: [1] [1710/4276] eta: 2:14:02 lr: 4.842204929247085e-05 loss: 0.2439 (0.2353) time: 3.1567 data: 0.0084 max mem: 33300 Epoch: [1] [1720/4276] eta: 2:13:31 lr: 4.841940892989738e-05 loss: 0.2508 (0.2355) time: 3.1379 data: 0.0082 max mem: 33300 Epoch: [1] [1730/4276] eta: 2:12:59 lr: 4.841676855132584e-05 loss: 0.2414 (0.2354) time: 3.1152 data: 0.0081 max mem: 33300 Epoch: [1] [1740/4276] eta: 2:12:27 lr: 4.8414128156755154e-05 loss: 0.2168 (0.2354) time: 3.1130 data: 0.0081 max mem: 33300 Epoch: [1] [1750/4276] eta: 2:11:56 lr: 4.841148774618425e-05 loss: 0.2234 (0.2356) time: 3.1309 data: 0.0081 max mem: 33300 Epoch: [1] [1760/4276] eta: 2:11:25 lr: 4.840884731961207e-05 loss: 0.2366 (0.2355) time: 3.1424 data: 0.0082 max mem: 33300 Epoch: [1] [1770/4276] eta: 2:10:54 lr: 4.840620687703756e-05 loss: 0.2366 (0.2354) time: 3.1459 data: 0.0079 max mem: 33300 Epoch: [1] [1780/4276] eta: 2:10:22 lr: 4.840356641845963e-05 loss: 0.2276 (0.2354) time: 3.1255 data: 0.0080 max mem: 33300 Epoch: [1] [1790/4276] eta: 2:09:50 lr: 4.840092594387722e-05 loss: 0.2192 (0.2354) time: 3.0895 data: 0.0080 max mem: 33300 Epoch: [1] [1800/4276] eta: 2:09:17 lr: 4.8398285453289275e-05 loss: 0.2247 (0.2354) time: 3.0554 data: 0.0081 max mem: 33300 Epoch: [1] [1810/4276] eta: 2:08:45 lr: 4.839564494669472e-05 loss: 0.2478 (0.2355) time: 3.0557 data: 0.0079 max mem: 33300 Epoch: [1] [1820/4276] eta: 2:08:13 lr: 4.839300442409248e-05 loss: 0.2358 (0.2354) time: 3.0710 data: 0.0075 max mem: 33300 Epoch: [1] [1830/4276] eta: 2:07:41 lr: 4.8390363885481496e-05 loss: 0.2259 (0.2354) time: 3.0898 data: 0.0077 max mem: 33300 Epoch: [1] [1840/4276] eta: 2:07:10 lr: 4.83877233308607e-05 loss: 0.2259 (0.2354) time: 3.1167 data: 0.0080 max mem: 33300 Epoch: [1] [1850/4276] eta: 2:06:38 lr: 4.838508276022902e-05 loss: 0.2259 (0.2354) time: 3.1212 data: 0.0083 max mem: 33300 Epoch: [1] [1860/4276] eta: 2:06:07 lr: 4.838244217358539e-05 loss: 0.2259 (0.2353) time: 3.1283 data: 0.0086 max mem: 33300 Epoch: [1] [1870/4276] eta: 2:05:36 lr: 4.8379801570928745e-05 loss: 0.2306 (0.2354) time: 3.1267 data: 0.0088 max mem: 33300 Epoch: [1] [1880/4276] eta: 2:05:04 lr: 4.837716095225801e-05 loss: 0.2362 (0.2354) time: 3.1165 data: 0.0083 max mem: 33300 Epoch: [1] [1890/4276] eta: 2:04:32 lr: 4.8374520317572124e-05 loss: 0.2241 (0.2353) time: 3.1145 data: 0.0078 max mem: 33300 Epoch: [1] [1900/4276] eta: 2:04:01 lr: 4.837187966687002e-05 loss: 0.2202 (0.2352) time: 3.1100 data: 0.0084 max mem: 33300 Epoch: [1] [1910/4276] eta: 2:03:29 lr: 4.836923900015062e-05 loss: 0.2248 (0.2351) time: 3.0900 data: 0.0086 max mem: 33300 Epoch: [1] [1920/4276] eta: 2:02:57 lr: 4.836659831741286e-05 loss: 0.2119 (0.2350) time: 3.0836 data: 0.0087 max mem: 33300 Epoch: [1] [1930/4276] eta: 2:02:25 lr: 4.836395761865566e-05 loss: 0.2119 (0.2349) time: 3.0864 data: 0.0089 max mem: 33300 Epoch: [1] [1940/4276] eta: 2:01:53 lr: 4.8361316903877977e-05 loss: 0.2430 (0.2351) time: 3.0640 data: 0.0081 max mem: 33300 Epoch: [1] [1950/4276] eta: 2:01:21 lr: 4.8358676173078724e-05 loss: 0.2333 (0.2350) time: 3.0586 data: 0.0077 max mem: 33300 Epoch: [1] [1960/4276] eta: 2:00:49 lr: 4.8356035426256833e-05 loss: 0.1980 (0.2349) time: 3.0903 data: 0.0079 max mem: 33300 Epoch: [1] [1970/4276] eta: 2:00:18 lr: 4.8353394663411235e-05 loss: 0.1905 (0.2347) time: 3.1146 data: 0.0081 max mem: 33300 Epoch: [1] [1980/4276] eta: 1:59:47 lr: 4.8350753884540856e-05 loss: 0.1880 (0.2345) time: 3.1360 data: 0.0085 max mem: 33300 Epoch: [1] [1990/4276] eta: 1:59:15 lr: 4.8348113089644635e-05 loss: 0.2092 (0.2346) time: 3.1353 data: 0.0084 max mem: 33300 Epoch: [1] [2000/4276] eta: 1:58:44 lr: 4.834547227872151e-05 loss: 0.2206 (0.2346) time: 3.1087 data: 0.0086 max mem: 33300 Epoch: [1] [2010/4276] eta: 1:58:12 lr: 4.834283145177039e-05 loss: 0.2186 (0.2345) time: 3.0843 data: 0.0084 max mem: 33300 Epoch: [1] [2020/4276] eta: 1:57:40 lr: 4.834019060879021e-05 loss: 0.2312 (0.2345) time: 3.0642 data: 0.0077 max mem: 33300 Epoch: [1] [2030/4276] eta: 1:57:08 lr: 4.833754974977991e-05 loss: 0.2279 (0.2344) time: 3.0707 data: 0.0080 max mem: 33300 Epoch: [1] [2040/4276] eta: 1:56:36 lr: 4.833490887473842e-05 loss: 0.2131 (0.2344) time: 3.0885 data: 0.0083 max mem: 33300 Epoch: [1] [2050/4276] eta: 1:56:05 lr: 4.833226798366465e-05 loss: 0.2381 (0.2344) time: 3.1044 data: 0.0085 max mem: 33300 Epoch: [1] [2060/4276] eta: 1:55:33 lr: 4.832962707655755e-05 loss: 0.2381 (0.2344) time: 3.1067 data: 0.0084 max mem: 33300 Epoch: [1] [2070/4276] eta: 1:55:01 lr: 4.8326986153416046e-05 loss: 0.2205 (0.2343) time: 3.1040 data: 0.0080 max mem: 33300 Epoch: [1] [2080/4276] eta: 1:54:30 lr: 4.832434521423905e-05 loss: 0.2149 (0.2343) time: 3.1100 data: 0.0078 max mem: 33300 Epoch: [1] [2090/4276] eta: 1:53:59 lr: 4.8321704259025514e-05 loss: 0.2149 (0.2343) time: 3.1128 data: 0.0079 max mem: 33300 Epoch: [1] [2100/4276] eta: 1:53:27 lr: 4.831906328777435e-05 loss: 0.2198 (0.2342) time: 3.1032 data: 0.0078 max mem: 33300 Epoch: [1] [2110/4276] eta: 1:52:55 lr: 4.83164223004845e-05 loss: 0.2198 (0.2341) time: 3.0912 data: 0.0077 max mem: 33300 Epoch: [1] [2120/4276] eta: 1:52:24 lr: 4.831378129715488e-05 loss: 0.2232 (0.2341) time: 3.0878 data: 0.0080 max mem: 33300 Epoch: [1] [2130/4276] eta: 1:51:52 lr: 4.831114027778443e-05 loss: 0.2133 (0.2340) time: 3.1011 data: 0.0079 max mem: 33300 Epoch: [1] [2140/4276] eta: 1:51:21 lr: 4.830849924237207e-05 loss: 0.2133 (0.2340) time: 3.1129 data: 0.0079 max mem: 33300 Epoch: [1] [2150/4276] eta: 1:50:49 lr: 4.830585819091673e-05 loss: 0.2203 (0.2339) time: 3.1206 data: 0.0079 max mem: 33300 Epoch: [1] [2160/4276] eta: 1:50:18 lr: 4.830321712341734e-05 loss: 0.2183 (0.2339) time: 3.1164 data: 0.0075 max mem: 33300 Epoch: [1] [2170/4276] eta: 1:49:47 lr: 4.830057603987282e-05 loss: 0.2264 (0.2340) time: 3.1213 data: 0.0073 max mem: 33300 Epoch: [1] [2180/4276] eta: 1:49:15 lr: 4.829793494028211e-05 loss: 0.2264 (0.2340) time: 3.1367 data: 0.0073 max mem: 33300 Epoch: [1] [2190/4276] eta: 1:48:44 lr: 4.829529382464413e-05 loss: 0.2398 (0.2341) time: 3.1313 data: 0.0072 max mem: 33300 Epoch: [1] [2200/4276] eta: 1:48:13 lr: 4.8292652692957805e-05 loss: 0.2300 (0.2340) time: 3.1194 data: 0.0071 max mem: 33300 Epoch: [1] [2210/4276] eta: 1:47:41 lr: 4.829001154522207e-05 loss: 0.2314 (0.2341) time: 3.1141 data: 0.0072 max mem: 33300 Epoch: [1] [2220/4276] eta: 1:47:10 lr: 4.828737038143585e-05 loss: 0.2348 (0.2340) time: 3.1226 data: 0.0072 max mem: 33300 Epoch: [1] [2230/4276] eta: 1:46:39 lr: 4.8284729201598064e-05 loss: 0.2242 (0.2340) time: 3.1320 data: 0.0073 max mem: 33300 Epoch: [1] [2240/4276] eta: 1:46:07 lr: 4.828208800570765e-05 loss: 0.2229 (0.2339) time: 3.0942 data: 0.0082 max mem: 33300 Epoch: [1] [2250/4276] eta: 1:45:35 lr: 4.827944679376353e-05 loss: 0.2249 (0.2339) time: 3.0638 data: 0.0083 max mem: 33300 Epoch: [1] [2260/4276] eta: 1:45:03 lr: 4.8276805565764624e-05 loss: 0.2353 (0.2340) time: 3.0719 data: 0.0085 max mem: 33300 Epoch: [1] [2270/4276] eta: 1:44:32 lr: 4.827416432170988e-05 loss: 0.2163 (0.2339) time: 3.0760 data: 0.0092 max mem: 33300 Epoch: [1] [2280/4276] eta: 1:44:00 lr: 4.827152306159819e-05 loss: 0.2150 (0.2339) time: 3.0955 data: 0.0091 max mem: 33300 Epoch: [1] [2290/4276] eta: 1:43:29 lr: 4.826888178542851e-05 loss: 0.2246 (0.2339) time: 3.1046 data: 0.0088 max mem: 33300 Epoch: [1] [2300/4276] eta: 1:42:58 lr: 4.826624049319975e-05 loss: 0.2040 (0.2337) time: 3.1147 data: 0.0090 max mem: 33300 Epoch: [1] [2310/4276] eta: 1:42:26 lr: 4.8263599184910844e-05 loss: 0.2040 (0.2336) time: 3.1268 data: 0.0085 max mem: 33300 Epoch: [1] [2320/4276] eta: 1:41:55 lr: 4.8260957860560714e-05 loss: 0.2187 (0.2336) time: 3.1222 data: 0.0079 max mem: 33300 Epoch: [1] [2330/4276] eta: 1:41:24 lr: 4.825831652014829e-05 loss: 0.2187 (0.2336) time: 3.1475 data: 0.0089 max mem: 33300 Epoch: [1] [2340/4276] eta: 1:40:53 lr: 4.8255675163672486e-05 loss: 0.2162 (0.2335) time: 3.1592 data: 0.0089 max mem: 33300 Epoch: [1] [2350/4276] eta: 1:40:21 lr: 4.8253033791132246e-05 loss: 0.2162 (0.2334) time: 3.1154 data: 0.0086 max mem: 33300 Epoch: [1] [2360/4276] eta: 1:39:50 lr: 4.825039240252648e-05 loss: 0.2181 (0.2334) time: 3.0861 data: 0.0088 max mem: 33300 Epoch: [1] [2370/4276] eta: 1:39:18 lr: 4.8247750997854115e-05 loss: 0.2181 (0.2334) time: 3.0862 data: 0.0083 max mem: 33300 Epoch: [1] [2380/4276] eta: 1:38:47 lr: 4.824510957711408e-05 loss: 0.1940 (0.2333) time: 3.0924 data: 0.0078 max mem: 33300 Epoch: [1] [2390/4276] eta: 1:38:15 lr: 4.82424681403053e-05 loss: 0.1940 (0.2332) time: 3.0975 data: 0.0080 max mem: 33300 Epoch: [1] [2400/4276] eta: 1:37:44 lr: 4.8239826687426695e-05 loss: 0.2150 (0.2332) time: 3.0987 data: 0.0084 max mem: 33300 Epoch: [1] [2410/4276] eta: 1:37:12 lr: 4.823718521847719e-05 loss: 0.2186 (0.2331) time: 3.0938 data: 0.0086 max mem: 33300 Epoch: [1] [2420/4276] eta: 1:36:41 lr: 4.823454373345572e-05 loss: 0.2143 (0.2330) time: 3.0885 data: 0.0081 max mem: 33300 Epoch: [1] [2430/4276] eta: 1:36:09 lr: 4.8231902232361195e-05 loss: 0.2182 (0.2331) time: 3.0947 data: 0.0080 max mem: 33300 Epoch: [1] [2440/4276] eta: 1:35:38 lr: 4.822926071519255e-05 loss: 0.2361 (0.2331) time: 3.0990 data: 0.0082 max mem: 33300 Epoch: [1] [2450/4276] eta: 1:35:06 lr: 4.82266191819487e-05 loss: 0.2182 (0.2330) time: 3.0944 data: 0.0084 max mem: 33300 Epoch: [1] [2460/4276] eta: 1:34:35 lr: 4.822397763262858e-05 loss: 0.2182 (0.2330) time: 3.0983 data: 0.0082 max mem: 33300 Epoch: [1] [2470/4276] eta: 1:34:03 lr: 4.82213360672311e-05 loss: 0.2360 (0.2331) time: 3.0988 data: 0.0079 max mem: 33300 Epoch: [1] [2480/4276] eta: 1:33:32 lr: 4.82186944857552e-05 loss: 0.2372 (0.2330) time: 3.0782 data: 0.0087 max mem: 33300 Epoch: [1] [2490/4276] eta: 1:33:00 lr: 4.8216052888199774e-05 loss: 0.2278 (0.2330) time: 3.0808 data: 0.0088 max mem: 33300 Epoch: [1] [2500/4276] eta: 1:32:29 lr: 4.821341127456378e-05 loss: 0.2355 (0.2331) time: 3.0956 data: 0.0079 max mem: 33300 Epoch: [1] [2510/4276] eta: 1:31:58 lr: 4.8210769644846126e-05 loss: 0.2311 (0.2330) time: 3.1118 data: 0.0075 max mem: 33300 Epoch: [1] [2520/4276] eta: 1:31:26 lr: 4.820812799904573e-05 loss: 0.2191 (0.2329) time: 3.1407 data: 0.0078 max mem: 33300 Epoch: [1] [2530/4276] eta: 1:30:55 lr: 4.8205486337161525e-05 loss: 0.1950 (0.2328) time: 3.1509 data: 0.0085 max mem: 33300 Epoch: [1] [2540/4276] eta: 1:30:24 lr: 4.820284465919243e-05 loss: 0.1962 (0.2328) time: 3.1560 data: 0.0086 max mem: 33300 Epoch: [1] [2550/4276] eta: 1:29:53 lr: 4.8200202965137365e-05 loss: 0.2258 (0.2328) time: 3.1497 data: 0.0077 max mem: 33300 Epoch: [1] [2560/4276] eta: 1:29:22 lr: 4.819756125499525e-05 loss: 0.1979 (0.2326) time: 3.1259 data: 0.0077 max mem: 33300 Epoch: [1] [2570/4276] eta: 1:28:50 lr: 4.819491952876501e-05 loss: 0.1989 (0.2326) time: 3.0993 data: 0.0082 max mem: 33300 Epoch: [1] [2580/4276] eta: 1:28:19 lr: 4.8192277786445575e-05 loss: 0.2053 (0.2326) time: 3.0683 data: 0.0079 max mem: 33300 Epoch: [1] [2590/4276] eta: 1:27:47 lr: 4.818963602803586e-05 loss: 0.2091 (0.2325) time: 3.0888 data: 0.0080 max mem: 33300 Epoch: [1] [2600/4276] eta: 1:27:17 lr: 4.818699425353478e-05 loss: 0.2219 (0.2326) time: 3.1889 data: 0.0095 max mem: 33300 Epoch: [1] [2610/4276] eta: 1:26:46 lr: 4.818435246294127e-05 loss: 0.2222 (0.2325) time: 3.2411 data: 0.0101 max mem: 33300 Epoch: [1] [2620/4276] eta: 1:26:16 lr: 4.818171065625425e-05 loss: 0.2166 (0.2325) time: 3.2392 data: 0.0095 max mem: 33300 Epoch: [1] [2630/4276] eta: 1:25:45 lr: 4.817906883347262e-05 loss: 0.2166 (0.2325) time: 3.2303 data: 0.0091 max mem: 33300 Epoch: [1] [2640/4276] eta: 1:25:14 lr: 4.8176426994595336e-05 loss: 0.1977 (0.2324) time: 3.1991 data: 0.0095 max mem: 33300 Epoch: [1] [2650/4276] eta: 1:24:44 lr: 4.8173785139621294e-05 loss: 0.2047 (0.2323) time: 3.2176 data: 0.0103 max mem: 33300 Epoch: [1] [2660/4276] eta: 1:24:13 lr: 4.817114326854943e-05 loss: 0.2134 (0.2323) time: 3.2443 data: 0.0103 max mem: 33300 Epoch: [1] [2670/4276] eta: 1:23:43 lr: 4.816850138137865e-05 loss: 0.2219 (0.2323) time: 3.2319 data: 0.0094 max mem: 33300 Epoch: [1] [2680/4276] eta: 1:23:12 lr: 4.816585947810788e-05 loss: 0.2172 (0.2323) time: 3.2313 data: 0.0091 max mem: 33300 Epoch: [1] [2690/4276] eta: 1:22:41 lr: 4.8163217558736054e-05 loss: 0.2221 (0.2322) time: 3.2444 data: 0.0095 max mem: 33300 Epoch: [1] [2700/4276] eta: 1:22:10 lr: 4.8160575623262075e-05 loss: 0.2171 (0.2321) time: 3.2150 data: 0.0095 max mem: 33300 Epoch: [1] [2710/4276] eta: 1:21:39 lr: 4.8157933671684874e-05 loss: 0.2080 (0.2322) time: 3.1812 data: 0.0098 max mem: 33300 Epoch: [1] [2720/4276] eta: 1:21:09 lr: 4.815529170400336e-05 loss: 0.2344 (0.2322) time: 3.2041 data: 0.0099 max mem: 33300 Epoch: [1] [2730/4276] eta: 1:20:38 lr: 4.8152649720216466e-05 loss: 0.2236 (0.2322) time: 3.2380 data: 0.0098 max mem: 33300 Epoch: [1] [2740/4276] eta: 1:20:07 lr: 4.815000772032312e-05 loss: 0.2197 (0.2321) time: 3.2516 data: 0.0104 max mem: 33300 Epoch: [1] [2750/4276] eta: 1:19:37 lr: 4.8147365704322204e-05 loss: 0.2188 (0.2321) time: 3.2645 data: 0.0102 max mem: 33300 Epoch: [1] [2760/4276] eta: 1:19:07 lr: 4.8144723672212675e-05 loss: 0.2123 (0.2321) time: 3.3060 data: 0.0100 max mem: 33300 Epoch: [1] [2770/4276] eta: 1:18:36 lr: 4.814208162399344e-05 loss: 0.2183 (0.2321) time: 3.3080 data: 0.0104 max mem: 33300 Epoch: [1] [2780/4276] eta: 1:18:06 lr: 4.8139439559663415e-05 loss: 0.2262 (0.2321) time: 3.2548 data: 0.0101 max mem: 33300 Epoch: [1] [2790/4276] eta: 1:17:35 lr: 4.813679747922152e-05 loss: 0.2302 (0.2321) time: 3.2628 data: 0.0100 max mem: 33300 Epoch: [1] [2800/4276] eta: 1:17:04 lr: 4.8134155382666684e-05 loss: 0.2176 (0.2320) time: 3.2542 data: 0.0106 max mem: 33300 Epoch: [1] [2810/4276] eta: 1:16:33 lr: 4.813151326999782e-05 loss: 0.1931 (0.2319) time: 3.2278 data: 0.0107 max mem: 33300 Epoch: [1] [2820/4276] eta: 1:16:03 lr: 4.8128871141213836e-05 loss: 0.2071 (0.2319) time: 3.2450 data: 0.0097 max mem: 33300 Epoch: [1] [2830/4276] eta: 1:15:32 lr: 4.812622899631366e-05 loss: 0.2081 (0.2318) time: 3.2491 data: 0.0096 max mem: 33300 Epoch: [1] [2840/4276] eta: 1:15:01 lr: 4.812358683529621e-05 loss: 0.2248 (0.2318) time: 3.2446 data: 0.0094 max mem: 33300 Epoch: [1] [2850/4276] eta: 1:14:30 lr: 4.8120944658160425e-05 loss: 0.2315 (0.2319) time: 3.2187 data: 0.0094 max mem: 33300 Epoch: [1] [2860/4276] eta: 1:13:59 lr: 4.811830246490518e-05 loss: 0.2310 (0.2319) time: 3.2045 data: 0.0096 max mem: 33300 Epoch: [1] [2870/4276] eta: 1:13:28 lr: 4.8115660255529425e-05 loss: 0.2185 (0.2319) time: 3.2163 data: 0.0100 max mem: 33300 Epoch: [1] [2880/4276] eta: 1:12:57 lr: 4.811301803003207e-05 loss: 0.2274 (0.2319) time: 3.2103 data: 0.0099 max mem: 33300 Epoch: [1] [2890/4276] eta: 1:12:26 lr: 4.811037578841204e-05 loss: 0.2274 (0.2319) time: 3.2076 data: 0.0099 max mem: 33300 Epoch: [1] [2900/4276] eta: 1:11:55 lr: 4.810773353066823e-05 loss: 0.2155 (0.2318) time: 3.2030 data: 0.0103 max mem: 33300 Epoch: [1] [2910/4276] eta: 1:11:24 lr: 4.810509125679958e-05 loss: 0.2150 (0.2318) time: 3.1972 data: 0.0098 max mem: 33300 Epoch: [1] [2920/4276] eta: 1:10:53 lr: 4.8102448966805005e-05 loss: 0.2102 (0.2318) time: 3.2171 data: 0.0096 max mem: 33300 Epoch: [1] [2930/4276] eta: 1:10:22 lr: 4.8099806660683415e-05 loss: 0.2088 (0.2318) time: 3.2321 data: 0.0099 max mem: 33300 Epoch: [1] [2940/4276] eta: 1:09:51 lr: 4.809716433843373e-05 loss: 0.2019 (0.2316) time: 3.2606 data: 0.0099 max mem: 33300 Epoch: [1] [2950/4276] eta: 1:09:20 lr: 4.809452200005486e-05 loss: 0.1957 (0.2316) time: 3.2547 data: 0.0103 max mem: 33300 Epoch: [1] [2960/4276] eta: 1:08:50 lr: 4.809187964554573e-05 loss: 0.2260 (0.2316) time: 3.2450 data: 0.0100 max mem: 33300 Epoch: [1] [2970/4276] eta: 1:08:19 lr: 4.808923727490527e-05 loss: 0.2415 (0.2317) time: 3.2681 data: 0.0095 max mem: 33300 Epoch: [1] [2980/4276] eta: 1:07:48 lr: 4.8086594888132366e-05 loss: 0.2415 (0.2317) time: 3.2456 data: 0.0102 max mem: 33300 Epoch: [1] [2990/4276] eta: 1:07:17 lr: 4.8083952485225966e-05 loss: 0.2077 (0.2316) time: 3.2066 data: 0.0103 max mem: 33300 Epoch: [1] [3000/4276] eta: 1:06:45 lr: 4.808131006618496e-05 loss: 0.1981 (0.2315) time: 3.1341 data: 0.0093 max mem: 33300 Epoch: [1] [3010/4276] eta: 1:06:14 lr: 4.807866763100828e-05 loss: 0.2106 (0.2315) time: 3.1024 data: 0.0082 max mem: 33300 Epoch: [1] [3020/4276] eta: 1:05:42 lr: 4.807602517969484e-05 loss: 0.2106 (0.2315) time: 3.1229 data: 0.0081 max mem: 33300 Epoch: [1] [3030/4276] eta: 1:05:11 lr: 4.807338271224355e-05 loss: 0.2326 (0.2315) time: 3.1106 data: 0.0085 max mem: 33300 Epoch: [1] [3040/4276] eta: 1:04:39 lr: 4.807074022865332e-05 loss: 0.2444 (0.2317) time: 3.0886 data: 0.0091 max mem: 33300 Epoch: [1] [3050/4276] eta: 1:04:07 lr: 4.806809772892309e-05 loss: 0.2388 (0.2316) time: 3.0835 data: 0.0090 max mem: 33300 Epoch: [1] [3060/4276] eta: 1:03:36 lr: 4.806545521305176e-05 loss: 0.1891 (0.2315) time: 3.1104 data: 0.0083 max mem: 33300 Epoch: [1] [3070/4276] eta: 1:03:04 lr: 4.8062812681038236e-05 loss: 0.2030 (0.2316) time: 3.1270 data: 0.0083 max mem: 33300 Epoch: [1] [3080/4276] eta: 1:02:33 lr: 4.8060170132881454e-05 loss: 0.2215 (0.2316) time: 3.1233 data: 0.0087 max mem: 33300 Epoch: [1] [3090/4276] eta: 1:02:01 lr: 4.805752756858031e-05 loss: 0.2258 (0.2316) time: 3.1159 data: 0.0087 max mem: 33300 Epoch: [1] [3100/4276] eta: 1:01:30 lr: 4.805488498813373e-05 loss: 0.2304 (0.2316) time: 3.1166 data: 0.0084 max mem: 33300 Epoch: [1] [3110/4276] eta: 1:00:59 lr: 4.805224239154062e-05 loss: 0.2064 (0.2315) time: 3.1189 data: 0.0078 max mem: 33300 Epoch: [1] [3120/4276] eta: 1:00:27 lr: 4.8049599778799916e-05 loss: 0.1942 (0.2315) time: 3.1186 data: 0.0076 max mem: 33300 Epoch: [1] [3130/4276] eta: 0:59:56 lr: 4.804695714991051e-05 loss: 0.2194 (0.2314) time: 3.1281 data: 0.0078 max mem: 33300 Epoch: [1] [3140/4276] eta: 0:59:24 lr: 4.804431450487132e-05 loss: 0.2275 (0.2315) time: 3.1512 data: 0.0078 max mem: 33300 Epoch: [1] [3150/4276] eta: 0:58:53 lr: 4.804167184368127e-05 loss: 0.2331 (0.2315) time: 3.1608 data: 0.0077 max mem: 33300 Epoch: [1] [3160/4276] eta: 0:58:22 lr: 4.8039029166339264e-05 loss: 0.2219 (0.2315) time: 3.1550 data: 0.0078 max mem: 33300 Epoch: [1] [3170/4276] eta: 0:57:50 lr: 4.8036386472844225e-05 loss: 0.2161 (0.2315) time: 3.1545 data: 0.0078 max mem: 33300 Epoch: [1] [3180/4276] eta: 0:57:19 lr: 4.8033743763195054e-05 loss: 0.2237 (0.2316) time: 3.1426 data: 0.0080 max mem: 33300 Epoch: [1] [3190/4276] eta: 0:56:48 lr: 4.803110103739068e-05 loss: 0.2400 (0.2316) time: 3.1323 data: 0.0080 max mem: 33300 Epoch: [1] [3200/4276] eta: 0:56:16 lr: 4.802845829543001e-05 loss: 0.2123 (0.2315) time: 3.1371 data: 0.0078 max mem: 33300 Epoch: [1] [3210/4276] eta: 0:55:45 lr: 4.802581553731195e-05 loss: 0.2076 (0.2315) time: 3.1364 data: 0.0076 max mem: 33300 Epoch: [1] [3220/4276] eta: 0:55:13 lr: 4.802317276303542e-05 loss: 0.2189 (0.2315) time: 3.1308 data: 0.0074 max mem: 33300 Epoch: [1] [3230/4276] eta: 0:54:42 lr: 4.8020529972599346e-05 loss: 0.2196 (0.2315) time: 3.1080 data: 0.0075 max mem: 33300 Epoch: [1] [3240/4276] eta: 0:54:10 lr: 4.8017887166002626e-05 loss: 0.2258 (0.2315) time: 3.0882 data: 0.0076 max mem: 33300 Epoch: [1] [3250/4276] eta: 0:53:39 lr: 4.8015244343244165e-05 loss: 0.2258 (0.2315) time: 3.0837 data: 0.0080 max mem: 33300 Epoch: [1] [3260/4276] eta: 0:53:07 lr: 4.801260150432289e-05 loss: 0.2298 (0.2315) time: 3.0822 data: 0.0084 max mem: 33300 Epoch: [1] [3270/4276] eta: 0:52:36 lr: 4.8009958649237704e-05 loss: 0.2152 (0.2314) time: 3.1051 data: 0.0081 max mem: 33300 Epoch: [1] [3280/4276] eta: 0:52:04 lr: 4.8007315777987536e-05 loss: 0.2152 (0.2315) time: 3.1151 data: 0.0081 max mem: 33300 Epoch: [1] [3290/4276] eta: 0:51:33 lr: 4.800467289057128e-05 loss: 0.2345 (0.2315) time: 3.1116 data: 0.0081 max mem: 33300 Epoch: [1] [3300/4276] eta: 0:51:02 lr: 4.800202998698786e-05 loss: 0.2373 (0.2315) time: 3.1170 data: 0.0078 max mem: 33300 Epoch: [1] [3310/4276] eta: 0:50:30 lr: 4.799938706723619e-05 loss: 0.2377 (0.2315) time: 3.1257 data: 0.0079 max mem: 33300 Epoch: [1] [3320/4276] eta: 0:49:59 lr: 4.7996744131315167e-05 loss: 0.2377 (0.2316) time: 3.1420 data: 0.0080 max mem: 33300 Epoch: [1] [3330/4276] eta: 0:49:28 lr: 4.799410117922371e-05 loss: 0.2039 (0.2316) time: 3.1591 data: 0.0079 max mem: 33300 Epoch: [1] [3340/4276] eta: 0:48:56 lr: 4.799145821096074e-05 loss: 0.2127 (0.2316) time: 3.1665 data: 0.0078 max mem: 33300 Epoch: [1] [3350/4276] eta: 0:48:25 lr: 4.798881522652515e-05 loss: 0.2127 (0.2315) time: 3.1681 data: 0.0078 max mem: 33300 Epoch: [1] [3360/4276] eta: 0:47:54 lr: 4.798617222591587e-05 loss: 0.2090 (0.2315) time: 3.1564 data: 0.0083 max mem: 33300 Epoch: [1] [3370/4276] eta: 0:47:22 lr: 4.79835292091318e-05 loss: 0.2344 (0.2315) time: 3.1353 data: 0.0082 max mem: 33300 Epoch: [1] [3380/4276] eta: 0:46:51 lr: 4.798088617617186e-05 loss: 0.2329 (0.2315) time: 3.1206 data: 0.0081 max mem: 33300 Epoch: [1] [3390/4276] eta: 0:46:19 lr: 4.797824312703495e-05 loss: 0.2398 (0.2316) time: 3.1105 data: 0.0082 max mem: 33300 Epoch: [1] [3400/4276] eta: 0:45:48 lr: 4.7975600061719986e-05 loss: 0.2497 (0.2316) time: 3.1132 data: 0.0081 max mem: 33300 Epoch: [1] [3410/4276] eta: 0:45:16 lr: 4.797295698022587e-05 loss: 0.2361 (0.2316) time: 3.1191 data: 0.0080 max mem: 33300 Epoch: [1] [3420/4276] eta: 0:44:45 lr: 4.797031388255154e-05 loss: 0.2299 (0.2316) time: 3.1202 data: 0.0082 max mem: 33300 Epoch: [1] [3430/4276] eta: 0:44:14 lr: 4.7967670768695874e-05 loss: 0.2317 (0.2317) time: 3.1193 data: 0.0083 max mem: 33300 Epoch: [1] [3440/4276] eta: 0:43:42 lr: 4.79650276386578e-05 loss: 0.2318 (0.2316) time: 3.1158 data: 0.0078 max mem: 33300 Epoch: [1] [3450/4276] eta: 0:43:11 lr: 4.796238449243623e-05 loss: 0.2314 (0.2316) time: 3.1177 data: 0.0080 max mem: 33300 Epoch: [1] [3460/4276] eta: 0:42:39 lr: 4.795974133003005e-05 loss: 0.2314 (0.2316) time: 3.1208 data: 0.0085 max mem: 33300 Epoch: [1] [3470/4276] eta: 0:42:08 lr: 4.79570981514382e-05 loss: 0.2089 (0.2316) time: 3.1088 data: 0.0081 max mem: 33300 Epoch: [1] [3480/4276] eta: 0:41:36 lr: 4.795445495665958e-05 loss: 0.2190 (0.2316) time: 3.0811 data: 0.0081 max mem: 33300 Epoch: [1] [3490/4276] eta: 0:41:05 lr: 4.7951811745693086e-05 loss: 0.2323 (0.2316) time: 3.0905 data: 0.0090 max mem: 33300 Epoch: [1] [3500/4276] eta: 0:40:34 lr: 4.7949168518537643e-05 loss: 0.2324 (0.2316) time: 3.1255 data: 0.0097 max mem: 33300 Epoch: [1] [3510/4276] eta: 0:40:02 lr: 4.794652527519216e-05 loss: 0.2153 (0.2315) time: 3.1325 data: 0.0092 max mem: 33300 Epoch: [1] [3520/4276] eta: 0:39:31 lr: 4.794388201565554e-05 loss: 0.2137 (0.2315) time: 3.1266 data: 0.0081 max mem: 33300 Epoch: [1] [3530/4276] eta: 0:38:59 lr: 4.7941238739926686e-05 loss: 0.2165 (0.2315) time: 3.1378 data: 0.0085 max mem: 33300 Epoch: [1] [3540/4276] eta: 0:38:28 lr: 4.7938595448004516e-05 loss: 0.2165 (0.2315) time: 3.1566 data: 0.0096 max mem: 33300 Epoch: [1] [3550/4276] eta: 0:37:57 lr: 4.793595213988795e-05 loss: 0.2307 (0.2315) time: 3.1403 data: 0.0093 max mem: 33300 Epoch: [1] [3560/4276] eta: 0:37:25 lr: 4.793330881557587e-05 loss: 0.2292 (0.2315) time: 3.1225 data: 0.0091 max mem: 33300 Epoch: [1] [3570/4276] eta: 0:36:54 lr: 4.7930665475067206e-05 loss: 0.2292 (0.2315) time: 3.1236 data: 0.0099 max mem: 33300 Epoch: [1] [3580/4276] eta: 0:36:23 lr: 4.7928022118360846e-05 loss: 0.2134 (0.2314) time: 3.1273 data: 0.0100 max mem: 33300 Epoch: [1] [3590/4276] eta: 0:35:51 lr: 4.7925378745455726e-05 loss: 0.2121 (0.2314) time: 3.1290 data: 0.0095 max mem: 33300 Epoch: [1] [3600/4276] eta: 0:35:20 lr: 4.7922735356350735e-05 loss: 0.2227 (0.2314) time: 3.1243 data: 0.0093 max mem: 33300 Epoch: [1] [3610/4276] eta: 0:34:48 lr: 4.7920091951044776e-05 loss: 0.2301 (0.2314) time: 3.1067 data: 0.0088 max mem: 33300 Epoch: [1] [3620/4276] eta: 0:34:17 lr: 4.7917448529536776e-05 loss: 0.2356 (0.2314) time: 3.0731 data: 0.0080 max mem: 33300 Epoch: [1] [3630/4276] eta: 0:33:45 lr: 4.791480509182562e-05 loss: 0.2356 (0.2315) time: 3.0519 data: 0.0075 max mem: 33300 Epoch: [1] [3640/4276] eta: 0:33:14 lr: 4.791216163791023e-05 loss: 0.2245 (0.2315) time: 3.0532 data: 0.0073 max mem: 33300 Epoch: [1] [3650/4276] eta: 0:32:42 lr: 4.790951816778951e-05 loss: 0.2202 (0.2315) time: 3.0772 data: 0.0080 max mem: 33300 Epoch: [1] [3660/4276] eta: 0:32:11 lr: 4.790687468146237e-05 loss: 0.2166 (0.2315) time: 3.1041 data: 0.0087 max mem: 33300 Epoch: [1] [3670/4276] eta: 0:31:40 lr: 4.7904231178927713e-05 loss: 0.2375 (0.2315) time: 3.1172 data: 0.0091 max mem: 33300 Epoch: [1] [3680/4276] eta: 0:31:08 lr: 4.790158766018445e-05 loss: 0.2338 (0.2315) time: 3.1241 data: 0.0094 max mem: 33300 Epoch: [1] [3690/4276] eta: 0:30:37 lr: 4.789894412523148e-05 loss: 0.2255 (0.2315) time: 3.1324 data: 0.0094 max mem: 33300 Epoch: [1] [3700/4276] eta: 0:30:06 lr: 4.789630057406772e-05 loss: 0.2244 (0.2315) time: 3.1415 data: 0.0089 max mem: 33300 Epoch: [1] [3710/4276] eta: 0:29:34 lr: 4.789365700669207e-05 loss: 0.2219 (0.2314) time: 3.1441 data: 0.0093 max mem: 33300 Epoch: [1] [3720/4276] eta: 0:29:03 lr: 4.789101342310343e-05 loss: 0.1998 (0.2314) time: 3.1474 data: 0.0094 max mem: 33300 Epoch: [1] [3730/4276] eta: 0:28:32 lr: 4.788836982330072e-05 loss: 0.2056 (0.2314) time: 3.1449 data: 0.0087 max mem: 33300 Epoch: [1] [3740/4276] eta: 0:28:00 lr: 4.788572620728284e-05 loss: 0.2097 (0.2314) time: 3.1424 data: 0.0087 max mem: 33300 Epoch: [1] [3750/4276] eta: 0:27:29 lr: 4.7883082575048696e-05 loss: 0.2187 (0.2314) time: 3.1323 data: 0.0088 max mem: 33300 Epoch: [1] [3760/4276] eta: 0:26:57 lr: 4.7880438926597185e-05 loss: 0.2187 (0.2314) time: 3.1138 data: 0.0088 max mem: 33300 Epoch: [1] [3770/4276] eta: 0:26:26 lr: 4.787779526192722e-05 loss: 0.2307 (0.2314) time: 3.0890 data: 0.0084 max mem: 33300 Epoch: [1] [3780/4276] eta: 0:25:55 lr: 4.787515158103772e-05 loss: 0.2243 (0.2313) time: 3.0804 data: 0.0084 max mem: 33300 Epoch: [1] [3790/4276] eta: 0:25:23 lr: 4.7872507883927567e-05 loss: 0.2188 (0.2313) time: 3.1097 data: 0.0091 max mem: 33300 Epoch: [1] [3800/4276] eta: 0:24:52 lr: 4.786986417059568e-05 loss: 0.2240 (0.2314) time: 3.1536 data: 0.0095 max mem: 33300 Epoch: [1] [3810/4276] eta: 0:24:21 lr: 4.786722044104096e-05 loss: 0.2152 (0.2314) time: 3.1354 data: 0.0089 max mem: 33300 Epoch: [1] [3820/4276] eta: 0:23:49 lr: 4.786457669526231e-05 loss: 0.2021 (0.2313) time: 3.0761 data: 0.0082 max mem: 33300 Epoch: [1] [3830/4276] eta: 0:23:18 lr: 4.786193293325865e-05 loss: 0.2198 (0.2313) time: 3.0818 data: 0.0085 max mem: 33300 Epoch: [1] [3840/4276] eta: 0:22:46 lr: 4.785928915502885e-05 loss: 0.2207 (0.2313) time: 3.0881 data: 0.0088 max mem: 33300 Epoch: [1] [3850/4276] eta: 0:22:15 lr: 4.785664536057185e-05 loss: 0.2111 (0.2312) time: 3.0581 data: 0.0088 max mem: 33300 Epoch: [1] [3860/4276] eta: 0:21:43 lr: 4.785400154988654e-05 loss: 0.2119 (0.2312) time: 3.0532 data: 0.0089 max mem: 33300 Epoch: [1] [3870/4276] eta: 0:21:12 lr: 4.785135772297182e-05 loss: 0.2119 (0.2312) time: 3.0942 data: 0.0087 max mem: 33300 Epoch: [1] [3880/4276] eta: 0:20:41 lr: 4.7848713879826604e-05 loss: 0.2119 (0.2311) time: 3.1686 data: 0.0088 max mem: 33300 Epoch: [1] [3890/4276] eta: 0:20:09 lr: 4.784607002044979e-05 loss: 0.2166 (0.2311) time: 3.1949 data: 0.0088 max mem: 33300 Epoch: [1] [3900/4276] eta: 0:19:38 lr: 4.7843426144840284e-05 loss: 0.2331 (0.2311) time: 3.1792 data: 0.0090 max mem: 33300 Epoch: [1] [3910/4276] eta: 0:19:07 lr: 4.7840782252996986e-05 loss: 0.2260 (0.2311) time: 3.1856 data: 0.0097 max mem: 33300 Epoch: [1] [3920/4276] eta: 0:18:36 lr: 4.78381383449188e-05 loss: 0.2135 (0.2310) time: 3.1578 data: 0.0090 max mem: 33300 Epoch: [1] [3930/4276] eta: 0:18:04 lr: 4.7835494420604635e-05 loss: 0.2136 (0.2310) time: 3.1175 data: 0.0089 max mem: 33300 Epoch: [1] [3940/4276] eta: 0:17:33 lr: 4.783285048005338e-05 loss: 0.2206 (0.2310) time: 3.1235 data: 0.0094 max mem: 33300 Epoch: [1] [3950/4276] eta: 0:17:01 lr: 4.783020652326395e-05 loss: 0.2168 (0.2309) time: 3.1295 data: 0.0087 max mem: 33300 Epoch: [1] [3960/4276] eta: 0:16:30 lr: 4.782756255023526e-05 loss: 0.2236 (0.2310) time: 3.1239 data: 0.0088 max mem: 33300 Epoch: [1] [3970/4276] eta: 0:15:59 lr: 4.782491856096618e-05 loss: 0.2496 (0.2310) time: 3.1179 data: 0.0091 max mem: 33300 Epoch: [1] [3980/4276] eta: 0:15:27 lr: 4.782227455545565e-05 loss: 0.2273 (0.2309) time: 3.1227 data: 0.0091 max mem: 33300 Epoch: [1] [3990/4276] eta: 0:14:56 lr: 4.781963053370254e-05 loss: 0.2143 (0.2309) time: 3.1372 data: 0.0089 max mem: 33300 Epoch: [1] [4000/4276] eta: 0:14:25 lr: 4.781698649570577e-05 loss: 0.2080 (0.2309) time: 3.1045 data: 0.0079 max mem: 33300 Epoch: [1] [4010/4276] eta: 0:13:53 lr: 4.781434244146424e-05 loss: 0.2080 (0.2309) time: 3.0705 data: 0.0075 max mem: 33300 Epoch: [1] [4020/4276] eta: 0:13:22 lr: 4.781169837097685e-05 loss: 0.1984 (0.2309) time: 3.0915 data: 0.0086 max mem: 33300 Epoch: [1] [4030/4276] eta: 0:12:51 lr: 4.78090542842425e-05 loss: 0.2032 (0.2309) time: 3.0862 data: 0.0083 max mem: 33300 Epoch: [1] [4040/4276] eta: 0:12:19 lr: 4.78064101812601e-05 loss: 0.2259 (0.2309) time: 3.0663 data: 0.0077 max mem: 33300 Epoch: [1] [4050/4276] eta: 0:11:48 lr: 4.7803766062028546e-05 loss: 0.2054 (0.2308) time: 3.0651 data: 0.0083 max mem: 33300 Epoch: [1] [4060/4276] eta: 0:11:16 lr: 4.780112192654674e-05 loss: 0.1975 (0.2308) time: 3.0935 data: 0.0083 max mem: 33300 Epoch: [1] [4070/4276] eta: 0:10:45 lr: 4.779847777481357e-05 loss: 0.2200 (0.2308) time: 3.1354 data: 0.0088 max mem: 33300 Epoch: [1] [4080/4276] eta: 0:10:14 lr: 4.779583360682796e-05 loss: 0.2200 (0.2308) time: 3.1352 data: 0.0093 max mem: 33300 Epoch: [1] [4090/4276] eta: 0:09:42 lr: 4.77931894225888e-05 loss: 0.2206 (0.2308) time: 3.1433 data: 0.0090 max mem: 33300 Epoch: [1] [4100/4276] eta: 0:09:11 lr: 4.7790545222095e-05 loss: 0.2360 (0.2308) time: 3.1645 data: 0.0089 max mem: 33300 Epoch: [1] [4110/4276] eta: 0:08:40 lr: 4.7787901005345435e-05 loss: 0.2360 (0.2308) time: 3.1590 data: 0.0088 max mem: 33300 Epoch: [1] [4120/4276] eta: 0:08:08 lr: 4.778525677233903e-05 loss: 0.2246 (0.2308) time: 3.1360 data: 0.0087 max mem: 33300 Epoch: [1] [4130/4276] eta: 0:07:37 lr: 4.7782612523074685e-05 loss: 0.2213 (0.2307) time: 3.1225 data: 0.0088 max mem: 33300 Epoch: [1] [4140/4276] eta: 0:07:06 lr: 4.77799682575513e-05 loss: 0.2213 (0.2307) time: 3.1310 data: 0.0084 max mem: 33300 Epoch: [1] [4150/4276] eta: 0:06:34 lr: 4.777732397576775e-05 loss: 0.2165 (0.2307) time: 3.1335 data: 0.0082 max mem: 33300 Epoch: [1] [4160/4276] eta: 0:06:03 lr: 4.7774679677722965e-05 loss: 0.2165 (0.2307) time: 3.1224 data: 0.0079 max mem: 33300 Epoch: [1] [4170/4276] eta: 0:05:32 lr: 4.777203536341584e-05 loss: 0.2361 (0.2307) time: 3.1212 data: 0.0078 max mem: 33300 Epoch: [1] [4180/4276] eta: 0:05:00 lr: 4.776939103284526e-05 loss: 0.2361 (0.2307) time: 3.1237 data: 0.0081 max mem: 33300 Epoch: [1] [4190/4276] eta: 0:04:29 lr: 4.776674668601014e-05 loss: 0.2117 (0.2307) time: 3.1249 data: 0.0084 max mem: 33300 Epoch: [1] [4200/4276] eta: 0:03:58 lr: 4.776410232290937e-05 loss: 0.2201 (0.2308) time: 3.1234 data: 0.0082 max mem: 33300 Epoch: [1] [4210/4276] eta: 0:03:26 lr: 4.776145794354185e-05 loss: 0.2292 (0.2308) time: 3.1227 data: 0.0080 max mem: 33300 Epoch: [1] [4220/4276] eta: 0:02:55 lr: 4.775881354790649e-05 loss: 0.2452 (0.2308) time: 3.1293 data: 0.0087 max mem: 33300 Epoch: [1] [4230/4276] eta: 0:02:24 lr: 4.775616913600217e-05 loss: 0.2606 (0.2309) time: 3.1291 data: 0.0092 max mem: 33300 Epoch: [1] [4240/4276] eta: 0:01:52 lr: 4.7753524707827814e-05 loss: 0.2471 (0.2309) time: 3.1239 data: 0.0086 max mem: 33300 Epoch: [1] [4250/4276] eta: 0:01:21 lr: 4.7750880263382295e-05 loss: 0.2256 (0.2309) time: 3.1010 data: 0.0079 max mem: 33300 Epoch: [1] [4260/4276] eta: 0:00:50 lr: 4.774823580266453e-05 loss: 0.2305 (0.2309) time: 3.1172 data: 0.0089 max mem: 33300 Epoch: [1] [4270/4276] eta: 0:00:18 lr: 4.774559132567341e-05 loss: 0.2314 (0.2309) time: 3.1537 data: 0.0088 max mem: 33300 Epoch: [1] Total time: 3:43:20 Test: [ 0/21770] eta: 10:18:50 time: 1.7056 data: 1.6626 max mem: 33300 Test: [ 100/21770] eta: 0:20:13 time: 0.0401 data: 0.0012 max mem: 33300 Test: [ 200/21770] eta: 0:17:15 time: 0.0399 data: 0.0012 max mem: 33300 Test: [ 300/21770] eta: 0:16:13 time: 0.0398 data: 0.0011 max mem: 33300 Test: [ 400/21770] eta: 0:15:40 time: 0.0402 data: 0.0011 max mem: 33300 Test: [ 500/21770] eta: 0:15:19 time: 0.0399 data: 0.0012 max mem: 33300 Test: [ 600/21770] eta: 0:15:03 time: 0.0401 data: 0.0012 max mem: 33300 Test: [ 700/21770] eta: 0:14:51 time: 0.0404 data: 0.0011 max mem: 33300 Test: [ 800/21770] eta: 0:14:41 time: 0.0400 data: 0.0012 max mem: 33300 Test: [ 900/21770] eta: 0:14:32 time: 0.0400 data: 0.0012 max mem: 33300 Test: [ 1000/21770] eta: 0:14:23 time: 0.0387 data: 0.0012 max mem: 33300 Test: [ 1100/21770] eta: 0:14:14 time: 0.0389 data: 0.0012 max mem: 33300 Test: [ 1200/21770] eta: 0:14:05 time: 0.0389 data: 0.0012 max mem: 33300 Test: [ 1300/21770] eta: 0:13:58 time: 0.0389 data: 0.0012 max mem: 33300 Test: [ 1400/21770] eta: 0:13:51 time: 0.0389 data: 0.0011 max mem: 33300 Test: [ 1500/21770] eta: 0:13:44 time: 0.0391 data: 0.0012 max mem: 33300 Test: [ 1600/21770] eta: 0:13:38 time: 0.0392 data: 0.0012 max mem: 33300 Test: [ 1700/21770] eta: 0:13:33 time: 0.0391 data: 0.0012 max mem: 33300 Test: [ 1800/21770] eta: 0:13:27 time: 0.0394 data: 0.0012 max mem: 33300 Test: [ 1900/21770] eta: 0:13:22 time: 0.0391 data: 0.0012 max mem: 33300 Test: [ 2000/21770] eta: 0:13:17 time: 0.0393 data: 0.0012 max mem: 33300 Test: [ 2100/21770] eta: 0:13:11 time: 0.0389 data: 0.0011 max mem: 33300 Test: [ 2200/21770] eta: 0:13:06 time: 0.0392 data: 0.0011 max mem: 33300 Test: [ 2300/21770] eta: 0:13:02 time: 0.0392 data: 0.0012 max mem: 33300 Test: [ 2400/21770] eta: 0:12:57 time: 0.0394 data: 0.0012 max mem: 33300 Test: [ 2500/21770] eta: 0:12:52 time: 0.0394 data: 0.0012 max mem: 33300 Test: [ 2600/21770] eta: 0:12:48 time: 0.0390 data: 0.0012 max mem: 33300 Test: [ 2700/21770] eta: 0:12:43 time: 0.0391 data: 0.0012 max mem: 33300 Test: [ 2800/21770] eta: 0:12:38 time: 0.0389 data: 0.0012 max mem: 33300 Test: [ 2900/21770] eta: 0:12:34 time: 0.0389 data: 0.0012 max mem: 33300 Test: [ 3000/21770] eta: 0:12:29 time: 0.0389 data: 0.0012 max mem: 33300 Test: [ 3100/21770] eta: 0:12:24 time: 0.0389 data: 0.0011 max mem: 33300 Test: [ 3200/21770] eta: 0:12:20 time: 0.0391 data: 0.0012 max mem: 33300 Test: [ 3300/21770] eta: 0:12:15 time: 0.0389 data: 0.0012 max mem: 33300 Test: [ 3400/21770] eta: 0:12:11 time: 0.0388 data: 0.0011 max mem: 33300 Test: [ 3500/21770] eta: 0:12:06 time: 0.0389 data: 0.0012 max mem: 33300 Test: [ 3600/21770] eta: 0:12:02 time: 0.0390 data: 0.0012 max mem: 33300 Test: [ 3700/21770] eta: 0:11:58 time: 0.0387 data: 0.0011 max mem: 33300 Test: [ 3800/21770] eta: 0:11:53 time: 0.0388 data: 0.0011 max mem: 33300 Test: [ 3900/21770] eta: 0:11:49 time: 0.0388 data: 0.0011 max mem: 33300 Test: [ 4000/21770] eta: 0:11:44 time: 0.0388 data: 0.0011 max mem: 33300 Test: [ 4100/21770] eta: 0:11:40 time: 0.0388 data: 0.0011 max mem: 33300 Test: [ 4200/21770] eta: 0:11:36 time: 0.0387 data: 0.0011 max mem: 33300 Test: [ 4300/21770] eta: 0:11:31 time: 0.0388 data: 0.0011 max mem: 33300 Test: [ 4400/21770] eta: 0:11:27 time: 0.0388 data: 0.0011 max mem: 33300 Test: [ 4500/21770] eta: 0:11:23 time: 0.0388 data: 0.0011 max mem: 33300 Test: [ 4600/21770] eta: 0:11:19 time: 0.0393 data: 0.0011 max mem: 33300 Test: [ 4700/21770] eta: 0:11:15 time: 0.0392 data: 0.0011 max mem: 33300 Test: [ 4800/21770] eta: 0:11:11 time: 0.0395 data: 0.0011 max mem: 33300 Test: [ 4900/21770] eta: 0:11:07 time: 0.0398 data: 0.0011 max mem: 33300 Test: [ 5000/21770] eta: 0:11:03 time: 0.0394 data: 0.0011 max mem: 33300 Test: [ 5100/21770] eta: 0:10:59 time: 0.0399 data: 0.0011 max mem: 33300 Test: [ 5200/21770] eta: 0:10:55 time: 0.0396 data: 0.0011 max mem: 33300 Test: [ 5300/21770] eta: 0:10:51 time: 0.0392 data: 0.0011 max mem: 33300 Test: [ 5400/21770] eta: 0:10:47 time: 0.0390 data: 0.0011 max mem: 33300 Test: [ 5500/21770] eta: 0:10:43 time: 0.0392 data: 0.0011 max mem: 33300 Test: [ 5600/21770] eta: 0:10:39 time: 0.0390 data: 0.0011 max mem: 33300 Test: [ 5700/21770] eta: 0:10:35 time: 0.0389 data: 0.0012 max mem: 33300 Test: [ 5800/21770] eta: 0:10:30 time: 0.0388 data: 0.0012 max mem: 33300 Test: [ 5900/21770] eta: 0:10:26 time: 0.0387 data: 0.0011 max mem: 33300 Test: [ 6000/21770] eta: 0:10:22 time: 0.0388 data: 0.0011 max mem: 33300 Test: [ 6100/21770] eta: 0:10:18 time: 0.0388 data: 0.0011 max mem: 33300 Test: [ 6200/21770] eta: 0:10:14 time: 0.0386 data: 0.0011 max mem: 33300 Test: [ 6300/21770] eta: 0:10:10 time: 0.0391 data: 0.0011 max mem: 33300 Test: [ 6400/21770] eta: 0:10:06 time: 0.0389 data: 0.0011 max mem: 33300 Test: [ 6500/21770] eta: 0:10:02 time: 0.0398 data: 0.0011 max mem: 33300 Test: [ 6600/21770] eta: 0:09:58 time: 0.0392 data: 0.0011 max mem: 33300 Test: [ 6700/21770] eta: 0:09:54 time: 0.0394 data: 0.0011 max mem: 33300 Test: [ 6800/21770] eta: 0:09:50 time: 0.0394 data: 0.0011 max mem: 33300 Test: [ 6900/21770] eta: 0:09:46 time: 0.0399 data: 0.0011 max mem: 33300 Test: [ 7000/21770] eta: 0:09:42 time: 0.0395 data: 0.0011 max mem: 33300 Test: [ 7100/21770] eta: 0:09:38 time: 0.0399 data: 0.0012 max mem: 33300 Test: [ 7200/21770] eta: 0:09:34 time: 0.0391 data: 0.0011 max mem: 33300 Test: [ 7300/21770] eta: 0:09:30 time: 0.0388 data: 0.0011 max mem: 33300 Test: [ 7400/21770] eta: 0:09:26 time: 0.0388 data: 0.0011 max mem: 33300 Test: [ 7500/21770] eta: 0:09:22 time: 0.0388 data: 0.0011 max mem: 33300 Test: [ 7600/21770] eta: 0:09:18 time: 0.0388 data: 0.0011 max mem: 33300 Test: [ 7700/21770] eta: 0:09:14 time: 0.0389 data: 0.0012 max mem: 33300 Test: [ 7800/21770] eta: 0:09:10 time: 0.0389 data: 0.0012 max mem: 33300 Test: [ 7900/21770] eta: 0:09:06 time: 0.0389 data: 0.0012 max mem: 33300 Test: [ 8000/21770] eta: 0:09:02 time: 0.0388 data: 0.0011 max mem: 33300 Test: [ 8100/21770] eta: 0:08:58 time: 0.0388 data: 0.0011 max mem: 33300 Test: [ 8200/21770] eta: 0:08:54 time: 0.0388 data: 0.0011 max mem: 33300 Test: [ 8300/21770] eta: 0:08:50 time: 0.0388 data: 0.0011 max mem: 33300 Test: [ 8400/21770] eta: 0:08:46 time: 0.0387 data: 0.0011 max mem: 33300 Test: [ 8500/21770] eta: 0:08:42 time: 0.0387 data: 0.0011 max mem: 33300 Test: [ 8600/21770] eta: 0:08:38 time: 0.0387 data: 0.0011 max mem: 33300 Test: [ 8700/21770] eta: 0:08:34 time: 0.0387 data: 0.0011 max mem: 33300 Test: [ 8800/21770] eta: 0:08:30 time: 0.0387 data: 0.0011 max mem: 33300 Test: [ 8900/21770] eta: 0:08:26 time: 0.0388 data: 0.0011 max mem: 33300 Test: [ 9000/21770] eta: 0:08:22 time: 0.0387 data: 0.0011 max mem: 33300 Test: [ 9100/21770] eta: 0:08:18 time: 0.0387 data: 0.0011 max mem: 33300 Test: [ 9200/21770] eta: 0:08:14 time: 0.0387 data: 0.0011 max mem: 33300 Test: [ 9300/21770] eta: 0:08:10 time: 0.0387 data: 0.0011 max mem: 33300 Test: [ 9400/21770] eta: 0:08:06 time: 0.0387 data: 0.0011 max mem: 33300 Test: [ 9500/21770] eta: 0:08:02 time: 0.0388 data: 0.0011 max mem: 33300 Test: [ 9600/21770] eta: 0:07:58 time: 0.0387 data: 0.0011 max mem: 33300 Test: [ 9700/21770] eta: 0:07:54 time: 0.0388 data: 0.0011 max mem: 33300 Test: [ 9800/21770] eta: 0:07:50 time: 0.0388 data: 0.0011 max mem: 33300 Test: [ 9900/21770] eta: 0:07:46 time: 0.0394 data: 0.0011 max mem: 33300 Test: [10000/21770] eta: 0:07:42 time: 0.0392 data: 0.0011 max mem: 33300 Test: [10100/21770] eta: 0:07:38 time: 0.0391 data: 0.0011 max mem: 33300 Test: [10200/21770] eta: 0:07:34 time: 0.0391 data: 0.0011 max mem: 33300 Test: [10300/21770] eta: 0:07:30 time: 0.0393 data: 0.0011 max mem: 33300 Test: [10400/21770] eta: 0:07:26 time: 0.0389 data: 0.0011 max mem: 33300 Test: [10500/21770] eta: 0:07:22 time: 0.0388 data: 0.0011 max mem: 33300 Test: [10600/21770] eta: 0:07:18 time: 0.0388 data: 0.0011 max mem: 33300 Test: [10700/21770] eta: 0:07:14 time: 0.0388 data: 0.0011 max mem: 33300 Test: [10800/21770] eta: 0:07:10 time: 0.0396 data: 0.0011 max mem: 33300 Test: [10900/21770] eta: 0:07:06 time: 0.0398 data: 0.0011 max mem: 33300 Test: [11000/21770] eta: 0:07:02 time: 0.0402 data: 0.0011 max mem: 33300 Test: [11100/21770] eta: 0:06:59 time: 0.0399 data: 0.0011 max mem: 33300 Test: [11200/21770] eta: 0:06:55 time: 0.0402 data: 0.0011 max mem: 33300 Test: [11300/21770] eta: 0:06:51 time: 0.0398 data: 0.0011 max mem: 33300 Test: [11400/21770] eta: 0:06:47 time: 0.0402 data: 0.0011 max mem: 33300 Test: [11500/21770] eta: 0:06:43 time: 0.0399 data: 0.0012 max mem: 33300 Test: [11600/21770] eta: 0:06:39 time: 0.0402 data: 0.0011 max mem: 33300 Test: [11700/21770] eta: 0:06:35 time: 0.0398 data: 0.0011 max mem: 33300 Test: [11800/21770] eta: 0:06:31 time: 0.0400 data: 0.0011 max mem: 33300 Test: [11900/21770] eta: 0:06:28 time: 0.0398 data: 0.0011 max mem: 33300 Test: [12000/21770] eta: 0:06:24 time: 0.0400 data: 0.0012 max mem: 33300 Test: [12100/21770] eta: 0:06:20 time: 0.0401 data: 0.0011 max mem: 33300 Test: [12200/21770] eta: 0:06:16 time: 0.0404 data: 0.0011 max mem: 33300 Test: [12300/21770] eta: 0:06:12 time: 0.0405 data: 0.0011 max mem: 33300 Test: [12400/21770] eta: 0:06:08 time: 0.0403 data: 0.0011 max mem: 33300 Test: [12500/21770] eta: 0:06:04 time: 0.0403 data: 0.0011 max mem: 33300 Test: [12600/21770] eta: 0:06:01 time: 0.0407 data: 0.0011 max mem: 33300 Test: [12700/21770] eta: 0:05:57 time: 0.0406 data: 0.0012 max mem: 33300 Test: [12800/21770] eta: 0:05:53 time: 0.0403 data: 0.0011 max mem: 33300 Test: [12900/21770] eta: 0:05:49 time: 0.0405 data: 0.0011 max mem: 33300 Test: [13000/21770] eta: 0:05:45 time: 0.0403 data: 0.0011 max mem: 33300 Test: [13100/21770] eta: 0:05:41 time: 0.0399 data: 0.0012 max mem: 33300 Test: [13200/21770] eta: 0:05:37 time: 0.0401 data: 0.0012 max mem: 33300 Test: [13300/21770] eta: 0:05:33 time: 0.0399 data: 0.0012 max mem: 33300 Test: [13400/21770] eta: 0:05:30 time: 0.0399 data: 0.0012 max mem: 33300 Test: [13500/21770] eta: 0:05:26 time: 0.0397 data: 0.0012 max mem: 33300 Test: [13600/21770] eta: 0:05:22 time: 0.0403 data: 0.0011 max mem: 33300 Test: [13700/21770] eta: 0:05:18 time: 0.0408 data: 0.0011 max mem: 33300 Test: [13800/21770] eta: 0:05:14 time: 0.0403 data: 0.0012 max mem: 33300 Test: [13900/21770] eta: 0:05:10 time: 0.0399 data: 0.0012 max mem: 33300 Test: [14000/21770] eta: 0:05:06 time: 0.0400 data: 0.0012 max mem: 33300 Test: [14100/21770] eta: 0:05:02 time: 0.0396 data: 0.0012 max mem: 33300 Test: [14200/21770] eta: 0:04:58 time: 0.0390 data: 0.0012 max mem: 33300 Test: [14300/21770] eta: 0:04:54 time: 0.0394 data: 0.0012 max mem: 33300 Test: [14400/21770] eta: 0:04:50 time: 0.0391 data: 0.0012 max mem: 33300 Test: [14500/21770] eta: 0:04:46 time: 0.0395 data: 0.0012 max mem: 33300 Test: [14600/21770] eta: 0:04:42 time: 0.0401 data: 0.0012 max mem: 33300 Test: [14700/21770] eta: 0:04:38 time: 0.0401 data: 0.0012 max mem: 33300 Test: [14800/21770] eta: 0:04:35 time: 0.0400 data: 0.0012 max mem: 33300 Test: [14900/21770] eta: 0:04:31 time: 0.0400 data: 0.0011 max mem: 33300 Test: [15000/21770] eta: 0:04:27 time: 0.0399 data: 0.0012 max mem: 33300 Test: [15100/21770] eta: 0:04:23 time: 0.0389 data: 0.0012 max mem: 33300 Test: [15200/21770] eta: 0:04:19 time: 0.0394 data: 0.0012 max mem: 33300 Test: [15300/21770] eta: 0:04:15 time: 0.0392 data: 0.0011 max mem: 33300 Test: [15400/21770] eta: 0:04:11 time: 0.0392 data: 0.0011 max mem: 33300 Test: [15500/21770] eta: 0:04:07 time: 0.0393 data: 0.0012 max mem: 33300 Test: [15600/21770] eta: 0:04:03 time: 0.0396 data: 0.0012 max mem: 33300 Test: [15700/21770] eta: 0:03:59 time: 0.0395 data: 0.0012 max mem: 33300 Test: [15800/21770] eta: 0:03:55 time: 0.0392 data: 0.0012 max mem: 33300 Test: [15900/21770] eta: 0:03:51 time: 0.0393 data: 0.0011 max mem: 33300 Test: [16000/21770] eta: 0:03:47 time: 0.0396 data: 0.0011 max mem: 33300 Test: [16100/21770] eta: 0:03:43 time: 0.0396 data: 0.0011 max mem: 33300 Test: [16200/21770] eta: 0:03:39 time: 0.0401 data: 0.0012 max mem: 33300 Test: [16300/21770] eta: 0:03:35 time: 0.0400 data: 0.0011 max mem: 33300 Test: [16400/21770] eta: 0:03:31 time: 0.0390 data: 0.0009 max mem: 33300 Test: [16500/21770] eta: 0:03:27 time: 0.0396 data: 0.0010 max mem: 33300 Test: [16600/21770] eta: 0:03:24 time: 0.0389 data: 0.0010 max mem: 33300 Test: [16700/21770] eta: 0:03:20 time: 0.0384 data: 0.0010 max mem: 33300 Test: [16800/21770] eta: 0:03:16 time: 0.0386 data: 0.0011 max mem: 33300 Test: [16900/21770] eta: 0:03:12 time: 0.0388 data: 0.0012 max mem: 33300 Test: [17000/21770] eta: 0:03:08 time: 0.0387 data: 0.0011 max mem: 33300 Test: [17100/21770] eta: 0:03:04 time: 0.0389 data: 0.0011 max mem: 33300 Test: [17200/21770] eta: 0:03:00 time: 0.0388 data: 0.0011 max mem: 33300 Test: [17300/21770] eta: 0:02:56 time: 0.0389 data: 0.0011 max mem: 33300 Test: [17400/21770] eta: 0:02:52 time: 0.0389 data: 0.0012 max mem: 33300 Test: [17500/21770] eta: 0:02:48 time: 0.0391 data: 0.0011 max mem: 33300 Test: [17600/21770] eta: 0:02:44 time: 0.0390 data: 0.0011 max mem: 33300 Test: [17700/21770] eta: 0:02:40 time: 0.0392 data: 0.0011 max mem: 33300 Test: [17800/21770] eta: 0:02:36 time: 0.0388 data: 0.0012 max mem: 33300 Test: [17900/21770] eta: 0:02:32 time: 0.0388 data: 0.0011 max mem: 33300 Test: [18000/21770] eta: 0:02:28 time: 0.0388 data: 0.0011 max mem: 33300 Test: [18100/21770] eta: 0:02:24 time: 0.0389 data: 0.0011 max mem: 33300 Test: [18200/21770] eta: 0:02:20 time: 0.0389 data: 0.0011 max mem: 33300 Test: [18300/21770] eta: 0:02:16 time: 0.0388 data: 0.0011 max mem: 33300 Test: [18400/21770] eta: 0:02:12 time: 0.0387 data: 0.0011 max mem: 33300 Test: [18500/21770] eta: 0:02:08 time: 0.0387 data: 0.0011 max mem: 33300 Test: [18600/21770] eta: 0:02:04 time: 0.0388 data: 0.0011 max mem: 33300 Test: [18700/21770] eta: 0:02:00 time: 0.0386 data: 0.0011 max mem: 33300 Test: [18800/21770] eta: 0:01:56 time: 0.0387 data: 0.0011 max mem: 33300 Test: [18900/21770] eta: 0:01:53 time: 0.0387 data: 0.0011 max mem: 33300 Test: [19000/21770] eta: 0:01:49 time: 0.0387 data: 0.0011 max mem: 33300 Test: [19100/21770] eta: 0:01:45 time: 0.0388 data: 0.0011 max mem: 33300 Test: [19200/21770] eta: 0:01:41 time: 0.0387 data: 0.0011 max mem: 33300 Test: [19300/21770] eta: 0:01:37 time: 0.0386 data: 0.0011 max mem: 33300 Test: [19400/21770] eta: 0:01:33 time: 0.0388 data: 0.0012 max mem: 33300 Test: [19500/21770] eta: 0:01:29 time: 0.0388 data: 0.0011 max mem: 33300 Test: [19600/21770] eta: 0:01:25 time: 0.0389 data: 0.0011 max mem: 33300 Test: [19700/21770] eta: 0:01:21 time: 0.0388 data: 0.0011 max mem: 33300 Test: [19800/21770] eta: 0:01:17 time: 0.0389 data: 0.0012 max mem: 33300 Test: [19900/21770] eta: 0:01:13 time: 0.0387 data: 0.0011 max mem: 33300 Test: [20000/21770] eta: 0:01:09 time: 0.0388 data: 0.0011 max mem: 33300 Test: [20100/21770] eta: 0:01:05 time: 0.0387 data: 0.0011 max mem: 33300 Test: [20200/21770] eta: 0:01:01 time: 0.0386 data: 0.0011 max mem: 33300 Test: [20300/21770] eta: 0:00:57 time: 0.0388 data: 0.0012 max mem: 33300 Test: [20400/21770] eta: 0:00:53 time: 0.0388 data: 0.0011 max mem: 33300 Test: [20500/21770] eta: 0:00:49 time: 0.0388 data: 0.0012 max mem: 33300 Test: [20600/21770] eta: 0:00:46 time: 0.0388 data: 0.0011 max mem: 33300 Test: [20700/21770] eta: 0:00:42 time: 0.0388 data: 0.0011 max mem: 33300 Test: [20800/21770] eta: 0:00:38 time: 0.0388 data: 0.0012 max mem: 33300 Test: [20900/21770] eta: 0:00:34 time: 0.0388 data: 0.0012 max mem: 33300 Test: [21000/21770] eta: 0:00:30 time: 0.0388 data: 0.0012 max mem: 33300 Test: [21100/21770] eta: 0:00:26 time: 0.0388 data: 0.0011 max mem: 33300 Test: [21200/21770] eta: 0:00:22 time: 0.0388 data: 0.0012 max mem: 33300 Test: [21300/21770] eta: 0:00:18 time: 0.0387 data: 0.0012 max mem: 33300 Test: [21400/21770] eta: 0:00:14 time: 0.0388 data: 0.0011 max mem: 33300 Test: [21500/21770] eta: 0:00:10 time: 0.0386 data: 0.0011 max mem: 33300 Test: [21600/21770] eta: 0:00:06 time: 0.0388 data: 0.0011 max mem: 33300 Test: [21700/21770] eta: 0:00:02 time: 0.0388 data: 0.0011 max mem: 33300 Test: Total time: 0:14:15 Final results: Mean IoU is 0.00 precision@0.5 = 0.00 precision@0.6 = 0.00 precision@0.7 = 0.00 precision@0.8 = 0.00 precision@0.9 = 0.00 overall IoU = 0.00 mean IoU = 0.00 Mean accuracy for one-to-zero sample is 100.00 Average object IoU 0.0 Overall IoU 0.0 Epoch: [2] [ 0/4276] eta: 6:16:23 lr: 4.774400463166706e-05 loss: 0.1869 (0.1869) time: 5.2814 data: 2.0109 max mem: 33300 Epoch: [2] [ 10/4276] eta: 3:52:42 lr: 4.7741360128636285e-05 loss: 0.2215 (0.2167) time: 3.2730 data: 0.1892 max mem: 33300 Epoch: [2] [ 20/4276] eta: 3:45:35 lr: 4.773871560932929e-05 loss: 0.2215 (0.2166) time: 3.0752 data: 0.0072 max mem: 33300 Epoch: [2] [ 30/4276] eta: 3:41:39 lr: 4.773607107374497e-05 loss: 0.2188 (0.2187) time: 3.0549 data: 0.0075 max mem: 33300 Epoch: [2] [ 40/4276] eta: 3:40:15 lr: 4.7733426521882224e-05 loss: 0.2153 (0.2158) time: 3.0561 data: 0.0082 max mem: 33300 Epoch: [2] [ 50/4276] eta: 3:39:08 lr: 4.773078195373996e-05 loss: 0.2089 (0.2159) time: 3.0790 data: 0.0084 max mem: 33300 Epoch: [2] [ 60/4276] eta: 3:38:20 lr: 4.772813736931707e-05 loss: 0.2089 (0.2169) time: 3.0817 data: 0.0084 max mem: 33300 Epoch: [2] [ 70/4276] eta: 3:37:29 lr: 4.772549276861244e-05 loss: 0.1996 (0.2159) time: 3.0806 data: 0.0083 max mem: 33300 Epoch: [2] [ 80/4276] eta: 3:36:52 lr: 4.7722848151624984e-05 loss: 0.2017 (0.2157) time: 3.0825 data: 0.0082 max mem: 33300 Epoch: [2] [ 90/4276] eta: 3:35:52 lr: 4.7720203518353595e-05 loss: 0.1987 (0.2134) time: 3.0646 data: 0.0083 max mem: 33300 Epoch: [2] [ 100/4276] eta: 3:35:13 lr: 4.7717558868797165e-05 loss: 0.1968 (0.2181) time: 3.0570 data: 0.0084 max mem: 33300 Epoch: [2] [ 110/4276] eta: 3:34:40 lr: 4.771491420295459e-05 loss: 0.2161 (0.2190) time: 3.0807 data: 0.0086 max mem: 33300 Epoch: [2] [ 120/4276] eta: 3:34:08 lr: 4.771226952082477e-05 loss: 0.2083 (0.2183) time: 3.0872 data: 0.0090 max mem: 33300 Epoch: [2] [ 130/4276] eta: 3:33:32 lr: 4.770962482240661e-05 loss: 0.2144 (0.2196) time: 3.0813 data: 0.0092 max mem: 33300 Epoch: [2] [ 140/4276] eta: 3:33:04 lr: 4.770698010769899e-05 loss: 0.2338 (0.2206) time: 3.0881 data: 0.0091 max mem: 33300 Epoch: [2] [ 150/4276] eta: 3:32:17 lr: 4.770433537670081e-05 loss: 0.2042 (0.2194) time: 3.0660 data: 0.0088 max mem: 33300 Epoch: [2] [ 160/4276] eta: 3:31:27 lr: 4.7701690629410976e-05 loss: 0.2037 (0.2196) time: 3.0213 data: 0.0081 max mem: 33300 Epoch: [2] [ 170/4276] eta: 3:30:43 lr: 4.769904586582838e-05 loss: 0.2212 (0.2204) time: 3.0204 data: 0.0077 max mem: 33300 Epoch: [2] [ 180/4276] eta: 3:30:11 lr: 4.769640108595191e-05 loss: 0.2293 (0.2213) time: 3.0522 data: 0.0083 max mem: 33300 Epoch: [2] [ 190/4276] eta: 3:29:39 lr: 4.7693756289780475e-05 loss: 0.2352 (0.2226) time: 3.0733 data: 0.0086 max mem: 33300 Epoch: [2] [ 200/4276] eta: 3:29:07 lr: 4.7691111477312964e-05 loss: 0.2290 (0.2235) time: 3.0730 data: 0.0089 max mem: 33300 Epoch: [2] [ 210/4276] eta: 3:28:35 lr: 4.768846664854827e-05 loss: 0.2162 (0.2235) time: 3.0729 data: 0.0091 max mem: 33300 Epoch: [2] [ 220/4276] eta: 3:28:04 lr: 4.7685821803485283e-05 loss: 0.2120 (0.2230) time: 3.0750 data: 0.0089 max mem: 33300 Epoch: [2] [ 230/4276] eta: 3:27:30 lr: 4.768317694212292e-05 loss: 0.2120 (0.2222) time: 3.0679 data: 0.0087 max mem: 33300 Epoch: [2] [ 240/4276] eta: 3:27:01 lr: 4.768053206446005e-05 loss: 0.2189 (0.2228) time: 3.0714 data: 0.0080 max mem: 33300 Epoch: [2] [ 250/4276] eta: 3:26:30 lr: 4.767788717049558e-05 loss: 0.2404 (0.2239) time: 3.0826 data: 0.0080 max mem: 33300 Epoch: [2] [ 260/4276] eta: 3:25:56 lr: 4.767524226022841e-05 loss: 0.2404 (0.2247) time: 3.0687 data: 0.0080 max mem: 33300 Epoch: [2] [ 270/4276] eta: 3:25:22 lr: 4.7672597333657425e-05 loss: 0.2354 (0.2253) time: 3.0548 data: 0.0082 max mem: 33300 Epoch: [2] [ 280/4276] eta: 3:24:42 lr: 4.766995239078152e-05 loss: 0.2211 (0.2252) time: 3.0307 data: 0.0079 max mem: 33300 Epoch: [2] [ 290/4276] eta: 3:24:04 lr: 4.76673074315996e-05 loss: 0.2140 (0.2248) time: 3.0173 data: 0.0075 max mem: 33300 Epoch: [2] [ 300/4276] eta: 3:23:35 lr: 4.7664662456110546e-05 loss: 0.2007 (0.2245) time: 3.0529 data: 0.0083 max mem: 33300 Epoch: [2] [ 310/4276] eta: 3:23:02 lr: 4.766201746431326e-05 loss: 0.2061 (0.2244) time: 3.0682 data: 0.0086 max mem: 33300 Epoch: [2] [ 320/4276] eta: 3:22:31 lr: 4.765937245620663e-05 loss: 0.2304 (0.2252) time: 3.0646 data: 0.0084 max mem: 33300 Epoch: [2] [ 330/4276] eta: 3:22:02 lr: 4.7656727431789554e-05 loss: 0.2308 (0.2252) time: 3.0800 data: 0.0087 max mem: 33300 Epoch: [2] [ 340/4276] eta: 3:21:32 lr: 4.765408239106092e-05 loss: 0.2203 (0.2250) time: 3.0801 data: 0.0086 max mem: 33300 Epoch: [2] [ 350/4276] eta: 3:21:01 lr: 4.7651437334019634e-05 loss: 0.2137 (0.2245) time: 3.0726 data: 0.0088 max mem: 33300 Epoch: [2] [ 360/4276] eta: 3:20:31 lr: 4.764879226066458e-05 loss: 0.2199 (0.2253) time: 3.0730 data: 0.0093 max mem: 33300 Epoch: [2] [ 370/4276] eta: 3:19:59 lr: 4.7646147170994654e-05 loss: 0.2199 (0.2248) time: 3.0710 data: 0.0088 max mem: 33300 Epoch: [2] [ 380/4276] eta: 3:19:26 lr: 4.764350206500874e-05 loss: 0.2178 (0.2249) time: 3.0558 data: 0.0084 max mem: 33300 Epoch: [2] [ 390/4276] eta: 3:18:54 lr: 4.764085694270574e-05 loss: 0.2265 (0.2254) time: 3.0514 data: 0.0084 max mem: 33300 Epoch: [2] [ 400/4276] eta: 3:18:20 lr: 4.763821180408455e-05 loss: 0.2267 (0.2255) time: 3.0470 data: 0.0085 max mem: 33300 Epoch: [2] [ 410/4276] eta: 3:17:51 lr: 4.763556664914405e-05 loss: 0.2227 (0.2253) time: 3.0663 data: 0.0086 max mem: 33300 Epoch: [2] [ 420/4276] eta: 3:17:21 lr: 4.763292147788315e-05 loss: 0.2178 (0.2255) time: 3.0852 data: 0.0081 max mem: 33300 Epoch: [2] [ 430/4276] eta: 3:16:51 lr: 4.763027629030073e-05 loss: 0.2230 (0.2255) time: 3.0764 data: 0.0075 max mem: 33300 Epoch: [2] [ 440/4276] eta: 3:16:23 lr: 4.762763108639568e-05 loss: 0.2197 (0.2253) time: 3.0923 data: 0.0077 max mem: 33300 Epoch: [2] [ 450/4276] eta: 3:15:54 lr: 4.762498586616689e-05 loss: 0.2197 (0.2257) time: 3.0980 data: 0.0087 max mem: 33300 Epoch: [2] [ 460/4276] eta: 3:15:24 lr: 4.7622340629613274e-05 loss: 0.2163 (0.2252) time: 3.0854 data: 0.0092 max mem: 33300 Epoch: [2] [ 470/4276] eta: 3:14:53 lr: 4.761969537673369e-05 loss: 0.2043 (0.2248) time: 3.0764 data: 0.0089 max mem: 33300 Epoch: [2] [ 480/4276] eta: 3:14:22 lr: 4.761705010752707e-05 loss: 0.2051 (0.2245) time: 3.0675 data: 0.0082 max mem: 33300 Epoch: [2] [ 490/4276] eta: 3:13:51 lr: 4.761440482199226e-05 loss: 0.2060 (0.2245) time: 3.0687 data: 0.0079 max mem: 33300 Epoch: [2] [ 500/4276] eta: 3:13:21 lr: 4.761175952012818e-05 loss: 0.2137 (0.2244) time: 3.0773 data: 0.0082 max mem: 33300 Epoch: [2] [ 510/4276] eta: 3:12:48 lr: 4.760911420193372e-05 loss: 0.2050 (0.2240) time: 3.0619 data: 0.0082 max mem: 33300 Epoch: [2] [ 520/4276] eta: 3:12:16 lr: 4.7606468867407774e-05 loss: 0.2050 (0.2240) time: 3.0462 data: 0.0082 max mem: 33300 Epoch: [2] [ 530/4276] eta: 3:11:45 lr: 4.760382351654921e-05 loss: 0.2248 (0.2239) time: 3.0605 data: 0.0086 max mem: 33300 Epoch: [2] [ 540/4276] eta: 3:11:15 lr: 4.760117814935695e-05 loss: 0.2050 (0.2236) time: 3.0771 data: 0.0091 max mem: 33300 Epoch: [2] [ 550/4276] eta: 3:10:44 lr: 4.759853276582985e-05 loss: 0.2154 (0.2238) time: 3.0691 data: 0.0090 max mem: 33300 Epoch: [2] [ 560/4276] eta: 3:10:10 lr: 4.759588736596683e-05 loss: 0.2165 (0.2240) time: 3.0446 data: 0.0081 max mem: 33300 Epoch: [2] [ 570/4276] eta: 3:09:39 lr: 4.7593241949766764e-05 loss: 0.2131 (0.2240) time: 3.0489 data: 0.0079 max mem: 33300 Epoch: [2] [ 580/4276] eta: 3:09:09 lr: 4.7590596517228554e-05 loss: 0.2199 (0.2237) time: 3.0688 data: 0.0082 max mem: 33300 Epoch: [2] [ 590/4276] eta: 3:08:35 lr: 4.758795106835108e-05 loss: 0.2059 (0.2234) time: 3.0482 data: 0.0080 max mem: 33300 Epoch: [2] [ 600/4276] eta: 3:08:07 lr: 4.758530560313323e-05 loss: 0.1998 (0.2233) time: 3.0659 data: 0.0081 max mem: 33300 Epoch: [2] [ 610/4276] eta: 3:07:38 lr: 4.7582660121573905e-05 loss: 0.2086 (0.2231) time: 3.1076 data: 0.0085 max mem: 33300 Epoch: [2] [ 620/4276] eta: 3:07:08 lr: 4.758001462367198e-05 loss: 0.2140 (0.2231) time: 3.0987 data: 0.0084 max mem: 33300 Epoch: [2] [ 630/4276] eta: 3:06:40 lr: 4.757736910942637e-05 loss: 0.2283 (0.2232) time: 3.0986 data: 0.0081 max mem: 33300 Epoch: [2] [ 640/4276] eta: 3:06:10 lr: 4.757472357883593e-05 loss: 0.2121 (0.2231) time: 3.0959 data: 0.0082 max mem: 33300 Epoch: [2] [ 650/4276] eta: 3:05:39 lr: 4.757207803189957e-05 loss: 0.2098 (0.2230) time: 3.0781 data: 0.0082 max mem: 33300 Epoch: [2] [ 660/4276] eta: 3:05:09 lr: 4.7569432468616174e-05 loss: 0.2207 (0.2230) time: 3.0756 data: 0.0088 max mem: 33300 Epoch: [2] [ 670/4276] eta: 3:04:38 lr: 4.7566786888984634e-05 loss: 0.2090 (0.2228) time: 3.0754 data: 0.0093 max mem: 33300 Epoch: [2] [ 680/4276] eta: 3:04:07 lr: 4.756414129300384e-05 loss: 0.2019 (0.2225) time: 3.0697 data: 0.0089 max mem: 33300 Epoch: [2] [ 690/4276] eta: 3:03:35 lr: 4.756149568067267e-05 loss: 0.2112 (0.2227) time: 3.0565 data: 0.0090 max mem: 33300 Epoch: [2] [ 700/4276] eta: 3:03:04 lr: 4.755885005199002e-05 loss: 0.2199 (0.2228) time: 3.0593 data: 0.0088 max mem: 33300 Epoch: [2] [ 710/4276] eta: 3:02:32 lr: 4.7556204406954785e-05 loss: 0.2343 (0.2230) time: 3.0608 data: 0.0086 max mem: 33300 Epoch: [2] [ 720/4276] eta: 3:01:59 lr: 4.755355874556584e-05 loss: 0.2268 (0.2229) time: 3.0317 data: 0.0081 max mem: 33300 Epoch: [2] [ 730/4276] eta: 3:01:26 lr: 4.755091306782208e-05 loss: 0.2252 (0.2232) time: 3.0249 data: 0.0077 max mem: 33300 Epoch: [2] [ 740/4276] eta: 3:00:56 lr: 4.754826737372239e-05 loss: 0.2250 (0.2232) time: 3.0504 data: 0.0078 max mem: 33300 Epoch: [2] [ 750/4276] eta: 3:00:23 lr: 4.7545621663265665e-05 loss: 0.2048 (0.2231) time: 3.0483 data: 0.0079 max mem: 33300 Epoch: [2] [ 760/4276] eta: 2:59:53 lr: 4.754297593645078e-05 loss: 0.1902 (0.2228) time: 3.0552 data: 0.0080 max mem: 33300 Epoch: [2] [ 770/4276] eta: 2:59:22 lr: 4.754033019327663e-05 loss: 0.1991 (0.2228) time: 3.0731 data: 0.0081 max mem: 33300 Epoch: [2] [ 780/4276] eta: 2:58:52 lr: 4.75376844337421e-05 loss: 0.2128 (0.2228) time: 3.0814 data: 0.0081 max mem: 33300 Epoch: [2] [ 790/4276] eta: 2:58:21 lr: 4.753503865784609e-05 loss: 0.2128 (0.2229) time: 3.0709 data: 0.0077 max mem: 33300 Epoch: [2] [ 800/4276] eta: 2:57:51 lr: 4.753239286558746e-05 loss: 0.2186 (0.2228) time: 3.0742 data: 0.0081 max mem: 33300 Epoch: [2] [ 810/4276] eta: 2:57:20 lr: 4.7529747056965116e-05 loss: 0.2126 (0.2228) time: 3.0781 data: 0.0078 max mem: 33300 Epoch: [2] [ 820/4276] eta: 2:56:50 lr: 4.752710123197795e-05 loss: 0.2145 (0.2227) time: 3.0647 data: 0.0074 max mem: 33300 Epoch: [2] [ 830/4276] eta: 2:56:18 lr: 4.7524455390624835e-05 loss: 0.2347 (0.2231) time: 3.0645 data: 0.0084 max mem: 33300 Epoch: [2] [ 840/4276] eta: 2:55:47 lr: 4.7521809532904654e-05 loss: 0.2265 (0.2232) time: 3.0611 data: 0.0087 max mem: 33300 Epoch: [2] [ 850/4276] eta: 2:55:15 lr: 4.751916365881631e-05 loss: 0.2055 (0.2232) time: 3.0467 data: 0.0080 max mem: 33300 Epoch: [2] [ 860/4276] eta: 2:54:42 lr: 4.751651776835867e-05 loss: 0.1978 (0.2231) time: 3.0135 data: 0.0074 max mem: 33300 Epoch: [2] [ 870/4276] eta: 2:54:09 lr: 4.751387186153064e-05 loss: 0.2087 (0.2231) time: 3.0144 data: 0.0076 max mem: 33300 Epoch: [2] [ 880/4276] eta: 2:53:37 lr: 4.751122593833109e-05 loss: 0.2130 (0.2232) time: 3.0285 data: 0.0081 max mem: 33300 Epoch: [2] [ 890/4276] eta: 2:53:08 lr: 4.7508579998758914e-05 loss: 0.2197 (0.2232) time: 3.0674 data: 0.0081 max mem: 33300 Epoch: [2] [ 900/4276] eta: 2:52:38 lr: 4.750593404281299e-05 loss: 0.2197 (0.2232) time: 3.0923 data: 0.0084 max mem: 33300 Epoch: [2] [ 910/4276] eta: 2:52:06 lr: 4.750328807049222e-05 loss: 0.2147 (0.2232) time: 3.0648 data: 0.0083 max mem: 33300 Epoch: [2] [ 920/4276] eta: 2:51:36 lr: 4.750064208179547e-05 loss: 0.2306 (0.2233) time: 3.0636 data: 0.0080 max mem: 33300 Epoch: [2] [ 930/4276] eta: 2:51:05 lr: 4.749799607672163e-05 loss: 0.2225 (0.2232) time: 3.0715 data: 0.0081 max mem: 33300 Epoch: [2] [ 940/4276] eta: 2:50:33 lr: 4.7495350055269594e-05 loss: 0.1986 (0.2230) time: 3.0507 data: 0.0078 max mem: 33300 Epoch: [2] [ 950/4276] eta: 2:50:01 lr: 4.749270401743824e-05 loss: 0.2004 (0.2229) time: 3.0315 data: 0.0076 max mem: 33300 Epoch: [2] [ 960/4276] eta: 2:49:31 lr: 4.749005796322645e-05 loss: 0.2084 (0.2230) time: 3.0485 data: 0.0078 max mem: 33300 Epoch: [2] [ 970/4276] eta: 2:49:01 lr: 4.748741189263311e-05 loss: 0.2167 (0.2229) time: 3.0922 data: 0.0080 max mem: 33300 Epoch: [2] [ 980/4276] eta: 2:48:33 lr: 4.74847658056571e-05 loss: 0.2216 (0.2232) time: 3.1184 data: 0.0085 max mem: 33300 Epoch: [2] [ 990/4276] eta: 2:48:04 lr: 4.748211970229732e-05 loss: 0.2226 (0.2231) time: 3.1193 data: 0.0087 max mem: 33300 Epoch: [2] [1000/4276] eta: 2:47:34 lr: 4.747947358255264e-05 loss: 0.2143 (0.2231) time: 3.1142 data: 0.0084 max mem: 33300 Epoch: [2] [1010/4276] eta: 2:47:03 lr: 4.7476827446421945e-05 loss: 0.2078 (0.2229) time: 3.0779 data: 0.0081 max mem: 33300 Epoch: [2] [1020/4276] eta: 2:46:30 lr: 4.747418129390413e-05 loss: 0.2078 (0.2229) time: 3.0305 data: 0.0075 max mem: 33300 Epoch: [2] [1030/4276] eta: 2:45:58 lr: 4.7471535124998065e-05 loss: 0.2230 (0.2231) time: 3.0172 data: 0.0071 max mem: 33300 Epoch: [2] [1040/4276] eta: 2:45:28 lr: 4.746888893970264e-05 loss: 0.2330 (0.2230) time: 3.0494 data: 0.0082 max mem: 33300 Epoch: [2] [1050/4276] eta: 2:44:58 lr: 4.746624273801673e-05 loss: 0.2246 (0.2232) time: 3.0800 data: 0.0090 max mem: 33300 Epoch: [2] [1060/4276] eta: 2:44:27 lr: 4.7463596519939235e-05 loss: 0.2234 (0.2233) time: 3.0804 data: 0.0089 max mem: 33300 Epoch: [2] [1070/4276] eta: 2:43:55 lr: 4.7460950285469024e-05 loss: 0.2044 (0.2232) time: 3.0471 data: 0.0081 max mem: 33300 Epoch: [2] [1080/4276] eta: 2:43:23 lr: 4.7458304034604986e-05 loss: 0.2055 (0.2231) time: 3.0191 data: 0.0071 max mem: 33300 Epoch: [2] [1090/4276] eta: 2:42:52 lr: 4.7455657767346e-05 loss: 0.2068 (0.2229) time: 3.0385 data: 0.0075 max mem: 33300 Epoch: [2] [1100/4276] eta: 2:42:21 lr: 4.745301148369095e-05 loss: 0.2060 (0.2229) time: 3.0610 data: 0.0085 max mem: 33300 Epoch: [2] [1110/4276] eta: 2:41:51 lr: 4.745036518363872e-05 loss: 0.2064 (0.2229) time: 3.0749 data: 0.0085 max mem: 33300 Epoch: [2] [1120/4276] eta: 2:41:20 lr: 4.74477188671882e-05 loss: 0.2272 (0.2231) time: 3.0718 data: 0.0080 max mem: 33300 Epoch: [2] [1130/4276] eta: 2:40:50 lr: 4.744507253433825e-05 loss: 0.2262 (0.2230) time: 3.0677 data: 0.0081 max mem: 33300 Epoch: [2] [1140/4276] eta: 2:40:19 lr: 4.744242618508777e-05 loss: 0.2131 (0.2229) time: 3.0742 data: 0.0084 max mem: 33300 Epoch: [2] [1150/4276] eta: 2:39:49 lr: 4.743977981943564e-05 loss: 0.2186 (0.2229) time: 3.0807 data: 0.0084 max mem: 33300 Epoch: [2] [1160/4276] eta: 2:39:19 lr: 4.743713343738073e-05 loss: 0.2187 (0.2230) time: 3.0844 data: 0.0082 max mem: 33300 Epoch: [2] [1170/4276] eta: 2:38:49 lr: 4.7434487038921935e-05 loss: 0.2310 (0.2231) time: 3.0896 data: 0.0080 max mem: 33300 Epoch: [2] [1180/4276] eta: 2:38:19 lr: 4.743184062405814e-05 loss: 0.2227 (0.2230) time: 3.0961 data: 0.0080 max mem: 33300 Epoch: [2] [1190/4276] eta: 2:37:49 lr: 4.742919419278821e-05 loss: 0.2042 (0.2229) time: 3.0913 data: 0.0080 max mem: 33300 Epoch: [2] [1200/4276] eta: 2:37:18 lr: 4.742654774511104e-05 loss: 0.2207 (0.2229) time: 3.0751 data: 0.0081 max mem: 33300 Epoch: [2] [1210/4276] eta: 2:36:47 lr: 4.74239012810255e-05 loss: 0.2168 (0.2228) time: 3.0640 data: 0.0084 max mem: 33300 Epoch: [2] [1220/4276] eta: 2:36:16 lr: 4.7421254800530486e-05 loss: 0.2035 (0.2228) time: 3.0594 data: 0.0082 max mem: 33300 Epoch: [2] [1230/4276] eta: 2:35:45 lr: 4.7418608303624864e-05 loss: 0.2089 (0.2228) time: 3.0608 data: 0.0081 max mem: 33300 Epoch: [2] [1240/4276] eta: 2:35:14 lr: 4.741596179030751e-05 loss: 0.2244 (0.2228) time: 3.0619 data: 0.0078 max mem: 33300 Epoch: [2] [1250/4276] eta: 2:34:44 lr: 4.741331526057733e-05 loss: 0.2244 (0.2229) time: 3.0632 data: 0.0076 max mem: 33300 Epoch: [2] [1260/4276] eta: 2:34:13 lr: 4.7410668714433174e-05 loss: 0.2007 (0.2227) time: 3.0654 data: 0.0082 max mem: 33300 Epoch: [2] [1270/4276] eta: 2:33:42 lr: 4.740802215187394e-05 loss: 0.2007 (0.2227) time: 3.0602 data: 0.0083 max mem: 33300 Epoch: [2] [1280/4276] eta: 2:33:11 lr: 4.740537557289851e-05 loss: 0.2190 (0.2228) time: 3.0594 data: 0.0078 max mem: 33300 Epoch: [2] [1290/4276] eta: 2:32:40 lr: 4.7402728977505765e-05 loss: 0.2190 (0.2229) time: 3.0626 data: 0.0078 max mem: 33300 Epoch: [2] [1300/4276] eta: 2:32:10 lr: 4.740008236569456e-05 loss: 0.2086 (0.2228) time: 3.0625 data: 0.0080 max mem: 33300 Epoch: [2] [1310/4276] eta: 2:31:39 lr: 4.7397435737463805e-05 loss: 0.2014 (0.2227) time: 3.0630 data: 0.0081 max mem: 33300 Epoch: [2] [1320/4276] eta: 2:31:08 lr: 4.739478909281236e-05 loss: 0.2202 (0.2228) time: 3.0654 data: 0.0081 max mem: 33300 Epoch: [2] [1330/4276] eta: 2:30:38 lr: 4.7392142431739115e-05 loss: 0.2115 (0.2226) time: 3.0790 data: 0.0079 max mem: 33300 Epoch: [2] [1340/4276] eta: 2:30:08 lr: 4.738949575424294e-05 loss: 0.1932 (0.2225) time: 3.1004 data: 0.0079 max mem: 33300 Epoch: [2] [1350/4276] eta: 2:29:38 lr: 4.7386849060322724e-05 loss: 0.2044 (0.2224) time: 3.1038 data: 0.0083 max mem: 33300 Epoch: [2] [1360/4276] eta: 2:29:08 lr: 4.738420234997734e-05 loss: 0.2144 (0.2224) time: 3.0980 data: 0.0085 max mem: 33300 Epoch: [2] [1370/4276] eta: 2:28:38 lr: 4.738155562320567e-05 loss: 0.2144 (0.2223) time: 3.0955 data: 0.0083 max mem: 33300 Epoch: [2] [1380/4276] eta: 2:28:07 lr: 4.737890888000659e-05 loss: 0.2099 (0.2224) time: 3.0795 data: 0.0083 max mem: 33300 Epoch: [2] [1390/4276] eta: 2:27:36 lr: 4.737626212037897e-05 loss: 0.2208 (0.2223) time: 3.0651 data: 0.0082 max mem: 33300 Epoch: [2] [1400/4276] eta: 2:27:06 lr: 4.737361534432171e-05 loss: 0.2195 (0.2224) time: 3.0635 data: 0.0084 max mem: 33300 Epoch: [2] [1410/4276] eta: 2:26:35 lr: 4.737096855183367e-05 loss: 0.2195 (0.2225) time: 3.0688 data: 0.0085 max mem: 33300 Epoch: [2] [1420/4276] eta: 2:26:04 lr: 4.736832174291373e-05 loss: 0.2162 (0.2226) time: 3.0722 data: 0.0085 max mem: 33300 Epoch: [2] [1430/4276] eta: 2:25:34 lr: 4.736567491756077e-05 loss: 0.2081 (0.2226) time: 3.0687 data: 0.0083 max mem: 33300 Epoch: [2] [1440/4276] eta: 2:25:02 lr: 4.736302807577367e-05 loss: 0.2249 (0.2229) time: 3.0546 data: 0.0083 max mem: 33300 Epoch: [2] [1450/4276] eta: 2:24:31 lr: 4.736038121755131e-05 loss: 0.2308 (0.2229) time: 3.0511 data: 0.0085 max mem: 33300 Epoch: [2] [1460/4276] eta: 2:24:01 lr: 4.735773434289256e-05 loss: 0.2244 (0.2230) time: 3.0599 data: 0.0085 max mem: 33300 Epoch: [2] [1470/4276] eta: 2:23:30 lr: 4.7355087451796296e-05 loss: 0.2253 (0.2231) time: 3.0676 data: 0.0082 max mem: 33300 Epoch: [2] [1480/4276] eta: 2:22:59 lr: 4.735244054426139e-05 loss: 0.2219 (0.2231) time: 3.0671 data: 0.0079 max mem: 33300 Epoch: [2] [1490/4276] eta: 2:22:28 lr: 4.734979362028675e-05 loss: 0.2133 (0.2230) time: 3.0621 data: 0.0083 max mem: 33300 Epoch: [2] [1500/4276] eta: 2:21:58 lr: 4.7347146679871224e-05 loss: 0.2133 (0.2229) time: 3.0679 data: 0.0086 max mem: 33300 Epoch: [2] [1510/4276] eta: 2:21:28 lr: 4.7344499723013694e-05 loss: 0.1963 (0.2229) time: 3.0873 data: 0.0086 max mem: 33300 Epoch: [2] [1520/4276] eta: 2:20:58 lr: 4.734185274971303e-05 loss: 0.2063 (0.2228) time: 3.1053 data: 0.0085 max mem: 33300 Epoch: [2] [1530/4276] eta: 2:20:28 lr: 4.733920575996813e-05 loss: 0.2083 (0.2228) time: 3.1076 data: 0.0085 max mem: 33300 Epoch: [2] [1540/4276] eta: 2:19:58 lr: 4.733655875377785e-05 loss: 0.2178 (0.2228) time: 3.1179 data: 0.0085 max mem: 33300 Epoch: [2] [1550/4276] eta: 2:19:28 lr: 4.7333911731141074e-05 loss: 0.2178 (0.2228) time: 3.1183 data: 0.0085 max mem: 33300 Epoch: [2] [1560/4276] eta: 2:18:57 lr: 4.7331264692056676e-05 loss: 0.2062 (0.2227) time: 3.0767 data: 0.0083 max mem: 33300 Epoch: [2] [1570/4276] eta: 2:18:26 lr: 4.7328617636523534e-05 loss: 0.1894 (0.2226) time: 3.0579 data: 0.0084 max mem: 33300 Epoch: [2] [1580/4276] eta: 2:17:56 lr: 4.732597056454051e-05 loss: 0.1894 (0.2224) time: 3.0752 data: 0.0089 max mem: 33300 Epoch: [2] [1590/4276] eta: 2:17:25 lr: 4.7323323476106504e-05 loss: 0.2082 (0.2223) time: 3.0749 data: 0.0086 max mem: 33300 Epoch: [2] [1600/4276] eta: 2:16:54 lr: 4.732067637122038e-05 loss: 0.2149 (0.2224) time: 3.0743 data: 0.0088 max mem: 33300 Epoch: [2] [1610/4276] eta: 2:16:24 lr: 4.7318029249881004e-05 loss: 0.2002 (0.2223) time: 3.0707 data: 0.0084 max mem: 33300 Epoch: [2] [1620/4276] eta: 2:15:53 lr: 4.731538211208726e-05 loss: 0.2096 (0.2223) time: 3.0722 data: 0.0082 max mem: 33300 Epoch: [2] [1630/4276] eta: 2:15:22 lr: 4.731273495783802e-05 loss: 0.2228 (0.2225) time: 3.0738 data: 0.0090 max mem: 33300 Epoch: [2] [1640/4276] eta: 2:14:52 lr: 4.731008778713217e-05 loss: 0.2303 (0.2226) time: 3.0618 data: 0.0086 max mem: 33300 Epoch: [2] [1650/4276] eta: 2:14:21 lr: 4.7307440599968564e-05 loss: 0.2194 (0.2226) time: 3.0596 data: 0.0080 max mem: 33300 Epoch: [2] [1660/4276] eta: 2:13:50 lr: 4.730479339634609e-05 loss: 0.2145 (0.2225) time: 3.0705 data: 0.0080 max mem: 33300 Epoch: [2] [1670/4276] eta: 2:13:19 lr: 4.730214617626362e-05 loss: 0.2209 (0.2226) time: 3.0724 data: 0.0079 max mem: 33300 Epoch: [2] [1680/4276] eta: 2:12:49 lr: 4.729949893972004e-05 loss: 0.2241 (0.2227) time: 3.0756 data: 0.0081 max mem: 33300 Epoch: [2] [1690/4276] eta: 2:12:18 lr: 4.729685168671419e-05 loss: 0.2221 (0.2226) time: 3.0857 data: 0.0088 max mem: 33300 Epoch: [2] [1700/4276] eta: 2:11:48 lr: 4.7294204417244984e-05 loss: 0.2153 (0.2226) time: 3.1011 data: 0.0090 max mem: 33300 Epoch: [2] [1710/4276] eta: 2:11:18 lr: 4.7291557131311264e-05 loss: 0.2254 (0.2227) time: 3.1124 data: 0.0086 max mem: 33300 Epoch: [2] [1720/4276] eta: 2:10:48 lr: 4.728890982891192e-05 loss: 0.2310 (0.2228) time: 3.1094 data: 0.0089 max mem: 33300 Epoch: [2] [1730/4276] eta: 2:10:18 lr: 4.728626251004583e-05 loss: 0.2231 (0.2227) time: 3.1122 data: 0.0090 max mem: 33300 Epoch: [2] [1740/4276] eta: 2:09:48 lr: 4.728361517471185e-05 loss: 0.2104 (0.2227) time: 3.0953 data: 0.0084 max mem: 33300 Epoch: [2] [1750/4276] eta: 2:09:16 lr: 4.728096782290886e-05 loss: 0.2266 (0.2228) time: 3.0423 data: 0.0077 max mem: 33300 Epoch: [2] [1760/4276] eta: 2:08:45 lr: 4.7278320454635745e-05 loss: 0.2208 (0.2227) time: 3.0167 data: 0.0074 max mem: 33300 Epoch: [2] [1770/4276] eta: 2:08:14 lr: 4.727567306989136e-05 loss: 0.2208 (0.2227) time: 3.0479 data: 0.0079 max mem: 33300 Epoch: [2] [1780/4276] eta: 2:07:43 lr: 4.727302566867459e-05 loss: 0.2153 (0.2227) time: 3.0666 data: 0.0089 max mem: 33300 Epoch: [2] [1790/4276] eta: 2:07:12 lr: 4.7270378250984306e-05 loss: 0.2128 (0.2227) time: 3.0551 data: 0.0094 max mem: 33300 Epoch: [2] [1800/4276] eta: 2:06:41 lr: 4.726773081681938e-05 loss: 0.2230 (0.2227) time: 3.0609 data: 0.0086 max mem: 33300 Epoch: [2] [1810/4276] eta: 2:06:11 lr: 4.726508336617867e-05 loss: 0.2369 (0.2228) time: 3.0714 data: 0.0078 max mem: 33300 Epoch: [2] [1820/4276] eta: 2:05:40 lr: 4.7262435899061064e-05 loss: 0.2079 (0.2227) time: 3.0734 data: 0.0079 max mem: 33300 Epoch: [2] [1830/4276] eta: 2:05:09 lr: 4.725978841546543e-05 loss: 0.2035 (0.2227) time: 3.0729 data: 0.0077 max mem: 33300 Epoch: [2] [1840/4276] eta: 2:04:39 lr: 4.725714091539065e-05 loss: 0.2035 (0.2227) time: 3.0667 data: 0.0080 max mem: 33300 Epoch: [2] [1850/4276] eta: 2:04:08 lr: 4.725449339883557e-05 loss: 0.2209 (0.2228) time: 3.0711 data: 0.0081 max mem: 33300 Epoch: [2] [1860/4276] eta: 2:03:37 lr: 4.725184586579908e-05 loss: 0.2256 (0.2228) time: 3.0796 data: 0.0077 max mem: 33300 Epoch: [2] [1870/4276] eta: 2:03:07 lr: 4.724919831628006e-05 loss: 0.2256 (0.2230) time: 3.0965 data: 0.0079 max mem: 33300 Epoch: [2] [1880/4276] eta: 2:02:37 lr: 4.7246550750277355e-05 loss: 0.2184 (0.2229) time: 3.1174 data: 0.0077 max mem: 33300 Epoch: [2] [1890/4276] eta: 2:02:07 lr: 4.724390316778985e-05 loss: 0.2126 (0.2228) time: 3.1168 data: 0.0074 max mem: 33300 Epoch: [2] [1900/4276] eta: 2:01:37 lr: 4.7241255568816426e-05 loss: 0.2062 (0.2227) time: 3.1102 data: 0.0075 max mem: 33300 Epoch: [2] [1910/4276] eta: 2:01:07 lr: 4.723860795335594e-05 loss: 0.2062 (0.2227) time: 3.1101 data: 0.0076 max mem: 33300 Epoch: [2] [1920/4276] eta: 2:00:36 lr: 4.723596032140727e-05 loss: 0.2063 (0.2225) time: 3.1113 data: 0.0077 max mem: 33300 Epoch: [2] [1930/4276] eta: 2:00:06 lr: 4.7233312672969276e-05 loss: 0.2044 (0.2225) time: 3.0965 data: 0.0077 max mem: 33300 Epoch: [2] [1940/4276] eta: 1:59:35 lr: 4.7230665008040836e-05 loss: 0.2122 (0.2226) time: 3.0828 data: 0.0076 max mem: 33300 Epoch: [2] [1950/4276] eta: 1:59:05 lr: 4.722801732662082e-05 loss: 0.2091 (0.2225) time: 3.0821 data: 0.0079 max mem: 33300 Epoch: [2] [1960/4276] eta: 1:58:34 lr: 4.7225369628708096e-05 loss: 0.2019 (0.2225) time: 3.0809 data: 0.0081 max mem: 33300 Epoch: [2] [1970/4276] eta: 1:58:03 lr: 4.722272191430154e-05 loss: 0.1903 (0.2224) time: 3.0772 data: 0.0079 max mem: 33300 Epoch: [2] [1980/4276] eta: 1:57:33 lr: 4.7220074183400015e-05 loss: 0.1874 (0.2222) time: 3.0765 data: 0.0080 max mem: 33300 Epoch: [2] [1990/4276] eta: 1:57:02 lr: 4.7217426436002394e-05 loss: 0.2045 (0.2222) time: 3.0714 data: 0.0078 max mem: 33300 Epoch: [2] [2000/4276] eta: 1:56:31 lr: 4.721477867210754e-05 loss: 0.2202 (0.2223) time: 3.0622 data: 0.0073 max mem: 33300 Epoch: [2] [2010/4276] eta: 1:56:00 lr: 4.721213089171433e-05 loss: 0.2209 (0.2222) time: 3.0637 data: 0.0072 max mem: 33300 Epoch: [2] [2020/4276] eta: 1:55:30 lr: 4.720948309482163e-05 loss: 0.2221 (0.2223) time: 3.0780 data: 0.0074 max mem: 33300 Epoch: [2] [2030/4276] eta: 1:54:59 lr: 4.720683528142831e-05 loss: 0.2142 (0.2222) time: 3.0768 data: 0.0075 max mem: 33300 Epoch: [2] [2040/4276] eta: 1:54:28 lr: 4.7204187451533236e-05 loss: 0.1993 (0.2221) time: 3.0596 data: 0.0073 max mem: 33300 Epoch: [2] [2050/4276] eta: 1:53:56 lr: 4.720153960513527e-05 loss: 0.2078 (0.2222) time: 3.0302 data: 0.0078 max mem: 33300 Epoch: [2] [2060/4276] eta: 1:53:25 lr: 4.7198891742233296e-05 loss: 0.2123 (0.2221) time: 3.0072 data: 0.0080 max mem: 33300 Epoch: [2] [2070/4276] eta: 1:52:54 lr: 4.7196243862826186e-05 loss: 0.2109 (0.2220) time: 3.0336 data: 0.0079 max mem: 33300 Epoch: [2] [2080/4276] eta: 1:52:24 lr: 4.719359596691278e-05 loss: 0.2034 (0.2221) time: 3.0736 data: 0.0081 max mem: 33300 Epoch: [2] [2090/4276] eta: 1:51:53 lr: 4.7190948054491974e-05 loss: 0.2117 (0.2220) time: 3.1000 data: 0.0077 max mem: 33300 Epoch: [2] [2100/4276] eta: 1:51:23 lr: 4.718830012556262e-05 loss: 0.2023 (0.2219) time: 3.1106 data: 0.0075 max mem: 33300 Epoch: [2] [2110/4276] eta: 1:50:53 lr: 4.71856521801236e-05 loss: 0.2001 (0.2218) time: 3.1126 data: 0.0078 max mem: 33300 Epoch: [2] [2120/4276] eta: 1:50:22 lr: 4.718300421817376e-05 loss: 0.2030 (0.2218) time: 3.0827 data: 0.0079 max mem: 33300 Epoch: [2] [2130/4276] eta: 1:49:51 lr: 4.718035623971199e-05 loss: 0.1977 (0.2218) time: 3.0638 data: 0.0077 max mem: 33300 Epoch: [2] [2140/4276] eta: 1:49:21 lr: 4.717770824473715e-05 loss: 0.2135 (0.2218) time: 3.0778 data: 0.0076 max mem: 33300 Epoch: [2] [2150/4276] eta: 1:48:50 lr: 4.7175060233248105e-05 loss: 0.2139 (0.2218) time: 3.0771 data: 0.0078 max mem: 33300 Epoch: [2] [2160/4276] eta: 1:48:19 lr: 4.7172412205243715e-05 loss: 0.2137 (0.2219) time: 3.0806 data: 0.0079 max mem: 33300 Epoch: [2] [2170/4276] eta: 1:47:49 lr: 4.7169764160722854e-05 loss: 0.2155 (0.2219) time: 3.0792 data: 0.0078 max mem: 33300 Epoch: [2] [2180/4276] eta: 1:47:18 lr: 4.71671160996844e-05 loss: 0.2332 (0.2220) time: 3.0780 data: 0.0075 max mem: 33300 Epoch: [2] [2190/4276] eta: 1:46:47 lr: 4.7164468022127195e-05 loss: 0.2332 (0.2220) time: 3.0796 data: 0.0077 max mem: 33300 Epoch: [2] [2200/4276] eta: 1:46:17 lr: 4.716181992805012e-05 loss: 0.2237 (0.2220) time: 3.0787 data: 0.0076 max mem: 33300 Epoch: [2] [2210/4276] eta: 1:45:46 lr: 4.7159171817452044e-05 loss: 0.2230 (0.2220) time: 3.0715 data: 0.0078 max mem: 33300 Epoch: [2] [2220/4276] eta: 1:45:15 lr: 4.715652369033183e-05 loss: 0.2195 (0.2220) time: 3.0623 data: 0.0082 max mem: 33300 Epoch: [2] [2230/4276] eta: 1:44:44 lr: 4.715387554668834e-05 loss: 0.2195 (0.2220) time: 3.0685 data: 0.0079 max mem: 33300 Epoch: [2] [2240/4276] eta: 1:44:14 lr: 4.7151227386520443e-05 loss: 0.2104 (0.2219) time: 3.0821 data: 0.0079 max mem: 33300 Epoch: [2] [2250/4276] eta: 1:43:43 lr: 4.714857920982701e-05 loss: 0.2007 (0.2218) time: 3.0947 data: 0.0080 max mem: 33300 Epoch: [2] [2260/4276] eta: 1:43:13 lr: 4.71459310166069e-05 loss: 0.2063 (0.2219) time: 3.1048 data: 0.0078 max mem: 33300 Epoch: [2] [2270/4276] eta: 1:42:43 lr: 4.714328280685897e-05 loss: 0.2063 (0.2219) time: 3.1039 data: 0.0081 max mem: 33300 Epoch: [2] [2280/4276] eta: 1:42:12 lr: 4.7140634580582105e-05 loss: 0.2045 (0.2218) time: 3.0973 data: 0.0081 max mem: 33300 Epoch: [2] [2290/4276] eta: 1:41:41 lr: 4.7137986337775154e-05 loss: 0.2086 (0.2218) time: 3.0925 data: 0.0076 max mem: 33300 Epoch: [2] [2300/4276] eta: 1:41:11 lr: 4.7135338078437e-05 loss: 0.2006 (0.2216) time: 3.0921 data: 0.0074 max mem: 33300 Epoch: [2] [2310/4276] eta: 1:40:40 lr: 4.7132689802566476e-05 loss: 0.2009 (0.2216) time: 3.0663 data: 0.0074 max mem: 33300 Epoch: [2] [2320/4276] eta: 1:40:09 lr: 4.713004151016248e-05 loss: 0.2166 (0.2215) time: 3.0490 data: 0.0077 max mem: 33300 Epoch: [2] [2330/4276] eta: 1:39:38 lr: 4.712739320122386e-05 loss: 0.2094 (0.2214) time: 3.0657 data: 0.0078 max mem: 33300 Epoch: [2] [2340/4276] eta: 1:39:08 lr: 4.7124744875749485e-05 loss: 0.2094 (0.2214) time: 3.0672 data: 0.0079 max mem: 33300 Epoch: [2] [2350/4276] eta: 1:38:37 lr: 4.712209653373821e-05 loss: 0.2129 (0.2214) time: 3.0676 data: 0.0081 max mem: 33300 Epoch: [2] [2360/4276] eta: 1:38:06 lr: 4.711944817518891e-05 loss: 0.2129 (0.2213) time: 3.0668 data: 0.0080 max mem: 33300 Epoch: [2] [2370/4276] eta: 1:37:35 lr: 4.7116799800100444e-05 loss: 0.2131 (0.2214) time: 3.0688 data: 0.0081 max mem: 33300 Epoch: [2] [2380/4276] eta: 1:37:04 lr: 4.711415140847169e-05 loss: 0.2041 (0.2213) time: 3.0455 data: 0.0077 max mem: 33300 Epoch: [2] [2390/4276] eta: 1:36:33 lr: 4.711150300030148e-05 loss: 0.1933 (0.2213) time: 3.0281 data: 0.0082 max mem: 33300 Epoch: [2] [2400/4276] eta: 1:36:02 lr: 4.71088545755887e-05 loss: 0.2095 (0.2213) time: 3.0381 data: 0.0086 max mem: 33300 Epoch: [2] [2410/4276] eta: 1:35:31 lr: 4.710620613433222e-05 loss: 0.2123 (0.2213) time: 3.0528 data: 0.0080 max mem: 33300 Epoch: [2] [2420/4276] eta: 1:35:01 lr: 4.710355767653088e-05 loss: 0.2022 (0.2212) time: 3.0597 data: 0.0081 max mem: 33300 Epoch: [2] [2430/4276] eta: 1:34:30 lr: 4.7100909202183566e-05 loss: 0.2158 (0.2212) time: 3.0598 data: 0.0080 max mem: 33300 Epoch: [2] [2440/4276] eta: 1:33:59 lr: 4.709826071128913e-05 loss: 0.2264 (0.2212) time: 3.0948 data: 0.0077 max mem: 33300 Epoch: [2] [2450/4276] eta: 1:33:29 lr: 4.709561220384643e-05 loss: 0.2121 (0.2212) time: 3.1159 data: 0.0078 max mem: 33300 Epoch: [2] [2460/4276] eta: 1:32:58 lr: 4.7092963679854336e-05 loss: 0.2088 (0.2212) time: 3.0844 data: 0.0080 max mem: 33300 Epoch: [2] [2470/4276] eta: 1:32:27 lr: 4.7090315139311705e-05 loss: 0.2113 (0.2212) time: 3.0550 data: 0.0080 max mem: 33300 Epoch: [2] [2480/4276] eta: 1:31:56 lr: 4.708766658221741e-05 loss: 0.2195 (0.2212) time: 3.0402 data: 0.0077 max mem: 33300 Epoch: [2] [2490/4276] eta: 1:31:25 lr: 4.7085018008570296e-05 loss: 0.2177 (0.2211) time: 2.9883 data: 0.0081 max mem: 33300 Epoch: [2] [2500/4276] eta: 1:30:53 lr: 4.708236941836924e-05 loss: 0.2157 (0.2211) time: 2.9386 data: 0.0083 max mem: 33300 Epoch: [2] [2510/4276] eta: 1:30:21 lr: 4.7079720811613096e-05 loss: 0.2129 (0.2211) time: 2.9364 data: 0.0077 max mem: 33300 Epoch: [2] [2520/4276] eta: 1:29:50 lr: 4.707707218830073e-05 loss: 0.1953 (0.2209) time: 2.9349 data: 0.0077 max mem: 33300 Epoch: [2] [2530/4276] eta: 1:29:18 lr: 4.7074423548431005e-05 loss: 0.1726 (0.2208) time: 2.9083 data: 0.0082 max mem: 33300 Epoch: [2] [2540/4276] eta: 1:28:46 lr: 4.7071774892002775e-05 loss: 0.1852 (0.2207) time: 2.8897 data: 0.0082 max mem: 33300 Epoch: [2] [2550/4276] eta: 1:28:14 lr: 4.70691262190149e-05 loss: 0.2095 (0.2207) time: 2.9025 data: 0.0078 max mem: 33300 Epoch: [2] [2560/4276] eta: 1:27:43 lr: 4.706647752946626e-05 loss: 0.1865 (0.2206) time: 2.9302 data: 0.0077 max mem: 33300 Epoch: [2] [2570/4276] eta: 1:27:11 lr: 4.706382882335569e-05 loss: 0.1865 (0.2206) time: 2.9445 data: 0.0073 max mem: 33300 Epoch: [2] [2580/4276] eta: 1:26:40 lr: 4.7061180100682064e-05 loss: 0.2048 (0.2206) time: 2.9425 data: 0.0071 max mem: 33300 Epoch: [2] [2590/4276] eta: 1:26:08 lr: 4.705853136144425e-05 loss: 0.2073 (0.2205) time: 2.9435 data: 0.0072 max mem: 33300 Epoch: [2] [2600/4276] eta: 1:25:37 lr: 4.7055882605641095e-05 loss: 0.2081 (0.2206) time: 2.9456 data: 0.0072 max mem: 33300 Epoch: [2] [2610/4276] eta: 1:25:05 lr: 4.705323383327146e-05 loss: 0.2163 (0.2206) time: 2.9455 data: 0.0072 max mem: 33300 Epoch: [2] [2620/4276] eta: 1:24:34 lr: 4.705058504433422e-05 loss: 0.2148 (0.2206) time: 2.9485 data: 0.0072 max mem: 33300 Epoch: [2] [2630/4276] eta: 1:24:03 lr: 4.704793623882821e-05 loss: 0.2083 (0.2205) time: 2.9637 data: 0.0072 max mem: 33300 Epoch: [2] [2640/4276] eta: 1:23:32 lr: 4.704528741675231e-05 loss: 0.1901 (0.2204) time: 2.9697 data: 0.0073 max mem: 33300 Epoch: [2] [2650/4276] eta: 1:23:00 lr: 4.704263857810539e-05 loss: 0.1908 (0.2204) time: 2.9727 data: 0.0074 max mem: 33300 Epoch: [2] [2660/4276] eta: 1:22:29 lr: 4.7039989722886276e-05 loss: 0.2208 (0.2204) time: 2.9744 data: 0.0073 max mem: 33300 Epoch: [2] [2670/4276] eta: 1:21:58 lr: 4.703734085109385e-05 loss: 0.2215 (0.2204) time: 2.9674 data: 0.0071 max mem: 33300 Epoch: [2] [2680/4276] eta: 1:21:27 lr: 4.703469196272696e-05 loss: 0.1988 (0.2204) time: 2.9663 data: 0.0070 max mem: 33300 Epoch: [2] [2690/4276] eta: 1:20:56 lr: 4.703204305778448e-05 loss: 0.2201 (0.2204) time: 2.9675 data: 0.0072 max mem: 33300 Epoch: [2] [2700/4276] eta: 1:20:24 lr: 4.702939413626526e-05 loss: 0.2037 (0.2203) time: 2.9679 data: 0.0070 max mem: 33300 Epoch: [2] [2710/4276] eta: 1:19:53 lr: 4.7026745198168154e-05 loss: 0.1966 (0.2203) time: 2.9730 data: 0.0065 max mem: 33300 Epoch: [2] [2720/4276] eta: 1:21:06 lr: 4.702409624349204e-05 loss: 0.2068 (0.2203) time: 12.0791 data: 9.1211 max mem: 33300 Epoch: [2] [2730/4276] eta: 1:20:34 lr: 4.7021447272235744e-05 loss: 0.2062 (0.2203) time: 12.0557 data: 9.1208 max mem: 33300 Epoch: [2] [2740/4276] eta: 1:20:01 lr: 4.7018798284398155e-05 loss: 0.2183 (0.2203) time: 2.9080 data: 0.0067 max mem: 33300 Epoch: [2] [2750/4276] eta: 1:19:29 lr: 4.7016149279978114e-05 loss: 0.2258 (0.2203) time: 2.9202 data: 0.0081 max mem: 33300 Epoch: [2] [2760/4276] eta: 1:18:57 lr: 4.701350025897448e-05 loss: 0.2026 (0.2203) time: 2.9517 data: 0.0089 max mem: 33300 Epoch: [2] [2770/4276] eta: 1:18:25 lr: 4.701085122138613e-05 loss: 0.1964 (0.2202) time: 2.9404 data: 0.0089 max mem: 33300 Epoch: [2] [2780/4276] eta: 1:17:53 lr: 4.7008202167211894e-05 loss: 0.1970 (0.2202) time: 2.9537 data: 0.0086 max mem: 33300 Epoch: [2] [2790/4276] eta: 1:17:20 lr: 4.7005553096450646e-05 loss: 0.2303 (0.2202) time: 2.9432 data: 0.0085 max mem: 33300 Epoch: [2] [2800/4276] eta: 1:16:48 lr: 4.700290400910125e-05 loss: 0.2203 (0.2202) time: 2.9379 data: 0.0083 max mem: 33300 Epoch: [2] [2810/4276] eta: 1:16:16 lr: 4.700025490516254e-05 loss: 0.1877 (0.2201) time: 2.9537 data: 0.0085 max mem: 33300 Epoch: [2] [2820/4276] eta: 1:15:44 lr: 4.6997605784633384e-05 loss: 0.2024 (0.2200) time: 2.9560 data: 0.0087 max mem: 33300 Epoch: [2] [2830/4276] eta: 1:15:12 lr: 4.699495664751265e-05 loss: 0.2044 (0.2200) time: 2.9494 data: 0.0088 max mem: 33300 Epoch: [2] [2840/4276] eta: 1:14:40 lr: 4.699230749379918e-05 loss: 0.2098 (0.2199) time: 2.9260 data: 0.0082 max mem: 33300 Epoch: [2] [2850/4276] eta: 1:14:07 lr: 4.698965832349185e-05 loss: 0.2197 (0.2201) time: 2.9013 data: 0.0077 max mem: 33300 Epoch: [2] [2860/4276] eta: 1:13:35 lr: 4.698700913658949e-05 loss: 0.2197 (0.2201) time: 2.8998 data: 0.0082 max mem: 33300 Epoch: [2] [2870/4276] eta: 1:13:03 lr: 4.698435993309098e-05 loss: 0.2148 (0.2201) time: 2.9112 data: 0.0082 max mem: 33300 Epoch: [2] [2880/4276] eta: 1:12:31 lr: 4.6981710712995166e-05 loss: 0.2203 (0.2201) time: 2.9196 data: 0.0077 max mem: 33300 Epoch: [2] [2890/4276] eta: 1:11:59 lr: 4.6979061476300896e-05 loss: 0.2209 (0.2201) time: 2.9386 data: 0.0078 max mem: 33300 Epoch: [2] [2900/4276] eta: 1:11:27 lr: 4.697641222300704e-05 loss: 0.2050 (0.2201) time: 2.9419 data: 0.0087 max mem: 33300 Epoch: [2] [2910/4276] eta: 1:10:55 lr: 4.697376295311244e-05 loss: 0.2057 (0.2201) time: 2.9303 data: 0.0087 max mem: 33300 Epoch: [2] [2920/4276] eta: 1:10:23 lr: 4.697111366661597e-05 loss: 0.2057 (0.2200) time: 2.9376 data: 0.0084 max mem: 33300 Epoch: [2] [2930/4276] eta: 1:09:51 lr: 4.696846436351648e-05 loss: 0.1928 (0.2200) time: 2.9560 data: 0.0088 max mem: 33300 Epoch: [2] [2940/4276] eta: 1:09:19 lr: 4.696581504381281e-05 loss: 0.1918 (0.2199) time: 2.9231 data: 0.0088 max mem: 33300 Epoch: [2] [2950/4276] eta: 1:08:46 lr: 4.6963165707503834e-05 loss: 0.1871 (0.2199) time: 2.8837 data: 0.0086 max mem: 33300 Epoch: [2] [2960/4276] eta: 1:08:14 lr: 4.69605163545884e-05 loss: 0.2100 (0.2199) time: 2.8946 data: 0.0080 max mem: 33300 Epoch: [2] [2970/4276] eta: 1:07:43 lr: 4.6957866985065355e-05 loss: 0.2184 (0.2200) time: 2.9155 data: 0.0075 max mem: 33300 Epoch: [2] [2980/4276] eta: 1:07:10 lr: 4.695521759893356e-05 loss: 0.2186 (0.2199) time: 2.9127 data: 0.0072 max mem: 33300 Epoch: [2] [2990/4276] eta: 1:06:39 lr: 4.695256819619187e-05 loss: 0.2076 (0.2199) time: 2.9106 data: 0.0070 max mem: 33300 Epoch: [2] [3000/4276] eta: 1:06:07 lr: 4.694991877683915e-05 loss: 0.2030 (0.2198) time: 2.9121 data: 0.0070 max mem: 33300 Epoch: [2] [3010/4276] eta: 1:05:35 lr: 4.694726934087424e-05 loss: 0.2104 (0.2198) time: 2.9056 data: 0.0070 max mem: 33300 Epoch: [2] [3020/4276] eta: 1:05:03 lr: 4.694461988829599e-05 loss: 0.2071 (0.2198) time: 2.8854 data: 0.0072 max mem: 33300 Epoch: [2] [3030/4276] eta: 1:04:30 lr: 4.694197041910327e-05 loss: 0.1952 (0.2198) time: 2.8668 data: 0.0081 max mem: 33300 Epoch: [2] [3040/4276] eta: 1:03:58 lr: 4.6939320933294926e-05 loss: 0.2211 (0.2199) time: 2.8684 data: 0.0085 max mem: 33300 Epoch: [2] [3050/4276] eta: 1:03:26 lr: 4.693667143086982e-05 loss: 0.2278 (0.2198) time: 2.8690 data: 0.0084 max mem: 33300 Epoch: [2] [3060/4276] eta: 1:02:54 lr: 4.693402191182678e-05 loss: 0.1962 (0.2197) time: 2.8675 data: 0.0080 max mem: 33300 Epoch: [2] [3070/4276] eta: 1:02:23 lr: 4.693137237616469e-05 loss: 0.1969 (0.2197) time: 2.8924 data: 0.0079 max mem: 33300 Epoch: [2] [3080/4276] eta: 1:01:51 lr: 4.692872282388239e-05 loss: 0.2033 (0.2197) time: 2.8959 data: 0.0081 max mem: 33300 Epoch: [2] [3090/4276] eta: 1:01:19 lr: 4.692607325497873e-05 loss: 0.2033 (0.2197) time: 2.8967 data: 0.0083 max mem: 33300 Epoch: [2] [3100/4276] eta: 1:00:47 lr: 4.6923423669452566e-05 loss: 0.2153 (0.2197) time: 2.9053 data: 0.0086 max mem: 33300 Epoch: [2] [3110/4276] eta: 1:00:16 lr: 4.692077406730275e-05 loss: 0.2080 (0.2196) time: 2.9109 data: 0.0088 max mem: 33300 Epoch: [2] [3120/4276] eta: 0:59:44 lr: 4.6918124448528136e-05 loss: 0.1874 (0.2196) time: 2.9476 data: 0.0087 max mem: 33300 Epoch: [2] [3130/4276] eta: 0:59:12 lr: 4.691547481312758e-05 loss: 0.2016 (0.2195) time: 2.9565 data: 0.0086 max mem: 33300 Epoch: [2] [3140/4276] eta: 0:58:41 lr: 4.6912825161099925e-05 loss: 0.2121 (0.2196) time: 2.9600 data: 0.0084 max mem: 33300 Epoch: [2] [3150/4276] eta: 0:58:09 lr: 4.691017549244403e-05 loss: 0.2202 (0.2196) time: 2.9592 data: 0.0080 max mem: 33300 Epoch: [2] [3160/4276] eta: 0:57:38 lr: 4.6907525807158756e-05 loss: 0.2148 (0.2196) time: 2.9362 data: 0.0079 max mem: 33300 Epoch: [2] [3170/4276] eta: 0:57:06 lr: 4.6904876105242935e-05 loss: 0.2085 (0.2196) time: 2.9430 data: 0.0078 max mem: 33300 Epoch: [2] [3180/4276] eta: 0:56:35 lr: 4.690222638669543e-05 loss: 0.2138 (0.2197) time: 2.9464 data: 0.0078 max mem: 33300 Epoch: [2] [3190/4276] eta: 0:56:03 lr: 4.689957665151509e-05 loss: 0.2261 (0.2197) time: 2.9333 data: 0.0081 max mem: 33300 Epoch: [2] [3200/4276] eta: 0:55:32 lr: 4.689692689970077e-05 loss: 0.2100 (0.2197) time: 2.9474 data: 0.0081 max mem: 33300 Epoch: [2] [3210/4276] eta: 0:55:00 lr: 4.689427713125132e-05 loss: 0.1988 (0.2197) time: 2.9601 data: 0.0076 max mem: 33300 Epoch: [2] [3220/4276] eta: 0:54:29 lr: 4.6891627346165585e-05 loss: 0.2150 (0.2196) time: 2.9469 data: 0.0076 max mem: 33300 Epoch: [2] [3230/4276] eta: 0:53:57 lr: 4.688897754444242e-05 loss: 0.2150 (0.2196) time: 2.9270 data: 0.0081 max mem: 33300 Epoch: [2] [3240/4276] eta: 0:53:26 lr: 4.688632772608069e-05 loss: 0.2191 (0.2196) time: 2.8966 data: 0.0081 max mem: 33300 Epoch: [2] [3250/4276] eta: 0:52:54 lr: 4.688367789107922e-05 loss: 0.2132 (0.2196) time: 2.8699 data: 0.0078 max mem: 33300 Epoch: [2] [3260/4276] eta: 0:52:23 lr: 4.688102803943687e-05 loss: 0.2091 (0.2196) time: 2.8895 data: 0.0085 max mem: 33300 Epoch: [2] [3270/4276] eta: 0:51:51 lr: 4.68783781711525e-05 loss: 0.2025 (0.2196) time: 2.9229 data: 0.0085 max mem: 33300 Epoch: [2] [3280/4276] eta: 0:51:20 lr: 4.6875728286224954e-05 loss: 0.2025 (0.2196) time: 2.9289 data: 0.0080 max mem: 33300 Epoch: [2] [3290/4276] eta: 0:50:48 lr: 4.687307838465309e-05 loss: 0.2205 (0.2196) time: 2.9218 data: 0.0082 max mem: 33300 Epoch: [2] [3300/4276] eta: 0:50:17 lr: 4.6870428466435736e-05 loss: 0.2353 (0.2197) time: 2.9492 data: 0.0079 max mem: 33300 Epoch: [2] [3310/4276] eta: 0:49:46 lr: 4.686777853157176e-05 loss: 0.2353 (0.2197) time: 2.9646 data: 0.0076 max mem: 33300 Epoch: [2] [3320/4276] eta: 0:49:14 lr: 4.686512858006001e-05 loss: 0.2224 (0.2197) time: 2.9431 data: 0.0075 max mem: 33300 Epoch: [2] [3330/4276] eta: 0:48:43 lr: 4.6862478611899326e-05 loss: 0.2027 (0.2197) time: 2.9283 data: 0.0077 max mem: 33300 Epoch: [2] [3340/4276] eta: 0:48:12 lr: 4.685982862708857e-05 loss: 0.2027 (0.2197) time: 2.9410 data: 0.0079 max mem: 33300 Epoch: [2] [3350/4276] eta: 0:47:40 lr: 4.685717862562658e-05 loss: 0.2046 (0.2196) time: 2.9600 data: 0.0079 max mem: 33300 Epoch: [2] [3360/4276] eta: 0:47:09 lr: 4.6854528607512216e-05 loss: 0.2034 (0.2196) time: 2.9451 data: 0.0080 max mem: 33300 Epoch: [2] [3370/4276] eta: 0:46:38 lr: 4.685187857274431e-05 loss: 0.2232 (0.2197) time: 2.9254 data: 0.0080 max mem: 33300 Epoch: [2] [3380/4276] eta: 0:46:06 lr: 4.684922852132173e-05 loss: 0.2202 (0.2196) time: 2.9294 data: 0.0077 max mem: 33300 Epoch: [2] [3390/4276] eta: 0:45:35 lr: 4.6846578453243325e-05 loss: 0.2197 (0.2197) time: 2.9277 data: 0.0073 max mem: 33300 Epoch: [2] [3400/4276] eta: 0:45:04 lr: 4.684392836850792e-05 loss: 0.2196 (0.2197) time: 2.9277 data: 0.0073 max mem: 33300 Epoch: [2] [3410/4276] eta: 0:44:33 lr: 4.684127826711438e-05 loss: 0.2175 (0.2197) time: 2.9551 data: 0.0072 max mem: 33300 Epoch: [2] [3420/4276] eta: 0:44:01 lr: 4.683862814906156e-05 loss: 0.2134 (0.2197) time: 2.9645 data: 0.0072 max mem: 33300 Epoch: [2] [3430/4276] eta: 0:43:30 lr: 4.6835978014348294e-05 loss: 0.2280 (0.2197) time: 2.9174 data: 0.0079 max mem: 33300 Epoch: [2] [3440/4276] eta: 0:42:59 lr: 4.683332786297343e-05 loss: 0.2235 (0.2197) time: 2.8819 data: 0.0085 max mem: 33300 Epoch: [2] [3450/4276] eta: 0:42:27 lr: 4.683067769493582e-05 loss: 0.2046 (0.2197) time: 2.9044 data: 0.0081 max mem: 33300 Epoch: [2] [3460/4276] eta: 0:41:56 lr: 4.682802751023431e-05 loss: 0.2157 (0.2197) time: 2.9136 data: 0.0081 max mem: 33300 Epoch: [2] [3470/4276] eta: 0:41:25 lr: 4.682537730886776e-05 loss: 0.1994 (0.2197) time: 2.8905 data: 0.0083 max mem: 33300 Epoch: [2] [3480/4276] eta: 0:40:54 lr: 4.682272709083501e-05 loss: 0.2125 (0.2197) time: 2.8728 data: 0.0082 max mem: 33300 Epoch: [2] [3490/4276] eta: 0:40:22 lr: 4.682007685613489e-05 loss: 0.2253 (0.2197) time: 2.8950 data: 0.0082 max mem: 33300 Epoch: [2] [3500/4276] eta: 0:39:51 lr: 4.6817426604766265e-05 loss: 0.2147 (0.2197) time: 2.9132 data: 0.0082 max mem: 33300 Epoch: [2] [3510/4276] eta: 0:39:20 lr: 4.681477633672798e-05 loss: 0.2017 (0.2196) time: 2.9227 data: 0.0083 max mem: 33300 Epoch: [2] [3520/4276] eta: 0:38:49 lr: 4.6812126052018865e-05 loss: 0.2068 (0.2196) time: 2.9391 data: 0.0082 max mem: 33300 Epoch: [2] [3530/4276] eta: 0:38:18 lr: 4.680947575063779e-05 loss: 0.2143 (0.2196) time: 2.9490 data: 0.0079 max mem: 33300 Epoch: [2] [3540/4276] eta: 0:37:47 lr: 4.68068254325836e-05 loss: 0.2106 (0.2196) time: 2.9648 data: 0.0075 max mem: 33300 Epoch: [2] [3550/4276] eta: 0:37:16 lr: 4.680417509785512e-05 loss: 0.2311 (0.2196) time: 2.9657 data: 0.0072 max mem: 33300 Epoch: [2] [3560/4276] eta: 0:36:45 lr: 4.6801524746451215e-05 loss: 0.2203 (0.2196) time: 2.9450 data: 0.0077 max mem: 33300 Epoch: [2] [3570/4276] eta: 0:36:13 lr: 4.679887437837072e-05 loss: 0.2176 (0.2196) time: 2.9162 data: 0.0083 max mem: 33300 Epoch: [2] [3580/4276] eta: 0:35:42 lr: 4.6796223993612494e-05 loss: 0.2008 (0.2196) time: 2.9079 data: 0.0084 max mem: 33300 Epoch: [2] [3590/4276] eta: 0:35:11 lr: 4.679357359217536e-05 loss: 0.1949 (0.2196) time: 2.9215 data: 0.0087 max mem: 33300 Epoch: [2] [3600/4276] eta: 0:34:40 lr: 4.6790923174058185e-05 loss: 0.2147 (0.2196) time: 2.9336 data: 0.0095 max mem: 33300 Epoch: [2] [3610/4276] eta: 0:34:09 lr: 4.6788272739259806e-05 loss: 0.2286 (0.2196) time: 2.9423 data: 0.0095 max mem: 33300 Epoch: [2] [3620/4276] eta: 0:33:38 lr: 4.6785622287779066e-05 loss: 0.2186 (0.2196) time: 2.9505 data: 0.0089 max mem: 33300 Epoch: [2] [3630/4276] eta: 0:33:07 lr: 4.678297181961481e-05 loss: 0.2186 (0.2196) time: 2.9405 data: 0.0089 max mem: 33300 Epoch: [2] [3640/4276] eta: 0:32:36 lr: 4.678032133476589e-05 loss: 0.2151 (0.2196) time: 2.9238 data: 0.0092 max mem: 33300 Epoch: [2] [3650/4276] eta: 0:32:05 lr: 4.6777670833231144e-05 loss: 0.2112 (0.2196) time: 2.9377 data: 0.0090 max mem: 33300 Epoch: [2] [3660/4276] eta: 0:31:34 lr: 4.677502031500941e-05 loss: 0.2008 (0.2196) time: 2.9576 data: 0.0081 max mem: 33300 Epoch: [2] [3670/4276] eta: 0:31:03 lr: 4.6772369780099545e-05 loss: 0.2153 (0.2196) time: 2.9519 data: 0.0077 max mem: 33300 Epoch: [2] [3680/4276] eta: 0:30:32 lr: 4.676971922850039e-05 loss: 0.2262 (0.2197) time: 2.9432 data: 0.0078 max mem: 33300 Epoch: [2] [3690/4276] eta: 0:30:01 lr: 4.6767068660210785e-05 loss: 0.2164 (0.2197) time: 2.9418 data: 0.0083 max mem: 33300 Epoch: [2] [3700/4276] eta: 0:29:30 lr: 4.676441807522958e-05 loss: 0.2149 (0.2197) time: 2.9591 data: 0.0081 max mem: 33300 Epoch: [2] [3710/4276] eta: 0:28:59 lr: 4.676176747355561e-05 loss: 0.2111 (0.2196) time: 2.9714 data: 0.0076 max mem: 33300 Epoch: [2] [3720/4276] eta: 0:28:29 lr: 4.675911685518773e-05 loss: 0.2041 (0.2196) time: 2.9715 data: 0.0078 max mem: 33300 Epoch: [2] [3730/4276] eta: 0:27:58 lr: 4.675646622012477e-05 loss: 0.2089 (0.2196) time: 2.9621 data: 0.0078 max mem: 33300 Epoch: [2] [3740/4276] eta: 0:27:27 lr: 4.6753815568365586e-05 loss: 0.2089 (0.2196) time: 2.9638 data: 0.0077 max mem: 33300 Epoch: [2] [3750/4276] eta: 0:26:56 lr: 4.675116489990901e-05 loss: 0.2218 (0.2196) time: 2.9730 data: 0.0076 max mem: 33300 Epoch: [2] [3760/4276] eta: 0:26:25 lr: 4.674851421475389e-05 loss: 0.2218 (0.2196) time: 2.9712 data: 0.0080 max mem: 33300 Epoch: [2] [3770/4276] eta: 0:25:54 lr: 4.6745863512899064e-05 loss: 0.2295 (0.2196) time: 2.9713 data: 0.0082 max mem: 33300 Epoch: [2] [3780/4276] eta: 0:25:23 lr: 4.674321279434339e-05 loss: 0.2158 (0.2196) time: 2.9709 data: 0.0081 max mem: 33300 Epoch: [2] [3790/4276] eta: 0:24:52 lr: 4.674056205908569e-05 loss: 0.2045 (0.2196) time: 2.9733 data: 0.0080 max mem: 33300 Epoch: [2] [3800/4276] eta: 0:24:22 lr: 4.673791130712482e-05 loss: 0.2101 (0.2196) time: 2.9734 data: 0.0082 max mem: 33300 Epoch: [2] [3810/4276] eta: 0:23:51 lr: 4.673526053845962e-05 loss: 0.2101 (0.2196) time: 2.9501 data: 0.0084 max mem: 33300 Epoch: [2] [3820/4276] eta: 0:23:20 lr: 4.673260975308893e-05 loss: 0.2053 (0.2196) time: 2.9249 data: 0.0084 max mem: 33300 Epoch: [2] [3830/4276] eta: 0:22:49 lr: 4.672995895101159e-05 loss: 0.2059 (0.2196) time: 2.9250 data: 0.0081 max mem: 33300 Epoch: [2] [3840/4276] eta: 0:22:18 lr: 4.672730813222644e-05 loss: 0.2071 (0.2196) time: 2.9436 data: 0.0084 max mem: 33300 Epoch: [2] [3850/4276] eta: 0:21:47 lr: 4.672465729673233e-05 loss: 0.2063 (0.2195) time: 2.9595 data: 0.0084 max mem: 33300 Epoch: [2] [3860/4276] eta: 0:21:17 lr: 4.6722006444528096e-05 loss: 0.2078 (0.2195) time: 2.9529 data: 0.0084 max mem: 33300 Epoch: [2] [3870/4276] eta: 0:20:46 lr: 4.671935557561258e-05 loss: 0.2078 (0.2195) time: 2.9393 data: 0.0087 max mem: 33300 Epoch: [2] [3880/4276] eta: 0:20:15 lr: 4.671670468998462e-05 loss: 0.2051 (0.2195) time: 2.9332 data: 0.0085 max mem: 33300 Epoch: [2] [3890/4276] eta: 0:19:44 lr: 4.671405378764306e-05 loss: 0.2115 (0.2195) time: 2.9320 data: 0.0083 max mem: 33300 Epoch: [2] [3900/4276] eta: 0:19:13 lr: 4.6711402868586744e-05 loss: 0.2213 (0.2195) time: 2.9365 data: 0.0083 max mem: 33300 Epoch: [2] [3910/4276] eta: 0:18:42 lr: 4.670875193281451e-05 loss: 0.2098 (0.2195) time: 2.9387 data: 0.0086 max mem: 33300 Epoch: [2] [3920/4276] eta: 0:18:12 lr: 4.670610098032519e-05 loss: 0.1902 (0.2194) time: 2.9408 data: 0.0086 max mem: 33300 Epoch: [2] [3930/4276] eta: 0:17:41 lr: 4.670345001111765e-05 loss: 0.1902 (0.2194) time: 2.9381 data: 0.0084 max mem: 33300 Epoch: [2] [3940/4276] eta: 0:17:10 lr: 4.670079902519069e-05 loss: 0.2076 (0.2194) time: 2.9316 data: 0.0084 max mem: 33300 Epoch: [2] [3950/4276] eta: 0:16:39 lr: 4.669814802254318e-05 loss: 0.2065 (0.2193) time: 2.9378 data: 0.0082 max mem: 33300 Epoch: [2] [3960/4276] eta: 0:16:08 lr: 4.669549700317396e-05 loss: 0.2162 (0.2194) time: 2.9496 data: 0.0083 max mem: 33300 Epoch: [2] [3970/4276] eta: 0:15:38 lr: 4.669284596708185e-05 loss: 0.2314 (0.2194) time: 2.9495 data: 0.0084 max mem: 33300 Epoch: [2] [3980/4276] eta: 0:15:07 lr: 4.6690194914265706e-05 loss: 0.2114 (0.2193) time: 2.9437 data: 0.0083 max mem: 33300 Epoch: [2] [3990/4276] eta: 0:14:36 lr: 4.668754384472436e-05 loss: 0.2114 (0.2193) time: 2.9411 data: 0.0087 max mem: 33300 Epoch: [2] [4000/4276] eta: 0:14:05 lr: 4.668489275845666e-05 loss: 0.2017 (0.2193) time: 2.9274 data: 0.0089 max mem: 33300 Epoch: [2] [4010/4276] eta: 0:13:35 lr: 4.668224165546144e-05 loss: 0.1947 (0.2193) time: 2.9227 data: 0.0091 max mem: 33300 Epoch: [2] [4020/4276] eta: 0:13:04 lr: 4.667959053573753e-05 loss: 0.1947 (0.2193) time: 2.9276 data: 0.0092 max mem: 33300 Epoch: [2] [4030/4276] eta: 0:12:33 lr: 4.667693939928378e-05 loss: 0.1947 (0.2193) time: 2.9595 data: 0.0083 max mem: 33300 Epoch: [2] [4040/4276] eta: 0:12:03 lr: 4.667428824609903e-05 loss: 0.2100 (0.2193) time: 2.9789 data: 0.0078 max mem: 33300 Epoch: [2] [4050/4276] eta: 0:11:32 lr: 4.6671637076182106e-05 loss: 0.1979 (0.2192) time: 2.9519 data: 0.0083 max mem: 33300 Epoch: [2] [4060/4276] eta: 0:11:01 lr: 4.666898588953186e-05 loss: 0.1896 (0.2192) time: 2.9319 data: 0.0085 max mem: 33300 Epoch: [2] [4070/4276] eta: 0:10:30 lr: 4.666633468614712e-05 loss: 0.2158 (0.2192) time: 2.9315 data: 0.0086 max mem: 33300 Epoch: [2] [4080/4276] eta: 0:10:00 lr: 4.666368346602673e-05 loss: 0.2047 (0.2192) time: 2.9299 data: 0.0086 max mem: 33300 Epoch: [2] [4090/4276] eta: 0:09:29 lr: 4.666103222916953e-05 loss: 0.2132 (0.2192) time: 2.9334 data: 0.0083 max mem: 33300 Epoch: [2] [4100/4276] eta: 0:08:58 lr: 4.665838097557434e-05 loss: 0.2145 (0.2192) time: 2.9405 data: 0.0088 max mem: 33300 Epoch: [2] [4110/4276] eta: 0:08:28 lr: 4.665572970524002e-05 loss: 0.2184 (0.2192) time: 2.9579 data: 0.0090 max mem: 33300 Epoch: [2] [4120/4276] eta: 0:07:57 lr: 4.665307841816541e-05 loss: 0.2184 (0.2192) time: 2.9514 data: 0.0093 max mem: 33300 Epoch: [2] [4130/4276] eta: 0:07:26 lr: 4.665042711434932e-05 loss: 0.2132 (0.2192) time: 2.9368 data: 0.0098 max mem: 33300 Epoch: [2] [4140/4276] eta: 0:06:56 lr: 4.664777579379061e-05 loss: 0.2137 (0.2191) time: 2.9443 data: 0.0097 max mem: 33300 Epoch: [2] [4150/4276] eta: 0:06:25 lr: 4.66451244564881e-05 loss: 0.2058 (0.2191) time: 2.9465 data: 0.0087 max mem: 33300 Epoch: [2] [4160/4276] eta: 0:05:55 lr: 4.664247310244065e-05 loss: 0.2063 (0.2191) time: 2.9308 data: 0.0086 max mem: 33300 Epoch: [2] [4170/4276] eta: 0:05:24 lr: 4.6639821731647074e-05 loss: 0.2176 (0.2191) time: 2.8977 data: 0.0093 max mem: 33300 Epoch: [2] [4180/4276] eta: 0:04:53 lr: 4.6637170344106226e-05 loss: 0.2147 (0.2191) time: 2.8810 data: 0.0085 max mem: 33300 Epoch: [2] [4190/4276] eta: 0:04:23 lr: 4.663451893981693e-05 loss: 0.2045 (0.2191) time: 2.8922 data: 0.0085 max mem: 33300 Epoch: [2] [4200/4276] eta: 0:03:52 lr: 4.663186751877803e-05 loss: 0.2045 (0.2191) time: 2.9131 data: 0.0088 max mem: 33300 Epoch: [2] [4210/4276] eta: 0:03:21 lr: 4.6629216080988345e-05 loss: 0.2142 (0.2192) time: 2.9304 data: 0.0088 max mem: 33300 Epoch: [2] [4220/4276] eta: 0:02:51 lr: 4.6626564626446737e-05 loss: 0.2227 (0.2192) time: 2.9523 data: 0.0093 max mem: 33300 Epoch: [2] [4230/4276] eta: 0:02:20 lr: 4.662391315515202e-05 loss: 0.2446 (0.2193) time: 2.9579 data: 0.0090 max mem: 33300 Epoch: [2] [4240/4276] eta: 0:01:50 lr: 4.662126166710305e-05 loss: 0.2403 (0.2193) time: 2.9558 data: 0.0091 max mem: 33300 Epoch: [2] [4250/4276] eta: 0:01:19 lr: 4.661861016229864e-05 loss: 0.2156 (0.2193) time: 2.9236 data: 0.0092 max mem: 33300 Epoch: [2] [4260/4276] eta: 0:00:48 lr: 4.661595864073764e-05 loss: 0.2285 (0.2193) time: 2.9316 data: 0.0089 max mem: 33300 Epoch: [2] [4270/4276] eta: 0:00:18 lr: 4.661330710241888e-05 loss: 0.2214 (0.2193) time: 2.9502 data: 0.0081 max mem: 33300 Epoch: [2] Total time: 3:37:51 Test: [ 0/21770] eta: 13:04:05 time: 2.1610 data: 2.1218 max mem: 33300 Test: [ 100/21770] eta: 0:21:13 time: 0.0383 data: 0.0008 max mem: 33300 Test: [ 200/21770] eta: 0:17:29 time: 0.0387 data: 0.0008 max mem: 33300 Test: [ 300/21770] eta: 0:16:12 time: 0.0383 data: 0.0008 max mem: 33300 Test: [ 400/21770] eta: 0:15:26 time: 0.0374 data: 0.0009 max mem: 33300 Test: [ 500/21770] eta: 0:14:56 time: 0.0374 data: 0.0009 max mem: 33300 Test: [ 600/21770] eta: 0:14:36 time: 0.0375 data: 0.0009 max mem: 33300 Test: [ 700/21770] eta: 0:14:20 time: 0.0374 data: 0.0009 max mem: 33300 Test: [ 800/21770] eta: 0:14:09 time: 0.0390 data: 0.0009 max mem: 33300 Test: [ 900/21770] eta: 0:14:02 time: 0.0396 data: 0.0009 max mem: 33300 Test: [ 1000/21770] eta: 0:13:55 time: 0.0378 data: 0.0009 max mem: 33300 Test: [ 1100/21770] eta: 0:13:46 time: 0.0377 data: 0.0010 max mem: 33300 Test: [ 1200/21770] eta: 0:13:39 time: 0.0384 data: 0.0010 max mem: 33300 Test: [ 1300/21770] eta: 0:13:33 time: 0.0388 data: 0.0009 max mem: 33300 Test: [ 1400/21770] eta: 0:13:28 time: 0.0387 data: 0.0010 max mem: 33300 Test: [ 1500/21770] eta: 0:13:23 time: 0.0389 data: 0.0009 max mem: 33300 Test: [ 1600/21770] eta: 0:13:18 time: 0.0387 data: 0.0009 max mem: 33300 Test: [ 1700/21770] eta: 0:13:13 time: 0.0387 data: 0.0009 max mem: 33300 Test: [ 1800/21770] eta: 0:13:08 time: 0.0386 data: 0.0009 max mem: 33300 Test: [ 1900/21770] eta: 0:13:03 time: 0.0390 data: 0.0009 max mem: 33300 Test: [ 2000/21770] eta: 0:12:59 time: 0.0387 data: 0.0009 max mem: 33300 Test: [ 2100/21770] eta: 0:12:54 time: 0.0392 data: 0.0009 max mem: 33300 Test: [ 2200/21770] eta: 0:12:50 time: 0.0392 data: 0.0009 max mem: 33300 Test: [ 2300/21770] eta: 0:12:46 time: 0.0391 data: 0.0009 max mem: 33300 Test: [ 2400/21770] eta: 0:12:42 time: 0.0394 data: 0.0009 max mem: 33300 Test: [ 2500/21770] eta: 0:12:38 time: 0.0397 data: 0.0009 max mem: 33300 Test: [ 2600/21770] eta: 0:12:35 time: 0.0400 data: 0.0009 max mem: 33300 Test: [ 2700/21770] eta: 0:12:31 time: 0.0399 data: 0.0009 max mem: 33300 Test: [ 2800/21770] eta: 0:12:28 time: 0.0400 data: 0.0009 max mem: 33300 Test: [ 2900/21770] eta: 0:12:24 time: 0.0400 data: 0.0009 max mem: 33300 Test: [ 3000/21770] eta: 0:12:20 time: 0.0400 data: 0.0008 max mem: 33300 Test: [ 3100/21770] eta: 0:12:17 time: 0.0398 data: 0.0009 max mem: 33300 Test: [ 3200/21770] eta: 0:12:13 time: 0.0400 data: 0.0008 max mem: 33300 Test: [ 3300/21770] eta: 0:12:09 time: 0.0401 data: 0.0008 max mem: 33300 Test: [ 3400/21770] eta: 0:12:06 time: 0.0400 data: 0.0008 max mem: 33300 Test: [ 3500/21770] eta: 0:12:02 time: 0.0401 data: 0.0008 max mem: 33300 Test: [ 3600/21770] eta: 0:11:58 time: 0.0400 data: 0.0009 max mem: 33300 Test: [ 3700/21770] eta: 0:11:54 time: 0.0400 data: 0.0009 max mem: 33300 Test: [ 3800/21770] eta: 0:11:51 time: 0.0401 data: 0.0008 max mem: 33300 Test: [ 3900/21770] eta: 0:11:47 time: 0.0400 data: 0.0008 max mem: 33300 Test: [ 4000/21770] eta: 0:11:43 time: 0.0400 data: 0.0008 max mem: 33300 Test: [ 4100/21770] eta: 0:11:39 time: 0.0402 data: 0.0009 max mem: 33300 Test: [ 4200/21770] eta: 0:11:36 time: 0.0394 data: 0.0008 max mem: 33300 Test: [ 4300/21770] eta: 0:11:32 time: 0.0400 data: 0.0008 max mem: 33300 Test: [ 4400/21770] eta: 0:11:28 time: 0.0378 data: 0.0009 max mem: 33300 Test: [ 4500/21770] eta: 0:11:23 time: 0.0379 data: 0.0009 max mem: 33300 Test: [ 4600/21770] eta: 0:11:19 time: 0.0385 data: 0.0008 max mem: 33300 Test: [ 4700/21770] eta: 0:11:14 time: 0.0385 data: 0.0009 max mem: 33300 Test: [ 4800/21770] eta: 0:11:10 time: 0.0385 data: 0.0008 max mem: 33300 Test: [ 4900/21770] eta: 0:11:06 time: 0.0384 data: 0.0008 max mem: 33300 Test: [ 5000/21770] eta: 0:11:01 time: 0.0385 data: 0.0008 max mem: 33300 Test: [ 5100/21770] eta: 0:10:57 time: 0.0385 data: 0.0008 max mem: 33300 Test: [ 5200/21770] eta: 0:10:53 time: 0.0385 data: 0.0008 max mem: 33300 Test: [ 5300/21770] eta: 0:10:49 time: 0.0385 data: 0.0008 max mem: 33300 Test: [ 5400/21770] eta: 0:10:44 time: 0.0385 data: 0.0009 max mem: 33300 Test: [ 5500/21770] eta: 0:10:40 time: 0.0385 data: 0.0008 max mem: 33300 Test: [ 5600/21770] eta: 0:10:36 time: 0.0384 data: 0.0008 max mem: 33300 Test: [ 5700/21770] eta: 0:10:32 time: 0.0385 data: 0.0009 max mem: 33300 Test: [ 5800/21770] eta: 0:10:27 time: 0.0379 data: 0.0009 max mem: 33300 Test: [ 5900/21770] eta: 0:10:23 time: 0.0379 data: 0.0008 max mem: 33300 Test: [ 6000/21770] eta: 0:10:19 time: 0.0377 data: 0.0008 max mem: 33300 Test: [ 6100/21770] eta: 0:10:14 time: 0.0379 data: 0.0009 max mem: 33300 Test: [ 6200/21770] eta: 0:10:10 time: 0.0379 data: 0.0009 max mem: 33300 Test: [ 6300/21770] eta: 0:10:06 time: 0.0378 data: 0.0009 max mem: 33300 Test: [ 6400/21770] eta: 0:10:02 time: 0.0379 data: 0.0008 max mem: 33300 Test: [ 6500/21770] eta: 0:09:57 time: 0.0379 data: 0.0009 max mem: 33300 Test: [ 6600/21770] eta: 0:09:53 time: 0.0378 data: 0.0009 max mem: 33300 Test: [ 6700/21770] eta: 0:09:49 time: 0.0378 data: 0.0009 max mem: 33300 Test: [ 6800/21770] eta: 0:09:45 time: 0.0379 data: 0.0009 max mem: 33300 Test: [ 6900/21770] eta: 0:09:41 time: 0.0378 data: 0.0008 max mem: 33300 Test: [ 7000/21770] eta: 0:09:36 time: 0.0379 data: 0.0009 max mem: 33300 Test: [ 7100/21770] eta: 0:09:32 time: 0.0378 data: 0.0009 max mem: 33300 Test: [ 7200/21770] eta: 0:09:28 time: 0.0379 data: 0.0008 max mem: 33300 Test: [ 7300/21770] eta: 0:09:24 time: 0.0378 data: 0.0009 max mem: 33300 Test: [ 7400/21770] eta: 0:09:20 time: 0.0381 data: 0.0008 max mem: 33300 Test: [ 7500/21770] eta: 0:09:16 time: 0.0384 data: 0.0008 max mem: 33300 Test: [ 7600/21770] eta: 0:09:12 time: 0.0382 data: 0.0008 max mem: 33300 Test: [ 7700/21770] eta: 0:09:08 time: 0.0382 data: 0.0008 max mem: 33300 Test: [ 7800/21770] eta: 0:09:04 time: 0.0381 data: 0.0008 max mem: 33300 Test: [ 7900/21770] eta: 0:09:00 time: 0.0384 data: 0.0008 max mem: 33300 Test: [ 8000/21770] eta: 0:08:56 time: 0.0382 data: 0.0008 max mem: 33300 Test: [ 8100/21770] eta: 0:08:52 time: 0.0383 data: 0.0008 max mem: 33300 Test: [ 8200/21770] eta: 0:08:48 time: 0.0381 data: 0.0008 max mem: 33300 Test: [ 8300/21770] eta: 0:08:44 time: 0.0384 data: 0.0008 max mem: 33300 Test: [ 8400/21770] eta: 0:08:40 time: 0.0382 data: 0.0008 max mem: 33300 Test: [ 8500/21770] eta: 0:08:36 time: 0.0383 data: 0.0008 max mem: 33300 Test: [ 8600/21770] eta: 0:08:32 time: 0.0381 data: 0.0008 max mem: 33300 Test: [ 8700/21770] eta: 0:08:28 time: 0.0383 data: 0.0008 max mem: 33300 Test: [ 8800/21770] eta: 0:08:24 time: 0.0382 data: 0.0008 max mem: 33300 Test: [ 8900/21770] eta: 0:08:20 time: 0.0382 data: 0.0009 max mem: 33300 Test: [ 9000/21770] eta: 0:08:16 time: 0.0378 data: 0.0008 max mem: 33300 Test: [ 9100/21770] eta: 0:08:12 time: 0.0379 data: 0.0008 max mem: 33300 Test: [ 9200/21770] eta: 0:08:08 time: 0.0379 data: 0.0009 max mem: 33300 Test: [ 9300/21770] eta: 0:08:04 time: 0.0378 data: 0.0009 max mem: 33300 Test: [ 9400/21770] eta: 0:08:00 time: 0.0378 data: 0.0009 max mem: 33300 Test: [ 9500/21770] eta: 0:07:56 time: 0.0379 data: 0.0008 max mem: 33300 Test: [ 9600/21770] eta: 0:07:52 time: 0.0378 data: 0.0008 max mem: 33300 Test: [ 9700/21770] eta: 0:07:48 time: 0.0378 data: 0.0008 max mem: 33300 Test: [ 9800/21770] eta: 0:07:44 time: 0.0379 data: 0.0009 max mem: 33300 Test: [ 9900/21770] eta: 0:07:40 time: 0.0378 data: 0.0009 max mem: 33300 Test: [10000/21770] eta: 0:07:36 time: 0.0378 data: 0.0008 max mem: 33300 Test: [10100/21770] eta: 0:07:32 time: 0.0379 data: 0.0009 max mem: 33300 Test: [10200/21770] eta: 0:07:28 time: 0.0378 data: 0.0009 max mem: 33300 Test: [10300/21770] eta: 0:07:24 time: 0.0379 data: 0.0009 max mem: 33300 Test: [10400/21770] eta: 0:07:20 time: 0.0379 data: 0.0008 max mem: 33300 Test: [10500/21770] eta: 0:07:16 time: 0.0378 data: 0.0009 max mem: 33300 Test: [10600/21770] eta: 0:07:12 time: 0.0379 data: 0.0009 max mem: 33300 Test: [10700/21770] eta: 0:07:08 time: 0.0379 data: 0.0009 max mem: 33300 Test: [10800/21770] eta: 0:07:04 time: 0.0378 data: 0.0008 max mem: 33300 Test: [10900/21770] eta: 0:07:00 time: 0.0379 data: 0.0009 max mem: 33300 Test: [11000/21770] eta: 0:06:56 time: 0.0378 data: 0.0009 max mem: 33300 Test: [11100/21770] eta: 0:06:52 time: 0.0379 data: 0.0009 max mem: 33300 Test: [11200/21770] eta: 0:06:48 time: 0.0379 data: 0.0008 max mem: 33300 Test: [11300/21770] eta: 0:06:44 time: 0.0378 data: 0.0008 max mem: 33300 Test: [11400/21770] eta: 0:06:40 time: 0.0378 data: 0.0008 max mem: 33300 Test: [11500/21770] eta: 0:06:36 time: 0.0379 data: 0.0009 max mem: 33300 Test: [11600/21770] eta: 0:06:32 time: 0.0379 data: 0.0009 max mem: 33300 Test: [11700/21770] eta: 0:06:29 time: 0.0378 data: 0.0008 max mem: 33300 Test: [11800/21770] eta: 0:06:25 time: 0.0378 data: 0.0008 max mem: 33300 Test: [11900/21770] eta: 0:06:21 time: 0.0378 data: 0.0009 max mem: 33300 Test: [12000/21770] eta: 0:06:17 time: 0.0378 data: 0.0008 max mem: 33300 Test: [12100/21770] eta: 0:06:13 time: 0.0378 data: 0.0008 max mem: 33300 Test: [12200/21770] eta: 0:06:09 time: 0.0378 data: 0.0008 max mem: 33300 Test: [12300/21770] eta: 0:06:05 time: 0.0378 data: 0.0008 max mem: 33300 Test: [12400/21770] eta: 0:06:01 time: 0.0378 data: 0.0009 max mem: 33300 Test: [12500/21770] eta: 0:05:57 time: 0.0379 data: 0.0008 max mem: 33300 Test: [12600/21770] eta: 0:05:53 time: 0.0378 data: 0.0008 max mem: 33300 Test: [12700/21770] eta: 0:05:49 time: 0.0379 data: 0.0009 max mem: 33300 Test: [12800/21770] eta: 0:05:45 time: 0.0379 data: 0.0008 max mem: 33300 Test: [12900/21770] eta: 0:05:42 time: 0.0379 data: 0.0008 max mem: 33300 Test: [13000/21770] eta: 0:05:38 time: 0.0379 data: 0.0009 max mem: 33300 Test: [13100/21770] eta: 0:05:34 time: 0.0378 data: 0.0008 max mem: 33300 Test: [13200/21770] eta: 0:05:30 time: 0.0378 data: 0.0009 max mem: 33300 Test: [13300/21770] eta: 0:05:26 time: 0.0379 data: 0.0009 max mem: 33300 Test: [13400/21770] eta: 0:05:22 time: 0.0379 data: 0.0009 max mem: 33300 Test: [13500/21770] eta: 0:05:18 time: 0.0378 data: 0.0008 max mem: 33300 Test: [13600/21770] eta: 0:05:14 time: 0.0379 data: 0.0009 max mem: 33300 Test: [13700/21770] eta: 0:05:10 time: 0.0379 data: 0.0009 max mem: 33300 Test: [13800/21770] eta: 0:05:06 time: 0.0383 data: 0.0009 max mem: 33300 Test: [13900/21770] eta: 0:05:03 time: 0.0379 data: 0.0009 max mem: 33300 Test: [14000/21770] eta: 0:04:59 time: 0.0378 data: 0.0009 max mem: 33300 Test: [14100/21770] eta: 0:04:55 time: 0.0381 data: 0.0009 max mem: 33300 Test: [14200/21770] eta: 0:04:51 time: 0.0383 data: 0.0008 max mem: 33300 Test: [14300/21770] eta: 0:04:47 time: 0.0381 data: 0.0009 max mem: 33300 Test: [14400/21770] eta: 0:04:43 time: 0.0382 data: 0.0008 max mem: 33300 Test: [14500/21770] eta: 0:04:39 time: 0.0382 data: 0.0009 max mem: 33300 Test: [14600/21770] eta: 0:04:35 time: 0.0383 data: 0.0008 max mem: 33300 Test: [14700/21770] eta: 0:04:32 time: 0.0382 data: 0.0009 max mem: 33300 Test: [14800/21770] eta: 0:04:28 time: 0.0383 data: 0.0008 max mem: 33300 Test: [14900/21770] eta: 0:04:24 time: 0.0382 data: 0.0009 max mem: 33300 Test: [15000/21770] eta: 0:04:20 time: 0.0382 data: 0.0008 max mem: 33300 Test: [15100/21770] eta: 0:04:16 time: 0.0382 data: 0.0008 max mem: 33300 Test: [15200/21770] eta: 0:04:12 time: 0.0383 data: 0.0008 max mem: 33300 Test: [15300/21770] eta: 0:04:08 time: 0.0381 data: 0.0008 max mem: 33300 Test: [15400/21770] eta: 0:04:05 time: 0.0378 data: 0.0009 max mem: 33300 Test: [15500/21770] eta: 0:04:01 time: 0.0379 data: 0.0009 max mem: 33300 Test: [15600/21770] eta: 0:03:57 time: 0.0379 data: 0.0009 max mem: 33300 Test: [15700/21770] eta: 0:03:53 time: 0.0378 data: 0.0009 max mem: 33300 Test: [15800/21770] eta: 0:03:49 time: 0.0379 data: 0.0009 max mem: 33300 Test: [15900/21770] eta: 0:03:45 time: 0.0378 data: 0.0009 max mem: 33300 Test: [16000/21770] eta: 0:03:41 time: 0.0383 data: 0.0009 max mem: 33300 Test: [16100/21770] eta: 0:03:38 time: 0.0401 data: 0.0008 max mem: 33300 Test: [16200/21770] eta: 0:03:34 time: 0.0399 data: 0.0008 max mem: 33300 Test: [16300/21770] eta: 0:03:30 time: 0.0401 data: 0.0008 max mem: 33300 Test: [16400/21770] eta: 0:03:26 time: 0.0400 data: 0.0008 max mem: 33300 Test: [16500/21770] eta: 0:03:22 time: 0.0401 data: 0.0008 max mem: 33300 Test: [16600/21770] eta: 0:03:19 time: 0.0398 data: 0.0008 max mem: 33300 Test: [16700/21770] eta: 0:03:15 time: 0.0400 data: 0.0008 max mem: 33300 Test: [16800/21770] eta: 0:03:11 time: 0.0399 data: 0.0008 max mem: 33300 Test: [16900/21770] eta: 0:03:07 time: 0.0399 data: 0.0008 max mem: 33300 Test: [17000/21770] eta: 0:03:03 time: 0.0378 data: 0.0008 max mem: 33300 Test: [17100/21770] eta: 0:02:59 time: 0.0378 data: 0.0009 max mem: 33300 Test: [17200/21770] eta: 0:02:56 time: 0.0379 data: 0.0008 max mem: 33300 Test: [17300/21770] eta: 0:02:52 time: 0.0378 data: 0.0008 max mem: 33300 Test: [17400/21770] eta: 0:02:48 time: 0.0378 data: 0.0008 max mem: 33300 Test: [17500/21770] eta: 0:02:44 time: 0.0379 data: 0.0008 max mem: 33300 Test: [17600/21770] eta: 0:02:40 time: 0.0379 data: 0.0008 max mem: 33300 Test: [17700/21770] eta: 0:02:36 time: 0.0378 data: 0.0008 max mem: 33300 Test: [17800/21770] eta: 0:02:32 time: 0.0378 data: 0.0008 max mem: 33300 Test: [17900/21770] eta: 0:02:29 time: 0.0378 data: 0.0008 max mem: 33300 Test: [18000/21770] eta: 0:02:25 time: 0.0378 data: 0.0008 max mem: 33300 Test: [18100/21770] eta: 0:02:21 time: 0.0378 data: 0.0008 max mem: 33300 Test: [18200/21770] eta: 0:02:17 time: 0.0378 data: 0.0009 max mem: 33300 Test: [18300/21770] eta: 0:02:13 time: 0.0378 data: 0.0008 max mem: 33300 Test: [18400/21770] eta: 0:02:09 time: 0.0378 data: 0.0008 max mem: 33300 Test: [18500/21770] eta: 0:02:05 time: 0.0378 data: 0.0008 max mem: 33300 Test: [18600/21770] eta: 0:02:01 time: 0.0385 data: 0.0009 max mem: 33300 Test: [18700/21770] eta: 0:01:58 time: 0.0378 data: 0.0008 max mem: 33300 Test: [18800/21770] eta: 0:01:54 time: 0.0378 data: 0.0008 max mem: 33300 Test: [18900/21770] eta: 0:01:50 time: 0.0378 data: 0.0008 max mem: 33300 Test: [19000/21770] eta: 0:01:46 time: 0.0379 data: 0.0008 max mem: 33300 Test: [19100/21770] eta: 0:01:42 time: 0.0378 data: 0.0008 max mem: 33300 Test: [19200/21770] eta: 0:01:38 time: 0.0378 data: 0.0008 max mem: 33300 Test: [19300/21770] eta: 0:01:34 time: 0.0378 data: 0.0008 max mem: 33300 Test: [19400/21770] eta: 0:01:31 time: 0.0379 data: 0.0008 max mem: 33300 Test: [19500/21770] eta: 0:01:27 time: 0.0383 data: 0.0008 max mem: 33300 Test: [19600/21770] eta: 0:01:23 time: 0.0382 data: 0.0008 max mem: 33300 Test: [19700/21770] eta: 0:01:19 time: 0.0382 data: 0.0008 max mem: 33300 Test: [19800/21770] eta: 0:01:15 time: 0.0380 data: 0.0008 max mem: 33300 Test: [19900/21770] eta: 0:01:11 time: 0.0382 data: 0.0008 max mem: 33300 Test: [20000/21770] eta: 0:01:08 time: 0.0381 data: 0.0008 max mem: 33300 Test: [20100/21770] eta: 0:01:04 time: 0.0382 data: 0.0008 max mem: 33300 Test: [20200/21770] eta: 0:01:00 time: 0.0384 data: 0.0009 max mem: 33300 Test: [20300/21770] eta: 0:00:56 time: 0.0378 data: 0.0008 max mem: 33300 Test: [20400/21770] eta: 0:00:52 time: 0.0382 data: 0.0008 max mem: 33300 Test: [20500/21770] eta: 0:00:48 time: 0.0381 data: 0.0008 max mem: 33300 Test: [20600/21770] eta: 0:00:44 time: 0.0378 data: 0.0008 max mem: 33300 Test: [20700/21770] eta: 0:00:41 time: 0.0379 data: 0.0008 max mem: 33300 Test: [20800/21770] eta: 0:00:37 time: 0.0378 data: 0.0008 max mem: 33300 Test: [20900/21770] eta: 0:00:33 time: 0.0381 data: 0.0008 max mem: 33300 Test: [21000/21770] eta: 0:00:29 time: 0.0379 data: 0.0008 max mem: 33300 Test: [21100/21770] eta: 0:00:25 time: 0.0379 data: 0.0009 max mem: 33300 Test: [21200/21770] eta: 0:00:21 time: 0.0379 data: 0.0009 max mem: 33300 Test: [21300/21770] eta: 0:00:18 time: 0.0378 data: 0.0008 max mem: 33300 Test: [21400/21770] eta: 0:00:14 time: 0.0378 data: 0.0008 max mem: 33300 Test: [21500/21770] eta: 0:00:10 time: 0.0379 data: 0.0009 max mem: 33300 Test: [21600/21770] eta: 0:00:06 time: 0.0379 data: 0.0009 max mem: 33300 Test: [21700/21770] eta: 0:00:02 time: 0.0379 data: 0.0009 max mem: 33300 Test: Total time: 0:13:56 Final results: Mean IoU is 0.00 precision@0.5 = 0.00 precision@0.6 = 0.00 precision@0.7 = 0.00 precision@0.8 = 0.00 precision@0.9 = 0.00 overall IoU = 0.00 mean IoU = 0.00 Mean accuracy for one-to-zero sample is 100.00 Average object IoU 0.0 Overall IoU 0.0 Epoch: [3] [ 0/4276] eta: 6:29:51 lr: 4.661171617138342e-05 loss: 0.1767 (0.1767) time: 5.4704 data: 2.4459 max mem: 33300 Epoch: [3] [ 10/4276] eta: 3:43:19 lr: 4.660906460624983e-05 loss: 0.1969 (0.2049) time: 3.1410 data: 0.2285 max mem: 33300 Epoch: [3] [ 20/4276] eta: 3:36:36 lr: 4.660641302435543e-05 loss: 0.1960 (0.2046) time: 2.9329 data: 0.0072 max mem: 33300 Epoch: [3] [ 30/4276] eta: 3:33:20 lr: 4.66037614256991e-05 loss: 0.1896 (0.2064) time: 2.9453 data: 0.0077 max mem: 33300 Epoch: [3] [ 40/4276] eta: 3:30:55 lr: 4.6601109810279633e-05 loss: 0.1900 (0.2044) time: 2.9183 data: 0.0074 max mem: 33300 Epoch: [3] [ 50/4276] eta: 3:29:43 lr: 4.65984581780959e-05 loss: 0.1900 (0.2031) time: 2.9199 data: 0.0075 max mem: 33300 Epoch: [3] [ 60/4276] eta: 3:28:34 lr: 4.6595806529146695e-05 loss: 0.1927 (0.2055) time: 2.9289 data: 0.0078 max mem: 33300 Epoch: [3] [ 70/4276] eta: 3:27:33 lr: 4.6593154863430875e-05 loss: 0.1927 (0.2054) time: 2.9183 data: 0.0075 max mem: 33300 Epoch: [3] [ 80/4276] eta: 3:26:31 lr: 4.659050318094728e-05 loss: 0.2069 (0.2065) time: 2.9065 data: 0.0073 max mem: 33300 Epoch: [3] [ 90/4276] eta: 3:25:38 lr: 4.6587851481694736e-05 loss: 0.1944 (0.2042) time: 2.9007 data: 0.0073 max mem: 33300 Epoch: [3] [ 100/4276] eta: 3:24:57 lr: 4.6585199765672065e-05 loss: 0.1792 (0.2071) time: 2.9108 data: 0.0072 max mem: 33300 Epoch: [3] [ 110/4276] eta: 3:24:17 lr: 4.658254803287812e-05 loss: 0.1961 (0.2075) time: 2.9173 data: 0.0074 max mem: 33300 Epoch: [3] [ 120/4276] eta: 3:23:28 lr: 4.6579896283311725e-05 loss: 0.2026 (0.2079) time: 2.9020 data: 0.0077 max mem: 33300 Epoch: [3] [ 130/4276] eta: 3:22:38 lr: 4.6577244516971705e-05 loss: 0.2084 (0.2091) time: 2.8795 data: 0.0085 max mem: 33300 Epoch: [3] [ 140/4276] eta: 3:21:51 lr: 4.65745927338569e-05 loss: 0.2085 (0.2090) time: 2.8723 data: 0.0091 max mem: 33300 Epoch: [3] [ 150/4276] eta: 3:21:08 lr: 4.6571940933966144e-05 loss: 0.1969 (0.2080) time: 2.8748 data: 0.0090 max mem: 33300 Epoch: [3] [ 160/4276] eta: 3:20:24 lr: 4.656928911729828e-05 loss: 0.1980 (0.2084) time: 2.8721 data: 0.0090 max mem: 33300 Epoch: [3] [ 170/4276] eta: 3:19:52 lr: 4.656663728385212e-05 loss: 0.2070 (0.2087) time: 2.8882 data: 0.0095 max mem: 33300 Epoch: [3] [ 180/4276] eta: 3:19:17 lr: 4.65639854336265e-05 loss: 0.2120 (0.2101) time: 2.9029 data: 0.0093 max mem: 33300 Epoch: [3] [ 190/4276] eta: 3:18:40 lr: 4.6561333566620265e-05 loss: 0.2279 (0.2110) time: 2.8898 data: 0.0087 max mem: 33300 Epoch: [3] [ 200/4276] eta: 3:18:03 lr: 4.6558681682832236e-05 loss: 0.2242 (0.2121) time: 2.8791 data: 0.0086 max mem: 33300 Epoch: [3] [ 210/4276] eta: 3:17:23 lr: 4.655602978226125e-05 loss: 0.2121 (0.2121) time: 2.8699 data: 0.0086 max mem: 33300 Epoch: [3] [ 220/4276] eta: 3:16:45 lr: 4.655337786490613e-05 loss: 0.2054 (0.2118) time: 2.8624 data: 0.0086 max mem: 33300 Epoch: [3] [ 230/4276] eta: 3:16:14 lr: 4.655072593076572e-05 loss: 0.2024 (0.2107) time: 2.8824 data: 0.0085 max mem: 33300 Epoch: [3] [ 240/4276] eta: 3:15:46 lr: 4.6548073979838845e-05 loss: 0.2052 (0.2111) time: 2.9087 data: 0.0087 max mem: 33300 Epoch: [3] [ 250/4276] eta: 3:15:21 lr: 4.654542201212433e-05 loss: 0.2202 (0.2120) time: 2.9250 data: 0.0090 max mem: 33300 Epoch: [3] [ 260/4276] eta: 3:14:52 lr: 4.654277002762102e-05 loss: 0.2242 (0.2131) time: 2.9250 data: 0.0087 max mem: 33300 Epoch: [3] [ 270/4276] eta: 3:14:22 lr: 4.654011802632774e-05 loss: 0.2494 (0.2135) time: 2.9097 data: 0.0078 max mem: 33300 Epoch: [3] [ 280/4276] eta: 3:13:51 lr: 4.653746600824331e-05 loss: 0.2091 (0.2134) time: 2.9022 data: 0.0074 max mem: 33300 Epoch: [3] [ 290/4276] eta: 3:13:23 lr: 4.653481397336657e-05 loss: 0.2033 (0.2127) time: 2.9063 data: 0.0077 max mem: 33300 Epoch: [3] [ 300/4276] eta: 3:12:54 lr: 4.653216192169635e-05 loss: 0.1951 (0.2123) time: 2.9151 data: 0.0074 max mem: 33300 Epoch: [3] [ 310/4276] eta: 3:12:28 lr: 4.652950985323148e-05 loss: 0.2003 (0.2120) time: 2.9262 data: 0.0071 max mem: 33300 Epoch: [3] [ 320/4276] eta: 3:12:07 lr: 4.652685776797079e-05 loss: 0.2130 (0.2127) time: 2.9545 data: 0.0081 max mem: 33300 Epoch: [3] [ 330/4276] eta: 3:11:50 lr: 4.652420566591311e-05 loss: 0.2187 (0.2126) time: 2.9954 data: 0.0086 max mem: 33300 Epoch: [3] [ 340/4276] eta: 3:11:22 lr: 4.652155354705726e-05 loss: 0.2014 (0.2121) time: 2.9714 data: 0.0085 max mem: 33300 Epoch: [3] [ 350/4276] eta: 3:10:52 lr: 4.6518901411402095e-05 loss: 0.1942 (0.2115) time: 2.9178 data: 0.0081 max mem: 33300 Epoch: [3] [ 360/4276] eta: 3:10:22 lr: 4.6516249258946416e-05 loss: 0.2043 (0.2123) time: 2.9108 data: 0.0077 max mem: 33300 Epoch: [3] [ 370/4276] eta: 3:09:52 lr: 4.6513597089689066e-05 loss: 0.2140 (0.2118) time: 2.9087 data: 0.0076 max mem: 33300 Epoch: [3] [ 380/4276] eta: 3:09:22 lr: 4.651094490362887e-05 loss: 0.1903 (0.2117) time: 2.9075 data: 0.0074 max mem: 33300 Epoch: [3] [ 390/4276] eta: 6:40:21 lr: 4.650829270076466e-05 loss: 0.2155 (0.2119) time: 66.7441 data: 63.8357 max mem: 33300 Epoch: [3] [ 400/4276] eta: 6:34:07 lr: 4.650564048109527e-05 loss: 0.2062 (0.2118) time: 66.7663 data: 63.8354 max mem: 33300 Epoch: [3] [ 410/4276] eta: 6:28:07 lr: 4.650298824461952e-05 loss: 0.2062 (0.2116) time: 2.9386 data: 0.0067 max mem: 33300 Epoch: [3] [ 420/4276] eta: 6:22:25 lr: 4.650033599133624e-05 loss: 0.2081 (0.2118) time: 2.9385 data: 0.0070 max mem: 33300 Epoch: [3] [ 430/4276] eta: 6:16:57 lr: 4.6497683721244266e-05 loss: 0.2142 (0.2119) time: 2.9450 data: 0.0074 max mem: 33300 Epoch: [3] [ 440/4276] eta: 6:11:42 lr: 4.6495031434342415e-05 loss: 0.2208 (0.2118) time: 2.9364 data: 0.0073 max mem: 33300 Epoch: [3] [ 450/4276] eta: 6:06:41 lr: 4.649237913062951e-05 loss: 0.2148 (0.2122) time: 2.9432 data: 0.0073 max mem: 33300 Epoch: [3] [ 460/4276] eta: 6:01:51 lr: 4.648972681010439e-05 loss: 0.1953 (0.2117) time: 2.9466 data: 0.0072 max mem: 33300 Epoch: [3] [ 470/4276] eta: 5:57:12 lr: 4.648707447276589e-05 loss: 0.1875 (0.2113) time: 2.9450 data: 0.0070 max mem: 33300 Epoch: [3] [ 480/4276] eta: 5:52:44 lr: 4.648442211861283e-05 loss: 0.1959 (0.2111) time: 2.9450 data: 0.0071 max mem: 33300 Epoch: [3] [ 490/4276] eta: 5:48:25 lr: 4.6481769747644034e-05 loss: 0.1981 (0.2111) time: 2.9424 data: 0.0073 max mem: 33300 Epoch: [3] [ 500/4276] eta: 5:44:15 lr: 4.647911735985832e-05 loss: 0.1943 (0.2110) time: 2.9408 data: 0.0073 max mem: 33300 Epoch: [3] [ 510/4276] eta: 5:40:14 lr: 4.647646495525454e-05 loss: 0.1943 (0.2106) time: 2.9408 data: 0.0071 max mem: 33300 Epoch: [3] [ 520/4276] eta: 5:36:21 lr: 4.64738125338315e-05 loss: 0.2000 (0.2107) time: 2.9413 data: 0.0070 max mem: 33300 Epoch: [3] [ 530/4276] eta: 5:32:36 lr: 4.6471160095588034e-05 loss: 0.2040 (0.2105) time: 2.9401 data: 0.0070 max mem: 33300 Epoch: [3] [ 540/4276] eta: 5:28:57 lr: 4.646850764052297e-05 loss: 0.1933 (0.2103) time: 2.9346 data: 0.0071 max mem: 33300 Epoch: [3] [ 550/4276] eta: 5:25:25 lr: 4.6465855168635134e-05 loss: 0.1972 (0.2102) time: 2.9303 data: 0.0072 max mem: 33300 Epoch: [3] [ 560/4276] eta: 5:22:00 lr: 4.6463202679923347e-05 loss: 0.2064 (0.2104) time: 2.9311 data: 0.0073 max mem: 33300 Epoch: [3] [ 570/4276] eta: 5:18:41 lr: 4.646055017438644e-05 loss: 0.2161 (0.2104) time: 2.9308 data: 0.0075 max mem: 33300 Epoch: [3] [ 580/4276] eta: 5:15:27 lr: 4.645789765202324e-05 loss: 0.2161 (0.2104) time: 2.9298 data: 0.0077 max mem: 33300 Epoch: [3] [ 590/4276] eta: 5:12:16 lr: 4.645524511283257e-05 loss: 0.1967 (0.2102) time: 2.9032 data: 0.0089 max mem: 33300 Epoch: [3] [ 600/4276] eta: 5:09:11 lr: 4.645259255681326e-05 loss: 0.1932 (0.2100) time: 2.8865 data: 0.0093 max mem: 33300 Epoch: [3] [ 610/4276] eta: 5:06:12 lr: 4.644993998396413e-05 loss: 0.1943 (0.2098) time: 2.8967 data: 0.0086 max mem: 33300 Epoch: [3] [ 620/4276] eta: 5:03:18 lr: 4.6447287394284e-05 loss: 0.1943 (0.2098) time: 2.9020 data: 0.0087 max mem: 33300 Epoch: [3] [ 630/4276] eta: 5:00:27 lr: 4.6444634787771716e-05 loss: 0.2089 (0.2098) time: 2.8949 data: 0.0089 max mem: 33300 Epoch: [3] [ 640/4276] eta: 4:57:40 lr: 4.644198216442608e-05 loss: 0.1991 (0.2096) time: 2.8750 data: 0.0083 max mem: 33300 Epoch: [3] [ 650/4276] eta: 4:54:58 lr: 4.643932952424593e-05 loss: 0.1989 (0.2096) time: 2.8755 data: 0.0080 max mem: 33300 Epoch: [3] [ 660/4276] eta: 4:52:19 lr: 4.643667686723009e-05 loss: 0.2063 (0.2096) time: 2.8769 data: 0.0083 max mem: 33300 Epoch: [3] [ 670/4276] eta: 4:49:44 lr: 4.643402419337737e-05 loss: 0.2017 (0.2095) time: 2.8726 data: 0.0087 max mem: 33300 Epoch: [3] [ 680/4276] eta: 4:47:14 lr: 4.6431371502686615e-05 loss: 0.1848 (0.2093) time: 2.8841 data: 0.0091 max mem: 33300 Epoch: [3] [ 690/4276] eta: 4:44:47 lr: 4.642871879515664e-05 loss: 0.1858 (0.2092) time: 2.8833 data: 0.0090 max mem: 33300 Epoch: [3] [ 700/4276] eta: 4:42:23 lr: 4.6426066070786265e-05 loss: 0.2117 (0.2094) time: 2.8773 data: 0.0089 max mem: 33300 Epoch: [3] [ 710/4276] eta: 4:40:02 lr: 4.642341332957433e-05 loss: 0.2246 (0.2095) time: 2.8815 data: 0.0091 max mem: 33300 Epoch: [3] [ 720/4276] eta: 4:37:45 lr: 4.642076057151964e-05 loss: 0.2139 (0.2094) time: 2.8779 data: 0.0090 max mem: 33300 Epoch: [3] [ 730/4276] eta: 4:35:30 lr: 4.641810779662102e-05 loss: 0.2122 (0.2096) time: 2.8783 data: 0.0094 max mem: 33300 Epoch: [3] [ 740/4276] eta: 4:33:19 lr: 4.6415455004877304e-05 loss: 0.2096 (0.2097) time: 2.8879 data: 0.0094 max mem: 33300 Epoch: [3] [ 750/4276] eta: 4:31:10 lr: 4.641280219628731e-05 loss: 0.1933 (0.2096) time: 2.8803 data: 0.0088 max mem: 33300 Epoch: [3] [ 760/4276] eta: 4:29:02 lr: 4.641014937084987e-05 loss: 0.1840 (0.2093) time: 2.8616 data: 0.0089 max mem: 33300 Epoch: [3] [ 770/4276] eta: 4:26:57 lr: 4.640749652856378e-05 loss: 0.1909 (0.2093) time: 2.8546 data: 0.0087 max mem: 33300 Epoch: [3] [ 780/4276] eta: 4:24:56 lr: 4.64048436694279e-05 loss: 0.1997 (0.2094) time: 2.8650 data: 0.0084 max mem: 33300 Epoch: [3] [ 790/4276] eta: 4:22:57 lr: 4.6402190793441024e-05 loss: 0.1997 (0.2095) time: 2.8741 data: 0.0090 max mem: 33300 Epoch: [3] [ 800/4276] eta: 4:21:00 lr: 4.6399537900601984e-05 loss: 0.1915 (0.2095) time: 2.8788 data: 0.0094 max mem: 33300 Epoch: [3] [ 810/4276] eta: 4:19:05 lr: 4.6396884990909604e-05 loss: 0.1879 (0.2094) time: 2.8752 data: 0.0089 max mem: 33300 Epoch: [3] [ 820/4276] eta: 4:17:13 lr: 4.639423206436271e-05 loss: 0.2076 (0.2093) time: 2.8814 data: 0.0086 max mem: 33300 Epoch: [3] [ 830/4276] eta: 4:15:23 lr: 4.6391579120960114e-05 loss: 0.2097 (0.2095) time: 2.8883 data: 0.0091 max mem: 33300 Epoch: [3] [ 840/4276] eta: 4:13:36 lr: 4.6388926160700646e-05 loss: 0.1987 (0.2096) time: 2.8945 data: 0.0095 max mem: 33300 Epoch: [3] [ 850/4276] eta: 4:11:50 lr: 4.638627318358312e-05 loss: 0.1874 (0.2096) time: 2.9086 data: 0.0091 max mem: 33300 Epoch: [3] [ 860/4276] eta: 4:10:06 lr: 4.6383620189606376e-05 loss: 0.1893 (0.2097) time: 2.9016 data: 0.0085 max mem: 33300 Epoch: [3] [ 870/4276] eta: 4:08:24 lr: 4.638096717876921e-05 loss: 0.2051 (0.2097) time: 2.9014 data: 0.0087 max mem: 33300 Epoch: [3] [ 880/4276] eta: 4:06:42 lr: 4.637831415107045e-05 loss: 0.2076 (0.2098) time: 2.8865 data: 0.0091 max mem: 33300 Epoch: [3] [ 890/4276] eta: 4:05:01 lr: 4.6375661106508935e-05 loss: 0.2116 (0.2098) time: 2.8594 data: 0.0088 max mem: 33300 Epoch: [3] [ 900/4276] eta: 4:03:22 lr: 4.6373008045083464e-05 loss: 0.2032 (0.2098) time: 2.8521 data: 0.0090 max mem: 33300 Epoch: [3] [ 910/4276] eta: 4:01:44 lr: 4.6370354966792876e-05 loss: 0.2092 (0.2099) time: 2.8516 data: 0.0093 max mem: 33300 Epoch: [3] [ 920/4276] eta: 4:00:09 lr: 4.636770187163597e-05 loss: 0.2092 (0.2100) time: 2.8655 data: 0.0090 max mem: 33300 Epoch: [3] [ 930/4276] eta: 3:58:35 lr: 4.6365048759611584e-05 loss: 0.2067 (0.2099) time: 2.8808 data: 0.0086 max mem: 33300 Epoch: [3] [ 940/4276] eta: 3:57:04 lr: 4.636239563071854e-05 loss: 0.1900 (0.2097) time: 2.8904 data: 0.0084 max mem: 33300 Epoch: [3] [ 950/4276] eta: 3:55:34 lr: 4.635974248495565e-05 loss: 0.1907 (0.2096) time: 2.9063 data: 0.0083 max mem: 33300 Epoch: [3] [ 960/4276] eta: 3:54:05 lr: 4.6357089322321726e-05 loss: 0.2045 (0.2097) time: 2.9100 data: 0.0081 max mem: 33300 Epoch: [3] [ 970/4276] eta: 3:52:37 lr: 4.6354436142815606e-05 loss: 0.2062 (0.2097) time: 2.8963 data: 0.0086 max mem: 33300 Epoch: [3] [ 980/4276] eta: 3:51:09 lr: 4.6351782946436095e-05 loss: 0.2198 (0.2099) time: 2.8818 data: 0.0091 max mem: 33300 Epoch: [3] [ 990/4276] eta: 3:49:43 lr: 4.634912973318202e-05 loss: 0.2002 (0.2097) time: 2.8776 data: 0.0090 max mem: 33300 Epoch: [3] [1000/4276] eta: 3:48:18 lr: 4.63464765030522e-05 loss: 0.1933 (0.2097) time: 2.8782 data: 0.0090 max mem: 33300 Epoch: [3] [1010/4276] eta: 3:46:54 lr: 4.6343823256045454e-05 loss: 0.1967 (0.2097) time: 2.8810 data: 0.0094 max mem: 33300 Epoch: [3] [1020/4276] eta: 3:45:32 lr: 4.63411699921606e-05 loss: 0.1959 (0.2096) time: 2.8893 data: 0.0095 max mem: 33300 Epoch: [3] [1030/4276] eta: 3:44:11 lr: 4.633851671139646e-05 loss: 0.1958 (0.2098) time: 2.8937 data: 0.0093 max mem: 33300 Epoch: [3] [1040/4276] eta: 3:42:51 lr: 4.633586341375184e-05 loss: 0.2046 (0.2097) time: 2.9044 data: 0.0086 max mem: 33300 Epoch: [3] [1050/4276] eta: 3:41:32 lr: 4.633321009922558e-05 loss: 0.1951 (0.2097) time: 2.9168 data: 0.0079 max mem: 33300 Epoch: [3] [1060/4276] eta: 3:40:15 lr: 4.633055676781648e-05 loss: 0.1992 (0.2098) time: 2.9234 data: 0.0077 max mem: 33300 Epoch: [3] [1070/4276] eta: 3:38:58 lr: 4.632790341952336e-05 loss: 0.2033 (0.2098) time: 2.9242 data: 0.0075 max mem: 33300 Epoch: [3] [1080/4276] eta: 3:37:43 lr: 4.632525005434505e-05 loss: 0.2033 (0.2097) time: 2.9247 data: 0.0075 max mem: 33300 Epoch: [3] [1090/4276] eta: 3:36:28 lr: 4.632259667228036e-05 loss: 0.2087 (0.2096) time: 2.9259 data: 0.0081 max mem: 33300 Epoch: [3] [1100/4276] eta: 3:35:13 lr: 4.631994327332811e-05 loss: 0.1920 (0.2096) time: 2.9098 data: 0.0090 max mem: 33300 Epoch: [3] [1110/4276] eta: 3:33:58 lr: 4.631728985748711e-05 loss: 0.1969 (0.2097) time: 2.8851 data: 0.0093 max mem: 33300 Epoch: [3] [1120/4276] eta: 3:32:45 lr: 4.631463642475619e-05 loss: 0.2167 (0.2098) time: 2.8874 data: 0.0095 max mem: 33300 Epoch: [3] [1130/4276] eta: 3:31:33 lr: 4.631198297513417e-05 loss: 0.2054 (0.2098) time: 2.9020 data: 0.0094 max mem: 33300 Epoch: [3] [1140/4276] eta: 3:30:21 lr: 4.630932950861984e-05 loss: 0.1968 (0.2096) time: 2.8962 data: 0.0084 max mem: 33300 Epoch: [3] [1150/4276] eta: 3:29:11 lr: 4.630667602521204e-05 loss: 0.1989 (0.2096) time: 2.8953 data: 0.0077 max mem: 33300 Epoch: [3] [1160/4276] eta: 3:28:01 lr: 4.6304022524909586e-05 loss: 0.2006 (0.2097) time: 2.9017 data: 0.0075 max mem: 33300 Epoch: [3] [1170/4276] eta: 3:26:52 lr: 4.630136900771129e-05 loss: 0.2119 (0.2098) time: 2.9078 data: 0.0077 max mem: 33300 Epoch: [3] [1180/4276] eta: 3:25:43 lr: 4.629871547361597e-05 loss: 0.1973 (0.2096) time: 2.9074 data: 0.0077 max mem: 33300 Epoch: [3] [1190/4276] eta: 3:24:35 lr: 4.6296061922622444e-05 loss: 0.1861 (0.2096) time: 2.9079 data: 0.0075 max mem: 33300 Epoch: [3] [1200/4276] eta: 3:23:28 lr: 4.6293408354729526e-05 loss: 0.1861 (0.2094) time: 2.9164 data: 0.0075 max mem: 33300 Epoch: [3] [1210/4276] eta: 3:22:22 lr: 4.629075476993603e-05 loss: 0.1933 (0.2094) time: 2.9266 data: 0.0077 max mem: 33300 Epoch: [3] [1220/4276] eta: 3:21:17 lr: 4.628810116824077e-05 loss: 0.1932 (0.2093) time: 2.9279 data: 0.0077 max mem: 33300 Epoch: [3] [1230/4276] eta: 3:20:12 lr: 4.628544754964257e-05 loss: 0.1893 (0.2092) time: 2.9285 data: 0.0076 max mem: 33300 Epoch: [3] [1240/4276] eta: 3:19:08 lr: 4.6282793914140235e-05 loss: 0.2062 (0.2093) time: 2.9288 data: 0.0076 max mem: 33300 Epoch: [3] [1250/4276] eta: 3:18:04 lr: 4.62801402617326e-05 loss: 0.2062 (0.2094) time: 2.9196 data: 0.0076 max mem: 33300 Epoch: [3] [1260/4276] eta: 3:17:00 lr: 4.627748659241845e-05 loss: 0.1928 (0.2093) time: 2.9060 data: 0.0087 max mem: 33300 Epoch: [3] [1270/4276] eta: 3:15:56 lr: 4.627483290619663e-05 loss: 0.2070 (0.2094) time: 2.8925 data: 0.0096 max mem: 33300 Epoch: [3] [1280/4276] eta: 3:14:53 lr: 4.6272179203065934e-05 loss: 0.2091 (0.2095) time: 2.8858 data: 0.0091 max mem: 33300 Epoch: [3] [1290/4276] eta: 3:13:50 lr: 4.626952548302519e-05 loss: 0.2015 (0.2096) time: 2.8827 data: 0.0091 max mem: 33300 Epoch: [3] [1300/4276] eta: 3:12:48 lr: 4.62668717460732e-05 loss: 0.1846 (0.2094) time: 2.8815 data: 0.0091 max mem: 33300 Epoch: [3] [1310/4276] eta: 3:11:47 lr: 4.6264217992208794e-05 loss: 0.1876 (0.2093) time: 2.8788 data: 0.0092 max mem: 33300 Epoch: [3] [1320/4276] eta: 3:10:46 lr: 4.626156422143077e-05 loss: 0.2081 (0.2094) time: 2.8899 data: 0.0097 max mem: 33300 Epoch: [3] [1330/4276] eta: 3:09:46 lr: 4.625891043373796e-05 loss: 0.2008 (0.2093) time: 2.9004 data: 0.0099 max mem: 33300 Epoch: [3] [1340/4276] eta: 3:08:46 lr: 4.625625662912916e-05 loss: 0.1897 (0.2092) time: 2.8911 data: 0.0093 max mem: 33300 Epoch: [3] [1350/4276] eta: 3:07:46 lr: 4.625360280760319e-05 loss: 0.2037 (0.2091) time: 2.8856 data: 0.0093 max mem: 33300 Epoch: [3] [1360/4276] eta: 3:06:47 lr: 4.625094896915888e-05 loss: 0.2062 (0.2092) time: 2.8873 data: 0.0087 max mem: 33300 Epoch: [3] [1370/4276] eta: 3:05:49 lr: 4.6248295113795016e-05 loss: 0.1940 (0.2090) time: 2.8999 data: 0.0080 max mem: 33300 Epoch: [3] [1380/4276] eta: 3:04:51 lr: 4.624564124151043e-05 loss: 0.2034 (0.2090) time: 2.9078 data: 0.0080 max mem: 33300 Epoch: [3] [1390/4276] eta: 3:03:53 lr: 4.624298735230392e-05 loss: 0.2118 (0.2090) time: 2.8976 data: 0.0084 max mem: 33300 Epoch: [3] [1400/4276] eta: 3:02:56 lr: 4.624033344617432e-05 loss: 0.1992 (0.2091) time: 2.9102 data: 0.0085 max mem: 33300 Epoch: [3] [1410/4276] eta: 3:01:59 lr: 4.623767952312043e-05 loss: 0.2094 (0.2092) time: 2.9080 data: 0.0084 max mem: 33300 Epoch: [3] [1420/4276] eta: 3:01:03 lr: 4.623502558314106e-05 loss: 0.2114 (0.2093) time: 2.8860 data: 0.0086 max mem: 33300 Epoch: [3] [1430/4276] eta: 3:00:06 lr: 4.623237162623503e-05 loss: 0.2001 (0.2093) time: 2.8853 data: 0.0095 max mem: 33300 Epoch: [3] [1440/4276] eta: 2:59:11 lr: 4.622971765240115e-05 loss: 0.2040 (0.2096) time: 2.8999 data: 0.0100 max mem: 33300 Epoch: [3] [1450/4276] eta: 2:58:16 lr: 4.622706366163823e-05 loss: 0.2298 (0.2097) time: 2.9053 data: 0.0088 max mem: 33300 Epoch: [3] [1460/4276] eta: 2:57:21 lr: 4.622440965394508e-05 loss: 0.2097 (0.2098) time: 2.8956 data: 0.0082 max mem: 33300 Epoch: [3] [1470/4276] eta: 2:56:27 lr: 4.622175562932052e-05 loss: 0.2124 (0.2099) time: 2.9177 data: 0.0085 max mem: 33300 Epoch: [3] [1480/4276] eta: 2:55:33 lr: 4.621910158776336e-05 loss: 0.2169 (0.2100) time: 2.9429 data: 0.0082 max mem: 33300 Epoch: [3] [1490/4276] eta: 2:54:40 lr: 4.62164475292724e-05 loss: 0.2095 (0.2101) time: 2.9464 data: 0.0082 max mem: 33300 Epoch: [3] [1500/4276] eta: 2:53:48 lr: 4.621379345384646e-05 loss: 0.2061 (0.2100) time: 2.9471 data: 0.0080 max mem: 33300 Epoch: [3] [1510/4276] eta: 2:52:55 lr: 4.621113936148436e-05 loss: 0.2061 (0.2102) time: 2.9422 data: 0.0077 max mem: 33300 Epoch: [3] [1520/4276] eta: 2:52:03 lr: 4.62084852521849e-05 loss: 0.2140 (0.2102) time: 2.9339 data: 0.0079 max mem: 33300 Epoch: [3] [1530/4276] eta: 2:51:10 lr: 4.620583112594689e-05 loss: 0.2154 (0.2103) time: 2.9133 data: 0.0088 max mem: 33300 Epoch: [3] [1540/4276] eta: 2:50:18 lr: 4.620317698276915e-05 loss: 0.2089 (0.2102) time: 2.9107 data: 0.0089 max mem: 33300 Epoch: [3] [1550/4276] eta: 2:49:27 lr: 4.620052282265048e-05 loss: 0.2147 (0.2103) time: 2.9383 data: 0.0085 max mem: 33300 Epoch: [3] [1560/4276] eta: 2:48:36 lr: 4.61978686455897e-05 loss: 0.2043 (0.2102) time: 2.9459 data: 0.0085 max mem: 33300 Epoch: [3] [1570/4276] eta: 2:47:45 lr: 4.6195214451585613e-05 loss: 0.1921 (0.2101) time: 2.9438 data: 0.0080 max mem: 33300 Epoch: [3] [1580/4276] eta: 2:46:55 lr: 4.6192560240637025e-05 loss: 0.1870 (0.2100) time: 2.9444 data: 0.0079 max mem: 33300 Epoch: [3] [1590/4276] eta: 2:46:05 lr: 4.618990601274277e-05 loss: 0.1916 (0.2100) time: 2.9425 data: 0.0079 max mem: 33300 Epoch: [3] [1600/4276] eta: 2:45:15 lr: 4.618725176790163e-05 loss: 0.1942 (0.2100) time: 2.9572 data: 0.0077 max mem: 33300 Epoch: [3] [1610/4276] eta: 2:44:26 lr: 4.618459750611243e-05 loss: 0.1891 (0.2098) time: 2.9546 data: 0.0075 max mem: 33300 Epoch: [3] [1620/4276] eta: 2:43:36 lr: 4.618194322737397e-05 loss: 0.1869 (0.2097) time: 2.9396 data: 0.0076 max mem: 33300 Epoch: [3] [1630/4276] eta: 2:42:47 lr: 4.617928893168506e-05 loss: 0.1954 (0.2098) time: 2.9413 data: 0.0078 max mem: 33300 Epoch: [3] [1640/4276] eta: 2:41:58 lr: 4.617663461904453e-05 loss: 0.2097 (0.2099) time: 2.9430 data: 0.0077 max mem: 33300 Epoch: [3] [1650/4276] eta: 2:41:09 lr: 4.6173980289451156e-05 loss: 0.2000 (0.2100) time: 2.9420 data: 0.0080 max mem: 33300 Epoch: [3] [1660/4276] eta: 2:40:21 lr: 4.617132594290377e-05 loss: 0.2000 (0.2099) time: 2.9427 data: 0.0082 max mem: 33300 Epoch: [3] [1670/4276] eta: 2:39:32 lr: 4.6168671579401176e-05 loss: 0.2012 (0.2100) time: 2.9336 data: 0.0077 max mem: 33300 Epoch: [3] [1680/4276] eta: 2:38:44 lr: 4.6166017198942186e-05 loss: 0.2180 (0.2101) time: 2.9263 data: 0.0076 max mem: 33300 Epoch: [3] [1690/4276] eta: 2:37:56 lr: 4.6163362801525596e-05 loss: 0.2188 (0.2100) time: 2.9354 data: 0.0075 max mem: 33300 Epoch: [3] [1700/4276] eta: 2:37:08 lr: 4.616070838715022e-05 loss: 0.2113 (0.2101) time: 2.9360 data: 0.0076 max mem: 33300 Epoch: [3] [1710/4276] eta: 2:36:20 lr: 4.6158053955814874e-05 loss: 0.2227 (0.2102) time: 2.9095 data: 0.0092 max mem: 33300 Epoch: [3] [1720/4276] eta: 2:35:32 lr: 4.6155399507518356e-05 loss: 0.2223 (0.2103) time: 2.8872 data: 0.0092 max mem: 33300 Epoch: [3] [1730/4276] eta: 2:34:45 lr: 4.6152745042259476e-05 loss: 0.2064 (0.2103) time: 2.8852 data: 0.0076 max mem: 33300 Epoch: [3] [1740/4276] eta: 2:33:57 lr: 4.615009056003704e-05 loss: 0.2037 (0.2103) time: 2.8852 data: 0.0075 max mem: 33300 Epoch: [3] [1750/4276] eta: 2:33:10 lr: 4.614743606084987e-05 loss: 0.2125 (0.2103) time: 2.8863 data: 0.0073 max mem: 33300 Epoch: [3] [1760/4276] eta: 2:32:23 lr: 4.6144781544696755e-05 loss: 0.2105 (0.2102) time: 2.8860 data: 0.0071 max mem: 33300 Epoch: [3] [1770/4276] eta: 2:31:36 lr: 4.61421270115765e-05 loss: 0.2105 (0.2102) time: 2.8841 data: 0.0071 max mem: 33300 Epoch: [3] [1780/4276] eta: 2:30:49 lr: 4.613947246148793e-05 loss: 0.2068 (0.2103) time: 2.8850 data: 0.0072 max mem: 33300 Epoch: [3] [1790/4276] eta: 2:30:02 lr: 4.613681789442984e-05 loss: 0.2039 (0.2103) time: 2.8841 data: 0.0072 max mem: 33300 Epoch: [3] [1800/4276] eta: 2:29:16 lr: 4.6134163310401046e-05 loss: 0.2039 (0.2103) time: 2.8832 data: 0.0071 max mem: 33300 Epoch: [3] [1810/4276] eta: 2:28:30 lr: 4.6131508709400345e-05 loss: 0.2263 (0.2104) time: 2.8850 data: 0.0071 max mem: 33300 Epoch: [3] [1820/4276] eta: 2:27:44 lr: 4.6128854091426547e-05 loss: 0.2086 (0.2103) time: 2.8855 data: 0.0071 max mem: 33300 Epoch: [3] [1830/4276] eta: 2:26:58 lr: 4.612619945647845e-05 loss: 0.1939 (0.2104) time: 2.8869 data: 0.0074 max mem: 33300 Epoch: [3] [1840/4276] eta: 2:26:13 lr: 4.6123544804554875e-05 loss: 0.1994 (0.2103) time: 2.8871 data: 0.0078 max mem: 33300 Epoch: [3] [1850/4276] eta: 2:25:27 lr: 4.612089013565461e-05 loss: 0.2068 (0.2104) time: 2.8879 data: 0.0080 max mem: 33300 Epoch: [3] [1860/4276] eta: 2:24:42 lr: 4.611823544977648e-05 loss: 0.2153 (0.2104) time: 2.8944 data: 0.0088 max mem: 33300 Epoch: [3] [1870/4276] eta: 2:23:57 lr: 4.611558074691928e-05 loss: 0.2153 (0.2105) time: 2.8976 data: 0.0094 max mem: 33300 Epoch: [3] [1880/4276] eta: 2:23:12 lr: 4.611292602708181e-05 loss: 0.2157 (0.2105) time: 2.8923 data: 0.0088 max mem: 33300 Epoch: [3] [1890/4276] eta: 2:22:28 lr: 4.611027129026289e-05 loss: 0.1986 (0.2105) time: 2.8890 data: 0.0086 max mem: 33300 Epoch: [3] [1900/4276] eta: 2:21:43 lr: 4.610761653646132e-05 loss: 0.1870 (0.2104) time: 2.8971 data: 0.0086 max mem: 33300 Epoch: [3] [1910/4276] eta: 2:20:59 lr: 4.610496176567589e-05 loss: 0.1993 (0.2104) time: 2.9105 data: 0.0079 max mem: 33300 Epoch: [3] [1920/4276] eta: 2:20:16 lr: 4.610230697790542e-05 loss: 0.1964 (0.2103) time: 2.9203 data: 0.0079 max mem: 33300 Epoch: [3] [1930/4276] eta: 2:19:32 lr: 4.609965217314872e-05 loss: 0.1964 (0.2103) time: 2.9350 data: 0.0082 max mem: 33300 Epoch: [3] [1940/4276] eta: 2:18:49 lr: 4.6096997351404575e-05 loss: 0.2127 (0.2104) time: 2.9427 data: 0.0084 max mem: 33300 Epoch: [3] [1950/4276] eta: 2:18:06 lr: 4.6094342512671806e-05 loss: 0.2089 (0.2104) time: 2.9434 data: 0.0080 max mem: 33300 Epoch: [3] [1960/4276] eta: 2:17:23 lr: 4.609168765694921e-05 loss: 0.1953 (0.2103) time: 2.9430 data: 0.0084 max mem: 33300 Epoch: [3] [1970/4276] eta: 2:16:40 lr: 4.6089032784235584e-05 loss: 0.1846 (0.2102) time: 2.9252 data: 0.0095 max mem: 33300 Epoch: [3] [1980/4276] eta: 2:15:57 lr: 4.6086377894529754e-05 loss: 0.1846 (0.2101) time: 2.9265 data: 0.0092 max mem: 33300 Epoch: [3] [1990/4276] eta: 2:15:15 lr: 4.60837229878305e-05 loss: 0.1914 (0.2101) time: 2.9420 data: 0.0083 max mem: 33300 Epoch: [3] [2000/4276] eta: 2:14:32 lr: 4.608106806413663e-05 loss: 0.2146 (0.2102) time: 2.9401 data: 0.0077 max mem: 33300 Epoch: [3] [2010/4276] eta: 2:13:50 lr: 4.607841312344696e-05 loss: 0.2025 (0.2101) time: 2.9382 data: 0.0073 max mem: 33300 Epoch: [3] [2020/4276] eta: 2:13:08 lr: 4.607575816576028e-05 loss: 0.2031 (0.2102) time: 2.9382 data: 0.0072 max mem: 33300 Epoch: [3] [2030/4276] eta: 2:12:26 lr: 4.60731031910754e-05 loss: 0.1965 (0.2101) time: 2.9393 data: 0.0075 max mem: 33300 Epoch: [3] [2040/4276] eta: 2:11:44 lr: 4.607044819939112e-05 loss: 0.1825 (0.2100) time: 2.9392 data: 0.0074 max mem: 33300 Epoch: [3] [2050/4276] eta: 2:11:02 lr: 4.606779319070624e-05 loss: 0.2010 (0.2101) time: 2.9387 data: 0.0071 max mem: 33300 Epoch: [3] [2060/4276] eta: 2:10:20 lr: 4.606513816501958e-05 loss: 0.2066 (0.2101) time: 2.9392 data: 0.0078 max mem: 33300 Epoch: [3] [2070/4276] eta: 2:09:39 lr: 4.6062483122329916e-05 loss: 0.2066 (0.2100) time: 2.9430 data: 0.0078 max mem: 33300 Epoch: [3] [2080/4276] eta: 2:08:57 lr: 4.605982806263606e-05 loss: 0.1990 (0.2101) time: 2.9429 data: 0.0071 max mem: 33300 Epoch: [3] [2090/4276] eta: 2:08:16 lr: 4.6057172985936817e-05 loss: 0.1956 (0.2100) time: 2.9436 data: 0.0072 max mem: 33300 Epoch: [3] [2100/4276] eta: 2:07:35 lr: 4.6054517892230995e-05 loss: 0.1909 (0.2100) time: 2.9437 data: 0.0073 max mem: 33300 Epoch: [3] [2110/4276] eta: 2:06:54 lr: 4.605186278151738e-05 loss: 0.1888 (0.2099) time: 2.9443 data: 0.0072 max mem: 33300 Epoch: [3] [2120/4276] eta: 2:06:13 lr: 4.604920765379479e-05 loss: 0.1801 (0.2098) time: 2.9443 data: 0.0073 max mem: 33300 Epoch: [3] [2130/4276] eta: 2:05:32 lr: 4.604655250906202e-05 loss: 0.1799 (0.2097) time: 2.9391 data: 0.0074 max mem: 33300 Epoch: [3] [2140/4276] eta: 2:04:51 lr: 4.6043897347317864e-05 loss: 0.2023 (0.2098) time: 2.9442 data: 0.0072 max mem: 33300 Epoch: [3] [2150/4276] eta: 2:04:10 lr: 4.604124216856113e-05 loss: 0.2023 (0.2097) time: 2.9393 data: 0.0077 max mem: 33300 Epoch: [3] [2160/4276] eta: 2:03:30 lr: 4.603858697279061e-05 loss: 0.1997 (0.2097) time: 2.9277 data: 0.0084 max mem: 33300 Epoch: [3] [2170/4276] eta: 2:02:49 lr: 4.6035931760005125e-05 loss: 0.2076 (0.2097) time: 2.9318 data: 0.0078 max mem: 33300 Epoch: [3] [2180/4276] eta: 2:02:09 lr: 4.6033276530203464e-05 loss: 0.2092 (0.2098) time: 2.9396 data: 0.0072 max mem: 33300 Epoch: [3] [2190/4276] eta: 2:01:29 lr: 4.603062128338442e-05 loss: 0.2156 (0.2098) time: 2.9444 data: 0.0072 max mem: 33300 Epoch: [3] [2200/4276] eta: 2:00:48 lr: 4.60279660195468e-05 loss: 0.2164 (0.2098) time: 2.9443 data: 0.0072 max mem: 33300 Epoch: [3] [2210/4276] eta: 2:00:08 lr: 4.6025310738689405e-05 loss: 0.2141 (0.2099) time: 2.9424 data: 0.0072 max mem: 33300 Epoch: [3] [2220/4276] eta: 1:59:28 lr: 4.602265544081103e-05 loss: 0.2087 (0.2099) time: 2.9320 data: 0.0077 max mem: 33300 Epoch: [3] [2230/4276] eta: 1:58:48 lr: 4.602000012591048e-05 loss: 0.2026 (0.2098) time: 2.9241 data: 0.0088 max mem: 33300 Epoch: [3] [2240/4276] eta: 1:58:08 lr: 4.601734479398655e-05 loss: 0.1913 (0.2097) time: 2.9247 data: 0.0091 max mem: 33300 Epoch: [3] [2250/4276] eta: 1:57:29 lr: 4.6014689445038044e-05 loss: 0.1975 (0.2097) time: 2.9315 data: 0.0086 max mem: 33300 Epoch: [3] [2260/4276] eta: 1:56:49 lr: 4.601203407906376e-05 loss: 0.2119 (0.2098) time: 2.9399 data: 0.0080 max mem: 33300 Epoch: [3] [2270/4276] eta: 1:56:09 lr: 4.60093786960625e-05 loss: 0.2076 (0.2098) time: 2.9367 data: 0.0074 max mem: 33300 Epoch: [3] [2280/4276] eta: 1:55:30 lr: 4.600672329603305e-05 loss: 0.2065 (0.2098) time: 2.9372 data: 0.0072 max mem: 33300 Epoch: [3] [2290/4276] eta: 1:54:51 lr: 4.600406787897423e-05 loss: 0.1946 (0.2097) time: 2.9407 data: 0.0071 max mem: 33300 Epoch: [3] [2300/4276] eta: 1:54:11 lr: 4.600141244488482e-05 loss: 0.1874 (0.2096) time: 2.9433 data: 0.0072 max mem: 33300 Epoch: [3] [2310/4276] eta: 1:53:32 lr: 4.599875699376363e-05 loss: 0.1825 (0.2095) time: 2.9412 data: 0.0071 max mem: 33300 Epoch: [3] [2320/4276] eta: 1:52:53 lr: 4.5996101525609444e-05 loss: 0.2026 (0.2095) time: 2.9436 data: 0.0071 max mem: 33300 Epoch: [3] [2330/4276] eta: 1:52:14 lr: 4.599344604042108e-05 loss: 0.2017 (0.2095) time: 2.9436 data: 0.0072 max mem: 33300 Epoch: [3] [2340/4276] eta: 1:51:35 lr: 4.5990790538197314e-05 loss: 0.1913 (0.2094) time: 2.9434 data: 0.0072 max mem: 33300 Epoch: [3] [2350/4276] eta: 1:50:57 lr: 4.598813501893696e-05 loss: 0.1914 (0.2094) time: 2.9447 data: 0.0074 max mem: 33300 Epoch: [3] [2360/4276] eta: 1:50:18 lr: 4.5985479482638816e-05 loss: 0.1939 (0.2093) time: 2.9389 data: 0.0074 max mem: 33300 Epoch: [3] [2370/4276] eta: 1:49:39 lr: 4.598282392930167e-05 loss: 0.2009 (0.2093) time: 2.9371 data: 0.0072 max mem: 33300 Epoch: [3] [2380/4276] eta: 1:49:00 lr: 4.5980168358924326e-05 loss: 0.1957 (0.2093) time: 2.9361 data: 0.0072 max mem: 33300 Epoch: [3] [2390/4276] eta: 1:48:22 lr: 4.597751277150558e-05 loss: 0.1869 (0.2092) time: 2.9359 data: 0.0072 max mem: 33300 Epoch: [3] [2400/4276] eta: 1:47:43 lr: 4.597485716704422e-05 loss: 0.1945 (0.2093) time: 2.9310 data: 0.0077 max mem: 33300 Epoch: [3] [2410/4276] eta: 1:47:05 lr: 4.597220154553906e-05 loss: 0.1969 (0.2092) time: 2.9304 data: 0.0079 max mem: 33300 Epoch: [3] [2420/4276] eta: 1:46:27 lr: 4.5969545906988884e-05 loss: 0.1840 (0.2091) time: 2.9358 data: 0.0077 max mem: 33300 Epoch: [3] [2430/4276] eta: 1:45:48 lr: 4.596689025139249e-05 loss: 0.1962 (0.2092) time: 2.9372 data: 0.0078 max mem: 33300 Epoch: [3] [2440/4276] eta: 1:45:10 lr: 4.5964234578748674e-05 loss: 0.1989 (0.2091) time: 2.9281 data: 0.0078 max mem: 33300 Epoch: [3] [2450/4276] eta: 1:44:32 lr: 4.596157888905624e-05 loss: 0.1992 (0.2091) time: 2.9301 data: 0.0080 max mem: 33300 Epoch: [3] [2460/4276] eta: 1:43:54 lr: 4.595892318231397e-05 loss: 0.2006 (0.2091) time: 2.9405 data: 0.0079 max mem: 33300 Epoch: [3] [2470/4276] eta: 1:43:16 lr: 4.5956267458520675e-05 loss: 0.2034 (0.2091) time: 2.9196 data: 0.0083 max mem: 33300 Epoch: [3] [2480/4276] eta: 1:42:38 lr: 4.595361171767514e-05 loss: 0.2034 (0.2091) time: 2.8929 data: 0.0090 max mem: 33300 Epoch: [3] [2490/4276] eta: 1:41:59 lr: 4.595095595977617e-05 loss: 0.1989 (0.2091) time: 2.8867 data: 0.0090 max mem: 33300 Epoch: [3] [2500/4276] eta: 1:41:21 lr: 4.594830018482255e-05 loss: 0.2086 (0.2091) time: 2.8922 data: 0.0096 max mem: 33300 Epoch: [3] [2510/4276] eta: 1:40:43 lr: 4.5945644392813084e-05 loss: 0.2100 (0.2091) time: 2.8939 data: 0.0095 max mem: 33300 Epoch: [3] [2520/4276] eta: 1:40:06 lr: 4.594298858374656e-05 loss: 0.1742 (0.2090) time: 2.9150 data: 0.0095 max mem: 33300 Epoch: [3] [2530/4276] eta: 1:39:28 lr: 4.5940332757621774e-05 loss: 0.1629 (0.2088) time: 2.9463 data: 0.0088 max mem: 33300 Epoch: [3] [2540/4276] eta: 1:38:51 lr: 4.593767691443752e-05 loss: 0.1709 (0.2087) time: 2.9465 data: 0.0079 max mem: 33300 Epoch: [3] [2550/4276] eta: 1:38:14 lr: 4.59350210541926e-05 loss: 0.1906 (0.2087) time: 2.9431 data: 0.0081 max mem: 33300 Epoch: [3] [2560/4276] eta: 1:37:36 lr: 4.593236517688581e-05 loss: 0.1873 (0.2086) time: 2.9443 data: 0.0081 max mem: 33300 Epoch: [3] [2570/4276] eta: 1:36:59 lr: 4.5929709282515927e-05 loss: 0.1782 (0.2086) time: 2.9300 data: 0.0089 max mem: 33300 Epoch: [3] [2580/4276] eta: 1:36:22 lr: 4.592705337108176e-05 loss: 0.1830 (0.2085) time: 2.9303 data: 0.0097 max mem: 33300 Epoch: [3] [2590/4276] eta: 1:35:44 lr: 4.5924397442582096e-05 loss: 0.1870 (0.2085) time: 2.9414 data: 0.0091 max mem: 33300 Epoch: [3] [2600/4276] eta: 1:35:07 lr: 4.5921741497015735e-05 loss: 0.1927 (0.2085) time: 2.9408 data: 0.0085 max mem: 33300 Epoch: [3] [2610/4276] eta: 1:34:30 lr: 4.591908553438146e-05 loss: 0.2013 (0.2085) time: 2.9424 data: 0.0082 max mem: 33300 Epoch: [3] [2620/4276] eta: 1:33:53 lr: 4.5916429554678074e-05 loss: 0.1979 (0.2085) time: 2.9410 data: 0.0080 max mem: 33300 Epoch: [3] [2630/4276] eta: 1:33:17 lr: 4.591377355790437e-05 loss: 0.1930 (0.2084) time: 2.9500 data: 0.0082 max mem: 33300 Epoch: [3] [2640/4276] eta: 1:32:40 lr: 4.591111754405914e-05 loss: 0.1770 (0.2083) time: 2.9375 data: 0.0083 max mem: 33300 Epoch: [3] [2650/4276] eta: 1:32:03 lr: 4.590846151314117e-05 loss: 0.1838 (0.2083) time: 2.9281 data: 0.0085 max mem: 33300 Epoch: [3] [2660/4276] eta: 1:31:26 lr: 4.5905805465149265e-05 loss: 0.2039 (0.2083) time: 2.9411 data: 0.0087 max mem: 33300 Epoch: [3] [2670/4276] eta: 1:30:49 lr: 4.590314940008221e-05 loss: 0.2032 (0.2082) time: 2.9382 data: 0.0080 max mem: 33300 Epoch: [3] [2680/4276] eta: 1:30:13 lr: 4.590049331793879e-05 loss: 0.1955 (0.2082) time: 2.9396 data: 0.0076 max mem: 33300 Epoch: [3] [2690/4276] eta: 1:29:36 lr: 4.589783721871781e-05 loss: 0.1968 (0.2082) time: 2.9231 data: 0.0081 max mem: 33300 Epoch: [3] [2700/4276] eta: 1:28:59 lr: 4.589518110241806e-05 loss: 0.1911 (0.2081) time: 2.9145 data: 0.0087 max mem: 33300 Epoch: [3] [2710/4276] eta: 1:28:23 lr: 4.5892524969038327e-05 loss: 0.1871 (0.2082) time: 2.9312 data: 0.0080 max mem: 33300 Epoch: [3] [2720/4276] eta: 1:27:46 lr: 4.5889868818577406e-05 loss: 0.2103 (0.2081) time: 2.9386 data: 0.0073 max mem: 33300 Epoch: [3] [2730/4276] eta: 1:27:10 lr: 4.5887212651034087e-05 loss: 0.2055 (0.2082) time: 2.9254 data: 0.0077 max mem: 33300 Epoch: [3] [2740/4276] eta: 1:26:33 lr: 4.588455646640717e-05 loss: 0.2141 (0.2082) time: 2.9104 data: 0.0084 max mem: 33300 Epoch: [3] [2750/4276] eta: 1:25:57 lr: 4.588190026469543e-05 loss: 0.2049 (0.2082) time: 2.9246 data: 0.0081 max mem: 33300 Epoch: [3] [2760/4276] eta: 1:25:21 lr: 4.5879244045897666e-05 loss: 0.1954 (0.2081) time: 2.9416 data: 0.0071 max mem: 33300 Epoch: [3] [2770/4276] eta: 1:24:45 lr: 4.587658781001268e-05 loss: 0.1842 (0.2081) time: 2.9412 data: 0.0072 max mem: 33300 Epoch: [3] [2780/4276] eta: 1:24:09 lr: 4.5873931557039244e-05 loss: 0.1961 (0.2081) time: 2.9305 data: 0.0073 max mem: 33300 Epoch: [3] [2790/4276] eta: 1:23:33 lr: 4.587127528697615e-05 loss: 0.2118 (0.2081) time: 2.9316 data: 0.0080 max mem: 33300 Epoch: [3] [2800/4276] eta: 1:22:56 lr: 4.586861899982221e-05 loss: 0.2006 (0.2081) time: 2.9386 data: 0.0078 max mem: 33300 Epoch: [3] [2810/4276] eta: 1:22:21 lr: 4.586596269557619e-05 loss: 0.1934 (0.2080) time: 2.9438 data: 0.0072 max mem: 33300 Epoch: [3] [2820/4276] eta: 1:21:45 lr: 4.58633063742369e-05 loss: 0.1911 (0.2080) time: 2.9484 data: 0.0073 max mem: 33300 Epoch: [3] [2830/4276] eta: 1:21:09 lr: 4.5860650035803116e-05 loss: 0.1910 (0.2079) time: 2.9489 data: 0.0073 max mem: 33300 Epoch: [3] [2840/4276] eta: 1:20:33 lr: 4.585799368027363e-05 loss: 0.1955 (0.2079) time: 2.9468 data: 0.0073 max mem: 33300 Epoch: [3] [2850/4276] eta: 1:19:57 lr: 4.585533730764723e-05 loss: 0.2124 (0.2080) time: 2.9421 data: 0.0080 max mem: 33300 Epoch: [3] [2860/4276] eta: 1:19:21 lr: 4.585268091792271e-05 loss: 0.2105 (0.2080) time: 2.9414 data: 0.0081 max mem: 33300 Epoch: [3] [2870/4276] eta: 1:18:46 lr: 4.585002451109886e-05 loss: 0.2061 (0.2080) time: 2.9485 data: 0.0079 max mem: 33300 Epoch: [3] [2880/4276] eta: 1:18:10 lr: 4.584736808717447e-05 loss: 0.2068 (0.2081) time: 2.9411 data: 0.0084 max mem: 33300 Epoch: [3] [2890/4276] eta: 1:17:35 lr: 4.5844711646148325e-05 loss: 0.2130 (0.2081) time: 2.9356 data: 0.0086 max mem: 33300 Epoch: [3] [2900/4276] eta: 1:16:59 lr: 4.584205518801921e-05 loss: 0.2011 (0.2080) time: 2.9421 data: 0.0085 max mem: 33300 Epoch: [3] [2910/4276] eta: 1:16:23 lr: 4.583939871278593e-05 loss: 0.1841 (0.2080) time: 2.9161 data: 0.0091 max mem: 33300 Epoch: [3] [2920/4276] eta: 1:15:48 lr: 4.583674222044726e-05 loss: 0.1932 (0.2080) time: 2.8949 data: 0.0094 max mem: 33300 Epoch: [3] [2930/4276] eta: 1:15:12 lr: 4.5834085711001985e-05 loss: 0.1876 (0.2080) time: 2.8939 data: 0.0087 max mem: 33300 Epoch: [3] [2940/4276] eta: 1:14:36 lr: 4.58314291844489e-05 loss: 0.1845 (0.2079) time: 2.9095 data: 0.0084 max mem: 33300 Epoch: [3] [2950/4276] eta: 1:14:01 lr: 4.5828772640786795e-05 loss: 0.1764 (0.2079) time: 2.9343 data: 0.0080 max mem: 33300 Epoch: [3] [2960/4276] eta: 1:13:26 lr: 4.5826116080014454e-05 loss: 0.2115 (0.2079) time: 2.9411 data: 0.0078 max mem: 33300 Epoch: [3] [2970/4276] eta: 1:12:50 lr: 4.582345950213066e-05 loss: 0.2115 (0.2080) time: 2.9412 data: 0.0080 max mem: 33300 Epoch: [3] [2980/4276] eta: 1:12:15 lr: 4.5820802907134216e-05 loss: 0.2103 (0.2080) time: 2.9409 data: 0.0078 max mem: 33300 Epoch: [3] [2990/4276] eta: 1:11:40 lr: 4.581814629502389e-05 loss: 0.1975 (0.2079) time: 2.9404 data: 0.0075 max mem: 33300 Epoch: [3] [3000/4276] eta: 1:11:05 lr: 4.581548966579848e-05 loss: 0.1879 (0.2078) time: 2.9402 data: 0.0076 max mem: 33300 Epoch: [3] [3010/4276] eta: 1:10:30 lr: 4.581283301945678e-05 loss: 0.1947 (0.2078) time: 2.9411 data: 0.0078 max mem: 33300 Epoch: [3] [3020/4276] eta: 1:09:55 lr: 4.581017635599756e-05 loss: 0.1950 (0.2078) time: 2.9448 data: 0.0085 max mem: 33300 Epoch: [3] [3030/4276] eta: 1:09:20 lr: 4.580751967541962e-05 loss: 0.1892 (0.2078) time: 2.9462 data: 0.0088 max mem: 33300 Epoch: [3] [3040/4276] eta: 1:08:45 lr: 4.5804862977721744e-05 loss: 0.2053 (0.2079) time: 2.9200 data: 0.0085 max mem: 33300 Epoch: [3] [3050/4276] eta: 1:08:09 lr: 4.5802206262902706e-05 loss: 0.2104 (0.2079) time: 2.8926 data: 0.0085 max mem: 33300 Epoch: [3] [3060/4276] eta: 1:07:34 lr: 4.5799549530961304e-05 loss: 0.1801 (0.2078) time: 2.8908 data: 0.0084 max mem: 33300 Epoch: [3] [3070/4276] eta: 1:06:59 lr: 4.579689278189633e-05 loss: 0.1805 (0.2078) time: 2.8899 data: 0.0078 max mem: 33300 Epoch: [3] [3080/4276] eta: 1:06:24 lr: 4.579423601570656e-05 loss: 0.1939 (0.2077) time: 2.8887 data: 0.0075 max mem: 33300 Epoch: [3] [3090/4276] eta: 1:05:49 lr: 4.579157923239078e-05 loss: 0.1834 (0.2077) time: 2.8983 data: 0.0081 max mem: 33300 Epoch: [3] [3100/4276] eta: 1:05:14 lr: 4.578892243194778e-05 loss: 0.1948 (0.2077) time: 2.9066 data: 0.0088 max mem: 33300 Epoch: [3] [3110/4276] eta: 1:04:39 lr: 4.5786265614376344e-05 loss: 0.1948 (0.2076) time: 2.8982 data: 0.0088 max mem: 33300 Epoch: [3] [3120/4276] eta: 1:04:04 lr: 4.578360877967525e-05 loss: 0.1789 (0.2076) time: 2.8893 data: 0.0083 max mem: 33300 Epoch: [3] [3130/4276] eta: 1:03:29 lr: 4.578095192784329e-05 loss: 0.2010 (0.2076) time: 2.8830 data: 0.0079 max mem: 33300 Epoch: [3] [3140/4276] eta: 1:02:55 lr: 4.5778295058879254e-05 loss: 0.2050 (0.2076) time: 2.8714 data: 0.0080 max mem: 33300 Epoch: [3] [3150/4276] eta: 1:02:20 lr: 4.577563817278192e-05 loss: 0.2054 (0.2077) time: 2.8726 data: 0.0082 max mem: 33300 Epoch: [3] [3160/4276] eta: 1:01:45 lr: 4.577298126955007e-05 loss: 0.2001 (0.2077) time: 2.8830 data: 0.0080 max mem: 33300 Epoch: [3] [3170/4276] eta: 1:01:10 lr: 4.577032434918249e-05 loss: 0.1799 (0.2077) time: 2.9052 data: 0.0080 max mem: 33300 Epoch: [3] [3180/4276] eta: 1:00:36 lr: 4.5767667411677964e-05 loss: 0.2035 (0.2077) time: 2.9150 data: 0.0083 max mem: 33300 Epoch: [3] [3190/4276] eta: 1:00:01 lr: 4.576501045703529e-05 loss: 0.2080 (0.2078) time: 2.9074 data: 0.0084 max mem: 33300 Epoch: [3] [3200/4276] eta: 0:59:27 lr: 4.576235348525323e-05 loss: 0.1884 (0.2078) time: 2.9036 data: 0.0086 max mem: 33300 Epoch: [3] [3210/4276] eta: 0:58:52 lr: 4.575969649633058e-05 loss: 0.1920 (0.2078) time: 2.8901 data: 0.0081 max mem: 33300 Epoch: [3] [3220/4276] eta: 0:58:18 lr: 4.575703949026612e-05 loss: 0.1994 (0.2077) time: 2.8829 data: 0.0078 max mem: 33300 Epoch: [3] [3230/4276] eta: 0:57:43 lr: 4.575438246705864e-05 loss: 0.2002 (0.2077) time: 2.8829 data: 0.0081 max mem: 33300 Epoch: [3] [3240/4276] eta: 0:57:09 lr: 4.575172542670691e-05 loss: 0.2034 (0.2077) time: 2.8821 data: 0.0079 max mem: 33300 Epoch: [3] [3250/4276] eta: 0:56:34 lr: 4.5749068369209727e-05 loss: 0.2028 (0.2077) time: 2.8843 data: 0.0080 max mem: 33300 Epoch: [3] [3260/4276] eta: 0:56:00 lr: 4.574641129456587e-05 loss: 0.1973 (0.2077) time: 2.8872 data: 0.0084 max mem: 33300 Epoch: [3] [3270/4276] eta: 0:55:25 lr: 4.574375420277411e-05 loss: 0.2105 (0.2077) time: 2.8867 data: 0.0086 max mem: 33300 Epoch: [3] [3280/4276] eta: 0:54:51 lr: 4.5741097093833246e-05 loss: 0.2128 (0.2077) time: 2.8870 data: 0.0087 max mem: 33300 Epoch: [3] [3290/4276] eta: 0:54:17 lr: 4.573843996774205e-05 loss: 0.2134 (0.2078) time: 2.8873 data: 0.0089 max mem: 33300 Epoch: [3] [3300/4276] eta: 0:53:42 lr: 4.573578282449931e-05 loss: 0.2122 (0.2078) time: 2.8873 data: 0.0091 max mem: 33300 Epoch: [3] [3310/4276] eta: 0:53:08 lr: 4.573312566410381e-05 loss: 0.2069 (0.2078) time: 2.8873 data: 0.0091 max mem: 33300 Epoch: [3] [3320/4276] eta: 0:52:34 lr: 4.5730468486554323e-05 loss: 0.2030 (0.2079) time: 2.8929 data: 0.0090 max mem: 33300 Epoch: [3] [3330/4276] eta: 0:52:00 lr: 4.572781129184964e-05 loss: 0.1963 (0.2078) time: 2.9100 data: 0.0092 max mem: 33300 Epoch: [3] [3340/4276] eta: 0:51:26 lr: 4.5725154079988535e-05 loss: 0.2062 (0.2079) time: 2.9132 data: 0.0090 max mem: 33300 Epoch: [3] [3350/4276] eta: 0:50:52 lr: 4.572249685096979e-05 loss: 0.1989 (0.2078) time: 2.9209 data: 0.0087 max mem: 33300 Epoch: [3] [3360/4276] eta: 0:50:18 lr: 4.571983960479219e-05 loss: 0.1934 (0.2078) time: 2.9393 data: 0.0087 max mem: 33300 Epoch: [3] [3370/4276] eta: 0:49:44 lr: 4.571718234145452e-05 loss: 0.2195 (0.2079) time: 2.9391 data: 0.0078 max mem: 33300 Epoch: [3] [3380/4276] eta: 0:49:10 lr: 4.571452506095555e-05 loss: 0.2108 (0.2079) time: 2.9359 data: 0.0071 max mem: 33300 Epoch: [3] [3390/4276] eta: 0:48:36 lr: 4.571186776329407e-05 loss: 0.2108 (0.2080) time: 2.9368 data: 0.0072 max mem: 33300 Epoch: [3] [3400/4276] eta: 0:48:02 lr: 4.570921044846886e-05 loss: 0.2119 (0.2080) time: 2.9368 data: 0.0075 max mem: 33300 Epoch: [3] [3410/4276] eta: 0:47:29 lr: 4.57065531164787e-05 loss: 0.2119 (0.2080) time: 2.9359 data: 0.0076 max mem: 33300 Epoch: [3] [3420/4276] eta: 0:46:55 lr: 4.5703895767322364e-05 loss: 0.2149 (0.2080) time: 2.9368 data: 0.0077 max mem: 33300 Epoch: [3] [3430/4276] eta: 0:46:21 lr: 4.570123840099863e-05 loss: 0.2050 (0.2080) time: 2.9363 data: 0.0077 max mem: 33300 Epoch: [3] [3440/4276] eta: 0:45:47 lr: 4.569858101750629e-05 loss: 0.2050 (0.2080) time: 2.9353 data: 0.0073 max mem: 33300 Epoch: [3] [3450/4276] eta: 0:45:14 lr: 4.569592361684412e-05 loss: 0.2060 (0.2080) time: 2.9344 data: 0.0072 max mem: 33300 Epoch: [3] [3460/4276] eta: 0:44:40 lr: 4.56932661990109e-05 loss: 0.2060 (0.2080) time: 2.9334 data: 0.0074 max mem: 33300 Epoch: [3] [3470/4276] eta: 0:44:06 lr: 4.5690608764005397e-05 loss: 0.1957 (0.2079) time: 2.9330 data: 0.0074 max mem: 33300 Epoch: [3] [3480/4276] eta: 0:43:33 lr: 4.5687951311826404e-05 loss: 0.1977 (0.2079) time: 2.9340 data: 0.0075 max mem: 33300 Epoch: [3] [3490/4276] eta: 0:42:59 lr: 4.56852938424727e-05 loss: 0.2025 (0.2079) time: 2.9318 data: 0.0079 max mem: 33300 Epoch: [3] [3500/4276] eta: 0:42:25 lr: 4.568263635594306e-05 loss: 0.1990 (0.2079) time: 2.9098 data: 0.0089 max mem: 33300 Epoch: [3] [3510/4276] eta: 0:41:52 lr: 4.567997885223626e-05 loss: 0.1857 (0.2078) time: 2.8941 data: 0.0094 max mem: 33300 Epoch: [3] [3520/4276] eta: 0:41:18 lr: 4.567732133135108e-05 loss: 0.1940 (0.2078) time: 2.9170 data: 0.0082 max mem: 33300 Epoch: [3] [3530/4276] eta: 0:40:45 lr: 4.5674663793286305e-05 loss: 0.1992 (0.2078) time: 2.9368 data: 0.0076 max mem: 33300 Epoch: [3] [3540/4276] eta: 0:40:11 lr: 4.56720062380407e-05 loss: 0.1938 (0.2078) time: 2.9360 data: 0.0077 max mem: 33300 Epoch: [3] [3550/4276] eta: 0:39:38 lr: 4.5669348665613064e-05 loss: 0.2155 (0.2078) time: 2.9362 data: 0.0076 max mem: 33300 Epoch: [3] [3560/4276] eta: 0:39:04 lr: 4.5666691076002155e-05 loss: 0.2077 (0.2078) time: 2.9373 data: 0.0078 max mem: 33300 Epoch: [3] [3570/4276] eta: 0:38:31 lr: 4.5664033469206765e-05 loss: 0.2095 (0.2078) time: 2.9367 data: 0.0080 max mem: 33300 Epoch: [3] [3580/4276] eta: 0:37:57 lr: 4.566137584522566e-05 loss: 0.1833 (0.2078) time: 2.9370 data: 0.0077 max mem: 33300 Epoch: [3] [3590/4276] eta: 0:37:24 lr: 4.565871820405762e-05 loss: 0.1806 (0.2078) time: 2.9381 data: 0.0075 max mem: 33300 Epoch: [3] [3600/4276] eta: 0:36:51 lr: 4.5656060545701426e-05 loss: 0.2181 (0.2078) time: 2.9385 data: 0.0076 max mem: 33300 Epoch: [3] [3610/4276] eta: 0:36:17 lr: 4.565340287015586e-05 loss: 0.2158 (0.2078) time: 2.9435 data: 0.0077 max mem: 33300 Epoch: [3] [3620/4276] eta: 0:35:44 lr: 4.565074517741969e-05 loss: 0.1953 (0.2077) time: 2.9357 data: 0.0078 max mem: 33300 Epoch: [3] [3630/4276] eta: 0:35:11 lr: 4.5648087467491685e-05 loss: 0.2023 (0.2078) time: 2.9370 data: 0.0078 max mem: 33300 Epoch: [3] [3640/4276] eta: 0:34:38 lr: 4.5645429740370636e-05 loss: 0.2023 (0.2078) time: 2.9447 data: 0.0077 max mem: 33300 Epoch: [3] [3650/4276] eta: 0:34:04 lr: 4.5642771996055326e-05 loss: 0.1963 (0.2078) time: 2.9421 data: 0.0080 max mem: 33300 Epoch: [3] [3660/4276] eta: 0:33:31 lr: 4.564011423454451e-05 loss: 0.1984 (0.2078) time: 2.9420 data: 0.0080 max mem: 33300 Epoch: [3] [3670/4276] eta: 0:32:58 lr: 4.5637456455836985e-05 loss: 0.2159 (0.2078) time: 2.9333 data: 0.0077 max mem: 33300 Epoch: [3] [3680/4276] eta: 0:32:25 lr: 4.5634798659931514e-05 loss: 0.2159 (0.2078) time: 2.9385 data: 0.0084 max mem: 33300 Epoch: [3] [3690/4276] eta: 0:31:52 lr: 4.563214084682687e-05 loss: 0.2119 (0.2078) time: 2.9443 data: 0.0087 max mem: 33300 Epoch: [3] [3700/4276] eta: 0:31:18 lr: 4.562948301652184e-05 loss: 0.2030 (0.2079) time: 2.9383 data: 0.0080 max mem: 33300 Epoch: [3] [3710/4276] eta: 0:30:45 lr: 4.562682516901519e-05 loss: 0.2004 (0.2078) time: 2.9356 data: 0.0076 max mem: 33300 Epoch: [3] [3720/4276] eta: 0:30:12 lr: 4.56241673043057e-05 loss: 0.1901 (0.2078) time: 2.9345 data: 0.0078 max mem: 33300 Epoch: [3] [3730/4276] eta: 0:29:39 lr: 4.562150942239215e-05 loss: 0.1982 (0.2078) time: 2.9360 data: 0.0082 max mem: 33300 Epoch: [3] [3740/4276] eta: 0:29:06 lr: 4.56188515232733e-05 loss: 0.2011 (0.2078) time: 2.9562 data: 0.0083 max mem: 33300 Epoch: [3] [3750/4276] eta: 0:28:33 lr: 4.5616193606947936e-05 loss: 0.2031 (0.2078) time: 2.9560 data: 0.0084 max mem: 33300 Epoch: [3] [3760/4276] eta: 0:28:00 lr: 4.5613535673414844e-05 loss: 0.1998 (0.2078) time: 2.9195 data: 0.0084 max mem: 33300 Epoch: [3] [3770/4276] eta: 0:27:27 lr: 4.561087772267277e-05 loss: 0.1998 (0.2078) time: 2.9034 data: 0.0077 max mem: 33300 Epoch: [3] [3780/4276] eta: 0:26:54 lr: 4.5608219754720505e-05 loss: 0.2088 (0.2078) time: 2.9198 data: 0.0074 max mem: 33300 Epoch: [3] [3790/4276] eta: 0:26:21 lr: 4.5605561769556825e-05 loss: 0.2056 (0.2078) time: 2.9350 data: 0.0074 max mem: 33300 Epoch: [3] [3800/4276] eta: 0:25:48 lr: 4.56029037671805e-05 loss: 0.2086 (0.2078) time: 2.9358 data: 0.0074 max mem: 33300 Epoch: [3] [3810/4276] eta: 0:25:15 lr: 4.560024574759031e-05 loss: 0.1969 (0.2078) time: 2.9350 data: 0.0073 max mem: 33300 Epoch: [3] [3820/4276] eta: 0:24:42 lr: 4.559758771078501e-05 loss: 0.1906 (0.2078) time: 2.9358 data: 0.0074 max mem: 33300 Epoch: [3] [3830/4276] eta: 0:24:09 lr: 4.55949296567634e-05 loss: 0.1992 (0.2078) time: 2.9366 data: 0.0074 max mem: 33300 Epoch: [3] [3840/4276] eta: 0:23:37 lr: 4.559227158552423e-05 loss: 0.2085 (0.2078) time: 2.9366 data: 0.0075 max mem: 33300 Epoch: [3] [3850/4276] eta: 0:23:04 lr: 4.558961349706629e-05 loss: 0.1983 (0.2078) time: 2.9360 data: 0.0075 max mem: 33300 Epoch: [3] [3860/4276] eta: 0:22:31 lr: 4.5586955391388344e-05 loss: 0.2000 (0.2078) time: 2.9361 data: 0.0072 max mem: 33300 Epoch: [3] [3870/4276] eta: 0:21:58 lr: 4.558429726848916e-05 loss: 0.2053 (0.2078) time: 2.9375 data: 0.0072 max mem: 33300 Epoch: [3] [3880/4276] eta: 0:21:25 lr: 4.558163912836753e-05 loss: 0.2018 (0.2078) time: 2.9403 data: 0.0072 max mem: 33300 Epoch: [3] [3890/4276] eta: 0:20:53 lr: 4.55789809710222e-05 loss: 0.2013 (0.2078) time: 2.9400 data: 0.0072 max mem: 33300 Epoch: [3] [3900/4276] eta: 0:20:20 lr: 4.557632279645196e-05 loss: 0.2041 (0.2078) time: 2.9406 data: 0.0073 max mem: 33300 Epoch: [3] [3910/4276] eta: 0:19:47 lr: 4.557366460465558e-05 loss: 0.1963 (0.2078) time: 2.9399 data: 0.0072 max mem: 33300 Epoch: [3] [3920/4276] eta: 0:19:14 lr: 4.5571006395631835e-05 loss: 0.1947 (0.2078) time: 2.9400 data: 0.0071 max mem: 33300 Epoch: [3] [3930/4276] eta: 0:18:42 lr: 4.5568348169379484e-05 loss: 0.2035 (0.2077) time: 2.9390 data: 0.0071 max mem: 33300 Epoch: [3] [3940/4276] eta: 0:18:09 lr: 4.5565689925897305e-05 loss: 0.2015 (0.2077) time: 2.9377 data: 0.0072 max mem: 33300 Epoch: [3] [3950/4276] eta: 0:17:36 lr: 4.5563031665184076e-05 loss: 0.1931 (0.2077) time: 2.9386 data: 0.0074 max mem: 33300 Epoch: [3] [3960/4276] eta: 0:17:04 lr: 4.5560373387238565e-05 loss: 0.1986 (0.2077) time: 2.9277 data: 0.0084 max mem: 33300 Epoch: [3] [3970/4276] eta: 0:16:31 lr: 4.555771509205954e-05 loss: 0.2246 (0.2077) time: 2.9270 data: 0.0084 max mem: 33300 Epoch: [3] [3980/4276] eta: 0:15:58 lr: 4.5555056779645764e-05 loss: 0.2047 (0.2077) time: 2.9150 data: 0.0084 max mem: 33300 Epoch: [3] [3990/4276] eta: 0:15:26 lr: 4.5552398449996034e-05 loss: 0.2016 (0.2077) time: 2.8890 data: 0.0086 max mem: 33300 Epoch: [3] [4000/4276] eta: 0:14:53 lr: 4.554974010310909e-05 loss: 0.1831 (0.2077) time: 2.8982 data: 0.0080 max mem: 33300 Epoch: [3] [4010/4276] eta: 0:14:20 lr: 4.554708173898372e-05 loss: 0.1851 (0.2077) time: 2.9268 data: 0.0079 max mem: 33300 Epoch: [3] [4020/4276] eta: 0:13:48 lr: 4.5544423357618686e-05 loss: 0.1998 (0.2077) time: 2.9443 data: 0.0073 max mem: 33300 Epoch: [3] [4030/4276] eta: 0:13:15 lr: 4.554176495901277e-05 loss: 0.2065 (0.2077) time: 2.9440 data: 0.0072 max mem: 33300 Epoch: [3] [4040/4276] eta: 0:12:43 lr: 4.553910654316473e-05 loss: 0.2071 (0.2077) time: 2.9352 data: 0.0073 max mem: 33300 Epoch: [3] [4050/4276] eta: 0:12:10 lr: 4.5536448110073346e-05 loss: 0.1877 (0.2077) time: 2.9219 data: 0.0082 max mem: 33300 Epoch: [3] [4060/4276] eta: 0:11:38 lr: 4.553378965973737e-05 loss: 0.1877 (0.2077) time: 2.8979 data: 0.0088 max mem: 33300 Epoch: [3] [4070/4276] eta: 0:11:05 lr: 4.5531131192155604e-05 loss: 0.2049 (0.2077) time: 2.8893 data: 0.0090 max mem: 33300 Epoch: [3] [4080/4276] eta: 0:10:33 lr: 4.552847270732678e-05 loss: 0.2044 (0.2077) time: 2.9011 data: 0.0091 max mem: 33300 Epoch: [3] [4090/4276] eta: 0:10:00 lr: 4.552581420524969e-05 loss: 0.2044 (0.2077) time: 2.8980 data: 0.0090 max mem: 33300 Epoch: [3] [4100/4276] eta: 0:09:28 lr: 4.552315568592309e-05 loss: 0.2077 (0.2077) time: 2.8872 data: 0.0089 max mem: 33300 Epoch: [3] [4110/4276] eta: 0:08:55 lr: 4.552049714934576e-05 loss: 0.2102 (0.2077) time: 2.8875 data: 0.0090 max mem: 33300 Epoch: [3] [4120/4276] eta: 0:08:23 lr: 4.5517838595516465e-05 loss: 0.2066 (0.2077) time: 2.8887 data: 0.0095 max mem: 33300 Epoch: [3] [4130/4276] eta: 0:07:51 lr: 4.551518002443397e-05 loss: 0.2012 (0.2077) time: 2.8895 data: 0.0095 max mem: 33300 Epoch: [3] [4140/4276] eta: 0:07:18 lr: 4.5512521436097047e-05 loss: 0.1924 (0.2077) time: 2.8895 data: 0.0095 max mem: 33300 Epoch: [3] [4150/4276] eta: 0:06:46 lr: 4.550986283050447e-05 loss: 0.1904 (0.2077) time: 2.8895 data: 0.0099 max mem: 33300 Epoch: [3] [4160/4276] eta: 0:06:14 lr: 4.5507204207654994e-05 loss: 0.1904 (0.2077) time: 2.8896 data: 0.0102 max mem: 33300 Epoch: [3] [4170/4276] eta: 0:05:41 lr: 4.550454556754739e-05 loss: 0.2184 (0.2077) time: 2.8899 data: 0.0100 max mem: 33300 Epoch: [3] [4180/4276] eta: 0:05:09 lr: 4.5501886910180426e-05 loss: 0.2048 (0.2077) time: 2.8894 data: 0.0098 max mem: 33300 Epoch: [3] [4190/4276] eta: 0:04:37 lr: 4.549922823555288e-05 loss: 0.1954 (0.2077) time: 2.8886 data: 0.0099 max mem: 33300 Epoch: [3] [4200/4276] eta: 0:04:04 lr: 4.5496569543663506e-05 loss: 0.2003 (0.2078) time: 2.8908 data: 0.0102 max mem: 33300 Epoch: [3] [4210/4276] eta: 0:03:32 lr: 4.549391083451107e-05 loss: 0.2128 (0.2078) time: 2.9024 data: 0.0096 max mem: 33300 Epoch: [3] [4220/4276] eta: 0:03:00 lr: 4.549125210809435e-05 loss: 0.2140 (0.2078) time: 2.9278 data: 0.0081 max mem: 33300 Epoch: [3] [4230/4276] eta: 0:02:28 lr: 4.548859336441211e-05 loss: 0.2345 (0.2079) time: 2.9421 data: 0.0073 max mem: 33300 Epoch: [3] [4240/4276] eta: 0:01:55 lr: 4.548593460346312e-05 loss: 0.2225 (0.2079) time: 2.9411 data: 0.0075 max mem: 33300 Epoch: [3] [4250/4276] eta: 0:01:23 lr: 4.5483275825246125e-05 loss: 0.2051 (0.2079) time: 2.9507 data: 0.0077 max mem: 33300 Epoch: [3] [4260/4276] eta: 0:00:51 lr: 4.548061702975991e-05 loss: 0.2158 (0.2080) time: 2.9501 data: 0.0082 max mem: 33300 Epoch: [3] [4270/4276] eta: 0:00:19 lr: 4.5477958217003254e-05 loss: 0.2126 (0.2080) time: 2.9358 data: 0.0077 max mem: 33300 Epoch: [3] Total time: 3:49:13 Test: [ 0/21770] eta: 11:04:26 time: 1.8313 data: 1.7931 max mem: 33300 Test: [ 100/21770] eta: 0:20:54 time: 0.0389 data: 0.0009 max mem: 33300 Test: [ 200/21770] eta: 0:17:23 time: 0.0387 data: 0.0009 max mem: 33300 Test: [ 300/21770] eta: 0:16:09 time: 0.0387 data: 0.0008 max mem: 33300 Test: [ 400/21770] eta: 0:15:31 time: 0.0388 data: 0.0008 max mem: 33300 Test: [ 500/21770] eta: 0:15:07 time: 0.0389 data: 0.0009 max mem: 33300 Test: [ 600/21770] eta: 0:14:50 time: 0.0393 data: 0.0009 max mem: 33300 Test: [ 700/21770] eta: 0:14:36 time: 0.0390 data: 0.0009 max mem: 33300 Test: [ 800/21770] eta: 0:14:25 time: 0.0388 data: 0.0008 max mem: 33300 Test: [ 900/21770] eta: 0:14:16 time: 0.0391 data: 0.0008 max mem: 33300 Test: [ 1000/21770] eta: 0:14:08 time: 0.0392 data: 0.0009 max mem: 33300 Test: [ 1100/21770] eta: 0:14:01 time: 0.0393 data: 0.0008 max mem: 33300 Test: [ 1200/21770] eta: 0:13:54 time: 0.0392 data: 0.0009 max mem: 33300 Test: [ 1300/21770] eta: 0:13:48 time: 0.0394 data: 0.0008 max mem: 33300 Test: [ 1400/21770] eta: 0:13:43 time: 0.0393 data: 0.0009 max mem: 33300 Test: [ 1500/21770] eta: 0:13:37 time: 0.0390 data: 0.0009 max mem: 33300 Test: [ 1600/21770] eta: 0:13:31 time: 0.0392 data: 0.0009 max mem: 33300 Test: [ 1700/21770] eta: 0:13:26 time: 0.0387 data: 0.0008 max mem: 33300 Test: [ 1800/21770] eta: 0:13:20 time: 0.0389 data: 0.0009 max mem: 33300 Test: [ 1900/21770] eta: 0:13:15 time: 0.0389 data: 0.0009 max mem: 33300 Test: [ 2000/21770] eta: 0:13:10 time: 0.0389 data: 0.0009 max mem: 33300 Test: [ 2100/21770] eta: 0:13:05 time: 0.0388 data: 0.0008 max mem: 33300 Test: [ 2200/21770] eta: 0:13:00 time: 0.0388 data: 0.0009 max mem: 33300 Test: [ 2300/21770] eta: 0:12:55 time: 0.0388 data: 0.0009 max mem: 33300 Test: [ 2400/21770] eta: 0:12:50 time: 0.0394 data: 0.0009 max mem: 33300 Test: [ 2500/21770] eta: 0:12:46 time: 0.0391 data: 0.0009 max mem: 33300 Test: [ 2600/21770] eta: 0:12:41 time: 0.0392 data: 0.0009 max mem: 33300 Test: [ 2700/21770] eta: 0:12:37 time: 0.0390 data: 0.0009 max mem: 33300 Test: [ 2800/21770] eta: 0:12:33 time: 0.0392 data: 0.0009 max mem: 33300 Test: [ 2900/21770] eta: 0:12:29 time: 0.0395 data: 0.0009 max mem: 33300 Test: [ 3000/21770] eta: 0:12:25 time: 0.0400 data: 0.0008 max mem: 33300 Test: [ 3100/21770] eta: 0:12:21 time: 0.0400 data: 0.0009 max mem: 33300 Test: [ 3200/21770] eta: 0:12:17 time: 0.0399 data: 0.0008 max mem: 33300 Test: [ 3300/21770] eta: 0:12:13 time: 0.0401 data: 0.0008 max mem: 33300 Test: [ 3400/21770] eta: 0:12:09 time: 0.0399 data: 0.0008 max mem: 33300 Test: [ 3500/21770] eta: 0:12:05 time: 0.0402 data: 0.0008 max mem: 33300 Test: [ 3600/21770] eta: 0:12:02 time: 0.0400 data: 0.0009 max mem: 33300 Test: [ 3700/21770] eta: 0:11:58 time: 0.0400 data: 0.0008 max mem: 33300 Test: [ 3800/21770] eta: 0:11:54 time: 0.0400 data: 0.0009 max mem: 33300 Test: [ 3900/21770] eta: 0:11:50 time: 0.0401 data: 0.0008 max mem: 33300 Test: [ 4000/21770] eta: 0:11:46 time: 0.0399 data: 0.0009 max mem: 33300 Test: [ 4100/21770] eta: 0:11:42 time: 0.0399 data: 0.0008 max mem: 33300 Test: [ 4200/21770] eta: 0:11:38 time: 0.0399 data: 0.0009 max mem: 33300 Test: [ 4300/21770] eta: 0:11:34 time: 0.0401 data: 0.0008 max mem: 33300 Test: [ 4400/21770] eta: 0:11:31 time: 0.0398 data: 0.0008 max mem: 33300 Test: [ 4500/21770] eta: 0:11:27 time: 0.0402 data: 0.0009 max mem: 33300 Test: [ 4600/21770] eta: 0:11:23 time: 0.0398 data: 0.0008 max mem: 33300 Test: [ 4700/21770] eta: 0:11:19 time: 0.0401 data: 0.0008 max mem: 33300 Test: [ 4800/21770] eta: 0:11:15 time: 0.0398 data: 0.0009 max mem: 33300 Test: [ 4900/21770] eta: 0:11:11 time: 0.0400 data: 0.0008 max mem: 33300 Test: [ 5000/21770] eta: 0:11:07 time: 0.0399 data: 0.0008 max mem: 33300 Test: [ 5100/21770] eta: 0:11:03 time: 0.0401 data: 0.0008 max mem: 33300 Test: [ 5200/21770] eta: 0:10:59 time: 0.0400 data: 0.0009 max mem: 33300 Test: [ 5300/21770] eta: 0:10:55 time: 0.0401 data: 0.0008 max mem: 33300 Test: [ 5400/21770] eta: 0:10:51 time: 0.0399 data: 0.0008 max mem: 33300 Test: [ 5500/21770] eta: 0:10:47 time: 0.0400 data: 0.0009 max mem: 33300 Test: [ 5600/21770] eta: 0:10:43 time: 0.0399 data: 0.0009 max mem: 33300 Test: [ 5700/21770] eta: 0:10:39 time: 0.0396 data: 0.0009 max mem: 33300 Test: [ 5800/21770] eta: 0:10:36 time: 0.0400 data: 0.0009 max mem: 33300 Test: [ 5900/21770] eta: 0:10:32 time: 0.0396 data: 0.0009 max mem: 33300 Test: [ 6000/21770] eta: 0:10:28 time: 0.0395 data: 0.0009 max mem: 33300 Test: [ 6100/21770] eta: 0:10:24 time: 0.0398 data: 0.0008 max mem: 33300 Test: [ 6200/21770] eta: 0:10:20 time: 0.0392 data: 0.0009 max mem: 33300 Test: [ 6300/21770] eta: 0:10:15 time: 0.0401 data: 0.0009 max mem: 33300 Test: [ 6400/21770] eta: 0:10:12 time: 0.0398 data: 0.0008 max mem: 33300 Test: [ 6500/21770] eta: 0:10:08 time: 0.0401 data: 0.0008 max mem: 33300 Test: [ 6600/21770] eta: 0:10:04 time: 0.0399 data: 0.0008 max mem: 33300 Test: [ 6700/21770] eta: 0:10:00 time: 0.0401 data: 0.0008 max mem: 33300 Test: [ 6800/21770] eta: 0:09:56 time: 0.0398 data: 0.0008 max mem: 33300 Test: [ 6900/21770] eta: 0:09:52 time: 0.0401 data: 0.0008 max mem: 33300 Test: [ 7000/21770] eta: 0:09:48 time: 0.0399 data: 0.0008 max mem: 33300 Test: [ 7100/21770] eta: 0:09:44 time: 0.0400 data: 0.0008 max mem: 33300 Test: [ 7200/21770] eta: 0:09:40 time: 0.0400 data: 0.0008 max mem: 33300 Test: [ 7300/21770] eta: 0:09:36 time: 0.0401 data: 0.0009 max mem: 33300 Test: [ 7400/21770] eta: 0:09:32 time: 0.0400 data: 0.0008 max mem: 33300 Test: [ 7500/21770] eta: 0:09:28 time: 0.0401 data: 0.0008 max mem: 33300 Test: [ 7600/21770] eta: 0:09:24 time: 0.0400 data: 0.0008 max mem: 33300 Test: [ 7700/21770] eta: 0:09:20 time: 0.0399 data: 0.0008 max mem: 33300 Test: [ 7800/21770] eta: 0:09:16 time: 0.0399 data: 0.0008 max mem: 33300 Test: [ 7900/21770] eta: 0:09:12 time: 0.0401 data: 0.0008 max mem: 33300 Test: [ 8000/21770] eta: 0:09:08 time: 0.0383 data: 0.0009 max mem: 33300 Test: [ 8100/21770] eta: 0:09:04 time: 0.0400 data: 0.0008 max mem: 33300 Test: [ 8200/21770] eta: 0:09:00 time: 0.0397 data: 0.0008 max mem: 33300 Test: [ 8300/21770] eta: 0:08:56 time: 0.0400 data: 0.0008 max mem: 33300 Test: [ 8400/21770] eta: 0:08:52 time: 0.0400 data: 0.0008 max mem: 33300 Test: [ 8500/21770] eta: 0:08:48 time: 0.0401 data: 0.0008 max mem: 33300 Test: [ 8600/21770] eta: 0:08:44 time: 0.0399 data: 0.0008 max mem: 33300 Test: [ 8700/21770] eta: 0:08:40 time: 0.0393 data: 0.0008 max mem: 33300 Test: [ 8800/21770] eta: 0:08:36 time: 0.0399 data: 0.0009 max mem: 33300 Test: [ 8900/21770] eta: 0:08:32 time: 0.0400 data: 0.0008 max mem: 33300 Test: [ 9000/21770] eta: 0:08:28 time: 0.0395 data: 0.0008 max mem: 33300 Test: [ 9100/21770] eta: 0:08:24 time: 0.0399 data: 0.0008 max mem: 33300 Test: [ 9200/21770] eta: 0:08:20 time: 0.0398 data: 0.0008 max mem: 33300 Test: [ 9300/21770] eta: 0:08:16 time: 0.0400 data: 0.0008 max mem: 33300 Test: [ 9400/21770] eta: 0:08:12 time: 0.0400 data: 0.0008 max mem: 33300 Test: [ 9500/21770] eta: 0:08:09 time: 0.0400 data: 0.0008 max mem: 33300 Test: [ 9600/21770] eta: 0:08:05 time: 0.0399 data: 0.0008 max mem: 33300 Test: [ 9700/21770] eta: 0:08:01 time: 0.0400 data: 0.0008 max mem: 33300 Test: [ 9800/21770] eta: 0:07:57 time: 0.0399 data: 0.0008 max mem: 33300 Test: [ 9900/21770] eta: 0:07:53 time: 0.0401 data: 0.0008 max mem: 33300 Test: [10000/21770] eta: 0:07:49 time: 0.0400 data: 0.0008 max mem: 33300 Test: [10100/21770] eta: 0:07:45 time: 0.0401 data: 0.0008 max mem: 33300 Test: [10200/21770] eta: 0:07:41 time: 0.0400 data: 0.0008 max mem: 33300 Test: [10300/21770] eta: 0:07:37 time: 0.0393 data: 0.0008 max mem: 33300 Test: [10400/21770] eta: 0:07:33 time: 0.0378 data: 0.0009 max mem: 33300 Test: [10500/21770] eta: 0:07:28 time: 0.0378 data: 0.0009 max mem: 33300 Test: [10600/21770] eta: 0:07:24 time: 0.0378 data: 0.0008 max mem: 33300 Test: [10700/21770] eta: 0:07:20 time: 0.0378 data: 0.0009 max mem: 33300 Test: [10800/21770] eta: 0:07:16 time: 0.0377 data: 0.0009 max mem: 33300 Test: [10900/21770] eta: 0:07:12 time: 0.0384 data: 0.0008 max mem: 33300 Test: [11000/21770] eta: 0:07:07 time: 0.0386 data: 0.0008 max mem: 33300 Test: [11100/21770] eta: 0:07:03 time: 0.0392 data: 0.0009 max mem: 33300 Test: [11200/21770] eta: 0:06:59 time: 0.0387 data: 0.0009 max mem: 33300 Test: [11300/21770] eta: 0:06:55 time: 0.0393 data: 0.0009 max mem: 33300 Test: [11400/21770] eta: 0:06:51 time: 0.0387 data: 0.0008 max mem: 33300 Test: [11500/21770] eta: 0:06:47 time: 0.0393 data: 0.0009 max mem: 33300 Test: [11600/21770] eta: 0:06:43 time: 0.0385 data: 0.0009 max mem: 33300 Test: [11700/21770] eta: 0:06:39 time: 0.0389 data: 0.0009 max mem: 33300 Test: [11800/21770] eta: 0:06:35 time: 0.0389 data: 0.0008 max mem: 33300 Test: [11900/21770] eta: 0:06:31 time: 0.0390 data: 0.0008 max mem: 33300 Test: [12000/21770] eta: 0:06:27 time: 0.0388 data: 0.0008 max mem: 33300 Test: [12100/21770] eta: 0:06:23 time: 0.0386 data: 0.0009 max mem: 33300 Test: [12200/21770] eta: 0:06:19 time: 0.0384 data: 0.0009 max mem: 33300 Test: [12300/21770] eta: 0:06:15 time: 0.0378 data: 0.0009 max mem: 33300 Test: [12400/21770] eta: 0:06:11 time: 0.0382 data: 0.0009 max mem: 33300 Test: [12500/21770] eta: 0:06:07 time: 0.0391 data: 0.0009 max mem: 33300 Test: [12600/21770] eta: 0:06:03 time: 0.0391 data: 0.0009 max mem: 33300 Test: [12700/21770] eta: 0:05:59 time: 0.0391 data: 0.0009 max mem: 33300 Test: [12800/21770] eta: 0:05:55 time: 0.0390 data: 0.0009 max mem: 33300 Test: [12900/21770] eta: 0:05:51 time: 0.0391 data: 0.0009 max mem: 33300 Test: [13000/21770] eta: 0:05:47 time: 0.0393 data: 0.0009 max mem: 33300 Test: [13100/21770] eta: 0:05:43 time: 0.0391 data: 0.0009 max mem: 33300 Test: [13200/21770] eta: 0:05:39 time: 0.0393 data: 0.0009 max mem: 33300 Test: [13300/21770] eta: 0:05:35 time: 0.0390 data: 0.0009 max mem: 33300 Test: [13400/21770] eta: 0:05:31 time: 0.0390 data: 0.0008 max mem: 33300 Test: [13500/21770] eta: 0:05:27 time: 0.0393 data: 0.0009 max mem: 33300 Test: [13600/21770] eta: 0:05:23 time: 0.0389 data: 0.0009 max mem: 33300 Test: [13700/21770] eta: 0:05:19 time: 0.0391 data: 0.0009 max mem: 33300 Test: [13800/21770] eta: 0:05:15 time: 0.0392 data: 0.0009 max mem: 33300 Test: [13900/21770] eta: 0:05:11 time: 0.0393 data: 0.0008 max mem: 33300 Test: [14000/21770] eta: 0:05:07 time: 0.0390 data: 0.0009 max mem: 33300 Test: [14100/21770] eta: 0:05:03 time: 0.0392 data: 0.0009 max mem: 33300 Test: [14200/21770] eta: 0:04:59 time: 0.0397 data: 0.0008 max mem: 33300 Test: [14300/21770] eta: 0:04:55 time: 0.0393 data: 0.0009 max mem: 33300 Test: [14400/21770] eta: 0:04:51 time: 0.0394 data: 0.0008 max mem: 33300 Test: [14500/21770] eta: 0:04:47 time: 0.0393 data: 0.0009 max mem: 33300 Test: [14600/21770] eta: 0:04:43 time: 0.0396 data: 0.0008 max mem: 33300 Test: [14700/21770] eta: 0:04:39 time: 0.0391 data: 0.0009 max mem: 33300 Test: [14800/21770] eta: 0:04:35 time: 0.0394 data: 0.0009 max mem: 33300 Test: [14900/21770] eta: 0:04:31 time: 0.0396 data: 0.0009 max mem: 33300 Test: [15000/21770] eta: 0:04:27 time: 0.0395 data: 0.0008 max mem: 33300 Test: [15100/21770] eta: 0:04:23 time: 0.0393 data: 0.0009 max mem: 33300 Test: [15200/21770] eta: 0:04:19 time: 0.0394 data: 0.0008 max mem: 33300 Test: [15300/21770] eta: 0:04:15 time: 0.0394 data: 0.0008 max mem: 33300 Test: [15400/21770] eta: 0:04:12 time: 0.0396 data: 0.0009 max mem: 33300 Test: [15500/21770] eta: 0:04:08 time: 0.0394 data: 0.0008 max mem: 33300 Test: [15600/21770] eta: 0:04:04 time: 0.0397 data: 0.0009 max mem: 33300 Test: [15700/21770] eta: 0:04:00 time: 0.0395 data: 0.0009 max mem: 33300 Test: [15800/21770] eta: 0:03:56 time: 0.0394 data: 0.0009 max mem: 33300 Test: [15900/21770] eta: 0:03:52 time: 0.0393 data: 0.0009 max mem: 33300 Test: [16000/21770] eta: 0:03:48 time: 0.0397 data: 0.0008 max mem: 33300 Test: [16100/21770] eta: 0:03:44 time: 0.0394 data: 0.0008 max mem: 33300 Test: [16200/21770] eta: 0:03:40 time: 0.0392 data: 0.0008 max mem: 33300 Test: [16300/21770] eta: 0:03:36 time: 0.0391 data: 0.0008 max mem: 33300 Test: [16400/21770] eta: 0:03:32 time: 0.0393 data: 0.0008 max mem: 33300 Test: [16500/21770] eta: 0:03:28 time: 0.0392 data: 0.0008 max mem: 33300 Test: [16600/21770] eta: 0:03:24 time: 0.0393 data: 0.0008 max mem: 33300 Test: [16700/21770] eta: 0:03:20 time: 0.0391 data: 0.0009 max mem: 33300 Test: [16800/21770] eta: 0:03:16 time: 0.0393 data: 0.0008 max mem: 33300 Test: [16900/21770] eta: 0:03:12 time: 0.0391 data: 0.0008 max mem: 33300 Test: [17000/21770] eta: 0:03:08 time: 0.0395 data: 0.0008 max mem: 33300 Test: [17100/21770] eta: 0:03:04 time: 0.0392 data: 0.0008 max mem: 33300 Test: [17200/21770] eta: 0:03:00 time: 0.0393 data: 0.0008 max mem: 33300 Test: [17300/21770] eta: 0:02:56 time: 0.0391 data: 0.0008 max mem: 33300 Test: [17400/21770] eta: 0:02:52 time: 0.0394 data: 0.0008 max mem: 33300 Test: [17500/21770] eta: 0:02:48 time: 0.0395 data: 0.0008 max mem: 33300 Test: [17600/21770] eta: 0:02:44 time: 0.0393 data: 0.0008 max mem: 33300 Test: [17700/21770] eta: 0:02:40 time: 0.0393 data: 0.0008 max mem: 33300 Test: [17800/21770] eta: 0:02:36 time: 0.0393 data: 0.0008 max mem: 33300 Test: [17900/21770] eta: 0:02:32 time: 0.0392 data: 0.0008 max mem: 33300 Test: [18000/21770] eta: 0:02:29 time: 0.0386 data: 0.0009 max mem: 33300 Test: [18100/21770] eta: 0:02:25 time: 0.0391 data: 0.0009 max mem: 33300 Test: [18200/21770] eta: 0:02:21 time: 0.0394 data: 0.0008 max mem: 33300 Test: [18300/21770] eta: 0:02:17 time: 0.0391 data: 0.0008 max mem: 33300 Test: [18400/21770] eta: 0:02:13 time: 0.0391 data: 0.0008 max mem: 33300 Test: [18500/21770] eta: 0:02:09 time: 0.0387 data: 0.0009 max mem: 33300 Test: [18600/21770] eta: 0:02:05 time: 0.0378 data: 0.0008 max mem: 33300 Test: [18700/21770] eta: 0:02:01 time: 0.0378 data: 0.0008 max mem: 33300 Test: [18800/21770] eta: 0:01:57 time: 0.0378 data: 0.0008 max mem: 33300 Test: [18900/21770] eta: 0:01:53 time: 0.0378 data: 0.0008 max mem: 33300 Test: [19000/21770] eta: 0:01:49 time: 0.0378 data: 0.0009 max mem: 33300 Test: [19100/21770] eta: 0:01:45 time: 0.0378 data: 0.0008 max mem: 33300 Test: [19200/21770] eta: 0:01:41 time: 0.0377 data: 0.0008 max mem: 33300 Test: [19300/21770] eta: 0:01:37 time: 0.0377 data: 0.0008 max mem: 33300 Test: [19400/21770] eta: 0:01:33 time: 0.0378 data: 0.0008 max mem: 33300 Test: [19500/21770] eta: 0:01:29 time: 0.0377 data: 0.0008 max mem: 33300 Test: [19600/21770] eta: 0:01:25 time: 0.0382 data: 0.0008 max mem: 33300 Test: [19700/21770] eta: 0:01:21 time: 0.0380 data: 0.0008 max mem: 33300 Test: [19800/21770] eta: 0:01:17 time: 0.0381 data: 0.0008 max mem: 33300 Test: [19900/21770] eta: 0:01:13 time: 0.0380 data: 0.0008 max mem: 33300 Test: [20000/21770] eta: 0:01:09 time: 0.0382 data: 0.0008 max mem: 33300 Test: [20100/21770] eta: 0:01:05 time: 0.0380 data: 0.0008 max mem: 33300 Test: [20200/21770] eta: 0:01:01 time: 0.0378 data: 0.0008 max mem: 33300 Test: [20300/21770] eta: 0:00:57 time: 0.0377 data: 0.0008 max mem: 33300 Test: [20400/21770] eta: 0:00:53 time: 0.0380 data: 0.0008 max mem: 33300 Test: [20500/21770] eta: 0:00:49 time: 0.0381 data: 0.0008 max mem: 33300 Test: [20600/21770] eta: 0:00:46 time: 0.0382 data: 0.0008 max mem: 33300 Test: [20700/21770] eta: 0:00:42 time: 0.0380 data: 0.0008 max mem: 33300 Test: [20800/21770] eta: 0:00:38 time: 0.0381 data: 0.0008 max mem: 33300 Test: [20900/21770] eta: 0:00:34 time: 0.0380 data: 0.0009 max mem: 33300 Test: [21000/21770] eta: 0:00:30 time: 0.0381 data: 0.0008 max mem: 33300 Test: [21100/21770] eta: 0:00:26 time: 0.0381 data: 0.0009 max mem: 33300 Test: [21200/21770] eta: 0:00:22 time: 0.0382 data: 0.0009 max mem: 33300 Test: [21300/21770] eta: 0:00:18 time: 0.0381 data: 0.0009 max mem: 33300 Test: [21400/21770] eta: 0:00:14 time: 0.0382 data: 0.0009 max mem: 33300 Test: [21500/21770] eta: 0:00:10 time: 0.0380 data: 0.0009 max mem: 33300 Test: [21600/21770] eta: 0:00:06 time: 0.0382 data: 0.0009 max mem: 33300 Test: [21700/21770] eta: 0:00:02 time: 0.0380 data: 0.0009 max mem: 33300 Test: Total time: 0:14:15 Final results: Mean IoU is 0.00 precision@0.5 = 0.00 precision@0.6 = 0.00 precision@0.7 = 0.00 precision@0.8 = 0.00 precision@0.9 = 0.00 overall IoU = 0.00 mean IoU = 0.00 Mean accuracy for one-to-zero sample is 100.00 Average object IoU 0.0 Overall IoU 0.0 Epoch: [4] [ 0/4276] eta: 6:33:29 lr: 4.5476362921058924e-05 loss: 0.1850 (0.1850) time: 5.5214 data: 2.4344 max mem: 33300 Epoch: [4] [ 10/4276] eta: 3:46:14 lr: 4.5473704080666955e-05 loss: 0.1933 (0.1966) time: 3.1819 data: 0.2276 max mem: 33300 Epoch: [4] [ 20/4276] eta: 3:37:59 lr: 4.547104522300133e-05 loss: 0.1933 (0.2011) time: 2.9507 data: 0.0069 max mem: 33300 Epoch: [4] [ 30/4276] eta: 3:34:26 lr: 4.5468386348060795e-05 loss: 0.1857 (0.2003) time: 2.9470 data: 0.0075 max mem: 33300 Epoch: [4] [ 40/4276] eta: 3:32:27 lr: 4.5465727455844124e-05 loss: 0.1857 (0.1991) time: 2.9426 data: 0.0078 max mem: 33300 Epoch: [4] [ 50/4276] eta: 3:31:06 lr: 4.546306854635009e-05 loss: 0.1938 (0.1987) time: 2.9461 data: 0.0078 max mem: 33300 Epoch: [4] [ 60/4276] eta: 3:29:56 lr: 4.546040961957745e-05 loss: 0.1938 (0.1997) time: 2.9437 data: 0.0080 max mem: 33300 Epoch: [4] [ 70/4276] eta: 3:28:59 lr: 4.545775067552497e-05 loss: 0.1909 (0.2001) time: 2.9407 data: 0.0080 max mem: 33300 Epoch: [4] [ 80/4276] eta: 3:28:09 lr: 4.545509171419141e-05 loss: 0.1966 (0.2002) time: 2.9418 data: 0.0079 max mem: 33300 Epoch: [4] [ 90/4276] eta: 3:27:20 lr: 4.545243273557555e-05 loss: 0.1865 (0.1976) time: 2.9384 data: 0.0077 max mem: 33300 Epoch: [4] [ 100/4276] eta: 3:26:15 lr: 4.544977373967613e-05 loss: 0.1800 (0.2001) time: 2.9111 data: 0.0077 max mem: 33300 Epoch: [4] [ 110/4276] eta: 3:25:09 lr: 4.544711472649193e-05 loss: 0.1897 (0.2007) time: 2.8767 data: 0.0082 max mem: 33300 Epoch: [4] [ 120/4276] eta: 3:24:17 lr: 4.5444455696021716e-05 loss: 0.1897 (0.2004) time: 2.8783 data: 0.0081 max mem: 33300 Epoch: [4] [ 130/4276] eta: 3:23:27 lr: 4.544179664826424e-05 loss: 0.2018 (0.2012) time: 2.8868 data: 0.0082 max mem: 33300 Epoch: [4] [ 140/4276] eta: 3:22:33 lr: 4.543913758321828e-05 loss: 0.2086 (0.2013) time: 2.8721 data: 0.0087 max mem: 33300 Epoch: [4] [ 150/4276] eta: 3:21:46 lr: 4.543647850088259e-05 loss: 0.1933 (0.2008) time: 2.8678 data: 0.0086 max mem: 33300 Epoch: [4] [ 160/4276] eta: 3:21:11 lr: 4.543381940125594e-05 loss: 0.1912 (0.2010) time: 2.8919 data: 0.0087 max mem: 33300 Epoch: [4] [ 170/4276] eta: 3:20:46 lr: 4.543116028433708e-05 loss: 0.2053 (0.2012) time: 2.9312 data: 0.0088 max mem: 33300 Epoch: [4] [ 180/4276] eta: 3:20:06 lr: 4.542850115012478e-05 loss: 0.2053 (0.2020) time: 2.9187 data: 0.0086 max mem: 33300 Epoch: [4] [ 190/4276] eta: 3:19:22 lr: 4.542584199861781e-05 loss: 0.2186 (0.2028) time: 2.8750 data: 0.0086 max mem: 33300 Epoch: [4] [ 200/4276] eta: 3:18:40 lr: 4.542318282981493e-05 loss: 0.2222 (0.2038) time: 2.8648 data: 0.0086 max mem: 33300 Epoch: [4] [ 210/4276] eta: 3:18:00 lr: 4.54205236437149e-05 loss: 0.2140 (0.2039) time: 2.8662 data: 0.0083 max mem: 33300 Epoch: [4] [ 220/4276] eta: 3:17:18 lr: 4.541786444031647e-05 loss: 0.1998 (0.2034) time: 2.8614 data: 0.0089 max mem: 33300 Epoch: [4] [ 230/4276] eta: 3:16:46 lr: 4.5415205219618424e-05 loss: 0.1896 (0.2023) time: 2.8769 data: 0.0098 max mem: 33300 Epoch: [4] [ 240/4276] eta: 3:16:16 lr: 4.5412545981619505e-05 loss: 0.1871 (0.2022) time: 2.9066 data: 0.0092 max mem: 33300 Epoch: [4] [ 250/4276] eta: 3:15:42 lr: 4.5409886726318496e-05 loss: 0.2046 (0.2033) time: 2.9023 data: 0.0089 max mem: 33300 Epoch: [4] [ 260/4276] eta: 3:15:09 lr: 4.5407227453714135e-05 loss: 0.2160 (0.2038) time: 2.8909 data: 0.0094 max mem: 33300 Epoch: [4] [ 270/4276] eta: 3:14:39 lr: 4.54045681638052e-05 loss: 0.2085 (0.2041) time: 2.9003 data: 0.0094 max mem: 33300 Epoch: [4] [ 280/4276] eta: 3:14:10 lr: 4.5401908856590445e-05 loss: 0.1974 (0.2040) time: 2.9115 data: 0.0083 max mem: 33300 Epoch: [4] [ 290/4276] eta: 3:13:39 lr: 4.539924953206863e-05 loss: 0.1974 (0.2036) time: 2.9077 data: 0.0078 max mem: 33300 Epoch: [4] [ 300/4276] eta: 3:13:09 lr: 4.539659019023852e-05 loss: 0.1876 (0.2034) time: 2.9049 data: 0.0081 max mem: 33300 Epoch: [4] [ 310/4276] eta: 3:12:37 lr: 4.539393083109888e-05 loss: 0.1877 (0.2029) time: 2.9038 data: 0.0084 max mem: 33300 Epoch: [4] [ 320/4276] eta: 3:12:07 lr: 4.5391271454648466e-05 loss: 0.2010 (0.2035) time: 2.9022 data: 0.0087 max mem: 33300 Epoch: [4] [ 330/4276] eta: 3:11:33 lr: 4.538861206088603e-05 loss: 0.2010 (0.2036) time: 2.8882 data: 0.0087 max mem: 33300 Epoch: [4] [ 340/4276] eta: 3:10:59 lr: 4.538595264981034e-05 loss: 0.1950 (0.2033) time: 2.8732 data: 0.0085 max mem: 33300 Epoch: [4] [ 350/4276] eta: 3:10:27 lr: 4.538329322142016e-05 loss: 0.1889 (0.2029) time: 2.8787 data: 0.0085 max mem: 33300 Epoch: [4] [ 360/4276] eta: 3:09:55 lr: 4.538063377571425e-05 loss: 0.1960 (0.2034) time: 2.8811 data: 0.0084 max mem: 33300 Epoch: [4] [ 370/4276] eta: 3:09:23 lr: 4.5377974312691354e-05 loss: 0.1926 (0.2028) time: 2.8848 data: 0.0085 max mem: 33300 Epoch: [4] [ 380/4276] eta: 3:08:50 lr: 4.537531483235025e-05 loss: 0.1871 (0.2028) time: 2.8775 data: 0.0080 max mem: 33300 Epoch: [4] [ 390/4276] eta: 3:08:19 lr: 4.53726553346897e-05 loss: 0.2042 (0.2030) time: 2.8757 data: 0.0076 max mem: 33300 Epoch: [4] [ 400/4276] eta: 3:07:47 lr: 4.536999581970843e-05 loss: 0.2036 (0.2032) time: 2.8853 data: 0.0078 max mem: 33300 Epoch: [4] [ 410/4276] eta: 3:07:17 lr: 4.536733628740524e-05 loss: 0.2003 (0.2030) time: 2.8868 data: 0.0075 max mem: 33300 Epoch: [4] [ 420/4276] eta: 3:06:46 lr: 4.536467673777887e-05 loss: 0.2014 (0.2032) time: 2.8901 data: 0.0075 max mem: 33300 Epoch: [4] [ 430/4276] eta: 3:06:15 lr: 4.536201717082808e-05 loss: 0.2016 (0.2033) time: 2.8866 data: 0.0077 max mem: 33300 Epoch: [4] [ 440/4276] eta: 3:05:44 lr: 4.535935758655162e-05 loss: 0.2015 (0.2032) time: 2.8837 data: 0.0077 max mem: 33300 Epoch: [4] [ 450/4276] eta: 3:05:13 lr: 4.5356697984948265e-05 loss: 0.2045 (0.2036) time: 2.8867 data: 0.0077 max mem: 33300 Epoch: [4] [ 460/4276] eta: 3:04:43 lr: 4.5354038366016764e-05 loss: 0.1997 (0.2030) time: 2.8864 data: 0.0075 max mem: 33300 Epoch: [4] [ 470/4276] eta: 3:04:13 lr: 4.5351378729755876e-05 loss: 0.1838 (0.2026) time: 2.8874 data: 0.0075 max mem: 33300 Epoch: [4] [ 480/4276] eta: 3:03:43 lr: 4.5348719076164356e-05 loss: 0.1759 (0.2023) time: 2.8915 data: 0.0078 max mem: 33300 Epoch: [4] [ 490/4276] eta: 3:03:12 lr: 4.534605940524097e-05 loss: 0.1790 (0.2022) time: 2.8905 data: 0.0078 max mem: 33300 Epoch: [4] [ 500/4276] eta: 3:02:42 lr: 4.534339971698447e-05 loss: 0.2061 (0.2023) time: 2.8863 data: 0.0075 max mem: 33300 Epoch: [4] [ 510/4276] eta: 3:02:12 lr: 4.534074001139362e-05 loss: 0.1917 (0.2019) time: 2.8887 data: 0.0077 max mem: 33300 Epoch: [4] [ 520/4276] eta: 3:01:42 lr: 4.533808028846716e-05 loss: 0.1855 (0.2020) time: 2.8897 data: 0.0080 max mem: 33300 Epoch: [4] [ 530/4276] eta: 3:01:12 lr: 4.533542054820386e-05 loss: 0.1906 (0.2019) time: 2.8854 data: 0.0079 max mem: 33300 Epoch: [4] [ 540/4276] eta: 3:00:42 lr: 4.533276079060248e-05 loss: 0.1919 (0.2018) time: 2.8870 data: 0.0078 max mem: 33300 Epoch: [4] [ 550/4276] eta: 3:00:11 lr: 4.5330101015661766e-05 loss: 0.2059 (0.2019) time: 2.8850 data: 0.0079 max mem: 33300 Epoch: [4] [ 560/4276] eta: 2:59:41 lr: 4.5327441223380486e-05 loss: 0.2067 (0.2021) time: 2.8801 data: 0.0079 max mem: 33300 Epoch: [4] [ 570/4276] eta: 2:59:10 lr: 4.5324781413757385e-05 loss: 0.2022 (0.2021) time: 2.8736 data: 0.0076 max mem: 33300 Epoch: [4] [ 580/4276] eta: 2:58:43 lr: 4.532212158679123e-05 loss: 0.1887 (0.2020) time: 2.9014 data: 0.0086 max mem: 33300 Epoch: [4] [ 590/4276] eta: 2:58:16 lr: 4.531946174248077e-05 loss: 0.1863 (0.2016) time: 2.9355 data: 0.0085 max mem: 33300 Epoch: [4] [ 600/4276] eta: 2:57:49 lr: 4.531680188082476e-05 loss: 0.1863 (0.2016) time: 2.9361 data: 0.0073 max mem: 33300 Epoch: [4] [ 610/4276] eta: 2:57:22 lr: 4.5314142001821956e-05 loss: 0.1827 (0.2014) time: 2.9363 data: 0.0072 max mem: 33300 Epoch: [4] [ 620/4276] eta: 2:56:55 lr: 4.5311482105471125e-05 loss: 0.1849 (0.2014) time: 2.9363 data: 0.0072 max mem: 33300 Epoch: [4] [ 630/4276] eta: 2:56:26 lr: 4.5308822191771006e-05 loss: 0.1977 (0.2014) time: 2.9234 data: 0.0077 max mem: 33300 Epoch: [4] [ 640/4276] eta: 2:55:57 lr: 4.5306162260720366e-05 loss: 0.1998 (0.2014) time: 2.9063 data: 0.0085 max mem: 33300 Epoch: [4] [ 650/4276] eta: 2:55:28 lr: 4.5303502312317944e-05 loss: 0.1919 (0.2013) time: 2.9008 data: 0.0087 max mem: 33300 Epoch: [4] [ 660/4276] eta: 2:54:59 lr: 4.5300842346562514e-05 loss: 0.2075 (0.2016) time: 2.9047 data: 0.0084 max mem: 33300 Epoch: [4] [ 670/4276] eta: 2:54:32 lr: 4.529818236345282e-05 loss: 0.1950 (0.2014) time: 2.9244 data: 0.0084 max mem: 33300 Epoch: [4] [ 680/4276] eta: 2:54:04 lr: 4.529552236298762e-05 loss: 0.1793 (0.2011) time: 2.9363 data: 0.0080 max mem: 33300 Epoch: [4] [ 690/4276] eta: 2:53:36 lr: 4.529286234516568e-05 loss: 0.1914 (0.2010) time: 2.9273 data: 0.0075 max mem: 33300 Epoch: [4] [ 700/4276] eta: 2:53:07 lr: 4.529020230998572e-05 loss: 0.2060 (0.2012) time: 2.9101 data: 0.0076 max mem: 33300 Epoch: [4] [ 710/4276] eta: 2:52:37 lr: 4.528754225744653e-05 loss: 0.2228 (0.2014) time: 2.8971 data: 0.0081 max mem: 33300 Epoch: [4] [ 720/4276] eta: 2:52:09 lr: 4.528488218754684e-05 loss: 0.1983 (0.2013) time: 2.9090 data: 0.0084 max mem: 33300 Epoch: [4] [ 730/4276] eta: 2:51:40 lr: 4.528222210028541e-05 loss: 0.1921 (0.2016) time: 2.9154 data: 0.0084 max mem: 33300 Epoch: [4] [ 740/4276] eta: 2:51:12 lr: 4.5279561995661005e-05 loss: 0.1966 (0.2016) time: 2.9171 data: 0.0086 max mem: 33300 Epoch: [4] [ 750/4276] eta: 2:50:44 lr: 4.5276901873672374e-05 loss: 0.1907 (0.2015) time: 2.9227 data: 0.0086 max mem: 33300 Epoch: [4] [ 760/4276] eta: 2:50:20 lr: 4.527424173431825e-05 loss: 0.1839 (0.2014) time: 2.9717 data: 0.0081 max mem: 33300 Epoch: [4] [ 770/4276] eta: 2:50:02 lr: 4.5271581577597415e-05 loss: 0.1864 (0.2015) time: 3.0825 data: 0.0078 max mem: 33300 Epoch: [4] [ 780/4276] eta: 2:49:47 lr: 4.52689214035086e-05 loss: 0.2050 (0.2014) time: 3.1831 data: 0.0078 max mem: 33300 Epoch: [4] [ 790/4276] eta: 2:49:29 lr: 4.526626121205057e-05 loss: 0.1992 (0.2014) time: 3.1948 data: 0.0076 max mem: 33300 Epoch: [4] [ 800/4276] eta: 2:49:15 lr: 4.5263601003222074e-05 loss: 0.1984 (0.2014) time: 3.2229 data: 0.0079 max mem: 33300 Epoch: [4] [ 810/4276] eta: 2:48:58 lr: 4.5260940777021854e-05 loss: 0.1820 (0.2014) time: 3.2365 data: 0.0083 max mem: 33300 Epoch: [4] [ 820/4276] eta: 2:48:42 lr: 4.5258280533448684e-05 loss: 0.2011 (0.2013) time: 3.2155 data: 0.0080 max mem: 33300 Epoch: [4] [ 830/4276] eta: 2:48:25 lr: 4.5255620272501296e-05 loss: 0.1870 (0.2015) time: 3.2316 data: 0.0081 max mem: 33300 Epoch: [4] [ 840/4276] eta: 2:48:07 lr: 4.525295999417845e-05 loss: 0.1901 (0.2017) time: 3.2167 data: 0.0086 max mem: 33300 Epoch: [4] [ 850/4276] eta: 2:47:48 lr: 4.5250299698478906e-05 loss: 0.1859 (0.2016) time: 3.2093 data: 0.0083 max mem: 33300 Epoch: [4] [ 860/4276] eta: 2:47:26 lr: 4.524763938540139e-05 loss: 0.1889 (0.2017) time: 3.1638 data: 0.0075 max mem: 33300 Epoch: [4] [ 870/4276] eta: 2:47:09 lr: 4.524497905494468e-05 loss: 0.1970 (0.2016) time: 3.1878 data: 0.0077 max mem: 33300 Epoch: [4] [ 880/4276] eta: 2:46:46 lr: 4.524231870710751e-05 loss: 0.1933 (0.2018) time: 3.1816 data: 0.0081 max mem: 33300 Epoch: [4] [ 890/4276] eta: 2:46:27 lr: 4.5239658341888645e-05 loss: 0.2103 (0.2019) time: 3.1599 data: 0.0079 max mem: 33300 Epoch: [4] [ 900/4276] eta: 2:46:04 lr: 4.523699795928682e-05 loss: 0.2179 (0.2019) time: 3.1679 data: 0.0075 max mem: 33300 Epoch: [4] [ 910/4276] eta: 2:45:43 lr: 4.52343375593008e-05 loss: 0.2004 (0.2020) time: 3.1577 data: 0.0077 max mem: 33300 Epoch: [4] [ 920/4276] eta: 2:45:22 lr: 4.523167714192932e-05 loss: 0.2004 (0.2020) time: 3.1899 data: 0.0087 max mem: 33300 Epoch: [4] [ 930/4276] eta: 2:45:02 lr: 4.5229016707171145e-05 loss: 0.1907 (0.2019) time: 3.2015 data: 0.0096 max mem: 33300 Epoch: [4] [ 940/4276] eta: 2:44:42 lr: 4.522635625502502e-05 loss: 0.1794 (0.2017) time: 3.2280 data: 0.0095 max mem: 33300 Epoch: [4] [ 950/4276] eta: 2:44:19 lr: 4.5223695785489686e-05 loss: 0.1858 (0.2016) time: 3.2018 data: 0.0083 max mem: 33300 Epoch: [4] [ 960/4276] eta: 2:44:00 lr: 4.522103529856391e-05 loss: 0.1979 (0.2017) time: 3.2179 data: 0.0082 max mem: 33300 Epoch: [4] [ 970/4276] eta: 2:43:37 lr: 4.521837479424643e-05 loss: 0.1957 (0.2017) time: 3.2103 data: 0.0085 max mem: 33300 Epoch: [4] [ 980/4276] eta: 2:43:18 lr: 4.5215714272535995e-05 loss: 0.1973 (0.2018) time: 3.2150 data: 0.0081 max mem: 33300 Epoch: [4] [ 990/4276] eta: 2:42:55 lr: 4.5213053733431356e-05 loss: 0.1981 (0.2017) time: 3.2271 data: 0.0076 max mem: 33300 Epoch: [4] [1000/4276] eta: 2:42:34 lr: 4.521039317693126e-05 loss: 0.1863 (0.2016) time: 3.2102 data: 0.0076 max mem: 33300 Epoch: [4] [1010/4276] eta: 2:42:11 lr: 4.5207732603034466e-05 loss: 0.1892 (0.2015) time: 3.2286 data: 0.0080 max mem: 33300 Epoch: [4] [1020/4276] eta: 2:41:48 lr: 4.5205072011739716e-05 loss: 0.1892 (0.2015) time: 3.1979 data: 0.0079 max mem: 33300 Epoch: [4] [1030/4276] eta: 2:41:25 lr: 4.520241140304575e-05 loss: 0.1898 (0.2015) time: 3.1923 data: 0.0080 max mem: 33300 Epoch: [4] [1040/4276] eta: 2:41:01 lr: 4.519975077695132e-05 loss: 0.1831 (0.2014) time: 3.1871 data: 0.0083 max mem: 33300 Epoch: [4] [1050/4276] eta: 2:40:40 lr: 4.5197090133455185e-05 loss: 0.1868 (0.2015) time: 3.2273 data: 0.0088 max mem: 33300 Epoch: [4] [1060/4276] eta: 2:40:16 lr: 4.5194429472556085e-05 loss: 0.1964 (0.2016) time: 3.2172 data: 0.0087 max mem: 33300 Epoch: [4] [1070/4276] eta: 2:39:54 lr: 4.519176879425276e-05 loss: 0.1878 (0.2015) time: 3.2185 data: 0.0092 max mem: 33300 Epoch: [4] [1080/4276] eta: 2:39:30 lr: 4.518910809854398e-05 loss: 0.1979 (0.2014) time: 3.2239 data: 0.0095 max mem: 33300 Epoch: [4] [1090/4276] eta: 2:39:08 lr: 4.518644738542847e-05 loss: 0.1979 (0.2013) time: 3.2134 data: 0.0091 max mem: 33300 Epoch: [4] [1100/4276] eta: 2:38:43 lr: 4.5183786654904985e-05 loss: 0.1712 (0.2012) time: 3.2299 data: 0.0088 max mem: 33300 Epoch: [4] [1110/4276] eta: 2:38:16 lr: 4.518112590697228e-05 loss: 0.1889 (0.2012) time: 3.1506 data: 0.0087 max mem: 33300 Epoch: [4] [1120/4276] eta: 2:37:49 lr: 4.5178465141629085e-05 loss: 0.1974 (0.2012) time: 3.1027 data: 0.0085 max mem: 33300 Epoch: [4] [1130/4276] eta: 2:37:19 lr: 4.517580435887417e-05 loss: 0.1979 (0.2011) time: 3.0427 data: 0.0087 max mem: 33300 Epoch: [4] [1140/4276] eta: 2:36:46 lr: 4.517314355870625e-05 loss: 0.1842 (0.2010) time: 2.9493 data: 0.0085 max mem: 33300 Epoch: [4] [1150/4276] eta: 2:36:14 lr: 4.51704827411241e-05 loss: 0.1899 (0.2010) time: 2.9051 data: 0.0086 max mem: 33300 Epoch: [4] [1160/4276] eta: 2:35:40 lr: 4.516782190612646e-05 loss: 0.1980 (0.2011) time: 2.8872 data: 0.0087 max mem: 33300 Epoch: [4] [1170/4276] eta: 2:35:08 lr: 4.5165161053712066e-05 loss: 0.2069 (0.2012) time: 2.8850 data: 0.0084 max mem: 33300 Epoch: [4] [1180/4276] eta: 2:34:35 lr: 4.516250018387967e-05 loss: 0.2069 (0.2011) time: 2.8896 data: 0.0084 max mem: 33300 Epoch: [4] [1190/4276] eta: 2:34:02 lr: 4.515983929662801e-05 loss: 0.1943 (0.2011) time: 2.8844 data: 0.0081 max mem: 33300 Epoch: [4] [1200/4276] eta: 2:33:29 lr: 4.515717839195586e-05 loss: 0.2036 (0.2010) time: 2.8806 data: 0.0082 max mem: 33300 Epoch: [4] [1210/4276] eta: 2:32:57 lr: 4.515451746986193e-05 loss: 0.1829 (0.2010) time: 2.8899 data: 0.0082 max mem: 33300 Epoch: [4] [1220/4276] eta: 2:32:26 lr: 4.515185653034498e-05 loss: 0.1848 (0.2009) time: 2.9340 data: 0.0080 max mem: 33300 Epoch: [4] [1230/4276] eta: 2:31:56 lr: 4.514919557340375e-05 loss: 0.1851 (0.2009) time: 2.9699 data: 0.0087 max mem: 33300 Epoch: [4] [1240/4276] eta: 2:31:25 lr: 4.5146534599037e-05 loss: 0.1851 (0.2008) time: 2.9648 data: 0.0085 max mem: 33300 Epoch: [4] [1250/4276] eta: 2:30:53 lr: 4.5143873607243455e-05 loss: 0.1881 (0.2007) time: 2.9425 data: 0.0078 max mem: 33300 Epoch: [4] [1260/4276] eta: 2:30:23 lr: 4.514121259802187e-05 loss: 0.1738 (0.2006) time: 2.9516 data: 0.0080 max mem: 33300 Epoch: [4] [1270/4276] eta: 2:29:53 lr: 4.513855157137099e-05 loss: 0.1813 (0.2005) time: 2.9755 data: 0.0076 max mem: 33300 Epoch: [4] [1280/4276] eta: 2:29:22 lr: 4.513589052728956e-05 loss: 0.1881 (0.2005) time: 2.9567 data: 0.0075 max mem: 33300 Epoch: [4] [1290/4276] eta: 2:28:49 lr: 4.513322946577632e-05 loss: 0.2089 (0.2006) time: 2.9146 data: 0.0081 max mem: 33300 Epoch: [4] [1300/4276] eta: 2:28:17 lr: 4.513056838683001e-05 loss: 0.1880 (0.2005) time: 2.8859 data: 0.0085 max mem: 33300 Epoch: [4] [1310/4276] eta: 2:27:44 lr: 4.5127907290449384e-05 loss: 0.1880 (0.2005) time: 2.8806 data: 0.0083 max mem: 33300 Epoch: [4] [1320/4276] eta: 2:27:12 lr: 4.5125246176633175e-05 loss: 0.2068 (0.2006) time: 2.8811 data: 0.0085 max mem: 33300 Epoch: [4] [1330/4276] eta: 2:26:40 lr: 4.5122585045380135e-05 loss: 0.1840 (0.2004) time: 2.8848 data: 0.0084 max mem: 33300 Epoch: [4] [1340/4276] eta: 2:26:08 lr: 4.5119923896689e-05 loss: 0.1765 (0.2003) time: 2.8947 data: 0.0083 max mem: 33300 Epoch: [4] [1350/4276] eta: 2:25:37 lr: 4.5117262730558516e-05 loss: 0.1858 (0.2003) time: 2.9167 data: 0.0081 max mem: 33300 Epoch: [4] [1360/4276] eta: 2:25:07 lr: 4.511460154698743e-05 loss: 0.1967 (0.2003) time: 2.9441 data: 0.0083 max mem: 33300 Epoch: [4] [1370/4276] eta: 2:24:36 lr: 4.511194034597448e-05 loss: 0.1867 (0.2002) time: 2.9522 data: 0.0090 max mem: 33300 Epoch: [4] [1380/4276] eta: 2:24:06 lr: 4.5109279127518406e-05 loss: 0.1981 (0.2003) time: 2.9514 data: 0.0096 max mem: 33300 Epoch: [4] [1390/4276] eta: 2:23:35 lr: 4.5106617891617955e-05 loss: 0.2000 (0.2003) time: 2.9517 data: 0.0092 max mem: 33300 Epoch: [4] [1400/4276] eta: 2:23:04 lr: 4.510395663827187e-05 loss: 0.2001 (0.2004) time: 2.9471 data: 0.0084 max mem: 33300 Epoch: [4] [1410/4276] eta: 2:22:34 lr: 4.510129536747888e-05 loss: 0.2001 (0.2004) time: 2.9518 data: 0.0086 max mem: 33300 Epoch: [4] [1420/4276] eta: 2:22:02 lr: 4.509863407923775e-05 loss: 0.1962 (0.2005) time: 2.9312 data: 0.0089 max mem: 33300 Epoch: [4] [1430/4276] eta: 2:21:32 lr: 4.50959727735472e-05 loss: 0.1888 (0.2004) time: 2.9276 data: 0.0092 max mem: 33300 Epoch: [4] [1440/4276] eta: 2:21:01 lr: 4.509331145040599e-05 loss: 0.1974 (0.2006) time: 2.9318 data: 0.0092 max mem: 33300 Epoch: [4] [1450/4276] eta: 2:20:30 lr: 4.509065010981284e-05 loss: 0.2070 (0.2005) time: 2.9221 data: 0.0084 max mem: 33300 Epoch: [4] [1460/4276] eta: 2:19:59 lr: 4.5087988751766515e-05 loss: 0.1890 (0.2005) time: 2.9223 data: 0.0078 max mem: 33300 Epoch: [4] [1470/4276] eta: 2:19:27 lr: 4.508532737626574e-05 loss: 0.2030 (0.2006) time: 2.9014 data: 0.0077 max mem: 33300 Epoch: [4] [1480/4276] eta: 2:18:55 lr: 4.508266598330925e-05 loss: 0.2042 (0.2006) time: 2.8822 data: 0.0081 max mem: 33300 Epoch: [4] [1490/4276] eta: 2:18:24 lr: 4.5080004572895795e-05 loss: 0.1844 (0.2005) time: 2.8829 data: 0.0080 max mem: 33300 Epoch: [4] [1500/4276] eta: 2:17:52 lr: 4.507734314502412e-05 loss: 0.1844 (0.2005) time: 2.8814 data: 0.0075 max mem: 33300 Epoch: [4] [1510/4276] eta: 2:17:21 lr: 4.507468169969296e-05 loss: 0.1798 (0.2005) time: 2.8909 data: 0.0075 max mem: 33300 Epoch: [4] [1520/4276] eta: 2:16:50 lr: 4.507202023690106e-05 loss: 0.1806 (0.2004) time: 2.9144 data: 0.0076 max mem: 33300 Epoch: [4] [1530/4276] eta: 2:16:19 lr: 4.506935875664714e-05 loss: 0.1868 (0.2004) time: 2.9152 data: 0.0078 max mem: 33300 Epoch: [4] [1540/4276] eta: 2:15:47 lr: 4.506669725892996e-05 loss: 0.1915 (0.2004) time: 2.8972 data: 0.0078 max mem: 33300 Epoch: [4] [1550/4276] eta: 2:15:16 lr: 4.506403574374827e-05 loss: 0.1993 (0.2005) time: 2.8830 data: 0.0079 max mem: 33300 Epoch: [4] [1560/4276] eta: 2:14:45 lr: 4.5061374211100774e-05 loss: 0.1835 (0.2004) time: 2.8865 data: 0.0078 max mem: 33300 Epoch: [4] [1570/4276] eta: 2:14:14 lr: 4.5058712660986235e-05 loss: 0.1830 (0.2003) time: 2.8999 data: 0.0079 max mem: 33300 Epoch: [4] [1580/4276] eta: 2:13:44 lr: 4.505605109340339e-05 loss: 0.1804 (0.2003) time: 2.9454 data: 0.0083 max mem: 33300 Epoch: [4] [1590/4276] eta: 2:13:14 lr: 4.505338950835098e-05 loss: 0.1921 (0.2003) time: 2.9646 data: 0.0085 max mem: 33300 Epoch: [4] [1600/4276] eta: 2:12:44 lr: 4.5050727905827726e-05 loss: 0.1921 (0.2003) time: 2.9612 data: 0.0079 max mem: 33300 Epoch: [4] [1610/4276] eta: 2:12:14 lr: 4.5048066285832386e-05 loss: 0.1747 (0.2002) time: 2.9574 data: 0.0076 max mem: 33300 Epoch: [4] [1620/4276] eta: 2:11:43 lr: 4.50454046483637e-05 loss: 0.1832 (0.2001) time: 2.9368 data: 0.0077 max mem: 33300 Epoch: [4] [1630/4276] eta: 2:11:13 lr: 4.504274299342038e-05 loss: 0.1926 (0.2002) time: 2.9426 data: 0.0079 max mem: 33300 Epoch: [4] [1640/4276] eta: 2:10:43 lr: 4.504008132100119e-05 loss: 0.1947 (0.2003) time: 2.9561 data: 0.0079 max mem: 33300 Epoch: [4] [1650/4276] eta: 2:10:13 lr: 4.503741963110486e-05 loss: 0.1887 (0.2003) time: 2.9499 data: 0.0077 max mem: 33300 Epoch: [4] [1660/4276] eta: 2:09:43 lr: 4.503475792373013e-05 loss: 0.1887 (0.2003) time: 2.9416 data: 0.0076 max mem: 33300 Epoch: [4] [1670/4276] eta: 2:09:12 lr: 4.5032096198875734e-05 loss: 0.1931 (0.2003) time: 2.9302 data: 0.0078 max mem: 33300 Epoch: [4] [1680/4276] eta: 2:08:41 lr: 4.5029434456540404e-05 loss: 0.2001 (0.2003) time: 2.9029 data: 0.0083 max mem: 33300 Epoch: [4] [1690/4276] eta: 2:08:10 lr: 4.502677269672288e-05 loss: 0.2056 (0.2003) time: 2.8952 data: 0.0086 max mem: 33300 Epoch: [4] [1700/4276] eta: 2:07:39 lr: 4.502411091942191e-05 loss: 0.2014 (0.2004) time: 2.9018 data: 0.0092 max mem: 33300 Epoch: [4] [1710/4276] eta: 2:07:09 lr: 4.502144912463622e-05 loss: 0.2063 (0.2005) time: 2.9104 data: 0.0096 max mem: 33300 Epoch: [4] [1720/4276] eta: 2:06:38 lr: 4.501878731236454e-05 loss: 0.2093 (0.2006) time: 2.9184 data: 0.0090 max mem: 33300 Epoch: [4] [1730/4276] eta: 2:06:08 lr: 4.501612548260562e-05 loss: 0.2022 (0.2005) time: 2.9528 data: 0.0085 max mem: 33300 Epoch: [4] [1740/4276] eta: 2:05:40 lr: 4.501346363535819e-05 loss: 0.2046 (0.2006) time: 3.0219 data: 0.0082 max mem: 33300 Epoch: [4] [1750/4276] eta: 2:05:12 lr: 4.5010801770620996e-05 loss: 0.2046 (0.2006) time: 3.0810 data: 0.0079 max mem: 33300 Epoch: [4] [1760/4276] eta: 2:04:43 lr: 4.500813988839276e-05 loss: 0.1946 (0.2005) time: 3.0701 data: 0.0080 max mem: 33300 Epoch: [4] [1770/4276] eta: 2:04:15 lr: 4.5005477988672216e-05 loss: 0.2000 (0.2005) time: 3.0637 data: 0.0082 max mem: 33300 Epoch: [4] [1780/4276] eta: 2:03:47 lr: 4.500281607145812e-05 loss: 0.2005 (0.2005) time: 3.0910 data: 0.0081 max mem: 33300 Epoch: [4] [1790/4276] eta: 2:03:19 lr: 4.500015413674918e-05 loss: 0.1953 (0.2005) time: 3.0991 data: 0.0081 max mem: 33300 Epoch: [4] [1800/4276] eta: 2:02:51 lr: 4.499749218454414e-05 loss: 0.2034 (0.2005) time: 3.0989 data: 0.0084 max mem: 33300 Epoch: [4] [1810/4276] eta: 2:02:23 lr: 4.499483021484175e-05 loss: 0.2056 (0.2005) time: 3.1107 data: 0.0083 max mem: 33300 Epoch: [4] [1820/4276] eta: 2:01:56 lr: 4.499216822764073e-05 loss: 0.1877 (0.2004) time: 3.1507 data: 0.0090 max mem: 33300 Epoch: [4] [1830/4276] eta: 2:01:31 lr: 4.498950622293983e-05 loss: 0.1830 (0.2005) time: 3.2692 data: 0.0090 max mem: 33300 Epoch: [4] [1840/4276] eta: 2:01:06 lr: 4.498684420073776e-05 loss: 0.1867 (0.2005) time: 3.3320 data: 0.0085 max mem: 33300 Epoch: [4] [1850/4276] eta: 2:00:39 lr: 4.498418216103327e-05 loss: 0.1930 (0.2006) time: 3.2579 data: 0.0082 max mem: 33300 Epoch: [4] [1860/4276] eta: 2:00:15 lr: 4.4981520103825095e-05 loss: 0.2072 (0.2006) time: 3.3164 data: 0.0084 max mem: 33300 Epoch: [4] [1870/4276] eta: 1:59:48 lr: 4.4978858029111956e-05 loss: 0.2003 (0.2006) time: 3.3107 data: 0.0087 max mem: 33300 Epoch: [4] [1880/4276] eta: 1:59:22 lr: 4.4976195936892604e-05 loss: 0.1964 (0.2006) time: 3.2672 data: 0.0085 max mem: 33300 Epoch: [4] [1890/4276] eta: 1:58:56 lr: 4.4973533827165765e-05 loss: 0.1952 (0.2006) time: 3.2917 data: 0.0085 max mem: 33300 Epoch: [4] [1900/4276] eta: 1:58:29 lr: 4.497087169993018e-05 loss: 0.1811 (0.2004) time: 3.2436 data: 0.0089 max mem: 33300 Epoch: [4] [1910/4276] eta: 1:58:03 lr: 4.496820955518456e-05 loss: 0.1811 (0.2004) time: 3.2608 data: 0.0097 max mem: 33300 Epoch: [4] [1920/4276] eta: 1:57:35 lr: 4.496554739292766e-05 loss: 0.1855 (0.2004) time: 3.2480 data: 0.0106 max mem: 33300 Epoch: [4] [1930/4276] eta: 1:57:09 lr: 4.496288521315821e-05 loss: 0.1908 (0.2004) time: 3.2505 data: 0.0098 max mem: 33300 Epoch: [4] [1940/4276] eta: 1:56:42 lr: 4.496022301587492e-05 loss: 0.2100 (0.2005) time: 3.2761 data: 0.0087 max mem: 33300 Epoch: [4] [1950/4276] eta: 1:56:17 lr: 4.495756080107655e-05 loss: 0.1973 (0.2005) time: 3.3241 data: 0.0090 max mem: 33300 Epoch: [4] [1960/4276] eta: 1:55:55 lr: 4.495489856876183e-05 loss: 0.1933 (0.2004) time: 3.5270 data: 0.0097 max mem: 33300 Epoch: [4] [1970/4276] eta: 1:55:29 lr: 4.495223631892947e-05 loss: 0.1764 (0.2003) time: 3.5077 data: 0.0095 max mem: 33300 Epoch: [4] [1980/4276] eta: 1:55:05 lr: 4.494957405157824e-05 loss: 0.1745 (0.2002) time: 3.4372 data: 0.0096 max mem: 33300 Epoch: [4] [1990/4276] eta: 1:54:40 lr: 4.4946911766706826e-05 loss: 0.1882 (0.2003) time: 3.5182 data: 0.0103 max mem: 33300 Epoch: [4] [2000/4276] eta: 1:54:16 lr: 4.4944249464313995e-05 loss: 0.2092 (0.2003) time: 3.5189 data: 0.0106 max mem: 33300 Epoch: [4] [2010/4276] eta: 1:53:52 lr: 4.4941587144398465e-05 loss: 0.2052 (0.2003) time: 3.5123 data: 0.0110 max mem: 33300 Epoch: [4] [2020/4276] eta: 1:53:27 lr: 4.493892480695896e-05 loss: 0.1934 (0.2003) time: 3.5200 data: 0.0107 max mem: 33300 Epoch: [4] [2030/4276] eta: 1:53:03 lr: 4.4936262451994224e-05 loss: 0.1883 (0.2002) time: 3.5414 data: 0.0105 max mem: 33300 Epoch: [4] [2040/4276] eta: 1:52:38 lr: 4.493360007950298e-05 loss: 0.1852 (0.2002) time: 3.5260 data: 0.0103 max mem: 33300 Epoch: [4] [2050/4276] eta: 1:52:14 lr: 4.493093768948397e-05 loss: 0.1938 (0.2002) time: 3.5302 data: 0.0099 max mem: 33300 Epoch: [4] [2060/4276] eta: 1:51:48 lr: 4.492827528193591e-05 loss: 0.1947 (0.2002) time: 3.5173 data: 0.0101 max mem: 33300 Epoch: [4] [2070/4276] eta: 1:51:22 lr: 4.4925612856857535e-05 loss: 0.1895 (0.2001) time: 3.4508 data: 0.0102 max mem: 33300 Epoch: [4] [2080/4276] eta: 1:50:57 lr: 4.492295041424758e-05 loss: 0.1885 (0.2002) time: 3.4764 data: 0.0106 max mem: 33300 Epoch: [4] [2090/4276] eta: 1:50:31 lr: 4.492028795410478e-05 loss: 0.1880 (0.2001) time: 3.4702 data: 0.0112 max mem: 33300 Epoch: [4] [2100/4276] eta: 1:50:06 lr: 4.4917625476427844e-05 loss: 0.1957 (0.2001) time: 3.4747 data: 0.0104 max mem: 33300 Epoch: [4] [2110/4276] eta: 1:49:39 lr: 4.491496298121552e-05 loss: 0.1853 (0.2001) time: 3.4542 data: 0.0101 max mem: 33300 Epoch: [4] [2120/4276] eta: 1:49:14 lr: 4.4912300468466534e-05 loss: 0.1853 (0.2000) time: 3.4900 data: 0.0114 max mem: 33300 Epoch: [4] [2130/4276] eta: 1:48:48 lr: 4.490963793817961e-05 loss: 0.1734 (0.2000) time: 3.5289 data: 0.0119 max mem: 33300 Epoch: [4] [2140/4276] eta: 1:48:22 lr: 4.490697539035348e-05 loss: 0.1970 (0.2000) time: 3.4492 data: 0.0108 max mem: 33300 Epoch: [4] [2150/4276] eta: 1:47:57 lr: 4.490431282498688e-05 loss: 0.2005 (0.1999) time: 3.5606 data: 0.0106 max mem: 33300 Epoch: [4] [2160/4276] eta: 1:47:30 lr: 4.490165024207853e-05 loss: 0.1889 (0.2000) time: 3.5265 data: 0.0098 max mem: 33300 Epoch: [4] [2170/4276] eta: 1:47:05 lr: 4.489898764162716e-05 loss: 0.1975 (0.2000) time: 3.4748 data: 0.0097 max mem: 33300 Epoch: [4] [2180/4276] eta: 1:46:39 lr: 4.4896325023631495e-05 loss: 0.2188 (0.2000) time: 3.5408 data: 0.0110 max mem: 33300 Epoch: [4] [2190/4276] eta: 1:46:12 lr: 4.489366238809027e-05 loss: 0.1973 (0.2001) time: 3.4975 data: 0.0101 max mem: 33300 Epoch: [4] [2200/4276] eta: 1:45:47 lr: 4.489099973500221e-05 loss: 0.2019 (0.2001) time: 3.5493 data: 0.0092 max mem: 33300 Epoch: [4] [2210/4276] eta: 1:45:19 lr: 4.488833706436606e-05 loss: 0.2046 (0.2002) time: 3.4856 data: 0.0094 max mem: 33300 Epoch: [4] [2220/4276] eta: 1:44:52 lr: 4.488567437618052e-05 loss: 0.2047 (0.2002) time: 3.3914 data: 0.0096 max mem: 33300 Epoch: [4] [2230/4276] eta: 1:44:23 lr: 4.488301167044432e-05 loss: 0.1895 (0.2001) time: 3.3568 data: 0.0096 max mem: 33300 Epoch: [4] [2240/4276] eta: 1:43:56 lr: 4.488034894715621e-05 loss: 0.1762 (0.2000) time: 3.3319 data: 0.0099 max mem: 33300 Epoch: [4] [2250/4276] eta: 1:43:27 lr: 4.48776862063149e-05 loss: 0.1899 (0.2000) time: 3.3142 data: 0.0094 max mem: 33300 Epoch: [4] [2260/4276] eta: 1:42:58 lr: 4.4875023447919116e-05 loss: 0.1943 (0.2001) time: 3.2645 data: 0.0087 max mem: 33300 Epoch: [4] [2270/4276] eta: 1:42:30 lr: 4.487236067196759e-05 loss: 0.2025 (0.2001) time: 3.3006 data: 0.0088 max mem: 33300 Epoch: [4] [2280/4276] eta: 1:42:01 lr: 4.4869697878459056e-05 loss: 0.1837 (0.2001) time: 3.2953 data: 0.0082 max mem: 33300 Epoch: [4] [2290/4276] eta: 1:41:32 lr: 4.486703506739223e-05 loss: 0.1954 (0.2001) time: 3.2980 data: 0.0082 max mem: 33300 Epoch: [4] [2300/4276] eta: 1:41:03 lr: 4.486437223876584e-05 loss: 0.1714 (0.1999) time: 3.2889 data: 0.0090 max mem: 33300 Epoch: [4] [2310/4276] eta: 1:40:35 lr: 4.486170939257862e-05 loss: 0.1792 (0.1999) time: 3.3118 data: 0.0098 max mem: 33300 Epoch: [4] [2320/4276] eta: 1:40:06 lr: 4.485904652882929e-05 loss: 0.1989 (0.1999) time: 3.3177 data: 0.0098 max mem: 33300 Epoch: [4] [2330/4276] eta: 1:39:37 lr: 4.485638364751657e-05 loss: 0.1962 (0.1999) time: 3.2940 data: 0.0095 max mem: 33300 Epoch: [4] [2340/4276] eta: 1:39:08 lr: 4.485372074863919e-05 loss: 0.1946 (0.1998) time: 3.2488 data: 0.0094 max mem: 33300 Epoch: [4] [2350/4276] eta: 1:38:38 lr: 4.4851057832195875e-05 loss: 0.1923 (0.1999) time: 3.1972 data: 0.0088 max mem: 33300 Epoch: [4] [2360/4276] eta: 1:38:08 lr: 4.4848394898185354e-05 loss: 0.1915 (0.1998) time: 3.1754 data: 0.0087 max mem: 33300 Epoch: [4] [2370/4276] eta: 1:37:38 lr: 4.4845731946606354e-05 loss: 0.1957 (0.1998) time: 3.1475 data: 0.0084 max mem: 33300 Epoch: [4] [2380/4276] eta: 1:37:07 lr: 4.48430689774576e-05 loss: 0.2005 (0.1998) time: 3.1376 data: 0.0083 max mem: 33300 Epoch: [4] [2390/4276] eta: 1:36:37 lr: 4.48404059907378e-05 loss: 0.1891 (0.1997) time: 3.0864 data: 0.0085 max mem: 33300 Epoch: [4] [2400/4276] eta: 1:36:06 lr: 4.48377429864457e-05 loss: 0.1891 (0.1998) time: 3.0510 data: 0.0083 max mem: 33300 Epoch: [4] [2410/4276] eta: 1:35:33 lr: 4.4835079964580015e-05 loss: 0.1868 (0.1997) time: 2.9629 data: 0.0085 max mem: 33300 Epoch: [4] [2420/4276] eta: 1:35:01 lr: 4.4832416925139466e-05 loss: 0.1723 (0.1996) time: 2.8986 data: 0.0091 max mem: 33300 Epoch: [4] [2430/4276] eta: 1:34:30 lr: 4.482975386812278e-05 loss: 0.1851 (0.1997) time: 2.9247 data: 0.0089 max mem: 33300 Epoch: [4] [2440/4276] eta: 1:33:58 lr: 4.482709079352869e-05 loss: 0.1932 (0.1996) time: 2.9371 data: 0.0089 max mem: 33300 Epoch: [4] [2450/4276] eta: 1:33:26 lr: 4.482442770135591e-05 loss: 0.1802 (0.1996) time: 2.9360 data: 0.0092 max mem: 33300 Epoch: [4] [2460/4276] eta: 1:32:54 lr: 4.482176459160316e-05 loss: 0.1783 (0.1996) time: 2.9272 data: 0.0092 max mem: 33300 Epoch: [4] [2470/4276] eta: 1:32:23 lr: 4.481910146426917e-05 loss: 0.1950 (0.1996) time: 2.9292 data: 0.0092 max mem: 33300 Epoch: [4] [2480/4276] eta: 1:31:51 lr: 4.481643831935267e-05 loss: 0.2063 (0.1996) time: 2.9384 data: 0.0096 max mem: 33300 Epoch: [4] [2490/4276] eta: 1:31:19 lr: 4.481377515685237e-05 loss: 0.2063 (0.1996) time: 2.9368 data: 0.0096 max mem: 33300 Epoch: [4] [2500/4276] eta: 1:30:48 lr: 4.481111197676699e-05 loss: 0.2058 (0.1996) time: 2.9355 data: 0.0091 max mem: 33300 Epoch: [4] [2510/4276] eta: 1:30:16 lr: 4.480844877909526e-05 loss: 0.1939 (0.1996) time: 2.9265 data: 0.0094 max mem: 33300 Epoch: [4] [2520/4276] eta: 1:29:44 lr: 4.4805785563835914e-05 loss: 0.1580 (0.1995) time: 2.9259 data: 0.0096 max mem: 33300 Epoch: [4] [2530/4276] eta: 1:29:13 lr: 4.4803122330987657e-05 loss: 0.1551 (0.1993) time: 2.9323 data: 0.0090 max mem: 33300 Epoch: [4] [2540/4276] eta: 1:28:41 lr: 4.4800459080549215e-05 loss: 0.1662 (0.1993) time: 2.9303 data: 0.0083 max mem: 33300 Epoch: [4] [2550/4276] eta: 1:28:10 lr: 4.4797795812519315e-05 loss: 0.1880 (0.1992) time: 2.9214 data: 0.0088 max mem: 33300 Epoch: [4] [2560/4276] eta: 1:27:38 lr: 4.479513252689668e-05 loss: 0.1718 (0.1991) time: 2.9235 data: 0.0092 max mem: 33300 Epoch: [4] [2570/4276] eta: 1:27:07 lr: 4.479246922368002e-05 loss: 0.1687 (0.1991) time: 2.9338 data: 0.0087 max mem: 33300 Epoch: [4] [2580/4276] eta: 1:26:35 lr: 4.478980590286807e-05 loss: 0.1687 (0.1991) time: 2.9345 data: 0.0084 max mem: 33300 Epoch: [4] [2590/4276] eta: 1:26:04 lr: 4.4787142564459536e-05 loss: 0.1801 (0.1990) time: 2.9355 data: 0.0088 max mem: 33300 Epoch: [4] [2600/4276] eta: 1:25:32 lr: 4.4784479208453155e-05 loss: 0.1887 (0.1991) time: 2.9249 data: 0.0087 max mem: 33300 Epoch: [4] [2610/4276] eta: 1:25:00 lr: 4.478181583484763e-05 loss: 0.2015 (0.1990) time: 2.9189 data: 0.0086 max mem: 33300 Epoch: [4] [2620/4276] eta: 1:24:29 lr: 4.477915244364171e-05 loss: 0.2045 (0.1991) time: 2.9288 data: 0.0085 max mem: 33300 Epoch: [4] [2630/4276] eta: 1:23:58 lr: 4.477648903483409e-05 loss: 0.2019 (0.1990) time: 2.9349 data: 0.0084 max mem: 33300 Epoch: [4] [2640/4276] eta: 1:23:26 lr: 4.47738256084235e-05 loss: 0.1774 (0.1990) time: 2.9330 data: 0.0086 max mem: 33300 Epoch: [4] [2650/4276] eta: 1:22:55 lr: 4.477116216440865e-05 loss: 0.1762 (0.1990) time: 2.9121 data: 0.0085 max mem: 33300 Epoch: [4] [2660/4276] eta: 1:22:23 lr: 4.4768498702788276e-05 loss: 0.1918 (0.1990) time: 2.8975 data: 0.0087 max mem: 33300 Epoch: [4] [2670/4276] eta: 1:21:51 lr: 4.476583522356109e-05 loss: 0.1917 (0.1990) time: 2.8984 data: 0.0084 max mem: 33300 Epoch: [4] [2680/4276] eta: 1:21:20 lr: 4.4763171726725814e-05 loss: 0.1858 (0.1989) time: 2.9055 data: 0.0084 max mem: 33300 Epoch: [4] [2690/4276] eta: 1:20:49 lr: 4.476050821228117e-05 loss: 0.1916 (0.1989) time: 2.9216 data: 0.0088 max mem: 33300 Epoch: [4] [2700/4276] eta: 1:20:17 lr: 4.4757844680225855e-05 loss: 0.1879 (0.1988) time: 2.9305 data: 0.0087 max mem: 33300 Epoch: [4] [2710/4276] eta: 1:19:46 lr: 4.4755181130558625e-05 loss: 0.1890 (0.1988) time: 2.9084 data: 0.0082 max mem: 33300 Epoch: [4] [2720/4276] eta: 1:19:14 lr: 4.4752517563278166e-05 loss: 0.1944 (0.1988) time: 2.8843 data: 0.0082 max mem: 33300 Epoch: [4] [2730/4276] eta: 1:18:43 lr: 4.4749853978383215e-05 loss: 0.1865 (0.1989) time: 2.8980 data: 0.0089 max mem: 33300 Epoch: [4] [2740/4276] eta: 1:18:11 lr: 4.4747190375872487e-05 loss: 0.2064 (0.1989) time: 2.8968 data: 0.0086 max mem: 33300 Epoch: [4] [2750/4276] eta: 1:17:40 lr: 4.4744526755744706e-05 loss: 0.2035 (0.1989) time: 2.8814 data: 0.0081 max mem: 33300 Epoch: [4] [2760/4276] eta: 1:17:08 lr: 4.474186311799858e-05 loss: 0.1917 (0.1989) time: 2.8780 data: 0.0080 max mem: 33300 Epoch: [4] [2770/4276] eta: 1:16:37 lr: 4.4739199462632825e-05 loss: 0.1784 (0.1988) time: 2.8794 data: 0.0077 max mem: 33300 Epoch: [4] [2780/4276] eta: 1:16:06 lr: 4.4736535789646164e-05 loss: 0.1773 (0.1988) time: 2.8914 data: 0.0079 max mem: 33300 Epoch: [4] [2790/4276] eta: 1:15:34 lr: 4.473387209903732e-05 loss: 0.1995 (0.1988) time: 2.9017 data: 0.0086 max mem: 33300 Epoch: [4] [2800/4276] eta: 1:15:03 lr: 4.473120839080501e-05 loss: 0.1948 (0.1988) time: 2.8948 data: 0.0090 max mem: 33300 Epoch: [4] [2810/4276] eta: 1:14:32 lr: 4.472854466494794e-05 loss: 0.1802 (0.1987) time: 2.8923 data: 0.0092 max mem: 33300 Epoch: [4] [2820/4276] eta: 1:14:00 lr: 4.472588092146483e-05 loss: 0.1805 (0.1986) time: 2.9222 data: 0.0095 max mem: 33300 Epoch: [4] [2830/4276] eta: 1:13:29 lr: 4.472321716035441e-05 loss: 0.1827 (0.1986) time: 2.9363 data: 0.0092 max mem: 33300 Epoch: [4] [2840/4276] eta: 1:12:58 lr: 4.4720553381615385e-05 loss: 0.1890 (0.1986) time: 2.9050 data: 0.0081 max mem: 33300 Epoch: [4] [2850/4276] eta: 1:12:27 lr: 4.471788958524647e-05 loss: 0.2004 (0.1987) time: 2.8810 data: 0.0075 max mem: 33300 Epoch: [4] [2860/4276] eta: 1:11:55 lr: 4.471522577124639e-05 loss: 0.1905 (0.1986) time: 2.8924 data: 0.0080 max mem: 33300 Epoch: [4] [2870/4276] eta: 1:11:24 lr: 4.4712561939613855e-05 loss: 0.1813 (0.1986) time: 2.9077 data: 0.0090 max mem: 33300 Epoch: [4] [2880/4276] eta: 1:10:53 lr: 4.470989809034759e-05 loss: 0.1931 (0.1986) time: 2.9221 data: 0.0094 max mem: 33300 Epoch: [4] [2890/4276] eta: 1:10:22 lr: 4.470723422344629e-05 loss: 0.2049 (0.1987) time: 2.9356 data: 0.0091 max mem: 33300 Epoch: [4] [2900/4276] eta: 1:09:51 lr: 4.4704570338908694e-05 loss: 0.1897 (0.1986) time: 2.9366 data: 0.0089 max mem: 33300 Epoch: [4] [2910/4276] eta: 1:09:20 lr: 4.470190643673352e-05 loss: 0.1757 (0.1986) time: 2.9355 data: 0.0086 max mem: 33300 Epoch: [4] [2920/4276] eta: 1:08:49 lr: 4.4699242516919456e-05 loss: 0.1875 (0.1986) time: 2.9327 data: 0.0082 max mem: 33300 Epoch: [4] [2930/4276] eta: 1:08:18 lr: 4.469657857946523e-05 loss: 0.1835 (0.1986) time: 2.9378 data: 0.0083 max mem: 33300 Epoch: [4] [2940/4276] eta: 1:07:47 lr: 4.4693914624369575e-05 loss: 0.1801 (0.1984) time: 2.9392 data: 0.0082 max mem: 33300 Epoch: [4] [2950/4276] eta: 1:07:17 lr: 4.469125065163118e-05 loss: 0.1801 (0.1984) time: 2.9389 data: 0.0080 max mem: 33300 Epoch: [4] [2960/4276] eta: 1:06:46 lr: 4.468858666124877e-05 loss: 0.1895 (0.1984) time: 2.9402 data: 0.0081 max mem: 33300 Epoch: [4] [2970/4276] eta: 1:06:15 lr: 4.468592265322106e-05 loss: 0.1954 (0.1985) time: 2.9413 data: 0.0086 max mem: 33300 Epoch: [4] [2980/4276] eta: 1:05:44 lr: 4.468325862754677e-05 loss: 0.2033 (0.1985) time: 2.9398 data: 0.0084 max mem: 33300 Epoch: [4] [2990/4276] eta: 1:05:13 lr: 4.468059458422461e-05 loss: 0.1765 (0.1984) time: 2.9368 data: 0.0080 max mem: 33300 Epoch: [4] [3000/4276] eta: 1:04:42 lr: 4.467793052325328e-05 loss: 0.1742 (0.1984) time: 2.9383 data: 0.0082 max mem: 33300 Epoch: [4] [3010/4276] eta: 1:04:11 lr: 4.467526644463151e-05 loss: 0.1908 (0.1984) time: 2.9380 data: 0.0084 max mem: 33300 Epoch: [4] [3020/4276] eta: 1:03:40 lr: 4.4672602348358016e-05 loss: 0.1854 (0.1983) time: 2.9386 data: 0.0082 max mem: 33300 Epoch: [4] [3030/4276] eta: 1:03:09 lr: 4.4669938234431494e-05 loss: 0.1763 (0.1984) time: 2.9353 data: 0.0080 max mem: 33300 Epoch: [4] [3040/4276] eta: 1:02:39 lr: 4.466727410285068e-05 loss: 0.2034 (0.1985) time: 2.9190 data: 0.0083 max mem: 33300 Epoch: [4] [3050/4276] eta: 1:02:08 lr: 4.466460995361427e-05 loss: 0.2099 (0.1985) time: 2.9066 data: 0.0084 max mem: 33300 Epoch: [4] [3060/4276] eta: 1:01:37 lr: 4.466194578672098e-05 loss: 0.1703 (0.1984) time: 2.9083 data: 0.0082 max mem: 33300 Epoch: [4] [3070/4276] eta: 1:01:06 lr: 4.465928160216952e-05 loss: 0.1745 (0.1984) time: 2.8964 data: 0.0082 max mem: 33300 Epoch: [4] [3080/4276] eta: 1:00:35 lr: 4.465661739995862e-05 loss: 0.1951 (0.1983) time: 2.8836 data: 0.0080 max mem: 33300 Epoch: [4] [3090/4276] eta: 1:00:04 lr: 4.4653953180086965e-05 loss: 0.1909 (0.1983) time: 2.8798 data: 0.0075 max mem: 33300 Epoch: [4] [3100/4276] eta: 0:59:33 lr: 4.465128894255329e-05 loss: 0.1950 (0.1983) time: 2.8779 data: 0.0073 max mem: 33300 Epoch: [4] [3110/4276] eta: 0:59:02 lr: 4.46486246873563e-05 loss: 0.1871 (0.1982) time: 2.8965 data: 0.0075 max mem: 33300 Epoch: [4] [3120/4276] eta: 0:58:31 lr: 4.4645960414494695e-05 loss: 0.1752 (0.1982) time: 2.8951 data: 0.0073 max mem: 33300 Epoch: [4] [3130/4276] eta: 0:58:00 lr: 4.4643296123967204e-05 loss: 0.1825 (0.1982) time: 2.8767 data: 0.0071 max mem: 33300 Epoch: [4] [3140/4276] eta: 0:57:29 lr: 4.464063181577253e-05 loss: 0.1857 (0.1982) time: 2.8780 data: 0.0071 max mem: 33300 Epoch: [4] [3150/4276] eta: 0:56:58 lr: 4.463796748990939e-05 loss: 0.1917 (0.1983) time: 2.8788 data: 0.0070 max mem: 33300 Epoch: [4] [3160/4276] eta: 0:56:27 lr: 4.4635303146376486e-05 loss: 0.1917 (0.1983) time: 2.8784 data: 0.0070 max mem: 33300 Epoch: [4] [3170/4276] eta: 0:55:56 lr: 4.463263878517253e-05 loss: 0.1926 (0.1983) time: 2.8781 data: 0.0071 max mem: 33300 Epoch: [4] [3180/4276] eta: 0:55:25 lr: 4.4629974406296247e-05 loss: 0.1944 (0.1983) time: 2.8781 data: 0.0072 max mem: 33300 Epoch: [4] [3190/4276] eta: 0:54:54 lr: 4.462731000974632e-05 loss: 0.1951 (0.1983) time: 2.8835 data: 0.0073 max mem: 33300 Epoch: [4] [3200/4276] eta: 0:54:24 lr: 4.4624645595521485e-05 loss: 0.1861 (0.1983) time: 2.8829 data: 0.0072 max mem: 33300 Epoch: [4] [3210/4276] eta: 0:53:53 lr: 4.462198116362044e-05 loss: 0.1948 (0.1983) time: 2.8771 data: 0.0070 max mem: 33300 Epoch: [4] [3220/4276] eta: 0:53:22 lr: 4.46193167140419e-05 loss: 0.1948 (0.1983) time: 2.8792 data: 0.0073 max mem: 33300 Epoch: [4] [3230/4276] eta: 0:52:51 lr: 4.461665224678458e-05 loss: 0.1903 (0.1983) time: 2.8917 data: 0.0077 max mem: 33300 Epoch: [4] [3240/4276] eta: 0:52:20 lr: 4.461398776184717e-05 loss: 0.1977 (0.1983) time: 2.9194 data: 0.0085 max mem: 33300 Epoch: [4] [3250/4276] eta: 0:51:50 lr: 4.46113232592284e-05 loss: 0.1950 (0.1983) time: 2.9359 data: 0.0089 max mem: 33300 Epoch: [4] [3260/4276] eta: 0:51:19 lr: 4.460865873892697e-05 loss: 0.1891 (0.1983) time: 2.9357 data: 0.0082 max mem: 33300 Epoch: [4] [3270/4276] eta: 0:50:49 lr: 4.460599420094159e-05 loss: 0.1900 (0.1983) time: 2.9378 data: 0.0080 max mem: 33300 Epoch: [4] [3280/4276] eta: 0:50:18 lr: 4.4603329645270966e-05 loss: 0.1856 (0.1983) time: 2.9367 data: 0.0078 max mem: 33300 Epoch: [4] [3290/4276] eta: 0:49:47 lr: 4.460066507191381e-05 loss: 0.1898 (0.1983) time: 2.9366 data: 0.0077 max mem: 33300 Epoch: [4] [3300/4276] eta: 0:49:17 lr: 4.459800048086883e-05 loss: 0.2052 (0.1983) time: 2.9356 data: 0.0078 max mem: 33300 Epoch: [4] [3310/4276] eta: 0:48:46 lr: 4.4595335872134734e-05 loss: 0.2033 (0.1984) time: 2.9377 data: 0.0081 max mem: 33300 Epoch: [4] [3320/4276] eta: 0:48:16 lr: 4.4592671245710236e-05 loss: 0.1961 (0.1984) time: 2.9392 data: 0.0082 max mem: 33300 Epoch: [4] [3330/4276] eta: 0:47:45 lr: 4.459000660159404e-05 loss: 0.1861 (0.1984) time: 2.9413 data: 0.0080 max mem: 33300 Epoch: [4] [3340/4276] eta: 0:47:15 lr: 4.458734193978485e-05 loss: 0.1886 (0.1984) time: 2.9276 data: 0.0081 max mem: 33300 Epoch: [4] [3350/4276] eta: 0:46:44 lr: 4.458467726028137e-05 loss: 0.1886 (0.1983) time: 2.8968 data: 0.0078 max mem: 33300 Epoch: [4] [3360/4276] eta: 0:46:13 lr: 4.458201256308232e-05 loss: 0.1844 (0.1983) time: 2.8910 data: 0.0080 max mem: 33300 Epoch: [4] [3370/4276] eta: 0:45:43 lr: 4.457934784818641e-05 loss: 0.2042 (0.1984) time: 2.9113 data: 0.0083 max mem: 33300 Epoch: [4] [3380/4276] eta: 0:45:12 lr: 4.457668311559231e-05 loss: 0.2021 (0.1984) time: 2.9245 data: 0.0081 max mem: 33300 Epoch: [4] [3390/4276] eta: 0:44:42 lr: 4.457401836529878e-05 loss: 0.2056 (0.1984) time: 2.9137 data: 0.0084 max mem: 33300 Epoch: [4] [3400/4276] eta: 0:44:11 lr: 4.4571353597304496e-05 loss: 0.2118 (0.1984) time: 2.8996 data: 0.0084 max mem: 33300 Epoch: [4] [3410/4276] eta: 0:43:40 lr: 4.4568688811608165e-05 loss: 0.1914 (0.1984) time: 2.8995 data: 0.0088 max mem: 33300 Epoch: [4] [3420/4276] eta: 0:43:10 lr: 4.456602400820851e-05 loss: 0.1877 (0.1984) time: 2.9011 data: 0.0090 max mem: 33300 Epoch: [4] [3430/4276] eta: 0:42:39 lr: 4.4563359187104213e-05 loss: 0.1873 (0.1985) time: 2.9153 data: 0.0086 max mem: 33300 Epoch: [4] [3440/4276] eta: 0:42:09 lr: 4.4560694348294e-05 loss: 0.1691 (0.1984) time: 2.9347 data: 0.0085 max mem: 33300 Epoch: [4] [3450/4276] eta: 0:41:38 lr: 4.455802949177657e-05 loss: 0.1913 (0.1984) time: 2.9358 data: 0.0081 max mem: 33300 Epoch: [4] [3460/4276] eta: 0:41:08 lr: 4.455536461755062e-05 loss: 0.1981 (0.1984) time: 2.9361 data: 0.0079 max mem: 33300 Epoch: [4] [3470/4276] eta: 0:40:37 lr: 4.455269972561487e-05 loss: 0.1777 (0.1984) time: 2.9367 data: 0.0077 max mem: 33300 Epoch: [4] [3480/4276] eta: 0:40:07 lr: 4.4550034815968026e-05 loss: 0.1946 (0.1984) time: 2.9284 data: 0.0072 max mem: 33300 Epoch: [4] [3490/4276] eta: 0:39:36 lr: 4.4547369888608775e-05 loss: 0.2010 (0.1984) time: 2.8956 data: 0.0077 max mem: 33300 Epoch: [4] [3500/4276] eta: 0:39:06 lr: 4.4544704943535836e-05 loss: 0.1998 (0.1984) time: 2.8886 data: 0.0086 max mem: 33300 Epoch: [4] [3510/4276] eta: 0:38:35 lr: 4.4542039980747914e-05 loss: 0.1734 (0.1983) time: 2.9096 data: 0.0092 max mem: 33300 Epoch: [4] [3520/4276] eta: 0:38:05 lr: 4.45393750002437e-05 loss: 0.1975 (0.1983) time: 2.9105 data: 0.0090 max mem: 33300 Epoch: [4] [3530/4276] eta: 0:37:34 lr: 4.453671000202193e-05 loss: 0.2002 (0.1983) time: 2.9312 data: 0.0087 max mem: 33300 Epoch: [4] [3540/4276] eta: 0:37:04 lr: 4.453404498608128e-05 loss: 0.1830 (0.1983) time: 2.9456 data: 0.0082 max mem: 33300 Epoch: [4] [3550/4276] eta: 0:36:34 lr: 4.453137995242045e-05 loss: 0.1921 (0.1983) time: 2.9367 data: 0.0074 max mem: 33300 Epoch: [4] [3560/4276] eta: 0:36:03 lr: 4.452871490103816e-05 loss: 0.1921 (0.1983) time: 2.9189 data: 0.0075 max mem: 33300 Epoch: [4] [3570/4276] eta: 0:35:33 lr: 4.452604983193311e-05 loss: 0.2032 (0.1983) time: 2.9080 data: 0.0075 max mem: 33300 Epoch: [4] [3580/4276] eta: 0:35:02 lr: 4.4523384745104e-05 loss: 0.2000 (0.1983) time: 2.9168 data: 0.0072 max mem: 33300 Epoch: [4] [3590/4276] eta: 0:34:32 lr: 4.452071964054954e-05 loss: 0.1806 (0.1983) time: 2.9160 data: 0.0072 max mem: 33300 Epoch: [4] [3600/4276] eta: 0:34:02 lr: 4.451805451826843e-05 loss: 0.1954 (0.1984) time: 2.9145 data: 0.0077 max mem: 33300 Epoch: [4] [3610/4276] eta: 0:33:31 lr: 4.451538937825936e-05 loss: 0.2052 (0.1984) time: 2.8986 data: 0.0088 max mem: 33300 Epoch: [4] [3620/4276] eta: 0:33:01 lr: 4.4512724220521054e-05 loss: 0.1946 (0.1984) time: 2.8831 data: 0.0092 max mem: 33300 Epoch: [4] [3630/4276] eta: 0:32:30 lr: 4.45100590450522e-05 loss: 0.1991 (0.1984) time: 2.8831 data: 0.0089 max mem: 33300 Epoch: [4] [3640/4276] eta: 0:32:00 lr: 4.450739385185151e-05 loss: 0.2070 (0.1984) time: 2.9027 data: 0.0087 max mem: 33300 Epoch: [4] [3650/4276] eta: 0:31:29 lr: 4.4504728640917675e-05 loss: 0.1815 (0.1984) time: 2.9324 data: 0.0086 max mem: 33300 Epoch: [4] [3660/4276] eta: 0:30:59 lr: 4.4502063412249405e-05 loss: 0.1781 (0.1984) time: 2.9372 data: 0.0083 max mem: 33300 Epoch: [4] [3670/4276] eta: 0:30:29 lr: 4.44993981658454e-05 loss: 0.1970 (0.1984) time: 2.9342 data: 0.0083 max mem: 33300 Epoch: [4] [3680/4276] eta: 0:29:58 lr: 4.449673290170437e-05 loss: 0.2095 (0.1984) time: 2.9154 data: 0.0088 max mem: 33300 Epoch: [4] [3690/4276] eta: 0:29:28 lr: 4.4494067619825e-05 loss: 0.2159 (0.1985) time: 2.8906 data: 0.0092 max mem: 33300 Epoch: [4] [3700/4276] eta: 0:28:58 lr: 4.4491402320206e-05 loss: 0.2017 (0.1985) time: 2.8825 data: 0.0089 max mem: 33300 Epoch: [4] [3710/4276] eta: 0:28:27 lr: 4.448873700284608e-05 loss: 0.1944 (0.1984) time: 2.8900 data: 0.0083 max mem: 33300 Epoch: [4] [3720/4276] eta: 0:27:57 lr: 4.4486071667743924e-05 loss: 0.1809 (0.1984) time: 2.9196 data: 0.0086 max mem: 33300 Epoch: [4] [3730/4276] eta: 0:27:27 lr: 4.448340631489824e-05 loss: 0.1905 (0.1984) time: 2.9374 data: 0.0089 max mem: 33300 Epoch: [4] [3740/4276] eta: 0:26:56 lr: 4.448074094430773e-05 loss: 0.1931 (0.1984) time: 2.9270 data: 0.0089 max mem: 33300 Epoch: [4] [3750/4276] eta: 0:26:26 lr: 4.447807555597109e-05 loss: 0.1827 (0.1984) time: 2.9111 data: 0.0090 max mem: 33300 Epoch: [4] [3760/4276] eta: 0:25:56 lr: 4.447541014988703e-05 loss: 0.1920 (0.1984) time: 2.8843 data: 0.0087 max mem: 33300 Epoch: [4] [3770/4276] eta: 0:25:25 lr: 4.4472744726054246e-05 loss: 0.1975 (0.1984) time: 2.8707 data: 0.0090 max mem: 33300 Epoch: [4] [3780/4276] eta: 0:24:55 lr: 4.447007928447143e-05 loss: 0.2012 (0.1984) time: 2.8792 data: 0.0091 max mem: 33300 Epoch: [4] [3790/4276] eta: 0:24:25 lr: 4.446741382513729e-05 loss: 0.1927 (0.1984) time: 2.8895 data: 0.0087 max mem: 33300 Epoch: [4] [3800/4276] eta: 0:23:54 lr: 4.446474834805052e-05 loss: 0.1943 (0.1984) time: 2.8960 data: 0.0088 max mem: 33300 Epoch: [4] [3810/4276] eta: 0:23:24 lr: 4.4462082853209816e-05 loss: 0.1909 (0.1984) time: 2.8949 data: 0.0089 max mem: 33300 Epoch: [4] [3820/4276] eta: 0:22:54 lr: 4.445941734061389e-05 loss: 0.1821 (0.1984) time: 2.8990 data: 0.0087 max mem: 33300 Epoch: [4] [3830/4276] eta: 0:22:24 lr: 4.4456751810261435e-05 loss: 0.1821 (0.1984) time: 2.9083 data: 0.0083 max mem: 33300 Epoch: [4] [3840/4276] eta: 0:21:53 lr: 4.4454086262151154e-05 loss: 0.1837 (0.1984) time: 2.9110 data: 0.0084 max mem: 33300 Epoch: [4] [3850/4276] eta: 0:21:23 lr: 4.445142069628174e-05 loss: 0.1837 (0.1984) time: 2.9092 data: 0.0087 max mem: 33300 Epoch: [4] [3860/4276] eta: 0:20:53 lr: 4.444875511265188e-05 loss: 0.1965 (0.1984) time: 2.9131 data: 0.0091 max mem: 33300 Epoch: [4] [3870/4276] eta: 0:20:23 lr: 4.4446089511260296e-05 loss: 0.1978 (0.1984) time: 2.9142 data: 0.0087 max mem: 33300 Epoch: [4] [3880/4276] eta: 0:19:52 lr: 4.444342389210567e-05 loss: 0.1875 (0.1984) time: 2.9250 data: 0.0082 max mem: 33300 Epoch: [4] [3890/4276] eta: 0:19:22 lr: 4.44407582551867e-05 loss: 0.1848 (0.1984) time: 2.9349 data: 0.0080 max mem: 33300 Epoch: [4] [3900/4276] eta: 0:18:52 lr: 4.443809260050209e-05 loss: 0.1943 (0.1984) time: 2.9328 data: 0.0074 max mem: 33300 Epoch: [4] [3910/4276] eta: 0:18:22 lr: 4.443542692805054e-05 loss: 0.1875 (0.1984) time: 2.9325 data: 0.0072 max mem: 33300 Epoch: [4] [3920/4276] eta: 0:17:52 lr: 4.443276123783074e-05 loss: 0.1831 (0.1983) time: 2.9317 data: 0.0074 max mem: 33300 Epoch: [4] [3930/4276] eta: 0:17:21 lr: 4.443009552984138e-05 loss: 0.1849 (0.1983) time: 2.9321 data: 0.0076 max mem: 33300 Epoch: [4] [3940/4276] eta: 0:16:51 lr: 4.442742980408118e-05 loss: 0.1864 (0.1983) time: 2.9323 data: 0.0074 max mem: 33300 Epoch: [4] [3950/4276] eta: 0:16:21 lr: 4.442476406054882e-05 loss: 0.1777 (0.1983) time: 2.9336 data: 0.0073 max mem: 33300 Epoch: [4] [3960/4276] eta: 0:15:51 lr: 4.4422098299242995e-05 loss: 0.1914 (0.1983) time: 2.9357 data: 0.0073 max mem: 33300 Epoch: [4] [3970/4276] eta: 0:15:21 lr: 4.441943252016241e-05 loss: 0.2084 (0.1983) time: 2.9345 data: 0.0081 max mem: 33300 Epoch: [4] [3980/4276] eta: 0:14:51 lr: 4.441676672330575e-05 loss: 0.1913 (0.1983) time: 2.9111 data: 0.0087 max mem: 33300 Epoch: [4] [3990/4276] eta: 0:14:20 lr: 4.441410090867173e-05 loss: 0.1913 (0.1983) time: 2.9093 data: 0.0085 max mem: 33300 Epoch: [4] [4000/4276] eta: 0:13:50 lr: 4.441143507625903e-05 loss: 0.1773 (0.1983) time: 2.9179 data: 0.0086 max mem: 33300 Epoch: [4] [4010/4276] eta: 0:13:20 lr: 4.440876922606635e-05 loss: 0.1747 (0.1983) time: 2.8988 data: 0.0087 max mem: 33300 Epoch: [4] [4020/4276] eta: 0:12:50 lr: 4.4406103358092384e-05 loss: 0.1834 (0.1983) time: 2.8857 data: 0.0083 max mem: 33300 Epoch: [4] [4030/4276] eta: 0:12:20 lr: 4.440343747233583e-05 loss: 0.1922 (0.1983) time: 2.8812 data: 0.0078 max mem: 33300 Epoch: [4] [4040/4276] eta: 0:11:50 lr: 4.440077156879538e-05 loss: 0.1937 (0.1983) time: 2.8824 data: 0.0079 max mem: 33300 Epoch: [4] [4050/4276] eta: 0:11:19 lr: 4.4398105647469726e-05 loss: 0.1827 (0.1983) time: 2.8746 data: 0.0079 max mem: 33300 Epoch: [4] [4060/4276] eta: 0:10:49 lr: 4.439543970835758e-05 loss: 0.1738 (0.1983) time: 2.8695 data: 0.0082 max mem: 33300 Epoch: [4] [4070/4276] eta: 0:10:19 lr: 4.439277375145762e-05 loss: 0.1918 (0.1983) time: 2.8786 data: 0.0084 max mem: 33300 Epoch: [4] [4080/4276] eta: 0:09:49 lr: 4.439010777676854e-05 loss: 0.1914 (0.1983) time: 2.8857 data: 0.0082 max mem: 33300 Epoch: [4] [4090/4276] eta: 0:09:19 lr: 4.438744178428904e-05 loss: 0.1951 (0.1983) time: 2.8882 data: 0.0077 max mem: 33300 Epoch: [4] [4100/4276] eta: 0:08:49 lr: 4.438477577401782e-05 loss: 0.2059 (0.1983) time: 2.8883 data: 0.0075 max mem: 33300 Epoch: [4] [4110/4276] eta: 0:08:19 lr: 4.438210974595355e-05 loss: 0.2043 (0.1983) time: 2.8859 data: 0.0077 max mem: 33300 Epoch: [4] [4120/4276] eta: 0:07:48 lr: 4.437944370009496e-05 loss: 0.1954 (0.1983) time: 2.8839 data: 0.0080 max mem: 33300 Epoch: [4] [4130/4276] eta: 0:07:18 lr: 4.437677763644071e-05 loss: 0.1954 (0.1983) time: 2.8916 data: 0.0085 max mem: 33300 Epoch: [4] [4140/4276] eta: 0:06:48 lr: 4.437411155498951e-05 loss: 0.1910 (0.1983) time: 2.9165 data: 0.0087 max mem: 33300 Epoch: [4] [4150/4276] eta: 0:06:18 lr: 4.437144545574005e-05 loss: 0.1824 (0.1983) time: 2.9173 data: 0.0087 max mem: 33300 Epoch: [4] [4160/4276] eta: 0:05:48 lr: 4.436877933869103e-05 loss: 0.1808 (0.1983) time: 2.9015 data: 0.0088 max mem: 33300 Epoch: [4] [4170/4276] eta: 0:05:18 lr: 4.4366113203841126e-05 loss: 0.2090 (0.1983) time: 2.9186 data: 0.0084 max mem: 33300 Epoch: [4] [4180/4276] eta: 0:04:48 lr: 4.4363447051189045e-05 loss: 0.1976 (0.1983) time: 2.9371 data: 0.0081 max mem: 33300 Epoch: [4] [4190/4276] eta: 0:04:18 lr: 4.436078088073347e-05 loss: 0.1907 (0.1983) time: 2.9399 data: 0.0080 max mem: 33300 Epoch: [4] [4200/4276] eta: 0:03:48 lr: 4.43581146924731e-05 loss: 0.1973 (0.1983) time: 2.9408 data: 0.0078 max mem: 33300 Epoch: [4] [4210/4276] eta: 0:03:18 lr: 4.435544848640663e-05 loss: 0.2045 (0.1984) time: 2.9393 data: 0.0077 max mem: 33300 Epoch: [4] [4220/4276] eta: 0:02:48 lr: 4.435278226253274e-05 loss: 0.2234 (0.1984) time: 2.9391 data: 0.0077 max mem: 33300 Epoch: [4] [4230/4276] eta: 0:02:18 lr: 4.4350116020850134e-05 loss: 0.2234 (0.1985) time: 2.9379 data: 0.0078 max mem: 33300 Epoch: [4] [4240/4276] eta: 0:01:48 lr: 4.434744976135749e-05 loss: 0.2180 (0.1985) time: 2.9370 data: 0.0079 max mem: 33300 Epoch: [4] [4250/4276] eta: 0:01:18 lr: 4.434478348405351e-05 loss: 0.2025 (0.1985) time: 2.9368 data: 0.0077 max mem: 33300 Epoch: [4] [4260/4276] eta: 0:00:48 lr: 4.434211718893689e-05 loss: 0.1908 (0.1985) time: 2.9352 data: 0.0077 max mem: 33300 Epoch: [4] [4270/4276] eta: 0:00:18 lr: 4.433945087600631e-05 loss: 0.1951 (0.1985) time: 2.9305 data: 0.0074 max mem: 33300 Epoch: [4] Total time: 3:34:03 Test: [ 0/21770] eta: 6:40:12 time: 1.1030 data: 1.0601 max mem: 33300 Test: [ 100/21770] eta: 0:19:18 time: 0.0387 data: 0.0008 max mem: 33300 Test: [ 200/21770] eta: 0:16:34 time: 0.0388 data: 0.0009 max mem: 33300 Test: [ 300/21770] eta: 0:15:37 time: 0.0389 data: 0.0008 max mem: 33300 Test: [ 400/21770] eta: 0:15:06 time: 0.0388 data: 0.0008 max mem: 33300 Test: [ 500/21770] eta: 0:14:46 time: 0.0388 data: 0.0008 max mem: 33300 Test: [ 600/21770] eta: 0:14:32 time: 0.0386 data: 0.0008 max mem: 33300 Test: [ 700/21770] eta: 0:14:21 time: 0.0388 data: 0.0008 max mem: 33300 Test: [ 800/21770] eta: 0:14:11 time: 0.0390 data: 0.0008 max mem: 33300 Test: [ 900/21770] eta: 0:14:03 time: 0.0389 data: 0.0008 max mem: 33300 Test: [ 1000/21770] eta: 0:13:56 time: 0.0388 data: 0.0008 max mem: 33300 Test: [ 1100/21770] eta: 0:13:49 time: 0.0389 data: 0.0008 max mem: 33300 Test: [ 1200/21770] eta: 0:13:43 time: 0.0390 data: 0.0009 max mem: 33300 Test: [ 1300/21770] eta: 0:13:37 time: 0.0388 data: 0.0008 max mem: 33300 Test: [ 1400/21770] eta: 0:13:32 time: 0.0389 data: 0.0008 max mem: 33300 Test: [ 1500/21770] eta: 0:13:27 time: 0.0392 data: 0.0008 max mem: 33300 Test: [ 1600/21770] eta: 0:13:22 time: 0.0393 data: 0.0008 max mem: 33300 Test: [ 1700/21770] eta: 0:13:18 time: 0.0393 data: 0.0008 max mem: 33300 Test: [ 1800/21770] eta: 0:13:13 time: 0.0392 data: 0.0008 max mem: 33300 Test: [ 1900/21770] eta: 0:13:09 time: 0.0394 data: 0.0008 max mem: 33300 Test: [ 2000/21770] eta: 0:13:04 time: 0.0392 data: 0.0008 max mem: 33300 Test: [ 2100/21770] eta: 0:13:00 time: 0.0392 data: 0.0008 max mem: 33300 Test: [ 2200/21770] eta: 0:12:56 time: 0.0392 data: 0.0008 max mem: 33300 Test: [ 2300/21770] eta: 0:12:52 time: 0.0393 data: 0.0008 max mem: 33300 Test: [ 2400/21770] eta: 0:12:48 time: 0.0394 data: 0.0008 max mem: 33300 Test: [ 2500/21770] eta: 0:12:44 time: 0.0400 data: 0.0008 max mem: 33300 Test: [ 2600/21770] eta: 0:12:40 time: 0.0399 data: 0.0008 max mem: 33300 Test: [ 2700/21770] eta: 0:12:36 time: 0.0393 data: 0.0008 max mem: 33300 Test: [ 2800/21770] eta: 0:12:32 time: 0.0400 data: 0.0009 max mem: 33300 Test: [ 2900/21770] eta: 0:12:28 time: 0.0400 data: 0.0008 max mem: 33300 Test: [ 3000/21770] eta: 0:12:24 time: 0.0378 data: 0.0009 max mem: 33300 Test: [ 3100/21770] eta: 0:12:19 time: 0.0379 data: 0.0009 max mem: 33300 Test: [ 3200/21770] eta: 0:12:14 time: 0.0379 data: 0.0009 max mem: 33300 Test: [ 3300/21770] eta: 0:12:09 time: 0.0378 data: 0.0009 max mem: 33300 Test: [ 3400/21770] eta: 0:12:04 time: 0.0378 data: 0.0009 max mem: 33300 Test: [ 3500/21770] eta: 0:11:59 time: 0.0378 data: 0.0008 max mem: 33300 Test: [ 3600/21770] eta: 0:11:54 time: 0.0378 data: 0.0009 max mem: 33300 Test: [ 3700/21770] eta: 0:11:50 time: 0.0378 data: 0.0009 max mem: 33300 Test: [ 3800/21770] eta: 0:11:45 time: 0.0390 data: 0.0009 max mem: 33300 Test: [ 3900/21770] eta: 0:11:41 time: 0.0392 data: 0.0009 max mem: 33300 Test: [ 4000/21770] eta: 0:11:37 time: 0.0390 data: 0.0009 max mem: 33300 Test: [ 4100/21770] eta: 0:11:33 time: 0.0391 data: 0.0009 max mem: 33300 Test: [ 4200/21770] eta: 0:11:29 time: 0.0391 data: 0.0009 max mem: 33300 Test: [ 4300/21770] eta: 0:11:25 time: 0.0393 data: 0.0009 max mem: 33300 Test: [ 4400/21770] eta: 0:11:21 time: 0.0394 data: 0.0008 max mem: 33300 Test: [ 4500/21770] eta: 0:11:18 time: 0.0397 data: 0.0008 max mem: 33300 Test: [ 4600/21770] eta: 0:11:14 time: 0.0390 data: 0.0009 max mem: 33300 Test: [ 4700/21770] eta: 0:11:10 time: 0.0384 data: 0.0009 max mem: 33300 Test: [ 4800/21770] eta: 0:11:05 time: 0.0383 data: 0.0009 max mem: 33300 Test: [ 4900/21770] eta: 0:11:01 time: 0.0386 data: 0.0009 max mem: 33300 Test: [ 5000/21770] eta: 0:10:57 time: 0.0385 data: 0.0009 max mem: 33300 Test: [ 5100/21770] eta: 0:10:53 time: 0.0385 data: 0.0009 max mem: 33300 Test: [ 5200/21770] eta: 0:10:49 time: 0.0385 data: 0.0009 max mem: 33300 Test: [ 5300/21770] eta: 0:10:45 time: 0.0384 data: 0.0009 max mem: 33300 Test: [ 5400/21770] eta: 0:10:40 time: 0.0384 data: 0.0009 max mem: 33300 Test: [ 5500/21770] eta: 0:10:36 time: 0.0386 data: 0.0009 max mem: 33300 Test: [ 5600/21770] eta: 0:10:32 time: 0.0385 data: 0.0009 max mem: 33300 Test: [ 5700/21770] eta: 0:10:28 time: 0.0385 data: 0.0009 max mem: 33300 Test: [ 5800/21770] eta: 0:10:24 time: 0.0385 data: 0.0009 max mem: 33300 Test: [ 5900/21770] eta: 0:10:20 time: 0.0383 data: 0.0009 max mem: 33300 Test: [ 6000/21770] eta: 0:10:16 time: 0.0384 data: 0.0009 max mem: 33300 Test: [ 6100/21770] eta: 0:10:12 time: 0.0385 data: 0.0009 max mem: 33300 Test: [ 6200/21770] eta: 0:10:08 time: 0.0383 data: 0.0009 max mem: 33300 Test: [ 6300/21770] eta: 0:10:04 time: 0.0390 data: 0.0008 max mem: 33300 Test: [ 6400/21770] eta: 0:10:00 time: 0.0389 data: 0.0009 max mem: 33300 Test: [ 6500/21770] eta: 0:09:56 time: 0.0390 data: 0.0009 max mem: 33300 Test: [ 6600/21770] eta: 0:09:52 time: 0.0392 data: 0.0009 max mem: 33300 Test: [ 6700/21770] eta: 0:09:48 time: 0.0390 data: 0.0009 max mem: 33300 Test: [ 6800/21770] eta: 0:09:44 time: 0.0392 data: 0.0009 max mem: 33300 Test: [ 6900/21770] eta: 0:09:40 time: 0.0387 data: 0.0009 max mem: 33300 Test: [ 7000/21770] eta: 0:09:36 time: 0.0389 data: 0.0009 max mem: 33300 Test: [ 7100/21770] eta: 0:09:32 time: 0.0388 data: 0.0009 max mem: 33300 Test: [ 7200/21770] eta: 0:09:29 time: 0.0389 data: 0.0009 max mem: 33300 Test: [ 7300/21770] eta: 0:09:25 time: 0.0388 data: 0.0009 max mem: 33300 Test: [ 7400/21770] eta: 0:09:21 time: 0.0389 data: 0.0009 max mem: 33300 Test: [ 7500/21770] eta: 0:09:17 time: 0.0388 data: 0.0009 max mem: 33300 Test: [ 7600/21770] eta: 0:09:13 time: 0.0389 data: 0.0009 max mem: 33300 Test: [ 7700/21770] eta: 0:09:09 time: 0.0388 data: 0.0009 max mem: 33300 Test: [ 7800/21770] eta: 0:09:05 time: 0.0398 data: 0.0009 max mem: 33300 Test: [ 7900/21770] eta: 0:09:01 time: 0.0400 data: 0.0009 max mem: 33300 Test: [ 8000/21770] eta: 0:08:57 time: 0.0399 data: 0.0008 max mem: 33300 Test: [ 8100/21770] eta: 0:08:54 time: 0.0400 data: 0.0008 max mem: 33300 Test: [ 8200/21770] eta: 0:08:50 time: 0.0398 data: 0.0008 max mem: 33300 Test: [ 8300/21770] eta: 0:08:46 time: 0.0400 data: 0.0008 max mem: 33300 Test: [ 8400/21770] eta: 0:08:42 time: 0.0397 data: 0.0008 max mem: 33300 Test: [ 8500/21770] eta: 0:08:38 time: 0.0399 data: 0.0008 max mem: 33300 Test: [ 8600/21770] eta: 0:08:35 time: 0.0400 data: 0.0008 max mem: 33300 Test: [ 8700/21770] eta: 0:08:31 time: 0.0399 data: 0.0008 max mem: 33300 Test: [ 8800/21770] eta: 0:08:27 time: 0.0400 data: 0.0008 max mem: 33300 Test: [ 8900/21770] eta: 0:08:23 time: 0.0401 data: 0.0008 max mem: 33300 Test: [ 9000/21770] eta: 0:08:20 time: 0.0400 data: 0.0009 max mem: 33300 Test: [ 9100/21770] eta: 0:08:16 time: 0.0401 data: 0.0009 max mem: 33300 Test: [ 9200/21770] eta: 0:08:12 time: 0.0399 data: 0.0008 max mem: 33300 Test: [ 9300/21770] eta: 0:08:08 time: 0.0397 data: 0.0008 max mem: 33300 Test: [ 9400/21770] eta: 0:08:04 time: 0.0401 data: 0.0008 max mem: 33300 Test: [ 9500/21770] eta: 0:08:00 time: 0.0383 data: 0.0009 max mem: 33300 Test: [ 9600/21770] eta: 0:07:56 time: 0.0379 data: 0.0009 max mem: 33300 Test: [ 9700/21770] eta: 0:07:52 time: 0.0379 data: 0.0008 max mem: 33300 Test: [ 9800/21770] eta: 0:07:48 time: 0.0379 data: 0.0008 max mem: 33300 Test: [ 9900/21770] eta: 0:07:44 time: 0.0379 data: 0.0009 max mem: 33300 Test: [10000/21770] eta: 0:07:40 time: 0.0401 data: 0.0008 max mem: 33300 Test: [10100/21770] eta: 0:07:36 time: 0.0399 data: 0.0008 max mem: 33300 Test: [10200/21770] eta: 0:07:33 time: 0.0401 data: 0.0008 max mem: 33300 Test: [10300/21770] eta: 0:07:29 time: 0.0400 data: 0.0008 max mem: 33300 Test: [10400/21770] eta: 0:07:25 time: 0.0400 data: 0.0008 max mem: 33300 Test: [10500/21770] eta: 0:07:21 time: 0.0398 data: 0.0008 max mem: 33300 Test: [10600/21770] eta: 0:07:17 time: 0.0400 data: 0.0008 max mem: 33300 Test: [10700/21770] eta: 0:07:13 time: 0.0386 data: 0.0008 max mem: 33300 Test: [10800/21770] eta: 0:07:09 time: 0.0387 data: 0.0009 max mem: 33300 Test: [10900/21770] eta: 0:07:05 time: 0.0388 data: 0.0009 max mem: 33300 Test: [11000/21770] eta: 0:07:01 time: 0.0379 data: 0.0009 max mem: 33300 Test: [11100/21770] eta: 0:06:57 time: 0.0379 data: 0.0008 max mem: 33300 Test: [11200/21770] eta: 0:06:53 time: 0.0378 data: 0.0008 max mem: 33300 Test: [11300/21770] eta: 0:06:49 time: 0.0396 data: 0.0008 max mem: 33300 Test: [11400/21770] eta: 0:06:45 time: 0.0385 data: 0.0008 max mem: 33300 Test: [11500/21770] eta: 0:06:41 time: 0.0387 data: 0.0008 max mem: 33300 Test: [11600/21770] eta: 0:06:37 time: 0.0386 data: 0.0008 max mem: 33300 Test: [11700/21770] eta: 0:06:34 time: 0.0385 data: 0.0008 max mem: 33300 Test: [11800/21770] eta: 0:06:30 time: 0.0386 data: 0.0008 max mem: 33300 Test: [11900/21770] eta: 0:06:26 time: 0.0384 data: 0.0008 max mem: 33300 Test: [12000/21770] eta: 0:06:22 time: 0.0386 data: 0.0008 max mem: 33300 Test: [12100/21770] eta: 0:06:18 time: 0.0387 data: 0.0008 max mem: 33300 Test: [12200/21770] eta: 0:06:14 time: 0.0386 data: 0.0009 max mem: 33300 Test: [12300/21770] eta: 0:06:10 time: 0.0384 data: 0.0008 max mem: 33300 Test: [12400/21770] eta: 0:06:06 time: 0.0386 data: 0.0008 max mem: 33300 Test: [12500/21770] eta: 0:06:02 time: 0.0391 data: 0.0009 max mem: 33300 Test: [12600/21770] eta: 0:05:58 time: 0.0392 data: 0.0009 max mem: 33300 Test: [12700/21770] eta: 0:05:54 time: 0.0394 data: 0.0009 max mem: 33300 Test: [12800/21770] eta: 0:05:50 time: 0.0396 data: 0.0008 max mem: 33300 Test: [12900/21770] eta: 0:05:46 time: 0.0394 data: 0.0008 max mem: 33300 Test: [13000/21770] eta: 0:05:42 time: 0.0396 data: 0.0008 max mem: 33300 Test: [13100/21770] eta: 0:05:39 time: 0.0395 data: 0.0009 max mem: 33300 Test: [13200/21770] eta: 0:05:35 time: 0.0398 data: 0.0008 max mem: 33300 Test: [13300/21770] eta: 0:05:31 time: 0.0393 data: 0.0008 max mem: 33300 Test: [13400/21770] eta: 0:05:27 time: 0.0398 data: 0.0009 max mem: 33300 Test: [13500/21770] eta: 0:05:23 time: 0.0397 data: 0.0009 max mem: 33300 Test: [13600/21770] eta: 0:05:19 time: 0.0392 data: 0.0008 max mem: 33300 Test: [13700/21770] eta: 0:05:15 time: 0.0401 data: 0.0009 max mem: 33300 Test: [13800/21770] eta: 0:05:11 time: 0.0397 data: 0.0008 max mem: 33300 Test: [13900/21770] eta: 0:05:07 time: 0.0394 data: 0.0008 max mem: 33300 Test: [14000/21770] eta: 0:05:04 time: 0.0393 data: 0.0009 max mem: 33300 Test: [14100/21770] eta: 0:05:00 time: 0.0401 data: 0.0008 max mem: 33300 Test: [14200/21770] eta: 0:04:56 time: 0.0398 data: 0.0008 max mem: 33300 Test: [14300/21770] eta: 0:04:52 time: 0.0401 data: 0.0008 max mem: 33300 Test: [14400/21770] eta: 0:04:48 time: 0.0399 data: 0.0008 max mem: 33300 Test: [14500/21770] eta: 0:04:44 time: 0.0401 data: 0.0008 max mem: 33300 Test: [14600/21770] eta: 0:04:40 time: 0.0400 data: 0.0008 max mem: 33300 Test: [14700/21770] eta: 0:04:36 time: 0.0401 data: 0.0008 max mem: 33300 Test: [14800/21770] eta: 0:04:33 time: 0.0397 data: 0.0008 max mem: 33300 Test: [14900/21770] eta: 0:04:29 time: 0.0400 data: 0.0008 max mem: 33300 Test: [15000/21770] eta: 0:04:25 time: 0.0400 data: 0.0008 max mem: 33300 Test: [15100/21770] eta: 0:04:21 time: 0.0400 data: 0.0008 max mem: 33300 Test: [15200/21770] eta: 0:04:17 time: 0.0400 data: 0.0008 max mem: 33300 Test: [15300/21770] eta: 0:04:13 time: 0.0400 data: 0.0008 max mem: 33300 Test: [15400/21770] eta: 0:04:09 time: 0.0400 data: 0.0009 max mem: 33300 Test: [15500/21770] eta: 0:04:05 time: 0.0399 data: 0.0008 max mem: 33300 Test: [15600/21770] eta: 0:04:01 time: 0.0405 data: 0.0008 max mem: 33300 Test: [15700/21770] eta: 0:03:58 time: 0.0405 data: 0.0008 max mem: 33300 Test: [15800/21770] eta: 0:03:54 time: 0.0400 data: 0.0008 max mem: 33300 Test: [15900/21770] eta: 0:03:50 time: 0.0407 data: 0.0008 max mem: 33300 Test: [16000/21770] eta: 0:03:46 time: 0.0401 data: 0.0008 max mem: 33300 Test: [16100/21770] eta: 0:03:42 time: 0.0401 data: 0.0008 max mem: 33300 Test: [16200/21770] eta: 0:03:38 time: 0.0401 data: 0.0008 max mem: 33300 Test: [16300/21770] eta: 0:03:34 time: 0.0399 data: 0.0008 max mem: 33300 Test: [16400/21770] eta: 0:03:30 time: 0.0400 data: 0.0008 max mem: 33300 Test: [16500/21770] eta: 0:03:26 time: 0.0400 data: 0.0008 max mem: 33300 Test: [16600/21770] eta: 0:03:23 time: 0.0400 data: 0.0008 max mem: 33300 Test: [16700/21770] eta: 0:03:19 time: 0.0400 data: 0.0008 max mem: 33300 Test: [16800/21770] eta: 0:03:15 time: 0.0399 data: 0.0008 max mem: 33300 Test: [16900/21770] eta: 0:03:11 time: 0.0400 data: 0.0008 max mem: 33300 Test: [17000/21770] eta: 0:03:07 time: 0.0400 data: 0.0008 max mem: 33300 Test: [17100/21770] eta: 0:03:03 time: 0.0400 data: 0.0008 max mem: 33300 Test: [17200/21770] eta: 0:02:59 time: 0.0400 data: 0.0008 max mem: 33300 Test: [17300/21770] eta: 0:02:55 time: 0.0400 data: 0.0008 max mem: 33300 Test: [17400/21770] eta: 0:02:51 time: 0.0399 data: 0.0008 max mem: 33300 Test: [17500/21770] eta: 0:02:47 time: 0.0400 data: 0.0008 max mem: 33300 Test: [17600/21770] eta: 0:02:43 time: 0.0400 data: 0.0008 max mem: 33300 Test: [17700/21770] eta: 0:02:40 time: 0.0400 data: 0.0008 max mem: 33300 Test: [17800/21770] eta: 0:02:36 time: 0.0400 data: 0.0008 max mem: 33300 Test: [17900/21770] eta: 0:02:32 time: 0.0400 data: 0.0008 max mem: 33300 Test: [18000/21770] eta: 0:02:28 time: 0.0400 data: 0.0008 max mem: 33300 Test: [18100/21770] eta: 0:02:24 time: 0.0400 data: 0.0008 max mem: 33300 Test: [18200/21770] eta: 0:02:20 time: 0.0379 data: 0.0008 max mem: 33300 Test: [18300/21770] eta: 0:02:16 time: 0.0379 data: 0.0008 max mem: 33300 Test: [18400/21770] eta: 0:02:12 time: 0.0379 data: 0.0008 max mem: 33300 Test: [18500/21770] eta: 0:02:08 time: 0.0380 data: 0.0008 max mem: 33300 Test: [18600/21770] eta: 0:02:04 time: 0.0379 data: 0.0008 max mem: 33300 Test: [18700/21770] eta: 0:02:00 time: 0.0379 data: 0.0008 max mem: 33300 Test: [18800/21770] eta: 0:01:56 time: 0.0379 data: 0.0008 max mem: 33300 Test: [18900/21770] eta: 0:01:52 time: 0.0380 data: 0.0008 max mem: 33300 Test: [19000/21770] eta: 0:01:48 time: 0.0379 data: 0.0008 max mem: 33300 Test: [19100/21770] eta: 0:01:44 time: 0.0379 data: 0.0008 max mem: 33300 Test: [19200/21770] eta: 0:01:40 time: 0.0379 data: 0.0008 max mem: 33300 Test: [19300/21770] eta: 0:01:36 time: 0.0379 data: 0.0008 max mem: 33300 Test: [19400/21770] eta: 0:01:33 time: 0.0379 data: 0.0008 max mem: 33300 Test: [19500/21770] eta: 0:01:29 time: 0.0380 data: 0.0008 max mem: 33300 Test: [19600/21770] eta: 0:01:25 time: 0.0379 data: 0.0008 max mem: 33300 Test: [19700/21770] eta: 0:01:21 time: 0.0383 data: 0.0009 max mem: 33300 Test: [19800/21770] eta: 0:01:17 time: 0.0401 data: 0.0008 max mem: 33300 Test: [19900/21770] eta: 0:01:13 time: 0.0401 data: 0.0008 max mem: 33300 Test: [20000/21770] eta: 0:01:09 time: 0.0400 data: 0.0008 max mem: 33300 Test: [20100/21770] eta: 0:01:05 time: 0.0400 data: 0.0008 max mem: 33300 Test: [20200/21770] eta: 0:01:01 time: 0.0401 data: 0.0008 max mem: 33300 Test: [20300/21770] eta: 0:00:57 time: 0.0400 data: 0.0008 max mem: 33300 Test: [20400/21770] eta: 0:00:53 time: 0.0400 data: 0.0008 max mem: 33300 Test: [20500/21770] eta: 0:00:49 time: 0.0398 data: 0.0008 max mem: 33300 Test: [20600/21770] eta: 0:00:45 time: 0.0400 data: 0.0008 max mem: 33300 Test: [20700/21770] eta: 0:00:42 time: 0.0400 data: 0.0008 max mem: 33300 Test: [20800/21770] eta: 0:00:38 time: 0.0400 data: 0.0008 max mem: 33300 Test: [20900/21770] eta: 0:00:34 time: 0.0401 data: 0.0008 max mem: 33300 Test: [21000/21770] eta: 0:00:30 time: 0.0400 data: 0.0008 max mem: 33300 Test: [21100/21770] eta: 0:00:26 time: 0.0402 data: 0.0008 max mem: 33300 Test: [21200/21770] eta: 0:00:22 time: 0.0398 data: 0.0008 max mem: 33300 Test: [21300/21770] eta: 0:00:18 time: 0.0400 data: 0.0008 max mem: 33300 Test: [21400/21770] eta: 0:00:14 time: 0.0400 data: 0.0008 max mem: 33300 Test: [21500/21770] eta: 0:00:10 time: 0.0400 data: 0.0008 max mem: 33300 Test: [21600/21770] eta: 0:00:06 time: 0.0407 data: 0.0008 max mem: 33300 Test: [21700/21770] eta: 0:00:02 time: 0.0404 data: 0.0008 max mem: 33300 Test: Total time: 0:14:15 Final results: Mean IoU is 0.00 precision@0.5 = 0.00 precision@0.6 = 0.00 precision@0.7 = 0.00 precision@0.8 = 0.00 precision@0.9 = 0.00 overall IoU = 0.00 mean IoU = 0.00 Mean accuracy for one-to-zero sample is 100.00 Average object IoU 0.0 Overall IoU 0.0 Epoch: [5] [ 0/4276] eta: 6:52:14 lr: 4.4337851079696714e-05 loss: 0.1787 (0.1787) time: 5.7845 data: 2.7058 max mem: 33300 Epoch: [5] [ 10/4276] eta: 3:45:42 lr: 4.4335184738261074e-05 loss: 0.1796 (0.1878) time: 3.1746 data: 0.2535 max mem: 33300 Epoch: [5] [ 20/4276] eta: 3:36:18 lr: 4.433251837900807e-05 loss: 0.1828 (0.1938) time: 2.9127 data: 0.0086 max mem: 33300 Epoch: [5] [ 30/4276] eta: 3:32:02 lr: 4.43298520019364e-05 loss: 0.1887 (0.1955) time: 2.8985 data: 0.0081 max mem: 33300 Epoch: [5] [ 40/4276] eta: 3:29:38 lr: 4.4327185607044746e-05 loss: 0.1771 (0.1924) time: 2.8856 data: 0.0072 max mem: 33300 Epoch: [5] [ 50/4276] eta: 3:28:01 lr: 4.4324519194331805e-05 loss: 0.1763 (0.1893) time: 2.8872 data: 0.0071 max mem: 33300 Epoch: [5] [ 60/4276] eta: 3:26:41 lr: 4.4321852763796255e-05 loss: 0.1707 (0.1888) time: 2.8842 data: 0.0073 max mem: 33300 Epoch: [5] [ 70/4276] eta: 3:25:37 lr: 4.43191863154368e-05 loss: 0.1746 (0.1887) time: 2.8816 data: 0.0075 max mem: 33300 Epoch: [5] [ 80/4276] eta: 3:24:42 lr: 4.431651984925212e-05 loss: 0.1861 (0.1889) time: 2.8835 data: 0.0079 max mem: 33300 Epoch: [5] [ 90/4276] eta: 3:23:56 lr: 4.4313853365240906e-05 loss: 0.1821 (0.1872) time: 2.8871 data: 0.0078 max mem: 33300 Epoch: [5] [ 100/4276] eta: 3:23:29 lr: 4.431118686340185e-05 loss: 0.1737 (0.1892) time: 2.9097 data: 0.0077 max mem: 33300 Epoch: [5] [ 110/4276] eta: 3:23:02 lr: 4.4308520343733626e-05 loss: 0.1838 (0.1899) time: 2.9289 data: 0.0084 max mem: 33300 Epoch: [5] [ 120/4276] eta: 3:22:38 lr: 4.4305853806234945e-05 loss: 0.1867 (0.1905) time: 2.9338 data: 0.0086 max mem: 33300 Epoch: [5] [ 130/4276] eta: 3:22:13 lr: 4.4303187250904485e-05 loss: 0.1997 (0.1919) time: 2.9389 data: 0.0085 max mem: 33300 Epoch: [5] [ 140/4276] eta: 3:21:47 lr: 4.4300520677740926e-05 loss: 0.1979 (0.1916) time: 2.9389 data: 0.0080 max mem: 33300 Epoch: [5] [ 150/4276] eta: 3:21:20 lr: 4.429785408674297e-05 loss: 0.1865 (0.1914) time: 2.9382 data: 0.0076 max mem: 33300 Epoch: [5] [ 160/4276] eta: 3:20:54 lr: 4.42951874779093e-05 loss: 0.1896 (0.1917) time: 2.9389 data: 0.0078 max mem: 33300 Epoch: [5] [ 170/4276] eta: 3:20:24 lr: 4.42925208512386e-05 loss: 0.1956 (0.1921) time: 2.9329 data: 0.0080 max mem: 33300 Epoch: [5] [ 180/4276] eta: 3:19:54 lr: 4.428985420672957e-05 loss: 0.1935 (0.1923) time: 2.9243 data: 0.0079 max mem: 33300 Epoch: [5] [ 190/4276] eta: 3:19:27 lr: 4.4287187544380875e-05 loss: 0.1909 (0.1928) time: 2.9316 data: 0.0079 max mem: 33300 Epoch: [5] [ 200/4276] eta: 3:19:00 lr: 4.428452086419121e-05 loss: 0.1968 (0.1941) time: 2.9399 data: 0.0080 max mem: 33300 Epoch: [5] [ 210/4276] eta: 3:18:32 lr: 4.428185416615928e-05 loss: 0.1968 (0.1942) time: 2.9383 data: 0.0080 max mem: 33300 Epoch: [5] [ 220/4276] eta: 3:18:05 lr: 4.427918745028376e-05 loss: 0.1866 (0.1940) time: 2.9386 data: 0.0077 max mem: 33300 Epoch: [5] [ 230/4276] eta: 3:17:37 lr: 4.427652071656333e-05 loss: 0.1866 (0.1932) time: 2.9394 data: 0.0079 max mem: 33300 Epoch: [5] [ 240/4276] eta: 3:17:09 lr: 4.427385396499667e-05 loss: 0.1889 (0.1933) time: 2.9379 data: 0.0083 max mem: 33300 Epoch: [5] [ 250/4276] eta: 3:16:40 lr: 4.4271187195582484e-05 loss: 0.1988 (0.1942) time: 2.9376 data: 0.0081 max mem: 33300 Epoch: [5] [ 260/4276] eta: 3:16:10 lr: 4.4268520408319455e-05 loss: 0.2002 (0.1946) time: 2.9305 data: 0.0083 max mem: 33300 Epoch: [5] [ 270/4276] eta: 3:15:42 lr: 4.4265853603206264e-05 loss: 0.2018 (0.1948) time: 2.9313 data: 0.0087 max mem: 33300 Epoch: [5] [ 280/4276] eta: 3:15:15 lr: 4.42631867802416e-05 loss: 0.1985 (0.1953) time: 2.9451 data: 0.0088 max mem: 33300 Epoch: [5] [ 290/4276] eta: 3:14:46 lr: 4.4260519939424136e-05 loss: 0.1976 (0.1948) time: 2.9425 data: 0.0086 max mem: 33300 Epoch: [5] [ 300/4276] eta: 3:14:19 lr: 4.425785308075258e-05 loss: 0.1783 (0.1946) time: 2.9391 data: 0.0084 max mem: 33300 Epoch: [5] [ 310/4276] eta: 3:13:50 lr: 4.4255186204225604e-05 loss: 0.1809 (0.1944) time: 2.9406 data: 0.0082 max mem: 33300 Epoch: [5] [ 320/4276] eta: 3:13:22 lr: 4.425251930984188e-05 loss: 0.1918 (0.1951) time: 2.9428 data: 0.0082 max mem: 33300 Epoch: [5] [ 330/4276] eta: 3:12:49 lr: 4.424985239760012e-05 loss: 0.1960 (0.1949) time: 2.9242 data: 0.0088 max mem: 33300 Epoch: [5] [ 340/4276] eta: 3:12:15 lr: 4.424718546749899e-05 loss: 0.1847 (0.1945) time: 2.8935 data: 0.0092 max mem: 33300 Epoch: [5] [ 350/4276] eta: 3:11:40 lr: 4.4244518519537174e-05 loss: 0.1847 (0.1943) time: 2.8864 data: 0.0086 max mem: 33300 Epoch: [5] [ 360/4276] eta: 3:11:12 lr: 4.424185155371337e-05 loss: 0.2058 (0.1948) time: 2.9138 data: 0.0086 max mem: 33300 Epoch: [5] [ 370/4276] eta: 3:10:44 lr: 4.423918457002625e-05 loss: 0.1844 (0.1942) time: 2.9396 data: 0.0087 max mem: 33300 Epoch: [5] [ 380/4276] eta: 3:10:16 lr: 4.42365175684745e-05 loss: 0.1779 (0.1942) time: 2.9420 data: 0.0082 max mem: 33300 Epoch: [5] [ 390/4276] eta: 3:09:49 lr: 4.42338505490568e-05 loss: 0.1879 (0.1944) time: 2.9475 data: 0.0081 max mem: 33300 Epoch: [5] [ 400/4276] eta: 3:09:21 lr: 4.423118351177184e-05 loss: 0.1918 (0.1945) time: 2.9463 data: 0.0082 max mem: 33300 Epoch: [5] [ 410/4276] eta: 3:08:52 lr: 4.422851645661831e-05 loss: 0.1918 (0.1944) time: 2.9396 data: 0.0082 max mem: 33300 Epoch: [5] [ 420/4276] eta: 3:08:22 lr: 4.4225849383594876e-05 loss: 0.1964 (0.1952) time: 2.9319 data: 0.0086 max mem: 33300 Epoch: [5] [ 430/4276] eta: 3:07:49 lr: 4.422318229270023e-05 loss: 0.1891 (0.1953) time: 2.9056 data: 0.0089 max mem: 33300 Epoch: [5] [ 440/4276] eta: 3:07:18 lr: 4.422051518393305e-05 loss: 0.1827 (0.1951) time: 2.8958 data: 0.0091 max mem: 33300 Epoch: [5] [ 450/4276] eta: 3:06:48 lr: 4.421784805729202e-05 loss: 0.1916 (0.1952) time: 2.9190 data: 0.0086 max mem: 33300 Epoch: [5] [ 460/4276] eta: 3:06:19 lr: 4.421518091277584e-05 loss: 0.1740 (0.1947) time: 2.9288 data: 0.0079 max mem: 33300 Epoch: [5] [ 470/4276] eta: 3:05:51 lr: 4.421251375038316e-05 loss: 0.1727 (0.1945) time: 2.9360 data: 0.0078 max mem: 33300 Epoch: [5] [ 480/4276] eta: 3:05:22 lr: 4.420984657011269e-05 loss: 0.1703 (0.1940) time: 2.9385 data: 0.0078 max mem: 33300 Epoch: [5] [ 490/4276] eta: 3:04:53 lr: 4.420717937196309e-05 loss: 0.1750 (0.1939) time: 2.9370 data: 0.0076 max mem: 33300 Epoch: [5] [ 500/4276] eta: 3:04:24 lr: 4.420451215593306e-05 loss: 0.1811 (0.1939) time: 2.9364 data: 0.0078 max mem: 33300 Epoch: [5] [ 510/4276] eta: 3:03:55 lr: 4.420184492202127e-05 loss: 0.1758 (0.1936) time: 2.9367 data: 0.0079 max mem: 33300 Epoch: [5] [ 520/4276] eta: 3:03:26 lr: 4.4199177670226406e-05 loss: 0.1731 (0.1936) time: 2.9365 data: 0.0077 max mem: 33300 Epoch: [5] [ 530/4276] eta: 3:02:57 lr: 4.419651040054714e-05 loss: 0.1839 (0.1935) time: 2.9323 data: 0.0075 max mem: 33300 Epoch: [5] [ 540/4276] eta: 3:02:29 lr: 4.419384311298217e-05 loss: 0.1799 (0.1934) time: 2.9392 data: 0.0077 max mem: 33300 Epoch: [5] [ 550/4276] eta: 3:01:59 lr: 4.419117580753016e-05 loss: 0.1818 (0.1933) time: 2.9332 data: 0.0083 max mem: 33300 Epoch: [5] [ 560/4276] eta: 3:01:30 lr: 4.41885084841898e-05 loss: 0.1895 (0.1935) time: 2.9264 data: 0.0081 max mem: 33300 Epoch: [5] [ 570/4276] eta: 3:01:01 lr: 4.418584114295977e-05 loss: 0.1987 (0.1935) time: 2.9382 data: 0.0076 max mem: 33300 Epoch: [5] [ 580/4276] eta: 3:00:31 lr: 4.418317378383875e-05 loss: 0.1834 (0.1935) time: 2.9333 data: 0.0077 max mem: 33300 Epoch: [5] [ 590/4276] eta: 3:00:02 lr: 4.41805064068254e-05 loss: 0.1684 (0.1932) time: 2.9277 data: 0.0079 max mem: 33300 Epoch: [5] [ 600/4276] eta: 2:59:33 lr: 4.4177839011918434e-05 loss: 0.1764 (0.1931) time: 2.9332 data: 0.0082 max mem: 33300 Epoch: [5] [ 610/4276] eta: 2:59:04 lr: 4.417517159911651e-05 loss: 0.1866 (0.1929) time: 2.9348 data: 0.0084 max mem: 33300 Epoch: [5] [ 620/4276] eta: 2:58:35 lr: 4.4172504168418314e-05 loss: 0.1810 (0.1929) time: 2.9361 data: 0.0084 max mem: 33300 Epoch: [5] [ 630/4276] eta: 2:58:06 lr: 4.416983671982252e-05 loss: 0.1802 (0.1929) time: 2.9371 data: 0.0086 max mem: 33300 Epoch: [5] [ 640/4276] eta: 2:57:37 lr: 4.416716925332781e-05 loss: 0.1783 (0.1927) time: 2.9391 data: 0.0088 max mem: 33300 Epoch: [5] [ 650/4276] eta: 2:57:09 lr: 4.416450176893287e-05 loss: 0.1802 (0.1928) time: 2.9408 data: 0.0089 max mem: 33300 Epoch: [5] [ 660/4276] eta: 2:56:39 lr: 4.416183426663636e-05 loss: 0.2002 (0.1929) time: 2.9370 data: 0.0090 max mem: 33300 Epoch: [5] [ 670/4276] eta: 2:56:10 lr: 4.415916674643697e-05 loss: 0.1895 (0.1929) time: 2.9337 data: 0.0087 max mem: 33300 Epoch: [5] [ 680/4276] eta: 2:55:41 lr: 4.4156499208333385e-05 loss: 0.1784 (0.1926) time: 2.9336 data: 0.0083 max mem: 33300 Epoch: [5] [ 690/4276] eta: 2:55:12 lr: 4.415383165232428e-05 loss: 0.1784 (0.1925) time: 2.9327 data: 0.0079 max mem: 33300 Epoch: [5] [ 700/4276] eta: 2:54:43 lr: 4.415116407840832e-05 loss: 0.1968 (0.1927) time: 2.9422 data: 0.0079 max mem: 33300 Epoch: [5] [ 710/4276] eta: 2:54:14 lr: 4.41484964865842e-05 loss: 0.1985 (0.1927) time: 2.9426 data: 0.0081 max mem: 33300 Epoch: [5] [ 720/4276] eta: 2:53:45 lr: 4.4145828876850584e-05 loss: 0.1889 (0.1926) time: 2.9382 data: 0.0080 max mem: 33300 Epoch: [5] [ 730/4276] eta: 2:53:16 lr: 4.414316124920615e-05 loss: 0.1805 (0.1927) time: 2.9381 data: 0.0079 max mem: 33300 Epoch: [5] [ 740/4276] eta: 2:52:47 lr: 4.414049360364958e-05 loss: 0.1856 (0.1927) time: 2.9366 data: 0.0076 max mem: 33300 Epoch: [5] [ 750/4276] eta: 2:52:18 lr: 4.413782594017955e-05 loss: 0.1830 (0.1926) time: 2.9362 data: 0.0078 max mem: 33300 Epoch: [5] [ 760/4276] eta: 2:51:49 lr: 4.413515825879474e-05 loss: 0.1709 (0.1924) time: 2.9365 data: 0.0078 max mem: 33300 Epoch: [5] [ 770/4276] eta: 2:51:19 lr: 4.413249055949383e-05 loss: 0.1733 (0.1925) time: 2.9331 data: 0.0082 max mem: 33300 Epoch: [5] [ 780/4276] eta: 2:50:50 lr: 4.412982284227548e-05 loss: 0.1878 (0.1925) time: 2.9238 data: 0.0085 max mem: 33300 Epoch: [5] [ 790/4276] eta: 2:50:21 lr: 4.412715510713838e-05 loss: 0.1908 (0.1926) time: 2.9292 data: 0.0081 max mem: 33300 Epoch: [5] [ 800/4276] eta: 2:49:52 lr: 4.4124487354081204e-05 loss: 0.1908 (0.1926) time: 2.9387 data: 0.0078 max mem: 33300 Epoch: [5] [ 810/4276] eta: 2:49:22 lr: 4.412181958310262e-05 loss: 0.1816 (0.1926) time: 2.9360 data: 0.0077 max mem: 33300 Epoch: [5] [ 820/4276] eta: 2:48:53 lr: 4.4119151794201316e-05 loss: 0.1723 (0.1925) time: 2.9343 data: 0.0076 max mem: 33300 Epoch: [5] [ 830/4276] eta: 2:48:24 lr: 4.411648398737595e-05 loss: 0.1723 (0.1926) time: 2.9353 data: 0.0078 max mem: 33300 Epoch: [5] [ 840/4276] eta: 2:47:55 lr: 4.411381616262522e-05 loss: 0.1789 (0.1927) time: 2.9346 data: 0.0082 max mem: 33300 Epoch: [5] [ 850/4276] eta: 2:47:25 lr: 4.411114831994778e-05 loss: 0.1793 (0.1927) time: 2.9318 data: 0.0079 max mem: 33300 Epoch: [5] [ 860/4276] eta: 2:46:56 lr: 4.410848045934231e-05 loss: 0.1793 (0.1928) time: 2.9273 data: 0.0076 max mem: 33300 Epoch: [5] [ 870/4276] eta: 2:46:26 lr: 4.4105812580807504e-05 loss: 0.1833 (0.1928) time: 2.9250 data: 0.0080 max mem: 33300 Epoch: [5] [ 880/4276] eta: 2:45:56 lr: 4.4103144684342016e-05 loss: 0.1833 (0.1928) time: 2.9234 data: 0.0091 max mem: 33300 Epoch: [5] [ 890/4276] eta: 2:45:27 lr: 4.410047676994452e-05 loss: 0.1959 (0.1929) time: 2.9290 data: 0.0090 max mem: 33300 Epoch: [5] [ 900/4276] eta: 2:44:58 lr: 4.4097808837613695e-05 loss: 0.1990 (0.1929) time: 2.9331 data: 0.0082 max mem: 33300 Epoch: [5] [ 910/4276] eta: 2:44:29 lr: 4.409514088734822e-05 loss: 0.1932 (0.1929) time: 2.9343 data: 0.0082 max mem: 33300 Epoch: [5] [ 920/4276] eta: 2:43:59 lr: 4.409247291914676e-05 loss: 0.1910 (0.1929) time: 2.9365 data: 0.0091 max mem: 33300 Epoch: [5] [ 930/4276] eta: 2:43:29 lr: 4.4089804933008e-05 loss: 0.1859 (0.1928) time: 2.9172 data: 0.0094 max mem: 33300 Epoch: [5] [ 940/4276] eta: 2:42:58 lr: 4.40871369289306e-05 loss: 0.1717 (0.1926) time: 2.8959 data: 0.0090 max mem: 33300 Epoch: [5] [ 950/4276] eta: 2:42:28 lr: 4.408446890691324e-05 loss: 0.1827 (0.1927) time: 2.9030 data: 0.0090 max mem: 33300 Epoch: [5] [ 960/4276] eta: 2:41:59 lr: 4.4081800866954595e-05 loss: 0.1870 (0.1928) time: 2.9244 data: 0.0084 max mem: 33300 Epoch: [5] [ 970/4276] eta: 2:41:29 lr: 4.407913280905333e-05 loss: 0.1877 (0.1928) time: 2.9251 data: 0.0079 max mem: 33300 Epoch: [5] [ 980/4276] eta: 2:40:58 lr: 4.407646473320812e-05 loss: 0.1974 (0.1930) time: 2.8969 data: 0.0079 max mem: 33300 Epoch: [5] [ 990/4276] eta: 2:40:27 lr: 4.4073796639417644e-05 loss: 0.1929 (0.1929) time: 2.8812 data: 0.0077 max mem: 33300 Epoch: [5] [1000/4276] eta: 2:39:56 lr: 4.407112852768057e-05 loss: 0.1869 (0.1929) time: 2.8810 data: 0.0078 max mem: 33300 Epoch: [5] [1010/4276] eta: 2:39:25 lr: 4.406846039799558e-05 loss: 0.1831 (0.1929) time: 2.8790 data: 0.0077 max mem: 33300 Epoch: [5] [1020/4276] eta: 2:38:55 lr: 4.406579225036132e-05 loss: 0.1831 (0.1928) time: 2.8789 data: 0.0079 max mem: 33300 Epoch: [5] [1030/4276] eta: 2:38:24 lr: 4.4063124084776486e-05 loss: 0.1930 (0.1929) time: 2.8785 data: 0.0081 max mem: 33300 Epoch: [5] [1040/4276] eta: 2:37:53 lr: 4.406045590123974e-05 loss: 0.1930 (0.1929) time: 2.8812 data: 0.0084 max mem: 33300 Epoch: [5] [1050/4276] eta: 2:37:22 lr: 4.405778769974975e-05 loss: 0.1848 (0.1929) time: 2.8827 data: 0.0089 max mem: 33300 Epoch: [5] [1060/4276] eta: 2:36:52 lr: 4.405511948030519e-05 loss: 0.1861 (0.1930) time: 2.8801 data: 0.0087 max mem: 33300 Epoch: [5] [1070/4276] eta: 2:36:21 lr: 4.405245124290474e-05 loss: 0.1933 (0.1929) time: 2.8777 data: 0.0083 max mem: 33300 Epoch: [5] [1080/4276] eta: 2:35:50 lr: 4.4049782987547064e-05 loss: 0.1798 (0.1928) time: 2.8750 data: 0.0081 max mem: 33300 Epoch: [5] [1090/4276] eta: 2:35:21 lr: 4.4047114714230824e-05 loss: 0.1798 (0.1927) time: 2.8992 data: 0.0088 max mem: 33300 Epoch: [5] [1100/4276] eta: 2:34:52 lr: 4.40444464229547e-05 loss: 0.1785 (0.1928) time: 2.9277 data: 0.0091 max mem: 33300 Epoch: [5] [1110/4276] eta: 2:34:23 lr: 4.404177811371737e-05 loss: 0.1785 (0.1927) time: 2.9327 data: 0.0088 max mem: 33300 Epoch: [5] [1120/4276] eta: 2:33:54 lr: 4.4039109786517486e-05 loss: 0.1882 (0.1929) time: 2.9327 data: 0.0088 max mem: 33300 Epoch: [5] [1130/4276] eta: 2:33:24 lr: 4.403644144135372e-05 loss: 0.1855 (0.1927) time: 2.9279 data: 0.0087 max mem: 33300 Epoch: [5] [1140/4276] eta: 2:32:55 lr: 4.403377307822476e-05 loss: 0.1814 (0.1925) time: 2.9207 data: 0.0090 max mem: 33300 Epoch: [5] [1150/4276] eta: 2:32:26 lr: 4.403110469712925e-05 loss: 0.1849 (0.1925) time: 2.9241 data: 0.0086 max mem: 33300 Epoch: [5] [1160/4276] eta: 2:31:56 lr: 4.4028436298065885e-05 loss: 0.1909 (0.1926) time: 2.9269 data: 0.0077 max mem: 33300 Epoch: [5] [1170/4276] eta: 2:31:27 lr: 4.4025767881033315e-05 loss: 0.1949 (0.1928) time: 2.9255 data: 0.0080 max mem: 33300 Epoch: [5] [1180/4276] eta: 2:30:58 lr: 4.4023099446030216e-05 loss: 0.1943 (0.1926) time: 2.9284 data: 0.0078 max mem: 33300 Epoch: [5] [1190/4276] eta: 2:30:29 lr: 4.402043099305526e-05 loss: 0.1721 (0.1925) time: 2.9278 data: 0.0076 max mem: 33300 Epoch: [5] [1200/4276] eta: 2:29:59 lr: 4.40177625221071e-05 loss: 0.1721 (0.1925) time: 2.9276 data: 0.0076 max mem: 33300 Epoch: [5] [1210/4276] eta: 2:29:30 lr: 4.401509403318442e-05 loss: 0.1718 (0.1924) time: 2.9291 data: 0.0072 max mem: 33300 Epoch: [5] [1220/4276] eta: 2:29:01 lr: 4.4012425526285884e-05 loss: 0.1706 (0.1923) time: 2.9296 data: 0.0072 max mem: 33300 Epoch: [5] [1230/4276] eta: 2:28:32 lr: 4.400975700141016e-05 loss: 0.1807 (0.1923) time: 2.9300 data: 0.0080 max mem: 33300 Epoch: [5] [1240/4276] eta: 2:28:02 lr: 4.400708845855592e-05 loss: 0.1873 (0.1924) time: 2.9112 data: 0.0087 max mem: 33300 Epoch: [5] [1250/4276] eta: 2:27:32 lr: 4.400441989772182e-05 loss: 0.1873 (0.1923) time: 2.8882 data: 0.0087 max mem: 33300 Epoch: [5] [1260/4276] eta: 2:27:01 lr: 4.400175131890654e-05 loss: 0.1719 (0.1921) time: 2.8818 data: 0.0083 max mem: 33300 Epoch: [5] [1270/4276] eta: 2:26:31 lr: 4.3999082722108735e-05 loss: 0.1717 (0.1921) time: 2.8787 data: 0.0079 max mem: 33300 Epoch: [5] [1280/4276] eta: 2:26:01 lr: 4.399641410732708e-05 loss: 0.1865 (0.1921) time: 2.8772 data: 0.0080 max mem: 33300 Epoch: [5] [1290/4276] eta: 2:25:30 lr: 4.3993745474560246e-05 loss: 0.1820 (0.1921) time: 2.8789 data: 0.0081 max mem: 33300 Epoch: [5] [1300/4276] eta: 2:25:00 lr: 4.3991076823806884e-05 loss: 0.1710 (0.1920) time: 2.8790 data: 0.0083 max mem: 33300 Epoch: [5] [1310/4276] eta: 2:24:30 lr: 4.398840815506568e-05 loss: 0.1658 (0.1918) time: 2.8781 data: 0.0082 max mem: 33300 Epoch: [5] [1320/4276] eta: 2:24:00 lr: 4.398573946833528e-05 loss: 0.1777 (0.1919) time: 2.8803 data: 0.0080 max mem: 33300 Epoch: [5] [1330/4276] eta: 2:23:31 lr: 4.3983070763614365e-05 loss: 0.1758 (0.1917) time: 2.9197 data: 0.0089 max mem: 33300 Epoch: [5] [1340/4276] eta: 2:23:03 lr: 4.39804020409016e-05 loss: 0.1723 (0.1917) time: 2.9634 data: 0.0099 max mem: 33300 Epoch: [5] [1350/4276] eta: 2:22:34 lr: 4.397773330019564e-05 loss: 0.1838 (0.1916) time: 2.9460 data: 0.0095 max mem: 33300 Epoch: [5] [1360/4276] eta: 2:22:05 lr: 4.397506454149516e-05 loss: 0.1926 (0.1916) time: 2.9251 data: 0.0088 max mem: 33300 Epoch: [5] [1370/4276] eta: 2:21:35 lr: 4.397239576479882e-05 loss: 0.1879 (0.1915) time: 2.9297 data: 0.0087 max mem: 33300 Epoch: [5] [1380/4276] eta: 2:21:06 lr: 4.396972697010529e-05 loss: 0.1889 (0.1916) time: 2.9324 data: 0.0085 max mem: 33300 Epoch: [5] [1390/4276] eta: 2:20:37 lr: 4.396705815741323e-05 loss: 0.2014 (0.1916) time: 2.9318 data: 0.0087 max mem: 33300 Epoch: [5] [1400/4276] eta: 2:20:08 lr: 4.396438932672131e-05 loss: 0.1980 (0.1917) time: 2.9318 data: 0.0088 max mem: 33300 Epoch: [5] [1410/4276] eta: 2:19:38 lr: 4.3961720478028185e-05 loss: 0.1934 (0.1916) time: 2.9099 data: 0.0081 max mem: 33300 Epoch: [5] [1420/4276] eta: 2:19:08 lr: 4.3959051611332535e-05 loss: 0.1909 (0.1918) time: 2.8846 data: 0.0082 max mem: 33300 Epoch: [5] [1430/4276] eta: 2:18:38 lr: 4.395638272663301e-05 loss: 0.1829 (0.1917) time: 2.8790 data: 0.0081 max mem: 33300 Epoch: [5] [1440/4276] eta: 2:18:08 lr: 4.3953713823928275e-05 loss: 0.1879 (0.1919) time: 2.8788 data: 0.0075 max mem: 33300 Epoch: [5] [1450/4276] eta: 2:17:39 lr: 4.3951044903217e-05 loss: 0.1909 (0.1918) time: 2.9010 data: 0.0085 max mem: 33300 Epoch: [5] [1460/4276] eta: 2:17:09 lr: 4.394837596449784e-05 loss: 0.1835 (0.1919) time: 2.9193 data: 0.0101 max mem: 33300 Epoch: [5] [1470/4276] eta: 2:16:40 lr: 4.394570700776948e-05 loss: 0.1861 (0.1918) time: 2.9158 data: 0.0104 max mem: 33300 Epoch: [5] [1480/4276] eta: 2:16:10 lr: 4.3943038033030565e-05 loss: 0.1909 (0.1918) time: 2.9027 data: 0.0099 max mem: 33300 Epoch: [5] [1490/4276] eta: 2:15:41 lr: 4.394036904027975e-05 loss: 0.1833 (0.1918) time: 2.8976 data: 0.0094 max mem: 33300 Epoch: [5] [1500/4276] eta: 2:15:11 lr: 4.393770002951572e-05 loss: 0.1794 (0.1917) time: 2.8920 data: 0.0090 max mem: 33300 Epoch: [5] [1510/4276] eta: 2:14:41 lr: 4.393503100073712e-05 loss: 0.1823 (0.1918) time: 2.8800 data: 0.0085 max mem: 33300 Epoch: [5] [1520/4276] eta: 2:14:11 lr: 4.393236195394262e-05 loss: 0.1913 (0.1918) time: 2.8796 data: 0.0079 max mem: 33300 Epoch: [5] [1530/4276] eta: 2:13:41 lr: 4.3929692889130876e-05 loss: 0.1970 (0.1918) time: 2.8795 data: 0.0079 max mem: 33300 Epoch: [5] [1540/4276] eta: 2:13:11 lr: 4.3927023806300553e-05 loss: 0.1867 (0.1918) time: 2.8811 data: 0.0085 max mem: 33300 Epoch: [5] [1550/4276] eta: 2:12:41 lr: 4.392435470545033e-05 loss: 0.1883 (0.1919) time: 2.8818 data: 0.0088 max mem: 33300 Epoch: [5] [1560/4276] eta: 2:12:11 lr: 4.3921685586578836e-05 loss: 0.1804 (0.1918) time: 2.8916 data: 0.0088 max mem: 33300 Epoch: [5] [1570/4276] eta: 2:11:42 lr: 4.391901644968476e-05 loss: 0.1713 (0.1917) time: 2.8921 data: 0.0083 max mem: 33300 Epoch: [5] [1580/4276] eta: 2:11:12 lr: 4.3916347294766754e-05 loss: 0.1635 (0.1916) time: 2.8805 data: 0.0079 max mem: 33300 Epoch: [5] [1590/4276] eta: 2:10:42 lr: 4.391367812182347e-05 loss: 0.1696 (0.1916) time: 2.8794 data: 0.0083 max mem: 33300 Epoch: [5] [1600/4276] eta: 2:10:12 lr: 4.391100893085358e-05 loss: 0.1789 (0.1916) time: 2.8927 data: 0.0084 max mem: 33300 Epoch: [5] [1610/4276] eta: 2:09:43 lr: 4.390833972185574e-05 loss: 0.1794 (0.1915) time: 2.9140 data: 0.0084 max mem: 33300 Epoch: [5] [1620/4276] eta: 2:09:14 lr: 4.390567049482862e-05 loss: 0.1772 (0.1914) time: 2.9343 data: 0.0087 max mem: 33300 Epoch: [5] [1630/4276] eta: 2:08:45 lr: 4.390300124977087e-05 loss: 0.1835 (0.1914) time: 2.9404 data: 0.0092 max mem: 33300 Epoch: [5] [1640/4276] eta: 2:08:16 lr: 4.390033198668115e-05 loss: 0.1812 (0.1915) time: 2.9343 data: 0.0092 max mem: 33300 Epoch: [5] [1650/4276] eta: 2:07:48 lr: 4.3897662705558124e-05 loss: 0.1724 (0.1915) time: 2.9372 data: 0.0090 max mem: 33300 Epoch: [5] [1660/4276] eta: 2:07:18 lr: 4.389499340640045e-05 loss: 0.1855 (0.1914) time: 2.9135 data: 0.0089 max mem: 33300 Epoch: [5] [1670/4276] eta: 2:06:48 lr: 4.3892324089206785e-05 loss: 0.1855 (0.1915) time: 2.8842 data: 0.0081 max mem: 33300 Epoch: [5] [1680/4276] eta: 2:06:18 lr: 4.3889654753975795e-05 loss: 0.2070 (0.1916) time: 2.8819 data: 0.0081 max mem: 33300 Epoch: [5] [1690/4276] eta: 2:05:49 lr: 4.3886985400706135e-05 loss: 0.2054 (0.1916) time: 2.8830 data: 0.0080 max mem: 33300 Epoch: [5] [1700/4276] eta: 2:05:19 lr: 4.388431602939647e-05 loss: 0.1896 (0.1917) time: 2.8838 data: 0.0078 max mem: 33300 Epoch: [5] [1710/4276] eta: 2:04:49 lr: 4.3881646640045447e-05 loss: 0.2046 (0.1917) time: 2.8831 data: 0.0079 max mem: 33300 Epoch: [5] [1720/4276] eta: 2:04:20 lr: 4.387897723265173e-05 loss: 0.2065 (0.1918) time: 2.8925 data: 0.0083 max mem: 33300 Epoch: [5] [1730/4276] eta: 2:03:51 lr: 4.387630780721399e-05 loss: 0.1939 (0.1918) time: 2.9238 data: 0.0090 max mem: 33300 Epoch: [5] [1740/4276] eta: 2:03:22 lr: 4.387363836373086e-05 loss: 0.1844 (0.1918) time: 2.9418 data: 0.0093 max mem: 33300 Epoch: [5] [1750/4276] eta: 2:02:53 lr: 4.3870968902201016e-05 loss: 0.1894 (0.1918) time: 2.9393 data: 0.0094 max mem: 33300 Epoch: [5] [1760/4276] eta: 2:02:24 lr: 4.386829942262311e-05 loss: 0.1894 (0.1918) time: 2.9404 data: 0.0094 max mem: 33300 Epoch: [5] [1770/4276] eta: 2:01:55 lr: 4.38656299249958e-05 loss: 0.1924 (0.1918) time: 2.9405 data: 0.0094 max mem: 33300 Epoch: [5] [1780/4276] eta: 2:01:26 lr: 4.3862960409317756e-05 loss: 0.2066 (0.1918) time: 2.9376 data: 0.0096 max mem: 33300 Epoch: [5] [1790/4276] eta: 2:00:57 lr: 4.386029087558761e-05 loss: 0.1835 (0.1918) time: 2.9380 data: 0.0096 max mem: 33300 Epoch: [5] [1800/4276] eta: 2:00:29 lr: 4.3857621323804035e-05 loss: 0.1900 (0.1919) time: 2.9390 data: 0.0090 max mem: 33300 Epoch: [5] [1810/4276] eta: 2:00:00 lr: 4.38549517539657e-05 loss: 0.2116 (0.1920) time: 2.9394 data: 0.0085 max mem: 33300 Epoch: [5] [1820/4276] eta: 1:59:31 lr: 4.3852282166071236e-05 loss: 0.1825 (0.1919) time: 2.9404 data: 0.0083 max mem: 33300 Epoch: [5] [1830/4276] eta: 1:59:02 lr: 4.3849612560119314e-05 loss: 0.1673 (0.1919) time: 2.9407 data: 0.0086 max mem: 33300 Epoch: [5] [1840/4276] eta: 1:58:33 lr: 4.384694293610859e-05 loss: 0.1696 (0.1918) time: 2.9398 data: 0.0086 max mem: 33300 Epoch: [5] [1850/4276] eta: 1:58:04 lr: 4.3844273294037705e-05 loss: 0.1974 (0.1919) time: 2.9385 data: 0.0085 max mem: 33300 Epoch: [5] [1860/4276] eta: 1:57:35 lr: 4.384160363390535e-05 loss: 0.1974 (0.1919) time: 2.9434 data: 0.0085 max mem: 33300 Epoch: [5] [1870/4276] eta: 1:57:06 lr: 4.383893395571014e-05 loss: 0.1984 (0.1920) time: 2.9429 data: 0.0083 max mem: 33300 Epoch: [5] [1880/4276] eta: 1:56:37 lr: 4.383626425945075e-05 loss: 0.1905 (0.1920) time: 2.9436 data: 0.0084 max mem: 33300 Epoch: [5] [1890/4276] eta: 1:56:08 lr: 4.383359454512585e-05 loss: 0.1830 (0.1919) time: 2.9400 data: 0.0087 max mem: 33300 Epoch: [5] [1900/4276] eta: 1:55:39 lr: 4.383092481273407e-05 loss: 0.1754 (0.1919) time: 2.9327 data: 0.0090 max mem: 33300 Epoch: [5] [1910/4276] eta: 1:55:10 lr: 4.382825506227407e-05 loss: 0.1765 (0.1919) time: 2.9332 data: 0.0083 max mem: 33300 Epoch: [5] [1920/4276] eta: 1:54:41 lr: 4.3825585293744516e-05 loss: 0.1865 (0.1918) time: 2.9360 data: 0.0076 max mem: 33300 Epoch: [5] [1930/4276] eta: 1:54:12 lr: 4.3822915507144055e-05 loss: 0.1865 (0.1918) time: 2.9391 data: 0.0079 max mem: 33300 Epoch: [5] [1940/4276] eta: 1:53:43 lr: 4.382024570247134e-05 loss: 0.1920 (0.1918) time: 2.9359 data: 0.0078 max mem: 33300 Epoch: [5] [1950/4276] eta: 1:53:14 lr: 4.3817575879725035e-05 loss: 0.1917 (0.1918) time: 2.9330 data: 0.0080 max mem: 33300 Epoch: [5] [1960/4276] eta: 1:52:45 lr: 4.381490603890378e-05 loss: 0.1694 (0.1918) time: 2.9384 data: 0.0082 max mem: 33300 Epoch: [5] [1970/4276] eta: 1:52:16 lr: 4.381223618000624e-05 loss: 0.1660 (0.1917) time: 2.9391 data: 0.0081 max mem: 33300 Epoch: [5] [1980/4276] eta: 1:51:47 lr: 4.380956630303106e-05 loss: 0.1667 (0.1916) time: 2.9337 data: 0.0079 max mem: 33300 Epoch: [5] [1990/4276] eta: 1:51:18 lr: 4.380689640797689e-05 loss: 0.1956 (0.1917) time: 2.9342 data: 0.0077 max mem: 33300 Epoch: [5] [2000/4276] eta: 1:50:49 lr: 4.38042264948424e-05 loss: 0.1923 (0.1917) time: 2.9390 data: 0.0077 max mem: 33300 Epoch: [5] [2010/4276] eta: 1:50:20 lr: 4.380155656362624e-05 loss: 0.1858 (0.1916) time: 2.9359 data: 0.0083 max mem: 33300 Epoch: [5] [2020/4276] eta: 1:49:51 lr: 4.379888661432705e-05 loss: 0.1832 (0.1916) time: 2.9310 data: 0.0080 max mem: 33300 Epoch: [5] [2030/4276] eta: 1:49:22 lr: 4.3796216646943494e-05 loss: 0.1832 (0.1916) time: 2.9358 data: 0.0072 max mem: 33300 Epoch: [5] [2040/4276] eta: 1:48:53 lr: 4.379354666147422e-05 loss: 0.1745 (0.1915) time: 2.9367 data: 0.0075 max mem: 33300 Epoch: [5] [2050/4276] eta: 1:48:23 lr: 4.3790876657917875e-05 loss: 0.1780 (0.1915) time: 2.9340 data: 0.0075 max mem: 33300 Epoch: [5] [2060/4276] eta: 1:47:54 lr: 4.378820663627312e-05 loss: 0.1808 (0.1915) time: 2.9330 data: 0.0073 max mem: 33300 Epoch: [5] [2070/4276] eta: 1:47:25 lr: 4.37855365965386e-05 loss: 0.1826 (0.1914) time: 2.9332 data: 0.0077 max mem: 33300 Epoch: [5] [2080/4276] eta: 1:46:56 lr: 4.378286653871297e-05 loss: 0.1841 (0.1915) time: 2.9325 data: 0.0079 max mem: 33300 Epoch: [5] [2090/4276] eta: 1:46:27 lr: 4.378019646279489e-05 loss: 0.1923 (0.1914) time: 2.9351 data: 0.0078 max mem: 33300 Epoch: [5] [2100/4276] eta: 1:45:58 lr: 4.3777526368783004e-05 loss: 0.1823 (0.1914) time: 2.9380 data: 0.0078 max mem: 33300 Epoch: [5] [2110/4276] eta: 1:45:29 lr: 4.377485625667595e-05 loss: 0.1770 (0.1914) time: 2.9393 data: 0.0076 max mem: 33300 Epoch: [5] [2120/4276] eta: 1:45:00 lr: 4.37721861264724e-05 loss: 0.1583 (0.1912) time: 2.9376 data: 0.0076 max mem: 33300 Epoch: [5] [2130/4276] eta: 1:44:31 lr: 4.376951597817099e-05 loss: 0.1446 (0.1912) time: 2.9221 data: 0.0078 max mem: 33300 Epoch: [5] [2140/4276] eta: 1:44:01 lr: 4.376684581177038e-05 loss: 0.1796 (0.1912) time: 2.9085 data: 0.0080 max mem: 33300 Epoch: [5] [2150/4276] eta: 1:43:32 lr: 4.376417562726922e-05 loss: 0.1814 (0.1911) time: 2.9147 data: 0.0080 max mem: 33300 Epoch: [5] [2160/4276] eta: 1:43:03 lr: 4.376150542466615e-05 loss: 0.1814 (0.1911) time: 2.9211 data: 0.0081 max mem: 33300 Epoch: [5] [2170/4276] eta: 1:42:33 lr: 4.375883520395984e-05 loss: 0.1908 (0.1912) time: 2.9195 data: 0.0080 max mem: 33300 Epoch: [5] [2180/4276] eta: 1:42:04 lr: 4.375616496514892e-05 loss: 0.1992 (0.1912) time: 2.9183 data: 0.0078 max mem: 33300 Epoch: [5] [2190/4276] eta: 1:41:35 lr: 4.3753494708232054e-05 loss: 0.1838 (0.1912) time: 2.9187 data: 0.0078 max mem: 33300 Epoch: [5] [2200/4276] eta: 1:41:06 lr: 4.375082443320788e-05 loss: 0.1843 (0.1912) time: 2.9200 data: 0.0078 max mem: 33300 Epoch: [5] [2210/4276] eta: 1:40:36 lr: 4.374815414007504e-05 loss: 0.1910 (0.1913) time: 2.9207 data: 0.0080 max mem: 33300 Epoch: [5] [2220/4276] eta: 1:40:07 lr: 4.3745483828832205e-05 loss: 0.1910 (0.1913) time: 2.9222 data: 0.0081 max mem: 33300 Epoch: [5] [2230/4276] eta: 1:39:38 lr: 4.3742813499478015e-05 loss: 0.1801 (0.1912) time: 2.9100 data: 0.0085 max mem: 33300 Epoch: [5] [2240/4276] eta: 1:39:08 lr: 4.374014315201112e-05 loss: 0.1769 (0.1911) time: 2.9003 data: 0.0090 max mem: 33300 Epoch: [5] [2250/4276] eta: 1:38:39 lr: 4.3737472786430155e-05 loss: 0.1861 (0.1911) time: 2.9110 data: 0.0089 max mem: 33300 Epoch: [5] [2260/4276] eta: 1:38:10 lr: 4.373480240273378e-05 loss: 0.1861 (0.1912) time: 2.9208 data: 0.0087 max mem: 33300 Epoch: [5] [2270/4276] eta: 1:37:41 lr: 4.373213200092065e-05 loss: 0.1911 (0.1912) time: 2.9225 data: 0.0082 max mem: 33300 Epoch: [5] [2280/4276] eta: 1:37:12 lr: 4.37294615809894e-05 loss: 0.1886 (0.1912) time: 2.9240 data: 0.0085 max mem: 33300 Epoch: [5] [2290/4276] eta: 1:36:42 lr: 4.3726791142938685e-05 loss: 0.1735 (0.1912) time: 2.9234 data: 0.0087 max mem: 33300 Epoch: [5] [2300/4276] eta: 1:36:13 lr: 4.372412068676714e-05 loss: 0.1721 (0.1911) time: 2.9184 data: 0.0086 max mem: 33300 Epoch: [5] [2310/4276] eta: 1:35:44 lr: 4.372145021247343e-05 loss: 0.1818 (0.1910) time: 2.9120 data: 0.0089 max mem: 33300 Epoch: [5] [2320/4276] eta: 1:35:15 lr: 4.3718779720056205e-05 loss: 0.1838 (0.1910) time: 2.9149 data: 0.0083 max mem: 33300 Epoch: [5] [2330/4276] eta: 1:34:45 lr: 4.371610920951409e-05 loss: 0.1755 (0.1910) time: 2.9310 data: 0.0078 max mem: 33300 Epoch: [5] [2340/4276] eta: 1:34:16 lr: 4.3713438680845745e-05 loss: 0.1857 (0.1909) time: 2.9428 data: 0.0080 max mem: 33300 Epoch: [5] [2350/4276] eta: 1:33:47 lr: 4.371076813404981e-05 loss: 0.1837 (0.1909) time: 2.9422 data: 0.0080 max mem: 33300 Epoch: [5] [2360/4276] eta: 1:33:18 lr: 4.370809756912494e-05 loss: 0.1770 (0.1908) time: 2.9383 data: 0.0082 max mem: 33300 Epoch: [5] [2370/4276] eta: 1:32:49 lr: 4.370542698606978e-05 loss: 0.1859 (0.1908) time: 2.9394 data: 0.0084 max mem: 33300 Epoch: [5] [2380/4276] eta: 1:32:20 lr: 4.370275638488297e-05 loss: 0.1972 (0.1909) time: 2.9404 data: 0.0082 max mem: 33300 Epoch: [5] [2390/4276] eta: 1:31:51 lr: 4.3700085765563154e-05 loss: 0.1676 (0.1908) time: 2.9395 data: 0.0080 max mem: 33300 Epoch: [5] [2400/4276] eta: 1:31:22 lr: 4.369741512810899e-05 loss: 0.1696 (0.1909) time: 2.9404 data: 0.0084 max mem: 33300 Epoch: [5] [2410/4276] eta: 1:30:53 lr: 4.369474447251912e-05 loss: 0.1858 (0.1908) time: 2.9372 data: 0.0087 max mem: 33300 Epoch: [5] [2420/4276] eta: 1:30:24 lr: 4.369207379879218e-05 loss: 0.1770 (0.1908) time: 2.9295 data: 0.0082 max mem: 33300 Epoch: [5] [2430/4276] eta: 1:29:54 lr: 4.368940310692682e-05 loss: 0.1848 (0.1908) time: 2.9242 data: 0.0080 max mem: 33300 Epoch: [5] [2440/4276] eta: 1:29:25 lr: 4.368673239692168e-05 loss: 0.1897 (0.1908) time: 2.9250 data: 0.0081 max mem: 33300 Epoch: [5] [2450/4276] eta: 1:28:56 lr: 4.3684061668775416e-05 loss: 0.1794 (0.1908) time: 2.9156 data: 0.0087 max mem: 33300 Epoch: [5] [2460/4276] eta: 1:28:26 lr: 4.368139092248667e-05 loss: 0.1794 (0.1908) time: 2.9060 data: 0.0091 max mem: 33300 Epoch: [5] [2470/4276] eta: 1:27:57 lr: 4.3678720158054065e-05 loss: 0.1880 (0.1908) time: 2.8957 data: 0.0091 max mem: 33300 Epoch: [5] [2480/4276] eta: 1:27:28 lr: 4.367604937547628e-05 loss: 0.1880 (0.1908) time: 2.8900 data: 0.0090 max mem: 33300 Epoch: [5] [2490/4276] eta: 1:26:58 lr: 4.367337857475193e-05 loss: 0.1770 (0.1908) time: 2.9074 data: 0.0088 max mem: 33300 Epoch: [5] [2500/4276] eta: 1:26:29 lr: 4.3670707755879674e-05 loss: 0.1818 (0.1908) time: 2.9200 data: 0.0083 max mem: 33300 Epoch: [5] [2510/4276] eta: 1:26:00 lr: 4.366803691885816e-05 loss: 0.1854 (0.1908) time: 2.9227 data: 0.0078 max mem: 33300 Epoch: [5] [2520/4276] eta: 1:25:31 lr: 4.366536606368601e-05 loss: 0.1593 (0.1907) time: 2.9242 data: 0.0076 max mem: 33300 Epoch: [5] [2530/4276] eta: 1:25:01 lr: 4.3662695190361884e-05 loss: 0.1473 (0.1905) time: 2.9242 data: 0.0078 max mem: 33300 Epoch: [5] [2540/4276] eta: 1:24:32 lr: 4.366002429888442e-05 loss: 0.1585 (0.1905) time: 2.9227 data: 0.0078 max mem: 33300 Epoch: [5] [2550/4276] eta: 1:24:03 lr: 4.365735338925227e-05 loss: 0.1720 (0.1904) time: 2.9226 data: 0.0076 max mem: 33300 Epoch: [5] [2560/4276] eta: 1:23:34 lr: 4.3654682461464056e-05 loss: 0.1625 (0.1903) time: 2.9232 data: 0.0078 max mem: 33300 Epoch: [5] [2570/4276] eta: 1:23:05 lr: 4.365201151551843e-05 loss: 0.1592 (0.1903) time: 2.9237 data: 0.0082 max mem: 33300 Epoch: [5] [2580/4276] eta: 1:22:35 lr: 4.364934055141405e-05 loss: 0.1748 (0.1903) time: 2.9237 data: 0.0079 max mem: 33300 Epoch: [5] [2590/4276] eta: 1:22:06 lr: 4.364666956914953e-05 loss: 0.1748 (0.1902) time: 2.9230 data: 0.0078 max mem: 33300 Epoch: [5] [2600/4276] eta: 1:21:37 lr: 4.364399856872353e-05 loss: 0.1855 (0.1903) time: 2.9220 data: 0.0078 max mem: 33300 Epoch: [5] [2610/4276] eta: 1:21:08 lr: 4.3641327550134684e-05 loss: 0.1808 (0.1902) time: 2.9207 data: 0.0080 max mem: 33300 Epoch: [5] [2620/4276] eta: 1:20:38 lr: 4.363865651338164e-05 loss: 0.1859 (0.1902) time: 2.9193 data: 0.0079 max mem: 33300 Epoch: [5] [2630/4276] eta: 1:20:09 lr: 4.363598545846304e-05 loss: 0.1760 (0.1901) time: 2.9234 data: 0.0075 max mem: 33300 Epoch: [5] [2640/4276] eta: 1:19:40 lr: 4.3633314385377514e-05 loss: 0.1574 (0.1901) time: 2.9258 data: 0.0074 max mem: 33300 Epoch: [5] [2650/4276] eta: 1:19:11 lr: 4.363064329412371e-05 loss: 0.1732 (0.1901) time: 2.9270 data: 0.0072 max mem: 33300 Epoch: [5] [2660/4276] eta: 1:18:42 lr: 4.362797218470027e-05 loss: 0.1811 (0.1901) time: 2.9303 data: 0.0077 max mem: 33300 Epoch: [5] [2670/4276] eta: 1:18:13 lr: 4.362530105710583e-05 loss: 0.1796 (0.1901) time: 2.9306 data: 0.0078 max mem: 33300 Epoch: [5] [2680/4276] eta: 1:17:43 lr: 4.3622629911339035e-05 loss: 0.1823 (0.1901) time: 2.9271 data: 0.0076 max mem: 33300 Epoch: [5] [2690/4276] eta: 1:17:14 lr: 4.361995874739852e-05 loss: 0.1868 (0.1901) time: 2.9158 data: 0.0076 max mem: 33300 Epoch: [5] [2700/4276] eta: 1:16:45 lr: 4.3617287565282926e-05 loss: 0.1790 (0.1900) time: 2.9083 data: 0.0075 max mem: 33300 Epoch: [5] [2710/4276] eta: 1:16:15 lr: 4.36146163649909e-05 loss: 0.1735 (0.1899) time: 2.9040 data: 0.0073 max mem: 33300 Epoch: [5] [2720/4276] eta: 1:15:46 lr: 4.361194514652107e-05 loss: 0.1843 (0.1899) time: 2.9122 data: 0.0072 max mem: 33300 Epoch: [5] [2730/4276] eta: 1:15:17 lr: 4.360927390987208e-05 loss: 0.1867 (0.1899) time: 2.9161 data: 0.0072 max mem: 33300 Epoch: [5] [2740/4276] eta: 1:14:48 lr: 4.360660265504257e-05 loss: 0.1896 (0.1900) time: 2.9089 data: 0.0075 max mem: 33300 Epoch: [5] [2750/4276] eta: 1:14:18 lr: 4.360393138203118e-05 loss: 0.1876 (0.1900) time: 2.9156 data: 0.0075 max mem: 33300 Epoch: [5] [2760/4276] eta: 1:13:49 lr: 4.360126009083654e-05 loss: 0.1833 (0.1900) time: 2.9252 data: 0.0072 max mem: 33300 Epoch: [5] [2770/4276] eta: 1:13:20 lr: 4.35985887814573e-05 loss: 0.1697 (0.1900) time: 2.9239 data: 0.0072 max mem: 33300 Epoch: [5] [2780/4276] eta: 1:12:51 lr: 4.359591745389209e-05 loss: 0.1761 (0.1900) time: 2.9235 data: 0.0072 max mem: 33300 Epoch: [5] [2790/4276] eta: 1:12:22 lr: 4.3593246108139565e-05 loss: 0.1913 (0.1900) time: 2.9231 data: 0.0072 max mem: 33300 Epoch: [5] [2800/4276] eta: 1:11:52 lr: 4.3590574744198334e-05 loss: 0.1872 (0.1900) time: 2.9228 data: 0.0072 max mem: 33300 Epoch: [5] [2810/4276] eta: 1:11:23 lr: 4.358790336206706e-05 loss: 0.1638 (0.1898) time: 2.9238 data: 0.0072 max mem: 33300 Epoch: [5] [2820/4276] eta: 1:10:54 lr: 4.358523196174436e-05 loss: 0.1638 (0.1898) time: 2.9247 data: 0.0071 max mem: 33300 Epoch: [5] [2830/4276] eta: 1:10:25 lr: 4.358256054322888e-05 loss: 0.1644 (0.1897) time: 2.9260 data: 0.0072 max mem: 33300 Epoch: [5] [2840/4276] eta: 1:09:56 lr: 4.357988910651927e-05 loss: 0.1741 (0.1897) time: 2.9242 data: 0.0072 max mem: 33300 Epoch: [5] [2850/4276] eta: 1:09:26 lr: 4.3577217651614145e-05 loss: 0.1911 (0.1897) time: 2.9244 data: 0.0071 max mem: 33300 Epoch: [5] [2860/4276] eta: 1:08:57 lr: 4.357454617851216e-05 loss: 0.1911 (0.1897) time: 2.9208 data: 0.0072 max mem: 33300 Epoch: [5] [2870/4276] eta: 1:08:28 lr: 4.3571874687211936e-05 loss: 0.1846 (0.1898) time: 2.9054 data: 0.0085 max mem: 33300 Epoch: [5] [2880/4276] eta: 1:07:58 lr: 4.356920317771212e-05 loss: 0.1904 (0.1898) time: 2.8875 data: 0.0090 max mem: 33300 Epoch: [5] [2890/4276] eta: 1:07:29 lr: 4.356653165001135e-05 loss: 0.1895 (0.1898) time: 2.8970 data: 0.0087 max mem: 33300 Epoch: [5] [2900/4276] eta: 1:07:00 lr: 4.356386010410825e-05 loss: 0.1708 (0.1897) time: 2.9121 data: 0.0087 max mem: 33300 Epoch: [5] [2910/4276] eta: 1:06:30 lr: 4.356118854000146e-05 loss: 0.1626 (0.1897) time: 2.9088 data: 0.0083 max mem: 33300 Epoch: [5] [2920/4276] eta: 1:06:01 lr: 4.355851695768962e-05 loss: 0.1731 (0.1896) time: 2.8869 data: 0.0084 max mem: 33300 Epoch: [5] [2930/4276] eta: 1:05:32 lr: 4.355584535717137e-05 loss: 0.1640 (0.1896) time: 2.8673 data: 0.0080 max mem: 33300 Epoch: [5] [2940/4276] eta: 1:05:02 lr: 4.355317373844533e-05 loss: 0.1610 (0.1895) time: 2.8656 data: 0.0075 max mem: 33300 Epoch: [5] [2950/4276] eta: 1:04:33 lr: 4.355050210151015e-05 loss: 0.1642 (0.1895) time: 2.8658 data: 0.0077 max mem: 33300 Epoch: [5] [2960/4276] eta: 1:04:03 lr: 4.354783044636445e-05 loss: 0.1811 (0.1895) time: 2.8656 data: 0.0077 max mem: 33300 Epoch: [5] [2970/4276] eta: 1:03:34 lr: 4.354515877300688e-05 loss: 0.1872 (0.1896) time: 2.8654 data: 0.0077 max mem: 33300 Epoch: [5] [2980/4276] eta: 1:03:05 lr: 4.3542487081436055e-05 loss: 0.1991 (0.1897) time: 2.9184 data: 0.0087 max mem: 33300 Epoch: [5] [2990/4276] eta: 1:02:36 lr: 4.3539815371650634e-05 loss: 0.1922 (0.1896) time: 2.9480 data: 0.0096 max mem: 33300 Epoch: [5] [3000/4276] eta: 1:02:06 lr: 4.353714364364923e-05 loss: 0.1743 (0.1896) time: 2.9232 data: 0.0095 max mem: 33300 Epoch: [5] [3010/4276] eta: 1:01:37 lr: 4.353447189743048e-05 loss: 0.1828 (0.1895) time: 2.9228 data: 0.0092 max mem: 33300 Epoch: [5] [3020/4276] eta: 1:01:08 lr: 4.353180013299303e-05 loss: 0.1859 (0.1895) time: 2.9242 data: 0.0092 max mem: 33300 Epoch: [5] [3030/4276] eta: 1:00:39 lr: 4.35291283503355e-05 loss: 0.1921 (0.1896) time: 2.9185 data: 0.0097 max mem: 33300 Epoch: [5] [3040/4276] eta: 1:00:09 lr: 4.3526456549456534e-05 loss: 0.2054 (0.1897) time: 2.9110 data: 0.0100 max mem: 33300 Epoch: [5] [3050/4276] eta: 0:59:40 lr: 4.352378473035476e-05 loss: 0.1971 (0.1896) time: 2.9185 data: 0.0091 max mem: 33300 Epoch: [5] [3060/4276] eta: 0:59:11 lr: 4.352111289302881e-05 loss: 0.1487 (0.1895) time: 2.9260 data: 0.0080 max mem: 33300 Epoch: [5] [3070/4276] eta: 0:58:42 lr: 4.351844103747731e-05 loss: 0.1677 (0.1895) time: 2.9232 data: 0.0076 max mem: 33300 Epoch: [5] [3080/4276] eta: 0:58:13 lr: 4.35157691636989e-05 loss: 0.1844 (0.1895) time: 2.9267 data: 0.0078 max mem: 33300 Epoch: [5] [3090/4276] eta: 0:57:43 lr: 4.351309727169221e-05 loss: 0.1695 (0.1894) time: 2.9194 data: 0.0083 max mem: 33300 Epoch: [5] [3100/4276] eta: 0:57:14 lr: 4.351042536145588e-05 loss: 0.1757 (0.1894) time: 2.9223 data: 0.0083 max mem: 33300 Epoch: [5] [3110/4276] eta: 0:56:45 lr: 4.3507753432988526e-05 loss: 0.1731 (0.1893) time: 2.9281 data: 0.0080 max mem: 33300 Epoch: [5] [3120/4276] eta: 0:56:16 lr: 4.3505081486288804e-05 loss: 0.1589 (0.1893) time: 2.9198 data: 0.0078 max mem: 33300 Epoch: [5] [3130/4276] eta: 0:55:47 lr: 4.3502409521355315e-05 loss: 0.1743 (0.1892) time: 2.9218 data: 0.0075 max mem: 33300 Epoch: [5] [3140/4276] eta: 0:55:17 lr: 4.349973753818671e-05 loss: 0.1817 (0.1893) time: 2.9230 data: 0.0073 max mem: 33300 Epoch: [5] [3150/4276] eta: 0:54:48 lr: 4.349706553678161e-05 loss: 0.1952 (0.1893) time: 2.9224 data: 0.0072 max mem: 33300 Epoch: [5] [3160/4276] eta: 0:54:19 lr: 4.3494393517138655e-05 loss: 0.1809 (0.1893) time: 2.9289 data: 0.0072 max mem: 33300 Epoch: [5] [3170/4276] eta: 0:53:50 lr: 4.349172147925647e-05 loss: 0.1764 (0.1894) time: 2.9307 data: 0.0074 max mem: 33300 Epoch: [5] [3180/4276] eta: 0:53:21 lr: 4.3489049423133686e-05 loss: 0.1768 (0.1894) time: 2.9288 data: 0.0077 max mem: 33300 Epoch: [5] [3190/4276] eta: 0:52:52 lr: 4.3486377348768934e-05 loss: 0.1806 (0.1894) time: 2.9305 data: 0.0077 max mem: 33300 Epoch: [5] [3200/4276] eta: 0:52:22 lr: 4.3483705256160855e-05 loss: 0.1822 (0.1894) time: 2.9296 data: 0.0076 max mem: 33300 Epoch: [5] [3210/4276] eta: 0:51:53 lr: 4.348103314530805e-05 loss: 0.1892 (0.1894) time: 2.9309 data: 0.0078 max mem: 33300 Epoch: [5] [3220/4276] eta: 0:51:24 lr: 4.347836101620918e-05 loss: 0.1892 (0.1894) time: 2.9301 data: 0.0077 max mem: 33300 Epoch: [5] [3230/4276] eta: 0:50:55 lr: 4.347568886886285e-05 loss: 0.1886 (0.1894) time: 2.9282 data: 0.0075 max mem: 33300 Epoch: [5] [3240/4276] eta: 0:50:26 lr: 4.3473016703267704e-05 loss: 0.1931 (0.1894) time: 2.9279 data: 0.0075 max mem: 33300 Epoch: [5] [3250/4276] eta: 0:49:56 lr: 4.347034451942237e-05 loss: 0.1839 (0.1894) time: 2.9282 data: 0.0080 max mem: 33300 Epoch: [5] [3260/4276] eta: 0:49:27 lr: 4.346767231732548e-05 loss: 0.1839 (0.1894) time: 2.9285 data: 0.0084 max mem: 33300 Epoch: [5] [3270/4276] eta: 0:48:58 lr: 4.346500009697564e-05 loss: 0.1906 (0.1894) time: 2.9275 data: 0.0080 max mem: 33300 Epoch: [5] [3280/4276] eta: 0:48:29 lr: 4.346232785837151e-05 loss: 0.1802 (0.1894) time: 2.9287 data: 0.0079 max mem: 33300 Epoch: [5] [3290/4276] eta: 0:48:00 lr: 4.3459655601511696e-05 loss: 0.1810 (0.1894) time: 2.9306 data: 0.0083 max mem: 33300 Epoch: [5] [3300/4276] eta: 0:47:31 lr: 4.345698332639483e-05 loss: 0.1814 (0.1894) time: 2.9305 data: 0.0079 max mem: 33300 Epoch: [5] [3310/4276] eta: 0:47:01 lr: 4.3454311033019545e-05 loss: 0.1897 (0.1895) time: 2.9271 data: 0.0080 max mem: 33300 Epoch: [5] [3320/4276] eta: 0:46:32 lr: 4.3451638721384466e-05 loss: 0.1903 (0.1895) time: 2.9144 data: 0.0087 max mem: 33300 Epoch: [5] [3330/4276] eta: 0:46:03 lr: 4.344896639148823e-05 loss: 0.1827 (0.1895) time: 2.8918 data: 0.0090 max mem: 33300 Epoch: [5] [3340/4276] eta: 0:45:33 lr: 4.344629404332944e-05 loss: 0.1884 (0.1895) time: 2.8797 data: 0.0088 max mem: 33300 Epoch: [5] [3350/4276] eta: 0:45:04 lr: 4.344362167690675e-05 loss: 0.1841 (0.1895) time: 2.8868 data: 0.0091 max mem: 33300 Epoch: [5] [3360/4276] eta: 0:44:35 lr: 4.344094929221878e-05 loss: 0.1716 (0.1895) time: 2.8809 data: 0.0089 max mem: 33300 Epoch: [5] [3370/4276] eta: 0:44:05 lr: 4.3438276889264135e-05 loss: 0.1891 (0.1895) time: 2.8619 data: 0.0080 max mem: 33300 Epoch: [5] [3380/4276] eta: 0:43:36 lr: 4.343560446804147e-05 loss: 0.1839 (0.1896) time: 2.8565 data: 0.0078 max mem: 33300 Epoch: [5] [3390/4276] eta: 0:43:07 lr: 4.34329320285494e-05 loss: 0.1941 (0.1896) time: 2.8572 data: 0.0078 max mem: 33300 Epoch: [5] [3400/4276] eta: 0:42:37 lr: 4.343025957078653e-05 loss: 0.1969 (0.1896) time: 2.8520 data: 0.0078 max mem: 33300 Epoch: [5] [3410/4276] eta: 0:42:08 lr: 4.342758709475154e-05 loss: 0.1896 (0.1896) time: 2.8619 data: 0.0080 max mem: 33300 Epoch: [5] [3420/4276] eta: 0:41:39 lr: 4.3424914600443e-05 loss: 0.1850 (0.1897) time: 2.8817 data: 0.0090 max mem: 33300 Epoch: [5] [3430/4276] eta: 0:41:09 lr: 4.3422242087859564e-05 loss: 0.1789 (0.1897) time: 2.8894 data: 0.0092 max mem: 33300 Epoch: [5] [3440/4276] eta: 0:40:40 lr: 4.341956955699985e-05 loss: 0.1711 (0.1896) time: 2.8816 data: 0.0087 max mem: 33300 Epoch: [5] [3450/4276] eta: 0:40:11 lr: 4.3416897007862485e-05 loss: 0.1729 (0.1897) time: 2.8891 data: 0.0089 max mem: 33300 Epoch: [5] [3460/4276] eta: 0:39:42 lr: 4.341422444044609e-05 loss: 0.1819 (0.1897) time: 2.9085 data: 0.0087 max mem: 33300 Epoch: [5] [3470/4276] eta: 0:39:12 lr: 4.341155185474929e-05 loss: 0.1728 (0.1896) time: 2.9039 data: 0.0082 max mem: 33300 Epoch: [5] [3480/4276] eta: 0:38:43 lr: 4.3408879250770724e-05 loss: 0.1813 (0.1896) time: 2.9016 data: 0.0079 max mem: 33300 Epoch: [5] [3490/4276] eta: 0:38:14 lr: 4.340620662850899e-05 loss: 0.1929 (0.1896) time: 2.8880 data: 0.0085 max mem: 33300 Epoch: [5] [3500/4276] eta: 0:37:45 lr: 4.340353398796274e-05 loss: 0.1929 (0.1896) time: 2.8877 data: 0.0087 max mem: 33300 Epoch: [5] [3510/4276] eta: 0:37:15 lr: 4.3400861329130576e-05 loss: 0.1790 (0.1896) time: 2.9048 data: 0.0082 max mem: 33300 Epoch: [5] [3520/4276] eta: 0:36:46 lr: 4.339818865201113e-05 loss: 0.1931 (0.1896) time: 2.9017 data: 0.0087 max mem: 33300 Epoch: [5] [3530/4276] eta: 0:36:17 lr: 4.3395515956603024e-05 loss: 0.1909 (0.1896) time: 2.9008 data: 0.0085 max mem: 33300 Epoch: [5] [3540/4276] eta: 0:35:48 lr: 4.339284324290489e-05 loss: 0.1839 (0.1896) time: 2.8920 data: 0.0087 max mem: 33300 Epoch: [5] [3550/4276] eta: 0:35:19 lr: 4.3390170510915333e-05 loss: 0.1904 (0.1896) time: 2.8844 data: 0.0092 max mem: 33300 Epoch: [5] [3560/4276] eta: 0:34:49 lr: 4.3387497760633e-05 loss: 0.1907 (0.1897) time: 2.8823 data: 0.0092 max mem: 33300 Epoch: [5] [3570/4276] eta: 0:34:20 lr: 4.338482499205649e-05 loss: 0.2004 (0.1897) time: 2.8860 data: 0.0089 max mem: 33300 Epoch: [5] [3580/4276] eta: 0:33:51 lr: 4.338215220518444e-05 loss: 0.1798 (0.1897) time: 2.8842 data: 0.0082 max mem: 33300 Epoch: [5] [3590/4276] eta: 0:33:21 lr: 4.337947940001547e-05 loss: 0.1725 (0.1897) time: 2.8725 data: 0.0081 max mem: 33300 Epoch: [5] [3600/4276] eta: 0:32:52 lr: 4.33768065765482e-05 loss: 0.1794 (0.1897) time: 2.8705 data: 0.0083 max mem: 33300 Epoch: [5] [3610/4276] eta: 0:32:23 lr: 4.337413373478125e-05 loss: 0.1838 (0.1897) time: 2.8669 data: 0.0079 max mem: 33300 Epoch: [5] [3620/4276] eta: 0:31:54 lr: 4.3371460874713246e-05 loss: 0.1842 (0.1897) time: 2.8667 data: 0.0075 max mem: 33300 Epoch: [5] [3630/4276] eta: 0:31:24 lr: 4.33687879963428e-05 loss: 0.1883 (0.1898) time: 2.8686 data: 0.0073 max mem: 33300 Epoch: [5] [3640/4276] eta: 0:30:55 lr: 4.336611509966856e-05 loss: 0.1928 (0.1898) time: 2.8762 data: 0.0073 max mem: 33300 Epoch: [5] [3650/4276] eta: 0:30:26 lr: 4.336344218468911e-05 loss: 0.1695 (0.1897) time: 2.8749 data: 0.0072 max mem: 33300 Epoch: [5] [3660/4276] eta: 0:29:57 lr: 4.3360769251403094e-05 loss: 0.1809 (0.1897) time: 2.8755 data: 0.0077 max mem: 33300 Epoch: [5] [3670/4276] eta: 0:29:27 lr: 4.335809629980913e-05 loss: 0.2027 (0.1897) time: 2.8925 data: 0.0087 max mem: 33300 Epoch: [5] [3680/4276] eta: 0:28:58 lr: 4.335542332990584e-05 loss: 0.2027 (0.1897) time: 2.8884 data: 0.0088 max mem: 33300 Epoch: [5] [3690/4276] eta: 0:28:29 lr: 4.335275034169183e-05 loss: 0.1848 (0.1897) time: 2.8779 data: 0.0090 max mem: 33300 Epoch: [5] [3700/4276] eta: 0:28:00 lr: 4.3350077335165734e-05 loss: 0.1885 (0.1897) time: 2.8905 data: 0.0092 max mem: 33300 Epoch: [5] [3710/4276] eta: 0:27:31 lr: 4.334740431032617e-05 loss: 0.1852 (0.1897) time: 2.9103 data: 0.0089 max mem: 33300 Epoch: [5] [3720/4276] eta: 0:27:01 lr: 4.334473126717176e-05 loss: 0.1662 (0.1897) time: 2.9205 data: 0.0092 max mem: 33300 Epoch: [5] [3730/4276] eta: 0:26:32 lr: 4.3342058205701116e-05 loss: 0.1772 (0.1897) time: 2.9216 data: 0.0094 max mem: 33300 Epoch: [5] [3740/4276] eta: 0:26:03 lr: 4.333938512591286e-05 loss: 0.1979 (0.1898) time: 2.9199 data: 0.0091 max mem: 33300 Epoch: [5] [3750/4276] eta: 0:25:34 lr: 4.333671202780561e-05 loss: 0.2207 (0.1898) time: 2.9211 data: 0.0092 max mem: 33300 Epoch: [5] [3760/4276] eta: 0:25:05 lr: 4.3334038911377994e-05 loss: 0.1921 (0.1898) time: 2.9242 data: 0.0094 max mem: 33300 Epoch: [5] [3770/4276] eta: 0:24:36 lr: 4.333136577662862e-05 loss: 0.1908 (0.1898) time: 2.9246 data: 0.0092 max mem: 33300 Epoch: [5] [3780/4276] eta: 0:24:06 lr: 4.3328692623556113e-05 loss: 0.1856 (0.1898) time: 2.9236 data: 0.0090 max mem: 33300 Epoch: [5] [3790/4276] eta: 0:23:37 lr: 4.3326019452159086e-05 loss: 0.1824 (0.1898) time: 2.9248 data: 0.0091 max mem: 33300 Epoch: [5] [3800/4276] eta: 0:23:08 lr: 4.332334626243616e-05 loss: 0.1926 (0.1898) time: 2.9248 data: 0.0094 max mem: 33300 Epoch: [5] [3810/4276] eta: 0:22:39 lr: 4.332067305438595e-05 loss: 0.1786 (0.1898) time: 2.9221 data: 0.0092 max mem: 33300 Epoch: [5] [3820/4276] eta: 0:22:10 lr: 4.331799982800709e-05 loss: 0.1643 (0.1898) time: 2.9216 data: 0.0089 max mem: 33300 Epoch: [5] [3830/4276] eta: 0:21:41 lr: 4.331532658329817e-05 loss: 0.1693 (0.1898) time: 2.9219 data: 0.0092 max mem: 33300 Epoch: [5] [3840/4276] eta: 0:21:11 lr: 4.331265332025783e-05 loss: 0.1829 (0.1898) time: 2.9101 data: 0.0094 max mem: 33300 Epoch: [5] [3850/4276] eta: 0:20:42 lr: 4.3309980038884676e-05 loss: 0.1699 (0.1897) time: 2.8873 data: 0.0092 max mem: 33300 Epoch: [5] [3860/4276] eta: 0:20:13 lr: 4.330730673917732e-05 loss: 0.1765 (0.1897) time: 2.8747 data: 0.0090 max mem: 33300 Epoch: [5] [3870/4276] eta: 0:19:44 lr: 4.33046334211344e-05 loss: 0.1859 (0.1897) time: 2.8829 data: 0.0088 max mem: 33300 Epoch: [5] [3880/4276] eta: 0:19:15 lr: 4.3301960084754513e-05 loss: 0.1866 (0.1897) time: 2.9086 data: 0.0087 max mem: 33300 Epoch: [5] [3890/4276] eta: 0:18:45 lr: 4.329928673003627e-05 loss: 0.1762 (0.1897) time: 2.9246 data: 0.0084 max mem: 33300 Epoch: [5] [3900/4276] eta: 0:18:16 lr: 4.329661335697832e-05 loss: 0.1815 (0.1897) time: 2.9260 data: 0.0081 max mem: 33300 Epoch: [5] [3910/4276] eta: 0:17:47 lr: 4.329393996557924e-05 loss: 0.1790 (0.1897) time: 2.9240 data: 0.0081 max mem: 33300 Epoch: [5] [3920/4276] eta: 0:17:18 lr: 4.3291266555837674e-05 loss: 0.1812 (0.1897) time: 2.9258 data: 0.0082 max mem: 33300 Epoch: [5] [3930/4276] eta: 0:16:49 lr: 4.328859312775222e-05 loss: 0.1896 (0.1896) time: 2.9271 data: 0.0082 max mem: 33300 Epoch: [5] [3940/4276] eta: 0:16:20 lr: 4.32859196813215e-05 loss: 0.1881 (0.1896) time: 2.9301 data: 0.0082 max mem: 33300 Epoch: [5] [3950/4276] eta: 0:15:50 lr: 4.328324621654414e-05 loss: 0.1702 (0.1896) time: 2.9301 data: 0.0082 max mem: 33300 Epoch: [5] [3960/4276] eta: 0:15:21 lr: 4.328057273341873e-05 loss: 0.1746 (0.1896) time: 2.9223 data: 0.0083 max mem: 33300 Epoch: [5] [3970/4276] eta: 0:14:52 lr: 4.3277899231943904e-05 loss: 0.1937 (0.1896) time: 2.9220 data: 0.0085 max mem: 33300 Epoch: [5] [3980/4276] eta: 0:14:23 lr: 4.3275225712118275e-05 loss: 0.1820 (0.1896) time: 2.9247 data: 0.0086 max mem: 33300 Epoch: [5] [3990/4276] eta: 0:13:54 lr: 4.327255217394045e-05 loss: 0.1820 (0.1896) time: 2.9112 data: 0.0085 max mem: 33300 Epoch: [5] [4000/4276] eta: 0:13:25 lr: 4.326987861740905e-05 loss: 0.1672 (0.1896) time: 2.8903 data: 0.0081 max mem: 33300 Epoch: [5] [4010/4276] eta: 0:12:55 lr: 4.326720504252268e-05 loss: 0.1590 (0.1896) time: 2.8846 data: 0.0085 max mem: 33300 Epoch: [5] [4020/4276] eta: 0:12:26 lr: 4.326453144927996e-05 loss: 0.1682 (0.1896) time: 2.8995 data: 0.0089 max mem: 33300 Epoch: [5] [4030/4276] eta: 0:11:57 lr: 4.326185783767952e-05 loss: 0.1868 (0.1896) time: 2.9160 data: 0.0083 max mem: 33300 Epoch: [5] [4040/4276] eta: 0:11:28 lr: 4.325918420771993e-05 loss: 0.1945 (0.1896) time: 2.9182 data: 0.0081 max mem: 33300 Epoch: [5] [4050/4276] eta: 0:10:59 lr: 4.325651055939985e-05 loss: 0.1845 (0.1896) time: 2.9178 data: 0.0081 max mem: 33300 Epoch: [5] [4060/4276] eta: 0:10:30 lr: 4.325383689271787e-05 loss: 0.1795 (0.1896) time: 2.9026 data: 0.0085 max mem: 33300 Epoch: [5] [4070/4276] eta: 0:10:00 lr: 4.325116320767259e-05 loss: 0.1889 (0.1896) time: 2.8765 data: 0.0087 max mem: 33300 Epoch: [5] [4080/4276] eta: 0:09:31 lr: 4.3248489504262655e-05 loss: 0.1887 (0.1896) time: 2.8717 data: 0.0088 max mem: 33300 Epoch: [5] [4090/4276] eta: 0:09:02 lr: 4.324581578248665e-05 loss: 0.1864 (0.1896) time: 2.8709 data: 0.0086 max mem: 33300 Epoch: [5] [4100/4276] eta: 0:08:33 lr: 4.324314204234321e-05 loss: 0.1864 (0.1896) time: 2.8861 data: 0.0079 max mem: 33300 Epoch: [5] [4110/4276] eta: 0:08:04 lr: 4.324046828383093e-05 loss: 0.1873 (0.1896) time: 2.8939 data: 0.0086 max mem: 33300 Epoch: [5] [4120/4276] eta: 0:07:34 lr: 4.323779450694842e-05 loss: 0.1875 (0.1897) time: 2.8789 data: 0.0091 max mem: 33300 Epoch: [5] [4130/4276] eta: 0:07:05 lr: 4.32351207116943e-05 loss: 0.1875 (0.1896) time: 2.8769 data: 0.0091 max mem: 33300 Epoch: [5] [4140/4276] eta: 0:06:36 lr: 4.323244689806718e-05 loss: 0.1798 (0.1896) time: 2.8811 data: 0.0088 max mem: 33300 Epoch: [5] [4150/4276] eta: 0:06:07 lr: 4.322977306606567e-05 loss: 0.1672 (0.1896) time: 2.9096 data: 0.0091 max mem: 33300 Epoch: [5] [4160/4276] eta: 0:05:38 lr: 4.322709921568838e-05 loss: 0.1777 (0.1896) time: 2.9200 data: 0.0099 max mem: 33300 Epoch: [5] [4170/4276] eta: 0:05:09 lr: 4.322442534693392e-05 loss: 0.1947 (0.1896) time: 2.9094 data: 0.0096 max mem: 33300 Epoch: [5] [4180/4276] eta: 0:04:39 lr: 4.322175145980091e-05 loss: 0.1801 (0.1896) time: 2.9184 data: 0.0089 max mem: 33300 Epoch: [5] [4190/4276] eta: 0:04:10 lr: 4.321907755428795e-05 loss: 0.1744 (0.1896) time: 2.9221 data: 0.0080 max mem: 33300 Epoch: [5] [4200/4276] eta: 0:03:41 lr: 4.321640363039365e-05 loss: 0.1991 (0.1897) time: 2.9207 data: 0.0076 max mem: 33300 Epoch: [5] [4210/4276] eta: 0:03:12 lr: 4.3213729688116626e-05 loss: 0.1991 (0.1897) time: 2.9188 data: 0.0076 max mem: 33300 Epoch: [5] [4220/4276] eta: 0:02:43 lr: 4.3211055727455484e-05 loss: 0.2087 (0.1898) time: 2.9173 data: 0.0076 max mem: 33300 Epoch: [5] [4230/4276] eta: 0:02:14 lr: 4.3208381748408835e-05 loss: 0.2202 (0.1898) time: 2.9189 data: 0.0077 max mem: 33300 Epoch: [5] [4240/4276] eta: 0:01:44 lr: 4.320570775097528e-05 loss: 0.2162 (0.1898) time: 2.9185 data: 0.0077 max mem: 33300 Epoch: [5] [4250/4276] eta: 0:01:15 lr: 4.3203033735153444e-05 loss: 0.1938 (0.1899) time: 2.9165 data: 0.0076 max mem: 33300 Epoch: [5] [4260/4276] eta: 0:00:46 lr: 4.320035970094193e-05 loss: 0.2018 (0.1899) time: 2.9134 data: 0.0076 max mem: 33300 Epoch: [5] [4270/4276] eta: 0:00:17 lr: 4.319768564833934e-05 loss: 0.1885 (0.1899) time: 2.9079 data: 0.0073 max mem: 33300 Epoch: [5] Total time: 3:27:50 Test: [ 0/21770] eta: 12:00:24 time: 1.9855 data: 1.9467 max mem: 33300 Test: [ 100/21770] eta: 0:20:53 time: 0.0388 data: 0.0009 max mem: 33300 Test: [ 200/21770] eta: 0:17:22 time: 0.0387 data: 0.0008 max mem: 33300 Test: [ 300/21770] eta: 0:16:03 time: 0.0379 data: 0.0008 max mem: 33300 Test: [ 400/21770] eta: 0:15:22 time: 0.0380 data: 0.0008 max mem: 33300 Test: [ 500/21770] eta: 0:14:57 time: 0.0384 data: 0.0009 max mem: 33300 Test: [ 600/21770] eta: 0:14:39 time: 0.0384 data: 0.0008 max mem: 33300 Test: [ 700/21770] eta: 0:14:25 time: 0.0380 data: 0.0008 max mem: 33300 Test: [ 800/21770] eta: 0:14:13 time: 0.0384 data: 0.0008 max mem: 33300 Test: [ 900/21770] eta: 0:14:03 time: 0.0380 data: 0.0008 max mem: 33300 Test: [ 1000/21770] eta: 0:13:54 time: 0.0383 data: 0.0008 max mem: 33300 Test: [ 1100/21770] eta: 0:13:47 time: 0.0381 data: 0.0009 max mem: 33300 Test: [ 1200/21770] eta: 0:13:40 time: 0.0385 data: 0.0008 max mem: 33300 Test: [ 1300/21770] eta: 0:13:34 time: 0.0384 data: 0.0008 max mem: 33300 Test: [ 1400/21770] eta: 0:13:27 time: 0.0381 data: 0.0008 max mem: 33300 Test: [ 1500/21770] eta: 0:13:21 time: 0.0381 data: 0.0008 max mem: 33300 Test: [ 1600/21770] eta: 0:13:15 time: 0.0381 data: 0.0008 max mem: 33300 Test: [ 1700/21770] eta: 0:13:10 time: 0.0379 data: 0.0008 max mem: 33300 Test: [ 1800/21770] eta: 0:13:04 time: 0.0381 data: 0.0008 max mem: 33300 Test: [ 1900/21770] eta: 0:12:59 time: 0.0378 data: 0.0008 max mem: 33300 Test: [ 2000/21770] eta: 0:12:53 time: 0.0378 data: 0.0008 max mem: 33300 Test: [ 2100/21770] eta: 0:12:48 time: 0.0377 data: 0.0008 max mem: 33300 Test: [ 2200/21770] eta: 0:12:43 time: 0.0381 data: 0.0008 max mem: 33300 Test: [ 2300/21770] eta: 0:12:38 time: 0.0378 data: 0.0008 max mem: 33300 Test: [ 2400/21770] eta: 0:12:34 time: 0.0380 data: 0.0009 max mem: 33300 Test: [ 2500/21770] eta: 0:12:29 time: 0.0379 data: 0.0009 max mem: 33300 Test: [ 2600/21770] eta: 0:12:24 time: 0.0380 data: 0.0008 max mem: 33300 Test: [ 2700/21770] eta: 0:12:20 time: 0.0380 data: 0.0009 max mem: 33300 Test: [ 2800/21770] eta: 0:12:16 time: 0.0381 data: 0.0008 max mem: 33300 Test: [ 2900/21770] eta: 0:12:11 time: 0.0380 data: 0.0009 max mem: 33300 Test: [ 3000/21770] eta: 0:12:07 time: 0.0381 data: 0.0008 max mem: 33300 Test: [ 3100/21770] eta: 0:12:03 time: 0.0379 data: 0.0009 max mem: 33300 Test: [ 3200/21770] eta: 0:11:58 time: 0.0381 data: 0.0009 max mem: 33300 Test: [ 3300/21770] eta: 0:11:54 time: 0.0377 data: 0.0009 max mem: 33300 Test: [ 3400/21770] eta: 0:11:50 time: 0.0377 data: 0.0009 max mem: 33300 Test: [ 3500/21770] eta: 0:11:45 time: 0.0377 data: 0.0008 max mem: 33300 Test: [ 3600/21770] eta: 0:11:41 time: 0.0378 data: 0.0009 max mem: 33300 Test: [ 3700/21770] eta: 0:11:37 time: 0.0378 data: 0.0009 max mem: 33300 Test: [ 3800/21770] eta: 0:11:32 time: 0.0378 data: 0.0009 max mem: 33300 Test: [ 3900/21770] eta: 0:11:28 time: 0.0378 data: 0.0009 max mem: 33300 Test: [ 4000/21770] eta: 0:11:24 time: 0.0378 data: 0.0009 max mem: 33300 Test: [ 4100/21770] eta: 0:11:20 time: 0.0390 data: 0.0009 max mem: 33300 Test: [ 4200/21770] eta: 0:11:17 time: 0.0397 data: 0.0009 max mem: 33300 Test: [ 4300/21770] eta: 0:11:13 time: 0.0393 data: 0.0009 max mem: 33300 Test: [ 4400/21770] eta: 0:11:10 time: 0.0389 data: 0.0009 max mem: 33300 Test: [ 4500/21770] eta: 0:11:06 time: 0.0391 data: 0.0009 max mem: 33300 Test: [ 4600/21770] eta: 0:11:02 time: 0.0389 data: 0.0009 max mem: 33300 Test: [ 4700/21770] eta: 0:10:59 time: 0.0392 data: 0.0009 max mem: 33300 Test: [ 4800/21770] eta: 0:10:55 time: 0.0387 data: 0.0009 max mem: 33300 Test: [ 4900/21770] eta: 0:10:51 time: 0.0383 data: 0.0009 max mem: 33300 Test: [ 5000/21770] eta: 0:10:47 time: 0.0385 data: 0.0009 max mem: 33300 Test: [ 5100/21770] eta: 0:10:43 time: 0.0388 data: 0.0009 max mem: 33300 Test: [ 5200/21770] eta: 0:10:40 time: 0.0389 data: 0.0009 max mem: 33300 Test: [ 5300/21770] eta: 0:10:36 time: 0.0387 data: 0.0009 max mem: 33300 Test: [ 5400/21770] eta: 0:10:32 time: 0.0394 data: 0.0009 max mem: 33300 Test: [ 5500/21770] eta: 0:10:28 time: 0.0390 data: 0.0009 max mem: 33300 Test: [ 5600/21770] eta: 0:10:25 time: 0.0392 data: 0.0009 max mem: 33300 Test: [ 5700/21770] eta: 0:10:21 time: 0.0389 data: 0.0009 max mem: 33300 Test: [ 5800/21770] eta: 0:10:17 time: 0.0386 data: 0.0009 max mem: 33300 Test: [ 5900/21770] eta: 0:10:13 time: 0.0392 data: 0.0009 max mem: 33300 Test: [ 6000/21770] eta: 0:10:10 time: 0.0397 data: 0.0008 max mem: 33300 Test: [ 6100/21770] eta: 0:10:06 time: 0.0387 data: 0.0009 max mem: 33300 Test: [ 6200/21770] eta: 0:10:02 time: 0.0386 data: 0.0009 max mem: 33300 Test: [ 6300/21770] eta: 0:09:58 time: 0.0389 data: 0.0009 max mem: 33300 Test: [ 6400/21770] eta: 0:09:55 time: 0.0387 data: 0.0009 max mem: 33300 Test: [ 6500/21770] eta: 0:09:51 time: 0.0387 data: 0.0009 max mem: 33300 Test: [ 6600/21770] eta: 0:09:47 time: 0.0386 data: 0.0009 max mem: 33300 Test: [ 6700/21770] eta: 0:09:43 time: 0.0385 data: 0.0009 max mem: 33300 Test: [ 6800/21770] eta: 0:09:39 time: 0.0384 data: 0.0009 max mem: 33300 Test: [ 6900/21770] eta: 0:09:35 time: 0.0386 data: 0.0009 max mem: 33300 Test: [ 7000/21770] eta: 0:09:31 time: 0.0385 data: 0.0009 max mem: 33300 Test: [ 7100/21770] eta: 0:09:27 time: 0.0385 data: 0.0009 max mem: 33300 Test: [ 7200/21770] eta: 0:09:23 time: 0.0386 data: 0.0009 max mem: 33300 Test: [ 7300/21770] eta: 0:09:19 time: 0.0384 data: 0.0009 max mem: 33300 Test: [ 7400/21770] eta: 0:09:16 time: 0.0385 data: 0.0009 max mem: 33300 Test: [ 7500/21770] eta: 0:09:12 time: 0.0385 data: 0.0009 max mem: 33300 Test: [ 7600/21770] eta: 0:09:08 time: 0.0388 data: 0.0009 max mem: 33300 Test: [ 7700/21770] eta: 0:09:04 time: 0.0387 data: 0.0009 max mem: 33300 Test: [ 7800/21770] eta: 0:09:00 time: 0.0386 data: 0.0009 max mem: 33300 Test: [ 7900/21770] eta: 0:08:56 time: 0.0392 data: 0.0009 max mem: 33300 Test: [ 8000/21770] eta: 0:08:52 time: 0.0397 data: 0.0009 max mem: 33300 Test: [ 8100/21770] eta: 0:08:49 time: 0.0401 data: 0.0008 max mem: 33300 Test: [ 8200/21770] eta: 0:08:45 time: 0.0398 data: 0.0008 max mem: 33300 Test: [ 8300/21770] eta: 0:08:41 time: 0.0391 data: 0.0008 max mem: 33300 Test: [ 8400/21770] eta: 0:08:38 time: 0.0397 data: 0.0008 max mem: 33300 Test: [ 8500/21770] eta: 0:08:34 time: 0.0394 data: 0.0008 max mem: 33300 Test: [ 8600/21770] eta: 0:08:30 time: 0.0392 data: 0.0008 max mem: 33300 Test: [ 8700/21770] eta: 0:08:26 time: 0.0386 data: 0.0009 max mem: 33300 Test: [ 8800/21770] eta: 0:08:22 time: 0.0392 data: 0.0008 max mem: 33300 Test: [ 8900/21770] eta: 0:08:19 time: 0.0393 data: 0.0009 max mem: 33300 Test: [ 9000/21770] eta: 0:08:15 time: 0.0391 data: 0.0009 max mem: 33300 Test: [ 9100/21770] eta: 0:08:11 time: 0.0387 data: 0.0008 max mem: 33300 Test: [ 9200/21770] eta: 0:08:07 time: 0.0390 data: 0.0008 max mem: 33300 Test: [ 9300/21770] eta: 0:08:03 time: 0.0393 data: 0.0008 max mem: 33300 Test: [ 9400/21770] eta: 0:07:59 time: 0.0382 data: 0.0009 max mem: 33300 Test: [ 9500/21770] eta: 0:07:55 time: 0.0385 data: 0.0009 max mem: 33300 Test: [ 9600/21770] eta: 0:07:51 time: 0.0403 data: 0.0008 max mem: 33300 Test: [ 9700/21770] eta: 0:07:48 time: 0.0389 data: 0.0009 max mem: 33300 Test: [ 9800/21770] eta: 0:07:44 time: 0.0383 data: 0.0009 max mem: 33300 Test: [ 9900/21770] eta: 0:07:40 time: 0.0386 data: 0.0009 max mem: 33300 Test: [10000/21770] eta: 0:07:36 time: 0.0387 data: 0.0009 max mem: 33300 Test: [10100/21770] eta: 0:07:32 time: 0.0391 data: 0.0009 max mem: 33300 Test: [10200/21770] eta: 0:07:28 time: 0.0387 data: 0.0009 max mem: 33300 Test: [10300/21770] eta: 0:07:24 time: 0.0389 data: 0.0009 max mem: 33300 Test: [10400/21770] eta: 0:07:20 time: 0.0388 data: 0.0009 max mem: 33300 Test: [10500/21770] eta: 0:07:17 time: 0.0378 data: 0.0009 max mem: 33300 Test: [10600/21770] eta: 0:07:13 time: 0.0395 data: 0.0008 max mem: 33300 Test: [10700/21770] eta: 0:07:09 time: 0.0391 data: 0.0009 max mem: 33300 Test: [10800/21770] eta: 0:07:05 time: 0.0385 data: 0.0009 max mem: 33300 Test: [10900/21770] eta: 0:07:01 time: 0.0386 data: 0.0009 max mem: 33300 Test: [11000/21770] eta: 0:06:57 time: 0.0391 data: 0.0008 max mem: 33300 Test: [11100/21770] eta: 0:06:54 time: 0.0384 data: 0.0009 max mem: 33300 Test: [11200/21770] eta: 0:06:50 time: 0.0402 data: 0.0008 max mem: 33300 Test: [11300/21770] eta: 0:06:46 time: 0.0385 data: 0.0009 max mem: 33300 Test: [11400/21770] eta: 0:06:42 time: 0.0388 data: 0.0008 max mem: 33300 Test: [11500/21770] eta: 0:06:38 time: 0.0389 data: 0.0009 max mem: 33300 Test: [11600/21770] eta: 0:06:34 time: 0.0389 data: 0.0009 max mem: 33300 Test: [11700/21770] eta: 0:06:30 time: 0.0387 data: 0.0009 max mem: 33300 Test: [11800/21770] eta: 0:06:26 time: 0.0390 data: 0.0008 max mem: 33300 Test: [11900/21770] eta: 0:06:23 time: 0.0385 data: 0.0009 max mem: 33300 Test: [12000/21770] eta: 0:06:19 time: 0.0390 data: 0.0009 max mem: 33300 Test: [12100/21770] eta: 0:06:15 time: 0.0387 data: 0.0008 max mem: 33300 Test: [12200/21770] eta: 0:06:11 time: 0.0389 data: 0.0009 max mem: 33300 Test: [12300/21770] eta: 0:06:07 time: 0.0387 data: 0.0009 max mem: 33300 Test: [12400/21770] eta: 0:06:03 time: 0.0390 data: 0.0009 max mem: 33300 Test: [12500/21770] eta: 0:05:59 time: 0.0389 data: 0.0009 max mem: 33300 Test: [12600/21770] eta: 0:05:55 time: 0.0393 data: 0.0009 max mem: 33300 Test: [12700/21770] eta: 0:05:52 time: 0.0390 data: 0.0009 max mem: 33300 Test: [12800/21770] eta: 0:05:48 time: 0.0396 data: 0.0008 max mem: 33300 Test: [12900/21770] eta: 0:05:44 time: 0.0391 data: 0.0008 max mem: 33300 Test: [13000/21770] eta: 0:05:40 time: 0.0396 data: 0.0008 max mem: 33300 Test: [13100/21770] eta: 0:05:36 time: 0.0391 data: 0.0008 max mem: 33300 Test: [13200/21770] eta: 0:05:32 time: 0.0394 data: 0.0009 max mem: 33300 Test: [13300/21770] eta: 0:05:28 time: 0.0397 data: 0.0009 max mem: 33300 Test: [13400/21770] eta: 0:05:25 time: 0.0402 data: 0.0008 max mem: 33300 Test: [13500/21770] eta: 0:05:21 time: 0.0397 data: 0.0008 max mem: 33300 Test: [13600/21770] eta: 0:05:17 time: 0.0396 data: 0.0008 max mem: 33300 Test: [13700/21770] eta: 0:05:13 time: 0.0398 data: 0.0009 max mem: 33300 Test: [13800/21770] eta: 0:05:09 time: 0.0399 data: 0.0009 max mem: 33300 Test: [13900/21770] eta: 0:05:05 time: 0.0397 data: 0.0008 max mem: 33300 Test: [14000/21770] eta: 0:05:02 time: 0.0401 data: 0.0008 max mem: 33300 Test: [14100/21770] eta: 0:04:58 time: 0.0395 data: 0.0009 max mem: 33300 Test: [14200/21770] eta: 0:04:54 time: 0.0401 data: 0.0009 max mem: 33300 Test: [14300/21770] eta: 0:04:50 time: 0.0404 data: 0.0009 max mem: 33300 Test: [14400/21770] eta: 0:04:46 time: 0.0400 data: 0.0008 max mem: 33300 Test: [14500/21770] eta: 0:04:42 time: 0.0396 data: 0.0009 max mem: 33300 Test: [14600/21770] eta: 0:04:39 time: 0.0379 data: 0.0009 max mem: 33300 Test: [14700/21770] eta: 0:04:35 time: 0.0388 data: 0.0008 max mem: 33300 Test: [14800/21770] eta: 0:04:31 time: 0.0383 data: 0.0008 max mem: 33300 Test: [14900/21770] eta: 0:04:27 time: 0.0385 data: 0.0009 max mem: 33300 Test: [15000/21770] eta: 0:04:23 time: 0.0384 data: 0.0009 max mem: 33300 Test: [15100/21770] eta: 0:04:19 time: 0.0385 data: 0.0008 max mem: 33300 Test: [15200/21770] eta: 0:04:15 time: 0.0382 data: 0.0008 max mem: 33300 Test: [15300/21770] eta: 0:04:11 time: 0.0384 data: 0.0009 max mem: 33300 Test: [15400/21770] eta: 0:04:07 time: 0.0384 data: 0.0009 max mem: 33300 Test: [15500/21770] eta: 0:04:03 time: 0.0385 data: 0.0009 max mem: 33300 Test: [15600/21770] eta: 0:03:59 time: 0.0402 data: 0.0009 max mem: 33300 Test: [15700/21770] eta: 0:03:56 time: 0.0399 data: 0.0008 max mem: 33300 Test: [15800/21770] eta: 0:03:52 time: 0.0399 data: 0.0009 max mem: 33300 Test: [15900/21770] eta: 0:03:48 time: 0.0407 data: 0.0009 max mem: 33300 Test: [16000/21770] eta: 0:03:44 time: 0.0405 data: 0.0009 max mem: 33300 Test: [16100/21770] eta: 0:03:40 time: 0.0401 data: 0.0009 max mem: 33300 Test: [16200/21770] eta: 0:03:36 time: 0.0401 data: 0.0008 max mem: 33300 Test: [16300/21770] eta: 0:03:33 time: 0.0401 data: 0.0009 max mem: 33300 Test: [16400/21770] eta: 0:03:29 time: 0.0401 data: 0.0009 max mem: 33300 Test: [16500/21770] eta: 0:03:25 time: 0.0399 data: 0.0008 max mem: 33300 Test: [16600/21770] eta: 0:03:21 time: 0.0401 data: 0.0008 max mem: 33300 Test: [16700/21770] eta: 0:03:17 time: 0.0399 data: 0.0009 max mem: 33300 Test: [16800/21770] eta: 0:03:13 time: 0.0399 data: 0.0008 max mem: 33300 Test: [16900/21770] eta: 0:03:09 time: 0.0401 data: 0.0009 max mem: 33300 Test: [17000/21770] eta: 0:03:05 time: 0.0401 data: 0.0008 max mem: 33300 Test: [17100/21770] eta: 0:03:02 time: 0.0397 data: 0.0009 max mem: 33300 Test: [17200/21770] eta: 0:02:58 time: 0.0395 data: 0.0009 max mem: 33300 Test: [17300/21770] eta: 0:02:54 time: 0.0392 data: 0.0008 max mem: 33300 Test: [17400/21770] eta: 0:02:50 time: 0.0393 data: 0.0009 max mem: 33300 Test: [17500/21770] eta: 0:02:46 time: 0.0390 data: 0.0009 max mem: 33300 Test: [17600/21770] eta: 0:02:42 time: 0.0393 data: 0.0009 max mem: 33300 Test: [17700/21770] eta: 0:02:38 time: 0.0391 data: 0.0008 max mem: 33300 Test: [17800/21770] eta: 0:02:34 time: 0.0388 data: 0.0008 max mem: 33300 Test: [17900/21770] eta: 0:02:30 time: 0.0383 data: 0.0008 max mem: 33300 Test: [18000/21770] eta: 0:02:27 time: 0.0387 data: 0.0009 max mem: 33300 Test: [18100/21770] eta: 0:02:23 time: 0.0389 data: 0.0008 max mem: 33300 Test: [18200/21770] eta: 0:02:19 time: 0.0389 data: 0.0009 max mem: 33300 Test: [18300/21770] eta: 0:02:15 time: 0.0390 data: 0.0008 max mem: 33300 Test: [18400/21770] eta: 0:02:11 time: 0.0387 data: 0.0009 max mem: 33300 Test: [18500/21770] eta: 0:02:07 time: 0.0393 data: 0.0009 max mem: 33300 Test: [18600/21770] eta: 0:02:03 time: 0.0392 data: 0.0009 max mem: 33300 Test: [18700/21770] eta: 0:01:59 time: 0.0391 data: 0.0008 max mem: 33300 Test: [18800/21770] eta: 0:01:55 time: 0.0394 data: 0.0008 max mem: 33300 Test: [18900/21770] eta: 0:01:51 time: 0.0391 data: 0.0008 max mem: 33300 Test: [19000/21770] eta: 0:01:48 time: 0.0399 data: 0.0008 max mem: 33300 Test: [19100/21770] eta: 0:01:44 time: 0.0393 data: 0.0008 max mem: 33300 Test: [19200/21770] eta: 0:01:40 time: 0.0395 data: 0.0008 max mem: 33300 Test: [19300/21770] eta: 0:01:36 time: 0.0396 data: 0.0008 max mem: 33300 Test: [19400/21770] eta: 0:01:32 time: 0.0394 data: 0.0008 max mem: 33300 Test: [19500/21770] eta: 0:01:28 time: 0.0392 data: 0.0008 max mem: 33300 Test: [19600/21770] eta: 0:01:24 time: 0.0393 data: 0.0008 max mem: 33300 Test: [19700/21770] eta: 0:01:20 time: 0.0394 data: 0.0008 max mem: 33300 Test: [19800/21770] eta: 0:01:16 time: 0.0395 data: 0.0008 max mem: 33300 Test: [19900/21770] eta: 0:01:12 time: 0.0394 data: 0.0008 max mem: 33300 Test: [20000/21770] eta: 0:01:09 time: 0.0398 data: 0.0009 max mem: 33300 Test: [20100/21770] eta: 0:01:05 time: 0.0393 data: 0.0008 max mem: 33300 Test: [20200/21770] eta: 0:01:01 time: 0.0401 data: 0.0008 max mem: 33300 Test: [20300/21770] eta: 0:00:57 time: 0.0392 data: 0.0008 max mem: 33300 Test: [20400/21770] eta: 0:00:53 time: 0.0397 data: 0.0008 max mem: 33300 Test: [20500/21770] eta: 0:00:49 time: 0.0392 data: 0.0008 max mem: 33300 Test: [20600/21770] eta: 0:00:45 time: 0.0400 data: 0.0008 max mem: 33300 Test: [20700/21770] eta: 0:00:41 time: 0.0393 data: 0.0008 max mem: 33300 Test: [20800/21770] eta: 0:00:37 time: 0.0400 data: 0.0008 max mem: 33300 Test: [20900/21770] eta: 0:00:33 time: 0.0401 data: 0.0008 max mem: 33300 Test: [21000/21770] eta: 0:00:30 time: 0.0400 data: 0.0008 max mem: 33300 Test: [21100/21770] eta: 0:00:26 time: 0.0400 data: 0.0008 max mem: 33300 Test: [21200/21770] eta: 0:00:22 time: 0.0400 data: 0.0008 max mem: 33300 Test: [21300/21770] eta: 0:00:18 time: 0.0401 data: 0.0008 max mem: 33300 Test: [21400/21770] eta: 0:00:14 time: 0.0401 data: 0.0008 max mem: 33300 Test: [21500/21770] eta: 0:00:10 time: 0.0400 data: 0.0008 max mem: 33300 Test: [21600/21770] eta: 0:00:06 time: 0.0400 data: 0.0008 max mem: 33300 Test: [21700/21770] eta: 0:00:02 time: 0.0399 data: 0.0008 max mem: 33300 Test: Total time: 0:14:11 Final results: Mean IoU is 0.00 precision@0.5 = 0.00 precision@0.6 = 0.00 precision@0.7 = 0.00 precision@0.8 = 0.00 precision@0.9 = 0.00 overall IoU = 0.00 mean IoU = 0.00 Mean accuracy for one-to-zero sample is 100.00 Average object IoU 0.0 Overall IoU 0.0 Epoch: [6] [ 0/4276] eta: 6:30:03 lr: 4.3196081207949495e-05 loss: 0.1499 (0.1499) time: 5.4732 data: 2.4258 max mem: 33300 Epoch: [6] [ 10/4276] eta: 3:42:03 lr: 4.3193407125918296e-05 loss: 0.1903 (0.1974) time: 3.1232 data: 0.2264 max mem: 33300 Epoch: [6] [ 20/4276] eta: 3:34:27 lr: 4.31907330254924e-05 loss: 0.1841 (0.1904) time: 2.9009 data: 0.0072 max mem: 33300 Epoch: [6] [ 30/4276] eta: 3:30:48 lr: 4.3188058906670436e-05 loss: 0.1748 (0.1913) time: 2.8997 data: 0.0081 max mem: 33300 Epoch: [6] [ 40/4276] eta: 3:28:29 lr: 4.3185384769450995e-05 loss: 0.1884 (0.1885) time: 2.8794 data: 0.0079 max mem: 33300 Epoch: [6] [ 50/4276] eta: 3:26:51 lr: 4.3182710613832684e-05 loss: 0.1802 (0.1860) time: 2.8715 data: 0.0075 max mem: 33300 Epoch: [6] [ 60/4276] eta: 3:25:31 lr: 4.318003643981412e-05 loss: 0.1797 (0.1855) time: 2.8668 data: 0.0074 max mem: 33300 Epoch: [6] [ 70/4276] eta: 3:24:32 lr: 4.3177362247393896e-05 loss: 0.1762 (0.1849) time: 2.8690 data: 0.0079 max mem: 33300 Epoch: [6] [ 80/4276] eta: 3:23:36 lr: 4.317468803657063e-05 loss: 0.1950 (0.1858) time: 2.8709 data: 0.0092 max mem: 33300 Epoch: [6] [ 90/4276] eta: 3:22:49 lr: 4.317201380734292e-05 loss: 0.1595 (0.1833) time: 2.8701 data: 0.0094 max mem: 33300 Epoch: [6] [ 100/4276] eta: 3:22:13 lr: 4.316933955970939e-05 loss: 0.1623 (0.1867) time: 2.8807 data: 0.0092 max mem: 33300 Epoch: [6] [ 110/4276] eta: 3:21:34 lr: 4.316666529366863e-05 loss: 0.1885 (0.1882) time: 2.8843 data: 0.0091 max mem: 33300 Epoch: [6] [ 120/4276] eta: 3:20:56 lr: 4.316399100921926e-05 loss: 0.1885 (0.1881) time: 2.8780 data: 0.0086 max mem: 33300 Epoch: [6] [ 130/4276] eta: 3:20:21 lr: 4.316131670635987e-05 loss: 0.1906 (0.1889) time: 2.8794 data: 0.0084 max mem: 33300 Epoch: [6] [ 140/4276] eta: 3:19:45 lr: 4.315864238508907e-05 loss: 0.1909 (0.1886) time: 2.8802 data: 0.0090 max mem: 33300 Epoch: [6] [ 150/4276] eta: 3:19:19 lr: 4.3155968045405476e-05 loss: 0.1852 (0.1886) time: 2.8924 data: 0.0097 max mem: 33300 Epoch: [6] [ 160/4276] eta: 3:18:44 lr: 4.315329368730768e-05 loss: 0.1855 (0.1884) time: 2.8907 data: 0.0094 max mem: 33300 Epoch: [6] [ 170/4276] eta: 3:18:07 lr: 4.3150619310794296e-05 loss: 0.1909 (0.1892) time: 2.8697 data: 0.0082 max mem: 33300 Epoch: [6] [ 180/4276] eta: 3:17:32 lr: 4.314794491586392e-05 loss: 0.1947 (0.1892) time: 2.8659 data: 0.0085 max mem: 33300 Epoch: [6] [ 190/4276] eta: 3:16:58 lr: 4.314527050251518e-05 loss: 0.2027 (0.1894) time: 2.8678 data: 0.0089 max mem: 33300 Epoch: [6] [ 200/4276] eta: 3:16:23 lr: 4.314259607074665e-05 loss: 0.1992 (0.1902) time: 2.8679 data: 0.0087 max mem: 33300 Epoch: [6] [ 210/4276] eta: 3:15:50 lr: 4.313992162055696e-05 loss: 0.1991 (0.1905) time: 2.8660 data: 0.0085 max mem: 33300 Epoch: [6] [ 220/4276] eta: 3:15:14 lr: 4.3137247151944686e-05 loss: 0.1902 (0.1902) time: 2.8594 data: 0.0081 max mem: 33300 Epoch: [6] [ 230/4276] eta: 3:14:38 lr: 4.313457266490846e-05 loss: 0.1810 (0.1893) time: 2.8515 data: 0.0083 max mem: 33300 Epoch: [6] [ 240/4276] eta: 3:14:03 lr: 4.3131898159446865e-05 loss: 0.1779 (0.1895) time: 2.8474 data: 0.0081 max mem: 33300 Epoch: [6] [ 250/4276] eta: 3:13:33 lr: 4.312922363555852e-05 loss: 0.1950 (0.1909) time: 2.8625 data: 0.0081 max mem: 33300 Epoch: [6] [ 260/4276] eta: 3:13:11 lr: 4.3126549093242026e-05 loss: 0.1949 (0.1908) time: 2.9036 data: 0.0086 max mem: 33300 Epoch: [6] [ 270/4276] eta: 3:12:46 lr: 4.3123874532495974e-05 loss: 0.1921 (0.1908) time: 2.9201 data: 0.0079 max mem: 33300 Epoch: [6] [ 280/4276] eta: 3:12:25 lr: 4.3121199953318974e-05 loss: 0.1975 (0.1909) time: 2.9292 data: 0.0075 max mem: 33300 Epoch: [6] [ 290/4276] eta: 3:12:03 lr: 4.311852535570964e-05 loss: 0.1975 (0.1904) time: 2.9407 data: 0.0076 max mem: 33300 Epoch: [6] [ 300/4276] eta: 3:11:39 lr: 4.311585073966656e-05 loss: 0.1796 (0.1900) time: 2.9353 data: 0.0084 max mem: 33300 Epoch: [6] [ 310/4276] eta: 3:11:16 lr: 4.311317610518834e-05 loss: 0.1697 (0.1896) time: 2.9339 data: 0.0082 max mem: 33300 Epoch: [6] [ 320/4276] eta: 3:10:52 lr: 4.3110501452273584e-05 loss: 0.1790 (0.1901) time: 2.9327 data: 0.0077 max mem: 33300 Epoch: [6] [ 330/4276] eta: 3:10:27 lr: 4.310782678092089e-05 loss: 0.1808 (0.1900) time: 2.9313 data: 0.0075 max mem: 33300 Epoch: [6] [ 340/4276] eta: 3:10:02 lr: 4.310515209112888e-05 loss: 0.1767 (0.1894) time: 2.9312 data: 0.0071 max mem: 33300 Epoch: [6] [ 350/4276] eta: 3:09:37 lr: 4.3102477382896114e-05 loss: 0.1701 (0.1893) time: 2.9319 data: 0.0074 max mem: 33300 Epoch: [6] [ 360/4276] eta: 3:09:11 lr: 4.309980265622123e-05 loss: 0.1872 (0.1897) time: 2.9304 data: 0.0078 max mem: 33300 Epoch: [6] [ 370/4276] eta: 3:08:46 lr: 4.309712791110282e-05 loss: 0.1718 (0.1890) time: 2.9315 data: 0.0077 max mem: 33300 Epoch: [6] [ 380/4276] eta: 3:08:20 lr: 4.3094453147539475e-05 loss: 0.1580 (0.1888) time: 2.9314 data: 0.0075 max mem: 33300 Epoch: [6] [ 390/4276] eta: 3:07:54 lr: 4.3091778365529804e-05 loss: 0.1800 (0.1887) time: 2.9300 data: 0.0073 max mem: 33300 Epoch: [6] [ 400/4276] eta: 3:07:28 lr: 4.308910356507241e-05 loss: 0.1802 (0.1885) time: 2.9319 data: 0.0074 max mem: 33300 Epoch: [6] [ 410/4276] eta: 3:07:04 lr: 4.308642874616588e-05 loss: 0.1773 (0.1882) time: 2.9458 data: 0.0076 max mem: 33300 Epoch: [6] [ 420/4276] eta: 3:06:38 lr: 4.3083753908808836e-05 loss: 0.1773 (0.1885) time: 2.9449 data: 0.0076 max mem: 33300 Epoch: [6] [ 430/4276] eta: 3:06:11 lr: 4.308107905299986e-05 loss: 0.1800 (0.1885) time: 2.9317 data: 0.0076 max mem: 33300 Epoch: [6] [ 440/4276] eta: 3:05:44 lr: 4.307840417873757e-05 loss: 0.1800 (0.1886) time: 2.9309 data: 0.0076 max mem: 33300 Epoch: [6] [ 450/4276] eta: 3:05:16 lr: 4.307572928602054e-05 loss: 0.1731 (0.1887) time: 2.9216 data: 0.0075 max mem: 33300 Epoch: [6] [ 460/4276] eta: 3:04:49 lr: 4.3073054374847384e-05 loss: 0.1731 (0.1882) time: 2.9185 data: 0.0080 max mem: 33300 Epoch: [6] [ 470/4276] eta: 3:04:19 lr: 4.3070379445216695e-05 loss: 0.1719 (0.1879) time: 2.9101 data: 0.0084 max mem: 33300 Epoch: [6] [ 480/4276] eta: 3:03:48 lr: 4.3067704497127084e-05 loss: 0.1678 (0.1874) time: 2.8888 data: 0.0079 max mem: 33300 Epoch: [6] [ 490/4276] eta: 3:03:18 lr: 4.3065029530577136e-05 loss: 0.1591 (0.1869) time: 2.8893 data: 0.0083 max mem: 33300 Epoch: [6] [ 500/4276] eta: 3:02:51 lr: 4.306235454556546e-05 loss: 0.1632 (0.1869) time: 2.9141 data: 0.0085 max mem: 33300 Epoch: [6] [ 510/4276] eta: 3:02:24 lr: 4.305967954209065e-05 loss: 0.1685 (0.1866) time: 2.9307 data: 0.0083 max mem: 33300 Epoch: [6] [ 520/4276] eta: 3:01:56 lr: 4.305700452015131e-05 loss: 0.1685 (0.1866) time: 2.9300 data: 0.0084 max mem: 33300 Epoch: [6] [ 530/4276] eta: 3:01:29 lr: 4.305432947974602e-05 loss: 0.1786 (0.1865) time: 2.9280 data: 0.0082 max mem: 33300 Epoch: [6] [ 540/4276] eta: 3:01:01 lr: 4.305165442087339e-05 loss: 0.1786 (0.1863) time: 2.9280 data: 0.0078 max mem: 33300 Epoch: [6] [ 550/4276] eta: 3:00:32 lr: 4.304897934353202e-05 loss: 0.1794 (0.1863) time: 2.9194 data: 0.0077 max mem: 33300 Epoch: [6] [ 560/4276] eta: 3:00:02 lr: 4.304630424772051e-05 loss: 0.1805 (0.1862) time: 2.9008 data: 0.0086 max mem: 33300 Epoch: [6] [ 570/4276] eta: 2:59:31 lr: 4.3043629133437445e-05 loss: 0.1777 (0.1862) time: 2.8853 data: 0.0088 max mem: 33300 Epoch: [6] [ 580/4276] eta: 2:59:00 lr: 4.304095400068143e-05 loss: 0.1783 (0.1860) time: 2.8776 data: 0.0081 max mem: 33300 Epoch: [6] [ 590/4276] eta: 2:58:29 lr: 4.303827884945106e-05 loss: 0.1746 (0.1856) time: 2.8741 data: 0.0078 max mem: 33300 Epoch: [6] [ 600/4276] eta: 2:58:00 lr: 4.303560367974494e-05 loss: 0.1710 (0.1855) time: 2.8848 data: 0.0077 max mem: 33300 Epoch: [6] [ 610/4276] eta: 2:57:32 lr: 4.3032928491561655e-05 loss: 0.1721 (0.1854) time: 2.9124 data: 0.0077 max mem: 33300 Epoch: [6] [ 620/4276] eta: 2:57:03 lr: 4.3030253284899794e-05 loss: 0.1807 (0.1855) time: 2.9200 data: 0.0081 max mem: 33300 Epoch: [6] [ 630/4276] eta: 2:56:35 lr: 4.302757805975797e-05 loss: 0.1809 (0.1854) time: 2.9210 data: 0.0083 max mem: 33300 Epoch: [6] [ 640/4276] eta: 2:56:08 lr: 4.302490281613477e-05 loss: 0.1757 (0.1853) time: 2.9314 data: 0.0083 max mem: 33300 Epoch: [6] [ 650/4276] eta: 2:55:40 lr: 4.30222275540288e-05 loss: 0.1747 (0.1852) time: 2.9316 data: 0.0082 max mem: 33300 Epoch: [6] [ 660/4276] eta: 2:55:12 lr: 4.301955227343863e-05 loss: 0.1776 (0.1853) time: 2.9287 data: 0.0076 max mem: 33300 Epoch: [6] [ 670/4276] eta: 2:54:45 lr: 4.301687697436288e-05 loss: 0.1776 (0.1852) time: 2.9305 data: 0.0074 max mem: 33300 Epoch: [6] [ 680/4276] eta: 2:54:17 lr: 4.3014201656800144e-05 loss: 0.1772 (0.1851) time: 2.9319 data: 0.0074 max mem: 33300 Epoch: [6] [ 690/4276] eta: 2:53:49 lr: 4.3011526320749e-05 loss: 0.1776 (0.1850) time: 2.9331 data: 0.0076 max mem: 33300 Epoch: [6] [ 700/4276] eta: 2:53:21 lr: 4.3008850966208054e-05 loss: 0.1822 (0.1849) time: 2.9324 data: 0.0076 max mem: 33300 Epoch: [6] [ 710/4276] eta: 2:52:53 lr: 4.300617559317589e-05 loss: 0.1839 (0.1849) time: 2.9288 data: 0.0076 max mem: 33300 Epoch: [6] [ 720/4276] eta: 2:52:25 lr: 4.300350020165112e-05 loss: 0.1801 (0.1848) time: 2.9271 data: 0.0074 max mem: 33300 Epoch: [6] [ 730/4276] eta: 2:51:56 lr: 4.300082479163232e-05 loss: 0.1686 (0.1849) time: 2.9262 data: 0.0074 max mem: 33300 Epoch: [6] [ 740/4276] eta: 2:51:28 lr: 4.2998149363118096e-05 loss: 0.1810 (0.1849) time: 2.9263 data: 0.0074 max mem: 33300 Epoch: [6] [ 750/4276] eta: 2:51:00 lr: 4.299547391610704e-05 loss: 0.1810 (0.1850) time: 2.9272 data: 0.0074 max mem: 33300 Epoch: [6] [ 760/4276] eta: 2:50:32 lr: 4.299279845059774e-05 loss: 0.1645 (0.1848) time: 2.9282 data: 0.0078 max mem: 33300 Epoch: [6] [ 770/4276] eta: 2:50:04 lr: 4.299012296658878e-05 loss: 0.1694 (0.1850) time: 2.9298 data: 0.0080 max mem: 33300 Epoch: [6] [ 780/4276] eta: 2:49:35 lr: 4.298744746407877e-05 loss: 0.1817 (0.1850) time: 2.9319 data: 0.0081 max mem: 33300 Epoch: [6] [ 790/4276] eta: 2:49:07 lr: 4.29847719430663e-05 loss: 0.1867 (0.1851) time: 2.9316 data: 0.0081 max mem: 33300 Epoch: [6] [ 800/4276] eta: 2:48:39 lr: 4.298209640354996e-05 loss: 0.1867 (0.1852) time: 2.9296 data: 0.0078 max mem: 33300 Epoch: [6] [ 810/4276] eta: 2:48:10 lr: 4.297942084552834e-05 loss: 0.1840 (0.1853) time: 2.9290 data: 0.0076 max mem: 33300 Epoch: [6] [ 820/4276] eta: 2:47:42 lr: 4.2976745269000027e-05 loss: 0.1840 (0.1851) time: 2.9287 data: 0.0080 max mem: 33300 Epoch: [6] [ 830/4276] eta: 2:47:14 lr: 4.2974069673963626e-05 loss: 0.1682 (0.1853) time: 2.9294 data: 0.0084 max mem: 33300 Epoch: [6] [ 840/4276] eta: 2:46:45 lr: 4.297139406041771e-05 loss: 0.1826 (0.1854) time: 2.9294 data: 0.0079 max mem: 33300 Epoch: [6] [ 850/4276] eta: 2:46:17 lr: 4.2968718428360896e-05 loss: 0.1826 (0.1854) time: 2.9286 data: 0.0076 max mem: 33300 Epoch: [6] [ 860/4276] eta: 2:45:48 lr: 4.296604277779175e-05 loss: 0.1836 (0.1855) time: 2.9283 data: 0.0079 max mem: 33300 Epoch: [6] [ 870/4276] eta: 2:45:20 lr: 4.296336710870887e-05 loss: 0.1830 (0.1856) time: 2.9307 data: 0.0078 max mem: 33300 Epoch: [6] [ 880/4276] eta: 2:44:52 lr: 4.2960691421110865e-05 loss: 0.1880 (0.1858) time: 2.9327 data: 0.0076 max mem: 33300 Epoch: [6] [ 890/4276] eta: 2:44:24 lr: 4.2958015714996305e-05 loss: 0.2035 (0.1859) time: 2.9419 data: 0.0081 max mem: 33300 Epoch: [6] [ 900/4276] eta: 2:43:55 lr: 4.295533999036378e-05 loss: 0.1935 (0.1859) time: 2.9373 data: 0.0083 max mem: 33300 Epoch: [6] [ 910/4276] eta: 2:43:27 lr: 4.29526642472119e-05 loss: 0.1824 (0.1860) time: 2.9252 data: 0.0080 max mem: 33300 Epoch: [6] [ 920/4276] eta: 2:42:57 lr: 4.2949988485539236e-05 loss: 0.1893 (0.1861) time: 2.9197 data: 0.0083 max mem: 33300 Epoch: [6] [ 930/4276] eta: 2:42:27 lr: 4.294731270534438e-05 loss: 0.1887 (0.1861) time: 2.8984 data: 0.0088 max mem: 33300 Epoch: [6] [ 940/4276] eta: 2:41:58 lr: 4.294463690662592e-05 loss: 0.1704 (0.1859) time: 2.8994 data: 0.0091 max mem: 33300 Epoch: [6] [ 950/4276] eta: 2:41:30 lr: 4.294196108938246e-05 loss: 0.1743 (0.1860) time: 2.9226 data: 0.0086 max mem: 33300 Epoch: [6] [ 960/4276] eta: 2:41:02 lr: 4.293928525361258e-05 loss: 0.1902 (0.1861) time: 2.9383 data: 0.0079 max mem: 33300 Epoch: [6] [ 970/4276] eta: 2:40:33 lr: 4.293660939931486e-05 loss: 0.1902 (0.1861) time: 2.9381 data: 0.0079 max mem: 33300 Epoch: [6] [ 980/4276] eta: 2:40:05 lr: 4.293393352648789e-05 loss: 0.1910 (0.1863) time: 2.9321 data: 0.0080 max mem: 33300 Epoch: [6] [ 990/4276] eta: 2:39:36 lr: 4.293125763513029e-05 loss: 0.1840 (0.1862) time: 2.9315 data: 0.0080 max mem: 33300 Epoch: [6] [1000/4276] eta: 2:39:07 lr: 4.292858172524061e-05 loss: 0.1740 (0.1861) time: 2.9326 data: 0.0078 max mem: 33300 Epoch: [6] [1010/4276] eta: 2:38:39 lr: 4.292590579681745e-05 loss: 0.1647 (0.1861) time: 2.9333 data: 0.0075 max mem: 33300 Epoch: [6] [1020/4276] eta: 2:38:11 lr: 4.29232298498594e-05 loss: 0.1749 (0.1860) time: 2.9378 data: 0.0077 max mem: 33300 Epoch: [6] [1030/4276] eta: 2:37:42 lr: 4.292055388436505e-05 loss: 0.1749 (0.1861) time: 2.9375 data: 0.0076 max mem: 33300 Epoch: [6] [1040/4276] eta: 2:37:14 lr: 4.2917877900332985e-05 loss: 0.1880 (0.1861) time: 2.9344 data: 0.0072 max mem: 33300 Epoch: [6] [1050/4276] eta: 2:36:44 lr: 4.29152018977618e-05 loss: 0.1880 (0.1861) time: 2.9158 data: 0.0077 max mem: 33300 Epoch: [6] [1060/4276] eta: 2:36:14 lr: 4.2912525876650065e-05 loss: 0.1834 (0.1862) time: 2.8865 data: 0.0083 max mem: 33300 Epoch: [6] [1070/4276] eta: 2:35:43 lr: 4.290984983699638e-05 loss: 0.1737 (0.1862) time: 2.8802 data: 0.0083 max mem: 33300 Epoch: [6] [1080/4276] eta: 2:35:13 lr: 4.290717377879932e-05 loss: 0.1737 (0.1861) time: 2.8807 data: 0.0084 max mem: 33300 Epoch: [6] [1090/4276] eta: 2:34:43 lr: 4.2904497702057486e-05 loss: 0.1823 (0.1860) time: 2.8808 data: 0.0087 max mem: 33300 Epoch: [6] [1100/4276] eta: 2:34:13 lr: 4.290182160676946e-05 loss: 0.1711 (0.1859) time: 2.8806 data: 0.0087 max mem: 33300 Epoch: [6] [1110/4276] eta: 2:33:43 lr: 4.2899145492933826e-05 loss: 0.1764 (0.1859) time: 2.8807 data: 0.0085 max mem: 33300 Epoch: [6] [1120/4276] eta: 2:33:13 lr: 4.289646936054916e-05 loss: 0.1759 (0.1859) time: 2.8844 data: 0.0083 max mem: 33300 Epoch: [6] [1130/4276] eta: 2:32:45 lr: 4.2893793209614064e-05 loss: 0.1731 (0.1857) time: 2.9160 data: 0.0085 max mem: 33300 Epoch: [6] [1140/4276] eta: 2:32:16 lr: 4.289111704012712e-05 loss: 0.1751 (0.1855) time: 2.9280 data: 0.0087 max mem: 33300 Epoch: [6] [1150/4276] eta: 2:31:47 lr: 4.2888440852086905e-05 loss: 0.1747 (0.1855) time: 2.9177 data: 0.0085 max mem: 33300 Epoch: [6] [1160/4276] eta: 2:31:18 lr: 4.288576464549201e-05 loss: 0.1719 (0.1854) time: 2.9295 data: 0.0082 max mem: 33300 Epoch: [6] [1170/4276] eta: 2:30:49 lr: 4.288308842034102e-05 loss: 0.1840 (0.1855) time: 2.9243 data: 0.0086 max mem: 33300 Epoch: [6] [1180/4276] eta: 2:30:19 lr: 4.288041217663251e-05 loss: 0.1858 (0.1855) time: 2.8973 data: 0.0091 max mem: 33300 Epoch: [6] [1190/4276] eta: 2:29:49 lr: 4.287773591436509e-05 loss: 0.1723 (0.1854) time: 2.8830 data: 0.0089 max mem: 33300 Epoch: [6] [1200/4276] eta: 2:29:19 lr: 4.287505963353731e-05 loss: 0.1593 (0.1853) time: 2.8835 data: 0.0085 max mem: 33300 Epoch: [6] [1210/4276] eta: 2:28:49 lr: 4.287238333414778e-05 loss: 0.1692 (0.1853) time: 2.8826 data: 0.0085 max mem: 33300 Epoch: [6] [1220/4276] eta: 2:28:20 lr: 4.286970701619507e-05 loss: 0.1692 (0.1852) time: 2.8932 data: 0.0088 max mem: 33300 Epoch: [6] [1230/4276] eta: 2:27:51 lr: 4.286703067967778e-05 loss: 0.1749 (0.1852) time: 2.9061 data: 0.0094 max mem: 33300 Epoch: [6] [1240/4276] eta: 2:27:22 lr: 4.286435432459447e-05 loss: 0.1765 (0.1852) time: 2.9162 data: 0.0090 max mem: 33300 Epoch: [6] [1250/4276] eta: 2:26:53 lr: 4.286167795094374e-05 loss: 0.1753 (0.1851) time: 2.9143 data: 0.0083 max mem: 33300 Epoch: [6] [1260/4276] eta: 2:26:24 lr: 4.2859001558724155e-05 loss: 0.1678 (0.1849) time: 2.9087 data: 0.0083 max mem: 33300 Epoch: [6] [1270/4276] eta: 2:25:55 lr: 4.2856325147934325e-05 loss: 0.1761 (0.1849) time: 2.9145 data: 0.0085 max mem: 33300 Epoch: [6] [1280/4276] eta: 2:25:25 lr: 4.2853648718572815e-05 loss: 0.1873 (0.1849) time: 2.9114 data: 0.0091 max mem: 33300 Epoch: [6] [1290/4276] eta: 2:24:55 lr: 4.2850972270638204e-05 loss: 0.1816 (0.1850) time: 2.8944 data: 0.0091 max mem: 33300 Epoch: [6] [1300/4276] eta: 2:24:26 lr: 4.284829580412909e-05 loss: 0.1712 (0.1850) time: 2.8813 data: 0.0087 max mem: 33300 Epoch: [6] [1310/4276] eta: 2:23:56 lr: 4.284561931904404e-05 loss: 0.1673 (0.1849) time: 2.8812 data: 0.0087 max mem: 33300 Epoch: [6] [1320/4276] eta: 2:23:26 lr: 4.284294281538164e-05 loss: 0.1816 (0.1849) time: 2.8830 data: 0.0085 max mem: 33300 Epoch: [6] [1330/4276] eta: 2:22:56 lr: 4.2840266293140475e-05 loss: 0.1698 (0.1848) time: 2.8828 data: 0.0085 max mem: 33300 Epoch: [6] [1340/4276] eta: 2:22:27 lr: 4.283758975231913e-05 loss: 0.1615 (0.1846) time: 2.8829 data: 0.0087 max mem: 33300 Epoch: [6] [1350/4276] eta: 2:21:57 lr: 4.283491319291617e-05 loss: 0.1681 (0.1846) time: 2.8815 data: 0.0087 max mem: 33300 Epoch: [6] [1360/4276] eta: 2:21:27 lr: 4.283223661493019e-05 loss: 0.1794 (0.1847) time: 2.8797 data: 0.0085 max mem: 33300 Epoch: [6] [1370/4276] eta: 2:20:57 lr: 4.282956001835977e-05 loss: 0.1653 (0.1845) time: 2.8801 data: 0.0085 max mem: 33300 Epoch: [6] [1380/4276] eta: 2:20:29 lr: 4.282688340320348e-05 loss: 0.1748 (0.1846) time: 2.9078 data: 0.0088 max mem: 33300 Epoch: [6] [1390/4276] eta: 2:20:00 lr: 4.2824206769459914e-05 loss: 0.1947 (0.1846) time: 2.9374 data: 0.0098 max mem: 33300 Epoch: [6] [1400/4276] eta: 2:19:32 lr: 4.282153011712764e-05 loss: 0.1889 (0.1847) time: 2.9359 data: 0.0095 max mem: 33300 Epoch: [6] [1410/4276] eta: 2:19:03 lr: 4.281885344620524e-05 loss: 0.1876 (0.1847) time: 2.9356 data: 0.0082 max mem: 33300 Epoch: [6] [1420/4276] eta: 2:18:34 lr: 4.281617675669131e-05 loss: 0.1780 (0.1848) time: 2.9209 data: 0.0079 max mem: 33300 Epoch: [6] [1430/4276] eta: 2:18:05 lr: 4.2813500048584406e-05 loss: 0.1765 (0.1848) time: 2.9087 data: 0.0082 max mem: 33300 Epoch: [6] [1440/4276] eta: 2:17:35 lr: 4.281082332188312e-05 loss: 0.1858 (0.1849) time: 2.9032 data: 0.0094 max mem: 33300 Epoch: [6] [1450/4276] eta: 2:17:05 lr: 4.280814657658604e-05 loss: 0.1891 (0.1849) time: 2.8858 data: 0.0095 max mem: 33300 Epoch: [6] [1460/4276] eta: 2:16:36 lr: 4.2805469812691715e-05 loss: 0.1860 (0.1849) time: 2.8789 data: 0.0087 max mem: 33300 Epoch: [6] [1470/4276] eta: 2:16:06 lr: 4.2802793030198754e-05 loss: 0.1860 (0.1849) time: 2.8783 data: 0.0087 max mem: 33300 Epoch: [6] [1480/4276] eta: 2:15:36 lr: 4.2800116229105716e-05 loss: 0.1689 (0.1849) time: 2.8776 data: 0.0087 max mem: 33300 Epoch: [6] [1490/4276] eta: 2:15:07 lr: 4.279743940941119e-05 loss: 0.1578 (0.1848) time: 2.8803 data: 0.0083 max mem: 33300 Epoch: [6] [1500/4276] eta: 2:14:37 lr: 4.2794762571113755e-05 loss: 0.1711 (0.1848) time: 2.8842 data: 0.0083 max mem: 33300 Epoch: [6] [1510/4276] eta: 2:14:07 lr: 4.279208571421198e-05 loss: 0.1714 (0.1848) time: 2.8833 data: 0.0087 max mem: 33300 Epoch: [6] [1520/4276] eta: 2:13:38 lr: 4.278940883870445e-05 loss: 0.1717 (0.1848) time: 2.8810 data: 0.0086 max mem: 33300 Epoch: [6] [1530/4276] eta: 2:13:09 lr: 4.278673194458974e-05 loss: 0.1747 (0.1848) time: 2.8970 data: 0.0086 max mem: 33300 Epoch: [6] [1540/4276] eta: 2:12:40 lr: 4.278405503186642e-05 loss: 0.1747 (0.1848) time: 2.9200 data: 0.0092 max mem: 33300 Epoch: [6] [1550/4276] eta: 2:12:11 lr: 4.278137810053308e-05 loss: 0.1832 (0.1848) time: 2.9312 data: 0.0090 max mem: 33300 Epoch: [6] [1560/4276] eta: 2:11:43 lr: 4.277870115058828e-05 loss: 0.1706 (0.1847) time: 2.9330 data: 0.0080 max mem: 33300 Epoch: [6] [1570/4276] eta: 2:11:14 lr: 4.2776024182030624e-05 loss: 0.1710 (0.1848) time: 2.9360 data: 0.0078 max mem: 33300 Epoch: [6] [1580/4276] eta: 2:10:45 lr: 4.277334719485866e-05 loss: 0.1780 (0.1847) time: 2.9379 data: 0.0078 max mem: 33300 Epoch: [6] [1590/4276] eta: 2:10:16 lr: 4.277067018907098e-05 loss: 0.1724 (0.1847) time: 2.9163 data: 0.0081 max mem: 33300 Epoch: [6] [1600/4276] eta: 2:09:47 lr: 4.276799316466615e-05 loss: 0.1724 (0.1846) time: 2.8913 data: 0.0089 max mem: 33300 Epoch: [6] [1610/4276] eta: 2:09:17 lr: 4.276531612164276e-05 loss: 0.1669 (0.1845) time: 2.8825 data: 0.0092 max mem: 33300 Epoch: [6] [1620/4276] eta: 2:08:47 lr: 4.2762639059999365e-05 loss: 0.1669 (0.1844) time: 2.8817 data: 0.0094 max mem: 33300 Epoch: [6] [1630/4276] eta: 2:08:18 lr: 4.275996197973456e-05 loss: 0.1736 (0.1845) time: 2.8955 data: 0.0094 max mem: 33300 Epoch: [6] [1640/4276] eta: 2:07:50 lr: 4.27572848808469e-05 loss: 0.1790 (0.1845) time: 2.9214 data: 0.0091 max mem: 33300 Epoch: [6] [1650/4276] eta: 2:07:21 lr: 4.2754607763334984e-05 loss: 0.1790 (0.1846) time: 2.9344 data: 0.0089 max mem: 33300 Epoch: [6] [1660/4276] eta: 2:06:52 lr: 4.275193062719737e-05 loss: 0.1850 (0.1845) time: 2.9233 data: 0.0089 max mem: 33300 Epoch: [6] [1670/4276] eta: 2:06:23 lr: 4.274925347243264e-05 loss: 0.1771 (0.1845) time: 2.9119 data: 0.0089 max mem: 33300 Epoch: [6] [1680/4276] eta: 2:05:54 lr: 4.274657629903937e-05 loss: 0.1801 (0.1846) time: 2.9053 data: 0.0087 max mem: 33300 Epoch: [6] [1690/4276] eta: 2:05:24 lr: 4.2743899107016114e-05 loss: 0.1890 (0.1846) time: 2.9065 data: 0.0090 max mem: 33300 Epoch: [6] [1700/4276] eta: 2:04:57 lr: 4.2741221896361463e-05 loss: 0.1852 (0.1847) time: 2.9738 data: 0.0090 max mem: 33300 Epoch: [6] [1710/4276] eta: 2:04:31 lr: 4.2738544667074e-05 loss: 0.1923 (0.1848) time: 3.0595 data: 0.0086 max mem: 33300 Epoch: [6] [1720/4276] eta: 2:04:06 lr: 4.2735867419152284e-05 loss: 0.1923 (0.1849) time: 3.1329 data: 0.0084 max mem: 33300 Epoch: [6] [1730/4276] eta: 2:03:42 lr: 4.273319015259489e-05 loss: 0.1868 (0.1849) time: 3.2398 data: 0.0085 max mem: 33300 Epoch: [6] [1740/4276] eta: 2:03:16 lr: 4.273051286740038e-05 loss: 0.1835 (0.1849) time: 3.2229 data: 0.0082 max mem: 33300 Epoch: [6] [1750/4276] eta: 2:02:50 lr: 4.272783556356735e-05 loss: 0.1835 (0.1849) time: 3.1361 data: 0.0077 max mem: 33300 Epoch: [6] [1760/4276] eta: 2:02:27 lr: 4.272515824109437e-05 loss: 0.1825 (0.1849) time: 3.2269 data: 0.0079 max mem: 33300 Epoch: [6] [1770/4276] eta: 2:02:01 lr: 4.2722480899979996e-05 loss: 0.1825 (0.1849) time: 3.2568 data: 0.0078 max mem: 33300 Epoch: [6] [1780/4276] eta: 2:01:36 lr: 4.27198035402228e-05 loss: 0.1856 (0.1850) time: 3.1798 data: 0.0083 max mem: 33300 Epoch: [6] [1790/4276] eta: 2:01:12 lr: 4.271712616182137e-05 loss: 0.1787 (0.1850) time: 3.2359 data: 0.0090 max mem: 33300 Epoch: [6] [1800/4276] eta: 2:00:46 lr: 4.271444876477427e-05 loss: 0.1787 (0.1850) time: 3.2338 data: 0.0089 max mem: 33300 Epoch: [6] [1810/4276] eta: 2:00:20 lr: 4.271177134908007e-05 loss: 0.1933 (0.1851) time: 3.1683 data: 0.0090 max mem: 33300 Epoch: [6] [1820/4276] eta: 1:59:54 lr: 4.270909391473734e-05 loss: 0.1848 (0.1850) time: 3.1896 data: 0.0092 max mem: 33300 Epoch: [6] [1830/4276] eta: 1:59:28 lr: 4.2706416461744655e-05 loss: 0.1690 (0.1850) time: 3.1838 data: 0.0093 max mem: 33300 Epoch: [6] [1840/4276] eta: 1:59:03 lr: 4.2703738990100586e-05 loss: 0.1653 (0.1850) time: 3.1948 data: 0.0090 max mem: 33300 Epoch: [6] [1850/4276] eta: 1:58:38 lr: 4.270106149980371e-05 loss: 0.1814 (0.1850) time: 3.2422 data: 0.0086 max mem: 33300 Epoch: [6] [1860/4276] eta: 1:58:11 lr: 4.269838399085257e-05 loss: 0.1820 (0.1850) time: 3.2024 data: 0.0087 max mem: 33300 Epoch: [6] [1870/4276] eta: 1:57:45 lr: 4.269570646324577e-05 loss: 0.1772 (0.1851) time: 3.1741 data: 0.0093 max mem: 33300 Epoch: [6] [1880/4276] eta: 1:57:20 lr: 4.2693028916981866e-05 loss: 0.1829 (0.1850) time: 3.2272 data: 0.0093 max mem: 33300 Epoch: [6] [1890/4276] eta: 1:56:54 lr: 4.269035135205943e-05 loss: 0.1756 (0.1850) time: 3.2370 data: 0.0095 max mem: 33300 Epoch: [6] [1900/4276] eta: 1:56:28 lr: 4.2687673768477025e-05 loss: 0.1734 (0.1850) time: 3.1937 data: 0.0089 max mem: 33300 Epoch: [6] [1910/4276] eta: 1:56:02 lr: 4.268499616623323e-05 loss: 0.1755 (0.1850) time: 3.2208 data: 0.0084 max mem: 33300 Epoch: [6] [1920/4276] eta: 1:55:37 lr: 4.268231854532661e-05 loss: 0.1697 (0.1849) time: 3.2733 data: 0.0092 max mem: 33300 Epoch: [6] [1930/4276] eta: 1:55:10 lr: 4.267964090575572e-05 loss: 0.1831 (0.1849) time: 3.2291 data: 0.0090 max mem: 33300 Epoch: [6] [1940/4276] eta: 1:54:44 lr: 4.267696324751915e-05 loss: 0.1849 (0.1850) time: 3.1939 data: 0.0085 max mem: 33300 Epoch: [6] [1950/4276] eta: 1:54:19 lr: 4.2674285570615466e-05 loss: 0.1784 (0.1850) time: 3.2495 data: 0.0089 max mem: 33300 Epoch: [6] [1960/4276] eta: 1:53:52 lr: 4.267160787504323e-05 loss: 0.1635 (0.1849) time: 3.2483 data: 0.0090 max mem: 33300 Epoch: [6] [1970/4276] eta: 1:53:26 lr: 4.266893016080101e-05 loss: 0.1614 (0.1848) time: 3.2056 data: 0.0087 max mem: 33300 Epoch: [6] [1980/4276] eta: 1:53:00 lr: 4.266625242788737e-05 loss: 0.1667 (0.1847) time: 3.2358 data: 0.0088 max mem: 33300 Epoch: [6] [1990/4276] eta: 1:52:34 lr: 4.2663574676300885e-05 loss: 0.1756 (0.1847) time: 3.2685 data: 0.0088 max mem: 33300 Epoch: [6] [2000/4276] eta: 1:52:07 lr: 4.266089690604013e-05 loss: 0.1875 (0.1848) time: 3.2282 data: 0.0085 max mem: 33300 Epoch: [6] [2010/4276] eta: 1:51:40 lr: 4.2658219117103656e-05 loss: 0.1802 (0.1847) time: 3.2046 data: 0.0082 max mem: 33300 Epoch: [6] [2020/4276] eta: 1:51:14 lr: 4.2655541309490034e-05 loss: 0.1792 (0.1847) time: 3.2264 data: 0.0089 max mem: 33300 Epoch: [6] [2030/4276] eta: 1:50:47 lr: 4.265286348319784e-05 loss: 0.1792 (0.1847) time: 3.2141 data: 0.0087 max mem: 33300 Epoch: [6] [2040/4276] eta: 1:50:20 lr: 4.265018563822563e-05 loss: 0.1726 (0.1846) time: 3.2091 data: 0.0083 max mem: 33300 Epoch: [6] [2050/4276] eta: 1:49:53 lr: 4.2647507774571976e-05 loss: 0.1756 (0.1846) time: 3.2187 data: 0.0090 max mem: 33300 Epoch: [6] [2060/4276] eta: 1:49:26 lr: 4.2644829892235445e-05 loss: 0.1803 (0.1846) time: 3.1897 data: 0.0088 max mem: 33300 Epoch: [6] [2070/4276] eta: 1:48:59 lr: 4.2642151991214607e-05 loss: 0.1720 (0.1845) time: 3.1921 data: 0.0085 max mem: 33300 Epoch: [6] [2080/4276] eta: 1:48:32 lr: 4.263947407150802e-05 loss: 0.1730 (0.1846) time: 3.2193 data: 0.0088 max mem: 33300 Epoch: [6] [2090/4276] eta: 1:48:05 lr: 4.263679613311425e-05 loss: 0.1874 (0.1846) time: 3.2061 data: 0.0088 max mem: 33300 Epoch: [6] [2100/4276] eta: 1:47:38 lr: 4.263411817603186e-05 loss: 0.1874 (0.1846) time: 3.2315 data: 0.0085 max mem: 33300 Epoch: [6] [2110/4276] eta: 1:47:11 lr: 4.263144020025943e-05 loss: 0.1720 (0.1846) time: 3.2422 data: 0.0083 max mem: 33300 Epoch: [6] [2120/4276] eta: 1:46:43 lr: 4.262876220579551e-05 loss: 0.1643 (0.1844) time: 3.1888 data: 0.0082 max mem: 33300 Epoch: [6] [2130/4276] eta: 1:46:17 lr: 4.262608419263867e-05 loss: 0.1591 (0.1844) time: 3.2085 data: 0.0081 max mem: 33300 Epoch: [6] [2140/4276] eta: 1:45:49 lr: 4.262340616078747e-05 loss: 0.1812 (0.1844) time: 3.2288 data: 0.0085 max mem: 33300 Epoch: [6] [2150/4276] eta: 1:45:21 lr: 4.2620728110240484e-05 loss: 0.1799 (0.1843) time: 3.1529 data: 0.0084 max mem: 33300 Epoch: [6] [2160/4276] eta: 1:44:52 lr: 4.2618050040996274e-05 loss: 0.1699 (0.1843) time: 3.1156 data: 0.0078 max mem: 33300 Epoch: [6] [2170/4276] eta: 1:44:25 lr: 4.26153719530534e-05 loss: 0.1819 (0.1843) time: 3.1434 data: 0.0082 max mem: 33300 Epoch: [6] [2180/4276] eta: 1:43:56 lr: 4.261269384641042e-05 loss: 0.1829 (0.1844) time: 3.1543 data: 0.0083 max mem: 33300 Epoch: [6] [2190/4276] eta: 1:43:29 lr: 4.261001572106591e-05 loss: 0.1819 (0.1844) time: 3.1627 data: 0.0084 max mem: 33300 Epoch: [6] [2200/4276] eta: 1:43:01 lr: 4.260733757701843e-05 loss: 0.1857 (0.1844) time: 3.1946 data: 0.0085 max mem: 33300 Epoch: [6] [2210/4276] eta: 1:42:34 lr: 4.260465941426653e-05 loss: 0.1929 (0.1844) time: 3.2328 data: 0.0084 max mem: 33300 Epoch: [6] [2220/4276] eta: 1:42:07 lr: 4.26019812328088e-05 loss: 0.1934 (0.1844) time: 3.2483 data: 0.0084 max mem: 33300 Epoch: [6] [2230/4276] eta: 1:41:38 lr: 4.259930303264378e-05 loss: 0.1737 (0.1844) time: 3.2009 data: 0.0085 max mem: 33300 Epoch: [6] [2240/4276] eta: 1:41:11 lr: 4.2596624813770035e-05 loss: 0.1605 (0.1843) time: 3.1981 data: 0.0083 max mem: 33300 Epoch: [6] [2250/4276] eta: 1:40:43 lr: 4.259394657618613e-05 loss: 0.1605 (0.1842) time: 3.2050 data: 0.0083 max mem: 33300 Epoch: [6] [2260/4276] eta: 1:40:14 lr: 4.259126831989063e-05 loss: 0.1726 (0.1842) time: 3.1106 data: 0.0085 max mem: 33300 Epoch: [6] [2270/4276] eta: 1:39:45 lr: 4.25885900448821e-05 loss: 0.1792 (0.1843) time: 3.0885 data: 0.0085 max mem: 33300 Epoch: [6] [2280/4276] eta: 1:39:17 lr: 4.2585911751159096e-05 loss: 0.1786 (0.1843) time: 3.1618 data: 0.0086 max mem: 33300 Epoch: [6] [2290/4276] eta: 1:38:48 lr: 4.2583233438720186e-05 loss: 0.1728 (0.1842) time: 3.1104 data: 0.0081 max mem: 33300 Epoch: [6] [2300/4276] eta: 1:38:18 lr: 4.258055510756392e-05 loss: 0.1661 (0.1841) time: 3.0514 data: 0.0078 max mem: 33300 Epoch: [6] [2310/4276] eta: 1:37:50 lr: 4.257787675768886e-05 loss: 0.1708 (0.1840) time: 3.0967 data: 0.0082 max mem: 33300 Epoch: [6] [2320/4276] eta: 1:37:21 lr: 4.2575198389093576e-05 loss: 0.1712 (0.1840) time: 3.0965 data: 0.0080 max mem: 33300 Epoch: [6] [2330/4276] eta: 1:36:51 lr: 4.257252000177662e-05 loss: 0.1712 (0.1840) time: 3.0628 data: 0.0079 max mem: 33300 Epoch: [6] [2340/4276] eta: 1:36:22 lr: 4.256984159573656e-05 loss: 0.1735 (0.1839) time: 3.0331 data: 0.0078 max mem: 33300 Epoch: [6] [2350/4276] eta: 1:35:52 lr: 4.256716317097196e-05 loss: 0.1644 (0.1839) time: 3.0249 data: 0.0080 max mem: 33300 Epoch: [6] [2360/4276] eta: 1:35:23 lr: 4.256448472748136e-05 loss: 0.1625 (0.1838) time: 3.0307 data: 0.0080 max mem: 33300 Epoch: [6] [2370/4276] eta: 1:34:53 lr: 4.2561806265263336e-05 loss: 0.1740 (0.1838) time: 3.0052 data: 0.0076 max mem: 33300 Epoch: [6] [2380/4276] eta: 1:34:23 lr: 4.255912778431646e-05 loss: 0.1703 (0.1838) time: 2.9898 data: 0.0074 max mem: 33300 Epoch: [6] [2390/4276] eta: 1:33:54 lr: 4.2556449284639254e-05 loss: 0.1600 (0.1837) time: 3.0213 data: 0.0070 max mem: 33300 Epoch: [6] [2400/4276] eta: 1:33:24 lr: 4.2553770766230304e-05 loss: 0.1613 (0.1838) time: 3.0096 data: 0.0075 max mem: 33300 Epoch: [6] [2410/4276] eta: 1:32:54 lr: 4.255109222908816e-05 loss: 0.1792 (0.1838) time: 2.9738 data: 0.0084 max mem: 33300 Epoch: [6] [2420/4276] eta: 1:32:24 lr: 4.2548413673211395e-05 loss: 0.1656 (0.1837) time: 2.9859 data: 0.0086 max mem: 33300 Epoch: [6] [2430/4276] eta: 1:31:54 lr: 4.2545735098598556e-05 loss: 0.1666 (0.1838) time: 2.9778 data: 0.0082 max mem: 33300 Epoch: [6] [2440/4276] eta: 1:31:23 lr: 4.2543056505248196e-05 loss: 0.1860 (0.1837) time: 2.9433 data: 0.0083 max mem: 33300 Epoch: [6] [2450/4276] eta: 1:30:53 lr: 4.254037789315888e-05 loss: 0.1669 (0.1837) time: 2.9343 data: 0.0086 max mem: 33300 Epoch: [6] [2460/4276] eta: 1:30:23 lr: 4.253769926232917e-05 loss: 0.1773 (0.1837) time: 2.9370 data: 0.0084 max mem: 33300 Epoch: [6] [2470/4276] eta: 1:29:53 lr: 4.253502061275761e-05 loss: 0.1733 (0.1837) time: 2.9310 data: 0.0080 max mem: 33300 Epoch: [6] [2480/4276] eta: 1:29:22 lr: 4.253234194444277e-05 loss: 0.1740 (0.1837) time: 2.9317 data: 0.0080 max mem: 33300 Epoch: [6] [2490/4276] eta: 1:28:52 lr: 4.25296632573832e-05 loss: 0.1740 (0.1837) time: 2.9340 data: 0.0079 max mem: 33300 Epoch: [6] [2500/4276] eta: 1:28:22 lr: 4.2526984551577465e-05 loss: 0.1688 (0.1837) time: 2.9159 data: 0.0083 max mem: 33300 Epoch: [6] [2510/4276] eta: 1:27:51 lr: 4.252430582702412e-05 loss: 0.1735 (0.1836) time: 2.9024 data: 0.0087 max mem: 33300 Epoch: [6] [2520/4276] eta: 1:27:21 lr: 4.2521627083721716e-05 loss: 0.1572 (0.1835) time: 2.9023 data: 0.0094 max mem: 33300 Epoch: [6] [2530/4276] eta: 1:26:51 lr: 4.251894832166881e-05 loss: 0.1570 (0.1834) time: 2.9094 data: 0.0102 max mem: 33300 Epoch: [6] [2540/4276] eta: 1:26:20 lr: 4.2516269540863964e-05 loss: 0.1589 (0.1833) time: 2.9141 data: 0.0103 max mem: 33300 Epoch: [6] [2550/4276] eta: 1:25:50 lr: 4.2513590741305724e-05 loss: 0.1589 (0.1833) time: 2.9078 data: 0.0099 max mem: 33300 Epoch: [6] [2560/4276] eta: 1:25:19 lr: 4.251091192299265e-05 loss: 0.1583 (0.1832) time: 2.8941 data: 0.0090 max mem: 33300 Epoch: [6] [2570/4276] eta: 1:24:49 lr: 4.2508233085923304e-05 loss: 0.1583 (0.1832) time: 2.8812 data: 0.0082 max mem: 33300 Epoch: [6] [2580/4276] eta: 1:24:18 lr: 4.2505554230096245e-05 loss: 0.1776 (0.1832) time: 2.8801 data: 0.0079 max mem: 33300 Epoch: [6] [2590/4276] eta: 1:23:48 lr: 4.250287535551001e-05 loss: 0.1695 (0.1831) time: 2.9034 data: 0.0077 max mem: 33300 Epoch: [6] [2600/4276] eta: 1:23:18 lr: 4.2500196462163164e-05 loss: 0.1777 (0.1832) time: 2.9180 data: 0.0080 max mem: 33300 Epoch: [6] [2610/4276] eta: 1:22:47 lr: 4.249751755005427e-05 loss: 0.1786 (0.1831) time: 2.8977 data: 0.0088 max mem: 33300 Epoch: [6] [2620/4276] eta: 1:22:17 lr: 4.249483861918186e-05 loss: 0.1789 (0.1832) time: 2.8879 data: 0.0087 max mem: 33300 Epoch: [6] [2630/4276] eta: 1:21:47 lr: 4.2492159669544505e-05 loss: 0.1751 (0.1831) time: 2.9113 data: 0.0087 max mem: 33300 Epoch: [6] [2640/4276] eta: 1:21:17 lr: 4.248948070114076e-05 loss: 0.1528 (0.1831) time: 2.9330 data: 0.0087 max mem: 33300 Epoch: [6] [2650/4276] eta: 1:20:47 lr: 4.248680171396918e-05 loss: 0.1771 (0.1831) time: 2.9358 data: 0.0081 max mem: 33300 Epoch: [6] [2660/4276] eta: 1:20:16 lr: 4.2484122708028304e-05 loss: 0.1843 (0.1831) time: 2.9142 data: 0.0087 max mem: 33300 Epoch: [6] [2670/4276] eta: 1:19:46 lr: 4.24814436833167e-05 loss: 0.1748 (0.1831) time: 2.8868 data: 0.0089 max mem: 33300 Epoch: [6] [2680/4276] eta: 1:19:15 lr: 4.247876463983292e-05 loss: 0.1705 (0.1831) time: 2.8837 data: 0.0081 max mem: 33300 Epoch: [6] [2690/4276] eta: 1:18:45 lr: 4.247608557757551e-05 loss: 0.1759 (0.1831) time: 2.8885 data: 0.0087 max mem: 33300 Epoch: [6] [2700/4276] eta: 1:18:15 lr: 4.247340649654302e-05 loss: 0.1633 (0.1830) time: 2.8892 data: 0.0093 max mem: 33300 Epoch: [6] [2710/4276] eta: 1:17:44 lr: 4.247072739673402e-05 loss: 0.1633 (0.1830) time: 2.8853 data: 0.0083 max mem: 33300 Epoch: [6] [2720/4276] eta: 1:17:14 lr: 4.246804827814704e-05 loss: 0.1718 (0.1829) time: 2.8848 data: 0.0077 max mem: 33300 Epoch: [6] [2730/4276] eta: 1:16:44 lr: 4.246536914078064e-05 loss: 0.1671 (0.1829) time: 2.8858 data: 0.0083 max mem: 33300 Epoch: [6] [2740/4276] eta: 1:16:14 lr: 4.246268998463339e-05 loss: 0.1922 (0.1830) time: 2.8870 data: 0.0087 max mem: 33300 Epoch: [6] [2750/4276] eta: 1:15:43 lr: 4.2460010809703816e-05 loss: 0.1886 (0.1830) time: 2.8937 data: 0.0086 max mem: 33300 Epoch: [6] [2760/4276] eta: 1:15:13 lr: 4.245733161599048e-05 loss: 0.1748 (0.1830) time: 2.8904 data: 0.0082 max mem: 33300 Epoch: [6] [2770/4276] eta: 1:14:43 lr: 4.245465240349194e-05 loss: 0.1690 (0.1830) time: 2.8958 data: 0.0078 max mem: 33300 Epoch: [6] [2780/4276] eta: 1:14:13 lr: 4.245197317220673e-05 loss: 0.1744 (0.1830) time: 2.9222 data: 0.0083 max mem: 33300 Epoch: [6] [2790/4276] eta: 1:13:43 lr: 4.2449293922133424e-05 loss: 0.1845 (0.1830) time: 2.9319 data: 0.0085 max mem: 33300 Epoch: [6] [2800/4276] eta: 1:13:13 lr: 4.244661465327054e-05 loss: 0.1773 (0.1830) time: 2.9329 data: 0.0081 max mem: 33300 Epoch: [6] [2810/4276] eta: 1:12:43 lr: 4.244393536561667e-05 loss: 0.1626 (0.1829) time: 2.9335 data: 0.0078 max mem: 33300 Epoch: [6] [2820/4276] eta: 1:12:13 lr: 4.2441256059170336e-05 loss: 0.1604 (0.1828) time: 2.9342 data: 0.0079 max mem: 33300 Epoch: [6] [2830/4276] eta: 1:11:43 lr: 4.243857673393009e-05 loss: 0.1646 (0.1828) time: 2.9359 data: 0.0083 max mem: 33300 Epoch: [6] [2840/4276] eta: 1:11:13 lr: 4.2435897389894487e-05 loss: 0.1646 (0.1828) time: 2.9353 data: 0.0079 max mem: 33300 Epoch: [6] [2850/4276] eta: 1:10:43 lr: 4.2433218027062074e-05 loss: 0.1908 (0.1829) time: 2.9348 data: 0.0074 max mem: 33300 Epoch: [6] [2860/4276] eta: 1:10:13 lr: 4.243053864543141e-05 loss: 0.1837 (0.1828) time: 2.9345 data: 0.0073 max mem: 33300 Epoch: [6] [2870/4276] eta: 1:09:43 lr: 4.242785924500103e-05 loss: 0.1793 (0.1828) time: 2.9347 data: 0.0077 max mem: 33300 Epoch: [6] [2880/4276] eta: 1:09:13 lr: 4.242517982576949e-05 loss: 0.1802 (0.1828) time: 2.9354 data: 0.0080 max mem: 33300 Epoch: [6] [2890/4276] eta: 1:08:43 lr: 4.2422500387735345e-05 loss: 0.1860 (0.1829) time: 2.9353 data: 0.0078 max mem: 33300 Epoch: [6] [2900/4276] eta: 1:08:13 lr: 4.241982093089712e-05 loss: 0.1656 (0.1828) time: 2.9283 data: 0.0079 max mem: 33300 Epoch: [6] [2910/4276] eta: 1:07:43 lr: 4.241714145525339e-05 loss: 0.1582 (0.1828) time: 2.9276 data: 0.0080 max mem: 33300 Epoch: [6] [2920/4276] eta: 1:07:13 lr: 4.24144619608027e-05 loss: 0.1780 (0.1828) time: 2.9368 data: 0.0080 max mem: 33300 Epoch: [6] [2930/4276] eta: 1:06:43 lr: 4.241178244754358e-05 loss: 0.1585 (0.1827) time: 2.9368 data: 0.0078 max mem: 33300 Epoch: [6] [2940/4276] eta: 1:06:13 lr: 4.240910291547459e-05 loss: 0.1560 (0.1827) time: 2.9302 data: 0.0082 max mem: 33300 Epoch: [6] [2950/4276] eta: 1:05:43 lr: 4.240642336459427e-05 loss: 0.1651 (0.1827) time: 2.9189 data: 0.0089 max mem: 33300 Epoch: [6] [2960/4276] eta: 1:05:14 lr: 4.240374379490118e-05 loss: 0.1766 (0.1826) time: 2.9253 data: 0.0089 max mem: 33300 Epoch: [6] [2970/4276] eta: 1:04:44 lr: 4.2401064206393867e-05 loss: 0.1745 (0.1827) time: 2.9388 data: 0.0087 max mem: 33300 Epoch: [6] [2980/4276] eta: 1:04:14 lr: 4.239838459907086e-05 loss: 0.1745 (0.1827) time: 2.9393 data: 0.0087 max mem: 33300 Epoch: [6] [2990/4276] eta: 1:03:44 lr: 4.2395704972930716e-05 loss: 0.1689 (0.1826) time: 2.9173 data: 0.0088 max mem: 33300 Epoch: [6] [3000/4276] eta: 1:03:14 lr: 4.239302532797199e-05 loss: 0.1587 (0.1825) time: 2.9161 data: 0.0087 max mem: 33300 Epoch: [6] [3010/4276] eta: 1:02:44 lr: 4.239034566419321e-05 loss: 0.1603 (0.1825) time: 2.9360 data: 0.0080 max mem: 33300 Epoch: [6] [3020/4276] eta: 1:02:14 lr: 4.238766598159294e-05 loss: 0.1646 (0.1825) time: 2.9318 data: 0.0078 max mem: 33300 Epoch: [6] [3030/4276] eta: 1:01:44 lr: 4.238498628016971e-05 loss: 0.1688 (0.1825) time: 2.9201 data: 0.0082 max mem: 33300 Epoch: [6] [3040/4276] eta: 1:01:14 lr: 4.2382306559922076e-05 loss: 0.1752 (0.1826) time: 2.9213 data: 0.0082 max mem: 33300 Epoch: [6] [3050/4276] eta: 1:00:44 lr: 4.237962682084858e-05 loss: 0.1790 (0.1825) time: 2.9358 data: 0.0079 max mem: 33300 Epoch: [6] [3060/4276] eta: 1:00:14 lr: 4.237694706294776e-05 loss: 0.1533 (0.1825) time: 2.9365 data: 0.0082 max mem: 33300 Epoch: [6] [3070/4276] eta: 0:59:44 lr: 4.237426728621818e-05 loss: 0.1632 (0.1825) time: 2.9396 data: 0.0082 max mem: 33300 Epoch: [6] [3080/4276] eta: 0:59:15 lr: 4.2371587490658374e-05 loss: 0.1816 (0.1825) time: 2.9404 data: 0.0079 max mem: 33300 Epoch: [6] [3090/4276] eta: 0:58:45 lr: 4.2368907676266876e-05 loss: 0.1603 (0.1824) time: 2.9412 data: 0.0079 max mem: 33300 Epoch: [6] [3100/4276] eta: 0:58:15 lr: 4.236622784304224e-05 loss: 0.1667 (0.1824) time: 2.9405 data: 0.0081 max mem: 33300 Epoch: [6] [3110/4276] eta: 0:57:45 lr: 4.236354799098301e-05 loss: 0.1645 (0.1823) time: 2.9389 data: 0.0083 max mem: 33300 Epoch: [6] [3120/4276] eta: 0:57:15 lr: 4.2360868120087725e-05 loss: 0.1539 (0.1822) time: 2.9395 data: 0.0081 max mem: 33300 Epoch: [6] [3130/4276] eta: 0:56:45 lr: 4.2358188230354944e-05 loss: 0.1539 (0.1822) time: 2.9343 data: 0.0079 max mem: 33300 Epoch: [6] [3140/4276] eta: 0:56:16 lr: 4.2355508321783195e-05 loss: 0.1685 (0.1822) time: 2.9333 data: 0.0081 max mem: 33300 Epoch: [6] [3150/4276] eta: 0:55:46 lr: 4.235282839437102e-05 loss: 0.1893 (0.1822) time: 2.9353 data: 0.0084 max mem: 33300 Epoch: [6] [3160/4276] eta: 0:55:16 lr: 4.235014844811697e-05 loss: 0.1872 (0.1822) time: 2.9351 data: 0.0083 max mem: 33300 Epoch: [6] [3170/4276] eta: 0:54:46 lr: 4.234746848301959e-05 loss: 0.1712 (0.1822) time: 2.9349 data: 0.0084 max mem: 33300 Epoch: [6] [3180/4276] eta: 0:54:16 lr: 4.234478849907741e-05 loss: 0.1709 (0.1822) time: 2.9349 data: 0.0085 max mem: 33300 Epoch: [6] [3190/4276] eta: 0:53:46 lr: 4.234210849628898e-05 loss: 0.1709 (0.1822) time: 2.9347 data: 0.0084 max mem: 33300 Epoch: [6] [3200/4276] eta: 0:53:16 lr: 4.233942847465284e-05 loss: 0.1757 (0.1822) time: 2.9356 data: 0.0081 max mem: 33300 Epoch: [6] [3210/4276] eta: 0:52:47 lr: 4.233674843416754e-05 loss: 0.1735 (0.1822) time: 2.9343 data: 0.0079 max mem: 33300 Epoch: [6] [3220/4276] eta: 0:52:17 lr: 4.2334068374831606e-05 loss: 0.1901 (0.1822) time: 2.9337 data: 0.0081 max mem: 33300 Epoch: [6] [3230/4276] eta: 0:51:47 lr: 4.2331388296643596e-05 loss: 0.1756 (0.1822) time: 2.9343 data: 0.0085 max mem: 33300 Epoch: [6] [3240/4276] eta: 0:51:17 lr: 4.232870819960205e-05 loss: 0.1953 (0.1823) time: 2.9375 data: 0.0089 max mem: 33300 Epoch: [6] [3250/4276] eta: 0:50:47 lr: 4.232602808370549e-05 loss: 0.1924 (0.1823) time: 2.9460 data: 0.0088 max mem: 33300 Epoch: [6] [3260/4276] eta: 0:50:17 lr: 4.232334794895247e-05 loss: 0.1853 (0.1823) time: 2.9297 data: 0.0084 max mem: 33300 Epoch: [6] [3270/4276] eta: 0:49:48 lr: 4.232066779534153e-05 loss: 0.1853 (0.1822) time: 2.9122 data: 0.0092 max mem: 33300 Epoch: [6] [3280/4276] eta: 0:49:18 lr: 4.2317987622871215e-05 loss: 0.1820 (0.1823) time: 2.9281 data: 0.0096 max mem: 33300 Epoch: [6] [3290/4276] eta: 0:48:48 lr: 4.2315307431540055e-05 loss: 0.1817 (0.1823) time: 2.9396 data: 0.0091 max mem: 33300 Epoch: [6] [3300/4276] eta: 0:48:18 lr: 4.23126272213466e-05 loss: 0.1817 (0.1823) time: 2.9425 data: 0.0090 max mem: 33300 Epoch: [6] [3310/4276] eta: 0:47:48 lr: 4.2309946992289386e-05 loss: 0.1889 (0.1824) time: 2.9448 data: 0.0088 max mem: 33300 Epoch: [6] [3320/4276] eta: 0:47:19 lr: 4.2307266744366944e-05 loss: 0.1890 (0.1824) time: 2.9436 data: 0.0083 max mem: 33300 Epoch: [6] [3330/4276] eta: 0:46:49 lr: 4.2304586477577826e-05 loss: 0.1731 (0.1824) time: 2.9388 data: 0.0077 max mem: 33300 Epoch: [6] [3340/4276] eta: 0:46:19 lr: 4.230190619192057e-05 loss: 0.1731 (0.1824) time: 2.9393 data: 0.0079 max mem: 33300 Epoch: [6] [3350/4276] eta: 0:45:49 lr: 4.22992258873937e-05 loss: 0.1652 (0.1823) time: 2.9392 data: 0.0083 max mem: 33300 Epoch: [6] [3360/4276] eta: 0:45:20 lr: 4.229654556399578e-05 loss: 0.1724 (0.1823) time: 2.9397 data: 0.0083 max mem: 33300 Epoch: [6] [3370/4276] eta: 0:44:50 lr: 4.229386522172532e-05 loss: 0.1821 (0.1824) time: 2.9423 data: 0.0081 max mem: 33300 Epoch: [6] [3380/4276] eta: 0:44:20 lr: 4.2291184860580875e-05 loss: 0.1894 (0.1824) time: 2.9513 data: 0.0077 max mem: 33300 Epoch: [6] [3390/4276] eta: 0:43:50 lr: 4.228850448056098e-05 loss: 0.1894 (0.1825) time: 2.9508 data: 0.0079 max mem: 33300 Epoch: [6] [3400/4276] eta: 0:43:21 lr: 4.2285824081664174e-05 loss: 0.2064 (0.1825) time: 2.9383 data: 0.0081 max mem: 33300 Epoch: [6] [3410/4276] eta: 0:42:51 lr: 4.228314366388899e-05 loss: 0.1869 (0.1825) time: 2.9380 data: 0.0082 max mem: 33300 Epoch: [6] [3420/4276] eta: 0:42:21 lr: 4.2280463227233964e-05 loss: 0.1834 (0.1826) time: 2.9364 data: 0.0082 max mem: 33300 Epoch: [6] [3430/4276] eta: 0:41:51 lr: 4.227778277169764e-05 loss: 0.1807 (0.1826) time: 2.9336 data: 0.0078 max mem: 33300 Epoch: [6] [3440/4276] eta: 0:41:21 lr: 4.227510229727856e-05 loss: 0.1717 (0.1825) time: 2.9320 data: 0.0075 max mem: 33300 Epoch: [6] [3450/4276] eta: 0:40:52 lr: 4.227242180397525e-05 loss: 0.1702 (0.1825) time: 2.9327 data: 0.0078 max mem: 33300 Epoch: [6] [3460/4276] eta: 0:40:22 lr: 4.226974129178624e-05 loss: 0.1806 (0.1825) time: 2.9326 data: 0.0078 max mem: 33300 Epoch: [6] [3470/4276] eta: 0:39:52 lr: 4.226706076071009e-05 loss: 0.1657 (0.1825) time: 2.9321 data: 0.0076 max mem: 33300 Epoch: [6] [3480/4276] eta: 0:39:22 lr: 4.2264380210745314e-05 loss: 0.1828 (0.1825) time: 2.9322 data: 0.0076 max mem: 33300 Epoch: [6] [3490/4276] eta: 0:38:53 lr: 4.226169964189045e-05 loss: 0.1916 (0.1825) time: 2.9315 data: 0.0076 max mem: 33300 Epoch: [6] [3500/4276] eta: 0:38:23 lr: 4.225901905414404e-05 loss: 0.1880 (0.1825) time: 2.9319 data: 0.0073 max mem: 33300 Epoch: [6] [3510/4276] eta: 0:37:53 lr: 4.225633844750463e-05 loss: 0.1768 (0.1825) time: 2.9460 data: 0.0071 max mem: 33300 Epoch: [6] [3520/4276] eta: 0:37:23 lr: 4.2253657821970734e-05 loss: 0.1851 (0.1825) time: 2.9449 data: 0.0071 max mem: 33300 Epoch: [6] [3530/4276] eta: 0:36:54 lr: 4.2250977177540894e-05 loss: 0.1851 (0.1825) time: 2.9313 data: 0.0073 max mem: 33300 Epoch: [6] [3540/4276] eta: 0:36:24 lr: 4.224829651421366e-05 loss: 0.1837 (0.1825) time: 2.9319 data: 0.0074 max mem: 33300 Epoch: [6] [3550/4276] eta: 0:35:54 lr: 4.224561583198754e-05 loss: 0.1743 (0.1824) time: 2.9337 data: 0.0074 max mem: 33300 Epoch: [6] [3560/4276] eta: 0:35:24 lr: 4.2242935130861085e-05 loss: 0.1701 (0.1825) time: 2.9331 data: 0.0073 max mem: 33300 Epoch: [6] [3570/4276] eta: 0:34:55 lr: 4.224025441083282e-05 loss: 0.1919 (0.1825) time: 2.9225 data: 0.0071 max mem: 33300 Epoch: [6] [3580/4276] eta: 0:34:25 lr: 4.223757367190129e-05 loss: 0.1770 (0.1825) time: 2.9248 data: 0.0071 max mem: 33300 Epoch: [6] [3590/4276] eta: 0:33:55 lr: 4.223489291406503e-05 loss: 0.1613 (0.1825) time: 2.9338 data: 0.0071 max mem: 33300 Epoch: [6] [3600/4276] eta: 0:33:25 lr: 4.2232212137322565e-05 loss: 0.1833 (0.1825) time: 2.9329 data: 0.0071 max mem: 33300 Epoch: [6] [3610/4276] eta: 0:32:56 lr: 4.222953134167242e-05 loss: 0.1900 (0.1825) time: 2.9351 data: 0.0072 max mem: 33300 Epoch: [6] [3620/4276] eta: 0:32:26 lr: 4.222685052711315e-05 loss: 0.1906 (0.1825) time: 2.9345 data: 0.0072 max mem: 33300 Epoch: [6] [3630/4276] eta: 0:31:56 lr: 4.2224169693643266e-05 loss: 0.1922 (0.1825) time: 2.9382 data: 0.0074 max mem: 33300 Epoch: [6] [3640/4276] eta: 0:31:26 lr: 4.2221488841261315e-05 loss: 0.1772 (0.1825) time: 2.9388 data: 0.0074 max mem: 33300 Epoch: [6] [3650/4276] eta: 0:30:57 lr: 4.2218807969965816e-05 loss: 0.1562 (0.1825) time: 2.9376 data: 0.0072 max mem: 33300 Epoch: [6] [3660/4276] eta: 0:30:27 lr: 4.221612707975532e-05 loss: 0.1549 (0.1824) time: 2.9378 data: 0.0074 max mem: 33300 Epoch: [6] [3670/4276] eta: 0:29:57 lr: 4.221344617062835e-05 loss: 0.1650 (0.1824) time: 2.9391 data: 0.0077 max mem: 33300 Epoch: [6] [3680/4276] eta: 0:29:28 lr: 4.221076524258343e-05 loss: 0.1919 (0.1824) time: 2.9402 data: 0.0077 max mem: 33300 Epoch: [6] [3690/4276] eta: 0:28:58 lr: 4.2208084295619096e-05 loss: 0.1830 (0.1824) time: 2.9371 data: 0.0074 max mem: 33300 Epoch: [6] [3700/4276] eta: 0:28:28 lr: 4.220540332973388e-05 loss: 0.1805 (0.1824) time: 2.9515 data: 0.0073 max mem: 33300 Epoch: [6] [3710/4276] eta: 0:27:58 lr: 4.220272234492632e-05 loss: 0.1740 (0.1824) time: 2.9494 data: 0.0072 max mem: 33300 Epoch: [6] [3720/4276] eta: 0:27:29 lr: 4.2200041341194935e-05 loss: 0.1601 (0.1824) time: 2.9325 data: 0.0072 max mem: 33300 Epoch: [6] [3730/4276] eta: 0:26:59 lr: 4.219736031853827e-05 loss: 0.1772 (0.1824) time: 2.9382 data: 0.0072 max mem: 33300 Epoch: [6] [3740/4276] eta: 0:26:29 lr: 4.219467927695484e-05 loss: 0.1772 (0.1824) time: 2.9386 data: 0.0072 max mem: 33300 Epoch: [6] [3750/4276] eta: 0:26:00 lr: 4.219199821644318e-05 loss: 0.1716 (0.1824) time: 2.9295 data: 0.0072 max mem: 33300 Epoch: [6] [3760/4276] eta: 0:25:30 lr: 4.218931713700182e-05 loss: 0.1716 (0.1824) time: 2.9283 data: 0.0073 max mem: 33300 Epoch: [6] [3770/4276] eta: 0:25:00 lr: 4.2186636038629295e-05 loss: 0.1817 (0.1824) time: 2.9293 data: 0.0072 max mem: 33300 Epoch: [6] [3780/4276] eta: 0:24:31 lr: 4.218395492132413e-05 loss: 0.1737 (0.1824) time: 2.9291 data: 0.0073 max mem: 33300 Epoch: [6] [3790/4276] eta: 0:24:01 lr: 4.2181273785084856e-05 loss: 0.1737 (0.1824) time: 2.9296 data: 0.0077 max mem: 33300 Epoch: [6] [3800/4276] eta: 0:23:31 lr: 4.2178592629910005e-05 loss: 0.1944 (0.1825) time: 2.9293 data: 0.0077 max mem: 33300 Epoch: [6] [3810/4276] eta: 0:23:01 lr: 4.217591145579809e-05 loss: 0.1782 (0.1825) time: 2.9283 data: 0.0075 max mem: 33300 Epoch: [6] [3820/4276] eta: 0:22:32 lr: 4.217323026274766e-05 loss: 0.1561 (0.1824) time: 2.9213 data: 0.0076 max mem: 33300 Epoch: [6] [3830/4276] eta: 0:22:02 lr: 4.2170549050757236e-05 loss: 0.1637 (0.1824) time: 2.9066 data: 0.0078 max mem: 33300 Epoch: [6] [3840/4276] eta: 0:21:32 lr: 4.2167867819825344e-05 loss: 0.1694 (0.1824) time: 2.9064 data: 0.0083 max mem: 33300 Epoch: [6] [3850/4276] eta: 0:21:03 lr: 4.216518656995051e-05 loss: 0.1702 (0.1823) time: 2.9206 data: 0.0081 max mem: 33300 Epoch: [6] [3860/4276] eta: 0:20:33 lr: 4.216250530113127e-05 loss: 0.1709 (0.1823) time: 2.9182 data: 0.0073 max mem: 33300 Epoch: [6] [3870/4276] eta: 0:20:03 lr: 4.215982401336614e-05 loss: 0.1741 (0.1823) time: 2.9084 data: 0.0073 max mem: 33300 Epoch: [6] [3880/4276] eta: 0:19:33 lr: 4.215714270665366e-05 loss: 0.1741 (0.1823) time: 2.9099 data: 0.0075 max mem: 33300 Epoch: [6] [3890/4276] eta: 0:19:04 lr: 4.215446138099234e-05 loss: 0.1660 (0.1823) time: 2.9106 data: 0.0073 max mem: 33300 Epoch: [6] [3900/4276] eta: 0:18:34 lr: 4.2151780036380726e-05 loss: 0.1698 (0.1823) time: 2.9153 data: 0.0071 max mem: 33300 Epoch: [6] [3910/4276] eta: 0:18:04 lr: 4.214909867281734e-05 loss: 0.1680 (0.1822) time: 2.9258 data: 0.0071 max mem: 33300 Epoch: [6] [3920/4276] eta: 0:17:35 lr: 4.2146417290300695e-05 loss: 0.1645 (0.1822) time: 2.9338 data: 0.0072 max mem: 33300 Epoch: [6] [3930/4276] eta: 0:17:05 lr: 4.214373588882934e-05 loss: 0.1730 (0.1823) time: 2.9283 data: 0.0072 max mem: 33300 Epoch: [6] [3940/4276] eta: 0:16:35 lr: 4.2141054468401776e-05 loss: 0.1730 (0.1823) time: 2.9184 data: 0.0072 max mem: 33300 Epoch: [6] [3950/4276] eta: 0:16:06 lr: 4.213837302901654e-05 loss: 0.1718 (0.1822) time: 2.9198 data: 0.0075 max mem: 33300 Epoch: [6] [3960/4276] eta: 0:15:36 lr: 4.2135691570672164e-05 loss: 0.1782 (0.1822) time: 2.9201 data: 0.0074 max mem: 33300 Epoch: [6] [3970/4276] eta: 0:15:06 lr: 4.2133010093367166e-05 loss: 0.1824 (0.1823) time: 2.9230 data: 0.0075 max mem: 33300 Epoch: [6] [3980/4276] eta: 0:14:37 lr: 4.2130328597100076e-05 loss: 0.1799 (0.1823) time: 2.9220 data: 0.0077 max mem: 33300 Epoch: [6] [3990/4276] eta: 0:14:07 lr: 4.212764708186941e-05 loss: 0.1799 (0.1823) time: 2.9199 data: 0.0075 max mem: 33300 Epoch: [6] [4000/4276] eta: 0:13:37 lr: 4.21249655476737e-05 loss: 0.1698 (0.1823) time: 2.9208 data: 0.0073 max mem: 33300 Epoch: [6] [4010/4276] eta: 0:13:08 lr: 4.2122283994511475e-05 loss: 0.1662 (0.1823) time: 2.9246 data: 0.0072 max mem: 33300 Epoch: [6] [4020/4276] eta: 0:12:38 lr: 4.211960242238124e-05 loss: 0.1777 (0.1823) time: 2.9200 data: 0.0074 max mem: 33300 Epoch: [6] [4030/4276] eta: 0:12:08 lr: 4.211692083128154e-05 loss: 0.1829 (0.1823) time: 2.9180 data: 0.0076 max mem: 33300 Epoch: [6] [4040/4276] eta: 0:11:39 lr: 4.211423922121089e-05 loss: 0.1829 (0.1824) time: 2.9089 data: 0.0081 max mem: 33300 Epoch: [6] [4050/4276] eta: 0:11:09 lr: 4.211155759216781e-05 loss: 0.1743 (0.1824) time: 2.8775 data: 0.0086 max mem: 33300 Epoch: [6] [4060/4276] eta: 0:10:39 lr: 4.2108875944150836e-05 loss: 0.1705 (0.1824) time: 2.8628 data: 0.0082 max mem: 33300 Epoch: [6] [4070/4276] eta: 0:10:10 lr: 4.210619427715848e-05 loss: 0.1798 (0.1824) time: 2.8615 data: 0.0081 max mem: 33300 Epoch: [6] [4080/4276] eta: 0:09:40 lr: 4.210351259118926e-05 loss: 0.1798 (0.1824) time: 2.8626 data: 0.0083 max mem: 33300 Epoch: [6] [4090/4276] eta: 0:09:10 lr: 4.210083088624172e-05 loss: 0.1862 (0.1824) time: 2.8611 data: 0.0079 max mem: 33300 Epoch: [6] [4100/4276] eta: 0:08:41 lr: 4.209814916231436e-05 loss: 0.1862 (0.1824) time: 2.8616 data: 0.0083 max mem: 33300 Epoch: [6] [4110/4276] eta: 0:08:11 lr: 4.2095467419405714e-05 loss: 0.1843 (0.1824) time: 2.8654 data: 0.0089 max mem: 33300 Epoch: [6] [4120/4276] eta: 0:07:41 lr: 4.20927856575143e-05 loss: 0.1843 (0.1824) time: 2.8654 data: 0.0085 max mem: 33300 Epoch: [6] [4130/4276] eta: 0:07:12 lr: 4.209010387663864e-05 loss: 0.1763 (0.1824) time: 2.8982 data: 0.0087 max mem: 33300 Epoch: [6] [4140/4276] eta: 0:06:42 lr: 4.208742207677726e-05 loss: 0.1719 (0.1824) time: 2.9274 data: 0.0091 max mem: 33300 Epoch: [6] [4150/4276] eta: 0:06:13 lr: 4.208474025792868e-05 loss: 0.1743 (0.1824) time: 2.9200 data: 0.0089 max mem: 33300 Epoch: [6] [4160/4276] eta: 0:05:43 lr: 4.208205842009141e-05 loss: 0.1739 (0.1824) time: 2.9280 data: 0.0087 max mem: 33300 Epoch: [6] [4170/4276] eta: 0:05:13 lr: 4.207937656326399e-05 loss: 0.1813 (0.1824) time: 2.9329 data: 0.0091 max mem: 33300 Epoch: [6] [4180/4276] eta: 0:04:44 lr: 4.207669468744493e-05 loss: 0.1791 (0.1824) time: 2.9184 data: 0.0100 max mem: 33300 Epoch: [6] [4190/4276] eta: 0:04:14 lr: 4.207401279263275e-05 loss: 0.1654 (0.1824) time: 2.8993 data: 0.0097 max mem: 33300 Epoch: [6] [4200/4276] eta: 0:03:44 lr: 4.2071330878825956e-05 loss: 0.1823 (0.1824) time: 2.8912 data: 0.0086 max mem: 33300 Epoch: [6] [4210/4276] eta: 0:03:15 lr: 4.206864894602311e-05 loss: 0.2089 (0.1825) time: 2.8992 data: 0.0087 max mem: 33300 Epoch: [6] [4220/4276] eta: 0:02:45 lr: 4.206596699422269e-05 loss: 0.2110 (0.1826) time: 2.9321 data: 0.0092 max mem: 33300 Epoch: [6] [4230/4276] eta: 0:02:16 lr: 4.206328502342323e-05 loss: 0.2134 (0.1826) time: 2.9516 data: 0.0091 max mem: 33300 Epoch: [6] [4240/4276] eta: 0:01:46 lr: 4.206060303362326e-05 loss: 0.2100 (0.1827) time: 2.9635 data: 0.0085 max mem: 33300 Epoch: [6] [4250/4276] eta: 0:01:16 lr: 4.205792102482129e-05 loss: 0.1969 (0.1827) time: 3.0049 data: 0.0085 max mem: 33300 Epoch: [6] [4260/4276] eta: 0:00:47 lr: 4.2055238997015825e-05 loss: 0.1969 (0.1828) time: 3.0490 data: 0.0084 max mem: 33300 Epoch: [6] [4270/4276] eta: 0:00:17 lr: 4.205255695020541e-05 loss: 0.1910 (0.1828) time: 3.0689 data: 0.0075 max mem: 33300 Epoch: [6] Total time: 3:31:00 Test: [ 0/21770] eta: 13:24:05 time: 2.2161 data: 2.1719 max mem: 33300 Test: [ 100/21770] eta: 0:21:52 time: 0.0393 data: 0.0011 max mem: 33300 Test: [ 200/21770] eta: 0:17:52 time: 0.0381 data: 0.0011 max mem: 33300 Test: [ 300/21770] eta: 0:16:24 time: 0.0381 data: 0.0011 max mem: 33300 Test: [ 400/21770] eta: 0:15:38 time: 0.0382 data: 0.0011 max mem: 33300 Test: [ 500/21770] eta: 0:15:09 time: 0.0383 data: 0.0011 max mem: 33300 Test: [ 600/21770] eta: 0:14:49 time: 0.0383 data: 0.0011 max mem: 33300 Test: [ 700/21770] eta: 0:14:34 time: 0.0385 data: 0.0011 max mem: 33300 Test: [ 800/21770] eta: 0:14:22 time: 0.0385 data: 0.0011 max mem: 33300 Test: [ 900/21770] eta: 0:14:11 time: 0.0381 data: 0.0011 max mem: 33300 Test: [ 1000/21770] eta: 0:14:02 time: 0.0382 data: 0.0011 max mem: 33300 Test: [ 1100/21770] eta: 0:13:53 time: 0.0382 data: 0.0011 max mem: 33300 Test: [ 1200/21770] eta: 0:13:46 time: 0.0384 data: 0.0011 max mem: 33300 Test: [ 1300/21770] eta: 0:13:40 time: 0.0389 data: 0.0011 max mem: 33300 Test: [ 1400/21770] eta: 0:13:34 time: 0.0391 data: 0.0011 max mem: 33300 Test: [ 1500/21770] eta: 0:13:29 time: 0.0396 data: 0.0011 max mem: 33300 Test: [ 1600/21770] eta: 0:13:24 time: 0.0395 data: 0.0011 max mem: 33300 Test: [ 1700/21770] eta: 0:13:19 time: 0.0382 data: 0.0011 max mem: 33300 Test: [ 1800/21770] eta: 0:13:13 time: 0.0383 data: 0.0011 max mem: 33300 Test: [ 1900/21770] eta: 0:13:08 time: 0.0383 data: 0.0011 max mem: 33300 Test: [ 2000/21770] eta: 0:13:03 time: 0.0384 data: 0.0011 max mem: 33300 Test: [ 2100/21770] eta: 0:12:58 time: 0.0384 data: 0.0011 max mem: 33300 Test: [ 2200/21770] eta: 0:12:53 time: 0.0385 data: 0.0011 max mem: 33300 Test: [ 2300/21770] eta: 0:12:48 time: 0.0386 data: 0.0011 max mem: 33300 Test: [ 2400/21770] eta: 0:12:43 time: 0.0385 data: 0.0011 max mem: 33300 Test: [ 2500/21770] eta: 0:12:39 time: 0.0389 data: 0.0011 max mem: 33300 Test: [ 2600/21770] eta: 0:12:34 time: 0.0386 data: 0.0011 max mem: 33300 Test: [ 2700/21770] eta: 0:12:31 time: 0.0399 data: 0.0012 max mem: 33300 Test: [ 2800/21770] eta: 0:12:27 time: 0.0402 data: 0.0012 max mem: 33300 Test: [ 2900/21770] eta: 0:12:24 time: 0.0395 data: 0.0011 max mem: 33300 Test: [ 3000/21770] eta: 0:12:20 time: 0.0399 data: 0.0011 max mem: 33300 Test: [ 3100/21770] eta: 0:12:15 time: 0.0386 data: 0.0012 max mem: 33300 Test: [ 3200/21770] eta: 0:12:11 time: 0.0386 data: 0.0011 max mem: 33300 Test: [ 3300/21770] eta: 0:12:07 time: 0.0392 data: 0.0012 max mem: 33300 Test: [ 3400/21770] eta: 0:12:03 time: 0.0392 data: 0.0011 max mem: 33300 Test: [ 3500/21770] eta: 0:11:59 time: 0.0384 data: 0.0011 max mem: 33300 Test: [ 3600/21770] eta: 0:11:54 time: 0.0386 data: 0.0011 max mem: 33300 Test: [ 3700/21770] eta: 0:11:50 time: 0.0384 data: 0.0011 max mem: 33300 Test: [ 3800/21770] eta: 0:11:46 time: 0.0383 data: 0.0011 max mem: 33300 Test: [ 3900/21770] eta: 0:11:41 time: 0.0384 data: 0.0011 max mem: 33300 Test: [ 4000/21770] eta: 0:11:37 time: 0.0383 data: 0.0010 max mem: 33300 Test: [ 4100/21770] eta: 0:11:33 time: 0.0387 data: 0.0011 max mem: 33300 Test: [ 4200/21770] eta: 0:11:29 time: 0.0386 data: 0.0011 max mem: 33300 Test: [ 4300/21770] eta: 0:11:24 time: 0.0384 data: 0.0011 max mem: 33300 Test: [ 4400/21770] eta: 0:11:20 time: 0.0385 data: 0.0011 max mem: 33300 Test: [ 4500/21770] eta: 0:11:16 time: 0.0391 data: 0.0011 max mem: 33300 Test: [ 4600/21770] eta: 0:11:12 time: 0.0401 data: 0.0011 max mem: 33300 Test: [ 4700/21770] eta: 0:11:09 time: 0.0383 data: 0.0011 max mem: 33300 Test: [ 4800/21770] eta: 0:11:04 time: 0.0386 data: 0.0011 max mem: 33300 Test: [ 4900/21770] eta: 0:11:00 time: 0.0388 data: 0.0011 max mem: 33300 Test: [ 5000/21770] eta: 0:10:56 time: 0.0386 data: 0.0011 max mem: 33300 Test: [ 5100/21770] eta: 0:10:52 time: 0.0384 data: 0.0011 max mem: 33300 Test: [ 5200/21770] eta: 0:10:48 time: 0.0383 data: 0.0011 max mem: 33300 Test: [ 5300/21770] eta: 0:10:44 time: 0.0386 data: 0.0011 max mem: 33300 Test: [ 5400/21770] eta: 0:10:40 time: 0.0384 data: 0.0011 max mem: 33300 Test: [ 5500/21770] eta: 0:10:36 time: 0.0387 data: 0.0011 max mem: 33300 Test: [ 5600/21770] eta: 0:10:32 time: 0.0387 data: 0.0011 max mem: 33300 Test: [ 5700/21770] eta: 0:10:28 time: 0.0387 data: 0.0011 max mem: 33300 Test: [ 5800/21770] eta: 0:10:23 time: 0.0384 data: 0.0011 max mem: 33300 Test: [ 5900/21770] eta: 0:10:19 time: 0.0383 data: 0.0011 max mem: 33300 Test: [ 6000/21770] eta: 0:10:15 time: 0.0383 data: 0.0011 max mem: 33300 Test: [ 6100/21770] eta: 0:10:11 time: 0.0386 data: 0.0011 max mem: 33300 Test: [ 6200/21770] eta: 0:10:07 time: 0.0384 data: 0.0011 max mem: 33300 Test: [ 6300/21770] eta: 0:10:03 time: 0.0384 data: 0.0011 max mem: 33300 Test: [ 6400/21770] eta: 0:09:59 time: 0.0397 data: 0.0011 max mem: 33300 Test: [ 6500/21770] eta: 0:09:56 time: 0.0398 data: 0.0013 max mem: 33300 Test: [ 6600/21770] eta: 0:09:52 time: 0.0387 data: 0.0011 max mem: 33300 Test: [ 6700/21770] eta: 0:09:48 time: 0.0385 data: 0.0011 max mem: 33300 Test: [ 6800/21770] eta: 0:09:44 time: 0.0383 data: 0.0011 max mem: 33300 Test: [ 6900/21770] eta: 0:09:40 time: 0.0386 data: 0.0011 max mem: 33300 Test: [ 7000/21770] eta: 0:09:36 time: 0.0391 data: 0.0012 max mem: 33300 Test: [ 7100/21770] eta: 0:09:32 time: 0.0389 data: 0.0011 max mem: 33300 Test: [ 7200/21770] eta: 0:09:28 time: 0.0387 data: 0.0011 max mem: 33300 Test: [ 7300/21770] eta: 0:09:24 time: 0.0390 data: 0.0012 max mem: 33300 Test: [ 7400/21770] eta: 0:09:20 time: 0.0393 data: 0.0011 max mem: 33300 Test: [ 7500/21770] eta: 0:09:16 time: 0.0392 data: 0.0011 max mem: 33300 Test: [ 7600/21770] eta: 0:09:12 time: 0.0390 data: 0.0011 max mem: 33300 Test: [ 7700/21770] eta: 0:09:08 time: 0.0393 data: 0.0012 max mem: 33300 Test: [ 7800/21770] eta: 0:09:04 time: 0.0389 data: 0.0012 max mem: 33300 Test: [ 7900/21770] eta: 0:09:01 time: 0.0392 data: 0.0012 max mem: 33300 Test: [ 8000/21770] eta: 0:08:57 time: 0.0389 data: 0.0012 max mem: 33300 Test: [ 8100/21770] eta: 0:08:53 time: 0.0387 data: 0.0012 max mem: 33300 Test: [ 8200/21770] eta: 0:08:49 time: 0.0388 data: 0.0012 max mem: 33300 Test: [ 8300/21770] eta: 0:08:45 time: 0.0388 data: 0.0012 max mem: 33300 Test: [ 8400/21770] eta: 0:08:41 time: 0.0388 data: 0.0012 max mem: 33300 Test: [ 8500/21770] eta: 0:08:37 time: 0.0388 data: 0.0011 max mem: 33300 Test: [ 8600/21770] eta: 0:08:33 time: 0.0390 data: 0.0012 max mem: 33300 Test: [ 8700/21770] eta: 0:08:29 time: 0.0390 data: 0.0011 max mem: 33300 Test: [ 8800/21770] eta: 0:08:25 time: 0.0386 data: 0.0012 max mem: 33300 Test: [ 8900/21770] eta: 0:08:21 time: 0.0387 data: 0.0012 max mem: 33300 Test: [ 9000/21770] eta: 0:08:17 time: 0.0387 data: 0.0011 max mem: 33300 Test: [ 9100/21770] eta: 0:08:13 time: 0.0390 data: 0.0011 max mem: 33300 Test: [ 9200/21770] eta: 0:08:10 time: 0.0385 data: 0.0011 max mem: 33300 Test: [ 9300/21770] eta: 0:08:06 time: 0.0385 data: 0.0011 max mem: 33300 Test: [ 9400/21770] eta: 0:08:02 time: 0.0384 data: 0.0010 max mem: 33300 Test: [ 9500/21770] eta: 0:07:58 time: 0.0382 data: 0.0011 max mem: 33300 Test: [ 9600/21770] eta: 0:07:54 time: 0.0386 data: 0.0012 max mem: 33300 Test: [ 9700/21770] eta: 0:07:50 time: 0.0387 data: 0.0012 max mem: 33300 Test: [ 9800/21770] eta: 0:07:46 time: 0.0390 data: 0.0012 max mem: 33300 Test: [ 9900/21770] eta: 0:07:42 time: 0.0387 data: 0.0011 max mem: 33300 Test: [10000/21770] eta: 0:07:38 time: 0.0394 data: 0.0012 max mem: 33300 Test: [10100/21770] eta: 0:07:34 time: 0.0386 data: 0.0011 max mem: 33300 Test: [10200/21770] eta: 0:07:30 time: 0.0388 data: 0.0011 max mem: 33300 Test: [10300/21770] eta: 0:07:26 time: 0.0391 data: 0.0012 max mem: 33300 Test: [10400/21770] eta: 0:07:22 time: 0.0389 data: 0.0012 max mem: 33300 Test: [10500/21770] eta: 0:07:18 time: 0.0391 data: 0.0011 max mem: 33300 Test: [10600/21770] eta: 0:07:15 time: 0.0400 data: 0.0012 max mem: 33300 Test: [10700/21770] eta: 0:07:11 time: 0.0387 data: 0.0012 max mem: 33300 Test: [10800/21770] eta: 0:07:07 time: 0.0384 data: 0.0011 max mem: 33300 Test: [10900/21770] eta: 0:07:03 time: 0.0385 data: 0.0011 max mem: 33300 Test: [11000/21770] eta: 0:06:59 time: 0.0389 data: 0.0011 max mem: 33300 Test: [11100/21770] eta: 0:06:55 time: 0.0391 data: 0.0011 max mem: 33300 Test: [11200/21770] eta: 0:06:51 time: 0.0389 data: 0.0011 max mem: 33300 Test: [11300/21770] eta: 0:06:47 time: 0.0386 data: 0.0011 max mem: 33300 Test: [11400/21770] eta: 0:06:43 time: 0.0387 data: 0.0011 max mem: 33300 Test: [11500/21770] eta: 0:06:39 time: 0.0388 data: 0.0012 max mem: 33300 Test: [11600/21770] eta: 0:06:36 time: 0.0388 data: 0.0012 max mem: 33300 Test: [11700/21770] eta: 0:06:32 time: 0.0385 data: 0.0011 max mem: 33300 Test: [11800/21770] eta: 0:06:28 time: 0.0388 data: 0.0012 max mem: 33300 Test: [11900/21770] eta: 0:06:24 time: 0.0388 data: 0.0012 max mem: 33300 Test: [12000/21770] eta: 0:06:20 time: 0.0391 data: 0.0011 max mem: 33300 Test: [12100/21770] eta: 0:06:16 time: 0.0386 data: 0.0011 max mem: 33300 Test: [12200/21770] eta: 0:06:12 time: 0.0387 data: 0.0011 max mem: 33300 Test: [12300/21770] eta: 0:06:08 time: 0.0385 data: 0.0011 max mem: 33300 Test: [12400/21770] eta: 0:06:04 time: 0.0386 data: 0.0011 max mem: 33300 Test: [12500/21770] eta: 0:06:00 time: 0.0383 data: 0.0011 max mem: 33300 Test: [12600/21770] eta: 0:05:56 time: 0.0386 data: 0.0012 max mem: 33300 Test: [12700/21770] eta: 0:05:52 time: 0.0387 data: 0.0011 max mem: 33300 Test: [12800/21770] eta: 0:05:49 time: 0.0389 data: 0.0011 max mem: 33300 Test: [12900/21770] eta: 0:05:45 time: 0.0387 data: 0.0011 max mem: 33300 Test: [13000/21770] eta: 0:05:41 time: 0.0390 data: 0.0011 max mem: 33300 Test: [13100/21770] eta: 0:05:37 time: 0.0387 data: 0.0011 max mem: 33300 Test: [13200/21770] eta: 0:05:33 time: 0.0389 data: 0.0011 max mem: 33300 Test: [13300/21770] eta: 0:05:29 time: 0.0389 data: 0.0011 max mem: 33300 Test: [13400/21770] eta: 0:05:25 time: 0.0389 data: 0.0011 max mem: 33300 Test: [13500/21770] eta: 0:05:21 time: 0.0387 data: 0.0011 max mem: 33300 Test: [13600/21770] eta: 0:05:17 time: 0.0387 data: 0.0011 max mem: 33300 Test: [13700/21770] eta: 0:05:13 time: 0.0386 data: 0.0011 max mem: 33300 Test: [13800/21770] eta: 0:05:10 time: 0.0387 data: 0.0011 max mem: 33300 Test: [13900/21770] eta: 0:05:06 time: 0.0392 data: 0.0011 max mem: 33300 Test: [14000/21770] eta: 0:05:02 time: 0.0394 data: 0.0012 max mem: 33300 Test: [14100/21770] eta: 0:04:58 time: 0.0385 data: 0.0011 max mem: 33300 Test: [14200/21770] eta: 0:04:54 time: 0.0385 data: 0.0011 max mem: 33300 Test: [14300/21770] eta: 0:04:50 time: 0.0385 data: 0.0011 max mem: 33300 Test: [14400/21770] eta: 0:04:46 time: 0.0385 data: 0.0011 max mem: 33300 Test: [14500/21770] eta: 0:04:42 time: 0.0384 data: 0.0011 max mem: 33300 Test: [14600/21770] eta: 0:04:38 time: 0.0390 data: 0.0010 max mem: 33300 Test: [14700/21770] eta: 0:04:35 time: 0.0402 data: 0.0010 max mem: 33300 Test: [14800/21770] eta: 0:04:31 time: 0.0400 data: 0.0011 max mem: 33300 Test: [14900/21770] eta: 0:04:27 time: 0.0398 data: 0.0011 max mem: 33300 Test: [15000/21770] eta: 0:04:23 time: 0.0400 data: 0.0012 max mem: 33300 Test: [15100/21770] eta: 0:04:19 time: 0.0391 data: 0.0011 max mem: 33300 Test: [15200/21770] eta: 0:04:15 time: 0.0395 data: 0.0012 max mem: 33300 Test: [15300/21770] eta: 0:04:11 time: 0.0394 data: 0.0012 max mem: 33300 Test: [15400/21770] eta: 0:04:08 time: 0.0398 data: 0.0012 max mem: 33300 Test: [15500/21770] eta: 0:04:04 time: 0.0399 data: 0.0012 max mem: 33300 Test: [15600/21770] eta: 0:04:00 time: 0.0399 data: 0.0012 max mem: 33300 Test: [15700/21770] eta: 0:03:56 time: 0.0401 data: 0.0012 max mem: 33300 Test: [15800/21770] eta: 0:03:52 time: 0.0399 data: 0.0012 max mem: 33300 Test: [15900/21770] eta: 0:03:48 time: 0.0393 data: 0.0012 max mem: 33300 Test: [16000/21770] eta: 0:03:44 time: 0.0387 data: 0.0012 max mem: 33300 Test: [16100/21770] eta: 0:03:40 time: 0.0386 data: 0.0012 max mem: 33300 Test: [16200/21770] eta: 0:03:37 time: 0.0385 data: 0.0011 max mem: 33300 Test: [16300/21770] eta: 0:03:33 time: 0.0389 data: 0.0011 max mem: 33300 Test: [16400/21770] eta: 0:03:29 time: 0.0388 data: 0.0012 max mem: 33300 Test: [16500/21770] eta: 0:03:25 time: 0.0388 data: 0.0011 max mem: 33300 Test: [16600/21770] eta: 0:03:21 time: 0.0389 data: 0.0012 max mem: 33300 Test: [16700/21770] eta: 0:03:17 time: 0.0386 data: 0.0012 max mem: 33300 Test: [16800/21770] eta: 0:03:13 time: 0.0387 data: 0.0012 max mem: 33300 Test: [16900/21770] eta: 0:03:09 time: 0.0389 data: 0.0012 max mem: 33300 Test: [17000/21770] eta: 0:03:05 time: 0.0390 data: 0.0012 max mem: 33300 Test: [17100/21770] eta: 0:03:01 time: 0.0387 data: 0.0012 max mem: 33300 Test: [17200/21770] eta: 0:02:58 time: 0.0391 data: 0.0011 max mem: 33300 Test: [17300/21770] eta: 0:02:54 time: 0.0389 data: 0.0011 max mem: 33300 Test: [17400/21770] eta: 0:02:50 time: 0.0388 data: 0.0011 max mem: 33300 Test: [17500/21770] eta: 0:02:46 time: 0.0388 data: 0.0011 max mem: 33300 Test: [17600/21770] eta: 0:02:42 time: 0.0385 data: 0.0011 max mem: 33300 Test: [17700/21770] eta: 0:02:38 time: 0.0384 data: 0.0012 max mem: 33300 Test: [17800/21770] eta: 0:02:34 time: 0.0385 data: 0.0011 max mem: 33300 Test: [17900/21770] eta: 0:02:30 time: 0.0383 data: 0.0011 max mem: 33300 Test: [18000/21770] eta: 0:02:26 time: 0.0382 data: 0.0010 max mem: 33300 Test: [18100/21770] eta: 0:02:22 time: 0.0390 data: 0.0011 max mem: 33300 Test: [18200/21770] eta: 0:02:19 time: 0.0399 data: 0.0012 max mem: 33300 Test: [18300/21770] eta: 0:02:15 time: 0.0386 data: 0.0012 max mem: 33300 Test: [18400/21770] eta: 0:02:11 time: 0.0387 data: 0.0012 max mem: 33300 Test: [18500/21770] eta: 0:02:07 time: 0.0389 data: 0.0012 max mem: 33300 Test: [18600/21770] eta: 0:02:03 time: 0.0388 data: 0.0012 max mem: 33300 Test: [18700/21770] eta: 0:01:59 time: 0.0387 data: 0.0011 max mem: 33300 Test: [18800/21770] eta: 0:01:55 time: 0.0383 data: 0.0011 max mem: 33300 Test: [18900/21770] eta: 0:01:51 time: 0.0384 data: 0.0011 max mem: 33300 Test: [19000/21770] eta: 0:01:47 time: 0.0386 data: 0.0011 max mem: 33300 Test: [19100/21770] eta: 0:01:43 time: 0.0388 data: 0.0011 max mem: 33300 Test: [19200/21770] eta: 0:01:40 time: 0.0384 data: 0.0011 max mem: 33300 Test: [19300/21770] eta: 0:01:36 time: 0.0386 data: 0.0011 max mem: 33300 Test: [19400/21770] eta: 0:01:32 time: 0.0386 data: 0.0011 max mem: 33300 Test: [19500/21770] eta: 0:01:28 time: 0.0388 data: 0.0011 max mem: 33300 Test: [19600/21770] eta: 0:01:24 time: 0.0386 data: 0.0011 max mem: 33300 Test: [19700/21770] eta: 0:01:20 time: 0.0386 data: 0.0011 max mem: 33300 Test: [19800/21770] eta: 0:01:16 time: 0.0386 data: 0.0011 max mem: 33300 Test: [19900/21770] eta: 0:01:12 time: 0.0392 data: 0.0012 max mem: 33300 Test: [20000/21770] eta: 0:01:08 time: 0.0403 data: 0.0012 max mem: 33300 Test: [20100/21770] eta: 0:01:05 time: 0.0401 data: 0.0011 max mem: 33300 Test: [20200/21770] eta: 0:01:01 time: 0.0401 data: 0.0012 max mem: 33300 Test: [20300/21770] eta: 0:00:57 time: 0.0397 data: 0.0011 max mem: 33300 Test: [20400/21770] eta: 0:00:53 time: 0.0394 data: 0.0011 max mem: 33300 Test: [20500/21770] eta: 0:00:49 time: 0.0387 data: 0.0011 max mem: 33300 Test: [20600/21770] eta: 0:00:45 time: 0.0390 data: 0.0011 max mem: 33300 Test: [20700/21770] eta: 0:00:41 time: 0.0391 data: 0.0011 max mem: 33300 Test: [20800/21770] eta: 0:00:37 time: 0.0388 data: 0.0011 max mem: 33300 Test: [20900/21770] eta: 0:00:33 time: 0.0399 data: 0.0012 max mem: 33300 Test: [21000/21770] eta: 0:00:29 time: 0.0399 data: 0.0012 max mem: 33300 Test: [21100/21770] eta: 0:00:26 time: 0.0395 data: 0.0011 max mem: 33300 Test: [21200/21770] eta: 0:00:22 time: 0.0390 data: 0.0011 max mem: 33300 Test: [21300/21770] eta: 0:00:18 time: 0.0393 data: 0.0012 max mem: 33300 Test: [21400/21770] eta: 0:00:14 time: 0.0390 data: 0.0012 max mem: 33300 Test: [21500/21770] eta: 0:00:10 time: 0.0393 data: 0.0011 max mem: 33300 Test: [21600/21770] eta: 0:00:06 time: 0.0392 data: 0.0012 max mem: 33300 Test: [21700/21770] eta: 0:00:02 time: 0.0398 data: 0.0011 max mem: 33300 Test: Total time: 0:14:08 Final results: Mean IoU is 0.00 precision@0.5 = 0.00 precision@0.6 = 0.00 precision@0.7 = 0.00 precision@0.8 = 0.00 precision@0.9 = 0.00 overall IoU = 0.00 mean IoU = 0.00 Mean accuracy for one-to-zero sample is 100.00 Average object IoU 0.0 Overall IoU 0.0 Epoch: [7] [ 0/4276] eta: 6:39:41 lr: 4.2050947712996166e-05 loss: 0.1366 (0.1366) time: 5.6084 data: 2.2408 max mem: 33300 Epoch: [7] [ 10/4276] eta: 3:54:24 lr: 4.2048265635774724e-05 loss: 0.1860 (0.1817) time: 3.2970 data: 0.2112 max mem: 33300 Epoch: [7] [ 20/4276] eta: 3:45:18 lr: 4.204558353954447e-05 loss: 0.1812 (0.1836) time: 3.0548 data: 0.0078 max mem: 33300 Epoch: [7] [ 30/4276] eta: 3:40:50 lr: 4.2042901424303916e-05 loss: 0.1666 (0.1839) time: 3.0236 data: 0.0064 max mem: 33300 Epoch: [7] [ 40/4276] eta: 3:39:45 lr: 4.204021929005158e-05 loss: 0.1686 (0.1811) time: 3.0459 data: 0.0068 max mem: 33300 Epoch: [7] [ 50/4276] eta: 3:38:13 lr: 4.2037537136785995e-05 loss: 0.1728 (0.1785) time: 3.0636 data: 0.0076 max mem: 33300 Epoch: [7] [ 60/4276] eta: 3:36:45 lr: 4.203485496450566e-05 loss: 0.1647 (0.1780) time: 3.0275 data: 0.0072 max mem: 33300 Epoch: [7] [ 70/4276] eta: 3:35:53 lr: 4.20321727732091e-05 loss: 0.1629 (0.1770) time: 3.0326 data: 0.0068 max mem: 33300 Epoch: [7] [ 80/4276] eta: 3:35:19 lr: 4.2029490562894824e-05 loss: 0.1690 (0.1773) time: 3.0610 data: 0.0069 max mem: 33300 Epoch: [7] [ 90/4276] eta: 3:34:22 lr: 4.202680833356136e-05 loss: 0.1576 (0.1752) time: 3.0482 data: 0.0072 max mem: 33300 Epoch: [7] [ 100/4276] eta: 3:34:00 lr: 4.202412608520722e-05 loss: 0.1542 (0.1764) time: 3.0584 data: 0.0073 max mem: 33300 Epoch: [7] [ 110/4276] eta: 3:33:07 lr: 4.2021443817830905e-05 loss: 0.1619 (0.1779) time: 3.0545 data: 0.0071 max mem: 33300 Epoch: [7] [ 120/4276] eta: 3:32:44 lr: 4.2018761531430956e-05 loss: 0.1754 (0.1783) time: 3.0529 data: 0.0071 max mem: 33300 Epoch: [7] [ 130/4276] eta: 3:32:10 lr: 4.2016079226005875e-05 loss: 0.1804 (0.1795) time: 3.0761 data: 0.0075 max mem: 33300 Epoch: [7] [ 140/4276] eta: 3:31:30 lr: 4.2013396901554184e-05 loss: 0.1780 (0.1790) time: 3.0501 data: 0.0066 max mem: 33300 Epoch: [7] [ 150/4276] eta: 3:30:59 lr: 4.201071455807439e-05 loss: 0.1737 (0.1784) time: 3.0526 data: 0.0069 max mem: 33300 Epoch: [7] [ 160/4276] eta: 3:30:20 lr: 4.2008032195565016e-05 loss: 0.1755 (0.1784) time: 3.0524 data: 0.0070 max mem: 33300 Epoch: [7] [ 170/4276] eta: 3:30:00 lr: 4.200534981402458e-05 loss: 0.1836 (0.1793) time: 3.0735 data: 0.0066 max mem: 33300 Epoch: [7] [ 180/4276] eta: 3:29:19 lr: 4.2002667413451574e-05 loss: 0.1836 (0.1799) time: 3.0663 data: 0.0068 max mem: 33300 Epoch: [7] [ 190/4276] eta: 3:28:43 lr: 4.199998499384454e-05 loss: 0.1869 (0.1804) time: 3.0321 data: 0.0071 max mem: 33300 Epoch: [7] [ 200/4276] eta: 3:28:04 lr: 4.199730255520197e-05 loss: 0.1952 (0.1818) time: 3.0331 data: 0.0072 max mem: 33300 Epoch: [7] [ 210/4276] eta: 3:27:39 lr: 4.19946200975224e-05 loss: 0.1893 (0.1817) time: 3.0586 data: 0.0077 max mem: 33300 Epoch: [7] [ 220/4276] eta: 3:27:06 lr: 4.1991937620804326e-05 loss: 0.1831 (0.1818) time: 3.0714 data: 0.0087 max mem: 33300 Epoch: [7] [ 230/4276] eta: 3:26:32 lr: 4.1989255125046266e-05 loss: 0.1718 (0.1810) time: 3.0470 data: 0.0079 max mem: 33300 Epoch: [7] [ 240/4276] eta: 3:25:59 lr: 4.198657261024675e-05 loss: 0.1690 (0.1815) time: 3.0458 data: 0.0072 max mem: 33300 Epoch: [7] [ 250/4276] eta: 3:25:25 lr: 4.198389007640426e-05 loss: 0.1815 (0.1820) time: 3.0463 data: 0.0072 max mem: 33300 Epoch: [7] [ 260/4276] eta: 3:24:57 lr: 4.198120752351733e-05 loss: 0.1818 (0.1822) time: 3.0623 data: 0.0076 max mem: 33300 Epoch: [7] [ 270/4276] eta: 3:24:19 lr: 4.197852495158447e-05 loss: 0.1866 (0.1826) time: 3.0434 data: 0.0072 max mem: 33300 Epoch: [7] [ 280/4276] eta: 3:23:52 lr: 4.1975842360604186e-05 loss: 0.1798 (0.1826) time: 3.0469 data: 0.0070 max mem: 33300 Epoch: [7] [ 290/4276] eta: 3:23:13 lr: 4.1973159750575e-05 loss: 0.1724 (0.1823) time: 3.0425 data: 0.0067 max mem: 33300 Epoch: [7] [ 300/4276] eta: 3:22:38 lr: 4.197047712149541e-05 loss: 0.1724 (0.1822) time: 3.0122 data: 0.0063 max mem: 33300 Epoch: [7] [ 310/4276] eta: 3:22:03 lr: 4.196779447336394e-05 loss: 0.1855 (0.1822) time: 3.0267 data: 0.0064 max mem: 33300 Epoch: [7] [ 320/4276] eta: 3:21:36 lr: 4.196511180617911e-05 loss: 0.1866 (0.1826) time: 3.0540 data: 0.0066 max mem: 33300 Epoch: [7] [ 330/4276] eta: 3:21:07 lr: 4.19624291199394e-05 loss: 0.1847 (0.1827) time: 3.0785 data: 0.0075 max mem: 33300 Epoch: [7] [ 340/4276] eta: 3:20:40 lr: 4.1959746414643346e-05 loss: 0.1772 (0.1824) time: 3.0833 data: 0.0074 max mem: 33300 Epoch: [7] [ 350/4276] eta: 3:20:16 lr: 4.195706369028946e-05 loss: 0.1670 (0.1823) time: 3.1044 data: 0.0076 max mem: 33300 Epoch: [7] [ 360/4276] eta: 3:19:43 lr: 4.1954380946876236e-05 loss: 0.1847 (0.1827) time: 3.0789 data: 0.0074 max mem: 33300 Epoch: [7] [ 370/4276] eta: 3:19:18 lr: 4.1951698184402205e-05 loss: 0.1745 (0.1820) time: 3.0754 data: 0.0073 max mem: 33300 Epoch: [7] [ 380/4276] eta: 3:18:44 lr: 4.194901540286586e-05 loss: 0.1691 (0.1821) time: 3.0673 data: 0.0074 max mem: 33300 Epoch: [7] [ 390/4276] eta: 3:18:17 lr: 4.194633260226572e-05 loss: 0.1803 (0.1821) time: 3.0644 data: 0.0071 max mem: 33300 Epoch: [7] [ 400/4276] eta: 3:17:45 lr: 4.1943649782600286e-05 loss: 0.1798 (0.1821) time: 3.0715 data: 0.0069 max mem: 33300 Epoch: [7] [ 410/4276] eta: 3:17:17 lr: 4.1940966943868075e-05 loss: 0.1691 (0.1818) time: 3.0632 data: 0.0071 max mem: 33300 Epoch: [7] [ 420/4276] eta: 3:16:45 lr: 4.1938284086067595e-05 loss: 0.1723 (0.1818) time: 3.0718 data: 0.0076 max mem: 33300 Epoch: [7] [ 430/4276] eta: 3:16:11 lr: 4.193560120919735e-05 loss: 0.1847 (0.1820) time: 3.0353 data: 0.0067 max mem: 33300 Epoch: [7] [ 440/4276] eta: 3:15:41 lr: 4.1932918313255866e-05 loss: 0.1762 (0.1819) time: 3.0465 data: 0.0067 max mem: 33300 Epoch: [7] [ 450/4276] eta: 3:15:10 lr: 4.193023539824164e-05 loss: 0.1834 (0.1821) time: 3.0624 data: 0.0068 max mem: 33300 Epoch: [7] [ 460/4276] eta: 3:14:42 lr: 4.192755246415317e-05 loss: 0.1855 (0.1818) time: 3.0725 data: 0.0071 max mem: 33300 Epoch: [7] [ 470/4276] eta: 3:14:09 lr: 4.1924869510988974e-05 loss: 0.1686 (0.1816) time: 3.0633 data: 0.0070 max mem: 33300 Epoch: [7] [ 480/4276] eta: 3:13:44 lr: 4.192218653874757e-05 loss: 0.1660 (0.1812) time: 3.0788 data: 0.0074 max mem: 33300 Epoch: [7] [ 490/4276] eta: 3:13:13 lr: 4.1919503547427445e-05 loss: 0.1630 (0.1809) time: 3.0948 data: 0.0077 max mem: 33300 Epoch: [7] [ 500/4276] eta: 3:12:45 lr: 4.191682053702712e-05 loss: 0.1686 (0.1807) time: 3.0809 data: 0.0073 max mem: 33300 Epoch: [7] [ 510/4276] eta: 3:12:17 lr: 4.1914137507545104e-05 loss: 0.1686 (0.1806) time: 3.1015 data: 0.0075 max mem: 33300 Epoch: [7] [ 520/4276] eta: 3:11:46 lr: 4.19114544589799e-05 loss: 0.1677 (0.1807) time: 3.0792 data: 0.0070 max mem: 33300 Epoch: [7] [ 530/4276] eta: 3:11:19 lr: 4.190877139133001e-05 loss: 0.1714 (0.1807) time: 3.0868 data: 0.0067 max mem: 33300 Epoch: [7] [ 540/4276] eta: 3:10:47 lr: 4.1906088304593945e-05 loss: 0.1664 (0.1805) time: 3.0749 data: 0.0063 max mem: 33300 Epoch: [7] [ 550/4276] eta: 3:10:20 lr: 4.1903405198770216e-05 loss: 0.1664 (0.1804) time: 3.0762 data: 0.0071 max mem: 33300 Epoch: [7] [ 560/4276] eta: 3:09:50 lr: 4.1900722073857326e-05 loss: 0.1794 (0.1807) time: 3.0981 data: 0.0075 max mem: 33300 Epoch: [7] [ 570/4276] eta: 3:09:19 lr: 4.189803892985377e-05 loss: 0.1762 (0.1807) time: 3.0735 data: 0.0074 max mem: 33300 Epoch: [7] [ 580/4276] eta: 3:08:53 lr: 4.189535576675807e-05 loss: 0.1680 (0.1804) time: 3.0998 data: 0.0080 max mem: 33300 Epoch: [7] [ 590/4276] eta: 3:08:22 lr: 4.189267258456872e-05 loss: 0.1636 (0.1801) time: 3.0938 data: 0.0076 max mem: 33300 Epoch: [7] [ 600/4276] eta: 3:07:54 lr: 4.188998938328424e-05 loss: 0.1639 (0.1801) time: 3.0889 data: 0.0076 max mem: 33300 Epoch: [7] [ 610/4276] eta: 3:07:24 lr: 4.1887306162903116e-05 loss: 0.1634 (0.1798) time: 3.1003 data: 0.0078 max mem: 33300 Epoch: [7] [ 620/4276] eta: 3:06:56 lr: 4.188462292342387e-05 loss: 0.1634 (0.1799) time: 3.0930 data: 0.0079 max mem: 33300 Epoch: [7] [ 630/4276] eta: 3:06:28 lr: 4.1881939664845e-05 loss: 0.1739 (0.1799) time: 3.1077 data: 0.0079 max mem: 33300 Epoch: [7] [ 640/4276] eta: 3:05:55 lr: 4.1879256387165e-05 loss: 0.1702 (0.1798) time: 3.0747 data: 0.0079 max mem: 33300 Epoch: [7] [ 650/4276] eta: 3:05:28 lr: 4.1876573090382384e-05 loss: 0.1694 (0.1798) time: 3.0866 data: 0.0083 max mem: 33300 Epoch: [7] [ 660/4276] eta: 3:04:57 lr: 4.1873889774495664e-05 loss: 0.1727 (0.1798) time: 3.0946 data: 0.0078 max mem: 33300 Epoch: [7] [ 670/4276] eta: 3:04:29 lr: 4.187120643950333e-05 loss: 0.1718 (0.1798) time: 3.0837 data: 0.0077 max mem: 33300 Epoch: [7] [ 680/4276] eta: 3:03:59 lr: 4.1868523085403896e-05 loss: 0.1666 (0.1796) time: 3.1038 data: 0.0075 max mem: 33300 Epoch: [7] [ 690/4276] eta: 3:03:29 lr: 4.1865839712195854e-05 loss: 0.1754 (0.1795) time: 3.0856 data: 0.0070 max mem: 33300 Epoch: [7] [ 700/4276] eta: 3:03:01 lr: 4.186315631987771e-05 loss: 0.1639 (0.1793) time: 3.0988 data: 0.0072 max mem: 33300 Epoch: [7] [ 710/4276] eta: 3:02:30 lr: 4.186047290844798e-05 loss: 0.1708 (0.1794) time: 3.0919 data: 0.0067 max mem: 33300 Epoch: [7] [ 720/4276] eta: 3:02:00 lr: 4.185778947790515e-05 loss: 0.1709 (0.1791) time: 3.0781 data: 0.0068 max mem: 33300 Epoch: [7] [ 730/4276] eta: 3:01:28 lr: 4.1855106028247724e-05 loss: 0.1566 (0.1791) time: 3.0728 data: 0.0075 max mem: 33300 Epoch: [7] [ 740/4276] eta: 3:00:56 lr: 4.185242255947422e-05 loss: 0.1662 (0.1791) time: 3.0473 data: 0.0073 max mem: 33300 Epoch: [7] [ 750/4276] eta: 3:00:28 lr: 4.184973907158312e-05 loss: 0.1718 (0.1792) time: 3.0848 data: 0.0074 max mem: 33300 Epoch: [7] [ 760/4276] eta: 2:59:56 lr: 4.184705556457294e-05 loss: 0.1600 (0.1790) time: 3.0859 data: 0.0073 max mem: 33300 Epoch: [7] [ 770/4276] eta: 2:59:28 lr: 4.1844372038442176e-05 loss: 0.1614 (0.1790) time: 3.0836 data: 0.0075 max mem: 33300 Epoch: [7] [ 780/4276] eta: 2:58:57 lr: 4.184168849318933e-05 loss: 0.1626 (0.1790) time: 3.0906 data: 0.0078 max mem: 33300 Epoch: [7] [ 790/4276] eta: 2:58:26 lr: 4.18390049288129e-05 loss: 0.1719 (0.1790) time: 3.0673 data: 0.0073 max mem: 33300 Epoch: [7] [ 800/4276] eta: 2:57:57 lr: 4.1836321345311394e-05 loss: 0.1682 (0.1789) time: 3.0850 data: 0.0074 max mem: 33300 Epoch: [7] [ 810/4276] eta: 2:57:24 lr: 4.18336377426833e-05 loss: 0.1620 (0.1789) time: 3.0659 data: 0.0071 max mem: 33300 Epoch: [7] [ 820/4276] eta: 2:56:56 lr: 4.183095412092713e-05 loss: 0.1628 (0.1787) time: 3.0848 data: 0.0073 max mem: 33300 Epoch: [7] [ 830/4276] eta: 2:56:24 lr: 4.182827048004138e-05 loss: 0.1628 (0.1787) time: 3.0872 data: 0.0072 max mem: 33300 Epoch: [7] [ 840/4276] eta: 2:55:52 lr: 4.182558682002455e-05 loss: 0.1722 (0.1788) time: 3.0436 data: 0.0074 max mem: 33300 Epoch: [7] [ 850/4276] eta: 2:55:22 lr: 4.182290314087514e-05 loss: 0.1749 (0.1788) time: 3.0586 data: 0.0076 max mem: 33300 Epoch: [7] [ 860/4276] eta: 2:54:54 lr: 4.182021944259166e-05 loss: 0.1690 (0.1788) time: 3.1043 data: 0.0073 max mem: 33300 Epoch: [7] [ 870/4276] eta: 2:54:25 lr: 4.181753572517259e-05 loss: 0.1764 (0.1789) time: 3.1368 data: 0.0079 max mem: 33300 Epoch: [7] [ 880/4276] eta: 2:53:56 lr: 4.181485198861643e-05 loss: 0.1786 (0.1790) time: 3.1171 data: 0.0077 max mem: 33300 Epoch: [7] [ 890/4276] eta: 2:53:30 lr: 4.181216823292169e-05 loss: 0.1867 (0.1793) time: 3.1488 data: 0.0083 max mem: 33300 Epoch: [7] [ 900/4276] eta: 2:53:01 lr: 4.180948445808687e-05 loss: 0.1997 (0.1794) time: 3.1584 data: 0.0088 max mem: 33300 Epoch: [7] [ 910/4276] eta: 2:52:32 lr: 4.180680066411047e-05 loss: 0.1853 (0.1794) time: 3.1267 data: 0.0080 max mem: 33300 Epoch: [7] [ 920/4276] eta: 2:52:05 lr: 4.1804116850990975e-05 loss: 0.1835 (0.1796) time: 3.1542 data: 0.0080 max mem: 33300 Epoch: [7] [ 930/4276] eta: 2:51:35 lr: 4.180143301872689e-05 loss: 0.1826 (0.1796) time: 3.1420 data: 0.0083 max mem: 33300 Epoch: [7] [ 940/4276] eta: 2:51:08 lr: 4.179874916731671e-05 loss: 0.1693 (0.1794) time: 3.1463 data: 0.0083 max mem: 33300 Epoch: [7] [ 950/4276] eta: 2:50:39 lr: 4.1796065296758944e-05 loss: 0.1641 (0.1794) time: 3.1551 data: 0.0085 max mem: 33300 Epoch: [7] [ 960/4276] eta: 2:50:10 lr: 4.179338140705207e-05 loss: 0.1721 (0.1795) time: 3.1270 data: 0.0085 max mem: 33300 Epoch: [7] [ 970/4276] eta: 2:49:44 lr: 4.179069749819459e-05 loss: 0.1695 (0.1795) time: 3.1657 data: 0.0084 max mem: 33300 Epoch: [7] [ 980/4276] eta: 2:49:13 lr: 4.1788013570185015e-05 loss: 0.1690 (0.1796) time: 3.1487 data: 0.0084 max mem: 33300 Epoch: [7] [ 990/4276] eta: 2:48:46 lr: 4.178532962302184e-05 loss: 0.1722 (0.1796) time: 3.1362 data: 0.0083 max mem: 33300 Epoch: [7] [1000/4276] eta: 2:48:18 lr: 4.178264565670353e-05 loss: 0.1671 (0.1795) time: 3.1785 data: 0.0083 max mem: 33300 Epoch: [7] [1010/4276] eta: 2:47:46 lr: 4.1779961671228626e-05 loss: 0.1688 (0.1795) time: 3.1057 data: 0.0086 max mem: 33300 Epoch: [7] [1020/4276] eta: 2:47:17 lr: 4.177727766659559e-05 loss: 0.1755 (0.1794) time: 3.0903 data: 0.0081 max mem: 33300 Epoch: [7] [1030/4276] eta: 2:46:49 lr: 4.177459364280294e-05 loss: 0.1787 (0.1796) time: 3.1593 data: 0.0077 max mem: 33300 Epoch: [7] [1040/4276] eta: 2:46:20 lr: 4.177190959984916e-05 loss: 0.1724 (0.1796) time: 3.1566 data: 0.0090 max mem: 33300 Epoch: [7] [1050/4276] eta: 2:45:51 lr: 4.1769225537732736e-05 loss: 0.1659 (0.1797) time: 3.1422 data: 0.0094 max mem: 33300 Epoch: [7] [1060/4276] eta: 2:45:23 lr: 4.1766541456452184e-05 loss: 0.1850 (0.1797) time: 3.1640 data: 0.0093 max mem: 33300 Epoch: [7] [1070/4276] eta: 2:44:55 lr: 4.176385735600599e-05 loss: 0.1714 (0.1797) time: 3.1879 data: 0.0090 max mem: 33300 Epoch: [7] [1080/4276] eta: 2:44:25 lr: 4.176117323639264e-05 loss: 0.1714 (0.1795) time: 3.1528 data: 0.0087 max mem: 33300 Epoch: [7] [1090/4276] eta: 2:43:57 lr: 4.175848909761064e-05 loss: 0.1745 (0.1795) time: 3.1422 data: 0.0091 max mem: 33300 Epoch: [7] [1100/4276] eta: 2:43:29 lr: 4.175580493965848e-05 loss: 0.1729 (0.1796) time: 3.1820 data: 0.0088 max mem: 33300 Epoch: [7] [1110/4276] eta: 2:42:58 lr: 4.1753120762534655e-05 loss: 0.1761 (0.1796) time: 3.1405 data: 0.0084 max mem: 33300 Epoch: [7] [1120/4276] eta: 2:42:31 lr: 4.175043656623765e-05 loss: 0.1761 (0.1797) time: 3.1509 data: 0.0087 max mem: 33300 Epoch: [7] [1130/4276] eta: 2:42:02 lr: 4.174775235076597e-05 loss: 0.1701 (0.1795) time: 3.1899 data: 0.0086 max mem: 33300 Epoch: [7] [1140/4276] eta: 2:41:30 lr: 4.17450681161181e-05 loss: 0.1590 (0.1793) time: 3.1067 data: 0.0081 max mem: 33300 Epoch: [7] [1150/4276] eta: 2:41:03 lr: 4.174238386229254e-05 loss: 0.1636 (0.1793) time: 3.1469 data: 0.0089 max mem: 33300 Epoch: [7] [1160/4276] eta: 2:40:33 lr: 4.1739699589287774e-05 loss: 0.1665 (0.1793) time: 3.1862 data: 0.0097 max mem: 33300 Epoch: [7] [1170/4276] eta: 2:40:03 lr: 4.173701529710231e-05 loss: 0.1814 (0.1793) time: 3.1173 data: 0.0090 max mem: 33300 Epoch: [7] [1180/4276] eta: 2:39:36 lr: 4.173433098573462e-05 loss: 0.1809 (0.1792) time: 3.1825 data: 0.0088 max mem: 33300 Epoch: [7] [1190/4276] eta: 2:39:05 lr: 4.173164665518321e-05 loss: 0.1597 (0.1791) time: 3.1740 data: 0.0091 max mem: 33300 Epoch: [7] [1200/4276] eta: 2:38:36 lr: 4.172896230544656e-05 loss: 0.1701 (0.1791) time: 3.1278 data: 0.0090 max mem: 33300 Epoch: [7] [1210/4276] eta: 2:38:08 lr: 4.1726277936523175e-05 loss: 0.1572 (0.1791) time: 3.1953 data: 0.0087 max mem: 33300 Epoch: [7] [1220/4276] eta: 2:37:37 lr: 4.172359354841155e-05 loss: 0.1644 (0.1790) time: 3.1480 data: 0.0080 max mem: 33300 Epoch: [7] [1230/4276] eta: 2:37:08 lr: 4.172090914111016e-05 loss: 0.1734 (0.1791) time: 3.1253 data: 0.0084 max mem: 33300 Epoch: [7] [1240/4276] eta: 2:36:40 lr: 4.17182247146175e-05 loss: 0.1770 (0.1790) time: 3.1977 data: 0.0091 max mem: 33300 Epoch: [7] [1250/4276] eta: 2:36:09 lr: 4.171554026893206e-05 loss: 0.1732 (0.1790) time: 3.1614 data: 0.0085 max mem: 33300 Epoch: [7] [1260/4276] eta: 2:35:40 lr: 4.171285580405234e-05 loss: 0.1581 (0.1788) time: 3.1389 data: 0.0082 max mem: 33300 Epoch: [7] [1270/4276] eta: 2:35:11 lr: 4.1710171319976826e-05 loss: 0.1694 (0.1788) time: 3.1662 data: 0.0083 max mem: 33300 Epoch: [7] [1280/4276] eta: 2:34:41 lr: 4.1707486816704e-05 loss: 0.1761 (0.1789) time: 3.1637 data: 0.0085 max mem: 33300 Epoch: [7] [1290/4276] eta: 2:34:11 lr: 4.170480229423236e-05 loss: 0.1794 (0.1790) time: 3.1491 data: 0.0087 max mem: 33300 Epoch: [7] [1300/4276] eta: 2:33:41 lr: 4.1702117752560396e-05 loss: 0.1789 (0.1789) time: 3.1456 data: 0.0082 max mem: 33300 Epoch: [7] [1310/4276] eta: 2:33:11 lr: 4.1699433191686584e-05 loss: 0.1569 (0.1788) time: 3.1333 data: 0.0078 max mem: 33300 Epoch: [7] [1320/4276] eta: 2:32:38 lr: 4.169674861160944e-05 loss: 0.1718 (0.1788) time: 3.0671 data: 0.0076 max mem: 33300 Epoch: [7] [1330/4276] eta: 2:32:09 lr: 4.1694064012327425e-05 loss: 0.1726 (0.1788) time: 3.1108 data: 0.0084 max mem: 33300 Epoch: [7] [1340/4276] eta: 2:31:40 lr: 4.169137939383905e-05 loss: 0.1604 (0.1786) time: 3.1772 data: 0.0090 max mem: 33300 Epoch: [7] [1350/4276] eta: 2:31:08 lr: 4.1688694756142785e-05 loss: 0.1667 (0.1786) time: 3.1140 data: 0.0092 max mem: 33300 Epoch: [7] [1360/4276] eta: 2:30:40 lr: 4.168601009923712e-05 loss: 0.1760 (0.1787) time: 3.1654 data: 0.0094 max mem: 33300 Epoch: [7] [1370/4276] eta: 2:30:10 lr: 4.168332542312055e-05 loss: 0.1594 (0.1786) time: 3.1821 data: 0.0086 max mem: 33300 Epoch: [7] [1380/4276] eta: 2:29:38 lr: 4.168064072779158e-05 loss: 0.1773 (0.1787) time: 3.0942 data: 0.0080 max mem: 33300 Epoch: [7] [1390/4276] eta: 2:29:08 lr: 4.167795601324866e-05 loss: 0.1870 (0.1788) time: 3.1014 data: 0.0081 max mem: 33300 Epoch: [7] [1400/4276] eta: 2:28:36 lr: 4.1675271279490296e-05 loss: 0.1956 (0.1788) time: 3.0871 data: 0.0084 max mem: 33300 Epoch: [7] [1410/4276] eta: 2:28:04 lr: 4.167258652651498e-05 loss: 0.1765 (0.1789) time: 3.0570 data: 0.0083 max mem: 33300 Epoch: [7] [1420/4276] eta: 2:27:32 lr: 4.16699017543212e-05 loss: 0.1791 (0.1790) time: 3.0667 data: 0.0081 max mem: 33300 Epoch: [7] [1430/4276] eta: 2:27:00 lr: 4.166721696290743e-05 loss: 0.1641 (0.1790) time: 3.0455 data: 0.0086 max mem: 33300 Epoch: [7] [1440/4276] eta: 2:26:27 lr: 4.1664532152272165e-05 loss: 0.1747 (0.1791) time: 3.0235 data: 0.0093 max mem: 33300 Epoch: [7] [1450/4276] eta: 2:25:56 lr: 4.166184732241389e-05 loss: 0.1714 (0.1790) time: 3.0702 data: 0.0087 max mem: 33300 Epoch: [7] [1460/4276] eta: 2:25:24 lr: 4.165916247333109e-05 loss: 0.1707 (0.1790) time: 3.0588 data: 0.0081 max mem: 33300 Epoch: [7] [1470/4276] eta: 2:24:52 lr: 4.165647760502225e-05 loss: 0.1735 (0.1790) time: 3.0323 data: 0.0080 max mem: 33300 Epoch: [7] [1480/4276] eta: 2:24:21 lr: 4.1653792717485854e-05 loss: 0.1719 (0.1789) time: 3.0832 data: 0.0074 max mem: 33300 Epoch: [7] [1490/4276] eta: 2:23:49 lr: 4.16511078107204e-05 loss: 0.1623 (0.1789) time: 3.0681 data: 0.0078 max mem: 33300 Epoch: [7] [1500/4276] eta: 2:23:17 lr: 4.164842288472435e-05 loss: 0.1712 (0.1788) time: 3.0359 data: 0.0086 max mem: 33300 Epoch: [7] [1510/4276] eta: 2:22:46 lr: 4.1645737939496203e-05 loss: 0.1675 (0.1788) time: 3.0785 data: 0.0092 max mem: 33300 Epoch: [7] [1520/4276] eta: 2:22:14 lr: 4.164305297503444e-05 loss: 0.1691 (0.1789) time: 3.0778 data: 0.0089 max mem: 33300 Epoch: [7] [1530/4276] eta: 2:21:42 lr: 4.164036799133756e-05 loss: 0.1760 (0.1788) time: 3.0375 data: 0.0085 max mem: 33300 Epoch: [7] [1540/4276] eta: 2:21:11 lr: 4.163768298840402e-05 loss: 0.1640 (0.1789) time: 3.0611 data: 0.0089 max mem: 33300 Epoch: [7] [1550/4276] eta: 2:20:39 lr: 4.1634997966232325e-05 loss: 0.1707 (0.1789) time: 3.0501 data: 0.0087 max mem: 33300 Epoch: [7] [1560/4276] eta: 2:20:06 lr: 4.1632312924820954e-05 loss: 0.1742 (0.1789) time: 3.0037 data: 0.0081 max mem: 33300 Epoch: [7] [1570/4276] eta: 2:19:35 lr: 4.162962786416838e-05 loss: 0.1678 (0.1789) time: 3.0437 data: 0.0080 max mem: 33300 Epoch: [7] [1580/4276] eta: 2:19:03 lr: 4.16269427842731e-05 loss: 0.1575 (0.1788) time: 3.0643 data: 0.0088 max mem: 33300 Epoch: [7] [1590/4276] eta: 2:18:30 lr: 4.162425768513359e-05 loss: 0.1661 (0.1788) time: 3.0172 data: 0.0088 max mem: 33300 Epoch: [7] [1600/4276] eta: 2:17:59 lr: 4.162157256674833e-05 loss: 0.1755 (0.1787) time: 3.0396 data: 0.0082 max mem: 33300 Epoch: [7] [1610/4276] eta: 2:17:27 lr: 4.161888742911581e-05 loss: 0.1454 (0.1786) time: 3.0519 data: 0.0083 max mem: 33300 Epoch: [7] [1620/4276] eta: 2:16:54 lr: 4.161620227223451e-05 loss: 0.1482 (0.1785) time: 2.9860 data: 0.0084 max mem: 33300 Epoch: [7] [1630/4276] eta: 2:16:21 lr: 4.1613517096102905e-05 loss: 0.1749 (0.1786) time: 2.9486 data: 0.0079 max mem: 33300 Epoch: [7] [1640/4276] eta: 2:15:47 lr: 4.161083190071949e-05 loss: 0.1825 (0.1786) time: 2.9510 data: 0.0074 max mem: 33300 Epoch: [7] [1650/4276] eta: 2:15:15 lr: 4.160814668608273e-05 loss: 0.1643 (0.1785) time: 2.9600 data: 0.0077 max mem: 33300 Epoch: [7] [1660/4276] eta: 2:14:42 lr: 4.160546145219113e-05 loss: 0.1756 (0.1785) time: 2.9641 data: 0.0076 max mem: 33300 Epoch: [7] [1670/4276] eta: 2:14:09 lr: 4.160277619904314e-05 loss: 0.1838 (0.1785) time: 2.9632 data: 0.0073 max mem: 33300 Epoch: [7] [1680/4276] eta: 2:13:36 lr: 4.160009092663727e-05 loss: 0.1838 (0.1786) time: 2.9607 data: 0.0072 max mem: 33300 Epoch: [7] [1690/4276] eta: 2:13:03 lr: 4.1597405634971984e-05 loss: 0.1818 (0.1786) time: 2.9564 data: 0.0073 max mem: 33300 Epoch: [7] [1700/4276] eta: 2:12:30 lr: 4.159472032404576e-05 loss: 0.1904 (0.1787) time: 2.9567 data: 0.0075 max mem: 33300 Epoch: [7] [1710/4276] eta: 2:11:57 lr: 4.1592034993857093e-05 loss: 0.1944 (0.1788) time: 2.9610 data: 0.0075 max mem: 33300 Epoch: [7] [1720/4276] eta: 2:11:25 lr: 4.1589349644404444e-05 loss: 0.1890 (0.1789) time: 2.9598 data: 0.0072 max mem: 33300 Epoch: [7] [1730/4276] eta: 2:10:52 lr: 4.158666427568631e-05 loss: 0.1795 (0.1789) time: 2.9586 data: 0.0072 max mem: 33300 Epoch: [7] [1740/4276] eta: 2:10:19 lr: 4.158397888770116e-05 loss: 0.1824 (0.1790) time: 2.9636 data: 0.0074 max mem: 33300 Epoch: [7] [1750/4276] eta: 2:09:47 lr: 4.1581293480447484e-05 loss: 0.1841 (0.1790) time: 2.9655 data: 0.0075 max mem: 33300 Epoch: [7] [1760/4276] eta: 2:09:14 lr: 4.1578608053923754e-05 loss: 0.1794 (0.1789) time: 2.9538 data: 0.0082 max mem: 33300 Epoch: [7] [1770/4276] eta: 2:08:41 lr: 4.1575922608128446e-05 loss: 0.1779 (0.1790) time: 2.9249 data: 0.0088 max mem: 33300 Epoch: [7] [1780/4276] eta: 2:08:08 lr: 4.157323714306004e-05 loss: 0.1831 (0.1790) time: 2.9116 data: 0.0086 max mem: 33300 Epoch: [7] [1790/4276] eta: 2:07:35 lr: 4.157055165871702e-05 loss: 0.1657 (0.1790) time: 2.9427 data: 0.0083 max mem: 33300 Epoch: [7] [1800/4276] eta: 2:07:03 lr: 4.1567866155097864e-05 loss: 0.1677 (0.1790) time: 2.9564 data: 0.0086 max mem: 33300 Epoch: [7] [1810/4276] eta: 2:06:30 lr: 4.1565180632201036e-05 loss: 0.1913 (0.1791) time: 2.9550 data: 0.0090 max mem: 33300 Epoch: [7] [1820/4276] eta: 2:05:58 lr: 4.156249509002503e-05 loss: 0.1771 (0.1790) time: 2.9626 data: 0.0083 max mem: 33300 Epoch: [7] [1830/4276] eta: 2:05:26 lr: 4.155980952856831e-05 loss: 0.1564 (0.1790) time: 2.9722 data: 0.0080 max mem: 33300 Epoch: [7] [1840/4276] eta: 2:04:54 lr: 4.1557123947829365e-05 loss: 0.1595 (0.1790) time: 2.9749 data: 0.0085 max mem: 33300 Epoch: [7] [1850/4276] eta: 2:04:21 lr: 4.155443834780667e-05 loss: 0.1606 (0.1790) time: 2.9620 data: 0.0085 max mem: 33300 Epoch: [7] [1860/4276] eta: 2:03:49 lr: 4.1551752728498695e-05 loss: 0.1760 (0.1790) time: 2.9589 data: 0.0081 max mem: 33300 Epoch: [7] [1870/4276] eta: 2:03:17 lr: 4.154906708990392e-05 loss: 0.1760 (0.1792) time: 2.9620 data: 0.0080 max mem: 33300 Epoch: [7] [1880/4276] eta: 2:02:45 lr: 4.1546381432020834e-05 loss: 0.1704 (0.1791) time: 2.9639 data: 0.0077 max mem: 33300 Epoch: [7] [1890/4276] eta: 2:02:13 lr: 4.1543695754847885e-05 loss: 0.1630 (0.1791) time: 2.9634 data: 0.0075 max mem: 33300 Epoch: [7] [1900/4276] eta: 2:01:41 lr: 4.154101005838357e-05 loss: 0.1531 (0.1790) time: 2.9651 data: 0.0077 max mem: 33300 Epoch: [7] [1910/4276] eta: 2:01:09 lr: 4.153832434262635e-05 loss: 0.1637 (0.1790) time: 2.9748 data: 0.0082 max mem: 33300 Epoch: [7] [1920/4276] eta: 2:00:37 lr: 4.1535638607574734e-05 loss: 0.1722 (0.1789) time: 2.9740 data: 0.0080 max mem: 33300 Epoch: [7] [1930/4276] eta: 2:00:05 lr: 4.1532952853227156e-05 loss: 0.1661 (0.1790) time: 2.9693 data: 0.0076 max mem: 33300 Epoch: [7] [1940/4276] eta: 1:59:33 lr: 4.153026707958211e-05 loss: 0.1936 (0.1790) time: 2.9721 data: 0.0075 max mem: 33300 Epoch: [7] [1950/4276] eta: 1:59:01 lr: 4.152758128663807e-05 loss: 0.1715 (0.1790) time: 2.9665 data: 0.0077 max mem: 33300 Epoch: [7] [1960/4276] eta: 1:58:28 lr: 4.1524895474393513e-05 loss: 0.1605 (0.1790) time: 2.9473 data: 0.0087 max mem: 33300 Epoch: [7] [1970/4276] eta: 1:57:57 lr: 4.15222096428469e-05 loss: 0.1644 (0.1789) time: 2.9491 data: 0.0088 max mem: 33300 Epoch: [7] [1980/4276] eta: 1:57:25 lr: 4.1519523791996715e-05 loss: 0.1560 (0.1788) time: 2.9642 data: 0.0080 max mem: 33300 Epoch: [7] [1990/4276] eta: 1:56:53 lr: 4.151683792184144e-05 loss: 0.1758 (0.1789) time: 2.9613 data: 0.0077 max mem: 33300 Epoch: [7] [2000/4276] eta: 1:56:21 lr: 4.151415203237954e-05 loss: 0.1912 (0.1789) time: 2.9610 data: 0.0075 max mem: 33300 Epoch: [7] [2010/4276] eta: 1:55:49 lr: 4.1511466123609474e-05 loss: 0.1808 (0.1789) time: 2.9638 data: 0.0075 max mem: 33300 Epoch: [7] [2020/4276] eta: 1:55:17 lr: 4.1508780195529746e-05 loss: 0.1781 (0.1789) time: 2.9583 data: 0.0076 max mem: 33300 Epoch: [7] [2030/4276] eta: 1:54:45 lr: 4.15060942481388e-05 loss: 0.1723 (0.1788) time: 2.9259 data: 0.0076 max mem: 33300 Epoch: [7] [2040/4276] eta: 1:54:13 lr: 4.150340828143512e-05 loss: 0.1586 (0.1787) time: 2.9306 data: 0.0074 max mem: 33300 Epoch: [7] [2050/4276] eta: 1:53:41 lr: 4.150072229541718e-05 loss: 0.1742 (0.1788) time: 2.9600 data: 0.0077 max mem: 33300 Epoch: [7] [2060/4276] eta: 1:53:09 lr: 4.149803629008346e-05 loss: 0.1773 (0.1788) time: 2.9522 data: 0.0082 max mem: 33300 Epoch: [7] [2070/4276] eta: 1:52:37 lr: 4.1495350265432417e-05 loss: 0.1714 (0.1787) time: 2.9272 data: 0.0080 max mem: 33300 Epoch: [7] [2080/4276] eta: 1:52:05 lr: 4.149266422146253e-05 loss: 0.1759 (0.1787) time: 2.9335 data: 0.0076 max mem: 33300 Epoch: [7] [2090/4276] eta: 1:51:33 lr: 4.148997815817226e-05 loss: 0.1759 (0.1787) time: 2.9622 data: 0.0073 max mem: 33300 Epoch: [7] [2100/4276] eta: 1:51:02 lr: 4.1487292075560106e-05 loss: 0.1738 (0.1787) time: 2.9739 data: 0.0082 max mem: 33300 Epoch: [7] [2110/4276] eta: 1:50:30 lr: 4.14846059736245e-05 loss: 0.1577 (0.1786) time: 2.9699 data: 0.0082 max mem: 33300 Epoch: [7] [2120/4276] eta: 1:49:59 lr: 4.148191985236394e-05 loss: 0.1466 (0.1785) time: 2.9577 data: 0.0076 max mem: 33300 Epoch: [7] [2130/4276] eta: 1:49:26 lr: 4.1479233711776894e-05 loss: 0.1479 (0.1784) time: 2.9310 data: 0.0079 max mem: 33300 Epoch: [7] [2140/4276] eta: 1:48:55 lr: 4.147654755186182e-05 loss: 0.1750 (0.1784) time: 2.9245 data: 0.0085 max mem: 33300 Epoch: [7] [2150/4276] eta: 1:48:23 lr: 4.147386137261721e-05 loss: 0.1747 (0.1784) time: 2.9524 data: 0.0088 max mem: 33300 Epoch: [7] [2160/4276] eta: 1:47:51 lr: 4.14711751740415e-05 loss: 0.1739 (0.1783) time: 2.9612 data: 0.0081 max mem: 33300 Epoch: [7] [2170/4276] eta: 1:47:20 lr: 4.1468488956133195e-05 loss: 0.1772 (0.1784) time: 2.9610 data: 0.0080 max mem: 33300 Epoch: [7] [2180/4276] eta: 1:46:48 lr: 4.1465802718890744e-05 loss: 0.1846 (0.1784) time: 2.9623 data: 0.0083 max mem: 33300 Epoch: [7] [2190/4276] eta: 1:46:17 lr: 4.1463116462312626e-05 loss: 0.1740 (0.1784) time: 2.9585 data: 0.0080 max mem: 33300 Epoch: [7] [2200/4276] eta: 1:45:45 lr: 4.1460430186397294e-05 loss: 0.1733 (0.1785) time: 2.9563 data: 0.0083 max mem: 33300 Epoch: [7] [2210/4276] eta: 1:45:14 lr: 4.145774389114323e-05 loss: 0.1778 (0.1785) time: 2.9578 data: 0.0080 max mem: 33300 Epoch: [7] [2220/4276] eta: 1:44:42 lr: 4.1455057576548904e-05 loss: 0.1778 (0.1785) time: 2.9300 data: 0.0075 max mem: 33300 Epoch: [7] [2230/4276] eta: 1:44:10 lr: 4.145237124261278e-05 loss: 0.1736 (0.1784) time: 2.9302 data: 0.0081 max mem: 33300 Epoch: [7] [2240/4276] eta: 1:43:39 lr: 4.1449684889333326e-05 loss: 0.1609 (0.1783) time: 2.9643 data: 0.0082 max mem: 33300 Epoch: [7] [2250/4276] eta: 1:43:08 lr: 4.144699851670901e-05 loss: 0.1589 (0.1783) time: 2.9694 data: 0.0083 max mem: 33300 Epoch: [7] [2260/4276] eta: 1:42:36 lr: 4.1444312124738305e-05 loss: 0.1743 (0.1783) time: 2.9641 data: 0.0085 max mem: 33300 Epoch: [7] [2270/4276] eta: 1:42:05 lr: 4.144162571341966e-05 loss: 0.1743 (0.1783) time: 2.9625 data: 0.0087 max mem: 33300 Epoch: [7] [2280/4276] eta: 1:41:34 lr: 4.1438939282751565e-05 loss: 0.1638 (0.1783) time: 2.9629 data: 0.0087 max mem: 33300 Epoch: [7] [2290/4276] eta: 1:41:03 lr: 4.143625283273247e-05 loss: 0.1719 (0.1783) time: 2.9682 data: 0.0083 max mem: 33300 Epoch: [7] [2300/4276] eta: 1:40:31 lr: 4.1433566363360854e-05 loss: 0.1644 (0.1782) time: 2.9753 data: 0.0083 max mem: 33300 Epoch: [7] [2310/4276] eta: 1:40:00 lr: 4.1430879874635175e-05 loss: 0.1596 (0.1782) time: 2.9700 data: 0.0085 max mem: 33300 Epoch: [7] [2320/4276] eta: 1:39:29 lr: 4.14281933665539e-05 loss: 0.1676 (0.1781) time: 2.9633 data: 0.0081 max mem: 33300 Epoch: [7] [2330/4276] eta: 1:38:58 lr: 4.142550683911551e-05 loss: 0.1706 (0.1782) time: 2.9620 data: 0.0076 max mem: 33300 Epoch: [7] [2340/4276] eta: 1:38:26 lr: 4.142282029231844e-05 loss: 0.1706 (0.1782) time: 2.9665 data: 0.0076 max mem: 33300 Epoch: [7] [2350/4276] eta: 1:37:55 lr: 4.142013372616118e-05 loss: 0.1703 (0.1781) time: 2.9659 data: 0.0075 max mem: 33300 Epoch: [7] [2360/4276] eta: 1:37:24 lr: 4.141744714064219e-05 loss: 0.1695 (0.1781) time: 2.9589 data: 0.0077 max mem: 33300 Epoch: [7] [2370/4276] eta: 1:36:53 lr: 4.141476053575992e-05 loss: 0.1699 (0.1781) time: 2.9662 data: 0.0082 max mem: 33300 Epoch: [7] [2380/4276] eta: 1:36:22 lr: 4.1412073911512865e-05 loss: 0.1788 (0.1781) time: 2.9698 data: 0.0080 max mem: 33300 Epoch: [7] [2390/4276] eta: 1:35:50 lr: 4.140938726789946e-05 loss: 0.1700 (0.1780) time: 2.9685 data: 0.0075 max mem: 33300 Epoch: [7] [2400/4276] eta: 1:35:19 lr: 4.140670060491819e-05 loss: 0.1730 (0.1781) time: 2.9693 data: 0.0075 max mem: 33300 Epoch: [7] [2410/4276] eta: 1:34:48 lr: 4.140401392256751e-05 loss: 0.1625 (0.1780) time: 2.9636 data: 0.0078 max mem: 33300 Epoch: [7] [2420/4276] eta: 1:34:17 lr: 4.1401327220845884e-05 loss: 0.1517 (0.1779) time: 2.9656 data: 0.0078 max mem: 33300 Epoch: [7] [2430/4276] eta: 1:33:46 lr: 4.139864049975177e-05 loss: 0.1577 (0.1780) time: 2.9651 data: 0.0074 max mem: 33300 Epoch: [7] [2440/4276] eta: 1:33:15 lr: 4.139595375928364e-05 loss: 0.1681 (0.1780) time: 2.9656 data: 0.0075 max mem: 33300 Epoch: [7] [2450/4276] eta: 1:32:44 lr: 4.1393266999439954e-05 loss: 0.1679 (0.1779) time: 2.9708 data: 0.0077 max mem: 33300 Epoch: [7] [2460/4276] eta: 1:32:13 lr: 4.139058022021918e-05 loss: 0.1727 (0.1779) time: 2.9675 data: 0.0076 max mem: 33300 Epoch: [7] [2470/4276] eta: 1:31:42 lr: 4.1387893421619775e-05 loss: 0.1727 (0.1779) time: 2.9735 data: 0.0074 max mem: 33300 Epoch: [7] [2480/4276] eta: 1:31:11 lr: 4.13852066036402e-05 loss: 0.1711 (0.1779) time: 2.9670 data: 0.0076 max mem: 33300 Epoch: [7] [2490/4276] eta: 1:30:40 lr: 4.1382519766278924e-05 loss: 0.1756 (0.1779) time: 2.9620 data: 0.0079 max mem: 33300 Epoch: [7] [2500/4276] eta: 1:30:09 lr: 4.13798329095344e-05 loss: 0.1756 (0.1779) time: 2.9709 data: 0.0077 max mem: 33300 Epoch: [7] [2510/4276] eta: 1:29:38 lr: 4.137714603340509e-05 loss: 0.1760 (0.1779) time: 2.9671 data: 0.0077 max mem: 33300 Epoch: [7] [2520/4276] eta: 1:29:07 lr: 4.1374459137889465e-05 loss: 0.1610 (0.1778) time: 2.9622 data: 0.0077 max mem: 33300 Epoch: [7] [2530/4276] eta: 1:28:36 lr: 4.137177222298598e-05 loss: 0.1376 (0.1777) time: 2.9683 data: 0.0075 max mem: 33300 Epoch: [7] [2540/4276] eta: 1:28:05 lr: 4.1369085288693097e-05 loss: 0.1541 (0.1776) time: 2.9672 data: 0.0076 max mem: 33300 Epoch: [7] [2550/4276] eta: 1:27:34 lr: 4.136639833500928e-05 loss: 0.1565 (0.1776) time: 2.9613 data: 0.0075 max mem: 33300 Epoch: [7] [2560/4276] eta: 1:27:03 lr: 4.1363711361932986e-05 loss: 0.1477 (0.1775) time: 2.9698 data: 0.0075 max mem: 33300 Epoch: [7] [2570/4276] eta: 1:26:31 lr: 4.136102436946268e-05 loss: 0.1348 (0.1775) time: 2.9509 data: 0.0077 max mem: 33300 Epoch: [7] [2580/4276] eta: 1:26:01 lr: 4.135833735759681e-05 loss: 0.1589 (0.1775) time: 2.9484 data: 0.0083 max mem: 33300 Epoch: [7] [2590/4276] eta: 1:25:30 lr: 4.135565032633384e-05 loss: 0.1593 (0.1774) time: 2.9660 data: 0.0083 max mem: 33300 Epoch: [7] [2600/4276] eta: 1:24:59 lr: 4.1352963275672233e-05 loss: 0.1642 (0.1774) time: 2.9691 data: 0.0079 max mem: 33300 Epoch: [7] [2610/4276] eta: 1:24:28 lr: 4.1350276205610455e-05 loss: 0.1656 (0.1774) time: 2.9695 data: 0.0083 max mem: 33300 Epoch: [7] [2620/4276] eta: 1:23:57 lr: 4.134758911614696e-05 loss: 0.1747 (0.1774) time: 2.9744 data: 0.0081 max mem: 33300 Epoch: [7] [2630/4276] eta: 1:23:26 lr: 4.1344902007280204e-05 loss: 0.1636 (0.1774) time: 2.9877 data: 0.0075 max mem: 33300 Epoch: [7] [2640/4276] eta: 1:22:55 lr: 4.134221487900865e-05 loss: 0.1506 (0.1773) time: 2.9742 data: 0.0074 max mem: 33300 Epoch: [7] [2650/4276] eta: 1:22:24 lr: 4.1339527731330754e-05 loss: 0.1557 (0.1773) time: 2.9587 data: 0.0082 max mem: 33300 Epoch: [7] [2660/4276] eta: 1:21:54 lr: 4.133684056424496e-05 loss: 0.1826 (0.1774) time: 2.9736 data: 0.0083 max mem: 33300 Epoch: [7] [2670/4276] eta: 1:21:23 lr: 4.133415337774976e-05 loss: 0.1764 (0.1774) time: 2.9838 data: 0.0077 max mem: 33300 Epoch: [7] [2680/4276] eta: 1:20:52 lr: 4.133146617184358e-05 loss: 0.1690 (0.1774) time: 2.9826 data: 0.0077 max mem: 33300 Epoch: [7] [2690/4276] eta: 1:20:21 lr: 4.132877894652489e-05 loss: 0.1669 (0.1773) time: 2.9799 data: 0.0076 max mem: 33300 Epoch: [7] [2700/4276] eta: 1:19:51 lr: 4.132609170179215e-05 loss: 0.1571 (0.1773) time: 2.9676 data: 0.0080 max mem: 33300 Epoch: [7] [2710/4276] eta: 1:19:20 lr: 4.132340443764381e-05 loss: 0.1642 (0.1772) time: 2.9520 data: 0.0087 max mem: 33300 Epoch: [7] [2720/4276] eta: 1:18:49 lr: 4.132071715407834e-05 loss: 0.1652 (0.1772) time: 2.9543 data: 0.0089 max mem: 33300 Epoch: [7] [2730/4276] eta: 1:18:18 lr: 4.131802985109418e-05 loss: 0.1649 (0.1772) time: 2.9672 data: 0.0082 max mem: 33300 Epoch: [7] [2740/4276] eta: 1:17:47 lr: 4.131534252868979e-05 loss: 0.1729 (0.1772) time: 2.9777 data: 0.0077 max mem: 33300 Epoch: [7] [2750/4276] eta: 1:17:16 lr: 4.131265518686364e-05 loss: 0.1729 (0.1773) time: 2.9744 data: 0.0079 max mem: 33300 Epoch: [7] [2760/4276] eta: 1:16:46 lr: 4.130996782561416e-05 loss: 0.1635 (0.1773) time: 2.9657 data: 0.0082 max mem: 33300 Epoch: [7] [2770/4276] eta: 1:16:15 lr: 4.1307280444939836e-05 loss: 0.1523 (0.1772) time: 2.9740 data: 0.0080 max mem: 33300 Epoch: [7] [2780/4276] eta: 1:15:44 lr: 4.13045930448391e-05 loss: 0.1563 (0.1772) time: 2.9585 data: 0.0083 max mem: 33300 Epoch: [7] [2790/4276] eta: 1:15:13 lr: 4.130190562531042e-05 loss: 0.1774 (0.1773) time: 2.9610 data: 0.0082 max mem: 33300 Epoch: [7] [2800/4276] eta: 1:14:43 lr: 4.129921818635224e-05 loss: 0.1728 (0.1772) time: 2.9796 data: 0.0075 max mem: 33300 Epoch: [7] [2810/4276] eta: 1:14:12 lr: 4.129653072796303e-05 loss: 0.1519 (0.1771) time: 2.9867 data: 0.0074 max mem: 33300 Epoch: [7] [2820/4276] eta: 1:13:41 lr: 4.129384325014122e-05 loss: 0.1474 (0.1770) time: 2.9820 data: 0.0076 max mem: 33300 Epoch: [7] [2830/4276] eta: 1:13:11 lr: 4.1291155752885295e-05 loss: 0.1475 (0.1770) time: 2.9724 data: 0.0074 max mem: 33300 Epoch: [7] [2840/4276] eta: 1:12:40 lr: 4.128846823619369e-05 loss: 0.1581 (0.1769) time: 2.9793 data: 0.0073 max mem: 33300 Epoch: [7] [2850/4276] eta: 1:12:09 lr: 4.1285780700064866e-05 loss: 0.1844 (0.1771) time: 2.9855 data: 0.0078 max mem: 33300 Epoch: [7] [2860/4276] eta: 1:11:39 lr: 4.128309314449727e-05 loss: 0.1813 (0.1770) time: 2.9839 data: 0.0077 max mem: 33300 Epoch: [7] [2870/4276] eta: 1:11:08 lr: 4.1280405569489354e-05 loss: 0.1646 (0.1770) time: 2.9661 data: 0.0073 max mem: 33300 Epoch: [7] [2880/4276] eta: 1:10:37 lr: 4.127771797503959e-05 loss: 0.1785 (0.1770) time: 2.9723 data: 0.0073 max mem: 33300 Epoch: [7] [2890/4276] eta: 1:10:07 lr: 4.12750303611464e-05 loss: 0.1788 (0.1771) time: 2.9784 data: 0.0072 max mem: 33300 Epoch: [7] [2900/4276] eta: 1:09:36 lr: 4.127234272780826e-05 loss: 0.1633 (0.1769) time: 2.9873 data: 0.0070 max mem: 33300 Epoch: [7] [2910/4276] eta: 1:09:06 lr: 4.126965507502361e-05 loss: 0.1597 (0.1770) time: 3.0219 data: 0.0071 max mem: 33300 Epoch: [7] [2920/4276] eta: 1:08:36 lr: 4.126696740279091e-05 loss: 0.1608 (0.1770) time: 3.0407 data: 0.0075 max mem: 33300 Epoch: [7] [2930/4276] eta: 1:08:05 lr: 4.1264279711108615e-05 loss: 0.1539 (0.1769) time: 3.0260 data: 0.0078 max mem: 33300 Epoch: [7] [2940/4276] eta: 1:07:35 lr: 4.126159199997517e-05 loss: 0.1673 (0.1769) time: 3.0139 data: 0.0076 max mem: 33300 Epoch: [7] [2950/4276] eta: 1:07:04 lr: 4.125890426938903e-05 loss: 0.1611 (0.1769) time: 3.0260 data: 0.0075 max mem: 33300 Epoch: [7] [2960/4276] eta: 1:06:34 lr: 4.125621651934863e-05 loss: 0.1698 (0.1769) time: 3.0248 data: 0.0078 max mem: 33300 Epoch: [7] [2970/4276] eta: 1:06:04 lr: 4.125352874985245e-05 loss: 0.1698 (0.1769) time: 3.0225 data: 0.0082 max mem: 33300 Epoch: [7] [2980/4276] eta: 1:05:33 lr: 4.125084096089891e-05 loss: 0.1869 (0.1769) time: 3.0219 data: 0.0077 max mem: 33300 Epoch: [7] [2990/4276] eta: 1:05:03 lr: 4.124815315248649e-05 loss: 0.1673 (0.1769) time: 3.0175 data: 0.0077 max mem: 33300 Epoch: [7] [3000/4276] eta: 1:04:32 lr: 4.124546532461363e-05 loss: 0.1613 (0.1768) time: 3.0223 data: 0.0082 max mem: 33300 Epoch: [7] [3010/4276] eta: 1:04:02 lr: 4.1242777477278754e-05 loss: 0.1586 (0.1768) time: 3.0208 data: 0.0084 max mem: 33300 Epoch: [7] [3020/4276] eta: 1:03:31 lr: 4.124008961048035e-05 loss: 0.1550 (0.1768) time: 3.0050 data: 0.0085 max mem: 33300 Epoch: [7] [3030/4276] eta: 1:03:01 lr: 4.123740172421685e-05 loss: 0.1538 (0.1768) time: 3.0088 data: 0.0084 max mem: 33300 Epoch: [7] [3040/4276] eta: 1:02:30 lr: 4.12347138184867e-05 loss: 0.1735 (0.1768) time: 3.0175 data: 0.0087 max mem: 33300 Epoch: [7] [3050/4276] eta: 1:02:00 lr: 4.1232025893288355e-05 loss: 0.1715 (0.1768) time: 3.0230 data: 0.0085 max mem: 33300 Epoch: [7] [3060/4276] eta: 1:01:30 lr: 4.122933794862026e-05 loss: 0.1501 (0.1767) time: 3.0238 data: 0.0077 max mem: 33300 Epoch: [7] [3070/4276] eta: 1:00:59 lr: 4.122664998448087e-05 loss: 0.1527 (0.1767) time: 3.0116 data: 0.0078 max mem: 33300 Epoch: [7] [3080/4276] eta: 1:00:29 lr: 4.1223962000868635e-05 loss: 0.1566 (0.1767) time: 3.0271 data: 0.0084 max mem: 33300 Epoch: [7] [3090/4276] eta: 0:59:59 lr: 4.122127399778199e-05 loss: 0.1483 (0.1766) time: 3.0296 data: 0.0084 max mem: 33300 Epoch: [7] [3100/4276] eta: 0:59:28 lr: 4.121858597521939e-05 loss: 0.1638 (0.1766) time: 3.0062 data: 0.0080 max mem: 33300 Epoch: [7] [3110/4276] eta: 0:58:58 lr: 4.1215897933179286e-05 loss: 0.1633 (0.1765) time: 3.0050 data: 0.0082 max mem: 33300 Epoch: [7] [3120/4276] eta: 0:58:27 lr: 4.121320987166012e-05 loss: 0.1494 (0.1764) time: 3.0116 data: 0.0078 max mem: 33300 Epoch: [7] [3130/4276] eta: 0:57:57 lr: 4.121052179066034e-05 loss: 0.1576 (0.1764) time: 3.0116 data: 0.0072 max mem: 33300 Epoch: [7] [3140/4276] eta: 0:57:26 lr: 4.1207833690178396e-05 loss: 0.1637 (0.1764) time: 2.9855 data: 0.0071 max mem: 33300 Epoch: [7] [3150/4276] eta: 0:56:56 lr: 4.1205145570212734e-05 loss: 0.1770 (0.1764) time: 2.9764 data: 0.0068 max mem: 33300 Epoch: [7] [3160/4276] eta: 0:56:25 lr: 4.1202457430761803e-05 loss: 0.1712 (0.1764) time: 2.9973 data: 0.0072 max mem: 33300 Epoch: [7] [3170/4276] eta: 0:55:55 lr: 4.119976927182404e-05 loss: 0.1712 (0.1765) time: 2.9959 data: 0.0080 max mem: 33300 Epoch: [7] [3180/4276] eta: 0:55:24 lr: 4.11970810933979e-05 loss: 0.1748 (0.1765) time: 3.0185 data: 0.0079 max mem: 33300 Epoch: [7] [3190/4276] eta: 0:54:54 lr: 4.1194392895481834e-05 loss: 0.1663 (0.1765) time: 3.0415 data: 0.0072 max mem: 33300 Epoch: [7] [3200/4276] eta: 0:54:24 lr: 4.119170467807427e-05 loss: 0.1661 (0.1765) time: 3.0253 data: 0.0068 max mem: 33300 Epoch: [7] [3210/4276] eta: 0:53:53 lr: 4.118901644117365e-05 loss: 0.1669 (0.1765) time: 3.0057 data: 0.0072 max mem: 33300 Epoch: [7] [3220/4276] eta: 0:53:23 lr: 4.118632818477845e-05 loss: 0.1828 (0.1765) time: 3.0115 data: 0.0077 max mem: 33300 Epoch: [7] [3230/4276] eta: 0:52:53 lr: 4.118363990888709e-05 loss: 0.1636 (0.1765) time: 3.0411 data: 0.0075 max mem: 33300 Epoch: [7] [3240/4276] eta: 0:52:22 lr: 4.1180951613498026e-05 loss: 0.1636 (0.1765) time: 3.0574 data: 0.0074 max mem: 33300 Epoch: [7] [3250/4276] eta: 0:51:52 lr: 4.117826329860969e-05 loss: 0.1779 (0.1765) time: 3.0699 data: 0.0075 max mem: 33300 Epoch: [7] [3260/4276] eta: 0:51:22 lr: 4.117557496422054e-05 loss: 0.1852 (0.1765) time: 3.0520 data: 0.0076 max mem: 33300 Epoch: [7] [3270/4276] eta: 0:50:51 lr: 4.117288661032901e-05 loss: 0.1765 (0.1765) time: 3.0248 data: 0.0079 max mem: 33300 Epoch: [7] [3280/4276] eta: 0:50:21 lr: 4.117019823693354e-05 loss: 0.1765 (0.1766) time: 3.0466 data: 0.0083 max mem: 33300 Epoch: [7] [3290/4276] eta: 0:49:51 lr: 4.116750984403259e-05 loss: 0.1857 (0.1766) time: 3.0732 data: 0.0085 max mem: 33300 Epoch: [7] [3300/4276] eta: 0:49:21 lr: 4.116482143162459e-05 loss: 0.1815 (0.1767) time: 3.0462 data: 0.0082 max mem: 33300 Epoch: [7] [3310/4276] eta: 0:48:50 lr: 4.1162132999707984e-05 loss: 0.1824 (0.1767) time: 3.0053 data: 0.0074 max mem: 33300 Epoch: [7] [3320/4276] eta: 0:48:20 lr: 4.115944454828122e-05 loss: 0.1839 (0.1767) time: 3.0008 data: 0.0073 max mem: 33300 Epoch: [7] [3330/4276] eta: 0:47:50 lr: 4.115675607734273e-05 loss: 0.1580 (0.1767) time: 3.0464 data: 0.0081 max mem: 33300 Epoch: [7] [3340/4276] eta: 0:47:19 lr: 4.115406758689098e-05 loss: 0.1683 (0.1767) time: 3.0812 data: 0.0083 max mem: 33300 Epoch: [7] [3350/4276] eta: 0:46:49 lr: 4.1151379076924384e-05 loss: 0.1603 (0.1766) time: 3.0828 data: 0.0080 max mem: 33300 Epoch: [7] [3360/4276] eta: 0:46:19 lr: 4.114869054744139e-05 loss: 0.1516 (0.1766) time: 3.1106 data: 0.0073 max mem: 33300 Epoch: [7] [3370/4276] eta: 0:45:49 lr: 4.114600199844045e-05 loss: 0.1878 (0.1767) time: 3.0596 data: 0.0062 max mem: 33300 Epoch: [7] [3380/4276] eta: 0:45:18 lr: 4.114331342992e-05 loss: 0.1769 (0.1767) time: 2.9846 data: 0.0059 max mem: 33300 Epoch: [7] [3390/4276] eta: 0:44:48 lr: 4.114062484187849e-05 loss: 0.1789 (0.1767) time: 2.9824 data: 0.0063 max mem: 33300 Epoch: [7] [3400/4276] eta: 0:44:17 lr: 4.113793623431434e-05 loss: 0.1887 (0.1767) time: 2.9884 data: 0.0064 max mem: 33300 Epoch: [7] [3410/4276] eta: 0:43:47 lr: 4.1135247607226e-05 loss: 0.1858 (0.1768) time: 2.9808 data: 0.0066 max mem: 33300 Epoch: [7] [3420/4276] eta: 0:43:16 lr: 4.1132558960611925e-05 loss: 0.1847 (0.1768) time: 2.9692 data: 0.0069 max mem: 33300 Epoch: [7] [3430/4276] eta: 0:42:46 lr: 4.1129870294470535e-05 loss: 0.1825 (0.1768) time: 3.0349 data: 0.0073 max mem: 33300 Epoch: [7] [3440/4276] eta: 0:42:16 lr: 4.112718160880027e-05 loss: 0.1731 (0.1768) time: 3.0453 data: 0.0073 max mem: 33300 Epoch: [7] [3450/4276] eta: 0:41:45 lr: 4.112449290359959e-05 loss: 0.1647 (0.1768) time: 3.0081 data: 0.0069 max mem: 33300 Epoch: [7] [3460/4276] eta: 0:41:15 lr: 4.112180417886691e-05 loss: 0.1820 (0.1769) time: 2.9925 data: 0.0069 max mem: 33300 Epoch: [7] [3470/4276] eta: 0:40:44 lr: 4.111911543460069e-05 loss: 0.1644 (0.1768) time: 2.9886 data: 0.0070 max mem: 33300 Epoch: [7] [3480/4276] eta: 0:40:14 lr: 4.1116426670799356e-05 loss: 0.1681 (0.1768) time: 3.0467 data: 0.0076 max mem: 33300 Epoch: [7] [3490/4276] eta: 0:39:44 lr: 4.111373788746135e-05 loss: 0.1834 (0.1768) time: 3.0502 data: 0.0078 max mem: 33300 Epoch: [7] [3500/4276] eta: 0:39:13 lr: 4.111104908458511e-05 loss: 0.1695 (0.1768) time: 3.0348 data: 0.0072 max mem: 33300 Epoch: [7] [3510/4276] eta: 0:38:43 lr: 4.1108360262169076e-05 loss: 0.1624 (0.1768) time: 3.0575 data: 0.0070 max mem: 33300 Epoch: [7] [3520/4276] eta: 0:38:13 lr: 4.1105671420211686e-05 loss: 0.1775 (0.1768) time: 3.0580 data: 0.0067 max mem: 33300 Epoch: [7] [3530/4276] eta: 0:37:43 lr: 4.110298255871137e-05 loss: 0.1757 (0.1768) time: 3.0518 data: 0.0060 max mem: 33300 Epoch: [7] [3540/4276] eta: 0:37:12 lr: 4.110029367766657e-05 loss: 0.1757 (0.1769) time: 3.0499 data: 0.0060 max mem: 33300 Epoch: [7] [3550/4276] eta: 0:36:42 lr: 4.109760477707573e-05 loss: 0.1765 (0.1768) time: 2.9984 data: 0.0064 max mem: 33300 Epoch: [7] [3560/4276] eta: 0:36:11 lr: 4.109491585693728e-05 loss: 0.1690 (0.1768) time: 2.9697 data: 0.0067 max mem: 33300 Epoch: [7] [3570/4276] eta: 0:35:41 lr: 4.1092226917249665e-05 loss: 0.1761 (0.1769) time: 3.0113 data: 0.0075 max mem: 33300 Epoch: [7] [3580/4276] eta: 0:35:11 lr: 4.108953795801131e-05 loss: 0.1592 (0.1768) time: 3.0036 data: 0.0075 max mem: 33300 Epoch: [7] [3590/4276] eta: 0:34:40 lr: 4.108684897922065e-05 loss: 0.1506 (0.1768) time: 2.9881 data: 0.0067 max mem: 33300 Epoch: [7] [3600/4276] eta: 0:34:10 lr: 4.108415998087613e-05 loss: 0.1764 (0.1768) time: 3.0180 data: 0.0072 max mem: 33300 Epoch: [7] [3610/4276] eta: 0:33:39 lr: 4.108147096297619e-05 loss: 0.1740 (0.1768) time: 3.0004 data: 0.0074 max mem: 33300 Epoch: [7] [3620/4276] eta: 0:33:09 lr: 4.107878192551925e-05 loss: 0.1724 (0.1768) time: 2.9906 data: 0.0075 max mem: 33300 Epoch: [7] [3630/4276] eta: 0:32:39 lr: 4.107609286850376e-05 loss: 0.1724 (0.1768) time: 3.0217 data: 0.0083 max mem: 33300 Epoch: [7] [3640/4276] eta: 0:32:08 lr: 4.107340379192814e-05 loss: 0.1665 (0.1768) time: 3.0434 data: 0.0084 max mem: 33300 Epoch: [7] [3650/4276] eta: 0:31:38 lr: 4.107071469579084e-05 loss: 0.1574 (0.1767) time: 3.0508 data: 0.0079 max mem: 33300 Epoch: [7] [3660/4276] eta: 0:31:08 lr: 4.1068025580090286e-05 loss: 0.1574 (0.1767) time: 3.0498 data: 0.0080 max mem: 33300 Epoch: [7] [3670/4276] eta: 0:30:37 lr: 4.1065336444824915e-05 loss: 0.1710 (0.1767) time: 3.0291 data: 0.0080 max mem: 33300 Epoch: [7] [3680/4276] eta: 0:30:07 lr: 4.106264728999316e-05 loss: 0.1888 (0.1767) time: 3.0393 data: 0.0078 max mem: 33300 Epoch: [7] [3690/4276] eta: 0:29:37 lr: 4.1059958115593456e-05 loss: 0.1862 (0.1768) time: 3.0636 data: 0.0077 max mem: 33300 Epoch: [7] [3700/4276] eta: 0:29:07 lr: 4.105726892162424e-05 loss: 0.1805 (0.1767) time: 3.0411 data: 0.0079 max mem: 33300 Epoch: [7] [3710/4276] eta: 0:28:36 lr: 4.1054579708083935e-05 loss: 0.1635 (0.1767) time: 3.0482 data: 0.0081 max mem: 33300 Epoch: [7] [3720/4276] eta: 0:28:06 lr: 4.1051890474970986e-05 loss: 0.1541 (0.1767) time: 3.0532 data: 0.0079 max mem: 33300 Epoch: [7] [3730/4276] eta: 0:27:36 lr: 4.1049201222283817e-05 loss: 0.1667 (0.1767) time: 3.0319 data: 0.0075 max mem: 33300 Epoch: [7] [3740/4276] eta: 0:27:05 lr: 4.104651195002086e-05 loss: 0.1775 (0.1767) time: 3.0661 data: 0.0076 max mem: 33300 Epoch: [7] [3750/4276] eta: 0:26:35 lr: 4.104382265818056e-05 loss: 0.1601 (0.1767) time: 3.0541 data: 0.0078 max mem: 33300 Epoch: [7] [3760/4276] eta: 0:26:05 lr: 4.104113334676133e-05 loss: 0.1601 (0.1767) time: 2.9819 data: 0.0086 max mem: 33300 Epoch: [7] [3770/4276] eta: 0:25:34 lr: 4.103844401576162e-05 loss: 0.1803 (0.1767) time: 3.0107 data: 0.0087 max mem: 33300 Epoch: [7] [3780/4276] eta: 0:25:04 lr: 4.103575466517985e-05 loss: 0.1820 (0.1767) time: 3.0324 data: 0.0084 max mem: 33300 Epoch: [7] [3790/4276] eta: 0:24:34 lr: 4.103306529501446e-05 loss: 0.1683 (0.1767) time: 3.0291 data: 0.0086 max mem: 33300 Epoch: [7] [3800/4276] eta: 0:24:03 lr: 4.103037590526388e-05 loss: 0.1734 (0.1767) time: 3.0710 data: 0.0083 max mem: 33300 Epoch: [7] [3810/4276] eta: 0:23:33 lr: 4.102768649592653e-05 loss: 0.1734 (0.1767) time: 3.0419 data: 0.0085 max mem: 33300 Epoch: [7] [3820/4276] eta: 0:23:03 lr: 4.1024997067000855e-05 loss: 0.1523 (0.1766) time: 3.0234 data: 0.0087 max mem: 33300 Epoch: [7] [3830/4276] eta: 0:22:32 lr: 4.102230761848527e-05 loss: 0.1563 (0.1766) time: 3.0780 data: 0.0088 max mem: 33300 Epoch: [7] [3840/4276] eta: 0:22:02 lr: 4.101961815037822e-05 loss: 0.1659 (0.1766) time: 3.0963 data: 0.0088 max mem: 33300 Epoch: [7] [3850/4276] eta: 0:21:32 lr: 4.101692866267814e-05 loss: 0.1521 (0.1765) time: 3.0658 data: 0.0081 max mem: 33300 Epoch: [7] [3860/4276] eta: 0:21:01 lr: 4.1014239155383435e-05 loss: 0.1613 (0.1765) time: 3.0520 data: 0.0073 max mem: 33300 Epoch: [7] [3870/4276] eta: 0:20:31 lr: 4.101154962849255e-05 loss: 0.1822 (0.1765) time: 3.0291 data: 0.0074 max mem: 33300 Epoch: [7] [3880/4276] eta: 0:20:01 lr: 4.100886008200392e-05 loss: 0.1729 (0.1765) time: 3.0239 data: 0.0081 max mem: 33300 Epoch: [7] [3890/4276] eta: 0:19:30 lr: 4.100617051591596e-05 loss: 0.1613 (0.1765) time: 3.0455 data: 0.0082 max mem: 33300 Epoch: [7] [3900/4276] eta: 0:19:00 lr: 4.1003480930227114e-05 loss: 0.1688 (0.1765) time: 3.0037 data: 0.0083 max mem: 33300 Epoch: [7] [3910/4276] eta: 0:18:30 lr: 4.100079132493579e-05 loss: 0.1608 (0.1764) time: 2.9806 data: 0.0078 max mem: 33300 Epoch: [7] [3920/4276] eta: 0:17:59 lr: 4.0998101700040434e-05 loss: 0.1487 (0.1764) time: 2.9924 data: 0.0078 max mem: 33300 Epoch: [7] [3930/4276] eta: 0:17:29 lr: 4.099541205553948e-05 loss: 0.1727 (0.1764) time: 2.9628 data: 0.0075 max mem: 33300 Epoch: [7] [3940/4276] eta: 0:16:59 lr: 4.099272239143133e-05 loss: 0.1727 (0.1765) time: 2.9734 data: 0.0068 max mem: 33300 Epoch: [7] [3950/4276] eta: 0:16:28 lr: 4.099003270771443e-05 loss: 0.1654 (0.1764) time: 3.0316 data: 0.0076 max mem: 33300 Epoch: [7] [3960/4276] eta: 0:15:58 lr: 4.098734300438721e-05 loss: 0.1654 (0.1764) time: 3.0018 data: 0.0078 max mem: 33300 Epoch: [7] [3970/4276] eta: 0:15:27 lr: 4.098465328144808e-05 loss: 0.1802 (0.1764) time: 2.9690 data: 0.0075 max mem: 33300 Epoch: [7] [3980/4276] eta: 0:14:57 lr: 4.098196353889549e-05 loss: 0.1684 (0.1764) time: 3.0058 data: 0.0080 max mem: 33300 Epoch: [7] [3990/4276] eta: 0:14:27 lr: 4.0979273776727846e-05 loss: 0.1661 (0.1764) time: 2.9894 data: 0.0085 max mem: 33300 Epoch: [7] [4000/4276] eta: 0:13:56 lr: 4.097658399494358e-05 loss: 0.1565 (0.1764) time: 2.9709 data: 0.0083 max mem: 33300 Epoch: [7] [4010/4276] eta: 0:13:26 lr: 4.097389419354113e-05 loss: 0.1597 (0.1764) time: 2.9930 data: 0.0081 max mem: 33300 Epoch: [7] [4020/4276] eta: 0:12:56 lr: 4.0971204372518906e-05 loss: 0.1602 (0.1764) time: 3.0294 data: 0.0082 max mem: 33300 Epoch: [7] [4030/4276] eta: 0:12:25 lr: 4.096851453187534e-05 loss: 0.1785 (0.1764) time: 3.0404 data: 0.0086 max mem: 33300 Epoch: [7] [4040/4276] eta: 0:11:55 lr: 4.096582467160887e-05 loss: 0.1808 (0.1765) time: 3.0327 data: 0.0087 max mem: 33300 Epoch: [7] [4050/4276] eta: 0:11:25 lr: 4.09631347917179e-05 loss: 0.1689 (0.1764) time: 3.0544 data: 0.0087 max mem: 33300 Epoch: [7] [4060/4276] eta: 0:10:55 lr: 4.096044489220086e-05 loss: 0.1651 (0.1765) time: 3.0639 data: 0.0085 max mem: 33300 Epoch: [7] [4070/4276] eta: 0:10:24 lr: 4.0957754973056187e-05 loss: 0.1724 (0.1765) time: 3.0583 data: 0.0084 max mem: 33300 Epoch: [7] [4080/4276] eta: 0:09:54 lr: 4.095506503428229e-05 loss: 0.1721 (0.1765) time: 3.0463 data: 0.0082 max mem: 33300 Epoch: [7] [4090/4276] eta: 0:09:24 lr: 4.095237507587761e-05 loss: 0.1734 (0.1765) time: 3.0225 data: 0.0074 max mem: 33300 Epoch: [7] [4100/4276] eta: 0:08:53 lr: 4.0949685097840556e-05 loss: 0.1895 (0.1765) time: 3.0088 data: 0.0077 max mem: 33300 Epoch: [7] [4110/4276] eta: 0:08:23 lr: 4.094699510016955e-05 loss: 0.1756 (0.1765) time: 3.0331 data: 0.0082 max mem: 33300 Epoch: [7] [4120/4276] eta: 0:07:53 lr: 4.094430508286304e-05 loss: 0.1756 (0.1765) time: 3.0453 data: 0.0087 max mem: 33300 Epoch: [7] [4130/4276] eta: 0:07:22 lr: 4.094161504591942e-05 loss: 0.1604 (0.1765) time: 3.0297 data: 0.0087 max mem: 33300 Epoch: [7] [4140/4276] eta: 0:06:52 lr: 4.0938924989337126e-05 loss: 0.1585 (0.1765) time: 3.0360 data: 0.0078 max mem: 33300 Epoch: [7] [4150/4276] eta: 0:06:22 lr: 4.093623491311458e-05 loss: 0.1675 (0.1765) time: 3.0601 data: 0.0080 max mem: 33300 Epoch: [7] [4160/4276] eta: 0:05:51 lr: 4.093354481725021e-05 loss: 0.1713 (0.1765) time: 3.0714 data: 0.0084 max mem: 33300 Epoch: [7] [4170/4276] eta: 0:05:21 lr: 4.093085470174243e-05 loss: 0.1818 (0.1766) time: 3.0524 data: 0.0085 max mem: 33300 Epoch: [7] [4180/4276] eta: 0:04:51 lr: 4.092816456658966e-05 loss: 0.1818 (0.1765) time: 3.0299 data: 0.0084 max mem: 33300 Epoch: [7] [4190/4276] eta: 0:04:20 lr: 4.0925474411790334e-05 loss: 0.1610 (0.1765) time: 3.0101 data: 0.0081 max mem: 33300 Epoch: [7] [4200/4276] eta: 0:03:50 lr: 4.092278423734286e-05 loss: 0.1939 (0.1766) time: 3.0370 data: 0.0086 max mem: 33300 Epoch: [7] [4210/4276] eta: 0:03:20 lr: 4.0920094043245665e-05 loss: 0.2026 (0.1766) time: 3.0527 data: 0.0087 max mem: 33300 Epoch: [7] [4220/4276] eta: 0:02:49 lr: 4.0917403829497174e-05 loss: 0.1941 (0.1767) time: 3.0399 data: 0.0085 max mem: 33300 Epoch: [7] [4230/4276] eta: 0:02:19 lr: 4.09147135960958e-05 loss: 0.1891 (0.1767) time: 3.0330 data: 0.0082 max mem: 33300 Epoch: [7] [4240/4276] eta: 0:01:49 lr: 4.091202334303998e-05 loss: 0.1881 (0.1767) time: 3.0263 data: 0.0083 max mem: 33300 Epoch: [7] [4250/4276] eta: 0:01:18 lr: 4.090933307032812e-05 loss: 0.1836 (0.1768) time: 3.0625 data: 0.0078 max mem: 33300 Epoch: [7] [4260/4276] eta: 0:00:48 lr: 4.090664277795863e-05 loss: 0.1848 (0.1768) time: 3.0738 data: 0.0084 max mem: 33300 Epoch: [7] [4270/4276] eta: 0:00:18 lr: 4.0903952465929956e-05 loss: 0.1813 (0.1768) time: 3.0573 data: 0.0083 max mem: 33300 Epoch: [7] Total time: 3:36:09 Test: [ 0/21770] eta: 8:20:02 time: 1.3782 data: 1.3340 max mem: 33300 Test: [ 100/21770] eta: 0:18:36 time: 0.0385 data: 0.0011 max mem: 33300 Test: [ 200/21770] eta: 0:16:15 time: 0.0392 data: 0.0011 max mem: 33300 Test: [ 300/21770] eta: 0:15:27 time: 0.0390 data: 0.0011 max mem: 33300 Test: [ 400/21770] eta: 0:15:03 time: 0.0392 data: 0.0012 max mem: 33300 Test: [ 500/21770] eta: 0:14:48 time: 0.0398 data: 0.0012 max mem: 33300 Test: [ 600/21770] eta: 0:14:34 time: 0.0393 data: 0.0011 max mem: 33300 Test: [ 700/21770] eta: 0:14:23 time: 0.0388 data: 0.0011 max mem: 33300 Test: [ 800/21770] eta: 0:14:13 time: 0.0388 data: 0.0011 max mem: 33300 Test: [ 900/21770] eta: 0:14:04 time: 0.0385 data: 0.0011 max mem: 33300 Test: [ 1000/21770] eta: 0:13:57 time: 0.0389 data: 0.0011 max mem: 33300 Test: [ 1100/21770] eta: 0:13:50 time: 0.0389 data: 0.0011 max mem: 33300 Test: [ 1200/21770] eta: 0:13:44 time: 0.0392 data: 0.0011 max mem: 33300 Test: [ 1300/21770] eta: 0:13:39 time: 0.0390 data: 0.0011 max mem: 33300 Test: [ 1400/21770] eta: 0:13:34 time: 0.0397 data: 0.0011 max mem: 33300 Test: [ 1500/21770] eta: 0:13:30 time: 0.0394 data: 0.0011 max mem: 33300 Test: [ 1600/21770] eta: 0:13:25 time: 0.0404 data: 0.0011 max mem: 33300 Test: [ 1700/21770] eta: 0:13:21 time: 0.0388 data: 0.0011 max mem: 33300 Test: [ 1800/21770] eta: 0:13:16 time: 0.0392 data: 0.0011 max mem: 33300 Test: [ 1900/21770] eta: 0:13:11 time: 0.0394 data: 0.0012 max mem: 33300 Test: [ 2000/21770] eta: 0:13:05 time: 0.0383 data: 0.0011 max mem: 33300 Test: [ 2100/21770] eta: 0:13:00 time: 0.0389 data: 0.0011 max mem: 33300 Test: [ 2200/21770] eta: 0:12:56 time: 0.0388 data: 0.0011 max mem: 33300 Test: [ 2300/21770] eta: 0:12:51 time: 0.0387 data: 0.0011 max mem: 33300 Test: [ 2400/21770] eta: 0:12:47 time: 0.0394 data: 0.0011 max mem: 33300 Test: [ 2500/21770] eta: 0:12:43 time: 0.0395 data: 0.0012 max mem: 33300 Test: [ 2600/21770] eta: 0:12:39 time: 0.0395 data: 0.0012 max mem: 33300 Test: [ 2700/21770] eta: 0:12:34 time: 0.0393 data: 0.0011 max mem: 33300 Test: [ 2800/21770] eta: 0:12:30 time: 0.0392 data: 0.0011 max mem: 33300 Test: [ 2900/21770] eta: 0:12:26 time: 0.0395 data: 0.0012 max mem: 33300 Test: [ 3000/21770] eta: 0:12:22 time: 0.0390 data: 0.0011 max mem: 33300 Test: [ 3100/21770] eta: 0:12:17 time: 0.0388 data: 0.0011 max mem: 33300 Test: [ 3200/21770] eta: 0:12:13 time: 0.0393 data: 0.0011 max mem: 33300 Test: [ 3300/21770] eta: 0:12:09 time: 0.0389 data: 0.0011 max mem: 33300 Test: [ 3400/21770] eta: 0:12:05 time: 0.0391 data: 0.0011 max mem: 33300 Test: [ 3500/21770] eta: 0:12:00 time: 0.0386 data: 0.0011 max mem: 33300 Test: [ 3600/21770] eta: 0:11:56 time: 0.0387 data: 0.0011 max mem: 33300 Test: [ 3700/21770] eta: 0:11:52 time: 0.0387 data: 0.0011 max mem: 33300 Test: [ 3800/21770] eta: 0:11:47 time: 0.0387 data: 0.0011 max mem: 33300 Test: [ 3900/21770] eta: 0:11:43 time: 0.0386 data: 0.0011 max mem: 33300 Test: [ 4000/21770] eta: 0:11:39 time: 0.0387 data: 0.0011 max mem: 33300 Test: [ 4100/21770] eta: 0:11:34 time: 0.0387 data: 0.0011 max mem: 33300 Test: [ 4200/21770] eta: 0:11:30 time: 0.0388 data: 0.0011 max mem: 33300 Test: [ 4300/21770] eta: 0:11:26 time: 0.0389 data: 0.0011 max mem: 33300 Test: [ 4400/21770] eta: 0:11:22 time: 0.0392 data: 0.0011 max mem: 33300 Test: [ 4500/21770] eta: 0:11:18 time: 0.0392 data: 0.0011 max mem: 33300 Test: [ 4600/21770] eta: 0:11:14 time: 0.0399 data: 0.0011 max mem: 33300 Test: [ 4700/21770] eta: 0:11:11 time: 0.0400 data: 0.0011 max mem: 33300 Test: [ 4800/21770] eta: 0:11:07 time: 0.0398 data: 0.0011 max mem: 33300 Test: [ 4900/21770] eta: 0:11:03 time: 0.0399 data: 0.0011 max mem: 33300 Test: [ 5000/21770] eta: 0:10:59 time: 0.0388 data: 0.0010 max mem: 33300 Test: [ 5100/21770] eta: 0:10:55 time: 0.0392 data: 0.0012 max mem: 33300 Test: [ 5200/21770] eta: 0:10:51 time: 0.0399 data: 0.0012 max mem: 33300 Test: [ 5300/21770] eta: 0:10:47 time: 0.0387 data: 0.0011 max mem: 33300 Test: [ 5400/21770] eta: 0:10:43 time: 0.0391 data: 0.0011 max mem: 33300 Test: [ 5500/21770] eta: 0:10:39 time: 0.0386 data: 0.0011 max mem: 33300 Test: [ 5600/21770] eta: 0:10:35 time: 0.0391 data: 0.0011 max mem: 33300 Test: [ 5700/21770] eta: 0:10:31 time: 0.0391 data: 0.0012 max mem: 33300 Test: [ 5800/21770] eta: 0:10:27 time: 0.0396 data: 0.0012 max mem: 33300 Test: [ 5900/21770] eta: 0:10:23 time: 0.0390 data: 0.0012 max mem: 33300 Test: [ 6000/21770] eta: 0:10:19 time: 0.0396 data: 0.0011 max mem: 33300 Test: [ 6100/21770] eta: 0:10:16 time: 0.0392 data: 0.0012 max mem: 33300 Test: [ 6200/21770] eta: 0:10:12 time: 0.0390 data: 0.0012 max mem: 33300 Test: [ 6300/21770] eta: 0:10:08 time: 0.0392 data: 0.0012 max mem: 33300 Test: [ 6400/21770] eta: 0:10:04 time: 0.0393 data: 0.0011 max mem: 33300 Test: [ 6500/21770] eta: 0:10:00 time: 0.0387 data: 0.0011 max mem: 33300 Test: [ 6600/21770] eta: 0:09:56 time: 0.0393 data: 0.0012 max mem: 33300 Test: [ 6700/21770] eta: 0:09:52 time: 0.0394 data: 0.0012 max mem: 33300 Test: [ 6800/21770] eta: 0:09:48 time: 0.0391 data: 0.0011 max mem: 33300 Test: [ 6900/21770] eta: 0:09:44 time: 0.0391 data: 0.0012 max mem: 33300 Test: [ 7000/21770] eta: 0:09:40 time: 0.0400 data: 0.0012 max mem: 33300 Test: [ 7100/21770] eta: 0:09:36 time: 0.0400 data: 0.0012 max mem: 33300 Test: [ 7200/21770] eta: 0:09:32 time: 0.0397 data: 0.0012 max mem: 33300 Test: [ 7300/21770] eta: 0:09:29 time: 0.0400 data: 0.0012 max mem: 33300 Test: [ 7400/21770] eta: 0:09:25 time: 0.0401 data: 0.0012 max mem: 33300 Test: [ 7500/21770] eta: 0:09:21 time: 0.0392 data: 0.0012 max mem: 33300 Test: [ 7600/21770] eta: 0:09:17 time: 0.0391 data: 0.0012 max mem: 33300 Test: [ 7700/21770] eta: 0:09:13 time: 0.0394 data: 0.0011 max mem: 33300 Test: [ 7800/21770] eta: 0:09:09 time: 0.0399 data: 0.0011 max mem: 33300 Test: [ 7900/21770] eta: 0:09:05 time: 0.0389 data: 0.0012 max mem: 33300 Test: [ 8000/21770] eta: 0:09:01 time: 0.0399 data: 0.0011 max mem: 33300 Test: [ 8100/21770] eta: 0:08:57 time: 0.0399 data: 0.0012 max mem: 33300 Test: [ 8200/21770] eta: 0:08:54 time: 0.0403 data: 0.0011 max mem: 33300 Test: [ 8300/21770] eta: 0:08:50 time: 0.0400 data: 0.0011 max mem: 33300 Test: [ 8400/21770] eta: 0:08:46 time: 0.0399 data: 0.0011 max mem: 33300 Test: [ 8500/21770] eta: 0:08:42 time: 0.0399 data: 0.0011 max mem: 33300 Test: [ 8600/21770] eta: 0:08:38 time: 0.0397 data: 0.0011 max mem: 33300 Test: [ 8700/21770] eta: 0:08:34 time: 0.0390 data: 0.0012 max mem: 33300 Test: [ 8800/21770] eta: 0:08:30 time: 0.0399 data: 0.0011 max mem: 33300 Test: [ 8900/21770] eta: 0:08:27 time: 0.0401 data: 0.0011 max mem: 33300 Test: [ 9000/21770] eta: 0:08:23 time: 0.0397 data: 0.0012 max mem: 33300 Test: [ 9100/21770] eta: 0:08:19 time: 0.0407 data: 0.0012 max mem: 33300 Test: [ 9200/21770] eta: 0:08:15 time: 0.0400 data: 0.0012 max mem: 33300 Test: [ 9300/21770] eta: 0:08:11 time: 0.0401 data: 0.0012 max mem: 33300 Test: [ 9400/21770] eta: 0:08:07 time: 0.0399 data: 0.0012 max mem: 33300 Test: [ 9500/21770] eta: 0:08:03 time: 0.0400 data: 0.0012 max mem: 33300 Test: [ 9600/21770] eta: 0:08:00 time: 0.0398 data: 0.0012 max mem: 33300 Test: [ 9700/21770] eta: 0:07:56 time: 0.0398 data: 0.0012 max mem: 33300 Test: [ 9800/21770] eta: 0:07:52 time: 0.0398 data: 0.0012 max mem: 33300 Test: [ 9900/21770] eta: 0:07:48 time: 0.0400 data: 0.0012 max mem: 33300 Test: [10000/21770] eta: 0:07:44 time: 0.0399 data: 0.0012 max mem: 33300 Test: [10100/21770] eta: 0:07:40 time: 0.0403 data: 0.0013 max mem: 33300 Test: [10200/21770] eta: 0:07:36 time: 0.0399 data: 0.0012 max mem: 33300 Test: [10300/21770] eta: 0:07:32 time: 0.0399 data: 0.0011 max mem: 33300 Test: [10400/21770] eta: 0:07:28 time: 0.0398 data: 0.0012 max mem: 33300 Test: [10500/21770] eta: 0:07:24 time: 0.0400 data: 0.0012 max mem: 33300 Test: [10600/21770] eta: 0:07:21 time: 0.0400 data: 0.0012 max mem: 33300 Test: [10700/21770] eta: 0:07:17 time: 0.0399 data: 0.0011 max mem: 33300 Test: [10800/21770] eta: 0:07:13 time: 0.0398 data: 0.0011 max mem: 33300 Test: [10900/21770] eta: 0:07:09 time: 0.0394 data: 0.0012 max mem: 33300 Test: [11000/21770] eta: 0:07:05 time: 0.0393 data: 0.0011 max mem: 33300 Test: [11100/21770] eta: 0:07:01 time: 0.0400 data: 0.0012 max mem: 33300 Test: [11200/21770] eta: 0:06:57 time: 0.0398 data: 0.0012 max mem: 33300 Test: [11300/21770] eta: 0:06:53 time: 0.0397 data: 0.0012 max mem: 33300 Test: [11400/21770] eta: 0:06:49 time: 0.0402 data: 0.0013 max mem: 33300 Test: [11500/21770] eta: 0:06:45 time: 0.0400 data: 0.0012 max mem: 33300 Test: [11600/21770] eta: 0:06:41 time: 0.0399 data: 0.0012 max mem: 33300 Test: [11700/21770] eta: 0:06:37 time: 0.0399 data: 0.0012 max mem: 33300 Test: [11800/21770] eta: 0:06:33 time: 0.0399 data: 0.0012 max mem: 33300 Test: [11900/21770] eta: 0:06:30 time: 0.0401 data: 0.0011 max mem: 33300 Test: [12000/21770] eta: 0:06:26 time: 0.0400 data: 0.0011 max mem: 33300 Test: [12100/21770] eta: 0:06:22 time: 0.0398 data: 0.0011 max mem: 33300 Test: [12200/21770] eta: 0:06:18 time: 0.0395 data: 0.0011 max mem: 33300 Test: [12300/21770] eta: 0:06:14 time: 0.0397 data: 0.0012 max mem: 33300 Test: [12400/21770] eta: 0:06:10 time: 0.0400 data: 0.0012 max mem: 33300 Test: [12500/21770] eta: 0:06:06 time: 0.0398 data: 0.0011 max mem: 33300 Test: [12600/21770] eta: 0:06:02 time: 0.0400 data: 0.0012 max mem: 33300 Test: [12700/21770] eta: 0:05:58 time: 0.0395 data: 0.0012 max mem: 33300 Test: [12800/21770] eta: 0:05:54 time: 0.0401 data: 0.0012 max mem: 33300 Test: [12900/21770] eta: 0:05:50 time: 0.0399 data: 0.0012 max mem: 33300 Test: [13000/21770] eta: 0:05:46 time: 0.0399 data: 0.0012 max mem: 33300 Test: [13100/21770] eta: 0:05:42 time: 0.0399 data: 0.0012 max mem: 33300 Test: [13200/21770] eta: 0:05:39 time: 0.0401 data: 0.0012 max mem: 33300 Test: [13300/21770] eta: 0:05:35 time: 0.0400 data: 0.0012 max mem: 33300 Test: [13400/21770] eta: 0:05:31 time: 0.0400 data: 0.0012 max mem: 33300 Test: [13500/21770] eta: 0:05:27 time: 0.0401 data: 0.0012 max mem: 33300 Test: [13600/21770] eta: 0:05:23 time: 0.0402 data: 0.0012 max mem: 33300 Test: [13700/21770] eta: 0:05:19 time: 0.0400 data: 0.0011 max mem: 33300 Test: [13800/21770] eta: 0:05:15 time: 0.0400 data: 0.0011 max mem: 33300 Test: [13900/21770] eta: 0:05:11 time: 0.0400 data: 0.0012 max mem: 33300 Test: [14000/21770] eta: 0:05:07 time: 0.0400 data: 0.0011 max mem: 33300 Test: [14100/21770] eta: 0:05:03 time: 0.0399 data: 0.0011 max mem: 33300 Test: [14200/21770] eta: 0:04:59 time: 0.0399 data: 0.0011 max mem: 33300 Test: [14300/21770] eta: 0:04:55 time: 0.0399 data: 0.0012 max mem: 33300 Test: [14400/21770] eta: 0:04:51 time: 0.0391 data: 0.0012 max mem: 33300 Test: [14500/21770] eta: 0:04:47 time: 0.0391 data: 0.0012 max mem: 33300 Test: [14600/21770] eta: 0:04:43 time: 0.0393 data: 0.0012 max mem: 33300 Test: [14700/21770] eta: 0:04:39 time: 0.0402 data: 0.0012 max mem: 33300 Test: [14800/21770] eta: 0:04:35 time: 0.0399 data: 0.0011 max mem: 33300 Test: [14900/21770] eta: 0:04:32 time: 0.0402 data: 0.0011 max mem: 33300 Test: [15000/21770] eta: 0:04:28 time: 0.0402 data: 0.0011 max mem: 33300 Test: [15100/21770] eta: 0:04:24 time: 0.0399 data: 0.0011 max mem: 33300 Test: [15200/21770] eta: 0:04:20 time: 0.0401 data: 0.0011 max mem: 33300 Test: [15300/21770] eta: 0:04:16 time: 0.0397 data: 0.0011 max mem: 33300 Test: [15400/21770] eta: 0:04:12 time: 0.0399 data: 0.0012 max mem: 33300 Test: [15500/21770] eta: 0:04:08 time: 0.0401 data: 0.0012 max mem: 33300 Test: [15600/21770] eta: 0:04:04 time: 0.0401 data: 0.0012 max mem: 33300 Test: [15700/21770] eta: 0:04:00 time: 0.0400 data: 0.0011 max mem: 33300 Test: [15800/21770] eta: 0:03:56 time: 0.0408 data: 0.0012 max mem: 33300 Test: [15900/21770] eta: 0:03:52 time: 0.0394 data: 0.0012 max mem: 33300 Test: [16000/21770] eta: 0:03:48 time: 0.0393 data: 0.0012 max mem: 33300 Test: [16100/21770] eta: 0:03:44 time: 0.0392 data: 0.0012 max mem: 33300 Test: [16200/21770] eta: 0:03:40 time: 0.0392 data: 0.0012 max mem: 33300 Test: [16300/21770] eta: 0:03:36 time: 0.0401 data: 0.0012 max mem: 33300 Test: [16400/21770] eta: 0:03:32 time: 0.0401 data: 0.0012 max mem: 33300 Test: [16500/21770] eta: 0:03:28 time: 0.0401 data: 0.0012 max mem: 33300 Test: [16600/21770] eta: 0:03:24 time: 0.0401 data: 0.0012 max mem: 33300 Test: [16700/21770] eta: 0:03:20 time: 0.0402 data: 0.0011 max mem: 33300 Test: [16800/21770] eta: 0:03:16 time: 0.0399 data: 0.0012 max mem: 33300 Test: [16900/21770] eta: 0:03:13 time: 0.0398 data: 0.0011 max mem: 33300 Test: [17000/21770] eta: 0:03:09 time: 0.0401 data: 0.0011 max mem: 33300 Test: [17100/21770] eta: 0:03:05 time: 0.0399 data: 0.0011 max mem: 33300 Test: [17200/21770] eta: 0:03:01 time: 0.0400 data: 0.0012 max mem: 33300 Test: [17300/21770] eta: 0:02:57 time: 0.0402 data: 0.0012 max mem: 33300 Test: [17400/21770] eta: 0:02:53 time: 0.0401 data: 0.0012 max mem: 33300 Test: [17500/21770] eta: 0:02:49 time: 0.0394 data: 0.0012 max mem: 33300 Test: [17600/21770] eta: 0:02:45 time: 0.0391 data: 0.0012 max mem: 33300 Test: [17700/21770] eta: 0:02:41 time: 0.0395 data: 0.0012 max mem: 33300 Test: [17800/21770] eta: 0:02:37 time: 0.0392 data: 0.0012 max mem: 33300 Test: [17900/21770] eta: 0:02:33 time: 0.0391 data: 0.0012 max mem: 33300 Test: [18000/21770] eta: 0:02:29 time: 0.0391 data: 0.0012 max mem: 33300 Test: [18100/21770] eta: 0:02:25 time: 0.0393 data: 0.0012 max mem: 33300 Test: [18200/21770] eta: 0:02:21 time: 0.0394 data: 0.0012 max mem: 33300 Test: [18300/21770] eta: 0:02:17 time: 0.0392 data: 0.0012 max mem: 33300 Test: [18400/21770] eta: 0:02:13 time: 0.0391 data: 0.0012 max mem: 33300 Test: [18500/21770] eta: 0:02:09 time: 0.0391 data: 0.0012 max mem: 33300 Test: [18600/21770] eta: 0:02:05 time: 0.0390 data: 0.0012 max mem: 33300 Test: [18700/21770] eta: 0:02:01 time: 0.0392 data: 0.0011 max mem: 33300 Test: [18800/21770] eta: 0:01:57 time: 0.0401 data: 0.0011 max mem: 33300 Test: [18900/21770] eta: 0:01:53 time: 0.0399 data: 0.0012 max mem: 33300 Test: [19000/21770] eta: 0:01:49 time: 0.0392 data: 0.0012 max mem: 33300 Test: [19100/21770] eta: 0:01:45 time: 0.0392 data: 0.0012 max mem: 33300 Test: [19200/21770] eta: 0:01:41 time: 0.0392 data: 0.0012 max mem: 33300 Test: [19300/21770] eta: 0:01:37 time: 0.0392 data: 0.0012 max mem: 33300 Test: [19400/21770] eta: 0:01:33 time: 0.0392 data: 0.0012 max mem: 33300 Test: [19500/21770] eta: 0:01:29 time: 0.0392 data: 0.0012 max mem: 33300 Test: [19600/21770] eta: 0:01:25 time: 0.0399 data: 0.0012 max mem: 33300 Test: [19700/21770] eta: 0:01:22 time: 0.0400 data: 0.0012 max mem: 33300 Test: [19800/21770] eta: 0:01:18 time: 0.0400 data: 0.0012 max mem: 33300 Test: [19900/21770] eta: 0:01:14 time: 0.0400 data: 0.0012 max mem: 33300 Test: [20000/21770] eta: 0:01:10 time: 0.0406 data: 0.0012 max mem: 33300 Test: [20100/21770] eta: 0:01:06 time: 0.0407 data: 0.0012 max mem: 33300 Test: [20200/21770] eta: 0:01:02 time: 0.0405 data: 0.0012 max mem: 33300 Test: [20300/21770] eta: 0:00:58 time: 0.0402 data: 0.0012 max mem: 33300 Test: [20400/21770] eta: 0:00:54 time: 0.0395 data: 0.0012 max mem: 33300 Test: [20500/21770] eta: 0:00:50 time: 0.0404 data: 0.0012 max mem: 33300 Test: [20600/21770] eta: 0:00:46 time: 0.0398 data: 0.0012 max mem: 33300 Test: [20700/21770] eta: 0:00:42 time: 0.0397 data: 0.0012 max mem: 33300 Test: [20800/21770] eta: 0:00:38 time: 0.0401 data: 0.0012 max mem: 33300 Test: [20900/21770] eta: 0:00:34 time: 0.0401 data: 0.0012 max mem: 33300 Test: [21000/21770] eta: 0:00:30 time: 0.0402 data: 0.0012 max mem: 33300 Test: [21100/21770] eta: 0:00:26 time: 0.0396 data: 0.0012 max mem: 33300 Test: [21200/21770] eta: 0:00:22 time: 0.0393 data: 0.0012 max mem: 33300 Test: [21300/21770] eta: 0:00:18 time: 0.0391 data: 0.0012 max mem: 33300 Test: [21400/21770] eta: 0:00:14 time: 0.0395 data: 0.0012 max mem: 33300 Test: [21500/21770] eta: 0:00:10 time: 0.0392 data: 0.0012 max mem: 33300 Test: [21600/21770] eta: 0:00:06 time: 0.0399 data: 0.0011 max mem: 33300 Test: [21700/21770] eta: 0:00:02 time: 0.0399 data: 0.0011 max mem: 33300 Test: Total time: 0:14:23 Final results: Mean IoU is 0.00 precision@0.5 = 0.00 precision@0.6 = 0.00 precision@0.7 = 0.00 precision@0.8 = 0.00 precision@0.9 = 0.00 overall IoU = 0.00 mean IoU = 0.00 Mean accuracy for one-to-zero sample is 100.00 Average object IoU 0.0 Overall IoU 0.0 Epoch: [8] [ 0/4276] eta: 6:43:59 lr: 4.090233826927567e-05 loss: 0.1393 (0.1393) time: 5.6687 data: 2.3195 max mem: 33300 Epoch: [8] [ 10/4276] eta: 3:58:27 lr: 4.089964792578899e-05 loss: 0.1784 (0.1802) time: 3.3538 data: 0.2178 max mem: 33300 Epoch: [8] [ 20/4276] eta: 3:51:38 lr: 4.0896957562639e-05 loss: 0.1643 (0.1753) time: 3.1454 data: 0.0075 max mem: 33300 Epoch: [8] [ 30/4276] eta: 3:48:20 lr: 4.089426717982412e-05 loss: 0.1575 (0.1772) time: 3.1569 data: 0.0081 max mem: 33300 Epoch: [8] [ 40/4276] eta: 3:46:19 lr: 4.089157677734278e-05 loss: 0.1679 (0.1761) time: 3.1430 data: 0.0089 max mem: 33300 Epoch: [8] [ 50/4276] eta: 3:44:50 lr: 4.0888886355193376e-05 loss: 0.1679 (0.1747) time: 3.1387 data: 0.0089 max mem: 33300 Epoch: [8] [ 60/4276] eta: 3:43:36 lr: 4.0886195913374344e-05 loss: 0.1649 (0.1740) time: 3.1345 data: 0.0085 max mem: 33300 Epoch: [8] [ 70/4276] eta: 3:42:45 lr: 4.08835054518841e-05 loss: 0.1672 (0.1730) time: 3.1407 data: 0.0089 max mem: 33300 Epoch: [8] [ 80/4276] eta: 3:41:33 lr: 4.088081497072106e-05 loss: 0.1701 (0.1731) time: 3.1251 data: 0.0093 max mem: 33300 Epoch: [8] [ 90/4276] eta: 3:40:40 lr: 4.0878124469883636e-05 loss: 0.1607 (0.1715) time: 3.1108 data: 0.0097 max mem: 33300 Epoch: [8] [ 100/4276] eta: 3:39:57 lr: 4.087543394937026e-05 loss: 0.1607 (0.1731) time: 3.1278 data: 0.0100 max mem: 33300 Epoch: [8] [ 110/4276] eta: 3:39:03 lr: 4.0872743409179326e-05 loss: 0.1744 (0.1734) time: 3.1179 data: 0.0092 max mem: 33300 Epoch: [8] [ 120/4276] eta: 3:38:07 lr: 4.087005284930927e-05 loss: 0.1630 (0.1731) time: 3.0930 data: 0.0083 max mem: 33300 Epoch: [8] [ 130/4276] eta: 3:37:26 lr: 4.086736226975851e-05 loss: 0.1755 (0.1735) time: 3.1010 data: 0.0080 max mem: 33300 Epoch: [8] [ 140/4276] eta: 3:36:40 lr: 4.086467167052545e-05 loss: 0.1701 (0.1731) time: 3.1075 data: 0.0084 max mem: 33300 Epoch: [8] [ 150/4276] eta: 3:35:52 lr: 4.086198105160851e-05 loss: 0.1620 (0.1729) time: 3.0900 data: 0.0092 max mem: 33300 Epoch: [8] [ 160/4276] eta: 3:35:09 lr: 4.08592904130061e-05 loss: 0.1622 (0.1727) time: 3.0882 data: 0.0088 max mem: 33300 Epoch: [8] [ 170/4276] eta: 3:34:27 lr: 4.085659975471666e-05 loss: 0.1720 (0.1730) time: 3.0940 data: 0.0082 max mem: 33300 Epoch: [8] [ 180/4276] eta: 3:33:53 lr: 4.085390907673858e-05 loss: 0.1758 (0.1732) time: 3.1072 data: 0.0094 max mem: 33300 Epoch: [8] [ 190/4276] eta: 3:33:17 lr: 4.085121837907028e-05 loss: 0.1758 (0.1733) time: 3.1169 data: 0.0100 max mem: 33300 Epoch: [8] [ 200/4276] eta: 3:32:44 lr: 4.0848527661710176e-05 loss: 0.1801 (0.1747) time: 3.1188 data: 0.0090 max mem: 33300 Epoch: [8] [ 210/4276] eta: 3:32:05 lr: 4.0845836924656686e-05 loss: 0.1883 (0.1751) time: 3.1059 data: 0.0082 max mem: 33300 Epoch: [8] [ 220/4276] eta: 3:31:33 lr: 4.084314616790822e-05 loss: 0.1882 (0.1752) time: 3.1066 data: 0.0088 max mem: 33300 Epoch: [8] [ 230/4276] eta: 3:30:56 lr: 4.084045539146321e-05 loss: 0.1683 (0.1747) time: 3.1138 data: 0.0094 max mem: 33300 Epoch: [8] [ 240/4276] eta: 3:30:25 lr: 4.0837764595320036e-05 loss: 0.1803 (0.1750) time: 3.1154 data: 0.0091 max mem: 33300 Epoch: [8] [ 250/4276] eta: 3:29:50 lr: 4.083507377947715e-05 loss: 0.1898 (0.1763) time: 3.1167 data: 0.0082 max mem: 33300 Epoch: [8] [ 260/4276] eta: 3:29:15 lr: 4.083238294393294e-05 loss: 0.1854 (0.1762) time: 3.1037 data: 0.0082 max mem: 33300 Epoch: [8] [ 270/4276] eta: 3:28:43 lr: 4.082969208868582e-05 loss: 0.1708 (0.1765) time: 3.1125 data: 0.0086 max mem: 33300 Epoch: [8] [ 280/4276] eta: 3:28:15 lr: 4.0827001213734205e-05 loss: 0.1758 (0.1766) time: 3.1346 data: 0.0088 max mem: 33300 Epoch: [8] [ 290/4276] eta: 3:27:46 lr: 4.0824310319076514e-05 loss: 0.1671 (0.1763) time: 3.1449 data: 0.0083 max mem: 33300 Epoch: [8] [ 300/4276] eta: 3:27:15 lr: 4.082161940471116e-05 loss: 0.1605 (0.1761) time: 3.1371 data: 0.0087 max mem: 33300 Epoch: [8] [ 310/4276] eta: 3:26:42 lr: 4.081892847063655e-05 loss: 0.1640 (0.1758) time: 3.1211 data: 0.0087 max mem: 33300 Epoch: [8] [ 320/4276] eta: 3:26:12 lr: 4.0816237516851094e-05 loss: 0.1692 (0.1765) time: 3.1254 data: 0.0076 max mem: 33300 Epoch: [8] [ 330/4276] eta: 3:25:43 lr: 4.0813546543353215e-05 loss: 0.1746 (0.1765) time: 3.1436 data: 0.0079 max mem: 33300 Epoch: [8] [ 340/4276] eta: 3:25:14 lr: 4.081085555014131e-05 loss: 0.1638 (0.1760) time: 3.1463 data: 0.0082 max mem: 33300 Epoch: [8] [ 350/4276] eta: 3:24:40 lr: 4.08081645372138e-05 loss: 0.1519 (0.1755) time: 3.1279 data: 0.0080 max mem: 33300 Epoch: [8] [ 360/4276] eta: 3:24:08 lr: 4.080547350456909e-05 loss: 0.1754 (0.1761) time: 3.1119 data: 0.0080 max mem: 33300 Epoch: [8] [ 370/4276] eta: 3:23:37 lr: 4.0802782452205594e-05 loss: 0.1551 (0.1753) time: 3.1256 data: 0.0084 max mem: 33300 Epoch: [8] [ 380/4276] eta: 3:23:02 lr: 4.080009138012173e-05 loss: 0.1520 (0.1755) time: 3.1147 data: 0.0086 max mem: 33300 Epoch: [8] [ 390/4276] eta: 3:22:27 lr: 4.0797400288315883e-05 loss: 0.1744 (0.1754) time: 3.0874 data: 0.0083 max mem: 33300 Epoch: [8] [ 400/4276] eta: 3:21:56 lr: 4.079470917678649e-05 loss: 0.1733 (0.1753) time: 3.1079 data: 0.0081 max mem: 33300 Epoch: [8] [ 410/4276] eta: 3:21:25 lr: 4.079201804553195e-05 loss: 0.1714 (0.1750) time: 3.1310 data: 0.0079 max mem: 33300 Epoch: [8] [ 420/4276] eta: 3:20:53 lr: 4.078932689455068e-05 loss: 0.1679 (0.1750) time: 3.1213 data: 0.0072 max mem: 33300 Epoch: [8] [ 430/4276] eta: 3:20:20 lr: 4.078663572384108e-05 loss: 0.1728 (0.1752) time: 3.1108 data: 0.0068 max mem: 33300 Epoch: [8] [ 440/4276] eta: 3:19:49 lr: 4.078394453340156e-05 loss: 0.1737 (0.1752) time: 3.1176 data: 0.0069 max mem: 33300 Epoch: [8] [ 450/4276] eta: 3:19:14 lr: 4.078125332323053e-05 loss: 0.1656 (0.1752) time: 3.1079 data: 0.0072 max mem: 33300 Epoch: [8] [ 460/4276] eta: 3:18:44 lr: 4.0778562093326394e-05 loss: 0.1574 (0.1746) time: 3.1138 data: 0.0077 max mem: 33300 Epoch: [8] [ 470/4276] eta: 3:18:15 lr: 4.077587084368757e-05 loss: 0.1541 (0.1743) time: 3.1419 data: 0.0080 max mem: 33300 Epoch: [8] [ 480/4276] eta: 3:17:40 lr: 4.077317957431246e-05 loss: 0.1565 (0.1742) time: 3.1120 data: 0.0075 max mem: 33300 Epoch: [8] [ 490/4276] eta: 3:17:07 lr: 4.077048828519948e-05 loss: 0.1521 (0.1740) time: 3.0926 data: 0.0075 max mem: 33300 Epoch: [8] [ 500/4276] eta: 3:16:34 lr: 4.076779697634702e-05 loss: 0.1619 (0.1740) time: 3.1027 data: 0.0076 max mem: 33300 Epoch: [8] [ 510/4276] eta: 3:15:58 lr: 4.076510564775351e-05 loss: 0.1617 (0.1737) time: 3.0763 data: 0.0071 max mem: 33300 Epoch: [8] [ 520/4276] eta: 3:15:28 lr: 4.076241429941734e-05 loss: 0.1553 (0.1737) time: 3.0975 data: 0.0077 max mem: 33300 Epoch: [8] [ 530/4276] eta: 3:14:56 lr: 4.075972293133692e-05 loss: 0.1731 (0.1741) time: 3.1297 data: 0.0078 max mem: 33300 Epoch: [8] [ 540/4276] eta: 3:14:26 lr: 4.075703154351065e-05 loss: 0.1728 (0.1738) time: 3.1238 data: 0.0072 max mem: 33300 Epoch: [8] [ 550/4276] eta: 3:13:54 lr: 4.075434013593696e-05 loss: 0.1770 (0.1742) time: 3.1227 data: 0.0076 max mem: 33300 Epoch: [8] [ 560/4276] eta: 3:13:24 lr: 4.0751648708614234e-05 loss: 0.1875 (0.1744) time: 3.1298 data: 0.0079 max mem: 33300 Epoch: [8] [ 570/4276] eta: 3:12:47 lr: 4.0748957261540885e-05 loss: 0.1773 (0.1745) time: 3.0889 data: 0.0074 max mem: 33300 Epoch: [8] [ 580/4276] eta: 3:12:10 lr: 4.0746265794715324e-05 loss: 0.1669 (0.1745) time: 3.0318 data: 0.0071 max mem: 33300 Epoch: [8] [ 590/4276] eta: 3:11:30 lr: 4.0743574308135946e-05 loss: 0.1574 (0.1743) time: 3.0082 data: 0.0070 max mem: 33300 Epoch: [8] [ 600/4276] eta: 3:10:52 lr: 4.0740882801801156e-05 loss: 0.1599 (0.1742) time: 2.9886 data: 0.0068 max mem: 33300 Epoch: [8] [ 610/4276] eta: 3:10:12 lr: 4.0738191275709364e-05 loss: 0.1666 (0.1740) time: 2.9855 data: 0.0068 max mem: 33300 Epoch: [8] [ 620/4276] eta: 3:09:34 lr: 4.073549972985898e-05 loss: 0.1594 (0.1739) time: 2.9877 data: 0.0069 max mem: 33300 Epoch: [8] [ 630/4276] eta: 3:08:58 lr: 4.0732808164248395e-05 loss: 0.1605 (0.1739) time: 3.0102 data: 0.0071 max mem: 33300 Epoch: [8] [ 640/4276] eta: 3:08:25 lr: 4.0730116578876027e-05 loss: 0.1730 (0.1739) time: 3.0478 data: 0.0072 max mem: 33300 Epoch: [8] [ 650/4276] eta: 3:07:51 lr: 4.072742497374027e-05 loss: 0.1730 (0.1740) time: 3.0670 data: 0.0074 max mem: 33300 Epoch: [8] [ 660/4276] eta: 3:07:21 lr: 4.072473334883953e-05 loss: 0.1780 (0.1741) time: 3.0969 data: 0.0074 max mem: 33300 Epoch: [8] [ 670/4276] eta: 3:06:48 lr: 4.072204170417222e-05 loss: 0.1647 (0.1741) time: 3.1020 data: 0.0073 max mem: 33300 Epoch: [8] [ 680/4276] eta: 3:06:16 lr: 4.0719350039736726e-05 loss: 0.1677 (0.1740) time: 3.0800 data: 0.0072 max mem: 33300 Epoch: [8] [ 690/4276] eta: 3:05:44 lr: 4.0716658355531456e-05 loss: 0.1700 (0.1740) time: 3.0892 data: 0.0075 max mem: 33300 Epoch: [8] [ 700/4276] eta: 3:05:12 lr: 4.0713966651554816e-05 loss: 0.1731 (0.1740) time: 3.0935 data: 0.0077 max mem: 33300 Epoch: [8] [ 710/4276] eta: 3:04:40 lr: 4.071127492780522e-05 loss: 0.1816 (0.1741) time: 3.0893 data: 0.0075 max mem: 33300 Epoch: [8] [ 720/4276] eta: 3:04:09 lr: 4.070858318428105e-05 loss: 0.1661 (0.1739) time: 3.0909 data: 0.0075 max mem: 33300 Epoch: [8] [ 730/4276] eta: 3:03:36 lr: 4.0705891420980716e-05 loss: 0.1587 (0.1739) time: 3.0858 data: 0.0078 max mem: 33300 Epoch: [8] [ 740/4276] eta: 3:03:03 lr: 4.070319963790263e-05 loss: 0.1650 (0.1739) time: 3.0757 data: 0.0076 max mem: 33300 Epoch: [8] [ 750/4276] eta: 3:02:34 lr: 4.070050783504517e-05 loss: 0.1670 (0.1738) time: 3.1059 data: 0.0076 max mem: 33300 Epoch: [8] [ 760/4276] eta: 3:02:04 lr: 4.069781601240676e-05 loss: 0.1588 (0.1737) time: 3.1344 data: 0.0076 max mem: 33300 Epoch: [8] [ 770/4276] eta: 3:01:34 lr: 4.069512416998578e-05 loss: 0.1598 (0.1737) time: 3.1348 data: 0.0071 max mem: 33300 Epoch: [8] [ 780/4276] eta: 3:01:04 lr: 4.0692432307780654e-05 loss: 0.1715 (0.1737) time: 3.1363 data: 0.0074 max mem: 33300 Epoch: [8] [ 790/4276] eta: 3:00:33 lr: 4.068974042578977e-05 loss: 0.1717 (0.1738) time: 3.1198 data: 0.0074 max mem: 33300 Epoch: [8] [ 800/4276] eta: 3:00:02 lr: 4.0687048524011515e-05 loss: 0.1717 (0.1738) time: 3.1105 data: 0.0075 max mem: 33300 Epoch: [8] [ 810/4276] eta: 2:59:33 lr: 4.068435660244431e-05 loss: 0.1610 (0.1738) time: 3.1283 data: 0.0077 max mem: 33300 Epoch: [8] [ 820/4276] eta: 2:59:02 lr: 4.068166466108656e-05 loss: 0.1469 (0.1735) time: 3.1332 data: 0.0073 max mem: 33300 Epoch: [8] [ 830/4276] eta: 2:58:32 lr: 4.0678972699936634e-05 loss: 0.1540 (0.1735) time: 3.1282 data: 0.0070 max mem: 33300 Epoch: [8] [ 840/4276] eta: 2:57:58 lr: 4.067628071899296e-05 loss: 0.1568 (0.1735) time: 3.0817 data: 0.0071 max mem: 33300 Epoch: [8] [ 850/4276] eta: 2:58:17 lr: 4.067358871825392e-05 loss: 0.1558 (0.1735) time: 3.6959 data: 0.0070 max mem: 33300 Epoch: [8] [ 860/4276] eta: 2:57:45 lr: 4.067089669771792e-05 loss: 0.1667 (0.1735) time: 3.7245 data: 0.0071 max mem: 33300 Epoch: [8] [ 870/4276] eta: 2:57:11 lr: 4.066820465738335e-05 loss: 0.1667 (0.1735) time: 3.0801 data: 0.0071 max mem: 33300 Epoch: [8] [ 880/4276] eta: 2:56:38 lr: 4.066551259724862e-05 loss: 0.1642 (0.1736) time: 3.0615 data: 0.0075 max mem: 33300 Epoch: [8] [ 890/4276] eta: 2:56:05 lr: 4.066282051731213e-05 loss: 0.1743 (0.1736) time: 3.0714 data: 0.0077 max mem: 33300 Epoch: [8] [ 900/4276] eta: 2:55:34 lr: 4.0660128417572265e-05 loss: 0.1789 (0.1735) time: 3.0982 data: 0.0073 max mem: 33300 Epoch: [8] [ 910/4276] eta: 2:55:02 lr: 4.065743629802743e-05 loss: 0.1775 (0.1736) time: 3.1045 data: 0.0072 max mem: 33300 Epoch: [8] [ 920/4276] eta: 2:54:31 lr: 4.065474415867601e-05 loss: 0.1716 (0.1736) time: 3.1121 data: 0.0077 max mem: 33300 Epoch: [8] [ 930/4276] eta: 2:53:59 lr: 4.0652051999516424e-05 loss: 0.1698 (0.1736) time: 3.1198 data: 0.0080 max mem: 33300 Epoch: [8] [ 940/4276] eta: 2:53:28 lr: 4.064935982054706e-05 loss: 0.1668 (0.1735) time: 3.1176 data: 0.0076 max mem: 33300 Epoch: [8] [ 950/4276] eta: 2:52:57 lr: 4.06466676217663e-05 loss: 0.1704 (0.1735) time: 3.1255 data: 0.0081 max mem: 33300 Epoch: [8] [ 960/4276] eta: 2:52:24 lr: 4.064397540317255e-05 loss: 0.1778 (0.1737) time: 3.1036 data: 0.0078 max mem: 33300 Epoch: [8] [ 970/4276] eta: 2:51:50 lr: 4.064128316476422e-05 loss: 0.1688 (0.1736) time: 3.0519 data: 0.0073 max mem: 33300 Epoch: [8] [ 980/4276] eta: 2:51:17 lr: 4.063859090653969e-05 loss: 0.1681 (0.1736) time: 3.0372 data: 0.0078 max mem: 33300 Epoch: [8] [ 990/4276] eta: 2:50:43 lr: 4.0635898628497345e-05 loss: 0.1760 (0.1736) time: 3.0437 data: 0.0079 max mem: 33300 Epoch: [8] [1000/4276] eta: 2:50:08 lr: 4.06332063306356e-05 loss: 0.1583 (0.1735) time: 3.0176 data: 0.0082 max mem: 33300 Epoch: [8] [1010/4276] eta: 2:49:33 lr: 4.063051401295285e-05 loss: 0.1585 (0.1735) time: 3.0038 data: 0.0086 max mem: 33300 Epoch: [8] [1020/4276] eta: 2:48:58 lr: 4.0627821675447486e-05 loss: 0.1661 (0.1734) time: 3.0063 data: 0.0082 max mem: 33300 Epoch: [8] [1030/4276] eta: 2:48:24 lr: 4.062512931811789e-05 loss: 0.1661 (0.1733) time: 3.0075 data: 0.0079 max mem: 33300 Epoch: [8] [1040/4276] eta: 2:47:51 lr: 4.0622436940962475e-05 loss: 0.1521 (0.1732) time: 3.0356 data: 0.0081 max mem: 33300 Epoch: [8] [1050/4276] eta: 2:47:18 lr: 4.061974454397963e-05 loss: 0.1634 (0.1734) time: 3.0469 data: 0.0081 max mem: 33300 Epoch: [8] [1060/4276] eta: 2:46:45 lr: 4.061705212716773e-05 loss: 0.1639 (0.1734) time: 3.0439 data: 0.0084 max mem: 33300 Epoch: [8] [1070/4276] eta: 2:46:12 lr: 4.061435969052519e-05 loss: 0.1639 (0.1734) time: 3.0461 data: 0.0084 max mem: 33300 Epoch: [8] [1080/4276] eta: 2:45:38 lr: 4.0611667234050395e-05 loss: 0.1754 (0.1733) time: 3.0392 data: 0.0082 max mem: 33300 Epoch: [8] [1090/4276] eta: 2:45:05 lr: 4.060897475774174e-05 loss: 0.1774 (0.1734) time: 3.0324 data: 0.0078 max mem: 33300 Epoch: [8] [1100/4276] eta: 2:44:31 lr: 4.0606282261597623e-05 loss: 0.1579 (0.1734) time: 3.0257 data: 0.0080 max mem: 33300 Epoch: [8] [1110/4276] eta: 2:43:59 lr: 4.060358974561643e-05 loss: 0.1632 (0.1734) time: 3.0380 data: 0.0084 max mem: 33300 Epoch: [8] [1120/4276] eta: 2:43:24 lr: 4.060089720979655e-05 loss: 0.1742 (0.1734) time: 3.0267 data: 0.0078 max mem: 33300 Epoch: [8] [1130/4276] eta: 2:42:51 lr: 4.059820465413638e-05 loss: 0.1580 (0.1733) time: 3.0128 data: 0.0076 max mem: 33300 Epoch: [8] [1140/4276] eta: 2:42:18 lr: 4.0595512078634304e-05 loss: 0.1520 (0.1733) time: 3.0275 data: 0.0082 max mem: 33300 Epoch: [8] [1150/4276] eta: 2:41:45 lr: 4.059281948328873e-05 loss: 0.1663 (0.1732) time: 3.0299 data: 0.0079 max mem: 33300 Epoch: [8] [1160/4276] eta: 2:41:12 lr: 4.0590126868098034e-05 loss: 0.1715 (0.1733) time: 3.0382 data: 0.0079 max mem: 33300 Epoch: [8] [1170/4276] eta: 2:40:40 lr: 4.058743423306062e-05 loss: 0.1847 (0.1736) time: 3.0522 data: 0.0082 max mem: 33300 Epoch: [8] [1180/4276] eta: 2:40:07 lr: 4.058474157817486e-05 loss: 0.1868 (0.1736) time: 3.0485 data: 0.0078 max mem: 33300 Epoch: [8] [1190/4276] eta: 2:39:36 lr: 4.058204890343916e-05 loss: 0.1715 (0.1737) time: 3.0623 data: 0.0081 max mem: 33300 Epoch: [8] [1200/4276] eta: 2:39:04 lr: 4.057935620885191e-05 loss: 0.1715 (0.1738) time: 3.0749 data: 0.0077 max mem: 33300 Epoch: [8] [1210/4276] eta: 2:38:31 lr: 4.057666349441149e-05 loss: 0.1809 (0.1738) time: 3.0540 data: 0.0073 max mem: 33300 Epoch: [8] [1220/4276] eta: 2:37:59 lr: 4.05739707601163e-05 loss: 0.1734 (0.1738) time: 3.0551 data: 0.0075 max mem: 33300 Epoch: [8] [1230/4276] eta: 2:37:27 lr: 4.057127800596472e-05 loss: 0.1766 (0.1739) time: 3.0522 data: 0.0077 max mem: 33300 Epoch: [8] [1240/4276] eta: 2:36:55 lr: 4.056858523195514e-05 loss: 0.1770 (0.1740) time: 3.0604 data: 0.0078 max mem: 33300 Epoch: [8] [1250/4276] eta: 2:36:23 lr: 4.0565892438085964e-05 loss: 0.1748 (0.1741) time: 3.0692 data: 0.0080 max mem: 33300 Epoch: [8] [1260/4276] eta: 2:35:50 lr: 4.056319962435557e-05 loss: 0.1583 (0.1740) time: 3.0394 data: 0.0085 max mem: 33300 Epoch: [8] [1270/4276] eta: 2:35:18 lr: 4.0560506790762335e-05 loss: 0.1605 (0.1740) time: 3.0331 data: 0.0083 max mem: 33300 Epoch: [8] [1280/4276] eta: 2:34:45 lr: 4.055781393730467e-05 loss: 0.1814 (0.1741) time: 3.0401 data: 0.0080 max mem: 33300 Epoch: [8] [1290/4276] eta: 2:34:13 lr: 4.055512106398095e-05 loss: 0.1814 (0.1742) time: 3.0374 data: 0.0080 max mem: 33300 Epoch: [8] [1300/4276] eta: 2:33:42 lr: 4.0552428170789555e-05 loss: 0.1623 (0.1741) time: 3.0611 data: 0.0084 max mem: 33300 Epoch: [8] [1310/4276] eta: 2:33:10 lr: 4.0549735257728895e-05 loss: 0.1583 (0.1740) time: 3.0767 data: 0.0084 max mem: 33300 Epoch: [8] [1320/4276] eta: 2:32:38 lr: 4.0547042324797336e-05 loss: 0.1718 (0.1741) time: 3.0698 data: 0.0080 max mem: 33300 Epoch: [8] [1330/4276] eta: 2:32:07 lr: 4.054434937199328e-05 loss: 0.1680 (0.1740) time: 3.0684 data: 0.0073 max mem: 33300 Epoch: [8] [1340/4276] eta: 2:31:35 lr: 4.05416563993151e-05 loss: 0.1531 (0.1740) time: 3.0667 data: 0.0080 max mem: 33300 Epoch: [8] [1350/4276] eta: 2:31:04 lr: 4.053896340676119e-05 loss: 0.1615 (0.1739) time: 3.0732 data: 0.0085 max mem: 33300 Epoch: [8] [1360/4276] eta: 2:30:31 lr: 4.053627039432994e-05 loss: 0.1619 (0.1740) time: 3.0438 data: 0.0088 max mem: 33300 Epoch: [8] [1370/4276] eta: 2:29:58 lr: 4.053357736201973e-05 loss: 0.1585 (0.1738) time: 3.0136 data: 0.0091 max mem: 33300 Epoch: [8] [1380/4276] eta: 2:29:26 lr: 4.053088430982894e-05 loss: 0.1659 (0.1739) time: 3.0273 data: 0.0089 max mem: 33300 Epoch: [8] [1390/4276] eta: 2:28:53 lr: 4.0528191237755966e-05 loss: 0.1791 (0.1740) time: 3.0315 data: 0.0087 max mem: 33300 Epoch: [8] [1400/4276] eta: 2:28:22 lr: 4.0525498145799196e-05 loss: 0.1791 (0.1740) time: 3.0502 data: 0.0088 max mem: 33300 Epoch: [8] [1410/4276] eta: 2:27:50 lr: 4.0522805033957014e-05 loss: 0.1645 (0.1740) time: 3.0587 data: 0.0089 max mem: 33300 Epoch: [8] [1420/4276] eta: 2:27:18 lr: 4.0520111902227784e-05 loss: 0.1635 (0.1740) time: 3.0459 data: 0.0084 max mem: 33300 Epoch: [8] [1430/4276] eta: 2:26:46 lr: 4.051741875060992e-05 loss: 0.1606 (0.1740) time: 3.0492 data: 0.0079 max mem: 33300 Epoch: [8] [1440/4276] eta: 2:26:15 lr: 4.0514725579101785e-05 loss: 0.1660 (0.1740) time: 3.0608 data: 0.0079 max mem: 33300 Epoch: [8] [1450/4276] eta: 2:25:43 lr: 4.0512032387701775e-05 loss: 0.1752 (0.1740) time: 3.0672 data: 0.0082 max mem: 33300 Epoch: [8] [1460/4276] eta: 2:25:12 lr: 4.050933917640826e-05 loss: 0.1752 (0.1740) time: 3.0673 data: 0.0082 max mem: 33300 Epoch: [8] [1470/4276] eta: 2:24:40 lr: 4.050664594521964e-05 loss: 0.1714 (0.1740) time: 3.0542 data: 0.0080 max mem: 33300 Epoch: [8] [1480/4276] eta: 2:24:08 lr: 4.05039526941343e-05 loss: 0.1675 (0.1739) time: 3.0499 data: 0.0082 max mem: 33300 Epoch: [8] [1490/4276] eta: 2:23:36 lr: 4.05012594231506e-05 loss: 0.1512 (0.1739) time: 3.0466 data: 0.0085 max mem: 33300 Epoch: [8] [1500/4276] eta: 2:23:05 lr: 4.049856613226694e-05 loss: 0.1681 (0.1739) time: 3.0404 data: 0.0082 max mem: 33300 Epoch: [8] [1510/4276] eta: 2:22:33 lr: 4.04958728214817e-05 loss: 0.1606 (0.1739) time: 3.0576 data: 0.0078 max mem: 33300 Epoch: [8] [1520/4276] eta: 2:22:02 lr: 4.049317949079327e-05 loss: 0.1512 (0.1738) time: 3.0589 data: 0.0075 max mem: 33300 Epoch: [8] [1530/4276] eta: 2:21:29 lr: 4.049048614020001e-05 loss: 0.1633 (0.1737) time: 3.0347 data: 0.0072 max mem: 33300 Epoch: [8] [1540/4276] eta: 2:20:57 lr: 4.048779276970032e-05 loss: 0.1667 (0.1738) time: 3.0155 data: 0.0074 max mem: 33300 Epoch: [8] [1550/4276] eta: 2:20:25 lr: 4.0485099379292565e-05 loss: 0.1744 (0.1738) time: 3.0091 data: 0.0077 max mem: 33300 Epoch: [8] [1560/4276] eta: 2:19:52 lr: 4.0482405968975155e-05 loss: 0.1592 (0.1738) time: 3.0113 data: 0.0081 max mem: 33300 Epoch: [8] [1570/4276] eta: 2:19:20 lr: 4.0479712538746454e-05 loss: 0.1592 (0.1737) time: 3.0107 data: 0.0082 max mem: 33300 Epoch: [8] [1580/4276] eta: 2:18:48 lr: 4.0477019088604834e-05 loss: 0.1588 (0.1736) time: 3.0201 data: 0.0082 max mem: 33300 Epoch: [8] [1590/4276] eta: 2:18:17 lr: 4.0474325618548684e-05 loss: 0.1637 (0.1736) time: 3.0472 data: 0.0081 max mem: 33300 Epoch: [8] [1600/4276] eta: 2:17:45 lr: 4.0471632128576384e-05 loss: 0.1644 (0.1735) time: 3.0488 data: 0.0078 max mem: 33300 Epoch: [8] [1610/4276] eta: 2:17:13 lr: 4.046893861868632e-05 loss: 0.1547 (0.1733) time: 3.0305 data: 0.0076 max mem: 33300 Epoch: [8] [1620/4276] eta: 2:16:42 lr: 4.046624508887686e-05 loss: 0.1544 (0.1732) time: 3.0356 data: 0.0077 max mem: 33300 Epoch: [8] [1630/4276] eta: 2:16:10 lr: 4.0463551539146394e-05 loss: 0.1658 (0.1733) time: 3.0395 data: 0.0082 max mem: 33300 Epoch: [8] [1640/4276] eta: 2:15:38 lr: 4.04608579694933e-05 loss: 0.1707 (0.1733) time: 3.0260 data: 0.0084 max mem: 33300 Epoch: [8] [1650/4276] eta: 2:15:06 lr: 4.045816437991594e-05 loss: 0.1622 (0.1732) time: 3.0281 data: 0.0076 max mem: 33300 Epoch: [8] [1660/4276] eta: 2:14:35 lr: 4.0455470770412716e-05 loss: 0.1622 (0.1733) time: 3.0410 data: 0.0077 max mem: 33300 Epoch: [8] [1670/4276] eta: 2:14:04 lr: 4.0452777140982004e-05 loss: 0.1658 (0.1733) time: 3.0743 data: 0.0086 max mem: 33300 Epoch: [8] [1680/4276] eta: 2:13:32 lr: 4.045008349162216e-05 loss: 0.1681 (0.1734) time: 3.0602 data: 0.0085 max mem: 33300 Epoch: [8] [1690/4276] eta: 2:13:00 lr: 4.044738982233158e-05 loss: 0.1681 (0.1734) time: 3.0216 data: 0.0081 max mem: 33300 Epoch: [8] [1700/4276] eta: 2:12:29 lr: 4.044469613310864e-05 loss: 0.1683 (0.1735) time: 3.0482 data: 0.0081 max mem: 33300 Epoch: [8] [1710/4276] eta: 2:11:58 lr: 4.044200242395172e-05 loss: 0.1766 (0.1735) time: 3.0723 data: 0.0082 max mem: 33300 Epoch: [8] [1720/4276] eta: 2:11:27 lr: 4.043930869485919e-05 loss: 0.1766 (0.1735) time: 3.0800 data: 0.0075 max mem: 33300 Epoch: [8] [1730/4276] eta: 2:10:56 lr: 4.043661494582943e-05 loss: 0.1747 (0.1735) time: 3.0617 data: 0.0077 max mem: 33300 Epoch: [8] [1740/4276] eta: 2:10:24 lr: 4.0433921176860815e-05 loss: 0.1730 (0.1736) time: 3.0372 data: 0.0081 max mem: 33300 Epoch: [8] [1750/4276] eta: 2:09:53 lr: 4.043122738795173e-05 loss: 0.1721 (0.1735) time: 3.0408 data: 0.0074 max mem: 33300 Epoch: [8] [1760/4276] eta: 2:09:22 lr: 4.042853357910054e-05 loss: 0.1556 (0.1734) time: 3.0547 data: 0.0076 max mem: 33300 Epoch: [8] [1770/4276] eta: 2:08:51 lr: 4.042583975030563e-05 loss: 0.1604 (0.1734) time: 3.0771 data: 0.0087 max mem: 33300 Epoch: [8] [1780/4276] eta: 2:08:20 lr: 4.042314590156536e-05 loss: 0.1781 (0.1734) time: 3.0808 data: 0.0088 max mem: 33300 Epoch: [8] [1790/4276] eta: 2:07:48 lr: 4.0420452032878125e-05 loss: 0.1753 (0.1734) time: 3.0511 data: 0.0079 max mem: 33300 Epoch: [8] [1800/4276] eta: 2:07:16 lr: 4.041775814424229e-05 loss: 0.1703 (0.1734) time: 3.0186 data: 0.0071 max mem: 33300 Epoch: [8] [1810/4276] eta: 2:06:45 lr: 4.041506423565623e-05 loss: 0.1711 (0.1734) time: 3.0228 data: 0.0071 max mem: 33300 Epoch: [8] [1820/4276] eta: 2:06:13 lr: 4.041237030711833e-05 loss: 0.1665 (0.1734) time: 3.0455 data: 0.0078 max mem: 33300 Epoch: [8] [1830/4276] eta: 2:05:42 lr: 4.040967635862695e-05 loss: 0.1640 (0.1734) time: 3.0668 data: 0.0079 max mem: 33300 Epoch: [8] [1840/4276] eta: 2:05:11 lr: 4.040698239018047e-05 loss: 0.1585 (0.1733) time: 3.0386 data: 0.0076 max mem: 33300 Epoch: [8] [1850/4276] eta: 2:04:39 lr: 4.0404288401777254e-05 loss: 0.1655 (0.1734) time: 3.0036 data: 0.0076 max mem: 33300 Epoch: [8] [1860/4276] eta: 2:04:07 lr: 4.0401594393415694e-05 loss: 0.1685 (0.1733) time: 3.0197 data: 0.0084 max mem: 33300 Epoch: [8] [1870/4276] eta: 2:03:35 lr: 4.0398900365094155e-05 loss: 0.1685 (0.1734) time: 3.0161 data: 0.0084 max mem: 33300 Epoch: [8] [1880/4276] eta: 2:03:04 lr: 4.039620631681101e-05 loss: 0.1761 (0.1734) time: 3.0364 data: 0.0074 max mem: 33300 Epoch: [8] [1890/4276] eta: 2:02:33 lr: 4.039351224856463e-05 loss: 0.1720 (0.1734) time: 3.0484 data: 0.0079 max mem: 33300 Epoch: [8] [1900/4276] eta: 2:02:01 lr: 4.0390818160353395e-05 loss: 0.1555 (0.1733) time: 3.0269 data: 0.0081 max mem: 33300 Epoch: [8] [1910/4276] eta: 2:01:30 lr: 4.0388124052175664e-05 loss: 0.1682 (0.1733) time: 3.0487 data: 0.0077 max mem: 33300 Epoch: [8] [1920/4276] eta: 2:00:59 lr: 4.038542992402982e-05 loss: 0.1730 (0.1733) time: 3.0645 data: 0.0077 max mem: 33300 Epoch: [8] [1930/4276] eta: 2:00:28 lr: 4.038273577591423e-05 loss: 0.1730 (0.1732) time: 3.0522 data: 0.0076 max mem: 33300 Epoch: [8] [1940/4276] eta: 1:59:57 lr: 4.0380041607827274e-05 loss: 0.1725 (0.1733) time: 3.0578 data: 0.0082 max mem: 33300 Epoch: [8] [1950/4276] eta: 1:59:26 lr: 4.0377347419767315e-05 loss: 0.1715 (0.1732) time: 3.0714 data: 0.0085 max mem: 33300 Epoch: [8] [1960/4276] eta: 1:58:55 lr: 4.037465321173273e-05 loss: 0.1592 (0.1732) time: 3.0737 data: 0.0079 max mem: 33300 Epoch: [8] [1970/4276] eta: 1:58:24 lr: 4.037195898372188e-05 loss: 0.1527 (0.1731) time: 3.0664 data: 0.0076 max mem: 33300 Epoch: [8] [1980/4276] eta: 1:57:54 lr: 4.0369264735733145e-05 loss: 0.1527 (0.1730) time: 3.0730 data: 0.0077 max mem: 33300 Epoch: [8] [1990/4276] eta: 1:57:23 lr: 4.0366570467764884e-05 loss: 0.1647 (0.1730) time: 3.0882 data: 0.0076 max mem: 33300 Epoch: [8] [2000/4276] eta: 1:56:51 lr: 4.036387617981548e-05 loss: 0.1795 (0.1730) time: 3.0603 data: 0.0078 max mem: 33300 Epoch: [8] [2010/4276] eta: 1:56:21 lr: 4.03611818718833e-05 loss: 0.1696 (0.1729) time: 3.0571 data: 0.0084 max mem: 33300 Epoch: [8] [2020/4276] eta: 1:55:50 lr: 4.035848754396671e-05 loss: 0.1731 (0.1730) time: 3.0823 data: 0.0085 max mem: 33300 Epoch: [8] [2030/4276] eta: 1:55:19 lr: 4.0355793196064077e-05 loss: 0.1655 (0.1729) time: 3.0863 data: 0.0085 max mem: 33300 Epoch: [8] [2040/4276] eta: 1:54:48 lr: 4.035309882817378e-05 loss: 0.1619 (0.1729) time: 3.0895 data: 0.0083 max mem: 33300 Epoch: [8] [2050/4276] eta: 1:54:16 lr: 4.0350404440294174e-05 loss: 0.1717 (0.1730) time: 3.0317 data: 0.0073 max mem: 33300 Epoch: [8] [2060/4276] eta: 1:53:45 lr: 4.034771003242365e-05 loss: 0.1658 (0.1729) time: 3.0006 data: 0.0066 max mem: 33300 Epoch: [8] [2070/4276] eta: 1:53:14 lr: 4.034501560456055e-05 loss: 0.1595 (0.1729) time: 3.0385 data: 0.0066 max mem: 33300 Epoch: [8] [2080/4276] eta: 1:52:43 lr: 4.0342321156703256e-05 loss: 0.1672 (0.1729) time: 3.0469 data: 0.0071 max mem: 33300 Epoch: [8] [2090/4276] eta: 1:52:12 lr: 4.0339626688850134e-05 loss: 0.1801 (0.1729) time: 3.0429 data: 0.0074 max mem: 33300 Epoch: [8] [2100/4276] eta: 1:51:40 lr: 4.0336932200999554e-05 loss: 0.1747 (0.1729) time: 3.0110 data: 0.0070 max mem: 33300 Epoch: [8] [2110/4276] eta: 1:51:08 lr: 4.033423769314988e-05 loss: 0.1577 (0.1729) time: 2.9815 data: 0.0073 max mem: 33300 Epoch: [8] [2120/4276] eta: 1:50:36 lr: 4.033154316529948e-05 loss: 0.1422 (0.1727) time: 2.9826 data: 0.0079 max mem: 33300 Epoch: [8] [2130/4276] eta: 1:50:04 lr: 4.032884861744672e-05 loss: 0.1328 (0.1727) time: 2.9821 data: 0.0081 max mem: 33300 Epoch: [8] [2140/4276] eta: 1:49:33 lr: 4.0326154049589965e-05 loss: 0.1676 (0.1727) time: 3.0067 data: 0.0082 max mem: 33300 Epoch: [8] [2150/4276] eta: 1:49:02 lr: 4.032345946172759e-05 loss: 0.1734 (0.1727) time: 3.0195 data: 0.0084 max mem: 33300 Epoch: [8] [2160/4276] eta: 1:48:30 lr: 4.032076485385794e-05 loss: 0.1588 (0.1727) time: 2.9905 data: 0.0083 max mem: 33300 Epoch: [8] [2170/4276] eta: 1:47:58 lr: 4.0318070225979406e-05 loss: 0.1648 (0.1727) time: 2.9635 data: 0.0085 max mem: 33300 Epoch: [8] [2180/4276] eta: 1:47:26 lr: 4.031537557809034e-05 loss: 0.1791 (0.1728) time: 2.9355 data: 0.0089 max mem: 33300 Epoch: [8] [2190/4276] eta: 1:46:54 lr: 4.031268091018911e-05 loss: 0.1724 (0.1728) time: 2.9525 data: 0.0089 max mem: 33300 Epoch: [8] [2200/4276] eta: 1:46:22 lr: 4.0309986222274083e-05 loss: 0.1724 (0.1728) time: 2.9740 data: 0.0090 max mem: 33300 Epoch: [8] [2210/4276] eta: 1:45:51 lr: 4.030729151434362e-05 loss: 0.1745 (0.1728) time: 2.9675 data: 0.0090 max mem: 33300 Epoch: [8] [2220/4276] eta: 1:45:19 lr: 4.0304596786396084e-05 loss: 0.1763 (0.1729) time: 2.9814 data: 0.0090 max mem: 33300 Epoch: [8] [2230/4276] eta: 1:44:47 lr: 4.030190203842985e-05 loss: 0.1651 (0.1728) time: 2.9745 data: 0.0088 max mem: 33300 Epoch: [8] [2240/4276] eta: 1:44:15 lr: 4.029920727044327e-05 loss: 0.1486 (0.1726) time: 2.9526 data: 0.0081 max mem: 33300 Epoch: [8] [2250/4276] eta: 1:43:44 lr: 4.029651248243471e-05 loss: 0.1470 (0.1726) time: 2.9583 data: 0.0079 max mem: 33300 Epoch: [8] [2260/4276] eta: 1:43:12 lr: 4.029381767440255e-05 loss: 0.1598 (0.1726) time: 2.9469 data: 0.0078 max mem: 33300 Epoch: [8] [2270/4276] eta: 1:42:40 lr: 4.029112284634512e-05 loss: 0.1652 (0.1726) time: 2.9319 data: 0.0076 max mem: 33300 Epoch: [8] [2280/4276] eta: 1:42:08 lr: 4.0288427998260806e-05 loss: 0.1678 (0.1726) time: 2.9329 data: 0.0079 max mem: 33300 Epoch: [8] [2290/4276] eta: 1:41:36 lr: 4.028573313014798e-05 loss: 0.1692 (0.1726) time: 2.9438 data: 0.0085 max mem: 33300 Epoch: [8] [2300/4276] eta: 1:41:04 lr: 4.0283038242004975e-05 loss: 0.1577 (0.1725) time: 2.9482 data: 0.0085 max mem: 33300 Epoch: [8] [2310/4276] eta: 1:40:32 lr: 4.028034333383018e-05 loss: 0.1650 (0.1725) time: 2.9281 data: 0.0078 max mem: 33300 Epoch: [8] [2320/4276] eta: 1:40:01 lr: 4.027764840562194e-05 loss: 0.1680 (0.1725) time: 2.9240 data: 0.0078 max mem: 33300 Epoch: [8] [2330/4276] eta: 1:39:29 lr: 4.027495345737862e-05 loss: 0.1627 (0.1725) time: 2.9390 data: 0.0082 max mem: 33300 Epoch: [8] [2340/4276] eta: 1:38:57 lr: 4.02722584890986e-05 loss: 0.1622 (0.1725) time: 2.9645 data: 0.0086 max mem: 33300 Epoch: [8] [2350/4276] eta: 1:38:26 lr: 4.0269563500780214e-05 loss: 0.1609 (0.1724) time: 2.9662 data: 0.0086 max mem: 33300 Epoch: [8] [2360/4276] eta: 1:37:54 lr: 4.0266868492421836e-05 loss: 0.1638 (0.1724) time: 2.9613 data: 0.0084 max mem: 33300 Epoch: [8] [2370/4276] eta: 1:37:23 lr: 4.0264173464021826e-05 loss: 0.1665 (0.1724) time: 2.9492 data: 0.0087 max mem: 33300 Epoch: [8] [2380/4276] eta: 1:36:51 lr: 4.026147841557854e-05 loss: 0.1713 (0.1724) time: 2.9387 data: 0.0084 max mem: 33300 Epoch: [8] [2390/4276] eta: 1:36:20 lr: 4.0258783347090346e-05 loss: 0.1579 (0.1723) time: 2.9669 data: 0.0088 max mem: 33300 Epoch: [8] [2400/4276] eta: 1:35:49 lr: 4.0256088258555604e-05 loss: 0.1533 (0.1724) time: 2.9954 data: 0.0098 max mem: 33300 Epoch: [8] [2410/4276] eta: 1:35:17 lr: 4.025339314997267e-05 loss: 0.1546 (0.1723) time: 2.9906 data: 0.0086 max mem: 33300 Epoch: [8] [2420/4276] eta: 1:34:46 lr: 4.0250698021339896e-05 loss: 0.1520 (0.1722) time: 2.9816 data: 0.0081 max mem: 33300 Epoch: [8] [2430/4276] eta: 1:34:15 lr: 4.0248002872655643e-05 loss: 0.1596 (0.1723) time: 2.9848 data: 0.0085 max mem: 33300 Epoch: [8] [2440/4276] eta: 1:33:44 lr: 4.0245307703918286e-05 loss: 0.1619 (0.1722) time: 3.0025 data: 0.0077 max mem: 33300 Epoch: [8] [2450/4276] eta: 1:33:13 lr: 4.024261251512617e-05 loss: 0.1585 (0.1723) time: 3.0130 data: 0.0080 max mem: 33300 Epoch: [8] [2460/4276] eta: 1:32:42 lr: 4.0239917306277655e-05 loss: 0.1693 (0.1722) time: 2.9907 data: 0.0084 max mem: 33300 Epoch: [8] [2470/4276] eta: 1:32:10 lr: 4.0237222077371095e-05 loss: 0.1684 (0.1723) time: 2.9801 data: 0.0084 max mem: 33300 Epoch: [8] [2480/4276] eta: 1:31:39 lr: 4.0234526828404857e-05 loss: 0.1701 (0.1722) time: 2.9855 data: 0.0082 max mem: 33300 Epoch: [8] [2490/4276] eta: 1:31:08 lr: 4.02318315593773e-05 loss: 0.1681 (0.1722) time: 2.9993 data: 0.0077 max mem: 33300 Epoch: [8] [2500/4276] eta: 1:30:37 lr: 4.022913627028677e-05 loss: 0.1654 (0.1722) time: 2.9942 data: 0.0080 max mem: 33300 Epoch: [8] [2510/4276] eta: 1:30:06 lr: 4.022644096113163e-05 loss: 0.1673 (0.1722) time: 2.9816 data: 0.0085 max mem: 33300 Epoch: [8] [2520/4276] eta: 1:29:35 lr: 4.0223745631910244e-05 loss: 0.1579 (0.1721) time: 2.9868 data: 0.0080 max mem: 33300 Epoch: [8] [2530/4276] eta: 1:29:03 lr: 4.0221050282620956e-05 loss: 0.1366 (0.1720) time: 2.9702 data: 0.0079 max mem: 33300 Epoch: [8] [2540/4276] eta: 1:28:32 lr: 4.021835491326212e-05 loss: 0.1420 (0.1719) time: 2.9766 data: 0.0080 max mem: 33300 Epoch: [8] [2550/4276] eta: 1:28:01 lr: 4.021565952383211e-05 loss: 0.1483 (0.1718) time: 3.0016 data: 0.0080 max mem: 33300 Epoch: [8] [2560/4276] eta: 1:27:30 lr: 4.0212964114329266e-05 loss: 0.1412 (0.1718) time: 2.9783 data: 0.0083 max mem: 33300 Epoch: [8] [2570/4276] eta: 1:26:59 lr: 4.0210268684751954e-05 loss: 0.1391 (0.1717) time: 2.9471 data: 0.0094 max mem: 33300 Epoch: [8] [2580/4276] eta: 1:26:27 lr: 4.0207573235098525e-05 loss: 0.1580 (0.1717) time: 2.9399 data: 0.0102 max mem: 33300 Epoch: [8] [2590/4276] eta: 1:25:56 lr: 4.0204877765367325e-05 loss: 0.1552 (0.1716) time: 2.9436 data: 0.0096 max mem: 33300 Epoch: [8] [2600/4276] eta: 1:25:25 lr: 4.0202182275556735e-05 loss: 0.1654 (0.1717) time: 2.9726 data: 0.0087 max mem: 33300 Epoch: [8] [2610/4276] eta: 1:24:54 lr: 4.0199486765665074e-05 loss: 0.1733 (0.1716) time: 2.9791 data: 0.0077 max mem: 33300 Epoch: [8] [2620/4276] eta: 1:24:23 lr: 4.0196791235690715e-05 loss: 0.1645 (0.1716) time: 2.9711 data: 0.0073 max mem: 33300 Epoch: [8] [2630/4276] eta: 1:23:51 lr: 4.0194095685632013e-05 loss: 0.1590 (0.1716) time: 2.9703 data: 0.0078 max mem: 33300 Epoch: [8] [2640/4276] eta: 1:23:21 lr: 4.019140011548733e-05 loss: 0.1490 (0.1715) time: 2.9853 data: 0.0079 max mem: 33300 Epoch: [8] [2650/4276] eta: 1:22:50 lr: 4.0188704525255e-05 loss: 0.1716 (0.1715) time: 2.9934 data: 0.0081 max mem: 33300 Epoch: [8] [2660/4276] eta: 1:22:19 lr: 4.0186008914933385e-05 loss: 0.1692 (0.1715) time: 2.9775 data: 0.0081 max mem: 33300 Epoch: [8] [2670/4276] eta: 1:21:47 lr: 4.018331328452085e-05 loss: 0.1684 (0.1715) time: 2.9780 data: 0.0075 max mem: 33300 Epoch: [8] [2680/4276] eta: 1:21:16 lr: 4.018061763401573e-05 loss: 0.1730 (0.1715) time: 2.9813 data: 0.0078 max mem: 33300 Epoch: [8] [2690/4276] eta: 1:20:46 lr: 4.0177921963416376e-05 loss: 0.1693 (0.1715) time: 3.0038 data: 0.0080 max mem: 33300 Epoch: [8] [2700/4276] eta: 1:20:15 lr: 4.017522627272115e-05 loss: 0.1567 (0.1714) time: 3.0013 data: 0.0080 max mem: 33300 Epoch: [8] [2710/4276] eta: 1:19:44 lr: 4.0172530561928404e-05 loss: 0.1547 (0.1714) time: 2.9708 data: 0.0080 max mem: 33300 Epoch: [8] [2720/4276] eta: 1:19:13 lr: 4.016983483103649e-05 loss: 0.1482 (0.1713) time: 2.9643 data: 0.0079 max mem: 33300 Epoch: [8] [2730/4276] eta: 1:18:41 lr: 4.016713908004375e-05 loss: 0.1493 (0.1713) time: 2.9574 data: 0.0085 max mem: 33300 Epoch: [8] [2740/4276] eta: 1:18:10 lr: 4.0164443308948554e-05 loss: 0.1733 (0.1713) time: 2.9626 data: 0.0088 max mem: 33300 Epoch: [8] [2750/4276] eta: 1:17:39 lr: 4.0161747517749235e-05 loss: 0.1747 (0.1714) time: 2.9735 data: 0.0091 max mem: 33300 Epoch: [8] [2760/4276] eta: 1:17:08 lr: 4.015905170644415e-05 loss: 0.1578 (0.1713) time: 2.9458 data: 0.0093 max mem: 33300 Epoch: [8] [2770/4276] eta: 1:16:37 lr: 4.0156355875031646e-05 loss: 0.1492 (0.1713) time: 2.9419 data: 0.0087 max mem: 33300 Epoch: [8] [2780/4276] eta: 1:16:06 lr: 4.0153660023510076e-05 loss: 0.1578 (0.1713) time: 2.9521 data: 0.0086 max mem: 33300 Epoch: [8] [2790/4276] eta: 1:15:35 lr: 4.015096415187779e-05 loss: 0.1721 (0.1713) time: 2.9781 data: 0.0085 max mem: 33300 Epoch: [8] [2800/4276] eta: 1:15:04 lr: 4.0148268260133145e-05 loss: 0.1670 (0.1713) time: 2.9994 data: 0.0082 max mem: 33300 Epoch: [8] [2810/4276] eta: 1:14:34 lr: 4.014557234827448e-05 loss: 0.1468 (0.1712) time: 2.9884 data: 0.0075 max mem: 33300 Epoch: [8] [2820/4276] eta: 1:14:03 lr: 4.014287641630014e-05 loss: 0.1462 (0.1711) time: 2.9850 data: 0.0075 max mem: 33300 Epoch: [8] [2830/4276] eta: 1:13:32 lr: 4.0140180464208496e-05 loss: 0.1510 (0.1710) time: 2.9821 data: 0.0075 max mem: 33300 Epoch: [8] [2840/4276] eta: 1:13:01 lr: 4.013748449199787e-05 loss: 0.1697 (0.1711) time: 3.0042 data: 0.0071 max mem: 33300 Epoch: [8] [2850/4276] eta: 1:12:30 lr: 4.013478849966662e-05 loss: 0.1818 (0.1711) time: 3.0185 data: 0.0074 max mem: 33300 Epoch: [8] [2860/4276] eta: 1:11:59 lr: 4.0132092487213093e-05 loss: 0.1725 (0.1711) time: 2.9731 data: 0.0077 max mem: 33300 Epoch: [8] [2870/4276] eta: 1:11:29 lr: 4.012939645463565e-05 loss: 0.1581 (0.1711) time: 2.9698 data: 0.0082 max mem: 33300 Epoch: [8] [2880/4276] eta: 1:10:58 lr: 4.012670040193263e-05 loss: 0.1636 (0.1711) time: 2.9937 data: 0.0084 max mem: 33300 Epoch: [8] [2890/4276] eta: 1:10:27 lr: 4.0124004329102375e-05 loss: 0.1686 (0.1712) time: 2.9992 data: 0.0079 max mem: 33300 Epoch: [8] [2900/4276] eta: 1:09:56 lr: 4.012130823614323e-05 loss: 0.1570 (0.1711) time: 2.9892 data: 0.0084 max mem: 33300 Epoch: [8] [2910/4276] eta: 1:09:25 lr: 4.011861212305355e-05 loss: 0.1505 (0.1711) time: 2.9635 data: 0.0082 max mem: 33300 Epoch: [8] [2920/4276] eta: 1:08:54 lr: 4.0115915989831685e-05 loss: 0.1590 (0.1711) time: 2.9628 data: 0.0073 max mem: 33300 Epoch: [8] [2930/4276] eta: 1:08:24 lr: 4.011321983647597e-05 loss: 0.1590 (0.1710) time: 2.9627 data: 0.0073 max mem: 33300 Epoch: [8] [2940/4276] eta: 1:07:53 lr: 4.011052366298475e-05 loss: 0.1522 (0.1710) time: 2.9645 data: 0.0074 max mem: 33300 Epoch: [8] [2950/4276] eta: 1:07:22 lr: 4.010782746935638e-05 loss: 0.1660 (0.1710) time: 2.9488 data: 0.0075 max mem: 33300 Epoch: [8] [2960/4276] eta: 1:06:51 lr: 4.010513125558922e-05 loss: 0.1711 (0.1710) time: 2.9201 data: 0.0083 max mem: 33300 Epoch: [8] [2970/4276] eta: 1:06:20 lr: 4.010243502168158e-05 loss: 0.1690 (0.1710) time: 2.9132 data: 0.0086 max mem: 33300 Epoch: [8] [2980/4276] eta: 1:05:48 lr: 4.0099738767631824e-05 loss: 0.1751 (0.1710) time: 2.9176 data: 0.0084 max mem: 33300 Epoch: [8] [2990/4276] eta: 1:05:17 lr: 4.0097042493438306e-05 loss: 0.1538 (0.1709) time: 2.9118 data: 0.0078 max mem: 33300 Epoch: [8] [3000/4276] eta: 1:04:46 lr: 4.009434619909935e-05 loss: 0.1500 (0.1709) time: 2.9172 data: 0.0080 max mem: 33300 Epoch: [8] [3010/4276] eta: 1:04:15 lr: 4.00916498846133e-05 loss: 0.1601 (0.1709) time: 2.9175 data: 0.0086 max mem: 33300 Epoch: [8] [3020/4276] eta: 1:03:44 lr: 4.0088953549978527e-05 loss: 0.1531 (0.1708) time: 2.9065 data: 0.0081 max mem: 33300 Epoch: [8] [3030/4276] eta: 1:03:13 lr: 4.0086257195193354e-05 loss: 0.1544 (0.1708) time: 2.9148 data: 0.0074 max mem: 33300 Epoch: [8] [3040/4276] eta: 1:02:42 lr: 4.008356082025612e-05 loss: 0.1642 (0.1709) time: 2.9143 data: 0.0072 max mem: 33300 Epoch: [8] [3050/4276] eta: 1:02:11 lr: 4.0080864425165185e-05 loss: 0.1598 (0.1708) time: 2.9014 data: 0.0070 max mem: 33300 Epoch: [8] [3060/4276] eta: 1:01:40 lr: 4.007816800991888e-05 loss: 0.1387 (0.1708) time: 2.8963 data: 0.0073 max mem: 33300 Epoch: [8] [3070/4276] eta: 1:01:09 lr: 4.0075471574515546e-05 loss: 0.1508 (0.1707) time: 2.9038 data: 0.0076 max mem: 33300 Epoch: [8] [3080/4276] eta: 1:00:38 lr: 4.007277511895354e-05 loss: 0.1636 (0.1707) time: 2.9110 data: 0.0074 max mem: 33300 Epoch: [8] [3090/4276] eta: 1:00:08 lr: 4.007007864323118e-05 loss: 0.1514 (0.1706) time: 2.9490 data: 0.0082 max mem: 33300 Epoch: [8] [3100/4276] eta: 0:59:37 lr: 4.0067382147346835e-05 loss: 0.1578 (0.1706) time: 3.0190 data: 0.0086 max mem: 33300 Epoch: [8] [3110/4276] eta: 0:59:07 lr: 4.006468563129883e-05 loss: 0.1531 (0.1706) time: 3.0429 data: 0.0079 max mem: 33300 Epoch: [8] [3120/4276] eta: 0:58:37 lr: 4.00619890950855e-05 loss: 0.1475 (0.1706) time: 3.0487 data: 0.0073 max mem: 33300 Epoch: [8] [3130/4276] eta: 0:58:06 lr: 4.0059292538705205e-05 loss: 0.1539 (0.1705) time: 3.0476 data: 0.0075 max mem: 33300 Epoch: [8] [3140/4276] eta: 0:57:36 lr: 4.0056595962156277e-05 loss: 0.1644 (0.1705) time: 3.0435 data: 0.0079 max mem: 33300 Epoch: [8] [3150/4276] eta: 0:57:05 lr: 4.0053899365437054e-05 loss: 0.1706 (0.1705) time: 3.0505 data: 0.0074 max mem: 33300 Epoch: [8] [3160/4276] eta: 0:56:35 lr: 4.005120274854587e-05 loss: 0.1706 (0.1705) time: 3.0554 data: 0.0071 max mem: 33300 Epoch: [8] [3170/4276] eta: 0:56:05 lr: 4.004850611148109e-05 loss: 0.1649 (0.1706) time: 3.0539 data: 0.0070 max mem: 33300 Epoch: [8] [3180/4276] eta: 0:55:34 lr: 4.004580945424102e-05 loss: 0.1649 (0.1705) time: 3.0451 data: 0.0070 max mem: 33300 Epoch: [8] [3190/4276] eta: 0:55:04 lr: 4.0043112776824035e-05 loss: 0.1609 (0.1705) time: 3.0331 data: 0.0073 max mem: 33300 Epoch: [8] [3200/4276] eta: 0:54:33 lr: 4.004041607922844e-05 loss: 0.1734 (0.1705) time: 3.0376 data: 0.0075 max mem: 33300 Epoch: [8] [3210/4276] eta: 0:54:03 lr: 4.003771936145259e-05 loss: 0.1805 (0.1706) time: 3.0425 data: 0.0076 max mem: 33300 Epoch: [8] [3220/4276] eta: 0:53:32 lr: 4.003502262349484e-05 loss: 0.1837 (0.1706) time: 3.0407 data: 0.0074 max mem: 33300 Epoch: [8] [3230/4276] eta: 0:53:02 lr: 4.00323258653535e-05 loss: 0.1656 (0.1706) time: 3.0484 data: 0.0072 max mem: 33300 Epoch: [8] [3240/4276] eta: 0:52:32 lr: 4.0029629087026916e-05 loss: 0.1715 (0.1706) time: 3.0424 data: 0.0072 max mem: 33300 Epoch: [8] [3250/4276] eta: 0:52:01 lr: 4.002693228851344e-05 loss: 0.1715 (0.1706) time: 3.0309 data: 0.0076 max mem: 33300 Epoch: [8] [3260/4276] eta: 0:51:31 lr: 4.002423546981139e-05 loss: 0.1666 (0.1706) time: 3.0137 data: 0.0085 max mem: 33300 Epoch: [8] [3270/4276] eta: 0:51:00 lr: 4.002153863091912e-05 loss: 0.1749 (0.1706) time: 2.9920 data: 0.0089 max mem: 33300 Epoch: [8] [3280/4276] eta: 0:50:29 lr: 4.001884177183495e-05 loss: 0.1833 (0.1707) time: 2.9825 data: 0.0082 max mem: 33300 Epoch: [8] [3290/4276] eta: 0:49:59 lr: 4.001614489255724e-05 loss: 0.1670 (0.1707) time: 2.9936 data: 0.0079 max mem: 33300 Epoch: [8] [3300/4276] eta: 0:49:28 lr: 4.0013447993084305e-05 loss: 0.1650 (0.1707) time: 3.0241 data: 0.0080 max mem: 33300 Epoch: [8] [3310/4276] eta: 0:48:58 lr: 4.00107510734145e-05 loss: 0.1780 (0.1707) time: 3.0427 data: 0.0084 max mem: 33300 Epoch: [8] [3320/4276] eta: 0:48:28 lr: 4.000805413354614e-05 loss: 0.1829 (0.1707) time: 3.0495 data: 0.0083 max mem: 33300 Epoch: [8] [3330/4276] eta: 0:47:57 lr: 4.0005357173477576e-05 loss: 0.1523 (0.1707) time: 3.0447 data: 0.0078 max mem: 33300 Epoch: [8] [3340/4276] eta: 0:47:27 lr: 4.000266019320715e-05 loss: 0.1635 (0.1707) time: 3.0439 data: 0.0078 max mem: 33300 Epoch: [8] [3350/4276] eta: 0:46:56 lr: 3.999996319273317e-05 loss: 0.1601 (0.1707) time: 3.0557 data: 0.0077 max mem: 33300 Epoch: [8] [3360/4276] eta: 0:46:26 lr: 3.999726617205399e-05 loss: 0.1601 (0.1707) time: 3.0513 data: 0.0075 max mem: 33300 Epoch: [8] [3370/4276] eta: 0:45:56 lr: 3.999456913116796e-05 loss: 0.1704 (0.1707) time: 3.0447 data: 0.0080 max mem: 33300 Epoch: [8] [3380/4276] eta: 0:45:25 lr: 3.9991872070073385e-05 loss: 0.1664 (0.1707) time: 3.0461 data: 0.0080 max mem: 33300 Epoch: [8] [3390/4276] eta: 0:44:55 lr: 3.9989174988768606e-05 loss: 0.1706 (0.1707) time: 3.0118 data: 0.0079 max mem: 33300 Epoch: [8] [3400/4276] eta: 0:44:24 lr: 3.998647788725197e-05 loss: 0.1814 (0.1708) time: 2.9845 data: 0.0079 max mem: 33300 Epoch: [8] [3410/4276] eta: 0:43:54 lr: 3.9983780765521795e-05 loss: 0.1814 (0.1708) time: 3.0164 data: 0.0078 max mem: 33300 Epoch: [8] [3420/4276] eta: 0:43:23 lr: 3.998108362357643e-05 loss: 0.1698 (0.1708) time: 3.0257 data: 0.0079 max mem: 33300 Epoch: [8] [3430/4276] eta: 0:42:53 lr: 3.99783864614142e-05 loss: 0.1706 (0.1709) time: 3.0333 data: 0.0083 max mem: 33300 Epoch: [8] [3440/4276] eta: 0:42:22 lr: 3.997568927903344e-05 loss: 0.1617 (0.1708) time: 3.0599 data: 0.0088 max mem: 33300 Epoch: [8] [3450/4276] eta: 0:41:52 lr: 3.997299207643249e-05 loss: 0.1685 (0.1708) time: 3.0627 data: 0.0081 max mem: 33300 Epoch: [8] [3460/4276] eta: 0:41:22 lr: 3.997029485360966e-05 loss: 0.1759 (0.1708) time: 3.0531 data: 0.0079 max mem: 33300 Epoch: [8] [3470/4276] eta: 0:40:51 lr: 3.9967597610563295e-05 loss: 0.1455 (0.1708) time: 3.0523 data: 0.0078 max mem: 33300 Epoch: [8] [3480/4276] eta: 0:40:21 lr: 3.9964900347291736e-05 loss: 0.1644 (0.1708) time: 3.0461 data: 0.0078 max mem: 33300 Epoch: [8] [3490/4276] eta: 0:39:50 lr: 3.99622030637933e-05 loss: 0.1716 (0.1708) time: 3.0147 data: 0.0087 max mem: 33300 Epoch: [8] [3500/4276] eta: 0:39:20 lr: 3.9959505760066336e-05 loss: 0.1725 (0.1708) time: 2.9928 data: 0.0086 max mem: 33300 Epoch: [8] [3510/4276] eta: 0:38:49 lr: 3.995680843610915e-05 loss: 0.1537 (0.1708) time: 3.0187 data: 0.0085 max mem: 33300 Epoch: [8] [3520/4276] eta: 0:38:19 lr: 3.995411109192009e-05 loss: 0.1615 (0.1708) time: 3.0504 data: 0.0087 max mem: 33300 Epoch: [8] [3530/4276] eta: 0:37:48 lr: 3.99514137274975e-05 loss: 0.1742 (0.1708) time: 3.0465 data: 0.0078 max mem: 33300 Epoch: [8] [3540/4276] eta: 0:37:18 lr: 3.994871634283968e-05 loss: 0.1753 (0.1708) time: 3.0338 data: 0.0081 max mem: 33300 Epoch: [8] [3550/4276] eta: 0:36:48 lr: 3.9946018937944976e-05 loss: 0.1683 (0.1708) time: 3.0350 data: 0.0089 max mem: 33300 Epoch: [8] [3560/4276] eta: 0:36:17 lr: 3.994332151281171e-05 loss: 0.1683 (0.1708) time: 3.0514 data: 0.0085 max mem: 33300 Epoch: [8] [3570/4276] eta: 0:35:47 lr: 3.994062406743822e-05 loss: 0.1851 (0.1708) time: 3.0423 data: 0.0080 max mem: 33300 Epoch: [8] [3580/4276] eta: 0:35:16 lr: 3.993792660182284e-05 loss: 0.1607 (0.1708) time: 3.0036 data: 0.0080 max mem: 33300 Epoch: [8] [3590/4276] eta: 0:34:46 lr: 3.993522911596388e-05 loss: 0.1459 (0.1708) time: 2.9916 data: 0.0081 max mem: 33300 Epoch: [8] [3600/4276] eta: 0:34:15 lr: 3.993253160985969e-05 loss: 0.1563 (0.1708) time: 3.0223 data: 0.0083 max mem: 33300 Epoch: [8] [3610/4276] eta: 0:33:45 lr: 3.992983408350859e-05 loss: 0.1727 (0.1708) time: 3.0485 data: 0.0089 max mem: 33300 Epoch: [8] [3620/4276] eta: 0:33:15 lr: 3.99271365369089e-05 loss: 0.1663 (0.1707) time: 3.0565 data: 0.0089 max mem: 33300 Epoch: [8] [3630/4276] eta: 0:32:44 lr: 3.992443897005896e-05 loss: 0.1602 (0.1707) time: 3.0584 data: 0.0081 max mem: 33300 Epoch: [8] [3640/4276] eta: 0:32:14 lr: 3.992174138295709e-05 loss: 0.1596 (0.1707) time: 3.0579 data: 0.0073 max mem: 33300 Epoch: [8] [3650/4276] eta: 0:31:43 lr: 3.991904377560162e-05 loss: 0.1595 (0.1707) time: 3.0496 data: 0.0072 max mem: 33300 Epoch: [8] [3660/4276] eta: 0:31:13 lr: 3.991634614799088e-05 loss: 0.1456 (0.1706) time: 3.0271 data: 0.0077 max mem: 33300 Epoch: [8] [3670/4276] eta: 0:30:43 lr: 3.9913648500123185e-05 loss: 0.1634 (0.1706) time: 3.0361 data: 0.0084 max mem: 33300 Epoch: [8] [3680/4276] eta: 0:30:12 lr: 3.991095083199689e-05 loss: 0.1891 (0.1707) time: 3.0543 data: 0.0081 max mem: 33300 Epoch: [8] [3690/4276] eta: 0:29:42 lr: 3.990825314361028e-05 loss: 0.1703 (0.1706) time: 3.0812 data: 0.0079 max mem: 33300 Epoch: [8] [3700/4276] eta: 0:29:12 lr: 3.990555543496171e-05 loss: 0.1631 (0.1706) time: 3.1429 data: 0.0080 max mem: 33300 Epoch: [8] [3710/4276] eta: 0:28:42 lr: 3.99028577060495e-05 loss: 0.1555 (0.1706) time: 3.2170 data: 0.0084 max mem: 33300 Epoch: [8] [3720/4276] eta: 0:28:11 lr: 3.990015995687198e-05 loss: 0.1462 (0.1705) time: 3.2460 data: 0.0084 max mem: 33300 Epoch: [8] [3730/4276] eta: 0:27:41 lr: 3.989746218742747e-05 loss: 0.1687 (0.1706) time: 3.2341 data: 0.0082 max mem: 33300 Epoch: [8] [3740/4276] eta: 0:27:11 lr: 3.989476439771429e-05 loss: 0.1723 (0.1706) time: 3.2294 data: 0.0085 max mem: 33300 Epoch: [8] [3750/4276] eta: 0:26:41 lr: 3.989206658773077e-05 loss: 0.1704 (0.1706) time: 3.2379 data: 0.0086 max mem: 33300 Epoch: [8] [3760/4276] eta: 0:26:11 lr: 3.9889368757475236e-05 loss: 0.1678 (0.1706) time: 3.2478 data: 0.0087 max mem: 33300 Epoch: [8] [3770/4276] eta: 0:25:41 lr: 3.988667090694601e-05 loss: 0.1574 (0.1706) time: 3.2263 data: 0.0087 max mem: 33300 Epoch: [8] [3780/4276] eta: 0:25:10 lr: 3.988397303614141e-05 loss: 0.1569 (0.1705) time: 3.2236 data: 0.0090 max mem: 33300 Epoch: [8] [3790/4276] eta: 0:24:40 lr: 3.988127514505978e-05 loss: 0.1569 (0.1705) time: 3.2343 data: 0.0093 max mem: 33300 Epoch: [8] [3800/4276] eta: 0:24:10 lr: 3.987857723369942e-05 loss: 0.1701 (0.1705) time: 3.2328 data: 0.0091 max mem: 33300 Epoch: [8] [3810/4276] eta: 0:23:40 lr: 3.987587930205867e-05 loss: 0.1686 (0.1705) time: 3.2372 data: 0.0085 max mem: 33300 Epoch: [8] [3820/4276] eta: 0:23:09 lr: 3.987318135013584e-05 loss: 0.1524 (0.1705) time: 3.2585 data: 0.0084 max mem: 33300 Epoch: [8] [3830/4276] eta: 0:22:39 lr: 3.987048337792926e-05 loss: 0.1524 (0.1705) time: 3.2454 data: 0.0084 max mem: 33300 Epoch: [8] [3840/4276] eta: 0:22:09 lr: 3.986778538543725e-05 loss: 0.1551 (0.1705) time: 3.2318 data: 0.0081 max mem: 33300 Epoch: [8] [3850/4276] eta: 0:21:39 lr: 3.986508737265813e-05 loss: 0.1576 (0.1704) time: 3.2478 data: 0.0078 max mem: 33300 Epoch: [8] [3860/4276] eta: 0:21:08 lr: 3.986238933959023e-05 loss: 0.1614 (0.1704) time: 3.2400 data: 0.0082 max mem: 33300 Epoch: [8] [3870/4276] eta: 0:20:38 lr: 3.985969128623186e-05 loss: 0.1728 (0.1704) time: 3.2373 data: 0.0085 max mem: 33300 Epoch: [8] [3880/4276] eta: 0:20:08 lr: 3.985699321258136e-05 loss: 0.1701 (0.1704) time: 3.2346 data: 0.0085 max mem: 33300 Epoch: [8] [3890/4276] eta: 0:19:37 lr: 3.985429511863703e-05 loss: 0.1604 (0.1704) time: 3.2362 data: 0.0086 max mem: 33300 Epoch: [8] [3900/4276] eta: 0:19:07 lr: 3.98515970043972e-05 loss: 0.1690 (0.1704) time: 3.2121 data: 0.0083 max mem: 33300 Epoch: [8] [3910/4276] eta: 0:18:37 lr: 3.98488988698602e-05 loss: 0.1601 (0.1704) time: 3.1801 data: 0.0083 max mem: 33300 Epoch: [8] [3920/4276] eta: 0:18:06 lr: 3.984620071502433e-05 loss: 0.1525 (0.1704) time: 3.1790 data: 0.0087 max mem: 33300 Epoch: [8] [3930/4276] eta: 0:17:36 lr: 3.984350253988792e-05 loss: 0.1617 (0.1704) time: 3.1963 data: 0.0090 max mem: 33300 Epoch: [8] [3940/4276] eta: 0:17:05 lr: 3.984080434444929e-05 loss: 0.1615 (0.1704) time: 3.1939 data: 0.0094 max mem: 33300 Epoch: [8] [3950/4276] eta: 0:16:35 lr: 3.983810612870676e-05 loss: 0.1562 (0.1703) time: 3.2030 data: 0.0093 max mem: 33300 Epoch: [8] [3960/4276] eta: 0:16:05 lr: 3.983540789265867e-05 loss: 0.1598 (0.1703) time: 3.2303 data: 0.0085 max mem: 33300 Epoch: [8] [3970/4276] eta: 0:15:34 lr: 3.983270963630329e-05 loss: 0.1676 (0.1704) time: 3.2170 data: 0.0081 max mem: 33300 Epoch: [8] [3980/4276] eta: 0:15:04 lr: 3.983001135963898e-05 loss: 0.1556 (0.1703) time: 3.2042 data: 0.0084 max mem: 33300 Epoch: [8] [3990/4276] eta: 0:14:33 lr: 3.9827313062664046e-05 loss: 0.1499 (0.1703) time: 3.2278 data: 0.0087 max mem: 33300 Epoch: [8] [4000/4276] eta: 0:14:03 lr: 3.98246147453768e-05 loss: 0.1499 (0.1703) time: 3.2399 data: 0.0086 max mem: 33300 Epoch: [8] [4010/4276] eta: 0:13:32 lr: 3.982191640777557e-05 loss: 0.1442 (0.1703) time: 3.2168 data: 0.0084 max mem: 33300 Epoch: [8] [4020/4276] eta: 0:13:02 lr: 3.981921804985866e-05 loss: 0.1498 (0.1703) time: 3.1894 data: 0.0081 max mem: 33300 Epoch: [8] [4030/4276] eta: 0:12:32 lr: 3.9816519671624404e-05 loss: 0.1620 (0.1703) time: 3.2358 data: 0.0079 max mem: 33300 Epoch: [8] [4040/4276] eta: 0:12:01 lr: 3.981382127307111e-05 loss: 0.1697 (0.1704) time: 3.2966 data: 0.0082 max mem: 33300 Epoch: [8] [4050/4276] eta: 0:11:31 lr: 3.98111228541971e-05 loss: 0.1634 (0.1703) time: 3.2931 data: 0.0083 max mem: 33300 Epoch: [8] [4060/4276] eta: 0:11:00 lr: 3.980842441500068e-05 loss: 0.1637 (0.1703) time: 3.2924 data: 0.0085 max mem: 33300 Epoch: [8] [4070/4276] eta: 0:10:30 lr: 3.980572595548018e-05 loss: 0.1665 (0.1703) time: 3.3369 data: 0.0087 max mem: 33300 Epoch: [8] [4080/4276] eta: 0:09:59 lr: 3.9803027475633895e-05 loss: 0.1647 (0.1704) time: 3.3752 data: 0.0086 max mem: 33300 Epoch: [8] [4090/4276] eta: 0:09:29 lr: 3.9800328975460166e-05 loss: 0.1821 (0.1704) time: 3.3874 data: 0.0084 max mem: 33300 Epoch: [8] [4100/4276] eta: 0:08:58 lr: 3.979763045495729e-05 loss: 0.1821 (0.1704) time: 3.3636 data: 0.0087 max mem: 33300 Epoch: [8] [4110/4276] eta: 0:08:28 lr: 3.9794931914123593e-05 loss: 0.1744 (0.1704) time: 3.3576 data: 0.0089 max mem: 33300 Epoch: [8] [4120/4276] eta: 0:07:57 lr: 3.9792233352957394e-05 loss: 0.1744 (0.1704) time: 3.3289 data: 0.0085 max mem: 33300 Epoch: [8] [4130/4276] eta: 0:07:27 lr: 3.978953477145699e-05 loss: 0.1626 (0.1704) time: 3.2753 data: 0.0091 max mem: 33300 Epoch: [8] [4140/4276] eta: 0:06:56 lr: 3.9786836169620714e-05 loss: 0.1581 (0.1704) time: 3.3293 data: 0.0094 max mem: 33300 Epoch: [8] [4150/4276] eta: 0:06:26 lr: 3.978413754744687e-05 loss: 0.1581 (0.1704) time: 3.3876 data: 0.0095 max mem: 33300 Epoch: [8] [4160/4276] eta: 0:05:55 lr: 3.978143890493377e-05 loss: 0.1614 (0.1704) time: 3.3640 data: 0.0098 max mem: 33300 Epoch: [8] [4170/4276] eta: 0:05:25 lr: 3.9778740242079735e-05 loss: 0.1651 (0.1704) time: 3.3503 data: 0.0102 max mem: 33300 Epoch: [8] [4180/4276] eta: 0:04:54 lr: 3.9776041558883064e-05 loss: 0.1640 (0.1704) time: 3.3508 data: 0.0099 max mem: 33300 Epoch: [8] [4190/4276] eta: 0:04:23 lr: 3.9773342855342085e-05 loss: 0.1569 (0.1704) time: 3.2961 data: 0.0088 max mem: 33300 Epoch: [8] [4200/4276] eta: 0:03:53 lr: 3.977064413145512e-05 loss: 0.1698 (0.1704) time: 3.2837 data: 0.0095 max mem: 33300 Epoch: [8] [4210/4276] eta: 0:03:22 lr: 3.9767945387220453e-05 loss: 0.1890 (0.1705) time: 3.2974 data: 0.0100 max mem: 33300 Epoch: [8] [4220/4276] eta: 0:02:51 lr: 3.976524662263642e-05 loss: 0.1962 (0.1705) time: 3.3285 data: 0.0095 max mem: 33300 Epoch: [8] [4230/4276] eta: 0:02:21 lr: 3.976254783770132e-05 loss: 0.1962 (0.1706) time: 3.3056 data: 0.0095 max mem: 33300 Epoch: [8] [4240/4276] eta: 0:01:50 lr: 3.975984903241347e-05 loss: 0.1905 (0.1706) time: 3.3024 data: 0.0098 max mem: 33300 Epoch: [8] [4250/4276] eta: 0:01:19 lr: 3.975715020677118e-05 loss: 0.1782 (0.1707) time: 3.3262 data: 0.0097 max mem: 33300 Epoch: [8] [4260/4276] eta: 0:00:49 lr: 3.975445136077276e-05 loss: 0.1760 (0.1707) time: 3.3215 data: 0.0099 max mem: 33300 Epoch: [8] [4270/4276] eta: 0:00:18 lr: 3.9751752494416536e-05 loss: 0.1781 (0.1707) time: 3.2737 data: 0.0089 max mem: 33300 Epoch: [8] Total time: 3:38:58 Test: [ 0/21770] eta: 11:12:23 time: 1.8532 data: 1.8129 max mem: 33300 Test: [ 100/21770] eta: 0:20:54 time: 0.0398 data: 0.0013 max mem: 33300 Test: [ 200/21770] eta: 0:17:37 time: 0.0402 data: 0.0013 max mem: 33300 Test: [ 300/21770] eta: 0:16:30 time: 0.0400 data: 0.0012 max mem: 33300 Test: [ 400/21770] eta: 0:15:56 time: 0.0408 data: 0.0013 max mem: 33300 Test: [ 500/21770] eta: 0:15:32 time: 0.0400 data: 0.0013 max mem: 33300 Test: [ 600/21770] eta: 0:15:14 time: 0.0401 data: 0.0012 max mem: 33300 Test: [ 700/21770] eta: 0:14:58 time: 0.0393 data: 0.0012 max mem: 33300 Test: [ 800/21770] eta: 0:14:46 time: 0.0399 data: 0.0013 max mem: 33300 Test: [ 900/21770] eta: 0:14:36 time: 0.0393 data: 0.0013 max mem: 33300 Test: [ 1000/21770] eta: 0:14:27 time: 0.0400 data: 0.0013 max mem: 33300 Test: [ 1100/21770] eta: 0:14:19 time: 0.0400 data: 0.0013 max mem: 33300 Test: [ 1200/21770] eta: 0:14:13 time: 0.0407 data: 0.0012 max mem: 33300 Test: [ 1300/21770] eta: 0:14:07 time: 0.0405 data: 0.0012 max mem: 33300 Test: [ 1400/21770] eta: 0:14:02 time: 0.0406 data: 0.0013 max mem: 33300 Test: [ 1500/21770] eta: 0:13:57 time: 0.0408 data: 0.0012 max mem: 33300 Test: [ 1600/21770] eta: 0:13:52 time: 0.0405 data: 0.0013 max mem: 33300 Test: [ 1700/21770] eta: 0:13:47 time: 0.0410 data: 0.0012 max mem: 33300 Test: [ 1800/21770] eta: 0:13:42 time: 0.0406 data: 0.0013 max mem: 33300 Test: [ 1900/21770] eta: 0:13:37 time: 0.0403 data: 0.0012 max mem: 33300 Test: [ 2000/21770] eta: 0:13:33 time: 0.0416 data: 0.0015 max mem: 33300 Test: [ 2100/21770] eta: 0:13:27 time: 0.0392 data: 0.0013 max mem: 33300 Test: [ 2200/21770] eta: 0:13:22 time: 0.0397 data: 0.0012 max mem: 33300 Test: [ 2300/21770] eta: 0:13:17 time: 0.0402 data: 0.0012 max mem: 33300 Test: [ 2400/21770] eta: 0:13:12 time: 0.0401 data: 0.0012 max mem: 33300 Test: [ 2500/21770] eta: 0:13:08 time: 0.0407 data: 0.0012 max mem: 33300 Test: [ 2600/21770] eta: 0:13:04 time: 0.0401 data: 0.0012 max mem: 33300 Test: [ 2700/21770] eta: 0:12:59 time: 0.0398 data: 0.0012 max mem: 33300 Test: [ 2800/21770] eta: 0:12:54 time: 0.0400 data: 0.0012 max mem: 33300 Test: [ 2900/21770] eta: 0:12:50 time: 0.0400 data: 0.0012 max mem: 33300 Test: [ 3000/21770] eta: 0:12:45 time: 0.0399 data: 0.0012 max mem: 33300 Test: [ 3100/21770] eta: 0:12:41 time: 0.0405 data: 0.0012 max mem: 33300 Test: [ 3200/21770] eta: 0:12:37 time: 0.0405 data: 0.0013 max mem: 33300 Test: [ 3300/21770] eta: 0:12:32 time: 0.0399 data: 0.0013 max mem: 33300 Test: [ 3400/21770] eta: 0:12:28 time: 0.0402 data: 0.0013 max mem: 33300 Test: [ 3500/21770] eta: 0:12:23 time: 0.0400 data: 0.0013 max mem: 33300 Test: [ 3600/21770] eta: 0:12:19 time: 0.0400 data: 0.0013 max mem: 33300 Test: [ 3700/21770] eta: 0:12:15 time: 0.0402 data: 0.0013 max mem: 33300 Test: [ 3800/21770] eta: 0:12:10 time: 0.0399 data: 0.0013 max mem: 33300 Test: [ 3900/21770] eta: 0:12:06 time: 0.0401 data: 0.0012 max mem: 33300 Test: [ 4000/21770] eta: 0:12:02 time: 0.0399 data: 0.0012 max mem: 33300 Test: [ 4100/21770] eta: 0:11:57 time: 0.0400 data: 0.0012 max mem: 33300 Test: [ 4200/21770] eta: 0:11:53 time: 0.0398 data: 0.0012 max mem: 33300 Test: [ 4300/21770] eta: 0:11:49 time: 0.0399 data: 0.0012 max mem: 33300 Test: [ 4400/21770] eta: 0:11:44 time: 0.0398 data: 0.0012 max mem: 33300 Test: [ 4500/21770] eta: 0:11:40 time: 0.0400 data: 0.0012 max mem: 33300 Test: [ 4600/21770] eta: 0:11:36 time: 0.0399 data: 0.0012 max mem: 33300 Test: [ 4700/21770] eta: 0:11:32 time: 0.0407 data: 0.0012 max mem: 33300 Test: [ 4800/21770] eta: 0:11:28 time: 0.0404 data: 0.0013 max mem: 33300 Test: [ 4900/21770] eta: 0:11:23 time: 0.0407 data: 0.0012 max mem: 33300 Test: [ 5000/21770] eta: 0:11:19 time: 0.0408 data: 0.0013 max mem: 33300 Test: [ 5100/21770] eta: 0:11:15 time: 0.0409 data: 0.0013 max mem: 33300 Test: [ 5200/21770] eta: 0:11:11 time: 0.0406 data: 0.0012 max mem: 33300 Test: [ 5300/21770] eta: 0:11:07 time: 0.0403 data: 0.0012 max mem: 33300 Test: [ 5400/21770] eta: 0:11:03 time: 0.0405 data: 0.0012 max mem: 33300 Test: [ 5500/21770] eta: 0:10:59 time: 0.0408 data: 0.0013 max mem: 33300 Test: [ 5600/21770] eta: 0:10:55 time: 0.0407 data: 0.0013 max mem: 33300 Test: [ 5700/21770] eta: 0:10:51 time: 0.0402 data: 0.0012 max mem: 33300 Test: [ 5800/21770] eta: 0:10:47 time: 0.0398 data: 0.0013 max mem: 33300 Test: [ 5900/21770] eta: 0:10:43 time: 0.0395 data: 0.0012 max mem: 33300 Test: [ 6000/21770] eta: 0:10:38 time: 0.0393 data: 0.0013 max mem: 33300 Test: [ 6100/21770] eta: 0:10:34 time: 0.0395 data: 0.0012 max mem: 33300 Test: [ 6200/21770] eta: 0:10:30 time: 0.0401 data: 0.0012 max mem: 33300 Test: [ 6300/21770] eta: 0:10:26 time: 0.0402 data: 0.0013 max mem: 33300 Test: [ 6400/21770] eta: 0:10:21 time: 0.0393 data: 0.0013 max mem: 33300 Test: [ 6500/21770] eta: 0:10:17 time: 0.0393 data: 0.0013 max mem: 33300 Test: [ 6600/21770] eta: 0:10:13 time: 0.0406 data: 0.0014 max mem: 33300 Test: [ 6700/21770] eta: 0:10:09 time: 0.0401 data: 0.0013 max mem: 33300 Test: [ 6800/21770] eta: 0:10:05 time: 0.0399 data: 0.0012 max mem: 33300 Test: [ 6900/21770] eta: 0:10:01 time: 0.0400 data: 0.0013 max mem: 33300 Test: [ 7000/21770] eta: 0:09:56 time: 0.0401 data: 0.0013 max mem: 33300 Test: [ 7100/21770] eta: 0:09:52 time: 0.0400 data: 0.0012 max mem: 33300 Test: [ 7200/21770] eta: 0:09:48 time: 0.0401 data: 0.0013 max mem: 33300 Test: [ 7300/21770] eta: 0:09:44 time: 0.0400 data: 0.0013 max mem: 33300 Test: [ 7400/21770] eta: 0:09:40 time: 0.0406 data: 0.0012 max mem: 33300 Test: [ 7500/21770] eta: 0:09:36 time: 0.0410 data: 0.0012 max mem: 33300 Test: [ 7600/21770] eta: 0:09:32 time: 0.0400 data: 0.0013 max mem: 33300 Test: [ 7700/21770] eta: 0:09:28 time: 0.0397 data: 0.0012 max mem: 33300 Test: [ 7800/21770] eta: 0:09:24 time: 0.0398 data: 0.0012 max mem: 33300 Test: [ 7900/21770] eta: 0:09:20 time: 0.0400 data: 0.0012 max mem: 33300 Test: [ 8000/21770] eta: 0:09:15 time: 0.0397 data: 0.0012 max mem: 33300 Test: [ 8100/21770] eta: 0:09:11 time: 0.0401 data: 0.0013 max mem: 33300 Test: [ 8200/21770] eta: 0:09:07 time: 0.0398 data: 0.0013 max mem: 33300 Test: [ 8300/21770] eta: 0:09:03 time: 0.0400 data: 0.0013 max mem: 33300 Test: [ 8400/21770] eta: 0:08:59 time: 0.0401 data: 0.0013 max mem: 33300 Test: [ 8500/21770] eta: 0:08:55 time: 0.0400 data: 0.0013 max mem: 33300 Test: [ 8600/21770] eta: 0:08:51 time: 0.0400 data: 0.0013 max mem: 33300 Test: [ 8700/21770] eta: 0:08:47 time: 0.0399 data: 0.0013 max mem: 33300 Test: [ 8800/21770] eta: 0:08:43 time: 0.0400 data: 0.0013 max mem: 33300 Test: [ 8900/21770] eta: 0:08:39 time: 0.0402 data: 0.0013 max mem: 33300 Test: [ 9000/21770] eta: 0:08:34 time: 0.0401 data: 0.0013 max mem: 33300 Test: [ 9100/21770] eta: 0:08:30 time: 0.0400 data: 0.0013 max mem: 33300 Test: [ 9200/21770] eta: 0:08:26 time: 0.0398 data: 0.0013 max mem: 33300 Test: [ 9300/21770] eta: 0:08:22 time: 0.0401 data: 0.0013 max mem: 33300 Test: [ 9400/21770] eta: 0:08:18 time: 0.0393 data: 0.0013 max mem: 33300 Test: [ 9500/21770] eta: 0:08:14 time: 0.0393 data: 0.0012 max mem: 33300 Test: [ 9600/21770] eta: 0:08:10 time: 0.0394 data: 0.0012 max mem: 33300 Test: [ 9700/21770] eta: 0:08:06 time: 0.0399 data: 0.0012 max mem: 33300 Test: [ 9800/21770] eta: 0:08:02 time: 0.0400 data: 0.0013 max mem: 33300 Test: [ 9900/21770] eta: 0:07:58 time: 0.0400 data: 0.0012 max mem: 33300 Test: [10000/21770] eta: 0:07:54 time: 0.0400 data: 0.0012 max mem: 33300 Test: [10100/21770] eta: 0:07:50 time: 0.0401 data: 0.0013 max mem: 33300 Test: [10200/21770] eta: 0:07:46 time: 0.0405 data: 0.0013 max mem: 33300 Test: [10300/21770] eta: 0:07:42 time: 0.0403 data: 0.0013 max mem: 33300 Test: [10400/21770] eta: 0:07:37 time: 0.0403 data: 0.0013 max mem: 33300 Test: [10500/21770] eta: 0:07:33 time: 0.0405 data: 0.0012 max mem: 33300 Test: [10600/21770] eta: 0:07:29 time: 0.0408 data: 0.0012 max mem: 33300 Test: [10700/21770] eta: 0:07:25 time: 0.0400 data: 0.0013 max mem: 33300 Test: [10800/21770] eta: 0:07:21 time: 0.0399 data: 0.0013 max mem: 33300 Test: [10900/21770] eta: 0:07:17 time: 0.0412 data: 0.0012 max mem: 33300 Test: [11000/21770] eta: 0:07:13 time: 0.0394 data: 0.0012 max mem: 33300 Test: [11100/21770] eta: 0:07:09 time: 0.0401 data: 0.0012 max mem: 33300 Test: [11200/21770] eta: 0:07:05 time: 0.0403 data: 0.0013 max mem: 33300 Test: [11300/21770] eta: 0:07:01 time: 0.0399 data: 0.0013 max mem: 33300 Test: [11400/21770] eta: 0:06:57 time: 0.0400 data: 0.0013 max mem: 33300 Test: [11500/21770] eta: 0:06:53 time: 0.0400 data: 0.0012 max mem: 33300 Test: [11600/21770] eta: 0:06:49 time: 0.0399 data: 0.0013 max mem: 33300 Test: [11700/21770] eta: 0:06:45 time: 0.0400 data: 0.0013 max mem: 33300 Test: [11800/21770] eta: 0:06:41 time: 0.0398 data: 0.0012 max mem: 33300 Test: [11900/21770] eta: 0:06:37 time: 0.0401 data: 0.0013 max mem: 33300 Test: [12000/21770] eta: 0:06:33 time: 0.0399 data: 0.0013 max mem: 33300 Test: [12100/21770] eta: 0:06:29 time: 0.0398 data: 0.0012 max mem: 33300 Test: [12200/21770] eta: 0:06:25 time: 0.0400 data: 0.0012 max mem: 33300 Test: [12300/21770] eta: 0:06:21 time: 0.0408 data: 0.0013 max mem: 33300 Test: [12400/21770] eta: 0:06:17 time: 0.0404 data: 0.0012 max mem: 33300 Test: [12500/21770] eta: 0:06:13 time: 0.0413 data: 0.0013 max mem: 33300 Test: [12600/21770] eta: 0:06:09 time: 0.0398 data: 0.0012 max mem: 33300 Test: [12700/21770] eta: 0:06:05 time: 0.0407 data: 0.0012 max mem: 33300 Test: [12800/21770] eta: 0:06:01 time: 0.0400 data: 0.0013 max mem: 33300 Test: [12900/21770] eta: 0:05:57 time: 0.0400 data: 0.0013 max mem: 33300 Test: [13000/21770] eta: 0:05:53 time: 0.0393 data: 0.0013 max mem: 33300 Test: [13100/21770] eta: 0:05:49 time: 0.0400 data: 0.0013 max mem: 33300 Test: [13200/21770] eta: 0:05:45 time: 0.0400 data: 0.0012 max mem: 33300 Test: [13300/21770] eta: 0:05:40 time: 0.0400 data: 0.0012 max mem: 33300 Test: [13400/21770] eta: 0:05:36 time: 0.0400 data: 0.0012 max mem: 33300 Test: [13500/21770] eta: 0:05:32 time: 0.0393 data: 0.0013 max mem: 33300 Test: [13600/21770] eta: 0:05:28 time: 0.0400 data: 0.0013 max mem: 33300 Test: [13700/21770] eta: 0:05:24 time: 0.0399 data: 0.0013 max mem: 33300 Test: [13800/21770] eta: 0:05:20 time: 0.0400 data: 0.0013 max mem: 33300 Test: [13900/21770] eta: 0:05:16 time: 0.0399 data: 0.0013 max mem: 33300 Test: [14000/21770] eta: 0:05:12 time: 0.0399 data: 0.0013 max mem: 33300 Test: [14100/21770] eta: 0:05:08 time: 0.0400 data: 0.0013 max mem: 33300 Test: [14200/21770] eta: 0:05:04 time: 0.0399 data: 0.0013 max mem: 33300 Test: [14300/21770] eta: 0:05:00 time: 0.0397 data: 0.0013 max mem: 33300 Test: [14400/21770] eta: 0:04:56 time: 0.0401 data: 0.0013 max mem: 33300 Test: [14500/21770] eta: 0:04:52 time: 0.0400 data: 0.0013 max mem: 33300 Test: [14600/21770] eta: 0:04:48 time: 0.0399 data: 0.0013 max mem: 33300 Test: [14700/21770] eta: 0:04:44 time: 0.0400 data: 0.0012 max mem: 33300 Test: [14800/21770] eta: 0:04:40 time: 0.0399 data: 0.0012 max mem: 33300 Test: [14900/21770] eta: 0:04:36 time: 0.0402 data: 0.0012 max mem: 33300 Test: [15000/21770] eta: 0:04:32 time: 0.0400 data: 0.0013 max mem: 33300 Test: [15100/21770] eta: 0:04:28 time: 0.0403 data: 0.0013 max mem: 33300 Test: [15200/21770] eta: 0:04:24 time: 0.0399 data: 0.0013 max mem: 33300 Test: [15300/21770] eta: 0:04:20 time: 0.0401 data: 0.0012 max mem: 33300 Test: [15400/21770] eta: 0:04:16 time: 0.0400 data: 0.0013 max mem: 33300 Test: [15500/21770] eta: 0:04:12 time: 0.0401 data: 0.0013 max mem: 33300 Test: [15600/21770] eta: 0:04:08 time: 0.0400 data: 0.0013 max mem: 33300 Test: [15700/21770] eta: 0:04:04 time: 0.0401 data: 0.0013 max mem: 33300 Test: [15800/21770] eta: 0:04:00 time: 0.0406 data: 0.0013 max mem: 33300 Test: [15900/21770] eta: 0:03:56 time: 0.0401 data: 0.0013 max mem: 33300 Test: [16000/21770] eta: 0:03:52 time: 0.0401 data: 0.0013 max mem: 33300 Test: [16100/21770] eta: 0:03:47 time: 0.0405 data: 0.0013 max mem: 33300 Test: [16200/21770] eta: 0:03:43 time: 0.0400 data: 0.0013 max mem: 33300 Test: [16300/21770] eta: 0:03:39 time: 0.0400 data: 0.0012 max mem: 33300 Test: [16400/21770] eta: 0:03:35 time: 0.0400 data: 0.0012 max mem: 33300 Test: [16500/21770] eta: 0:03:31 time: 0.0396 data: 0.0013 max mem: 33300 Test: [16600/21770] eta: 0:03:27 time: 0.0400 data: 0.0012 max mem: 33300 Test: [16700/21770] eta: 0:03:23 time: 0.0398 data: 0.0012 max mem: 33300 Test: [16800/21770] eta: 0:03:19 time: 0.0398 data: 0.0013 max mem: 33300 Test: [16900/21770] eta: 0:03:15 time: 0.0399 data: 0.0012 max mem: 33300 Test: [17000/21770] eta: 0:03:11 time: 0.0401 data: 0.0012 max mem: 33300 Test: [17100/21770] eta: 0:03:07 time: 0.0398 data: 0.0012 max mem: 33300 Test: [17200/21770] eta: 0:03:03 time: 0.0402 data: 0.0012 max mem: 33300 Test: [17300/21770] eta: 0:02:59 time: 0.0399 data: 0.0012 max mem: 33300 Test: [17400/21770] eta: 0:02:55 time: 0.0400 data: 0.0012 max mem: 33300 Test: [17500/21770] eta: 0:02:51 time: 0.0392 data: 0.0013 max mem: 33300 Test: [17600/21770] eta: 0:02:47 time: 0.0394 data: 0.0012 max mem: 33300 Test: [17700/21770] eta: 0:02:43 time: 0.0400 data: 0.0013 max mem: 33300 Test: [17800/21770] eta: 0:02:39 time: 0.0399 data: 0.0013 max mem: 33300 Test: [17900/21770] eta: 0:02:35 time: 0.0399 data: 0.0013 max mem: 33300 Test: [18000/21770] eta: 0:02:31 time: 0.0399 data: 0.0013 max mem: 33300 Test: [18100/21770] eta: 0:02:27 time: 0.0394 data: 0.0013 max mem: 33300 Test: [18200/21770] eta: 0:02:23 time: 0.0401 data: 0.0013 max mem: 33300 Test: [18300/21770] eta: 0:02:19 time: 0.0399 data: 0.0013 max mem: 33300 Test: [18400/21770] eta: 0:02:15 time: 0.0400 data: 0.0012 max mem: 33300 Test: [18500/21770] eta: 0:02:11 time: 0.0399 data: 0.0013 max mem: 33300 Test: [18600/21770] eta: 0:02:07 time: 0.0401 data: 0.0012 max mem: 33300 Test: [18700/21770] eta: 0:02:03 time: 0.0399 data: 0.0012 max mem: 33300 Test: [18800/21770] eta: 0:01:59 time: 0.0401 data: 0.0013 max mem: 33300 Test: [18900/21770] eta: 0:01:55 time: 0.0398 data: 0.0013 max mem: 33300 Test: [19000/21770] eta: 0:01:51 time: 0.0392 data: 0.0013 max mem: 33300 Test: [19100/21770] eta: 0:01:47 time: 0.0395 data: 0.0013 max mem: 33300 Test: [19200/21770] eta: 0:01:43 time: 0.0398 data: 0.0013 max mem: 33300 Test: [19300/21770] eta: 0:01:39 time: 0.0391 data: 0.0012 max mem: 33300 Test: [19400/21770] eta: 0:01:35 time: 0.0392 data: 0.0013 max mem: 33300 Test: [19500/21770] eta: 0:01:31 time: 0.0395 data: 0.0013 max mem: 33300 Test: [19600/21770] eta: 0:01:27 time: 0.0392 data: 0.0012 max mem: 33300 Test: [19700/21770] eta: 0:01:23 time: 0.0393 data: 0.0013 max mem: 33300 Test: [19800/21770] eta: 0:01:19 time: 0.0391 data: 0.0012 max mem: 33300 Test: [19900/21770] eta: 0:01:15 time: 0.0400 data: 0.0012 max mem: 33300 Test: [20000/21770] eta: 0:01:11 time: 0.0399 data: 0.0012 max mem: 33300 Test: [20100/21770] eta: 0:01:06 time: 0.0401 data: 0.0012 max mem: 33300 Test: [20200/21770] eta: 0:01:02 time: 0.0399 data: 0.0012 max mem: 33300 Test: [20300/21770] eta: 0:00:58 time: 0.0402 data: 0.0013 max mem: 33300 Test: [20400/21770] eta: 0:00:54 time: 0.0399 data: 0.0013 max mem: 33300 Test: [20500/21770] eta: 0:00:50 time: 0.0393 data: 0.0013 max mem: 33300 Test: [20600/21770] eta: 0:00:46 time: 0.0393 data: 0.0013 max mem: 33300 Test: [20700/21770] eta: 0:00:42 time: 0.0392 data: 0.0013 max mem: 33300 Test: [20800/21770] eta: 0:00:38 time: 0.0397 data: 0.0013 max mem: 33300 Test: [20900/21770] eta: 0:00:34 time: 0.0400 data: 0.0013 max mem: 33300 Test: [21000/21770] eta: 0:00:30 time: 0.0397 data: 0.0013 max mem: 33300 Test: [21100/21770] eta: 0:00:26 time: 0.0399 data: 0.0013 max mem: 33300 Test: [21200/21770] eta: 0:00:22 time: 0.0400 data: 0.0013 max mem: 33300 Test: [21300/21770] eta: 0:00:18 time: 0.0399 data: 0.0013 max mem: 33300 Test: [21400/21770] eta: 0:00:14 time: 0.0400 data: 0.0013 max mem: 33300 Test: [21500/21770] eta: 0:00:10 time: 0.0404 data: 0.0011 max mem: 33300 Test: [21600/21770] eta: 0:00:06 time: 0.0406 data: 0.0012 max mem: 33300 Test: [21700/21770] eta: 0:00:02 time: 0.0409 data: 0.0012 max mem: 33300 Test: Total time: 0:14:33 Final results: Mean IoU is 0.00 precision@0.5 = 0.00 precision@0.6 = 0.00 precision@0.7 = 0.00 precision@0.8 = 0.00 precision@0.9 = 0.00 overall IoU = 0.00 mean IoU = 0.00 Mean accuracy for one-to-zero sample is 100.00 Average object IoU 0.0 Overall IoU 0.0 Epoch: [9] [ 0/4276] eta: 6:25:28 lr: 3.9750133164830336e-05 loss: 0.1393 (0.1393) time: 5.4089 data: 2.0295 max mem: 33300 Epoch: [9] [ 10/4276] eta: 4:05:27 lr: 3.9747434265898074e-05 loss: 0.1721 (0.1722) time: 3.4523 data: 0.1924 max mem: 33300 Epoch: [9] [ 20/4276] eta: 3:59:59 lr: 3.9744735346603614e-05 loss: 0.1561 (0.1701) time: 3.2821 data: 0.0083 max mem: 33300 Epoch: [9] [ 30/4276] eta: 3:55:31 lr: 3.974203640694524e-05 loss: 0.1546 (0.1716) time: 3.2601 data: 0.0084 max mem: 33300 Epoch: [9] [ 40/4276] eta: 3:54:22 lr: 3.973933744692129e-05 loss: 0.1569 (0.1689) time: 3.2532 data: 0.0091 max mem: 33300 Epoch: [9] [ 50/4276] eta: 3:52:41 lr: 3.973663846653005e-05 loss: 0.1569 (0.1684) time: 3.2657 data: 0.0088 max mem: 33300 Epoch: [9] [ 60/4276] eta: 3:51:13 lr: 3.973393946576984e-05 loss: 0.1510 (0.1675) time: 3.2310 data: 0.0083 max mem: 33300 Epoch: [9] [ 70/4276] eta: 3:50:32 lr: 3.973124044463898e-05 loss: 0.1464 (0.1662) time: 3.2503 data: 0.0087 max mem: 33300 Epoch: [9] [ 80/4276] eta: 3:49:34 lr: 3.972854140313575e-05 loss: 0.1550 (0.1674) time: 3.2586 data: 0.0089 max mem: 33300 Epoch: [9] [ 90/4276] eta: 3:48:18 lr: 3.972584234125848e-05 loss: 0.1506 (0.1655) time: 3.2153 data: 0.0087 max mem: 33300 Epoch: [9] [ 100/4276] eta: 3:48:05 lr: 3.972314325900547e-05 loss: 0.1506 (0.1671) time: 3.2545 data: 0.0083 max mem: 33300 Epoch: [9] [ 110/4276] eta: 3:47:10 lr: 3.972044415637503e-05 loss: 0.1755 (0.1673) time: 3.2679 data: 0.0080 max mem: 33300 Epoch: [9] [ 120/4276] eta: 3:46:47 lr: 3.971774503336547e-05 loss: 0.1542 (0.1664) time: 3.2587 data: 0.0084 max mem: 33300 Epoch: [9] [ 130/4276] eta: 3:46:14 lr: 3.97150458899751e-05 loss: 0.1572 (0.1674) time: 3.2873 data: 0.0089 max mem: 33300 Epoch: [9] [ 140/4276] eta: 3:45:38 lr: 3.9712346726202215e-05 loss: 0.1584 (0.1669) time: 3.2681 data: 0.0091 max mem: 33300 Epoch: [9] [ 150/4276] eta: 3:45:05 lr: 3.970964754204513e-05 loss: 0.1529 (0.1663) time: 3.2673 data: 0.0089 max mem: 33300 Epoch: [9] [ 160/4276] eta: 3:44:11 lr: 3.970694833750216e-05 loss: 0.1579 (0.1656) time: 3.2312 data: 0.0086 max mem: 33300 Epoch: [9] [ 170/4276] eta: 3:43:33 lr: 3.970424911257159e-05 loss: 0.1617 (0.1659) time: 3.2189 data: 0.0087 max mem: 33300 Epoch: [9] [ 180/4276] eta: 3:42:46 lr: 3.970154986725174e-05 loss: 0.1632 (0.1669) time: 3.2253 data: 0.0087 max mem: 33300 Epoch: [9] [ 190/4276] eta: 3:42:18 lr: 3.9698850601540915e-05 loss: 0.1788 (0.1675) time: 3.2429 data: 0.0086 max mem: 33300 Epoch: [9] [ 200/4276] eta: 3:42:16 lr: 3.969615131543742e-05 loss: 0.1773 (0.1684) time: 3.3501 data: 0.0085 max mem: 33300 Epoch: [9] [ 210/4276] eta: 3:41:59 lr: 3.9693452008939556e-05 loss: 0.1773 (0.1687) time: 3.3856 data: 0.0090 max mem: 33300 Epoch: [9] [ 220/4276] eta: 3:41:52 lr: 3.969075268204563e-05 loss: 0.1784 (0.1692) time: 3.3860 data: 0.0096 max mem: 33300 Epoch: [9] [ 230/4276] eta: 3:41:47 lr: 3.968805333475396e-05 loss: 0.1711 (0.1686) time: 3.4287 data: 0.0093 max mem: 33300 Epoch: [9] [ 240/4276] eta: 3:41:26 lr: 3.9685353967062824e-05 loss: 0.1647 (0.1689) time: 3.3982 data: 0.0091 max mem: 33300 Epoch: [9] [ 250/4276] eta: 3:41:14 lr: 3.9682654578970546e-05 loss: 0.1739 (0.1702) time: 3.3900 data: 0.0093 max mem: 33300 Epoch: [9] [ 260/4276] eta: 3:40:49 lr: 3.967995517047542e-05 loss: 0.1770 (0.1703) time: 3.3879 data: 0.0092 max mem: 33300 Epoch: [9] [ 270/4276] eta: 3:40:27 lr: 3.967725574157576e-05 loss: 0.1717 (0.1703) time: 3.3613 data: 0.0090 max mem: 33300 Epoch: [9] [ 280/4276] eta: 3:40:11 lr: 3.967455629226986e-05 loss: 0.1598 (0.1700) time: 3.3963 data: 0.0096 max mem: 33300 Epoch: [9] [ 290/4276] eta: 3:39:39 lr: 3.967185682255603e-05 loss: 0.1598 (0.1694) time: 3.3673 data: 0.0100 max mem: 33300 Epoch: [9] [ 300/4276] eta: 3:39:09 lr: 3.966915733243255e-05 loss: 0.1523 (0.1692) time: 3.3220 data: 0.0094 max mem: 33300 Epoch: [9] [ 310/4276] eta: 3:38:54 lr: 3.966645782189776e-05 loss: 0.1519 (0.1688) time: 3.3903 data: 0.0089 max mem: 33300 Epoch: [9] [ 320/4276] eta: 3:38:20 lr: 3.966375829094994e-05 loss: 0.1633 (0.1693) time: 3.3769 data: 0.0086 max mem: 33300 Epoch: [9] [ 330/4276] eta: 3:37:59 lr: 3.9661058739587384e-05 loss: 0.1713 (0.1693) time: 3.3610 data: 0.0085 max mem: 33300 Epoch: [9] [ 340/4276] eta: 3:37:35 lr: 3.9658359167808415e-05 loss: 0.1615 (0.1689) time: 3.4014 data: 0.0083 max mem: 33300 Epoch: [9] [ 350/4276] eta: 3:37:06 lr: 3.9655659575611324e-05 loss: 0.1435 (0.1687) time: 3.3733 data: 0.0083 max mem: 33300 Epoch: [9] [ 360/4276] eta: 3:36:40 lr: 3.965295996299441e-05 loss: 0.1891 (0.1694) time: 3.3725 data: 0.0086 max mem: 33300 Epoch: [9] [ 370/4276] eta: 3:36:12 lr: 3.965026032995597e-05 loss: 0.1612 (0.1687) time: 3.3778 data: 0.0085 max mem: 33300 Epoch: [9] [ 380/4276] eta: 3:35:43 lr: 3.964756067649432e-05 loss: 0.1467 (0.1687) time: 3.3643 data: 0.0087 max mem: 33300 Epoch: [9] [ 390/4276] eta: 3:35:12 lr: 3.9644861002607754e-05 loss: 0.1520 (0.1686) time: 3.3494 data: 0.0089 max mem: 33300 Epoch: [9] [ 400/4276] eta: 3:34:43 lr: 3.964216130829456e-05 loss: 0.1560 (0.1687) time: 3.3574 data: 0.0093 max mem: 33300 Epoch: [9] [ 410/4276] eta: 3:34:09 lr: 3.9639461593553044e-05 loss: 0.1613 (0.1685) time: 3.3429 data: 0.0093 max mem: 33300 Epoch: [9] [ 420/4276] eta: 3:33:39 lr: 3.9636761858381515e-05 loss: 0.1551 (0.1683) time: 3.3377 data: 0.0091 max mem: 33300 Epoch: [9] [ 430/4276] eta: 3:32:59 lr: 3.963406210277826e-05 loss: 0.1587 (0.1683) time: 3.3057 data: 0.0092 max mem: 33300 Epoch: [9] [ 440/4276] eta: 3:32:25 lr: 3.9631362326741595e-05 loss: 0.1587 (0.1682) time: 3.2812 data: 0.0093 max mem: 33300 Epoch: [9] [ 450/4276] eta: 3:32:00 lr: 3.9628662530269796e-05 loss: 0.1668 (0.1684) time: 3.3678 data: 0.0091 max mem: 33300 Epoch: [9] [ 460/4276] eta: 3:31:19 lr: 3.962596271336118e-05 loss: 0.1580 (0.1679) time: 3.3206 data: 0.0091 max mem: 33300 Epoch: [9] [ 470/4276] eta: 3:30:58 lr: 3.962326287601403e-05 loss: 0.1444 (0.1678) time: 3.3510 data: 0.0092 max mem: 33300 Epoch: [9] [ 480/4276] eta: 3:30:20 lr: 3.9620563018226655e-05 loss: 0.1421 (0.1674) time: 3.3739 data: 0.0092 max mem: 33300 Epoch: [9] [ 490/4276] eta: 3:29:48 lr: 3.9617863139997346e-05 loss: 0.1364 (0.1669) time: 3.3056 data: 0.0093 max mem: 33300 Epoch: [9] [ 500/4276] eta: 3:29:15 lr: 3.961516324132441e-05 loss: 0.1364 (0.1667) time: 3.3364 data: 0.0097 max mem: 33300 Epoch: [9] [ 510/4276] eta: 3:28:41 lr: 3.961246332220614e-05 loss: 0.1510 (0.1664) time: 3.3189 data: 0.0099 max mem: 33300 Epoch: [9] [ 520/4276] eta: 3:28:11 lr: 3.960976338264082e-05 loss: 0.1540 (0.1665) time: 3.3406 data: 0.0097 max mem: 33300 Epoch: [9] [ 530/4276] eta: 3:27:33 lr: 3.960706342262676e-05 loss: 0.1700 (0.1664) time: 3.3164 data: 0.0095 max mem: 33300 Epoch: [9] [ 540/4276] eta: 3:27:04 lr: 3.960436344216226e-05 loss: 0.1513 (0.1661) time: 3.3181 data: 0.0094 max mem: 33300 Epoch: [9] [ 550/4276] eta: 3:26:29 lr: 3.96016634412456e-05 loss: 0.1581 (0.1662) time: 3.3399 data: 0.0092 max mem: 33300 Epoch: [9] [ 560/4276] eta: 3:25:58 lr: 3.959896341987509e-05 loss: 0.1679 (0.1662) time: 3.3347 data: 0.0087 max mem: 33300 Epoch: [9] [ 570/4276] eta: 3:25:25 lr: 3.959626337804902e-05 loss: 0.1677 (0.1662) time: 3.3419 data: 0.0088 max mem: 33300 Epoch: [9] [ 580/4276] eta: 3:24:50 lr: 3.959356331576568e-05 loss: 0.1596 (0.1661) time: 3.3133 data: 0.0093 max mem: 33300 Epoch: [9] [ 590/4276] eta: 3:24:20 lr: 3.9590863233023385e-05 loss: 0.1453 (0.1658) time: 3.3404 data: 0.0095 max mem: 33300 Epoch: [9] [ 600/4276] eta: 3:23:42 lr: 3.9588163129820397e-05 loss: 0.1473 (0.1658) time: 3.3135 data: 0.0095 max mem: 33300 Epoch: [9] [ 610/4276] eta: 3:23:14 lr: 3.9585463006155044e-05 loss: 0.1626 (0.1657) time: 3.3301 data: 0.0093 max mem: 33300 Epoch: [9] [ 620/4276] eta: 3:22:37 lr: 3.95827628620256e-05 loss: 0.1541 (0.1656) time: 3.3359 data: 0.0087 max mem: 33300 Epoch: [9] [ 630/4276] eta: 3:22:01 lr: 3.9580062697430354e-05 loss: 0.1541 (0.1658) time: 3.2681 data: 0.0085 max mem: 33300 Epoch: [9] [ 640/4276] eta: 3:21:24 lr: 3.9577362512367615e-05 loss: 0.1475 (0.1656) time: 3.2674 data: 0.0087 max mem: 33300 Epoch: [9] [ 650/4276] eta: 3:20:37 lr: 3.957466230683567e-05 loss: 0.1488 (0.1656) time: 3.1746 data: 0.0091 max mem: 33300 Epoch: [9] [ 660/4276] eta: 3:19:53 lr: 3.957196208083281e-05 loss: 0.1627 (0.1657) time: 3.0960 data: 0.0087 max mem: 33300 Epoch: [9] [ 670/4276] eta: 3:19:08 lr: 3.956926183435734e-05 loss: 0.1688 (0.1657) time: 3.1104 data: 0.0081 max mem: 33300 Epoch: [9] [ 680/4276] eta: 3:18:24 lr: 3.9566561567407525e-05 loss: 0.1648 (0.1655) time: 3.1105 data: 0.0081 max mem: 33300 Epoch: [9] [ 690/4276] eta: 3:17:41 lr: 3.956386127998169e-05 loss: 0.1648 (0.1656) time: 3.1054 data: 0.0082 max mem: 33300 Epoch: [9] [ 700/4276] eta: 3:16:58 lr: 3.9561160972078104e-05 loss: 0.1581 (0.1656) time: 3.1084 data: 0.0085 max mem: 33300 Epoch: [9] [ 710/4276] eta: 3:16:15 lr: 3.9558460643695066e-05 loss: 0.1522 (0.1656) time: 3.1096 data: 0.0087 max mem: 33300 Epoch: [9] [ 720/4276] eta: 3:15:32 lr: 3.955576029483087e-05 loss: 0.1522 (0.1655) time: 3.1046 data: 0.0087 max mem: 33300 Epoch: [9] [ 730/4276] eta: 3:14:50 lr: 3.95530599254838e-05 loss: 0.1501 (0.1655) time: 3.1066 data: 0.0084 max mem: 33300 Epoch: [9] [ 740/4276] eta: 3:14:07 lr: 3.955035953565216e-05 loss: 0.1599 (0.1655) time: 3.1076 data: 0.0081 max mem: 33300 Epoch: [9] [ 750/4276] eta: 3:13:27 lr: 3.954765912533423e-05 loss: 0.1567 (0.1656) time: 3.1146 data: 0.0080 max mem: 33300 Epoch: [9] [ 760/4276] eta: 3:12:45 lr: 3.954495869452829e-05 loss: 0.1505 (0.1654) time: 3.1136 data: 0.0083 max mem: 33300 Epoch: [9] [ 770/4276] eta: 3:12:04 lr: 3.9542258243232654e-05 loss: 0.1543 (0.1655) time: 3.1049 data: 0.0084 max mem: 33300 Epoch: [9] [ 780/4276] eta: 3:11:22 lr: 3.9539557771445596e-05 loss: 0.1646 (0.1655) time: 3.1026 data: 0.0087 max mem: 33300 Epoch: [9] [ 790/4276] eta: 3:10:41 lr: 3.953685727916541e-05 loss: 0.1714 (0.1656) time: 3.0948 data: 0.0088 max mem: 33300 Epoch: [9] [ 800/4276] eta: 3:10:00 lr: 3.953415676639038e-05 loss: 0.1664 (0.1655) time: 3.0952 data: 0.0086 max mem: 33300 Epoch: [9] [ 810/4276] eta: 3:09:19 lr: 3.95314562331188e-05 loss: 0.1524 (0.1655) time: 3.0914 data: 0.0092 max mem: 33300 Epoch: [9] [ 820/4276] eta: 3:08:39 lr: 3.952875567934897e-05 loss: 0.1453 (0.1655) time: 3.0943 data: 0.0089 max mem: 33300 Epoch: [9] [ 830/4276] eta: 3:07:59 lr: 3.952605510507915e-05 loss: 0.1552 (0.1657) time: 3.1069 data: 0.0080 max mem: 33300 Epoch: [9] [ 840/4276] eta: 3:07:20 lr: 3.952335451030765e-05 loss: 0.1613 (0.1659) time: 3.1087 data: 0.0078 max mem: 33300 Epoch: [9] [ 850/4276] eta: 3:06:41 lr: 3.952065389503276e-05 loss: 0.1613 (0.1659) time: 3.1143 data: 0.0082 max mem: 33300 Epoch: [9] [ 860/4276] eta: 3:06:03 lr: 3.951795325925275e-05 loss: 0.1669 (0.1660) time: 3.1228 data: 0.0086 max mem: 33300 Epoch: [9] [ 870/4276] eta: 3:05:24 lr: 3.9515252602965926e-05 loss: 0.1629 (0.1661) time: 3.1142 data: 0.0081 max mem: 33300 Epoch: [9] [ 880/4276] eta: 3:04:45 lr: 3.951255192617056e-05 loss: 0.1615 (0.1662) time: 3.1045 data: 0.0077 max mem: 33300 Epoch: [9] [ 890/4276] eta: 3:04:07 lr: 3.950985122886494e-05 loss: 0.1733 (0.1663) time: 3.1133 data: 0.0081 max mem: 33300 Epoch: [9] [ 900/4276] eta: 3:03:28 lr: 3.950715051104737e-05 loss: 0.1658 (0.1663) time: 3.1121 data: 0.0083 max mem: 33300 Epoch: [9] [ 910/4276] eta: 3:02:50 lr: 3.950444977271612e-05 loss: 0.1625 (0.1663) time: 3.1097 data: 0.0080 max mem: 33300 Epoch: [9] [ 920/4276] eta: 3:02:12 lr: 3.9501749013869476e-05 loss: 0.1625 (0.1664) time: 3.1089 data: 0.0078 max mem: 33300 Epoch: [9] [ 930/4276] eta: 3:01:34 lr: 3.949904823450573e-05 loss: 0.1669 (0.1663) time: 3.1108 data: 0.0078 max mem: 33300 Epoch: [9] [ 940/4276] eta: 3:00:56 lr: 3.949634743462316e-05 loss: 0.1633 (0.1661) time: 3.1133 data: 0.0079 max mem: 33300 Epoch: [9] [ 950/4276] eta: 3:00:19 lr: 3.9493646614220056e-05 loss: 0.1565 (0.1662) time: 3.1162 data: 0.0079 max mem: 33300 Epoch: [9] [ 960/4276] eta: 2:59:41 lr: 3.94909457732947e-05 loss: 0.1744 (0.1664) time: 3.1045 data: 0.0082 max mem: 33300 Epoch: [9] [ 970/4276] eta: 2:59:03 lr: 3.948824491184539e-05 loss: 0.1745 (0.1664) time: 3.0818 data: 0.0083 max mem: 33300 Epoch: [9] [ 980/4276] eta: 2:58:26 lr: 3.9485544029870394e-05 loss: 0.1648 (0.1665) time: 3.0969 data: 0.0078 max mem: 33300 Epoch: [9] [ 990/4276] eta: 2:57:48 lr: 3.9482843127368e-05 loss: 0.1648 (0.1664) time: 3.1118 data: 0.0079 max mem: 33300 Epoch: [9] [1000/4276] eta: 2:57:12 lr: 3.948014220433649e-05 loss: 0.1618 (0.1665) time: 3.1155 data: 0.0076 max mem: 33300 Epoch: [9] [1010/4276] eta: 2:56:36 lr: 3.947744126077415e-05 loss: 0.1585 (0.1665) time: 3.1243 data: 0.0072 max mem: 33300 Epoch: [9] [1020/4276] eta: 2:55:59 lr: 3.9474740296679266e-05 loss: 0.1530 (0.1664) time: 3.1135 data: 0.0074 max mem: 33300 Epoch: [9] [1030/4276] eta: 2:55:22 lr: 3.947203931205012e-05 loss: 0.1578 (0.1665) time: 3.1125 data: 0.0076 max mem: 33300 Epoch: [9] [1040/4276] eta: 2:54:45 lr: 3.9469338306884986e-05 loss: 0.1506 (0.1664) time: 3.1081 data: 0.0081 max mem: 33300 Epoch: [9] [1050/4276] eta: 2:54:09 lr: 3.946663728118216e-05 loss: 0.1587 (0.1666) time: 3.1019 data: 0.0079 max mem: 33300 Epoch: [9] [1060/4276] eta: 2:53:32 lr: 3.946393623493992e-05 loss: 0.1633 (0.1666) time: 3.1052 data: 0.0073 max mem: 33300 Epoch: [9] [1070/4276] eta: 2:52:56 lr: 3.9461235168156535e-05 loss: 0.1670 (0.1666) time: 3.1054 data: 0.0073 max mem: 33300 Epoch: [9] [1080/4276] eta: 2:52:20 lr: 3.945853408083031e-05 loss: 0.1684 (0.1666) time: 3.1085 data: 0.0074 max mem: 33300 Epoch: [9] [1090/4276] eta: 2:51:44 lr: 3.94558329729595e-05 loss: 0.1684 (0.1665) time: 3.0998 data: 0.0080 max mem: 33300 Epoch: [9] [1100/4276] eta: 2:51:07 lr: 3.945313184454241e-05 loss: 0.1610 (0.1666) time: 3.0881 data: 0.0091 max mem: 33300 Epoch: [9] [1110/4276] eta: 2:50:31 lr: 3.94504306955773e-05 loss: 0.1611 (0.1666) time: 3.0952 data: 0.0087 max mem: 33300 Epoch: [9] [1120/4276] eta: 2:49:55 lr: 3.944772952606247e-05 loss: 0.1690 (0.1666) time: 3.1098 data: 0.0082 max mem: 33300 Epoch: [9] [1130/4276] eta: 2:49:20 lr: 3.944502833599619e-05 loss: 0.1588 (0.1663) time: 3.1118 data: 0.0085 max mem: 33300 Epoch: [9] [1140/4276] eta: 2:48:44 lr: 3.9442327125376727e-05 loss: 0.1486 (0.1662) time: 3.1070 data: 0.0088 max mem: 33300 Epoch: [9] [1150/4276] eta: 2:48:08 lr: 3.943962589420238e-05 loss: 0.1554 (0.1661) time: 3.1024 data: 0.0086 max mem: 33300 Epoch: [9] [1160/4276] eta: 2:47:32 lr: 3.943692464247143e-05 loss: 0.1658 (0.1663) time: 3.1017 data: 0.0081 max mem: 33300 Epoch: [9] [1170/4276] eta: 2:46:57 lr: 3.943422337018214e-05 loss: 0.1875 (0.1665) time: 3.0950 data: 0.0083 max mem: 33300 Epoch: [9] [1180/4276] eta: 2:46:21 lr: 3.94315220773328e-05 loss: 0.1840 (0.1665) time: 3.0871 data: 0.0082 max mem: 33300 Epoch: [9] [1190/4276] eta: 2:45:46 lr: 3.942882076392168e-05 loss: 0.1570 (0.1665) time: 3.1006 data: 0.0079 max mem: 33300 Epoch: [9] [1200/4276] eta: 2:45:10 lr: 3.942611942994707e-05 loss: 0.1570 (0.1664) time: 3.1061 data: 0.0075 max mem: 33300 Epoch: [9] [1210/4276] eta: 2:44:35 lr: 3.9423418075407246e-05 loss: 0.1600 (0.1665) time: 3.1089 data: 0.0074 max mem: 33300 Epoch: [9] [1220/4276] eta: 2:44:00 lr: 3.942071670030047e-05 loss: 0.1680 (0.1665) time: 3.1023 data: 0.0074 max mem: 33300 Epoch: [9] [1230/4276] eta: 2:43:23 lr: 3.941801530462505e-05 loss: 0.1680 (0.1666) time: 3.0672 data: 0.0082 max mem: 33300 Epoch: [9] [1240/4276] eta: 2:42:47 lr: 3.941531388837922e-05 loss: 0.1581 (0.1667) time: 3.0419 data: 0.0083 max mem: 33300 Epoch: [9] [1250/4276] eta: 2:42:10 lr: 3.941261245156129e-05 loss: 0.1633 (0.1667) time: 3.0343 data: 0.0078 max mem: 33300 Epoch: [9] [1260/4276] eta: 2:41:34 lr: 3.940991099416953e-05 loss: 0.1506 (0.1666) time: 3.0428 data: 0.0079 max mem: 33300 Epoch: [9] [1270/4276] eta: 2:40:58 lr: 3.940720951620221e-05 loss: 0.1506 (0.1665) time: 3.0435 data: 0.0079 max mem: 33300 Epoch: [9] [1280/4276] eta: 2:40:23 lr: 3.940450801765761e-05 loss: 0.1751 (0.1667) time: 3.0704 data: 0.0085 max mem: 33300 Epoch: [9] [1290/4276] eta: 2:39:48 lr: 3.9401806498534e-05 loss: 0.1751 (0.1668) time: 3.1055 data: 0.0088 max mem: 33300 Epoch: [9] [1300/4276] eta: 2:39:14 lr: 3.939910495882967e-05 loss: 0.1543 (0.1667) time: 3.1080 data: 0.0084 max mem: 33300 Epoch: [9] [1310/4276] eta: 2:38:40 lr: 3.939640339854289e-05 loss: 0.1384 (0.1666) time: 3.1097 data: 0.0080 max mem: 33300 Epoch: [9] [1320/4276] eta: 2:38:05 lr: 3.9393701817671924e-05 loss: 0.1643 (0.1667) time: 3.1185 data: 0.0083 max mem: 33300 Epoch: [9] [1330/4276] eta: 2:37:31 lr: 3.939100021621505e-05 loss: 0.1616 (0.1666) time: 3.1137 data: 0.0085 max mem: 33300 Epoch: [9] [1340/4276] eta: 2:36:56 lr: 3.9388298594170544e-05 loss: 0.1495 (0.1666) time: 3.0917 data: 0.0087 max mem: 33300 Epoch: [9] [1350/4276] eta: 2:36:22 lr: 3.938559695153669e-05 loss: 0.1683 (0.1666) time: 3.0890 data: 0.0086 max mem: 33300 Epoch: [9] [1360/4276] eta: 2:35:47 lr: 3.938289528831176e-05 loss: 0.1693 (0.1666) time: 3.0952 data: 0.0078 max mem: 33300 Epoch: [9] [1370/4276] eta: 2:35:13 lr: 3.938019360449401e-05 loss: 0.1586 (0.1665) time: 3.1002 data: 0.0078 max mem: 33300 Epoch: [9] [1380/4276] eta: 2:34:39 lr: 3.937749190008173e-05 loss: 0.1684 (0.1667) time: 3.1028 data: 0.0078 max mem: 33300 Epoch: [9] [1390/4276] eta: 2:34:05 lr: 3.9374790175073186e-05 loss: 0.1763 (0.1667) time: 3.1100 data: 0.0082 max mem: 33300 Epoch: [9] [1400/4276] eta: 2:33:31 lr: 3.9372088429466645e-05 loss: 0.1684 (0.1668) time: 3.1123 data: 0.0083 max mem: 33300 Epoch: [9] [1410/4276] eta: 2:32:57 lr: 3.936938666326039e-05 loss: 0.1636 (0.1668) time: 3.1071 data: 0.0078 max mem: 33300 Epoch: [9] [1420/4276] eta: 2:32:23 lr: 3.93666848764527e-05 loss: 0.1613 (0.1669) time: 3.1083 data: 0.0076 max mem: 33300 Epoch: [9] [1430/4276] eta: 2:31:49 lr: 3.936398306904183e-05 loss: 0.1613 (0.1669) time: 3.0899 data: 0.0076 max mem: 33300 Epoch: [9] [1440/4276] eta: 2:31:14 lr: 3.936128124102606e-05 loss: 0.1601 (0.1669) time: 3.0793 data: 0.0075 max mem: 33300 Epoch: [9] [1450/4276] eta: 2:30:41 lr: 3.9358579392403655e-05 loss: 0.1631 (0.1669) time: 3.0967 data: 0.0075 max mem: 33300 Epoch: [9] [1460/4276] eta: 2:30:07 lr: 3.93558775231729e-05 loss: 0.1631 (0.1670) time: 3.1108 data: 0.0073 max mem: 33300 Epoch: [9] [1470/4276] eta: 2:29:33 lr: 3.935317563333205e-05 loss: 0.1663 (0.1670) time: 3.0963 data: 0.0081 max mem: 33300 Epoch: [9] [1480/4276] eta: 2:28:58 lr: 3.9350473722879386e-05 loss: 0.1613 (0.1670) time: 3.0647 data: 0.0089 max mem: 33300 Epoch: [9] [1490/4276] eta: 2:28:23 lr: 3.9347771791813165e-05 loss: 0.1500 (0.1670) time: 3.0502 data: 0.0082 max mem: 33300 Epoch: [9] [1500/4276] eta: 2:27:48 lr: 3.9345069840131674e-05 loss: 0.1514 (0.1670) time: 3.0464 data: 0.0079 max mem: 33300 Epoch: [9] [1510/4276] eta: 2:27:14 lr: 3.934236786783317e-05 loss: 0.1514 (0.1670) time: 3.0621 data: 0.0083 max mem: 33300 Epoch: [9] [1520/4276] eta: 2:26:41 lr: 3.933966587491593e-05 loss: 0.1481 (0.1670) time: 3.0887 data: 0.0082 max mem: 33300 Epoch: [9] [1530/4276] eta: 2:26:07 lr: 3.933696386137822e-05 loss: 0.1464 (0.1669) time: 3.1041 data: 0.0083 max mem: 33300 Epoch: [9] [1540/4276] eta: 2:25:33 lr: 3.933426182721831e-05 loss: 0.1663 (0.1670) time: 3.0828 data: 0.0084 max mem: 33300 Epoch: [9] [1550/4276] eta: 2:24:59 lr: 3.9331559772434474e-05 loss: 0.1711 (0.1670) time: 3.0579 data: 0.0081 max mem: 33300 Epoch: [9] [1560/4276] eta: 2:24:25 lr: 3.932885769702497e-05 loss: 0.1615 (0.1670) time: 3.0727 data: 0.0086 max mem: 33300 Epoch: [9] [1570/4276] eta: 2:23:52 lr: 3.9326155600988063e-05 loss: 0.1607 (0.1670) time: 3.0999 data: 0.0088 max mem: 33300 Epoch: [9] [1580/4276] eta: 2:23:18 lr: 3.932345348432204e-05 loss: 0.1645 (0.1669) time: 3.1105 data: 0.0080 max mem: 33300 Epoch: [9] [1590/4276] eta: 2:22:45 lr: 3.932075134702515e-05 loss: 0.1645 (0.1669) time: 3.1112 data: 0.0076 max mem: 33300 Epoch: [9] [1600/4276] eta: 2:22:12 lr: 3.931804918909566e-05 loss: 0.1559 (0.1669) time: 3.1146 data: 0.0077 max mem: 33300 Epoch: [9] [1610/4276] eta: 2:21:39 lr: 3.931534701053185e-05 loss: 0.1430 (0.1668) time: 3.1143 data: 0.0075 max mem: 33300 Epoch: [9] [1620/4276] eta: 2:21:06 lr: 3.9312644811331986e-05 loss: 0.1411 (0.1667) time: 3.1100 data: 0.0073 max mem: 33300 Epoch: [9] [1630/4276] eta: 2:20:32 lr: 3.930994259149432e-05 loss: 0.1635 (0.1668) time: 3.1060 data: 0.0076 max mem: 33300 Epoch: [9] [1640/4276] eta: 2:19:59 lr: 3.9307240351017126e-05 loss: 0.1616 (0.1668) time: 3.1110 data: 0.0080 max mem: 33300 Epoch: [9] [1650/4276] eta: 2:19:27 lr: 3.930453808989867e-05 loss: 0.1561 (0.1668) time: 3.1188 data: 0.0080 max mem: 33300 Epoch: [9] [1660/4276] eta: 2:18:54 lr: 3.9301835808137214e-05 loss: 0.1569 (0.1668) time: 3.1232 data: 0.0075 max mem: 33300 Epoch: [9] [1670/4276] eta: 2:18:21 lr: 3.929913350573104e-05 loss: 0.1634 (0.1667) time: 3.1156 data: 0.0074 max mem: 33300 Epoch: [9] [1680/4276] eta: 2:17:48 lr: 3.9296431182678386e-05 loss: 0.1644 (0.1668) time: 3.1120 data: 0.0073 max mem: 33300 Epoch: [9] [1690/4276] eta: 2:17:15 lr: 3.929372883897754e-05 loss: 0.1644 (0.1668) time: 3.1130 data: 0.0075 max mem: 33300 Epoch: [9] [1700/4276] eta: 2:16:42 lr: 3.929102647462676e-05 loss: 0.1765 (0.1669) time: 3.1068 data: 0.0078 max mem: 33300 Epoch: [9] [1710/4276] eta: 2:16:09 lr: 3.928832408962429e-05 loss: 0.1917 (0.1670) time: 3.1072 data: 0.0076 max mem: 33300 Epoch: [9] [1720/4276] eta: 2:15:36 lr: 3.9285621683968425e-05 loss: 0.1917 (0.1671) time: 3.1102 data: 0.0075 max mem: 33300 Epoch: [9] [1730/4276] eta: 2:15:03 lr: 3.9282919257657416e-05 loss: 0.1769 (0.1671) time: 3.1082 data: 0.0075 max mem: 33300 Epoch: [9] [1740/4276] eta: 2:14:30 lr: 3.9280216810689516e-05 loss: 0.1701 (0.1672) time: 3.1089 data: 0.0078 max mem: 33300 Epoch: [9] [1750/4276] eta: 2:13:57 lr: 3.9277514343063e-05 loss: 0.1714 (0.1672) time: 3.1093 data: 0.0080 max mem: 33300 Epoch: [9] [1760/4276] eta: 2:13:24 lr: 3.9274811854776125e-05 loss: 0.1553 (0.1671) time: 3.1097 data: 0.0077 max mem: 33300 Epoch: [9] [1770/4276] eta: 2:12:51 lr: 3.927210934582716e-05 loss: 0.1589 (0.1671) time: 3.1045 data: 0.0075 max mem: 33300 Epoch: [9] [1780/4276] eta: 2:12:18 lr: 3.926940681621436e-05 loss: 0.1655 (0.1671) time: 3.1002 data: 0.0074 max mem: 33300 Epoch: [9] [1790/4276] eta: 2:11:45 lr: 3.926670426593599e-05 loss: 0.1520 (0.1671) time: 3.1045 data: 0.0076 max mem: 33300 Epoch: [9] [1800/4276] eta: 2:11:13 lr: 3.926400169499031e-05 loss: 0.1601 (0.1671) time: 3.1032 data: 0.0079 max mem: 33300 Epoch: [9] [1810/4276] eta: 2:10:40 lr: 3.926129910337559e-05 loss: 0.1746 (0.1672) time: 3.1039 data: 0.0078 max mem: 33300 Epoch: [9] [1820/4276] eta: 2:10:07 lr: 3.925859649109007e-05 loss: 0.1745 (0.1671) time: 3.1071 data: 0.0077 max mem: 33300 Epoch: [9] [1830/4276] eta: 2:09:34 lr: 3.925589385813204e-05 loss: 0.1609 (0.1671) time: 3.1052 data: 0.0077 max mem: 33300 Epoch: [9] [1840/4276] eta: 2:09:01 lr: 3.9253191204499735e-05 loss: 0.1588 (0.1670) time: 3.1078 data: 0.0074 max mem: 33300 Epoch: [9] [1850/4276] eta: 2:08:28 lr: 3.925048853019143e-05 loss: 0.1657 (0.1671) time: 3.0879 data: 0.0073 max mem: 33300 Epoch: [9] [1860/4276] eta: 2:07:56 lr: 3.924778583520538e-05 loss: 0.1707 (0.1671) time: 3.1050 data: 0.0074 max mem: 33300 Epoch: [9] [1870/4276] eta: 2:07:24 lr: 3.924508311953984e-05 loss: 0.1724 (0.1673) time: 3.1490 data: 0.0074 max mem: 33300 Epoch: [9] [1880/4276] eta: 2:06:51 lr: 3.924238038319307e-05 loss: 0.1739 (0.1673) time: 3.1236 data: 0.0076 max mem: 33300 Epoch: [9] [1890/4276] eta: 2:06:19 lr: 3.923967762616334e-05 loss: 0.1602 (0.1673) time: 3.1282 data: 0.0080 max mem: 33300 Epoch: [9] [1900/4276] eta: 2:05:47 lr: 3.923697484844891e-05 loss: 0.1514 (0.1672) time: 3.1438 data: 0.0078 max mem: 33300 Epoch: [9] [1910/4276] eta: 2:05:14 lr: 3.9234272050048014e-05 loss: 0.1530 (0.1672) time: 3.1141 data: 0.0079 max mem: 33300 Epoch: [9] [1920/4276] eta: 2:04:42 lr: 3.9231569230958936e-05 loss: 0.1608 (0.1672) time: 3.1219 data: 0.0084 max mem: 33300 Epoch: [9] [1930/4276] eta: 2:04:09 lr: 3.922886639117993e-05 loss: 0.1608 (0.1672) time: 3.1023 data: 0.0088 max mem: 33300 Epoch: [9] [1940/4276] eta: 2:03:37 lr: 3.9226163530709245e-05 loss: 0.1604 (0.1672) time: 3.1142 data: 0.0086 max mem: 33300 Epoch: [9] [1950/4276] eta: 2:03:04 lr: 3.922346064954513e-05 loss: 0.1698 (0.1672) time: 3.1380 data: 0.0081 max mem: 33300 Epoch: [9] [1960/4276] eta: 2:02:32 lr: 3.922075774768587e-05 loss: 0.1466 (0.1671) time: 3.1261 data: 0.0082 max mem: 33300 Epoch: [9] [1970/4276] eta: 2:02:00 lr: 3.921805482512969e-05 loss: 0.1463 (0.1671) time: 3.1456 data: 0.0083 max mem: 33300 Epoch: [9] [1980/4276] eta: 2:01:28 lr: 3.921535188187487e-05 loss: 0.1527 (0.1670) time: 3.1251 data: 0.0080 max mem: 33300 Epoch: [9] [1990/4276] eta: 2:00:56 lr: 3.921264891791967e-05 loss: 0.1558 (0.1670) time: 3.1292 data: 0.0080 max mem: 33300 Epoch: [9] [2000/4276] eta: 2:00:23 lr: 3.920994593326231e-05 loss: 0.1771 (0.1671) time: 3.1393 data: 0.0083 max mem: 33300 Epoch: [9] [2010/4276] eta: 1:59:51 lr: 3.9207242927901095e-05 loss: 0.1608 (0.1670) time: 3.1208 data: 0.0081 max mem: 33300 Epoch: [9] [2020/4276] eta: 1:59:19 lr: 3.9204539901834245e-05 loss: 0.1623 (0.1671) time: 3.1422 data: 0.0079 max mem: 33300 Epoch: [9] [2030/4276] eta: 1:58:46 lr: 3.920183685506002e-05 loss: 0.1654 (0.1670) time: 3.1150 data: 0.0080 max mem: 33300 Epoch: [9] [2040/4276] eta: 1:58:14 lr: 3.919913378757669e-05 loss: 0.1540 (0.1670) time: 3.0897 data: 0.0082 max mem: 33300 Epoch: [9] [2050/4276] eta: 1:57:41 lr: 3.919643069938249e-05 loss: 0.1594 (0.1670) time: 3.1110 data: 0.0086 max mem: 33300 Epoch: [9] [2060/4276] eta: 1:57:09 lr: 3.91937275904757e-05 loss: 0.1612 (0.1670) time: 3.1210 data: 0.0091 max mem: 33300 Epoch: [9] [2070/4276] eta: 1:56:37 lr: 3.919102446085455e-05 loss: 0.1588 (0.1669) time: 3.1319 data: 0.0091 max mem: 33300 Epoch: [9] [2080/4276] eta: 1:56:04 lr: 3.91883213105173e-05 loss: 0.1644 (0.1670) time: 3.1005 data: 0.0092 max mem: 33300 Epoch: [9] [2090/4276] eta: 1:55:32 lr: 3.9185618139462215e-05 loss: 0.1539 (0.1669) time: 3.1158 data: 0.0091 max mem: 33300 Epoch: [9] [2100/4276] eta: 1:55:00 lr: 3.918291494768753e-05 loss: 0.1602 (0.1670) time: 3.1400 data: 0.0089 max mem: 33300 Epoch: [9] [2110/4276] eta: 1:54:28 lr: 3.918021173519151e-05 loss: 0.1586 (0.1669) time: 3.1211 data: 0.0087 max mem: 33300 Epoch: [9] [2120/4276] eta: 1:53:56 lr: 3.91775085019724e-05 loss: 0.1364 (0.1667) time: 3.1465 data: 0.0088 max mem: 33300 Epoch: [9] [2130/4276] eta: 1:53:24 lr: 3.917480524802846e-05 loss: 0.1346 (0.1666) time: 3.1215 data: 0.0087 max mem: 33300 Epoch: [9] [2140/4276] eta: 1:52:52 lr: 3.917210197335795e-05 loss: 0.1585 (0.1667) time: 3.1338 data: 0.0086 max mem: 33300 Epoch: [9] [2150/4276] eta: 1:52:20 lr: 3.91693986779591e-05 loss: 0.1692 (0.1667) time: 3.1520 data: 0.0090 max mem: 33300 Epoch: [9] [2160/4276] eta: 1:51:48 lr: 3.916669536183018e-05 loss: 0.1623 (0.1667) time: 3.1194 data: 0.0093 max mem: 33300 Epoch: [9] [2170/4276] eta: 1:51:16 lr: 3.9163992024969425e-05 loss: 0.1653 (0.1668) time: 3.1395 data: 0.0095 max mem: 33300 Epoch: [9] [2180/4276] eta: 1:50:43 lr: 3.91612886673751e-05 loss: 0.1702 (0.1668) time: 3.1248 data: 0.0095 max mem: 33300 Epoch: [9] [2190/4276] eta: 1:50:12 lr: 3.915858528904544e-05 loss: 0.1702 (0.1668) time: 3.1293 data: 0.0087 max mem: 33300 Epoch: [9] [2200/4276] eta: 1:49:39 lr: 3.9155881889978714e-05 loss: 0.1560 (0.1668) time: 3.1365 data: 0.0079 max mem: 33300 Epoch: [9] [2210/4276] eta: 1:49:07 lr: 3.9153178470173166e-05 loss: 0.1575 (0.1668) time: 3.1124 data: 0.0077 max mem: 33300 Epoch: [9] [2220/4276] eta: 1:48:35 lr: 3.915047502962704e-05 loss: 0.1730 (0.1668) time: 3.1406 data: 0.0081 max mem: 33300 Epoch: [9] [2230/4276] eta: 1:48:03 lr: 3.9147771568338595e-05 loss: 0.1586 (0.1668) time: 3.1198 data: 0.0082 max mem: 33300 Epoch: [9] [2240/4276] eta: 1:47:31 lr: 3.914506808630607e-05 loss: 0.1496 (0.1667) time: 3.1214 data: 0.0078 max mem: 33300 Epoch: [9] [2250/4276] eta: 1:46:59 lr: 3.9142364583527714e-05 loss: 0.1442 (0.1666) time: 3.1391 data: 0.0079 max mem: 33300 Epoch: [9] [2260/4276] eta: 1:46:27 lr: 3.913966106000178e-05 loss: 0.1559 (0.1666) time: 3.1063 data: 0.0089 max mem: 33300 Epoch: [9] [2270/4276] eta: 1:45:55 lr: 3.9136957515726524e-05 loss: 0.1537 (0.1666) time: 3.1320 data: 0.0096 max mem: 33300 Epoch: [9] [2280/4276] eta: 1:45:23 lr: 3.913425395070018e-05 loss: 0.1539 (0.1667) time: 3.1241 data: 0.0086 max mem: 33300 Epoch: [9] [2290/4276] eta: 1:44:51 lr: 3.913155036492101e-05 loss: 0.1550 (0.1666) time: 3.1326 data: 0.0079 max mem: 33300 Epoch: [9] [2300/4276] eta: 1:44:19 lr: 3.9128846758387246e-05 loss: 0.1522 (0.1666) time: 3.1504 data: 0.0082 max mem: 33300 Epoch: [9] [2310/4276] eta: 1:43:47 lr: 3.912614313109714e-05 loss: 0.1575 (0.1665) time: 3.1243 data: 0.0082 max mem: 33300 Epoch: [9] [2320/4276] eta: 1:43:15 lr: 3.912343948304896e-05 loss: 0.1575 (0.1665) time: 3.1424 data: 0.0082 max mem: 33300 Epoch: [9] [2330/4276] eta: 1:42:43 lr: 3.9120735814240916e-05 loss: 0.1580 (0.1665) time: 3.1265 data: 0.0083 max mem: 33300 Epoch: [9] [2340/4276] eta: 1:42:11 lr: 3.911803212467128e-05 loss: 0.1543 (0.1665) time: 3.1055 data: 0.0085 max mem: 33300 Epoch: [9] [2350/4276] eta: 1:41:38 lr: 3.9115328414338294e-05 loss: 0.1512 (0.1665) time: 3.0958 data: 0.0089 max mem: 33300 Epoch: [9] [2360/4276] eta: 1:41:06 lr: 3.9112624683240195e-05 loss: 0.1587 (0.1664) time: 3.0717 data: 0.0093 max mem: 33300 Epoch: [9] [2370/4276] eta: 1:40:34 lr: 3.910992093137524e-05 loss: 0.1648 (0.1664) time: 3.0911 data: 0.0094 max mem: 33300 Epoch: [9] [2380/4276] eta: 1:40:02 lr: 3.910721715874167e-05 loss: 0.1611 (0.1664) time: 3.1097 data: 0.0091 max mem: 33300 Epoch: [9] [2390/4276] eta: 1:39:30 lr: 3.9104513365337725e-05 loss: 0.1485 (0.1663) time: 3.1108 data: 0.0084 max mem: 33300 Epoch: [9] [2400/4276] eta: 1:38:57 lr: 3.910180955116166e-05 loss: 0.1567 (0.1664) time: 3.1036 data: 0.0084 max mem: 33300 Epoch: [9] [2410/4276] eta: 1:38:25 lr: 3.909910571621171e-05 loss: 0.1447 (0.1664) time: 3.1042 data: 0.0091 max mem: 33300 Epoch: [9] [2420/4276] eta: 1:37:53 lr: 3.909640186048612e-05 loss: 0.1440 (0.1663) time: 3.1036 data: 0.0090 max mem: 33300 Epoch: [9] [2430/4276] eta: 1:37:21 lr: 3.909369798398314e-05 loss: 0.1771 (0.1664) time: 3.0997 data: 0.0087 max mem: 33300 Epoch: [9] [2440/4276] eta: 1:36:49 lr: 3.9090994086701e-05 loss: 0.1771 (0.1664) time: 3.0927 data: 0.0087 max mem: 33300 Epoch: [9] [2450/4276] eta: 1:36:17 lr: 3.908829016863797e-05 loss: 0.1606 (0.1664) time: 3.0970 data: 0.0087 max mem: 33300 Epoch: [9] [2460/4276] eta: 1:35:45 lr: 3.9085586229792265e-05 loss: 0.1633 (0.1664) time: 3.1184 data: 0.0083 max mem: 33300 Epoch: [9] [2470/4276] eta: 1:35:13 lr: 3.908288227016214e-05 loss: 0.1633 (0.1665) time: 3.1163 data: 0.0079 max mem: 33300 Epoch: [9] [2480/4276] eta: 1:34:41 lr: 3.908017828974584e-05 loss: 0.1666 (0.1665) time: 3.1188 data: 0.0078 max mem: 33300 Epoch: [9] [2490/4276] eta: 1:34:09 lr: 3.9077474288541606e-05 loss: 0.1572 (0.1664) time: 3.1189 data: 0.0080 max mem: 33300 Epoch: [9] [2500/4276] eta: 1:33:37 lr: 3.907477026654767e-05 loss: 0.1555 (0.1665) time: 3.1126 data: 0.0080 max mem: 33300 Epoch: [9] [2510/4276] eta: 1:33:05 lr: 3.907206622376228e-05 loss: 0.1615 (0.1665) time: 3.1082 data: 0.0081 max mem: 33300 Epoch: [9] [2520/4276] eta: 1:32:33 lr: 3.9069362160183687e-05 loss: 0.1467 (0.1664) time: 3.1082 data: 0.0079 max mem: 33300 Epoch: [9] [2530/4276] eta: 1:32:01 lr: 3.906665807581012e-05 loss: 0.1333 (0.1662) time: 3.1203 data: 0.0078 max mem: 33300 Epoch: [9] [2540/4276] eta: 1:31:29 lr: 3.9063953970639824e-05 loss: 0.1414 (0.1662) time: 3.1142 data: 0.0085 max mem: 33300 Epoch: [9] [2550/4276] eta: 1:30:57 lr: 3.9061249844671045e-05 loss: 0.1475 (0.1662) time: 3.1181 data: 0.0089 max mem: 33300 Epoch: [9] [2560/4276] eta: 1:30:25 lr: 3.9058545697902004e-05 loss: 0.1378 (0.1661) time: 3.1245 data: 0.0086 max mem: 33300 Epoch: [9] [2570/4276] eta: 1:29:53 lr: 3.905584153033096e-05 loss: 0.1340 (0.1660) time: 3.1172 data: 0.0085 max mem: 33300 Epoch: [9] [2580/4276] eta: 1:29:21 lr: 3.905313734195615e-05 loss: 0.1475 (0.1660) time: 3.1094 data: 0.0090 max mem: 33300 Epoch: [9] [2590/4276] eta: 1:28:49 lr: 3.90504331327758e-05 loss: 0.1684 (0.1660) time: 3.1019 data: 0.0091 max mem: 33300 Epoch: [9] [2600/4276] eta: 1:28:17 lr: 3.904772890278817e-05 loss: 0.1590 (0.1660) time: 3.1072 data: 0.0085 max mem: 33300 Epoch: [9] [2610/4276] eta: 1:27:46 lr: 3.904502465199148e-05 loss: 0.1472 (0.1659) time: 3.1188 data: 0.0084 max mem: 33300 Epoch: [9] [2620/4276] eta: 1:27:14 lr: 3.9042320380383976e-05 loss: 0.1458 (0.1659) time: 3.1215 data: 0.0083 max mem: 33300 Epoch: [9] [2630/4276] eta: 1:26:42 lr: 3.903961608796391e-05 loss: 0.1476 (0.1658) time: 3.1154 data: 0.0081 max mem: 33300 Epoch: [9] [2640/4276] eta: 1:26:10 lr: 3.903691177472949e-05 loss: 0.1476 (0.1658) time: 3.1209 data: 0.0081 max mem: 33300 Epoch: [9] [2650/4276] eta: 1:25:38 lr: 3.903420744067898e-05 loss: 0.1607 (0.1658) time: 3.1147 data: 0.0080 max mem: 33300 Epoch: [9] [2660/4276] eta: 1:25:06 lr: 3.90315030858106e-05 loss: 0.1620 (0.1658) time: 3.1143 data: 0.0081 max mem: 33300 Epoch: [9] [2670/4276] eta: 1:24:34 lr: 3.9028798710122597e-05 loss: 0.1600 (0.1658) time: 3.1163 data: 0.0083 max mem: 33300 Epoch: [9] [2680/4276] eta: 1:24:02 lr: 3.90260943136132e-05 loss: 0.1622 (0.1658) time: 3.1090 data: 0.0083 max mem: 33300 Epoch: [9] [2690/4276] eta: 1:23:31 lr: 3.902338989628066e-05 loss: 0.1610 (0.1658) time: 3.1156 data: 0.0084 max mem: 33300 Epoch: [9] [2700/4276] eta: 1:22:59 lr: 3.90206854581232e-05 loss: 0.1492 (0.1657) time: 3.1182 data: 0.0089 max mem: 33300 Epoch: [9] [2710/4276] eta: 1:22:27 lr: 3.901798099913906e-05 loss: 0.1483 (0.1657) time: 3.1122 data: 0.0085 max mem: 33300 Epoch: [9] [2720/4276] eta: 1:21:55 lr: 3.901527651932648e-05 loss: 0.1621 (0.1657) time: 3.1076 data: 0.0082 max mem: 33300 Epoch: [9] [2730/4276] eta: 1:21:23 lr: 3.901257201868368e-05 loss: 0.1642 (0.1657) time: 3.1112 data: 0.0084 max mem: 33300 Epoch: [9] [2740/4276] eta: 1:20:51 lr: 3.9009867497208906e-05 loss: 0.1651 (0.1657) time: 3.1152 data: 0.0088 max mem: 33300 Epoch: [9] [2750/4276] eta: 1:20:19 lr: 3.900716295490041e-05 loss: 0.1673 (0.1657) time: 3.1076 data: 0.0089 max mem: 33300 Epoch: [9] [2760/4276] eta: 1:19:47 lr: 3.9004458391756385e-05 loss: 0.1567 (0.1657) time: 3.0870 data: 0.0082 max mem: 33300 Epoch: [9] [2770/4276] eta: 1:19:15 lr: 3.9001753807775106e-05 loss: 0.1513 (0.1657) time: 3.0746 data: 0.0084 max mem: 33300 Epoch: [9] [2780/4276] eta: 1:18:43 lr: 3.8999049202954784e-05 loss: 0.1477 (0.1656) time: 3.0933 data: 0.0097 max mem: 33300 Epoch: [9] [2790/4276] eta: 1:18:12 lr: 3.8996344577293664e-05 loss: 0.1682 (0.1657) time: 3.1177 data: 0.0101 max mem: 33300 Epoch: [9] [2800/4276] eta: 1:17:40 lr: 3.899363993078996e-05 loss: 0.1730 (0.1657) time: 3.1152 data: 0.0100 max mem: 33300 Epoch: [9] [2810/4276] eta: 1:17:08 lr: 3.899093526344193e-05 loss: 0.1412 (0.1656) time: 3.0989 data: 0.0100 max mem: 33300 Epoch: [9] [2820/4276] eta: 1:16:36 lr: 3.89882305752478e-05 loss: 0.1432 (0.1656) time: 3.0849 data: 0.0096 max mem: 33300 Epoch: [9] [2830/4276] eta: 1:16:04 lr: 3.89855258662058e-05 loss: 0.1513 (0.1655) time: 3.0738 data: 0.0093 max mem: 33300 Epoch: [9] [2840/4276] eta: 1:15:32 lr: 3.8982821136314143e-05 loss: 0.1614 (0.1655) time: 3.0777 data: 0.0085 max mem: 33300 Epoch: [9] [2850/4276] eta: 1:15:00 lr: 3.898011638557109e-05 loss: 0.1749 (0.1656) time: 3.0775 data: 0.0082 max mem: 33300 Epoch: [9] [2860/4276] eta: 1:14:28 lr: 3.897741161397487e-05 loss: 0.1683 (0.1656) time: 3.0618 data: 0.0091 max mem: 33300 Epoch: [9] [2870/4276] eta: 1:13:56 lr: 3.897470682152369e-05 loss: 0.1590 (0.1656) time: 3.0776 data: 0.0090 max mem: 33300 Epoch: [9] [2880/4276] eta: 1:13:24 lr: 3.8972002008215804e-05 loss: 0.1590 (0.1656) time: 3.0987 data: 0.0081 max mem: 33300 Epoch: [9] [2890/4276] eta: 1:12:52 lr: 3.896929717404943e-05 loss: 0.1520 (0.1656) time: 3.0936 data: 0.0081 max mem: 33300 Epoch: [9] [2900/4276] eta: 1:12:21 lr: 3.89665923190228e-05 loss: 0.1426 (0.1655) time: 3.0899 data: 0.0081 max mem: 33300 Epoch: [9] [2910/4276] eta: 1:11:49 lr: 3.896388744313417e-05 loss: 0.1495 (0.1655) time: 3.0734 data: 0.0082 max mem: 33300 Epoch: [9] [2920/4276] eta: 1:11:16 lr: 3.896118254638173e-05 loss: 0.1495 (0.1655) time: 3.0527 data: 0.0080 max mem: 33300 Epoch: [9] [2930/4276] eta: 1:10:45 lr: 3.895847762876373e-05 loss: 0.1474 (0.1655) time: 3.0728 data: 0.0081 max mem: 33300 Epoch: [9] [2940/4276] eta: 1:10:13 lr: 3.8955772690278395e-05 loss: 0.1484 (0.1654) time: 3.0995 data: 0.0080 max mem: 33300 Epoch: [9] [2950/4276] eta: 1:09:41 lr: 3.895306773092396e-05 loss: 0.1568 (0.1654) time: 3.0979 data: 0.0075 max mem: 33300 Epoch: [9] [2960/4276] eta: 1:09:09 lr: 3.895036275069865e-05 loss: 0.1552 (0.1653) time: 3.0986 data: 0.0079 max mem: 33300 Epoch: [9] [2970/4276] eta: 1:08:38 lr: 3.8947657749600694e-05 loss: 0.1612 (0.1654) time: 3.1039 data: 0.0077 max mem: 33300 Epoch: [9] [2980/4276] eta: 1:08:06 lr: 3.8944952727628315e-05 loss: 0.1660 (0.1654) time: 3.1101 data: 0.0079 max mem: 33300 Epoch: [9] [2990/4276] eta: 1:07:34 lr: 3.894224768477975e-05 loss: 0.1523 (0.1654) time: 3.1076 data: 0.0087 max mem: 33300 Epoch: [9] [3000/4276] eta: 1:07:02 lr: 3.8939542621053214e-05 loss: 0.1492 (0.1653) time: 3.0909 data: 0.0084 max mem: 33300 Epoch: [9] [3010/4276] eta: 1:06:31 lr: 3.893683753644695e-05 loss: 0.1529 (0.1653) time: 3.0869 data: 0.0079 max mem: 33300 Epoch: [9] [3020/4276] eta: 1:05:59 lr: 3.8934132430959174e-05 loss: 0.1591 (0.1653) time: 3.0767 data: 0.0082 max mem: 33300 Epoch: [9] [3030/4276] eta: 1:05:27 lr: 3.893142730458812e-05 loss: 0.1492 (0.1653) time: 3.0585 data: 0.0086 max mem: 33300 Epoch: [9] [3040/4276] eta: 1:04:55 lr: 3.8928722157332005e-05 loss: 0.1672 (0.1654) time: 3.0425 data: 0.0083 max mem: 33300 Epoch: [9] [3050/4276] eta: 1:04:23 lr: 3.892601698918906e-05 loss: 0.1672 (0.1654) time: 3.0617 data: 0.0082 max mem: 33300 Epoch: [9] [3060/4276] eta: 1:03:51 lr: 3.892331180015752e-05 loss: 0.1414 (0.1653) time: 3.0961 data: 0.0079 max mem: 33300 Epoch: [9] [3070/4276] eta: 1:03:19 lr: 3.8920606590235595e-05 loss: 0.1496 (0.1653) time: 3.0983 data: 0.0075 max mem: 33300 Epoch: [9] [3080/4276] eta: 1:02:48 lr: 3.8917901359421516e-05 loss: 0.1489 (0.1653) time: 3.0979 data: 0.0077 max mem: 33300 Epoch: [9] [3090/4276] eta: 1:02:16 lr: 3.891519610771352e-05 loss: 0.1489 (0.1652) time: 3.0961 data: 0.0077 max mem: 33300 Epoch: [9] [3100/4276] eta: 1:01:44 lr: 3.891249083510981e-05 loss: 0.1540 (0.1653) time: 3.1033 data: 0.0081 max mem: 33300 Epoch: [9] [3110/4276] eta: 1:01:13 lr: 3.8909785541608625e-05 loss: 0.1536 (0.1652) time: 3.1069 data: 0.0082 max mem: 33300 Epoch: [9] [3120/4276] eta: 1:00:41 lr: 3.890708022720819e-05 loss: 0.1448 (0.1652) time: 3.1039 data: 0.0079 max mem: 33300 Epoch: [9] [3130/4276] eta: 1:00:09 lr: 3.8904374891906715e-05 loss: 0.1444 (0.1651) time: 3.0851 data: 0.0091 max mem: 33300 Epoch: [9] [3140/4276] eta: 0:59:37 lr: 3.8901669535702446e-05 loss: 0.1491 (0.1651) time: 3.0680 data: 0.0097 max mem: 33300 Epoch: [9] [3150/4276] eta: 0:59:06 lr: 3.889896415859358e-05 loss: 0.1718 (0.1651) time: 3.0824 data: 0.0085 max mem: 33300 Epoch: [9] [3160/4276] eta: 0:58:34 lr: 3.889625876057836e-05 loss: 0.1629 (0.1651) time: 3.0937 data: 0.0078 max mem: 33300 Epoch: [9] [3170/4276] eta: 0:58:02 lr: 3.889355334165501e-05 loss: 0.1602 (0.1652) time: 3.1054 data: 0.0078 max mem: 33300 Epoch: [9] [3180/4276] eta: 0:57:31 lr: 3.8890847901821734e-05 loss: 0.1565 (0.1652) time: 3.1027 data: 0.0082 max mem: 33300 Epoch: [9] [3190/4276] eta: 0:56:59 lr: 3.888814244107677e-05 loss: 0.1536 (0.1652) time: 3.1005 data: 0.0082 max mem: 33300 Epoch: [9] [3200/4276] eta: 0:56:27 lr: 3.888543695941833e-05 loss: 0.1623 (0.1652) time: 3.1072 data: 0.0083 max mem: 33300 Epoch: [9] [3210/4276] eta: 0:55:56 lr: 3.888273145684464e-05 loss: 0.1691 (0.1652) time: 3.1104 data: 0.0083 max mem: 33300 Epoch: [9] [3220/4276] eta: 0:55:24 lr: 3.888002593335393e-05 loss: 0.1704 (0.1652) time: 3.1213 data: 0.0087 max mem: 33300 Epoch: [9] [3230/4276] eta: 0:54:53 lr: 3.887732038894441e-05 loss: 0.1529 (0.1652) time: 3.1235 data: 0.0091 max mem: 33300 Epoch: [9] [3240/4276] eta: 0:54:21 lr: 3.88746148236143e-05 loss: 0.1644 (0.1652) time: 3.1160 data: 0.0089 max mem: 33300 Epoch: [9] [3250/4276] eta: 0:53:50 lr: 3.887190923736183e-05 loss: 0.1773 (0.1652) time: 3.1184 data: 0.0089 max mem: 33300 Epoch: [9] [3260/4276] eta: 0:53:18 lr: 3.8869203630185207e-05 loss: 0.1670 (0.1652) time: 3.1210 data: 0.0089 max mem: 33300 Epoch: [9] [3270/4276] eta: 0:52:46 lr: 3.886649800208266e-05 loss: 0.1712 (0.1653) time: 3.0686 data: 0.0083 max mem: 33300 Epoch: [9] [3280/4276] eta: 0:52:14 lr: 3.88637923530524e-05 loss: 0.1539 (0.1653) time: 3.0174 data: 0.0086 max mem: 33300 Epoch: [9] [3290/4276] eta: 0:51:42 lr: 3.886108668309266e-05 loss: 0.1620 (0.1653) time: 3.0096 data: 0.0093 max mem: 33300 Epoch: [9] [3300/4276] eta: 0:51:11 lr: 3.8858380992201654e-05 loss: 0.1620 (0.1653) time: 3.0248 data: 0.0089 max mem: 33300 Epoch: [9] [3310/4276] eta: 0:50:39 lr: 3.885567528037759e-05 loss: 0.1694 (0.1654) time: 3.0459 data: 0.0086 max mem: 33300 Epoch: [9] [3320/4276] eta: 0:50:07 lr: 3.8852969547618704e-05 loss: 0.1674 (0.1654) time: 3.0531 data: 0.0083 max mem: 33300 Epoch: [9] [3330/4276] eta: 0:49:35 lr: 3.88502637939232e-05 loss: 0.1603 (0.1654) time: 3.0481 data: 0.0081 max mem: 33300 Epoch: [9] [3340/4276] eta: 0:49:04 lr: 3.88475580192893e-05 loss: 0.1648 (0.1654) time: 3.0412 data: 0.0078 max mem: 33300 Epoch: [9] [3350/4276] eta: 0:48:32 lr: 3.884485222371522e-05 loss: 0.1602 (0.1654) time: 3.0392 data: 0.0076 max mem: 33300 Epoch: [9] [3360/4276] eta: 0:48:00 lr: 3.8842146407199175e-05 loss: 0.1468 (0.1654) time: 3.0387 data: 0.0076 max mem: 33300 Epoch: [9] [3370/4276] eta: 0:47:28 lr: 3.88394405697394e-05 loss: 0.1515 (0.1654) time: 3.0149 data: 0.0081 max mem: 33300 Epoch: [9] [3380/4276] eta: 0:46:56 lr: 3.883673471133408e-05 loss: 0.1515 (0.1654) time: 2.9919 data: 0.0091 max mem: 33300 Epoch: [9] [3390/4276] eta: 0:46:25 lr: 3.883402883198146e-05 loss: 0.1594 (0.1654) time: 3.0170 data: 0.0091 max mem: 33300 Epoch: [9] [3400/4276] eta: 0:45:53 lr: 3.883132293167975e-05 loss: 0.1737 (0.1654) time: 3.0354 data: 0.0086 max mem: 33300 Epoch: [9] [3410/4276] eta: 0:45:21 lr: 3.8828617010427155e-05 loss: 0.1638 (0.1655) time: 3.0340 data: 0.0085 max mem: 33300 Epoch: [9] [3420/4276] eta: 0:44:50 lr: 3.882591106822189e-05 loss: 0.1611 (0.1655) time: 3.0339 data: 0.0078 max mem: 33300 Epoch: [9] [3430/4276] eta: 0:44:18 lr: 3.882320510506219e-05 loss: 0.1643 (0.1655) time: 3.0587 data: 0.0074 max mem: 33300 Epoch: [9] [3440/4276] eta: 0:43:46 lr: 3.8820499120946245e-05 loss: 0.1587 (0.1654) time: 3.0678 data: 0.0079 max mem: 33300 Epoch: [9] [3450/4276] eta: 0:43:15 lr: 3.881779311587229e-05 loss: 0.1461 (0.1655) time: 3.0648 data: 0.0081 max mem: 33300 Epoch: [9] [3460/4276] eta: 0:42:43 lr: 3.881508708983853e-05 loss: 0.1727 (0.1655) time: 3.0776 data: 0.0080 max mem: 33300 Epoch: [9] [3470/4276] eta: 0:42:12 lr: 3.881238104284317e-05 loss: 0.1593 (0.1655) time: 3.0821 data: 0.0084 max mem: 33300 Epoch: [9] [3480/4276] eta: 0:41:40 lr: 3.8809674974884444e-05 loss: 0.1567 (0.1654) time: 3.0667 data: 0.0086 max mem: 33300 Epoch: [9] [3490/4276] eta: 0:41:08 lr: 3.880696888596055e-05 loss: 0.1635 (0.1655) time: 3.0633 data: 0.0081 max mem: 33300 Epoch: [9] [3500/4276] eta: 0:40:37 lr: 3.88042627760697e-05 loss: 0.1685 (0.1655) time: 3.0808 data: 0.0078 max mem: 33300 Epoch: [9] [3510/4276] eta: 0:40:05 lr: 3.880155664521012e-05 loss: 0.1661 (0.1654) time: 3.0822 data: 0.0076 max mem: 33300 Epoch: [9] [3520/4276] eta: 0:39:34 lr: 3.879885049338001e-05 loss: 0.1529 (0.1654) time: 3.0801 data: 0.0075 max mem: 33300 Epoch: [9] [3530/4276] eta: 0:39:02 lr: 3.8796144320577596e-05 loss: 0.1546 (0.1655) time: 3.0767 data: 0.0077 max mem: 33300 Epoch: [9] [3540/4276] eta: 0:38:31 lr: 3.879343812680108e-05 loss: 0.1556 (0.1655) time: 3.0557 data: 0.0075 max mem: 33300 Epoch: [9] [3550/4276] eta: 0:37:59 lr: 3.879073191204867e-05 loss: 0.1513 (0.1654) time: 3.0057 data: 0.0072 max mem: 33300 Epoch: [9] [3560/4276] eta: 0:37:53 lr: 3.878802567631859e-05 loss: 0.1513 (0.1654) time: 9.3442 data: 6.3258 max mem: 33300 Epoch: [9] [3570/4276] eta: 0:37:21 lr: 3.878531941960903e-05 loss: 0.1747 (0.1655) time: 9.3972 data: 6.3268 max mem: 33300 Epoch: [9] [3580/4276] eta: 0:36:49 lr: 3.878261314191823e-05 loss: 0.1476 (0.1654) time: 3.0762 data: 0.0090 max mem: 33300 Epoch: [9] [3590/4276] eta: 0:36:17 lr: 3.877990684324438e-05 loss: 0.1409 (0.1654) time: 3.0726 data: 0.0088 max mem: 33300 Epoch: [9] [3600/4276] eta: 0:35:45 lr: 3.87772005235857e-05 loss: 0.1567 (0.1654) time: 3.0731 data: 0.0088 max mem: 33300 Epoch: [9] [3610/4276] eta: 0:35:13 lr: 3.8774494182940383e-05 loss: 0.1649 (0.1654) time: 3.0756 data: 0.0090 max mem: 33300 Epoch: [9] [3620/4276] eta: 0:34:41 lr: 3.877178782130666e-05 loss: 0.1644 (0.1654) time: 3.0768 data: 0.0085 max mem: 33300 Epoch: [9] [3630/4276] eta: 0:34:09 lr: 3.876908143868273e-05 loss: 0.1591 (0.1654) time: 3.0740 data: 0.0077 max mem: 33300 Epoch: [9] [3640/4276] eta: 0:33:37 lr: 3.876637503506681e-05 loss: 0.1498 (0.1653) time: 3.0714 data: 0.0077 max mem: 33300 Epoch: [9] [3650/4276] eta: 0:33:06 lr: 3.8763668610457094e-05 loss: 0.1404 (0.1653) time: 3.0729 data: 0.0078 max mem: 33300 Epoch: [9] [3660/4276] eta: 0:32:34 lr: 3.87609621648518e-05 loss: 0.1404 (0.1653) time: 3.0728 data: 0.0079 max mem: 33300 Epoch: [9] [3670/4276] eta: 0:32:02 lr: 3.875825569824914e-05 loss: 0.1490 (0.1652) time: 3.0736 data: 0.0077 max mem: 33300 Epoch: [9] [3680/4276] eta: 0:31:30 lr: 3.8755549210647315e-05 loss: 0.1697 (0.1653) time: 3.0766 data: 0.0076 max mem: 33300 Epoch: [9] [3690/4276] eta: 0:30:58 lr: 3.8752842702044536e-05 loss: 0.1698 (0.1653) time: 3.0761 data: 0.0078 max mem: 33300 Epoch: [9] [3700/4276] eta: 0:30:26 lr: 3.875013617243901e-05 loss: 0.1578 (0.1652) time: 3.0768 data: 0.0078 max mem: 33300 Epoch: [9] [3710/4276] eta: 0:29:54 lr: 3.874742962182894e-05 loss: 0.1540 (0.1652) time: 3.0799 data: 0.0076 max mem: 33300 Epoch: [9] [3720/4276] eta: 0:29:22 lr: 3.874472305021253e-05 loss: 0.1340 (0.1652) time: 3.0798 data: 0.0077 max mem: 33300 Epoch: [9] [3730/4276] eta: 0:28:51 lr: 3.8742016457588e-05 loss: 0.1518 (0.1652) time: 3.0800 data: 0.0078 max mem: 33300 Epoch: [9] [3740/4276] eta: 0:28:19 lr: 3.873930984395355e-05 loss: 0.1607 (0.1651) time: 3.0799 data: 0.0079 max mem: 33300 Epoch: [9] [3750/4276] eta: 0:27:47 lr: 3.873660320930738e-05 loss: 0.1525 (0.1651) time: 3.0756 data: 0.0077 max mem: 33300 Epoch: [9] [3760/4276] eta: 0:27:15 lr: 3.8733896553647705e-05 loss: 0.1525 (0.1651) time: 3.0730 data: 0.0077 max mem: 33300 Epoch: [9] [3770/4276] eta: 0:26:43 lr: 3.873118987697272e-05 loss: 0.1623 (0.1651) time: 3.0705 data: 0.0079 max mem: 33300 Epoch: [9] [3780/4276] eta: 0:26:11 lr: 3.8728483179280636e-05 loss: 0.1636 (0.1651) time: 3.0695 data: 0.0079 max mem: 33300 Epoch: [9] [3790/4276] eta: 0:25:40 lr: 3.872577646056966e-05 loss: 0.1592 (0.1651) time: 3.0739 data: 0.0077 max mem: 33300 Epoch: [9] [3800/4276] eta: 0:25:08 lr: 3.872306972083799e-05 loss: 0.1632 (0.1651) time: 3.0779 data: 0.0077 max mem: 33300 Epoch: [9] [3810/4276] eta: 0:24:36 lr: 3.872036296008383e-05 loss: 0.1591 (0.1651) time: 3.0764 data: 0.0076 max mem: 33300 Epoch: [9] [3820/4276] eta: 0:24:04 lr: 3.871765617830539e-05 loss: 0.1359 (0.1650) time: 3.0761 data: 0.0077 max mem: 33300 Epoch: [9] [3830/4276] eta: 0:23:32 lr: 3.871494937550088e-05 loss: 0.1375 (0.1650) time: 3.0739 data: 0.0076 max mem: 33300 Epoch: [9] [3840/4276] eta: 0:23:01 lr: 3.8712242551668486e-05 loss: 0.1395 (0.1650) time: 3.0636 data: 0.0079 max mem: 33300 Epoch: [9] [3850/4276] eta: 0:22:29 lr: 3.870953570680642e-05 loss: 0.1302 (0.1649) time: 3.0510 data: 0.0080 max mem: 33300 Epoch: [9] [3860/4276] eta: 0:21:57 lr: 3.870682884091289e-05 loss: 0.1561 (0.1649) time: 3.0411 data: 0.0078 max mem: 33300 Epoch: [9] [3870/4276] eta: 0:21:25 lr: 3.870412195398608e-05 loss: 0.1608 (0.1649) time: 3.0370 data: 0.0076 max mem: 33300 Epoch: [9] [3880/4276] eta: 0:20:53 lr: 3.8701415046024216e-05 loss: 0.1630 (0.1649) time: 3.0356 data: 0.0075 max mem: 33300 Epoch: [9] [3890/4276] eta: 0:20:22 lr: 3.869870811702548e-05 loss: 0.1499 (0.1649) time: 3.0312 data: 0.0075 max mem: 33300 Epoch: [9] [3900/4276] eta: 0:19:50 lr: 3.869600116698808e-05 loss: 0.1536 (0.1649) time: 3.0282 data: 0.0075 max mem: 33300 Epoch: [9] [3910/4276] eta: 0:19:18 lr: 3.8693294195910225e-05 loss: 0.1393 (0.1648) time: 3.0283 data: 0.0072 max mem: 33300 Epoch: [9] [3920/4276] eta: 0:18:46 lr: 3.8690587203790106e-05 loss: 0.1473 (0.1648) time: 3.0317 data: 0.0072 max mem: 33300 Epoch: [9] [3930/4276] eta: 0:18:14 lr: 3.868788019062593e-05 loss: 0.1589 (0.1648) time: 3.0279 data: 0.0075 max mem: 33300 Epoch: [9] [3940/4276] eta: 0:17:43 lr: 3.8685173156415896e-05 loss: 0.1509 (0.1648) time: 3.0045 data: 0.0077 max mem: 33300 Epoch: [9] [3950/4276] eta: 0:17:11 lr: 3.86824661011582e-05 loss: 0.1455 (0.1648) time: 2.9814 data: 0.0080 max mem: 33300 Epoch: [9] [3960/4276] eta: 0:16:39 lr: 3.867975902485104e-05 loss: 0.1588 (0.1648) time: 2.9802 data: 0.0079 max mem: 33300 Epoch: [9] [3970/4276] eta: 0:16:07 lr: 3.867705192749262e-05 loss: 0.1767 (0.1648) time: 2.9795 data: 0.0075 max mem: 33300 Epoch: [9] [3980/4276] eta: 0:15:36 lr: 3.8674344809081145e-05 loss: 0.1510 (0.1648) time: 2.9759 data: 0.0075 max mem: 33300 Epoch: [9] [3990/4276] eta: 0:15:04 lr: 3.86716376696148e-05 loss: 0.1375 (0.1647) time: 2.9740 data: 0.0076 max mem: 33300 Epoch: [9] [4000/4276] eta: 0:14:32 lr: 3.86689305090918e-05 loss: 0.1423 (0.1647) time: 2.9757 data: 0.0076 max mem: 33300 Epoch: [9] [4010/4276] eta: 0:14:00 lr: 3.8666223327510326e-05 loss: 0.1525 (0.1648) time: 2.9790 data: 0.0075 max mem: 33300 Epoch: [9] [4020/4276] eta: 0:13:29 lr: 3.866351612486859e-05 loss: 0.1560 (0.1648) time: 2.9780 data: 0.0076 max mem: 33300 Epoch: [9] [4030/4276] eta: 0:12:57 lr: 3.866080890116478e-05 loss: 0.1560 (0.1648) time: 2.9720 data: 0.0080 max mem: 33300 Epoch: [9] [4040/4276] eta: 0:12:25 lr: 3.86581016563971e-05 loss: 0.1702 (0.1648) time: 2.9714 data: 0.0084 max mem: 33300 Epoch: [9] [4050/4276] eta: 0:11:53 lr: 3.865539439056374e-05 loss: 0.1817 (0.1648) time: 2.9590 data: 0.0087 max mem: 33300 Epoch: [9] [4060/4276] eta: 0:11:22 lr: 3.86526871036629e-05 loss: 0.1739 (0.1649) time: 2.9551 data: 0.0088 max mem: 33300 Epoch: [9] [4070/4276] eta: 0:10:50 lr: 3.864997979569279e-05 loss: 0.1687 (0.1649) time: 2.9551 data: 0.0087 max mem: 33300 Epoch: [9] [4080/4276] eta: 0:10:18 lr: 3.864727246665159e-05 loss: 0.1622 (0.1649) time: 2.9420 data: 0.0080 max mem: 33300 Epoch: [9] [4090/4276] eta: 0:09:47 lr: 3.864456511653749e-05 loss: 0.1675 (0.1649) time: 2.9407 data: 0.0077 max mem: 33300 Epoch: [9] [4100/4276] eta: 0:09:15 lr: 3.8641857745348704e-05 loss: 0.1675 (0.1649) time: 2.9562 data: 0.0082 max mem: 33300 Epoch: [9] [4110/4276] eta: 0:08:43 lr: 3.863915035308341e-05 loss: 0.1637 (0.1649) time: 2.9657 data: 0.0082 max mem: 33300 Epoch: [9] [4120/4276] eta: 0:08:12 lr: 3.863644293973982e-05 loss: 0.1626 (0.1650) time: 2.9610 data: 0.0075 max mem: 33300 Epoch: [9] [4130/4276] eta: 0:07:40 lr: 3.863373550531612e-05 loss: 0.1501 (0.1649) time: 2.9649 data: 0.0076 max mem: 33300 Epoch: [9] [4140/4276] eta: 0:07:09 lr: 3.86310280498105e-05 loss: 0.1447 (0.1649) time: 2.9683 data: 0.0078 max mem: 33300 Epoch: [9] [4150/4276] eta: 0:06:37 lr: 3.8628320573221164e-05 loss: 0.1525 (0.1649) time: 2.9672 data: 0.0075 max mem: 33300 Epoch: [9] [4160/4276] eta: 0:06:05 lr: 3.862561307554629e-05 loss: 0.1621 (0.1649) time: 2.9637 data: 0.0076 max mem: 33300 Epoch: [9] [4170/4276] eta: 0:05:34 lr: 3.862290555678409e-05 loss: 0.1890 (0.1650) time: 2.9617 data: 0.0075 max mem: 33300 Epoch: [9] [4180/4276] eta: 0:05:02 lr: 3.862019801693276e-05 loss: 0.1711 (0.1649) time: 2.9647 data: 0.0072 max mem: 33300 Epoch: [9] [4190/4276] eta: 0:04:31 lr: 3.861749045599046e-05 loss: 0.1481 (0.1649) time: 2.9660 data: 0.0074 max mem: 33300 Epoch: [9] [4200/4276] eta: 0:03:59 lr: 3.861478287395542e-05 loss: 0.1603 (0.1650) time: 2.9676 data: 0.0076 max mem: 33300 Epoch: [9] [4210/4276] eta: 0:03:27 lr: 3.8612075270825816e-05 loss: 0.1804 (0.1650) time: 2.9656 data: 0.0079 max mem: 33300 Epoch: [9] [4220/4276] eta: 0:02:56 lr: 3.860936764659984e-05 loss: 0.1854 (0.1650) time: 2.9624 data: 0.0079 max mem: 33300 Epoch: [9] [4230/4276] eta: 0:02:24 lr: 3.860666000127569e-05 loss: 0.1812 (0.1651) time: 2.9629 data: 0.0076 max mem: 33300 Epoch: [9] [4240/4276] eta: 0:01:53 lr: 3.860395233485154e-05 loss: 0.1812 (0.1651) time: 2.9594 data: 0.0076 max mem: 33300 Epoch: [9] [4250/4276] eta: 0:01:21 lr: 3.86012446473256e-05 loss: 0.1714 (0.1651) time: 2.9627 data: 0.0077 max mem: 33300 Epoch: [9] [4260/4276] eta: 0:00:50 lr: 3.859853693869606e-05 loss: 0.1695 (0.1651) time: 2.9656 data: 0.0081 max mem: 33300 Epoch: [9] [4270/4276] eta: 0:00:18 lr: 3.85958292089611e-05 loss: 0.1673 (0.1652) time: 2.9570 data: 0.0077 max mem: 33300 Epoch: [9] Total time: 3:44:23 Test: [ 0/21770] eta: 11:38:37 time: 1.9255 data: 1.8818 max mem: 33300 Test: [ 100/21770] eta: 0:20:40 time: 0.0386 data: 0.0009 max mem: 33300 Test: [ 200/21770] eta: 0:17:15 time: 0.0389 data: 0.0008 max mem: 33300 Test: [ 300/21770] eta: 0:16:04 time: 0.0385 data: 0.0009 max mem: 33300 Test: [ 400/21770] eta: 0:15:26 time: 0.0388 data: 0.0009 max mem: 33300 Test: [ 500/21770] eta: 0:15:02 time: 0.0388 data: 0.0009 max mem: 33300 Test: [ 600/21770] eta: 0:14:45 time: 0.0390 data: 0.0008 max mem: 33300 Test: [ 700/21770] eta: 0:14:31 time: 0.0383 data: 0.0009 max mem: 33300 Test: [ 800/21770] eta: 0:14:18 time: 0.0384 data: 0.0009 max mem: 33300 Test: [ 900/21770] eta: 0:14:08 time: 0.0381 data: 0.0009 max mem: 33300 Test: [ 1000/21770] eta: 0:13:59 time: 0.0384 data: 0.0008 max mem: 33300 Test: [ 1100/21770] eta: 0:13:51 time: 0.0380 data: 0.0008 max mem: 33300 Test: [ 1200/21770] eta: 0:13:44 time: 0.0381 data: 0.0009 max mem: 33300 Test: [ 1300/21770] eta: 0:13:37 time: 0.0381 data: 0.0009 max mem: 33300 Test: [ 1400/21770] eta: 0:13:30 time: 0.0382 data: 0.0009 max mem: 33300 Test: [ 1500/21770] eta: 0:13:24 time: 0.0385 data: 0.0008 max mem: 33300 Test: [ 1600/21770] eta: 0:13:20 time: 0.0396 data: 0.0008 max mem: 33300 Test: [ 1700/21770] eta: 0:13:16 time: 0.0397 data: 0.0008 max mem: 33300 Test: [ 1800/21770] eta: 0:13:12 time: 0.0399 data: 0.0008 max mem: 33300 Test: [ 1900/21770] eta: 0:13:08 time: 0.0396 data: 0.0008 max mem: 33300 Test: [ 2000/21770] eta: 0:13:04 time: 0.0395 data: 0.0008 max mem: 33300 Test: [ 2100/21770] eta: 0:13:00 time: 0.0400 data: 0.0008 max mem: 33300 Test: [ 2200/21770] eta: 0:12:56 time: 0.0399 data: 0.0008 max mem: 33300 Test: [ 2300/21770] eta: 0:12:52 time: 0.0396 data: 0.0008 max mem: 33300 Test: [ 2400/21770] eta: 0:12:48 time: 0.0399 data: 0.0008 max mem: 33300 Test: [ 2500/21770] eta: 0:12:45 time: 0.0401 data: 0.0008 max mem: 33300 Test: [ 2600/21770] eta: 0:12:41 time: 0.0399 data: 0.0008 max mem: 33300 Test: [ 2700/21770] eta: 0:12:37 time: 0.0400 data: 0.0008 max mem: 33300 Test: [ 2800/21770] eta: 0:12:33 time: 0.0400 data: 0.0008 max mem: 33300 Test: [ 2900/21770] eta: 0:12:30 time: 0.0401 data: 0.0008 max mem: 33300 Test: [ 3000/21770] eta: 0:12:26 time: 0.0399 data: 0.0008 max mem: 33300 Test: [ 3100/21770] eta: 0:12:22 time: 0.0401 data: 0.0009 max mem: 33300 Test: [ 3200/21770] eta: 0:12:18 time: 0.0399 data: 0.0009 max mem: 33300 Test: [ 3300/21770] eta: 0:12:14 time: 0.0402 data: 0.0008 max mem: 33300 Test: [ 3400/21770] eta: 0:12:10 time: 0.0400 data: 0.0008 max mem: 33300 Test: [ 3500/21770] eta: 0:12:06 time: 0.0401 data: 0.0008 max mem: 33300 Test: [ 3600/21770] eta: 0:12:03 time: 0.0399 data: 0.0008 max mem: 33300 Test: [ 3700/21770] eta: 0:11:59 time: 0.0400 data: 0.0008 max mem: 33300 Test: [ 3800/21770] eta: 0:11:55 time: 0.0399 data: 0.0009 max mem: 33300 Test: [ 3900/21770] eta: 0:11:51 time: 0.0402 data: 0.0009 max mem: 33300 Test: [ 4000/21770] eta: 0:11:47 time: 0.0400 data: 0.0008 max mem: 33300 Test: [ 4100/21770] eta: 0:11:43 time: 0.0400 data: 0.0008 max mem: 33300 Test: [ 4200/21770] eta: 0:11:39 time: 0.0398 data: 0.0008 max mem: 33300 Test: [ 4300/21770] eta: 0:11:35 time: 0.0402 data: 0.0009 max mem: 33300 Test: [ 4400/21770] eta: 0:11:31 time: 0.0398 data: 0.0008 max mem: 33300 Test: [ 4500/21770] eta: 0:11:27 time: 0.0401 data: 0.0009 max mem: 33300 Test: [ 4600/21770] eta: 0:11:23 time: 0.0397 data: 0.0009 max mem: 33300 Test: [ 4700/21770] eta: 0:11:20 time: 0.0400 data: 0.0009 max mem: 33300 Test: [ 4800/21770] eta: 0:11:16 time: 0.0394 data: 0.0008 max mem: 33300 Test: [ 4900/21770] eta: 0:11:12 time: 0.0400 data: 0.0009 max mem: 33300 Test: [ 5000/21770] eta: 0:11:08 time: 0.0400 data: 0.0008 max mem: 33300 Test: [ 5100/21770] eta: 0:11:04 time: 0.0400 data: 0.0008 max mem: 33300 Test: [ 5200/21770] eta: 0:11:00 time: 0.0400 data: 0.0008 max mem: 33300 Test: [ 5300/21770] eta: 0:10:56 time: 0.0400 data: 0.0009 max mem: 33300 Test: [ 5400/21770] eta: 0:10:52 time: 0.0400 data: 0.0009 max mem: 33300 Test: [ 5500/21770] eta: 0:10:48 time: 0.0400 data: 0.0008 max mem: 33300 Test: [ 5600/21770] eta: 0:10:44 time: 0.0399 data: 0.0008 max mem: 33300 Test: [ 5700/21770] eta: 0:10:40 time: 0.0400 data: 0.0008 max mem: 33300 Test: [ 5800/21770] eta: 0:10:36 time: 0.0399 data: 0.0009 max mem: 33300 Test: [ 5900/21770] eta: 0:10:32 time: 0.0399 data: 0.0009 max mem: 33300 Test: [ 6000/21770] eta: 0:10:28 time: 0.0395 data: 0.0008 max mem: 33300 Test: [ 6100/21770] eta: 0:10:24 time: 0.0399 data: 0.0009 max mem: 33300 Test: [ 6200/21770] eta: 0:10:20 time: 0.0398 data: 0.0008 max mem: 33300 Test: [ 6300/21770] eta: 0:10:16 time: 0.0405 data: 0.0008 max mem: 33300 Test: [ 6400/21770] eta: 0:10:12 time: 0.0401 data: 0.0008 max mem: 33300 Test: [ 6500/21770] eta: 0:10:08 time: 0.0400 data: 0.0008 max mem: 33300 Test: [ 6600/21770] eta: 0:10:04 time: 0.0400 data: 0.0008 max mem: 33300 Test: [ 6700/21770] eta: 0:10:00 time: 0.0400 data: 0.0008 max mem: 33300 Test: [ 6800/21770] eta: 0:09:56 time: 0.0399 data: 0.0008 max mem: 33300 Test: [ 6900/21770] eta: 0:09:52 time: 0.0399 data: 0.0009 max mem: 33300 Test: [ 7000/21770] eta: 0:09:49 time: 0.0400 data: 0.0008 max mem: 33300 Test: [ 7100/21770] eta: 0:09:45 time: 0.0399 data: 0.0008 max mem: 33300 Test: [ 7200/21770] eta: 0:09:41 time: 0.0401 data: 0.0008 max mem: 33300 Test: [ 7300/21770] eta: 0:09:37 time: 0.0399 data: 0.0008 max mem: 33300 Test: [ 7400/21770] eta: 0:09:33 time: 0.0400 data: 0.0009 max mem: 33300 Test: [ 7500/21770] eta: 0:09:28 time: 0.0378 data: 0.0009 max mem: 33300 Test: [ 7600/21770] eta: 0:09:24 time: 0.0379 data: 0.0009 max mem: 33300 Test: [ 7700/21770] eta: 0:09:20 time: 0.0405 data: 0.0008 max mem: 33300 Test: [ 7800/21770] eta: 0:09:16 time: 0.0399 data: 0.0008 max mem: 33300 Test: [ 7900/21770] eta: 0:09:12 time: 0.0401 data: 0.0008 max mem: 33300 Test: [ 8000/21770] eta: 0:09:08 time: 0.0404 data: 0.0008 max mem: 33300 Test: [ 8100/21770] eta: 0:09:04 time: 0.0400 data: 0.0008 max mem: 33300 Test: [ 8200/21770] eta: 0:09:00 time: 0.0398 data: 0.0008 max mem: 33300 Test: [ 8300/21770] eta: 0:08:56 time: 0.0399 data: 0.0008 max mem: 33300 Test: [ 8400/21770] eta: 0:08:52 time: 0.0400 data: 0.0008 max mem: 33300 Test: [ 8500/21770] eta: 0:08:48 time: 0.0400 data: 0.0008 max mem: 33300 Test: [ 8600/21770] eta: 0:08:44 time: 0.0399 data: 0.0008 max mem: 33300 Test: [ 8700/21770] eta: 0:08:40 time: 0.0400 data: 0.0008 max mem: 33300 Test: [ 8800/21770] eta: 0:08:36 time: 0.0400 data: 0.0008 max mem: 33300 Test: [ 8900/21770] eta: 0:08:32 time: 0.0401 data: 0.0008 max mem: 33300 Test: [ 9000/21770] eta: 0:08:28 time: 0.0400 data: 0.0008 max mem: 33300 Test: [ 9100/21770] eta: 0:08:24 time: 0.0399 data: 0.0008 max mem: 33300 Test: [ 9200/21770] eta: 0:08:20 time: 0.0398 data: 0.0008 max mem: 33300 Test: [ 9300/21770] eta: 0:08:17 time: 0.0400 data: 0.0008 max mem: 33300 Test: [ 9400/21770] eta: 0:08:13 time: 0.0399 data: 0.0008 max mem: 33300 Test: [ 9500/21770] eta: 0:08:09 time: 0.0400 data: 0.0008 max mem: 33300 Test: [ 9600/21770] eta: 0:08:05 time: 0.0400 data: 0.0008 max mem: 33300 Test: [ 9700/21770] eta: 0:08:01 time: 0.0405 data: 0.0008 max mem: 33300 Test: [ 9800/21770] eta: 0:07:57 time: 0.0400 data: 0.0008 max mem: 33300 Test: [ 9900/21770] eta: 0:07:53 time: 0.0403 data: 0.0008 max mem: 33300 Test: [10000/21770] eta: 0:07:49 time: 0.0402 data: 0.0008 max mem: 33300 Test: [10100/21770] eta: 0:07:45 time: 0.0400 data: 0.0008 max mem: 33300 Test: [10200/21770] eta: 0:07:41 time: 0.0400 data: 0.0008 max mem: 33300 Test: [10300/21770] eta: 0:07:37 time: 0.0406 data: 0.0008 max mem: 33300 Test: [10400/21770] eta: 0:07:33 time: 0.0400 data: 0.0008 max mem: 33300 Test: [10500/21770] eta: 0:07:29 time: 0.0403 data: 0.0008 max mem: 33300 Test: [10600/21770] eta: 0:07:25 time: 0.0403 data: 0.0008 max mem: 33300 Test: [10700/21770] eta: 0:07:21 time: 0.0379 data: 0.0009 max mem: 33300 Test: [10800/21770] eta: 0:07:17 time: 0.0380 data: 0.0008 max mem: 33300 Test: [10900/21770] eta: 0:07:13 time: 0.0380 data: 0.0008 max mem: 33300 Test: [11000/21770] eta: 0:07:09 time: 0.0380 data: 0.0009 max mem: 33300 Test: [11100/21770] eta: 0:07:04 time: 0.0380 data: 0.0009 max mem: 33300 Test: [11200/21770] eta: 0:07:00 time: 0.0380 data: 0.0008 max mem: 33300 Test: [11300/21770] eta: 0:06:56 time: 0.0380 data: 0.0008 max mem: 33300 Test: [11400/21770] eta: 0:06:52 time: 0.0380 data: 0.0008 max mem: 33300 Test: [11500/21770] eta: 0:06:48 time: 0.0380 data: 0.0009 max mem: 33300 Test: [11600/21770] eta: 0:06:44 time: 0.0380 data: 0.0009 max mem: 33300 Test: [11700/21770] eta: 0:06:40 time: 0.0380 data: 0.0008 max mem: 33300 Test: [11800/21770] eta: 0:06:35 time: 0.0380 data: 0.0008 max mem: 33300 Test: [11900/21770] eta: 0:06:31 time: 0.0380 data: 0.0008 max mem: 33300 Test: [12000/21770] eta: 0:06:27 time: 0.0392 data: 0.0008 max mem: 33300 Test: [12100/21770] eta: 0:06:23 time: 0.0387 data: 0.0008 max mem: 33300 Test: [12200/21770] eta: 0:06:19 time: 0.0388 data: 0.0008 max mem: 33300 Test: [12300/21770] eta: 0:06:15 time: 0.0386 data: 0.0008 max mem: 33300 Test: [12400/21770] eta: 0:06:11 time: 0.0380 data: 0.0009 max mem: 33300 Test: [12500/21770] eta: 0:06:07 time: 0.0380 data: 0.0009 max mem: 33300 Test: [12600/21770] eta: 0:06:03 time: 0.0381 data: 0.0009 max mem: 33300 Test: [12700/21770] eta: 0:05:59 time: 0.0388 data: 0.0010 max mem: 33300 Test: [12800/21770] eta: 0:05:55 time: 0.0384 data: 0.0009 max mem: 33300 Test: [12900/21770] eta: 0:05:51 time: 0.0386 data: 0.0009 max mem: 33300 Test: [13000/21770] eta: 0:05:47 time: 0.0382 data: 0.0009 max mem: 33300 Test: [13100/21770] eta: 0:05:43 time: 0.0387 data: 0.0009 max mem: 33300 Test: [13200/21770] eta: 0:05:39 time: 0.0381 data: 0.0009 max mem: 33300 Test: [13300/21770] eta: 0:05:35 time: 0.0385 data: 0.0009 max mem: 33300 Test: [13400/21770] eta: 0:05:31 time: 0.0383 data: 0.0009 max mem: 33300 Test: [13500/21770] eta: 0:05:27 time: 0.0386 data: 0.0009 max mem: 33300 Test: [13600/21770] eta: 0:05:23 time: 0.0384 data: 0.0009 max mem: 33300 Test: [13700/21770] eta: 0:05:19 time: 0.0385 data: 0.0009 max mem: 33300 Test: [13800/21770] eta: 0:05:15 time: 0.0385 data: 0.0009 max mem: 33300 Test: [13900/21770] eta: 0:05:10 time: 0.0385 data: 0.0009 max mem: 33300 Test: [14000/21770] eta: 0:05:06 time: 0.0381 data: 0.0009 max mem: 33300 Test: [14100/21770] eta: 0:05:02 time: 0.0380 data: 0.0009 max mem: 33300 Test: [14200/21770] eta: 0:04:58 time: 0.0380 data: 0.0009 max mem: 33300 Test: [14300/21770] eta: 0:04:54 time: 0.0380 data: 0.0009 max mem: 33300 Test: [14400/21770] eta: 0:04:50 time: 0.0380 data: 0.0009 max mem: 33300 Test: [14500/21770] eta: 0:04:46 time: 0.0380 data: 0.0009 max mem: 33300 Test: [14600/21770] eta: 0:04:42 time: 0.0380 data: 0.0009 max mem: 33300 Test: [14700/21770] eta: 0:04:38 time: 0.0380 data: 0.0009 max mem: 33300 Test: [14800/21770] eta: 0:04:34 time: 0.0380 data: 0.0009 max mem: 33300 Test: [14900/21770] eta: 0:04:30 time: 0.0380 data: 0.0009 max mem: 33300 Test: [15000/21770] eta: 0:04:26 time: 0.0380 data: 0.0009 max mem: 33300 Test: [15100/21770] eta: 0:04:22 time: 0.0380 data: 0.0009 max mem: 33300 Test: [15200/21770] eta: 0:04:18 time: 0.0380 data: 0.0009 max mem: 33300 Test: [15300/21770] eta: 0:04:14 time: 0.0384 data: 0.0008 max mem: 33300 Test: [15400/21770] eta: 0:04:10 time: 0.0382 data: 0.0009 max mem: 33300 Test: [15500/21770] eta: 0:04:06 time: 0.0384 data: 0.0009 max mem: 33300 Test: [15600/21770] eta: 0:04:02 time: 0.0380 data: 0.0009 max mem: 33300 Test: [15700/21770] eta: 0:03:58 time: 0.0380 data: 0.0009 max mem: 33300 Test: [15800/21770] eta: 0:03:54 time: 0.0380 data: 0.0009 max mem: 33300 Test: [15900/21770] eta: 0:03:50 time: 0.0381 data: 0.0009 max mem: 33300 Test: [16000/21770] eta: 0:03:46 time: 0.0380 data: 0.0009 max mem: 33300 Test: [16100/21770] eta: 0:03:42 time: 0.0380 data: 0.0009 max mem: 33300 Test: [16200/21770] eta: 0:03:38 time: 0.0380 data: 0.0008 max mem: 33300 Test: [16300/21770] eta: 0:03:34 time: 0.0383 data: 0.0008 max mem: 33300 Test: [16400/21770] eta: 0:03:31 time: 0.0387 data: 0.0008 max mem: 33300 Test: [16500/21770] eta: 0:03:27 time: 0.0385 data: 0.0008 max mem: 33300 Test: [16600/21770] eta: 0:03:23 time: 0.0388 data: 0.0008 max mem: 33300 Test: [16700/21770] eta: 0:03:19 time: 0.0382 data: 0.0008 max mem: 33300 Test: [16800/21770] eta: 0:03:15 time: 0.0384 data: 0.0008 max mem: 33300 Test: [16900/21770] eta: 0:03:11 time: 0.0383 data: 0.0008 max mem: 33300 Test: [17000/21770] eta: 0:03:07 time: 0.0383 data: 0.0008 max mem: 33300 Test: [17100/21770] eta: 0:03:03 time: 0.0382 data: 0.0008 max mem: 33300 Test: [17200/21770] eta: 0:02:59 time: 0.0380 data: 0.0008 max mem: 33300 Test: [17300/21770] eta: 0:02:55 time: 0.0380 data: 0.0009 max mem: 33300 Test: [17400/21770] eta: 0:02:51 time: 0.0380 data: 0.0008 max mem: 33300 Test: [17500/21770] eta: 0:02:47 time: 0.0380 data: 0.0008 max mem: 33300 Test: [17600/21770] eta: 0:02:43 time: 0.0380 data: 0.0008 max mem: 33300 Test: [17700/21770] eta: 0:02:39 time: 0.0380 data: 0.0008 max mem: 33300 Test: [17800/21770] eta: 0:02:35 time: 0.0381 data: 0.0008 max mem: 33300 Test: [17900/21770] eta: 0:02:31 time: 0.0383 data: 0.0009 max mem: 33300 Test: [18000/21770] eta: 0:02:27 time: 0.0390 data: 0.0011 max mem: 33300 Test: [18100/21770] eta: 0:02:23 time: 0.0380 data: 0.0009 max mem: 33300 Test: [18200/21770] eta: 0:02:19 time: 0.0383 data: 0.0010 max mem: 33300 Test: [18300/21770] eta: 0:02:16 time: 0.0396 data: 0.0011 max mem: 33300 Test: [18400/21770] eta: 0:02:12 time: 0.0387 data: 0.0010 max mem: 33300 Test: [18500/21770] eta: 0:02:08 time: 0.0391 data: 0.0011 max mem: 33300 Test: [18600/21770] eta: 0:02:04 time: 0.0391 data: 0.0010 max mem: 33300 Test: [18700/21770] eta: 0:02:00 time: 0.0394 data: 0.0011 max mem: 33300 Test: [18800/21770] eta: 0:01:56 time: 0.0384 data: 0.0010 max mem: 33300 Test: [18900/21770] eta: 0:01:52 time: 0.0387 data: 0.0011 max mem: 33300 Test: [19000/21770] eta: 0:01:48 time: 0.0386 data: 0.0010 max mem: 33300 Test: [19100/21770] eta: 0:01:44 time: 0.0386 data: 0.0010 max mem: 33300 Test: [19200/21770] eta: 0:01:40 time: 0.0381 data: 0.0009 max mem: 33300 Test: [19300/21770] eta: 0:01:36 time: 0.0387 data: 0.0010 max mem: 33300 Test: [19400/21770] eta: 0:01:32 time: 0.0388 data: 0.0010 max mem: 33300 Test: [19500/21770] eta: 0:01:28 time: 0.0385 data: 0.0009 max mem: 33300 Test: [19600/21770] eta: 0:01:25 time: 0.0386 data: 0.0010 max mem: 33300 Test: [19700/21770] eta: 0:01:21 time: 0.0388 data: 0.0011 max mem: 33300 Test: [19800/21770] eta: 0:01:17 time: 0.0382 data: 0.0009 max mem: 33300 Test: [19900/21770] eta: 0:01:13 time: 0.0383 data: 0.0009 max mem: 33300 Test: [20000/21770] eta: 0:01:09 time: 0.0383 data: 0.0009 max mem: 33300 Test: [20100/21770] eta: 0:01:05 time: 0.0381 data: 0.0009 max mem: 33300 Test: [20200/21770] eta: 0:01:01 time: 0.0382 data: 0.0009 max mem: 33300 Test: [20300/21770] eta: 0:00:57 time: 0.0383 data: 0.0009 max mem: 33300 Test: [20400/21770] eta: 0:00:53 time: 0.0382 data: 0.0009 max mem: 33300 Test: [20500/21770] eta: 0:00:49 time: 0.0382 data: 0.0009 max mem: 33300 Test: [20600/21770] eta: 0:00:45 time: 0.0382 data: 0.0009 max mem: 33300 Test: [20700/21770] eta: 0:00:41 time: 0.0381 data: 0.0009 max mem: 33300 Test: [20800/21770] eta: 0:00:37 time: 0.0383 data: 0.0010 max mem: 33300 Test: [20900/21770] eta: 0:00:34 time: 0.0383 data: 0.0010 max mem: 33300 Test: [21000/21770] eta: 0:00:30 time: 0.0383 data: 0.0009 max mem: 33300 Test: [21100/21770] eta: 0:00:26 time: 0.0381 data: 0.0009 max mem: 33300 Test: [21200/21770] eta: 0:00:22 time: 0.0382 data: 0.0009 max mem: 33300 Test: [21300/21770] eta: 0:00:18 time: 0.0383 data: 0.0010 max mem: 33300 Test: [21400/21770] eta: 0:00:14 time: 0.0382 data: 0.0009 max mem: 33300 Test: [21500/21770] eta: 0:00:10 time: 0.0381 data: 0.0010 max mem: 33300 Test: [21600/21770] eta: 0:00:06 time: 0.0382 data: 0.0010 max mem: 33300 Test: [21700/21770] eta: 0:00:02 time: 0.0391 data: 0.0010 max mem: 33300 Test: Total time: 0:14:11 Final results: Mean IoU is 0.00 precision@0.5 = 0.00 precision@0.6 = 0.00 precision@0.7 = 0.00 precision@0.8 = 0.00 precision@0.9 = 0.00 overall IoU = 0.00 mean IoU = 0.00 Mean accuracy for one-to-zero sample is 100.00 Average object IoU 0.0 Overall IoU 0.0 Epoch: [10] [ 0/4276] eta: 6:46:32 lr: 3.859420456098877e-05 loss: 0.1311 (0.1311) time: 5.7045 data: 2.5797 max mem: 33300 Epoch: [10] [ 10/4276] eta: 3:54:16 lr: 3.8591496797481385e-05 loss: 0.1743 (0.1724) time: 3.2950 data: 0.2403 max mem: 33300 Epoch: [10] [ 20/4276] eta: 3:54:56 lr: 3.858878901286388e-05 loss: 0.1674 (0.1721) time: 3.1926 data: 0.0065 max mem: 33300 Epoch: [10] [ 30/4276] eta: 3:56:38 lr: 3.858608120713444e-05 loss: 0.1578 (0.1713) time: 3.3709 data: 0.0070 max mem: 33300 Epoch: [10] [ 40/4276] eta: 3:52:57 lr: 3.858337338029126e-05 loss: 0.1548 (0.1687) time: 3.2863 data: 0.0080 max mem: 33300 Epoch: [10] [ 50/4276] eta: 3:49:32 lr: 3.858066553233252e-05 loss: 0.1535 (0.1676) time: 3.1271 data: 0.0076 max mem: 33300 Epoch: [10] [ 60/4276] eta: 3:46:33 lr: 3.857795766325643e-05 loss: 0.1583 (0.1672) time: 3.0699 data: 0.0068 max mem: 33300 Epoch: [10] [ 70/4276] eta: 3:44:11 lr: 3.857524977306115e-05 loss: 0.1525 (0.1654) time: 3.0437 data: 0.0076 max mem: 33300 Epoch: [10] [ 80/4276] eta: 3:42:09 lr: 3.8572541861744896e-05 loss: 0.1556 (0.1649) time: 3.0317 data: 0.0078 max mem: 33300 Epoch: [10] [ 90/4276] eta: 3:40:38 lr: 3.856983392930583e-05 loss: 0.1435 (0.1633) time: 3.0353 data: 0.0074 max mem: 33300 Epoch: [10] [ 100/4276] eta: 3:39:30 lr: 3.856712597574216e-05 loss: 0.1435 (0.1644) time: 3.0608 data: 0.0071 max mem: 33300 Epoch: [10] [ 110/4276] eta: 3:38:38 lr: 3.856441800105206e-05 loss: 0.1540 (0.1643) time: 3.0874 data: 0.0074 max mem: 33300 Epoch: [10] [ 120/4276] eta: 3:37:40 lr: 3.8561710005233716e-05 loss: 0.1511 (0.1632) time: 3.0855 data: 0.0074 max mem: 33300 Epoch: [10] [ 130/4276] eta: 3:36:46 lr: 3.855900198828532e-05 loss: 0.1543 (0.1641) time: 3.0712 data: 0.0072 max mem: 33300 Epoch: [10] [ 140/4276] eta: 3:36:07 lr: 3.855629395020506e-05 loss: 0.1543 (0.1634) time: 3.0922 data: 0.0074 max mem: 33300 Epoch: [10] [ 150/4276] eta: 3:35:10 lr: 3.855358589099112e-05 loss: 0.1578 (0.1629) time: 3.0773 data: 0.0077 max mem: 33300 Epoch: [10] [ 160/4276] eta: 3:33:57 lr: 3.855087781064169e-05 loss: 0.1578 (0.1626) time: 3.0034 data: 0.0076 max mem: 33300 Epoch: [10] [ 170/4276] eta: 3:32:32 lr: 3.854816970915494e-05 loss: 0.1586 (0.1626) time: 2.9288 data: 0.0070 max mem: 33300 Epoch: [10] [ 180/4276] eta: 8:09:34 lr: 3.854546158652908e-05 loss: 0.1666 (0.1635) time: 39.7943 data: 36.8305 max mem: 33300 Epoch: [10] [ 190/4276] eta: 7:53:41 lr: 3.854275344276227e-05 loss: 0.1621 (0.1638) time: 39.8749 data: 36.8307 max mem: 33300 Epoch: [10] [ 200/4276] eta: 7:39:18 lr: 3.854004527785271e-05 loss: 0.1509 (0.1644) time: 3.0489 data: 0.0078 max mem: 33300 Epoch: [10] [ 210/4276] eta: 7:26:14 lr: 3.8537337091798574e-05 loss: 0.1656 (0.1650) time: 3.0431 data: 0.0080 max mem: 33300 Epoch: [10] [ 220/4276] eta: 7:14:29 lr: 3.853462888459806e-05 loss: 0.1656 (0.1651) time: 3.0717 data: 0.0076 max mem: 33300 Epoch: [10] [ 230/4276] eta: 7:03:39 lr: 3.853192065624934e-05 loss: 0.1620 (0.1646) time: 3.0922 data: 0.0074 max mem: 33300 Epoch: [10] [ 240/4276] eta: 6:53:36 lr: 3.852921240675059e-05 loss: 0.1569 (0.1649) time: 3.0709 data: 0.0069 max mem: 33300 Epoch: [10] [ 250/4276] eta: 6:44:22 lr: 3.8526504136100016e-05 loss: 0.1642 (0.1656) time: 3.0690 data: 0.0070 max mem: 33300 Epoch: [10] [ 260/4276] eta: 6:35:46 lr: 3.852379584429578e-05 loss: 0.1664 (0.1657) time: 3.0694 data: 0.0070 max mem: 33300 Epoch: [10] [ 270/4276] eta: 6:27:46 lr: 3.8521087531336074e-05 loss: 0.1586 (0.1657) time: 3.0644 data: 0.0069 max mem: 33300 Epoch: [10] [ 280/4276] eta: 6:20:16 lr: 3.851837919721908e-05 loss: 0.1630 (0.1654) time: 3.0605 data: 0.0073 max mem: 33300 Epoch: [10] [ 290/4276] eta: 6:13:15 lr: 3.8515670841942976e-05 loss: 0.1560 (0.1650) time: 3.0523 data: 0.0074 max mem: 33300 Epoch: [10] [ 300/4276] eta: 6:06:40 lr: 3.851296246550594e-05 loss: 0.1432 (0.1646) time: 3.0515 data: 0.0076 max mem: 33300 Epoch: [10] [ 310/4276] eta: 6:00:27 lr: 3.851025406790617e-05 loss: 0.1407 (0.1639) time: 3.0485 data: 0.0078 max mem: 33300 Epoch: [10] [ 320/4276] eta: 5:54:36 lr: 3.8507545649141826e-05 loss: 0.1638 (0.1645) time: 3.0485 data: 0.0074 max mem: 33300 Epoch: [10] [ 330/4276] eta: 5:49:07 lr: 3.8504837209211106e-05 loss: 0.1665 (0.1646) time: 3.0582 data: 0.0070 max mem: 33300 Epoch: [10] [ 340/4276] eta: 5:43:58 lr: 3.850212874811217e-05 loss: 0.1570 (0.1645) time: 3.0778 data: 0.0073 max mem: 33300 Epoch: [10] [ 350/4276] eta: 5:39:10 lr: 3.8499420265843216e-05 loss: 0.1539 (0.1643) time: 3.1142 data: 0.0078 max mem: 33300 Epoch: [10] [ 360/4276] eta: 5:34:27 lr: 3.849671176240241e-05 loss: 0.1682 (0.1653) time: 3.0979 data: 0.0076 max mem: 33300 Epoch: [10] [ 370/4276] eta: 5:30:00 lr: 3.849400323778795e-05 loss: 0.1626 (0.1647) time: 3.0626 data: 0.0075 max mem: 33300 Epoch: [10] [ 380/4276] eta: 5:25:42 lr: 3.8491294691998005e-05 loss: 0.1474 (0.1648) time: 3.0587 data: 0.0083 max mem: 33300 Epoch: [10] [ 390/4276] eta: 5:21:38 lr: 3.848858612503075e-05 loss: 0.1493 (0.1648) time: 3.0561 data: 0.0086 max mem: 33300 Epoch: [10] [ 400/4276] eta: 5:17:48 lr: 3.848587753688436e-05 loss: 0.1628 (0.1648) time: 3.0802 data: 0.0083 max mem: 33300 Epoch: [10] [ 410/4276] eta: 5:14:03 lr: 3.848316892755703e-05 loss: 0.1577 (0.1644) time: 3.0765 data: 0.0081 max mem: 33300 Epoch: [10] [ 420/4276] eta: 5:10:32 lr: 3.848046029704693e-05 loss: 0.1512 (0.1645) time: 3.0799 data: 0.0081 max mem: 33300 Epoch: [10] [ 430/4276] eta: 5:07:04 lr: 3.847775164535222e-05 loss: 0.1691 (0.1647) time: 3.0732 data: 0.0080 max mem: 33300 Epoch: [10] [ 440/4276] eta: 5:03:46 lr: 3.84750429724711e-05 loss: 0.1691 (0.1647) time: 3.0523 data: 0.0075 max mem: 33300 Epoch: [10] [ 450/4276] eta: 5:00:38 lr: 3.8472334278401746e-05 loss: 0.1596 (0.1647) time: 3.0797 data: 0.0079 max mem: 33300 Epoch: [10] [ 460/4276] eta: 4:57:36 lr: 3.8469625563142317e-05 loss: 0.1452 (0.1643) time: 3.0889 data: 0.0084 max mem: 33300 Epoch: [10] [ 470/4276] eta: 4:54:36 lr: 3.846691682669101e-05 loss: 0.1361 (0.1638) time: 3.0554 data: 0.0084 max mem: 33300 Epoch: [10] [ 480/4276] eta: 4:51:40 lr: 3.8464208069045984e-05 loss: 0.1343 (0.1633) time: 3.0143 data: 0.0082 max mem: 33300 Epoch: [10] [ 490/4276] eta: 4:48:53 lr: 3.8461499290205436e-05 loss: 0.1353 (0.1629) time: 3.0232 data: 0.0076 max mem: 33300 Epoch: [10] [ 500/4276] eta: 4:46:13 lr: 3.8458790490167515e-05 loss: 0.1353 (0.1626) time: 3.0537 data: 0.0078 max mem: 33300 Epoch: [10] [ 510/4276] eta: 4:43:39 lr: 3.845608166893041e-05 loss: 0.1512 (0.1625) time: 3.0715 data: 0.0081 max mem: 33300 Epoch: [10] [ 520/4276] eta: 4:41:12 lr: 3.84533728264923e-05 loss: 0.1559 (0.1625) time: 3.0947 data: 0.0080 max mem: 33300 Epoch: [10] [ 530/4276] eta: 4:38:48 lr: 3.845066396285135e-05 loss: 0.1583 (0.1624) time: 3.0949 data: 0.0077 max mem: 33300 Epoch: [10] [ 540/4276] eta: 4:36:30 lr: 3.844795507800575e-05 loss: 0.1505 (0.1622) time: 3.1019 data: 0.0080 max mem: 33300 Epoch: [10] [ 550/4276] eta: 4:34:15 lr: 3.8445246171953646e-05 loss: 0.1440 (0.1621) time: 3.1046 data: 0.0082 max mem: 33300 Epoch: [10] [ 560/4276] eta: 4:32:00 lr: 3.844253724469325e-05 loss: 0.1557 (0.1622) time: 3.0740 data: 0.0079 max mem: 33300 Epoch: [10] [ 570/4276] eta: 4:29:50 lr: 3.843982829622269e-05 loss: 0.1659 (0.1622) time: 3.0586 data: 0.0080 max mem: 33300 Epoch: [10] [ 580/4276] eta: 4:27:43 lr: 3.843711932654018e-05 loss: 0.1622 (0.1622) time: 3.0618 data: 0.0082 max mem: 33300 Epoch: [10] [ 590/4276] eta: 4:25:39 lr: 3.8434410335643866e-05 loss: 0.1485 (0.1619) time: 3.0589 data: 0.0080 max mem: 33300 Epoch: [10] [ 600/4276] eta: 4:23:41 lr: 3.8431701323531935e-05 loss: 0.1487 (0.1617) time: 3.0777 data: 0.0079 max mem: 33300 Epoch: [10] [ 610/4276] eta: 4:21:47 lr: 3.8428992290202547e-05 loss: 0.1498 (0.1615) time: 3.1097 data: 0.0080 max mem: 33300 Epoch: [10] [ 620/4276] eta: 4:19:55 lr: 3.842628323565389e-05 loss: 0.1496 (0.1616) time: 3.1175 data: 0.0082 max mem: 33300 Epoch: [10] [ 630/4276] eta: 4:18:05 lr: 3.842357415988413e-05 loss: 0.1596 (0.1618) time: 3.1090 data: 0.0083 max mem: 33300 Epoch: [10] [ 640/4276] eta: 4:16:21 lr: 3.842086506289143e-05 loss: 0.1593 (0.1616) time: 3.1237 data: 0.0087 max mem: 33300 Epoch: [10] [ 650/4276] eta: 4:14:33 lr: 3.841815594467397e-05 loss: 0.1451 (0.1615) time: 3.1015 data: 0.0085 max mem: 33300 Epoch: [10] [ 660/4276] eta: 4:12:48 lr: 3.841544680522991e-05 loss: 0.1649 (0.1616) time: 3.0578 data: 0.0082 max mem: 33300 Epoch: [10] [ 670/4276] eta: 4:11:05 lr: 3.841273764455743e-05 loss: 0.1649 (0.1616) time: 3.0624 data: 0.0083 max mem: 33300 Epoch: [10] [ 680/4276] eta: 4:09:24 lr: 3.84100284626547e-05 loss: 0.1530 (0.1616) time: 3.0623 data: 0.0083 max mem: 33300 Epoch: [10] [ 690/4276] eta: 4:07:45 lr: 3.840731925951988e-05 loss: 0.1651 (0.1615) time: 3.0540 data: 0.0079 max mem: 33300 Epoch: [10] [ 700/4276] eta: 4:06:05 lr: 3.8404610035151165e-05 loss: 0.1570 (0.1615) time: 3.0230 data: 0.0082 max mem: 33300 Epoch: [10] [ 710/4276] eta: 4:04:28 lr: 3.8401900789546694e-05 loss: 0.1634 (0.1615) time: 3.0109 data: 0.0086 max mem: 33300 Epoch: [10] [ 720/4276] eta: 4:02:53 lr: 3.839919152270465e-05 loss: 0.1624 (0.1614) time: 3.0213 data: 0.0081 max mem: 33300 Epoch: [10] [ 730/4276] eta: 4:01:20 lr: 3.8396482234623195e-05 loss: 0.1525 (0.1614) time: 3.0275 data: 0.0076 max mem: 33300 Epoch: [10] [ 740/4276] eta: 3:59:54 lr: 3.83937729253005e-05 loss: 0.1503 (0.1613) time: 3.0852 data: 0.0077 max mem: 33300 Epoch: [10] [ 750/4276] eta: 3:58:27 lr: 3.8391063594734736e-05 loss: 0.1503 (0.1613) time: 3.1087 data: 0.0084 max mem: 33300 Epoch: [10] [ 760/4276] eta: 3:57:00 lr: 3.8388354242924076e-05 loss: 0.1587 (0.1613) time: 3.0693 data: 0.0087 max mem: 33300 Epoch: [10] [ 770/4276] eta: 3:55:35 lr: 3.838564486986667e-05 loss: 0.1522 (0.1613) time: 3.0510 data: 0.0085 max mem: 33300 Epoch: [10] [ 780/4276] eta: 3:54:09 lr: 3.838293547556071e-05 loss: 0.1526 (0.1613) time: 3.0382 data: 0.0084 max mem: 33300 Epoch: [10] [ 790/4276] eta: 3:52:44 lr: 3.8380226060004336e-05 loss: 0.1610 (0.1614) time: 3.0167 data: 0.0082 max mem: 33300 Epoch: [10] [ 800/4276] eta: 3:51:23 lr: 3.837751662319574e-05 loss: 0.1596 (0.1615) time: 3.0293 data: 0.0084 max mem: 33300 Epoch: [10] [ 810/4276] eta: 3:50:05 lr: 3.837480716513307e-05 loss: 0.1507 (0.1615) time: 3.0835 data: 0.0087 max mem: 33300 Epoch: [10] [ 820/4276] eta: 3:48:48 lr: 3.837209768581449e-05 loss: 0.1501 (0.1613) time: 3.1006 data: 0.0088 max mem: 33300 Epoch: [10] [ 830/4276] eta: 3:47:32 lr: 3.836938818523818e-05 loss: 0.1465 (0.1614) time: 3.0961 data: 0.0084 max mem: 33300 Epoch: [10] [ 840/4276] eta: 3:46:17 lr: 3.836667866340229e-05 loss: 0.1600 (0.1614) time: 3.0975 data: 0.0081 max mem: 33300 Epoch: [10] [ 850/4276] eta: 3:45:00 lr: 3.8363969120305e-05 loss: 0.1422 (0.1613) time: 3.0583 data: 0.0084 max mem: 33300 Epoch: [10] [ 860/4276] eta: 3:43:45 lr: 3.836125955594446e-05 loss: 0.1465 (0.1613) time: 3.0352 data: 0.0082 max mem: 33300 Epoch: [10] [ 870/4276] eta: 3:42:32 lr: 3.835854997031885e-05 loss: 0.1541 (0.1613) time: 3.0475 data: 0.0084 max mem: 33300 Epoch: [10] [ 880/4276] eta: 3:41:20 lr: 3.835584036342633e-05 loss: 0.1550 (0.1614) time: 3.0634 data: 0.0084 max mem: 33300 Epoch: [10] [ 890/4276] eta: 3:40:09 lr: 3.8353130735265045e-05 loss: 0.1795 (0.1616) time: 3.0711 data: 0.0083 max mem: 33300 Epoch: [10] [ 900/4276] eta: 3:38:58 lr: 3.8350421085833185e-05 loss: 0.1660 (0.1616) time: 3.0656 data: 0.0086 max mem: 33300 Epoch: [10] [ 910/4276] eta: 3:37:50 lr: 3.834771141512889e-05 loss: 0.1602 (0.1617) time: 3.0805 data: 0.0091 max mem: 33300 Epoch: [10] [ 920/4276] eta: 3:36:42 lr: 3.834500172315034e-05 loss: 0.1610 (0.1618) time: 3.0856 data: 0.0092 max mem: 33300 Epoch: [10] [ 930/4276] eta: 3:35:36 lr: 3.8342292009895695e-05 loss: 0.1610 (0.1619) time: 3.1017 data: 0.0086 max mem: 33300 Epoch: [10] [ 940/4276] eta: 3:34:30 lr: 3.833958227536311e-05 loss: 0.1576 (0.1617) time: 3.1033 data: 0.0084 max mem: 33300 Epoch: [10] [ 950/4276] eta: 3:33:24 lr: 3.833687251955076e-05 loss: 0.1547 (0.1617) time: 3.0771 data: 0.0087 max mem: 33300 Epoch: [10] [ 960/4276] eta: 3:32:19 lr: 3.833416274245679e-05 loss: 0.1632 (0.1619) time: 3.0832 data: 0.0085 max mem: 33300 Epoch: [10] [ 970/4276] eta: 3:31:14 lr: 3.833145294407936e-05 loss: 0.1677 (0.1618) time: 3.0786 data: 0.0079 max mem: 33300 Epoch: [10] [ 980/4276] eta: 3:30:09 lr: 3.832874312441665e-05 loss: 0.1628 (0.1619) time: 3.0586 data: 0.0079 max mem: 33300 Epoch: [10] [ 990/4276] eta: 3:29:05 lr: 3.832603328346681e-05 loss: 0.1605 (0.1619) time: 3.0510 data: 0.0081 max mem: 33300 Epoch: [10] [1000/4276] eta: 3:28:04 lr: 3.832332342122799e-05 loss: 0.1530 (0.1619) time: 3.0706 data: 0.0086 max mem: 33300 Epoch: [10] [1010/4276] eta: 3:27:02 lr: 3.8320613537698374e-05 loss: 0.1637 (0.1619) time: 3.0920 data: 0.0087 max mem: 33300 Epoch: [10] [1020/4276] eta: 3:26:02 lr: 3.8317903632876106e-05 loss: 0.1523 (0.1618) time: 3.1072 data: 0.0084 max mem: 33300 Epoch: [10] [1030/4276] eta: 3:25:04 lr: 3.831519370675935e-05 loss: 0.1503 (0.1618) time: 3.1414 data: 0.0090 max mem: 33300 Epoch: [10] [1040/4276] eta: 3:24:04 lr: 3.8312483759346256e-05 loss: 0.1546 (0.1619) time: 3.1169 data: 0.0090 max mem: 33300 Epoch: [10] [1050/4276] eta: 3:23:04 lr: 3.8309773790634996e-05 loss: 0.1550 (0.1619) time: 3.0685 data: 0.0082 max mem: 33300 Epoch: [10] [1060/4276] eta: 3:22:05 lr: 3.8307063800623724e-05 loss: 0.1580 (0.1619) time: 3.0586 data: 0.0084 max mem: 33300 Epoch: [10] [1070/4276] eta: 3:21:05 lr: 3.830435378931059e-05 loss: 0.1686 (0.1620) time: 3.0519 data: 0.0088 max mem: 33300 Epoch: [10] [1080/4276] eta: 3:20:07 lr: 3.830164375669376e-05 loss: 0.1650 (0.1619) time: 3.0504 data: 0.0091 max mem: 33300 Epoch: [10] [1090/4276] eta: 3:19:09 lr: 3.82989337027714e-05 loss: 0.1628 (0.1619) time: 3.0600 data: 0.0085 max mem: 33300 Epoch: [10] [1100/4276] eta: 3:18:13 lr: 3.829622362754165e-05 loss: 0.1528 (0.1619) time: 3.0888 data: 0.0081 max mem: 33300 Epoch: [10] [1110/4276] eta: 3:17:17 lr: 3.829351353100268e-05 loss: 0.1528 (0.1620) time: 3.0964 data: 0.0086 max mem: 33300 Epoch: [10] [1120/4276] eta: 3:16:22 lr: 3.8290803413152644e-05 loss: 0.1550 (0.1620) time: 3.0968 data: 0.0083 max mem: 33300 Epoch: [10] [1130/4276] eta: 3:15:27 lr: 3.828809327398969e-05 loss: 0.1403 (0.1618) time: 3.1045 data: 0.0081 max mem: 33300 Epoch: [10] [1140/4276] eta: 3:14:31 lr: 3.828538311351198e-05 loss: 0.1403 (0.1616) time: 3.0834 data: 0.0086 max mem: 33300 Epoch: [10] [1150/4276] eta: 3:13:36 lr: 3.828267293171768e-05 loss: 0.1426 (0.1615) time: 3.0623 data: 0.0084 max mem: 33300 Epoch: [10] [1160/4276] eta: 3:12:42 lr: 3.827996272860493e-05 loss: 0.1617 (0.1615) time: 3.0789 data: 0.0080 max mem: 33300 Epoch: [10] [1170/4276] eta: 3:11:48 lr: 3.827725250417189e-05 loss: 0.1704 (0.1616) time: 3.0782 data: 0.0086 max mem: 33300 Epoch: [10] [1180/4276] eta: 3:10:53 lr: 3.8274542258416724e-05 loss: 0.1679 (0.1616) time: 3.0347 data: 0.0085 max mem: 33300 Epoch: [10] [1190/4276] eta: 3:09:58 lr: 3.827183199133757e-05 loss: 0.1486 (0.1615) time: 3.0210 data: 0.0081 max mem: 33300 Epoch: [10] [1200/4276] eta: 3:09:06 lr: 3.826912170293259e-05 loss: 0.1473 (0.1615) time: 3.0671 data: 0.0082 max mem: 33300 Epoch: [10] [1210/4276] eta: 3:08:14 lr: 3.826641139319994e-05 loss: 0.1387 (0.1614) time: 3.1029 data: 0.0079 max mem: 33300 Epoch: [10] [1220/4276] eta: 3:07:24 lr: 3.826370106213778e-05 loss: 0.1573 (0.1614) time: 3.1203 data: 0.0078 max mem: 33300 Epoch: [10] [1230/4276] eta: 3:06:33 lr: 3.8260990709744246e-05 loss: 0.1640 (0.1615) time: 3.1312 data: 0.0080 max mem: 33300 Epoch: [10] [1240/4276] eta: 3:05:42 lr: 3.8258280336017514e-05 loss: 0.1617 (0.1615) time: 3.1044 data: 0.0080 max mem: 33300 Epoch: [10] [1250/4276] eta: 3:04:51 lr: 3.8255569940955717e-05 loss: 0.1594 (0.1615) time: 3.0758 data: 0.0078 max mem: 33300 Epoch: [10] [1260/4276] eta: 3:04:00 lr: 3.8252859524557024e-05 loss: 0.1495 (0.1614) time: 3.0712 data: 0.0085 max mem: 33300 Epoch: [10] [1270/4276] eta: 3:03:09 lr: 3.8250149086819565e-05 loss: 0.1400 (0.1613) time: 3.0714 data: 0.0087 max mem: 33300 Epoch: [10] [1280/4276] eta: 3:02:19 lr: 3.824743862774151e-05 loss: 0.1617 (0.1613) time: 3.0692 data: 0.0085 max mem: 33300 Epoch: [10] [1290/4276] eta: 3:01:29 lr: 3.8244728147321005e-05 loss: 0.1663 (0.1614) time: 3.0711 data: 0.0085 max mem: 33300 Epoch: [10] [1300/4276] eta: 3:00:41 lr: 3.82420176455562e-05 loss: 0.1446 (0.1613) time: 3.0957 data: 0.0088 max mem: 33300 Epoch: [10] [1310/4276] eta: 2:59:52 lr: 3.8239307122445244e-05 loss: 0.1439 (0.1612) time: 3.1083 data: 0.0088 max mem: 33300 Epoch: [10] [1320/4276] eta: 2:59:04 lr: 3.82365965779863e-05 loss: 0.1663 (0.1613) time: 3.1070 data: 0.0085 max mem: 33300 Epoch: [10] [1330/4276] eta: 2:58:15 lr: 3.823388601217751e-05 loss: 0.1554 (0.1612) time: 3.0853 data: 0.0085 max mem: 33300 Epoch: [10] [1340/4276] eta: 2:57:26 lr: 3.823117542501702e-05 loss: 0.1468 (0.1611) time: 3.0518 data: 0.0087 max mem: 33300 Epoch: [10] [1350/4276] eta: 2:56:37 lr: 3.822846481650298e-05 loss: 0.1507 (0.1612) time: 3.0603 data: 0.0089 max mem: 33300 Epoch: [10] [1360/4276] eta: 2:55:49 lr: 3.822575418663355e-05 loss: 0.1690 (0.1612) time: 3.0705 data: 0.0086 max mem: 33300 Epoch: [10] [1370/4276] eta: 2:55:02 lr: 3.822304353540687e-05 loss: 0.1539 (0.1611) time: 3.0743 data: 0.0080 max mem: 33300 Epoch: [10] [1380/4276] eta: 2:54:15 lr: 3.822033286282109e-05 loss: 0.1547 (0.1612) time: 3.0906 data: 0.0079 max mem: 33300 Epoch: [10] [1390/4276] eta: 2:53:28 lr: 3.821762216887436e-05 loss: 0.1706 (0.1613) time: 3.0952 data: 0.0084 max mem: 33300 Epoch: [10] [1400/4276] eta: 2:52:42 lr: 3.821491145356483e-05 loss: 0.1732 (0.1614) time: 3.0969 data: 0.0083 max mem: 33300 Epoch: [10] [1410/4276] eta: 2:51:56 lr: 3.821220071689064e-05 loss: 0.1586 (0.1614) time: 3.1226 data: 0.0082 max mem: 33300 Epoch: [10] [1420/4276] eta: 2:51:11 lr: 3.820948995884995e-05 loss: 0.1423 (0.1615) time: 3.1297 data: 0.0085 max mem: 33300 Epoch: [10] [1430/4276] eta: 2:50:24 lr: 3.820677917944089e-05 loss: 0.1423 (0.1614) time: 3.0930 data: 0.0087 max mem: 33300 Epoch: [10] [1440/4276] eta: 2:49:38 lr: 3.820406837866162e-05 loss: 0.1549 (0.1614) time: 3.0817 data: 0.0085 max mem: 33300 Epoch: [10] [1450/4276] eta: 2:48:52 lr: 3.820135755651029e-05 loss: 0.1669 (0.1614) time: 3.0829 data: 0.0083 max mem: 33300 Epoch: [10] [1460/4276] eta: 2:48:07 lr: 3.819864671298504e-05 loss: 0.1551 (0.1614) time: 3.0684 data: 0.0085 max mem: 33300 Epoch: [10] [1470/4276] eta: 2:47:21 lr: 3.8195935848084014e-05 loss: 0.1610 (0.1614) time: 3.0620 data: 0.0093 max mem: 33300 Epoch: [10] [1480/4276] eta: 2:46:35 lr: 3.819322496180536e-05 loss: 0.1474 (0.1613) time: 3.0587 data: 0.0089 max mem: 33300 Epoch: [10] [1490/4276] eta: 2:45:50 lr: 3.819051405414722e-05 loss: 0.1397 (0.1613) time: 3.0758 data: 0.0080 max mem: 33300 Epoch: [10] [1500/4276] eta: 2:45:06 lr: 3.818780312510775e-05 loss: 0.1575 (0.1613) time: 3.0968 data: 0.0082 max mem: 33300 Epoch: [10] [1510/4276] eta: 2:44:23 lr: 3.818509217468509e-05 loss: 0.1639 (0.1613) time: 3.1395 data: 0.0080 max mem: 33300 Epoch: [10] [1520/4276] eta: 2:43:39 lr: 3.818238120287737e-05 loss: 0.1567 (0.1613) time: 3.1409 data: 0.0077 max mem: 33300 Epoch: [10] [1530/4276] eta: 2:42:54 lr: 3.8179670209682756e-05 loss: 0.1559 (0.1612) time: 3.0691 data: 0.0077 max mem: 33300 Epoch: [10] [1540/4276] eta: 2:42:09 lr: 3.8176959195099374e-05 loss: 0.1686 (0.1613) time: 3.0213 data: 0.0080 max mem: 33300 Epoch: [10] [1550/4276] eta: 2:41:24 lr: 3.8174248159125394e-05 loss: 0.1706 (0.1614) time: 3.0152 data: 0.0084 max mem: 33300 Epoch: [10] [1560/4276] eta: 2:40:39 lr: 3.8171537101758926e-05 loss: 0.1554 (0.1613) time: 3.0252 data: 0.0084 max mem: 33300 Epoch: [10] [1570/4276] eta: 2:39:54 lr: 3.8168826022998134e-05 loss: 0.1554 (0.1613) time: 3.0296 data: 0.0085 max mem: 33300 Epoch: [10] [1580/4276] eta: 2:39:12 lr: 3.816611492284115e-05 loss: 0.1546 (0.1613) time: 3.0669 data: 0.0083 max mem: 33300 Epoch: [10] [1590/4276] eta: 2:38:29 lr: 3.8163403801286124e-05 loss: 0.1491 (0.1612) time: 3.1046 data: 0.0080 max mem: 33300 Epoch: [10] [1600/4276] eta: 2:37:46 lr: 3.816069265833119e-05 loss: 0.1529 (0.1612) time: 3.1094 data: 0.0079 max mem: 33300 Epoch: [10] [1610/4276] eta: 2:37:04 lr: 3.81579814939745e-05 loss: 0.1442 (0.1611) time: 3.1189 data: 0.0081 max mem: 33300 Epoch: [10] [1620/4276] eta: 2:36:21 lr: 3.815527030821419e-05 loss: 0.1419 (0.1611) time: 3.1044 data: 0.0086 max mem: 33300 Epoch: [10] [1630/4276] eta: 2:35:38 lr: 3.8152559101048414e-05 loss: 0.1508 (0.1611) time: 3.0784 data: 0.0087 max mem: 33300 Epoch: [10] [1640/4276] eta: 2:34:56 lr: 3.814984787247528e-05 loss: 0.1499 (0.1610) time: 3.0675 data: 0.0082 max mem: 33300 Epoch: [10] [1650/4276] eta: 2:34:14 lr: 3.814713662249297e-05 loss: 0.1428 (0.1610) time: 3.0821 data: 0.0082 max mem: 33300 Epoch: [10] [1660/4276] eta: 2:33:31 lr: 3.814442535109959e-05 loss: 0.1561 (0.1611) time: 3.0856 data: 0.0081 max mem: 33300 Epoch: [10] [1670/4276] eta: 2:32:49 lr: 3.8141714058293295e-05 loss: 0.1561 (0.1611) time: 3.0772 data: 0.0080 max mem: 33300 Epoch: [10] [1680/4276] eta: 2:32:07 lr: 3.8139002744072224e-05 loss: 0.1582 (0.1611) time: 3.0782 data: 0.0081 max mem: 33300 Epoch: [10] [1690/4276] eta: 2:31:26 lr: 3.813629140843451e-05 loss: 0.1561 (0.1611) time: 3.0916 data: 0.0080 max mem: 33300 Epoch: [10] [1700/4276] eta: 2:30:45 lr: 3.81335800513783e-05 loss: 0.1552 (0.1612) time: 3.1181 data: 0.0082 max mem: 33300 Epoch: [10] [1710/4276] eta: 2:30:04 lr: 3.813086867290174e-05 loss: 0.1775 (0.1613) time: 3.1314 data: 0.0084 max mem: 33300 Epoch: [10] [1720/4276] eta: 2:29:23 lr: 3.812815727300295e-05 loss: 0.1775 (0.1614) time: 3.1291 data: 0.0081 max mem: 33300 Epoch: [10] [1730/4276] eta: 2:28:42 lr: 3.812544585168008e-05 loss: 0.1643 (0.1614) time: 3.1169 data: 0.0078 max mem: 33300 Epoch: [10] [1740/4276] eta: 2:28:01 lr: 3.812273440893126e-05 loss: 0.1552 (0.1614) time: 3.1098 data: 0.0081 max mem: 33300 Epoch: [10] [1750/4276] eta: 2:27:19 lr: 3.8120022944754625e-05 loss: 0.1516 (0.1613) time: 3.0626 data: 0.0080 max mem: 33300 Epoch: [10] [1760/4276] eta: 2:26:38 lr: 3.811731145914832e-05 loss: 0.1453 (0.1612) time: 3.0342 data: 0.0076 max mem: 33300 Epoch: [10] [1770/4276] eta: 2:25:57 lr: 3.811459995211049e-05 loss: 0.1458 (0.1613) time: 3.0606 data: 0.0076 max mem: 33300 Epoch: [10] [1780/4276] eta: 2:25:16 lr: 3.811188842363926e-05 loss: 0.1556 (0.1614) time: 3.0668 data: 0.0080 max mem: 33300 Epoch: [10] [1790/4276] eta: 2:24:36 lr: 3.810917687373276e-05 loss: 0.1584 (0.1614) time: 3.0865 data: 0.0085 max mem: 33300 Epoch: [10] [1800/4276] eta: 2:23:56 lr: 3.8106465302389145e-05 loss: 0.1584 (0.1614) time: 3.1354 data: 0.0084 max mem: 33300 Epoch: [10] [1810/4276] eta: 2:23:17 lr: 3.810375370960653e-05 loss: 0.1702 (0.1615) time: 3.1571 data: 0.0082 max mem: 33300 Epoch: [10] [1820/4276] eta: 2:22:37 lr: 3.810104209538307e-05 loss: 0.1618 (0.1614) time: 3.1204 data: 0.0079 max mem: 33300 Epoch: [10] [1830/4276] eta: 2:21:56 lr: 3.8098330459716876e-05 loss: 0.1440 (0.1614) time: 3.0816 data: 0.0079 max mem: 33300 Epoch: [10] [1840/4276] eta: 2:21:16 lr: 3.80956188026061e-05 loss: 0.1440 (0.1613) time: 3.0785 data: 0.0083 max mem: 33300 Epoch: [10] [1850/4276] eta: 2:20:36 lr: 3.8092907124048885e-05 loss: 0.1593 (0.1614) time: 3.0754 data: 0.0082 max mem: 33300 Epoch: [10] [1860/4276] eta: 2:19:56 lr: 3.8090195424043345e-05 loss: 0.1593 (0.1614) time: 3.0741 data: 0.0087 max mem: 33300 Epoch: [10] [1870/4276] eta: 2:19:16 lr: 3.808748370258762e-05 loss: 0.1477 (0.1615) time: 3.0722 data: 0.0089 max mem: 33300 Epoch: [10] [1880/4276] eta: 2:18:37 lr: 3.808477195967985e-05 loss: 0.1608 (0.1615) time: 3.0958 data: 0.0088 max mem: 33300 Epoch: [10] [1890/4276] eta: 2:17:58 lr: 3.808206019531816e-05 loss: 0.1601 (0.1615) time: 3.1348 data: 0.0086 max mem: 33300 Epoch: [10] [1900/4276] eta: 2:17:19 lr: 3.807934840950068e-05 loss: 0.1419 (0.1615) time: 3.1465 data: 0.0085 max mem: 33300 Epoch: [10] [1910/4276] eta: 2:16:40 lr: 3.807663660222556e-05 loss: 0.1482 (0.1615) time: 3.1078 data: 0.0084 max mem: 33300 Epoch: [10] [1920/4276] eta: 2:16:00 lr: 3.807392477349091e-05 loss: 0.1482 (0.1614) time: 3.0605 data: 0.0078 max mem: 33300 Epoch: [10] [1930/4276] eta: 2:15:21 lr: 3.8071212923294876e-05 loss: 0.1425 (0.1614) time: 3.0583 data: 0.0079 max mem: 33300 Epoch: [10] [1940/4276] eta: 2:14:41 lr: 3.806850105163559e-05 loss: 0.1570 (0.1614) time: 3.0647 data: 0.0084 max mem: 33300 Epoch: [10] [1950/4276] eta: 2:14:02 lr: 3.806578915851117e-05 loss: 0.1570 (0.1614) time: 3.0837 data: 0.0084 max mem: 33300 Epoch: [10] [1960/4276] eta: 2:13:24 lr: 3.806307724391977e-05 loss: 0.1438 (0.1614) time: 3.1000 data: 0.0081 max mem: 33300 Epoch: [10] [1970/4276] eta: 2:12:44 lr: 3.80603653078595e-05 loss: 0.1364 (0.1612) time: 3.0788 data: 0.0081 max mem: 33300 Epoch: [10] [1980/4276] eta: 2:12:06 lr: 3.8057653350328495e-05 loss: 0.1386 (0.1612) time: 3.0859 data: 0.0081 max mem: 33300 Epoch: [10] [1990/4276] eta: 2:11:28 lr: 3.8054941371324884e-05 loss: 0.1500 (0.1612) time: 3.1446 data: 0.0083 max mem: 33300 Epoch: [10] [2000/4276] eta: 2:10:50 lr: 3.805222937084681e-05 loss: 0.1576 (0.1612) time: 3.1502 data: 0.0082 max mem: 33300 Epoch: [10] [2010/4276] eta: 2:10:11 lr: 3.8049517348892385e-05 loss: 0.1576 (0.1612) time: 3.1028 data: 0.0079 max mem: 33300 Epoch: [10] [2020/4276] eta: 2:09:33 lr: 3.804680530545975e-05 loss: 0.1583 (0.1612) time: 3.0750 data: 0.0078 max mem: 33300 Epoch: [10] [2030/4276] eta: 2:08:54 lr: 3.804409324054702e-05 loss: 0.1567 (0.1611) time: 3.0773 data: 0.0083 max mem: 33300 Epoch: [10] [2040/4276] eta: 2:08:15 lr: 3.8041381154152345e-05 loss: 0.1472 (0.1611) time: 3.0663 data: 0.0087 max mem: 33300 Epoch: [10] [2050/4276] eta: 2:07:37 lr: 3.803866904627383e-05 loss: 0.1561 (0.1611) time: 3.0438 data: 0.0081 max mem: 33300 Epoch: [10] [2060/4276] eta: 2:06:58 lr: 3.803595691690962e-05 loss: 0.1550 (0.1611) time: 3.0556 data: 0.0085 max mem: 33300 Epoch: [10] [2070/4276] eta: 2:06:20 lr: 3.803324476605783e-05 loss: 0.1465 (0.1610) time: 3.0540 data: 0.0087 max mem: 33300 Epoch: [10] [2080/4276] eta: 2:05:42 lr: 3.803053259371659e-05 loss: 0.1566 (0.1611) time: 3.0637 data: 0.0084 max mem: 33300 Epoch: [10] [2090/4276] eta: 2:05:04 lr: 3.802782039988405e-05 loss: 0.1647 (0.1611) time: 3.1221 data: 0.0083 max mem: 33300 Epoch: [10] [2100/4276] eta: 2:04:27 lr: 3.80251081845583e-05 loss: 0.1635 (0.1611) time: 3.1554 data: 0.0081 max mem: 33300 Epoch: [10] [2110/4276] eta: 2:03:49 lr: 3.802239594773749e-05 loss: 0.1521 (0.1610) time: 3.1098 data: 0.0083 max mem: 33300 Epoch: [10] [2120/4276] eta: 2:03:11 lr: 3.801968368941973e-05 loss: 0.1287 (0.1609) time: 3.0638 data: 0.0084 max mem: 33300 Epoch: [10] [2130/4276] eta: 2:02:33 lr: 3.8016971409603164e-05 loss: 0.1336 (0.1608) time: 3.0653 data: 0.0078 max mem: 33300 Epoch: [10] [2140/4276] eta: 2:01:55 lr: 3.80142591082859e-05 loss: 0.1488 (0.1607) time: 3.0655 data: 0.0072 max mem: 33300 Epoch: [10] [2150/4276] eta: 2:01:17 lr: 3.801154678546607e-05 loss: 0.1535 (0.1607) time: 3.0652 data: 0.0074 max mem: 33300 Epoch: [10] [2160/4276] eta: 2:00:40 lr: 3.800883444114181e-05 loss: 0.1520 (0.1607) time: 3.0688 data: 0.0075 max mem: 33300 Epoch: [10] [2170/4276] eta: 2:00:03 lr: 3.8006122075311226e-05 loss: 0.1697 (0.1608) time: 3.0973 data: 0.0074 max mem: 33300 Epoch: [10] [2180/4276] eta: 1:59:26 lr: 3.800340968797245e-05 loss: 0.1697 (0.1608) time: 3.1451 data: 0.0077 max mem: 33300 Epoch: [10] [2190/4276] eta: 1:58:49 lr: 3.800069727912361e-05 loss: 0.1561 (0.1608) time: 3.1593 data: 0.0080 max mem: 33300 Epoch: [10] [2200/4276] eta: 1:58:11 lr: 3.7997984848762826e-05 loss: 0.1584 (0.1608) time: 3.0941 data: 0.0086 max mem: 33300 Epoch: [10] [2210/4276] eta: 1:57:33 lr: 3.799527239688821e-05 loss: 0.1601 (0.1608) time: 3.0248 data: 0.0090 max mem: 33300 Epoch: [10] [2220/4276] eta: 1:56:56 lr: 3.79925599234979e-05 loss: 0.1709 (0.1609) time: 3.0105 data: 0.0085 max mem: 33300 Epoch: [10] [2230/4276] eta: 1:56:18 lr: 3.798984742859001e-05 loss: 0.1534 (0.1609) time: 3.0393 data: 0.0087 max mem: 33300 Epoch: [10] [2240/4276] eta: 1:55:41 lr: 3.798713491216267e-05 loss: 0.1466 (0.1608) time: 3.0780 data: 0.0087 max mem: 33300 Epoch: [10] [2250/4276] eta: 1:55:05 lr: 3.798442237421401e-05 loss: 0.1466 (0.1608) time: 3.1039 data: 0.0080 max mem: 33300 Epoch: [10] [2260/4276] eta: 1:54:27 lr: 3.798170981474213e-05 loss: 0.1555 (0.1608) time: 3.0963 data: 0.0077 max mem: 33300 Epoch: [10] [2270/4276] eta: 1:53:51 lr: 3.797899723374516e-05 loss: 0.1485 (0.1608) time: 3.0777 data: 0.0076 max mem: 33300 Epoch: [10] [2280/4276] eta: 1:53:14 lr: 3.797628463122122e-05 loss: 0.1475 (0.1608) time: 3.1184 data: 0.0082 max mem: 33300 Epoch: [10] [2290/4276] eta: 1:52:38 lr: 3.797357200716843e-05 loss: 0.1559 (0.1607) time: 3.1473 data: 0.0083 max mem: 33300 Epoch: [10] [2300/4276] eta: 1:52:01 lr: 3.7970859361584915e-05 loss: 0.1559 (0.1607) time: 3.1012 data: 0.0079 max mem: 33300 Epoch: [10] [2310/4276] eta: 1:51:24 lr: 3.7968146694468794e-05 loss: 0.1517 (0.1607) time: 3.0365 data: 0.0083 max mem: 33300 Epoch: [10] [2320/4276] eta: 1:50:47 lr: 3.796543400581818e-05 loss: 0.1604 (0.1607) time: 3.0619 data: 0.0087 max mem: 33300 Epoch: [10] [2330/4276] eta: 1:50:11 lr: 3.79627212956312e-05 loss: 0.1604 (0.1607) time: 3.1018 data: 0.0082 max mem: 33300 Epoch: [10] [2340/4276] eta: 1:49:34 lr: 3.796000856390598e-05 loss: 0.1622 (0.1607) time: 3.0805 data: 0.0084 max mem: 33300 Epoch: [10] [2350/4276] eta: 1:48:57 lr: 3.795729581064063e-05 loss: 0.1512 (0.1607) time: 3.0712 data: 0.0084 max mem: 33300 Epoch: [10] [2360/4276] eta: 1:48:21 lr: 3.795458303583325e-05 loss: 0.1471 (0.1607) time: 3.0678 data: 0.0078 max mem: 33300 Epoch: [10] [2370/4276] eta: 1:47:44 lr: 3.795187023948199e-05 loss: 0.1681 (0.1607) time: 3.0745 data: 0.0077 max mem: 33300 Epoch: [10] [2380/4276] eta: 1:47:09 lr: 3.794915742158495e-05 loss: 0.1589 (0.1607) time: 3.1304 data: 0.0079 max mem: 33300 Epoch: [10] [2390/4276] eta: 1:46:33 lr: 3.7946444582140246e-05 loss: 0.1425 (0.1606) time: 3.1644 data: 0.0077 max mem: 33300 Epoch: [10] [2400/4276] eta: 1:45:57 lr: 3.794373172114601e-05 loss: 0.1543 (0.1606) time: 3.1195 data: 0.0077 max mem: 33300 Epoch: [10] [2410/4276] eta: 1:45:20 lr: 3.7941018838600345e-05 loss: 0.1459 (0.1606) time: 3.0840 data: 0.0078 max mem: 33300 Epoch: [10] [2420/4276] eta: 1:44:44 lr: 3.793830593450138e-05 loss: 0.1420 (0.1605) time: 3.0722 data: 0.0082 max mem: 33300 Epoch: [10] [2430/4276] eta: 1:44:08 lr: 3.793559300884721e-05 loss: 0.1470 (0.1606) time: 3.0647 data: 0.0082 max mem: 33300 Epoch: [10] [2440/4276] eta: 1:43:32 lr: 3.793288006163597e-05 loss: 0.1530 (0.1605) time: 3.0692 data: 0.0080 max mem: 33300 Epoch: [10] [2450/4276] eta: 1:42:55 lr: 3.7930167092865766e-05 loss: 0.1557 (0.1605) time: 3.0697 data: 0.0082 max mem: 33300 Epoch: [10] [2460/4276] eta: 1:42:19 lr: 3.792745410253472e-05 loss: 0.1655 (0.1606) time: 3.0755 data: 0.0081 max mem: 33300 Epoch: [10] [2470/4276] eta: 1:41:44 lr: 3.792474109064094e-05 loss: 0.1658 (0.1606) time: 3.0970 data: 0.0082 max mem: 33300 Epoch: [10] [2480/4276] eta: 1:41:08 lr: 3.7922028057182554e-05 loss: 0.1588 (0.1606) time: 3.1521 data: 0.0082 max mem: 33300 Epoch: [10] [2490/4276] eta: 1:40:33 lr: 3.791931500215766e-05 loss: 0.1462 (0.1605) time: 3.1833 data: 0.0081 max mem: 33300 Epoch: [10] [2500/4276] eta: 1:39:58 lr: 3.7916601925564385e-05 loss: 0.1584 (0.1606) time: 3.1806 data: 0.0080 max mem: 33300 Epoch: [10] [2510/4276] eta: 1:39:22 lr: 3.791388882740084e-05 loss: 0.1584 (0.1606) time: 3.1483 data: 0.0079 max mem: 33300 Epoch: [10] [2520/4276] eta: 1:38:46 lr: 3.791117570766512e-05 loss: 0.1444 (0.1605) time: 3.0883 data: 0.0079 max mem: 33300 Epoch: [10] [2530/4276] eta: 1:38:11 lr: 3.790846256635537e-05 loss: 0.1286 (0.1604) time: 3.0711 data: 0.0082 max mem: 33300 Epoch: [10] [2540/4276] eta: 1:37:35 lr: 3.7905749403469676e-05 loss: 0.1297 (0.1603) time: 3.0898 data: 0.0084 max mem: 33300 Epoch: [10] [2550/4276] eta: 1:36:59 lr: 3.790303621900616e-05 loss: 0.1369 (0.1603) time: 3.0980 data: 0.0087 max mem: 33300 Epoch: [10] [2560/4276] eta: 1:36:24 lr: 3.790032301296294e-05 loss: 0.1297 (0.1602) time: 3.0746 data: 0.0082 max mem: 33300 Epoch: [10] [2570/4276] eta: 1:35:48 lr: 3.789760978533812e-05 loss: 0.1309 (0.1602) time: 3.0637 data: 0.0077 max mem: 33300 Epoch: [10] [2580/4276] eta: 1:35:12 lr: 3.789489653612982e-05 loss: 0.1407 (0.1601) time: 3.0665 data: 0.0082 max mem: 33300 Epoch: [10] [2590/4276] eta: 1:34:37 lr: 3.789218326533614e-05 loss: 0.1431 (0.1600) time: 3.0889 data: 0.0086 max mem: 33300 Epoch: [10] [2600/4276] eta: 1:34:02 lr: 3.7889469972955195e-05 loss: 0.1440 (0.1601) time: 3.1381 data: 0.0088 max mem: 33300 Epoch: [10] [2610/4276] eta: 1:33:27 lr: 3.78867566589851e-05 loss: 0.1532 (0.1600) time: 3.1636 data: 0.0087 max mem: 33300 Epoch: [10] [2620/4276] eta: 1:32:52 lr: 3.7884043323423955e-05 loss: 0.1532 (0.1600) time: 3.1300 data: 0.0088 max mem: 33300 Epoch: [10] [2630/4276] eta: 1:32:16 lr: 3.788132996626988e-05 loss: 0.1378 (0.1600) time: 3.0856 data: 0.0091 max mem: 33300 Epoch: [10] [2640/4276] eta: 1:31:41 lr: 3.787861658752099e-05 loss: 0.1333 (0.1600) time: 3.0695 data: 0.0092 max mem: 33300 Epoch: [10] [2650/4276] eta: 1:31:05 lr: 3.787590318717538e-05 loss: 0.1502 (0.1600) time: 3.0701 data: 0.0092 max mem: 33300 Epoch: [10] [2660/4276] eta: 1:30:30 lr: 3.7873189765231166e-05 loss: 0.1527 (0.1600) time: 3.0717 data: 0.0096 max mem: 33300 Epoch: [10] [2670/4276] eta: 1:29:54 lr: 3.7870476321686454e-05 loss: 0.1625 (0.1600) time: 3.0704 data: 0.0098 max mem: 33300 Epoch: [10] [2680/4276] eta: 1:29:19 lr: 3.786776285653936e-05 loss: 0.1528 (0.1600) time: 3.0802 data: 0.0094 max mem: 33300 Epoch: [10] [2690/4276] eta: 1:28:44 lr: 3.786504936978797e-05 loss: 0.1528 (0.1600) time: 3.1133 data: 0.0094 max mem: 33300 Epoch: [10] [2700/4276] eta: 1:28:10 lr: 3.786233586143042e-05 loss: 0.1480 (0.1599) time: 3.1508 data: 0.0100 max mem: 33300 Epoch: [10] [2710/4276] eta: 1:27:35 lr: 3.785962233146481e-05 loss: 0.1474 (0.1599) time: 3.1593 data: 0.0100 max mem: 33300 Epoch: [10] [2720/4276] eta: 1:27:00 lr: 3.785690877988923e-05 loss: 0.1400 (0.1598) time: 3.1145 data: 0.0093 max mem: 33300 Epoch: [10] [2730/4276] eta: 1:26:25 lr: 3.7854195206701817e-05 loss: 0.1554 (0.1599) time: 3.0731 data: 0.0087 max mem: 33300 Epoch: [10] [2740/4276] eta: 1:25:49 lr: 3.785148161190065e-05 loss: 0.1663 (0.1599) time: 3.0676 data: 0.0086 max mem: 33300 Epoch: [10] [2750/4276] eta: 1:25:14 lr: 3.784876799548384e-05 loss: 0.1677 (0.1599) time: 3.0721 data: 0.0086 max mem: 33300 Epoch: [10] [2760/4276] eta: 1:24:40 lr: 3.7846054357449494e-05 loss: 0.1551 (0.1599) time: 3.0940 data: 0.0086 max mem: 33300 Epoch: [10] [2770/4276] eta: 1:24:05 lr: 3.7843340697795723e-05 loss: 0.1551 (0.1599) time: 3.0913 data: 0.0086 max mem: 33300 Epoch: [10] [2780/4276] eta: 1:23:30 lr: 3.784062701652064e-05 loss: 0.1577 (0.1599) time: 3.0762 data: 0.0089 max mem: 33300 Epoch: [10] [2790/4276] eta: 1:22:55 lr: 3.783791331362234e-05 loss: 0.1577 (0.1599) time: 3.0917 data: 0.0090 max mem: 33300 Epoch: [10] [2800/4276] eta: 1:22:20 lr: 3.7835199589098916e-05 loss: 0.1513 (0.1599) time: 3.1267 data: 0.0088 max mem: 33300 Epoch: [10] [2810/4276] eta: 1:21:46 lr: 3.783248584294849e-05 loss: 0.1299 (0.1597) time: 3.1561 data: 0.0091 max mem: 33300 Epoch: [10] [2820/4276] eta: 1:21:11 lr: 3.7829772075169165e-05 loss: 0.1318 (0.1596) time: 3.1292 data: 0.0094 max mem: 33300 Epoch: [10] [2830/4276] eta: 1:20:36 lr: 3.7827058285759034e-05 loss: 0.1347 (0.1596) time: 3.0979 data: 0.0091 max mem: 33300 Epoch: [10] [2840/4276] eta: 1:20:02 lr: 3.7824344474716203e-05 loss: 0.1619 (0.1596) time: 3.0990 data: 0.0086 max mem: 33300 Epoch: [10] [2850/4276] eta: 1:19:27 lr: 3.782163064203878e-05 loss: 0.1807 (0.1597) time: 3.0835 data: 0.0084 max mem: 33300 Epoch: [10] [2860/4276] eta: 1:18:52 lr: 3.7818916787724874e-05 loss: 0.1781 (0.1597) time: 3.0688 data: 0.0084 max mem: 33300 Epoch: [10] [2870/4276] eta: 1:18:17 lr: 3.7816202911772574e-05 loss: 0.1600 (0.1597) time: 3.0684 data: 0.0086 max mem: 33300 Epoch: [10] [2880/4276] eta: 1:17:42 lr: 3.781348901417998e-05 loss: 0.1635 (0.1597) time: 3.0629 data: 0.0086 max mem: 33300 Epoch: [10] [2890/4276] eta: 1:17:08 lr: 3.781077509494521e-05 loss: 0.1604 (0.1597) time: 3.0644 data: 0.0084 max mem: 33300 Epoch: [10] [2900/4276] eta: 1:16:34 lr: 3.780806115406635e-05 loss: 0.1383 (0.1596) time: 3.1216 data: 0.0087 max mem: 33300 Epoch: [10] [2910/4276] eta: 1:15:59 lr: 3.780534719154151e-05 loss: 0.1466 (0.1596) time: 3.1655 data: 0.0085 max mem: 33300 Epoch: [10] [2920/4276] eta: 1:15:25 lr: 3.780263320736878e-05 loss: 0.1540 (0.1596) time: 3.0956 data: 0.0086 max mem: 33300 Epoch: [10] [2930/4276] eta: 1:14:50 lr: 3.7799919201546274e-05 loss: 0.1446 (0.1596) time: 3.0242 data: 0.0088 max mem: 33300 Epoch: [10] [2940/4276] eta: 1:14:15 lr: 3.7797205174072086e-05 loss: 0.1343 (0.1595) time: 3.0422 data: 0.0090 max mem: 33300 Epoch: [10] [2950/4276] eta: 1:13:41 lr: 3.779449112494432e-05 loss: 0.1433 (0.1595) time: 3.0720 data: 0.0098 max mem: 33300 Epoch: [10] [2960/4276] eta: 1:13:06 lr: 3.7791777054161065e-05 loss: 0.1361 (0.1594) time: 3.0678 data: 0.0098 max mem: 33300 Epoch: [10] [2970/4276] eta: 1:12:31 lr: 3.778906296172043e-05 loss: 0.1370 (0.1595) time: 3.0413 data: 0.0090 max mem: 33300 Epoch: [10] [2980/4276] eta: 1:11:57 lr: 3.7786348847620516e-05 loss: 0.1504 (0.1594) time: 3.0386 data: 0.0085 max mem: 33300 Epoch: [10] [2990/4276] eta: 1:11:22 lr: 3.7783634711859406e-05 loss: 0.1504 (0.1594) time: 3.0648 data: 0.0087 max mem: 33300 Epoch: [10] [3000/4276] eta: 1:10:48 lr: 3.778092055443521e-05 loss: 0.1497 (0.1594) time: 3.1018 data: 0.0091 max mem: 33300 Epoch: [10] [3010/4276] eta: 1:10:14 lr: 3.777820637534603e-05 loss: 0.1497 (0.1593) time: 3.1499 data: 0.0093 max mem: 33300 Epoch: [10] [3020/4276] eta: 1:09:40 lr: 3.777549217458995e-05 loss: 0.1556 (0.1593) time: 3.1399 data: 0.0090 max mem: 33300 Epoch: [10] [3030/4276] eta: 1:09:06 lr: 3.777277795216509e-05 loss: 0.1556 (0.1594) time: 3.0902 data: 0.0087 max mem: 33300 Epoch: [10] [3040/4276] eta: 1:08:31 lr: 3.777006370806952e-05 loss: 0.1605 (0.1594) time: 3.0717 data: 0.0088 max mem: 33300 Epoch: [10] [3050/4276] eta: 1:07:57 lr: 3.776734944230135e-05 loss: 0.1615 (0.1594) time: 3.0862 data: 0.0086 max mem: 33300 Epoch: [10] [3060/4276] eta: 1:07:23 lr: 3.776463515485867e-05 loss: 0.1305 (0.1593) time: 3.0981 data: 0.0085 max mem: 33300 Epoch: [10] [3070/4276] eta: 1:06:49 lr: 3.776192084573959e-05 loss: 0.1337 (0.1593) time: 3.0856 data: 0.0085 max mem: 33300 Epoch: [10] [3080/4276] eta: 1:06:15 lr: 3.7759206514942194e-05 loss: 0.1549 (0.1593) time: 3.0736 data: 0.0083 max mem: 33300 Epoch: [10] [3090/4276] eta: 1:05:40 lr: 3.775649216246457e-05 loss: 0.1397 (0.1592) time: 3.0738 data: 0.0085 max mem: 33300 Epoch: [10] [3100/4276] eta: 1:05:06 lr: 3.7753777788304834e-05 loss: 0.1502 (0.1592) time: 3.0831 data: 0.0088 max mem: 33300 Epoch: [10] [3110/4276] eta: 1:04:32 lr: 3.7751063392461065e-05 loss: 0.1419 (0.1591) time: 3.1207 data: 0.0087 max mem: 33300 Epoch: [10] [3120/4276] eta: 1:03:59 lr: 3.774834897493137e-05 loss: 0.1359 (0.1591) time: 3.1423 data: 0.0082 max mem: 33300 Epoch: [10] [3130/4276] eta: 1:03:24 lr: 3.774563453571383e-05 loss: 0.1566 (0.1591) time: 3.0837 data: 0.0081 max mem: 33300 Epoch: [10] [3140/4276] eta: 1:02:50 lr: 3.774292007480654e-05 loss: 0.1590 (0.1592) time: 3.0195 data: 0.0083 max mem: 33300 Epoch: [10] [3150/4276] eta: 1:02:16 lr: 3.774020559220759e-05 loss: 0.1605 (0.1592) time: 3.0316 data: 0.0084 max mem: 33300 Epoch: [10] [3160/4276] eta: 1:01:42 lr: 3.7737491087915086e-05 loss: 0.1564 (0.1592) time: 3.0591 data: 0.0086 max mem: 33300 Epoch: [10] [3170/4276] eta: 1:01:08 lr: 3.773477656192712e-05 loss: 0.1576 (0.1592) time: 3.0580 data: 0.0085 max mem: 33300 Epoch: [10] [3180/4276] eta: 1:00:34 lr: 3.773206201424178e-05 loss: 0.1556 (0.1592) time: 3.0494 data: 0.0082 max mem: 33300 Epoch: [10] [3190/4276] eta: 0:59:59 lr: 3.772934744485716e-05 loss: 0.1411 (0.1591) time: 3.0280 data: 0.0085 max mem: 33300 Epoch: [10] [3200/4276] eta: 0:59:26 lr: 3.7726632853771345e-05 loss: 0.1711 (0.1592) time: 3.0656 data: 0.0087 max mem: 33300 Epoch: [10] [3210/4276] eta: 0:58:52 lr: 3.772391824098243e-05 loss: 0.1634 (0.1592) time: 3.1400 data: 0.0088 max mem: 33300 Epoch: [10] [3220/4276] eta: 0:58:18 lr: 3.77212036064885e-05 loss: 0.1634 (0.1592) time: 3.1348 data: 0.0084 max mem: 33300 Epoch: [10] [3230/4276] eta: 0:57:44 lr: 3.771848895028766e-05 loss: 0.1563 (0.1592) time: 3.0841 data: 0.0083 max mem: 33300 Epoch: [10] [3240/4276] eta: 0:57:10 lr: 3.7715774272378e-05 loss: 0.1652 (0.1592) time: 3.0621 data: 0.0089 max mem: 33300 Epoch: [10] [3250/4276] eta: 0:56:36 lr: 3.771305957275759e-05 loss: 0.1648 (0.1592) time: 3.0601 data: 0.0087 max mem: 33300 Epoch: [10] [3260/4276] eta: 0:56:02 lr: 3.7710344851424546e-05 loss: 0.1582 (0.1592) time: 3.0604 data: 0.0087 max mem: 33300 Epoch: [10] [3270/4276] eta: 0:55:29 lr: 3.770763010837694e-05 loss: 0.1615 (0.1592) time: 3.0814 data: 0.0091 max mem: 33300 Epoch: [10] [3280/4276] eta: 0:54:55 lr: 3.770491534361287e-05 loss: 0.1517 (0.1592) time: 3.0877 data: 0.0093 max mem: 33300 Epoch: [10] [3290/4276] eta: 0:54:21 lr: 3.770220055713041e-05 loss: 0.1517 (0.1592) time: 3.0688 data: 0.0092 max mem: 33300 Epoch: [10] [3300/4276] eta: 0:53:47 lr: 3.7699485748927666e-05 loss: 0.1565 (0.1592) time: 3.0654 data: 0.0093 max mem: 33300 Epoch: [10] [3310/4276] eta: 0:53:14 lr: 3.769677091900272e-05 loss: 0.1637 (0.1592) time: 3.1093 data: 0.0098 max mem: 33300 Epoch: [10] [3320/4276] eta: 0:52:40 lr: 3.769405606735366e-05 loss: 0.1654 (0.1593) time: 3.1179 data: 0.0090 max mem: 33300 Epoch: [10] [3330/4276] eta: 0:52:06 lr: 3.769134119397858e-05 loss: 0.1545 (0.1593) time: 3.0483 data: 0.0082 max mem: 33300 Epoch: [10] [3340/4276] eta: 0:51:32 lr: 3.768862629887555e-05 loss: 0.1545 (0.1593) time: 3.0263 data: 0.0082 max mem: 33300 Epoch: [10] [3350/4276] eta: 0:50:59 lr: 3.7685911382042686e-05 loss: 0.1525 (0.1592) time: 3.0656 data: 0.0088 max mem: 33300 Epoch: [10] [3360/4276] eta: 0:50:25 lr: 3.768319644347804e-05 loss: 0.1598 (0.1592) time: 3.0809 data: 0.0093 max mem: 33300 Epoch: [10] [3370/4276] eta: 0:49:51 lr: 3.768048148317972e-05 loss: 0.1682 (0.1593) time: 3.0691 data: 0.0091 max mem: 33300 Epoch: [10] [3380/4276] eta: 0:49:18 lr: 3.76777665011458e-05 loss: 0.1619 (0.1593) time: 3.0607 data: 0.0093 max mem: 33300 Epoch: [10] [3390/4276] eta: 0:48:44 lr: 3.767505149737438e-05 loss: 0.1774 (0.1594) time: 3.0597 data: 0.0097 max mem: 33300 Epoch: [10] [3400/4276] eta: 0:48:10 lr: 3.767233647186354e-05 loss: 0.1716 (0.1594) time: 3.0646 data: 0.0100 max mem: 33300 Epoch: [10] [3410/4276] eta: 0:47:37 lr: 3.766962142461136e-05 loss: 0.1643 (0.1595) time: 3.1077 data: 0.0091 max mem: 33300 Epoch: [10] [3420/4276] eta: 0:47:04 lr: 3.7666906355615934e-05 loss: 0.1691 (0.1595) time: 3.1673 data: 0.0086 max mem: 33300 Epoch: [10] [3430/4276] eta: 0:46:30 lr: 3.7664191264875334e-05 loss: 0.1688 (0.1596) time: 3.1386 data: 0.0088 max mem: 33300 Epoch: [10] [3440/4276] eta: 0:45:57 lr: 3.766147615238765e-05 loss: 0.1548 (0.1595) time: 3.0765 data: 0.0086 max mem: 33300 Epoch: [10] [3450/4276] eta: 0:45:23 lr: 3.7658761018150973e-05 loss: 0.1601 (0.1596) time: 3.0611 data: 0.0086 max mem: 33300 Epoch: [10] [3460/4276] eta: 0:44:50 lr: 3.765604586216337e-05 loss: 0.1717 (0.1596) time: 3.0610 data: 0.0088 max mem: 33300 Epoch: [10] [3470/4276] eta: 0:44:16 lr: 3.765333068442294e-05 loss: 0.1445 (0.1596) time: 3.0596 data: 0.0091 max mem: 33300 Epoch: [10] [3480/4276] eta: 0:43:43 lr: 3.765061548492776e-05 loss: 0.1525 (0.1597) time: 3.0647 data: 0.0091 max mem: 33300 Epoch: [10] [3490/4276] eta: 0:43:09 lr: 3.764790026367591e-05 loss: 0.1630 (0.1597) time: 3.0923 data: 0.0089 max mem: 33300 Epoch: [10] [3500/4276] eta: 0:42:36 lr: 3.764518502066548e-05 loss: 0.1584 (0.1596) time: 3.0977 data: 0.0091 max mem: 33300 Epoch: [10] [3510/4276] eta: 0:42:03 lr: 3.7642469755894546e-05 loss: 0.1449 (0.1596) time: 3.1114 data: 0.0096 max mem: 33300 Epoch: [10] [3520/4276] eta: 0:41:29 lr: 3.763975446936118e-05 loss: 0.1564 (0.1596) time: 3.1574 data: 0.0095 max mem: 33300 Epoch: [10] [3530/4276] eta: 0:40:56 lr: 3.763703916106348e-05 loss: 0.1564 (0.1596) time: 3.1439 data: 0.0091 max mem: 33300 Epoch: [10] [3540/4276] eta: 0:40:23 lr: 3.763432383099952e-05 loss: 0.1538 (0.1596) time: 3.0909 data: 0.0088 max mem: 33300 Epoch: [10] [3550/4276] eta: 0:39:49 lr: 3.763160847916738e-05 loss: 0.1541 (0.1596) time: 3.0674 data: 0.0088 max mem: 33300 Epoch: [10] [3560/4276] eta: 0:39:16 lr: 3.7628893105565136e-05 loss: 0.1639 (0.1597) time: 3.0862 data: 0.0088 max mem: 33300 Epoch: [10] [3570/4276] eta: 0:38:43 lr: 3.762617771019088e-05 loss: 0.1745 (0.1597) time: 3.0938 data: 0.0084 max mem: 33300 Epoch: [10] [3580/4276] eta: 0:38:09 lr: 3.762346229304268e-05 loss: 0.1502 (0.1597) time: 3.0882 data: 0.0086 max mem: 33300 Epoch: [10] [3590/4276] eta: 0:37:36 lr: 3.762074685411862e-05 loss: 0.1470 (0.1597) time: 3.0795 data: 0.0091 max mem: 33300 Epoch: [10] [3600/4276] eta: 0:37:03 lr: 3.761803139341678e-05 loss: 0.1611 (0.1597) time: 3.0635 data: 0.0089 max mem: 33300 Epoch: [10] [3610/4276] eta: 0:36:29 lr: 3.7615315910935234e-05 loss: 0.1595 (0.1597) time: 3.0593 data: 0.0089 max mem: 33300 Epoch: [10] [3620/4276] eta: 0:35:56 lr: 3.7612600406672064e-05 loss: 0.1488 (0.1597) time: 3.0770 data: 0.0085 max mem: 33300 Epoch: [10] [3630/4276] eta: 0:35:23 lr: 3.760988488062534e-05 loss: 0.1488 (0.1597) time: 3.0976 data: 0.0075 max mem: 33300 Epoch: [10] [3640/4276] eta: 0:34:50 lr: 3.760716933279317e-05 loss: 0.1540 (0.1596) time: 3.0765 data: 0.0074 max mem: 33300 Epoch: [10] [3650/4276] eta: 0:34:16 lr: 3.760445376317359e-05 loss: 0.1540 (0.1596) time: 3.0749 data: 0.0080 max mem: 33300 Epoch: [10] [3660/4276] eta: 0:33:43 lr: 3.76017381717647e-05 loss: 0.1564 (0.1596) time: 3.0788 data: 0.0084 max mem: 33300 Epoch: [10] [3670/4276] eta: 0:33:10 lr: 3.7599022558564565e-05 loss: 0.1632 (0.1596) time: 3.0533 data: 0.0081 max mem: 33300 Epoch: [10] [3680/4276] eta: 0:32:37 lr: 3.759630692357128e-05 loss: 0.1710 (0.1597) time: 3.0439 data: 0.0080 max mem: 33300 Epoch: [10] [3690/4276] eta: 0:32:04 lr: 3.7593591266782894e-05 loss: 0.1676 (0.1597) time: 3.0423 data: 0.0080 max mem: 33300 Epoch: [10] [3700/4276] eta: 0:31:30 lr: 3.759087558819751e-05 loss: 0.1636 (0.1597) time: 3.0426 data: 0.0082 max mem: 33300 Epoch: [10] [3710/4276] eta: 0:30:57 lr: 3.758815988781319e-05 loss: 0.1441 (0.1597) time: 3.0574 data: 0.0085 max mem: 33300 Epoch: [10] [3720/4276] eta: 0:30:24 lr: 3.758544416562801e-05 loss: 0.1412 (0.1597) time: 3.1188 data: 0.0084 max mem: 33300 Epoch: [10] [3730/4276] eta: 0:29:51 lr: 3.7582728421640037e-05 loss: 0.1496 (0.1597) time: 3.1475 data: 0.0088 max mem: 33300 Epoch: [10] [3740/4276] eta: 0:29:18 lr: 3.758001265584736e-05 loss: 0.1576 (0.1597) time: 3.0925 data: 0.0085 max mem: 33300 Epoch: [10] [3750/4276] eta: 0:28:45 lr: 3.757729686824805e-05 loss: 0.1524 (0.1597) time: 3.0545 data: 0.0083 max mem: 33300 Epoch: [10] [3760/4276] eta: 0:28:12 lr: 3.7574581058840166e-05 loss: 0.1521 (0.1597) time: 3.0518 data: 0.0086 max mem: 33300 Epoch: [10] [3770/4276] eta: 0:27:39 lr: 3.7571865227621805e-05 loss: 0.1521 (0.1597) time: 3.0501 data: 0.0084 max mem: 33300 Epoch: [10] [3780/4276] eta: 0:27:06 lr: 3.7569149374591024e-05 loss: 0.1647 (0.1597) time: 3.0506 data: 0.0083 max mem: 33300 Epoch: [10] [3790/4276] eta: 0:26:33 lr: 3.75664334997459e-05 loss: 0.1495 (0.1597) time: 3.0635 data: 0.0085 max mem: 33300 Epoch: [10] [3800/4276] eta: 0:25:59 lr: 3.7563717603084505e-05 loss: 0.1685 (0.1597) time: 3.0646 data: 0.0081 max mem: 33300 Epoch: [10] [3810/4276] eta: 0:25:26 lr: 3.7561001684604904e-05 loss: 0.1442 (0.1597) time: 3.0663 data: 0.0075 max mem: 33300 Epoch: [10] [3820/4276] eta: 0:24:54 lr: 3.7558285744305186e-05 loss: 0.1387 (0.1596) time: 3.1107 data: 0.0076 max mem: 33300 Epoch: [10] [3830/4276] eta: 0:24:21 lr: 3.7555569782183405e-05 loss: 0.1486 (0.1597) time: 3.1366 data: 0.0079 max mem: 33300 Epoch: [10] [3840/4276] eta: 0:23:48 lr: 3.7552853798237645e-05 loss: 0.1486 (0.1596) time: 3.0998 data: 0.0078 max mem: 33300 Epoch: [10] [3850/4276] eta: 0:23:15 lr: 3.7550137792465966e-05 loss: 0.1331 (0.1595) time: 3.0653 data: 0.0076 max mem: 33300 Epoch: [10] [3860/4276] eta: 0:22:42 lr: 3.7547421764866445e-05 loss: 0.1386 (0.1595) time: 3.0767 data: 0.0079 max mem: 33300 Epoch: [10] [3870/4276] eta: 0:22:09 lr: 3.754470571543715e-05 loss: 0.1601 (0.1595) time: 3.0733 data: 0.0081 max mem: 33300 Epoch: [10] [3880/4276] eta: 0:21:36 lr: 3.754198964417616e-05 loss: 0.1571 (0.1595) time: 3.0294 data: 0.0083 max mem: 33300 Epoch: [10] [3890/4276] eta: 0:21:03 lr: 3.753927355108154e-05 loss: 0.1531 (0.1595) time: 3.0014 data: 0.0084 max mem: 33300 Epoch: [10] [3900/4276] eta: 0:20:30 lr: 3.7536557436151344e-05 loss: 0.1531 (0.1595) time: 3.0136 data: 0.0084 max mem: 33300 Epoch: [10] [3910/4276] eta: 0:19:57 lr: 3.7533841299383655e-05 loss: 0.1575 (0.1595) time: 3.0503 data: 0.0080 max mem: 33300 Epoch: [10] [3920/4276] eta: 0:19:24 lr: 3.7531125140776536e-05 loss: 0.1434 (0.1595) time: 3.1048 data: 0.0080 max mem: 33300 Epoch: [10] [3930/4276] eta: 0:18:51 lr: 3.752840896032806e-05 loss: 0.1415 (0.1595) time: 3.1562 data: 0.0083 max mem: 33300 Epoch: [10] [3940/4276] eta: 0:18:18 lr: 3.75256927580363e-05 loss: 0.1490 (0.1595) time: 3.1306 data: 0.0079 max mem: 33300 Epoch: [10] [3950/4276] eta: 0:17:45 lr: 3.752297653389931e-05 loss: 0.1462 (0.1594) time: 3.0670 data: 0.0072 max mem: 33300 Epoch: [10] [3960/4276] eta: 0:17:13 lr: 3.752026028791516e-05 loss: 0.1473 (0.1595) time: 3.0515 data: 0.0077 max mem: 33300 Epoch: [10] [3970/4276] eta: 0:16:40 lr: 3.751754402008193e-05 loss: 0.1550 (0.1595) time: 3.0658 data: 0.0086 max mem: 33300 Epoch: [10] [3980/4276] eta: 0:16:07 lr: 3.751482773039768e-05 loss: 0.1534 (0.1595) time: 3.0654 data: 0.0082 max mem: 33300 Epoch: [10] [3990/4276] eta: 0:15:34 lr: 3.751211141886046e-05 loss: 0.1409 (0.1594) time: 3.0719 data: 0.0078 max mem: 33300 Epoch: [10] [4000/4276] eta: 0:15:01 lr: 3.750939508546836e-05 loss: 0.1431 (0.1594) time: 3.0873 data: 0.0077 max mem: 33300 Epoch: [10] [4010/4276] eta: 0:14:28 lr: 3.750667873021943e-05 loss: 0.1522 (0.1594) time: 3.0781 data: 0.0076 max mem: 33300 Epoch: [10] [4020/4276] eta: 0:13:56 lr: 3.750396235311174e-05 loss: 0.1556 (0.1595) time: 3.0921 data: 0.0073 max mem: 33300 Epoch: [10] [4030/4276] eta: 0:13:23 lr: 3.750124595414336e-05 loss: 0.1595 (0.1595) time: 3.1327 data: 0.0074 max mem: 33300 Epoch: [10] [4040/4276] eta: 0:12:50 lr: 3.749852953331235e-05 loss: 0.1604 (0.1595) time: 3.1254 data: 0.0074 max mem: 33300 Epoch: [10] [4050/4276] eta: 0:12:17 lr: 3.7495813090616774e-05 loss: 0.1462 (0.1595) time: 3.0811 data: 0.0068 max mem: 33300 Epoch: [10] [4060/4276] eta: 0:11:45 lr: 3.7493096626054693e-05 loss: 0.1511 (0.1595) time: 3.0663 data: 0.0069 max mem: 33300 Epoch: [10] [4070/4276] eta: 0:11:12 lr: 3.749038013962418e-05 loss: 0.1594 (0.1595) time: 3.0775 data: 0.0074 max mem: 33300 Epoch: [10] [4080/4276] eta: 0:10:39 lr: 3.748766363132329e-05 loss: 0.1585 (0.1595) time: 3.0704 data: 0.0074 max mem: 33300 Epoch: [10] [4090/4276] eta: 0:10:06 lr: 3.7484947101150085e-05 loss: 0.1582 (0.1595) time: 3.0594 data: 0.0074 max mem: 33300 Epoch: [10] [4100/4276] eta: 0:09:34 lr: 3.7482230549102634e-05 loss: 0.1606 (0.1595) time: 3.0619 data: 0.0076 max mem: 33300 Epoch: [10] [4110/4276] eta: 0:09:01 lr: 3.7479513975179e-05 loss: 0.1593 (0.1595) time: 3.0632 data: 0.0078 max mem: 33300 Epoch: [10] [4120/4276] eta: 0:08:28 lr: 3.747679737937723e-05 loss: 0.1586 (0.1595) time: 3.0784 data: 0.0076 max mem: 33300 Epoch: [10] [4130/4276] eta: 0:07:56 lr: 3.747408076169541e-05 loss: 0.1511 (0.1595) time: 3.1421 data: 0.0076 max mem: 33300 Epoch: [10] [4140/4276] eta: 0:07:23 lr: 3.747136412213158e-05 loss: 0.1483 (0.1595) time: 3.1724 data: 0.0079 max mem: 33300 Epoch: [10] [4150/4276] eta: 0:06:50 lr: 3.746864746068381e-05 loss: 0.1630 (0.1595) time: 3.1047 data: 0.0074 max mem: 33300 Epoch: [10] [4160/4276] eta: 0:06:18 lr: 3.746593077735016e-05 loss: 0.1575 (0.1595) time: 3.0515 data: 0.0073 max mem: 33300 Epoch: [10] [4170/4276] eta: 0:05:45 lr: 3.7463214072128686e-05 loss: 0.1775 (0.1596) time: 3.0508 data: 0.0077 max mem: 33300 Epoch: [10] [4180/4276] eta: 0:05:12 lr: 3.746049734501746e-05 loss: 0.1688 (0.1595) time: 3.0533 data: 0.0075 max mem: 33300 Epoch: [10] [4190/4276] eta: 0:04:40 lr: 3.7457780596014534e-05 loss: 0.1516 (0.1596) time: 3.0586 data: 0.0072 max mem: 33300 Epoch: [10] [4200/4276] eta: 0:04:07 lr: 3.745506382511797e-05 loss: 0.1516 (0.1596) time: 3.0758 data: 0.0072 max mem: 33300 Epoch: [10] [4210/4276] eta: 0:03:35 lr: 3.745234703232582e-05 loss: 0.1629 (0.1596) time: 3.0814 data: 0.0073 max mem: 33300 Epoch: [10] [4220/4276] eta: 0:03:02 lr: 3.744963021763615e-05 loss: 0.1754 (0.1597) time: 3.0781 data: 0.0073 max mem: 33300 Epoch: [10] [4230/4276] eta: 0:02:29 lr: 3.744691338104701e-05 loss: 0.1769 (0.1597) time: 3.1130 data: 0.0073 max mem: 33300 Epoch: [10] [4240/4276] eta: 0:01:57 lr: 3.744419652255647e-05 loss: 0.1700 (0.1598) time: 3.1437 data: 0.0073 max mem: 33300 Epoch: [10] [4250/4276] eta: 0:01:24 lr: 3.744147964216257e-05 loss: 0.1700 (0.1598) time: 3.0989 data: 0.0070 max mem: 33300 Epoch: [10] [4260/4276] eta: 0:00:52 lr: 3.743876273986339e-05 loss: 0.1702 (0.1598) time: 3.0559 data: 0.0069 max mem: 33300 Epoch: [10] [4270/4276] eta: 0:00:19 lr: 3.743604581565698e-05 loss: 0.1648 (0.1598) time: 3.0651 data: 0.0071 max mem: 33300 Epoch: [10] Total time: 3:52:01 Test: [ 0/21770] eta: 10:24:55 time: 1.7224 data: 1.6796 max mem: 33300 Test: [ 100/21770] eta: 0:19:54 time: 0.0381 data: 0.0010 max mem: 33300 Test: [ 200/21770] eta: 0:16:48 time: 0.0386 data: 0.0011 max mem: 33300 Test: [ 300/21770] eta: 0:15:43 time: 0.0380 data: 0.0010 max mem: 33300 Test: [ 400/21770] eta: 0:15:08 time: 0.0382 data: 0.0010 max mem: 33300 Test: [ 500/21770] eta: 0:14:46 time: 0.0384 data: 0.0011 max mem: 33300 Test: [ 600/21770] eta: 0:14:30 time: 0.0385 data: 0.0011 max mem: 33300 Test: [ 700/21770] eta: 0:14:18 time: 0.0391 data: 0.0011 max mem: 33300 Test: [ 800/21770] eta: 0:14:09 time: 0.0389 data: 0.0010 max mem: 33300 Test: [ 900/21770] eta: 0:14:01 time: 0.0388 data: 0.0010 max mem: 33300 Test: [ 1000/21770] eta: 0:13:54 time: 0.0386 data: 0.0010 max mem: 33300 Test: [ 1100/21770] eta: 0:13:47 time: 0.0390 data: 0.0010 max mem: 33300 Test: [ 1200/21770] eta: 0:13:41 time: 0.0386 data: 0.0010 max mem: 33300 Test: [ 1300/21770] eta: 0:13:34 time: 0.0385 data: 0.0010 max mem: 33300 Test: [ 1400/21770] eta: 0:13:28 time: 0.0384 data: 0.0011 max mem: 33300 Test: [ 1500/21770] eta: 0:13:22 time: 0.0383 data: 0.0011 max mem: 33300 Test: [ 1600/21770] eta: 0:13:17 time: 0.0386 data: 0.0010 max mem: 33300 Test: [ 1700/21770] eta: 0:13:11 time: 0.0381 data: 0.0010 max mem: 33300 Test: [ 1800/21770] eta: 0:13:07 time: 0.0381 data: 0.0010 max mem: 33300 Test: [ 1900/21770] eta: 0:13:03 time: 0.0391 data: 0.0011 max mem: 33300 Test: [ 2000/21770] eta: 0:12:58 time: 0.0381 data: 0.0010 max mem: 33300 Test: [ 2100/21770] eta: 0:12:53 time: 0.0384 data: 0.0011 max mem: 33300 Test: [ 2200/21770] eta: 0:12:48 time: 0.0383 data: 0.0011 max mem: 33300 Test: [ 2300/21770] eta: 0:12:44 time: 0.0391 data: 0.0010 max mem: 33300 Test: [ 2400/21770] eta: 0:12:41 time: 0.0399 data: 0.0010 max mem: 33300 Test: [ 2500/21770] eta: 0:12:37 time: 0.0390 data: 0.0010 max mem: 33300 Test: [ 2600/21770] eta: 0:12:33 time: 0.0394 data: 0.0010 max mem: 33300 Test: [ 2700/21770] eta: 0:12:29 time: 0.0394 data: 0.0011 max mem: 33300 Test: [ 2800/21770] eta: 0:12:25 time: 0.0381 data: 0.0011 max mem: 33300 Test: [ 2900/21770] eta: 0:12:20 time: 0.0384 data: 0.0011 max mem: 33300 Test: [ 3000/21770] eta: 0:12:16 time: 0.0384 data: 0.0011 max mem: 33300 Test: [ 3100/21770] eta: 0:12:12 time: 0.0387 data: 0.0011 max mem: 33300 Test: [ 3200/21770] eta: 0:12:07 time: 0.0389 data: 0.0011 max mem: 33300 Test: [ 3300/21770] eta: 0:12:03 time: 0.0388 data: 0.0011 max mem: 33300 Test: [ 3400/21770] eta: 0:11:59 time: 0.0397 data: 0.0011 max mem: 33300 Test: [ 3500/21770] eta: 0:11:55 time: 0.0393 data: 0.0011 max mem: 33300 Test: [ 3600/21770] eta: 0:11:51 time: 0.0398 data: 0.0011 max mem: 33300 Test: [ 3700/21770] eta: 0:11:48 time: 0.0398 data: 0.0012 max mem: 33300 Test: [ 3800/21770] eta: 0:11:44 time: 0.0403 data: 0.0012 max mem: 33300 Test: [ 3900/21770] eta: 0:11:41 time: 0.0399 data: 0.0012 max mem: 33300 Test: [ 4000/21770] eta: 0:11:37 time: 0.0403 data: 0.0012 max mem: 33300 Test: [ 4100/21770] eta: 0:11:33 time: 0.0397 data: 0.0012 max mem: 33300 Test: [ 4200/21770] eta: 0:11:30 time: 0.0402 data: 0.0012 max mem: 33300 Test: [ 4300/21770] eta: 0:11:26 time: 0.0396 data: 0.0012 max mem: 33300 Test: [ 4400/21770] eta: 0:11:23 time: 0.0402 data: 0.0012 max mem: 33300 Test: [ 4500/21770] eta: 0:11:19 time: 0.0398 data: 0.0012 max mem: 33300 Test: [ 4600/21770] eta: 0:11:15 time: 0.0404 data: 0.0012 max mem: 33300 Test: [ 4700/21770] eta: 0:11:12 time: 0.0401 data: 0.0012 max mem: 33300 Test: [ 4800/21770] eta: 0:11:08 time: 0.0402 data: 0.0012 max mem: 33300 Test: [ 4900/21770] eta: 0:11:04 time: 0.0398 data: 0.0012 max mem: 33300 Test: [ 5000/21770] eta: 0:11:01 time: 0.0403 data: 0.0012 max mem: 33300 Test: [ 5100/21770] eta: 0:10:57 time: 0.0400 data: 0.0012 max mem: 33300 Test: [ 5200/21770] eta: 0:10:53 time: 0.0403 data: 0.0012 max mem: 33300 Test: [ 5300/21770] eta: 0:10:49 time: 0.0400 data: 0.0012 max mem: 33300 Test: [ 5400/21770] eta: 0:10:46 time: 0.0400 data: 0.0011 max mem: 33300 Test: [ 5500/21770] eta: 0:10:42 time: 0.0391 data: 0.0011 max mem: 33300 Test: [ 5600/21770] eta: 0:10:38 time: 0.0391 data: 0.0011 max mem: 33300 Test: [ 5700/21770] eta: 0:10:34 time: 0.0393 data: 0.0011 max mem: 33300 Test: [ 5800/21770] eta: 0:10:30 time: 0.0394 data: 0.0011 max mem: 33300 Test: [ 5900/21770] eta: 0:10:26 time: 0.0393 data: 0.0011 max mem: 33300 Test: [ 6000/21770] eta: 0:10:22 time: 0.0393 data: 0.0011 max mem: 33300 Test: [ 6100/21770] eta: 0:10:18 time: 0.0389 data: 0.0011 max mem: 33300 Test: [ 6200/21770] eta: 0:10:14 time: 0.0394 data: 0.0011 max mem: 33300 Test: [ 6300/21770] eta: 0:10:10 time: 0.0397 data: 0.0011 max mem: 33300 Test: [ 6400/21770] eta: 0:10:06 time: 0.0402 data: 0.0011 max mem: 33300 Test: [ 6500/21770] eta: 0:10:02 time: 0.0398 data: 0.0010 max mem: 33300 Test: [ 6600/21770] eta: 0:09:58 time: 0.0402 data: 0.0011 max mem: 33300 Test: [ 6700/21770] eta: 0:09:55 time: 0.0400 data: 0.0011 max mem: 33300 Test: [ 6800/21770] eta: 0:09:51 time: 0.0403 data: 0.0012 max mem: 33300 Test: [ 6900/21770] eta: 0:09:47 time: 0.0399 data: 0.0011 max mem: 33300 Test: [ 7000/21770] eta: 0:09:43 time: 0.0401 data: 0.0011 max mem: 33300 Test: [ 7100/21770] eta: 0:09:39 time: 0.0394 data: 0.0010 max mem: 33300 Test: [ 7200/21770] eta: 0:09:35 time: 0.0399 data: 0.0010 max mem: 33300 Test: [ 7300/21770] eta: 0:09:31 time: 0.0393 data: 0.0010 max mem: 33300 Test: [ 7400/21770] eta: 0:09:27 time: 0.0401 data: 0.0011 max mem: 33300 Test: [ 7500/21770] eta: 0:09:24 time: 0.0400 data: 0.0011 max mem: 33300 Test: [ 7600/21770] eta: 0:09:20 time: 0.0403 data: 0.0011 max mem: 33300 Test: [ 7700/21770] eta: 0:09:16 time: 0.0398 data: 0.0011 max mem: 33300 Test: [ 7800/21770] eta: 0:09:12 time: 0.0398 data: 0.0011 max mem: 33300 Test: [ 7900/21770] eta: 0:09:08 time: 0.0390 data: 0.0010 max mem: 33300 Test: [ 8000/21770] eta: 0:09:04 time: 0.0391 data: 0.0010 max mem: 33300 Test: [ 8100/21770] eta: 0:09:00 time: 0.0388 data: 0.0009 max mem: 33300 Test: [ 8200/21770] eta: 0:08:56 time: 0.0388 data: 0.0010 max mem: 33300 Test: [ 8300/21770] eta: 0:08:52 time: 0.0386 data: 0.0010 max mem: 33300 Test: [ 8400/21770] eta: 0:08:48 time: 0.0393 data: 0.0010 max mem: 33300 Test: [ 8500/21770] eta: 0:08:44 time: 0.0383 data: 0.0010 max mem: 33300 Test: [ 8600/21770] eta: 0:08:40 time: 0.0394 data: 0.0011 max mem: 33300 Test: [ 8700/21770] eta: 0:08:36 time: 0.0396 data: 0.0010 max mem: 33300 Test: [ 8800/21770] eta: 0:08:32 time: 0.0395 data: 0.0011 max mem: 33300 Test: [ 8900/21770] eta: 0:08:28 time: 0.0401 data: 0.0011 max mem: 33300 Test: [ 9000/21770] eta: 0:08:24 time: 0.0396 data: 0.0011 max mem: 33300 Test: [ 9100/21770] eta: 0:08:20 time: 0.0398 data: 0.0011 max mem: 33300 Test: [ 9200/21770] eta: 0:08:16 time: 0.0393 data: 0.0011 max mem: 33300 Test: [ 9300/21770] eta: 0:08:12 time: 0.0392 data: 0.0010 max mem: 33300 Test: [ 9400/21770] eta: 0:08:08 time: 0.0392 data: 0.0010 max mem: 33300 Test: [ 9500/21770] eta: 0:08:04 time: 0.0392 data: 0.0010 max mem: 33300 Test: [ 9600/21770] eta: 0:08:00 time: 0.0392 data: 0.0010 max mem: 33300 Test: [ 9700/21770] eta: 0:07:56 time: 0.0396 data: 0.0010 max mem: 33300 Test: [ 9800/21770] eta: 0:07:52 time: 0.0389 data: 0.0010 max mem: 33300 Test: [ 9900/21770] eta: 0:07:48 time: 0.0397 data: 0.0012 max mem: 33300 Test: [10000/21770] eta: 0:07:44 time: 0.0399 data: 0.0011 max mem: 33300 Test: [10100/21770] eta: 0:07:41 time: 0.0401 data: 0.0012 max mem: 33300 Test: [10200/21770] eta: 0:07:37 time: 0.0400 data: 0.0012 max mem: 33300 Test: [10300/21770] eta: 0:07:33 time: 0.0397 data: 0.0011 max mem: 33300 Test: [10400/21770] eta: 0:07:29 time: 0.0398 data: 0.0011 max mem: 33300 Test: [10500/21770] eta: 0:07:25 time: 0.0391 data: 0.0010 max mem: 33300 Test: [10600/21770] eta: 0:07:21 time: 0.0393 data: 0.0011 max mem: 33300 Test: [10700/21770] eta: 0:07:17 time: 0.0394 data: 0.0011 max mem: 33300 Test: [10800/21770] eta: 0:07:13 time: 0.0395 data: 0.0011 max mem: 33300 Test: [10900/21770] eta: 0:07:09 time: 0.0391 data: 0.0011 max mem: 33300 Test: [11000/21770] eta: 0:07:05 time: 0.0388 data: 0.0011 max mem: 33300 Test: [11100/21770] eta: 0:07:01 time: 0.0390 data: 0.0011 max mem: 33300 Test: [11200/21770] eta: 0:06:57 time: 0.0388 data: 0.0011 max mem: 33300 Test: [11300/21770] eta: 0:06:53 time: 0.0389 data: 0.0011 max mem: 33300 Test: [11400/21770] eta: 0:06:49 time: 0.0386 data: 0.0011 max mem: 33300 Test: [11500/21770] eta: 0:06:45 time: 0.0388 data: 0.0011 max mem: 33300 Test: [11600/21770] eta: 0:06:41 time: 0.0389 data: 0.0011 max mem: 33300 Test: [11700/21770] eta: 0:06:37 time: 0.0394 data: 0.0012 max mem: 33300 Test: [11800/21770] eta: 0:06:33 time: 0.0391 data: 0.0012 max mem: 33300 Test: [11900/21770] eta: 0:06:29 time: 0.0393 data: 0.0012 max mem: 33300 Test: [12000/21770] eta: 0:06:25 time: 0.0393 data: 0.0012 max mem: 33300 Test: [12100/21770] eta: 0:06:21 time: 0.0396 data: 0.0013 max mem: 33300 Test: [12200/21770] eta: 0:06:17 time: 0.0397 data: 0.0012 max mem: 33300 Test: [12300/21770] eta: 0:06:13 time: 0.0393 data: 0.0012 max mem: 33300 Test: [12400/21770] eta: 0:06:09 time: 0.0398 data: 0.0012 max mem: 33300 Test: [12500/21770] eta: 0:06:05 time: 0.0394 data: 0.0012 max mem: 33300 Test: [12600/21770] eta: 0:06:01 time: 0.0399 data: 0.0012 max mem: 33300 Test: [12700/21770] eta: 0:05:58 time: 0.0400 data: 0.0012 max mem: 33300 Test: [12800/21770] eta: 0:05:54 time: 0.0400 data: 0.0011 max mem: 33300 Test: [12900/21770] eta: 0:05:50 time: 0.0401 data: 0.0012 max mem: 33300 Test: [13000/21770] eta: 0:05:46 time: 0.0399 data: 0.0012 max mem: 33300 Test: [13100/21770] eta: 0:05:42 time: 0.0399 data: 0.0012 max mem: 33300 Test: [13200/21770] eta: 0:05:38 time: 0.0397 data: 0.0012 max mem: 33300 Test: [13300/21770] eta: 0:05:34 time: 0.0398 data: 0.0011 max mem: 33300 Test: [13400/21770] eta: 0:05:30 time: 0.0392 data: 0.0010 max mem: 33300 Test: [13500/21770] eta: 0:05:26 time: 0.0396 data: 0.0011 max mem: 33300 Test: [13600/21770] eta: 0:05:22 time: 0.0401 data: 0.0011 max mem: 33300 Test: [13700/21770] eta: 0:05:18 time: 0.0401 data: 0.0011 max mem: 33300 Test: [13800/21770] eta: 0:05:14 time: 0.0400 data: 0.0011 max mem: 33300 Test: [13900/21770] eta: 0:05:10 time: 0.0394 data: 0.0011 max mem: 33300 Test: [14000/21770] eta: 0:05:06 time: 0.0399 data: 0.0010 max mem: 33300 Test: [14100/21770] eta: 0:05:03 time: 0.0399 data: 0.0010 max mem: 33300 Test: [14200/21770] eta: 0:04:59 time: 0.0389 data: 0.0010 max mem: 33300 Test: [14300/21770] eta: 0:04:55 time: 0.0386 data: 0.0010 max mem: 33300 Test: [14400/21770] eta: 0:04:51 time: 0.0389 data: 0.0011 max mem: 33300 Test: [14500/21770] eta: 0:04:47 time: 0.0386 data: 0.0010 max mem: 33300 Test: [14600/21770] eta: 0:04:43 time: 0.0388 data: 0.0011 max mem: 33300 Test: [14700/21770] eta: 0:04:39 time: 0.0390 data: 0.0011 max mem: 33300 Test: [14800/21770] eta: 0:04:35 time: 0.0391 data: 0.0011 max mem: 33300 Test: [14900/21770] eta: 0:04:31 time: 0.0389 data: 0.0011 max mem: 33300 Test: [15000/21770] eta: 0:04:27 time: 0.0390 data: 0.0011 max mem: 33300 Test: [15100/21770] eta: 0:04:23 time: 0.0389 data: 0.0011 max mem: 33300 Test: [15200/21770] eta: 0:04:19 time: 0.0391 data: 0.0012 max mem: 33300 Test: [15300/21770] eta: 0:04:15 time: 0.0390 data: 0.0012 max mem: 33300 Test: [15400/21770] eta: 0:04:11 time: 0.0394 data: 0.0012 max mem: 33300 Test: [15500/21770] eta: 0:04:07 time: 0.0395 data: 0.0012 max mem: 33300 Test: [15600/21770] eta: 0:04:03 time: 0.0391 data: 0.0011 max mem: 33300 Test: [15700/21770] eta: 0:03:59 time: 0.0392 data: 0.0012 max mem: 33300 Test: [15800/21770] eta: 0:03:55 time: 0.0393 data: 0.0012 max mem: 33300 Test: [15900/21770] eta: 0:03:51 time: 0.0390 data: 0.0011 max mem: 33300 Test: [16000/21770] eta: 0:03:47 time: 0.0390 data: 0.0012 max mem: 33300 Test: [16100/21770] eta: 0:03:43 time: 0.0392 data: 0.0011 max mem: 33300 Test: [16200/21770] eta: 0:03:39 time: 0.0388 data: 0.0011 max mem: 33300 Test: [16300/21770] eta: 0:03:35 time: 0.0388 data: 0.0011 max mem: 33300 Test: [16400/21770] eta: 0:03:31 time: 0.0388 data: 0.0010 max mem: 33300 Test: [16500/21770] eta: 0:03:27 time: 0.0385 data: 0.0010 max mem: 33300 Test: [16600/21770] eta: 0:03:23 time: 0.0386 data: 0.0010 max mem: 33300 Test: [16700/21770] eta: 0:03:19 time: 0.0385 data: 0.0010 max mem: 33300 Test: [16800/21770] eta: 0:03:15 time: 0.0386 data: 0.0010 max mem: 33300 Test: [16900/21770] eta: 0:03:11 time: 0.0385 data: 0.0010 max mem: 33300 Test: [17000/21770] eta: 0:03:07 time: 0.0389 data: 0.0011 max mem: 33300 Test: [17100/21770] eta: 0:03:04 time: 0.0386 data: 0.0011 max mem: 33300 Test: [17200/21770] eta: 0:03:00 time: 0.0386 data: 0.0010 max mem: 33300 Test: [17300/21770] eta: 0:02:56 time: 0.0387 data: 0.0011 max mem: 33300 Test: [17400/21770] eta: 0:02:52 time: 0.0388 data: 0.0010 max mem: 33300 Test: [17500/21770] eta: 0:02:48 time: 0.0388 data: 0.0010 max mem: 33300 Test: [17600/21770] eta: 0:02:44 time: 0.0393 data: 0.0011 max mem: 33300 Test: [17700/21770] eta: 0:02:40 time: 0.0390 data: 0.0011 max mem: 33300 Test: [17800/21770] eta: 0:02:36 time: 0.0395 data: 0.0011 max mem: 33300 Test: [17900/21770] eta: 0:02:32 time: 0.0391 data: 0.0011 max mem: 33300 Test: [18000/21770] eta: 0:02:28 time: 0.0393 data: 0.0011 max mem: 33300 Test: [18100/21770] eta: 0:02:24 time: 0.0390 data: 0.0011 max mem: 33300 Test: [18200/21770] eta: 0:02:20 time: 0.0392 data: 0.0011 max mem: 33300 Test: [18300/21770] eta: 0:02:16 time: 0.0392 data: 0.0011 max mem: 33300 Test: [18400/21770] eta: 0:02:12 time: 0.0392 data: 0.0011 max mem: 33300 Test: [18500/21770] eta: 0:02:08 time: 0.0390 data: 0.0011 max mem: 33300 Test: [18600/21770] eta: 0:02:04 time: 0.0394 data: 0.0011 max mem: 33300 Test: [18700/21770] eta: 0:02:00 time: 0.0390 data: 0.0011 max mem: 33300 Test: [18800/21770] eta: 0:01:56 time: 0.0395 data: 0.0012 max mem: 33300 Test: [18900/21770] eta: 0:01:53 time: 0.0392 data: 0.0011 max mem: 33300 Test: [19000/21770] eta: 0:01:49 time: 0.0399 data: 0.0011 max mem: 33300 Test: [19100/21770] eta: 0:01:45 time: 0.0391 data: 0.0011 max mem: 33300 Test: [19200/21770] eta: 0:01:41 time: 0.0392 data: 0.0011 max mem: 33300 Test: [19300/21770] eta: 0:01:37 time: 0.0390 data: 0.0011 max mem: 33300 Test: [19400/21770] eta: 0:01:33 time: 0.0393 data: 0.0011 max mem: 33300 Test: [19500/21770] eta: 0:01:29 time: 0.0391 data: 0.0011 max mem: 33300 Test: [19600/21770] eta: 0:01:25 time: 0.0392 data: 0.0011 max mem: 33300 Test: [19700/21770] eta: 0:01:21 time: 0.0393 data: 0.0012 max mem: 33300 Test: [19800/21770] eta: 0:01:17 time: 0.0392 data: 0.0012 max mem: 33300 Test: [19900/21770] eta: 0:01:13 time: 0.0390 data: 0.0012 max mem: 33300 Test: [20000/21770] eta: 0:01:09 time: 0.0393 data: 0.0012 max mem: 33300 Test: [20100/21770] eta: 0:01:05 time: 0.0391 data: 0.0012 max mem: 33300 Test: [20200/21770] eta: 0:01:01 time: 0.0392 data: 0.0012 max mem: 33300 Test: [20300/21770] eta: 0:00:57 time: 0.0390 data: 0.0012 max mem: 33300 Test: [20400/21770] eta: 0:00:53 time: 0.0387 data: 0.0011 max mem: 33300 Test: [20500/21770] eta: 0:00:49 time: 0.0389 data: 0.0012 max mem: 33300 Test: [20600/21770] eta: 0:00:46 time: 0.0391 data: 0.0011 max mem: 33300 Test: [20700/21770] eta: 0:00:42 time: 0.0390 data: 0.0011 max mem: 33300 Test: [20800/21770] eta: 0:00:38 time: 0.0392 data: 0.0012 max mem: 33300 Test: [20900/21770] eta: 0:00:34 time: 0.0398 data: 0.0012 max mem: 33300 Test: [21000/21770] eta: 0:00:30 time: 0.0406 data: 0.0012 max mem: 33300 Test: [21100/21770] eta: 0:00:26 time: 0.0403 data: 0.0012 max mem: 33300 Test: [21200/21770] eta: 0:00:22 time: 0.0402 data: 0.0012 max mem: 33300 Test: [21300/21770] eta: 0:00:18 time: 0.0400 data: 0.0012 max mem: 33300 Test: [21400/21770] eta: 0:00:14 time: 0.0402 data: 0.0012 max mem: 33300 Test: [21500/21770] eta: 0:00:10 time: 0.0398 data: 0.0011 max mem: 33300 Test: [21600/21770] eta: 0:00:06 time: 0.0394 data: 0.0012 max mem: 33300 Test: [21700/21770] eta: 0:00:02 time: 0.0396 data: 0.0012 max mem: 33300 Test: Total time: 0:14:17 Final results: Mean IoU is 0.00 precision@0.5 = 0.00 precision@0.6 = 0.00 precision@0.7 = 0.00 precision@0.8 = 0.00 precision@0.9 = 0.00 overall IoU = 0.00 mean IoU = 0.00 Mean accuracy for one-to-zero sample is 100.00 Average object IoU 0.0 Overall IoU 0.0 Epoch: [11] [ 0/4276] eta: 6:16:57 lr: 3.7434415650616845e-05 loss: 0.1300 (0.1300) time: 5.2894 data: 2.1308 max mem: 33300 Epoch: [11] [ 10/4276] eta: 3:52:01 lr: 3.743169869135481e-05 loss: 0.1694 (0.1669) time: 3.2634 data: 0.2023 max mem: 33300 Epoch: [11] [ 20/4276] eta: 3:44:39 lr: 3.742898171018049e-05 loss: 0.1600 (0.1645) time: 3.0610 data: 0.0089 max mem: 33300 Epoch: [11] [ 30/4276] eta: 3:41:36 lr: 3.742626470709194e-05 loss: 0.1509 (0.1631) time: 3.0590 data: 0.0082 max mem: 33300 Epoch: [11] [ 40/4276] eta: 3:38:57 lr: 3.742354768208721e-05 loss: 0.1496 (0.1623) time: 3.0323 data: 0.0084 max mem: 33300 Epoch: [11] [ 50/4276] eta: 3:36:56 lr: 3.7420830635164374e-05 loss: 0.1589 (0.1606) time: 3.0004 data: 0.0086 max mem: 33300 Epoch: [11] [ 60/4276] eta: 3:35:38 lr: 3.7418113566321454e-05 loss: 0.1502 (0.1599) time: 3.0026 data: 0.0081 max mem: 33300 Epoch: [11] [ 70/4276] eta: 3:34:52 lr: 3.741539647555653e-05 loss: 0.1441 (0.1581) time: 3.0270 data: 0.0075 max mem: 33300 Epoch: [11] [ 80/4276] eta: 3:34:08 lr: 3.7412679362867645e-05 loss: 0.1444 (0.1591) time: 3.0412 data: 0.0072 max mem: 33300 Epoch: [11] [ 90/4276] eta: 3:33:49 lr: 3.740996222825287e-05 loss: 0.1380 (0.1581) time: 3.0637 data: 0.0080 max mem: 33300 Epoch: [11] [ 100/4276] eta: 3:33:14 lr: 3.740724507171023e-05 loss: 0.1374 (0.1591) time: 3.0704 data: 0.0087 max mem: 33300 Epoch: [11] [ 110/4276] eta: 3:32:29 lr: 3.7404527893237806e-05 loss: 0.1492 (0.1592) time: 3.0402 data: 0.0083 max mem: 33300 Epoch: [11] [ 120/4276] eta: 3:31:47 lr: 3.7401810692833634e-05 loss: 0.1516 (0.1592) time: 3.0268 data: 0.0085 max mem: 33300 Epoch: [11] [ 130/4276] eta: 3:31:20 lr: 3.7399093470495774e-05 loss: 0.1638 (0.1594) time: 3.0486 data: 0.0083 max mem: 33300 Epoch: [11] [ 140/4276] eta: 3:30:52 lr: 3.739637622622227e-05 loss: 0.1430 (0.1583) time: 3.0674 data: 0.0083 max mem: 33300 Epoch: [11] [ 150/4276] eta: 3:30:18 lr: 3.7393658960011184e-05 loss: 0.1549 (0.1593) time: 3.0562 data: 0.0089 max mem: 33300 Epoch: [11] [ 160/4276] eta: 3:29:41 lr: 3.739094167186056e-05 loss: 0.1556 (0.1589) time: 3.0399 data: 0.0085 max mem: 33300 Epoch: [11] [ 170/4276] eta: 3:29:13 lr: 3.738822436176845e-05 loss: 0.1518 (0.1586) time: 3.0503 data: 0.0083 max mem: 33300 Epoch: [11] [ 180/4276] eta: 3:28:49 lr: 3.738550702973291e-05 loss: 0.1582 (0.1592) time: 3.0776 data: 0.0086 max mem: 33300 Epoch: [11] [ 190/4276] eta: 3:28:33 lr: 3.7382789675752e-05 loss: 0.1767 (0.1597) time: 3.1066 data: 0.0082 max mem: 33300 Epoch: [11] [ 200/4276] eta: 3:28:17 lr: 3.7380072299823746e-05 loss: 0.1563 (0.1599) time: 3.1308 data: 0.0078 max mem: 33300 Epoch: [11] [ 210/4276] eta: 3:27:45 lr: 3.737735490194622e-05 loss: 0.1589 (0.1602) time: 3.0981 data: 0.0077 max mem: 33300 Epoch: [11] [ 220/4276] eta: 3:27:12 lr: 3.737463748211745e-05 loss: 0.1654 (0.1604) time: 3.0571 data: 0.0080 max mem: 33300 Epoch: [11] [ 230/4276] eta: 3:26:31 lr: 3.737192004033551e-05 loss: 0.1456 (0.1598) time: 3.0291 data: 0.0088 max mem: 33300 Epoch: [11] [ 240/4276] eta: 3:25:46 lr: 3.736920257659843e-05 loss: 0.1513 (0.1600) time: 2.9919 data: 0.0081 max mem: 33300 Epoch: [11] [ 250/4276] eta: 3:25:05 lr: 3.7366485090904255e-05 loss: 0.1720 (0.1611) time: 2.9858 data: 0.0074 max mem: 33300 Epoch: [11] [ 260/4276] eta: 3:24:28 lr: 3.736376758325105e-05 loss: 0.1561 (0.1611) time: 3.0041 data: 0.0076 max mem: 33300 Epoch: [11] [ 270/4276] eta: 3:24:03 lr: 3.736105005363687e-05 loss: 0.1503 (0.1612) time: 3.0524 data: 0.0089 max mem: 33300 Epoch: [11] [ 280/4276] eta: 3:23:41 lr: 3.735833250205974e-05 loss: 0.1489 (0.1612) time: 3.1016 data: 0.0094 max mem: 33300 Epoch: [11] [ 290/4276] eta: 3:23:11 lr: 3.735561492851771e-05 loss: 0.1450 (0.1609) time: 3.0917 data: 0.0079 max mem: 33300 Epoch: [11] [ 300/4276] eta: 3:22:47 lr: 3.735289733300883e-05 loss: 0.1436 (0.1606) time: 3.0884 data: 0.0079 max mem: 33300 Epoch: [11] [ 310/4276] eta: 3:22:16 lr: 3.7350179715531166e-05 loss: 0.1539 (0.1603) time: 3.0807 data: 0.0087 max mem: 33300 Epoch: [11] [ 320/4276] eta: 3:21:43 lr: 3.734746207608273e-05 loss: 0.1593 (0.1610) time: 3.0494 data: 0.0084 max mem: 33300 Epoch: [11] [ 330/4276] eta: 3:21:12 lr: 3.73447444146616e-05 loss: 0.1716 (0.1612) time: 3.0472 data: 0.0080 max mem: 33300 Epoch: [11] [ 340/4276] eta: 3:20:38 lr: 3.7342026731265804e-05 loss: 0.1501 (0.1611) time: 3.0417 data: 0.0079 max mem: 33300 Epoch: [11] [ 350/4276] eta: 3:20:02 lr: 3.7339309025893385e-05 loss: 0.1474 (0.1609) time: 3.0218 data: 0.0079 max mem: 33300 Epoch: [11] [ 360/4276] eta: 3:19:28 lr: 3.7336591298542407e-05 loss: 0.1562 (0.1618) time: 3.0191 data: 0.0080 max mem: 33300 Epoch: [11] [ 370/4276] eta: 3:18:58 lr: 3.733387354921089e-05 loss: 0.1606 (0.1613) time: 3.0416 data: 0.0085 max mem: 33300 Epoch: [11] [ 380/4276] eta: 3:18:28 lr: 3.7331155777896895e-05 loss: 0.1421 (0.1613) time: 3.0626 data: 0.0087 max mem: 33300 Epoch: [11] [ 390/4276] eta: 3:18:03 lr: 3.732843798459846e-05 loss: 0.1492 (0.1611) time: 3.0904 data: 0.0087 max mem: 33300 Epoch: [11] [ 400/4276] eta: 3:17:38 lr: 3.732572016931362e-05 loss: 0.1571 (0.1609) time: 3.1129 data: 0.0086 max mem: 33300 Epoch: [11] [ 410/4276] eta: 3:17:11 lr: 3.732300233204044e-05 loss: 0.1498 (0.1605) time: 3.1054 data: 0.0084 max mem: 33300 Epoch: [11] [ 420/4276] eta: 3:16:40 lr: 3.7320284472776954e-05 loss: 0.1498 (0.1604) time: 3.0748 data: 0.0082 max mem: 33300 Epoch: [11] [ 430/4276] eta: 3:16:08 lr: 3.731756659152119e-05 loss: 0.1583 (0.1604) time: 3.0516 data: 0.0080 max mem: 33300 Epoch: [11] [ 440/4276] eta: 3:15:36 lr: 3.7314848688271206e-05 loss: 0.1539 (0.1602) time: 3.0442 data: 0.0078 max mem: 33300 Epoch: [11] [ 450/4276] eta: 3:15:04 lr: 3.7312130763025046e-05 loss: 0.1502 (0.1602) time: 3.0407 data: 0.0079 max mem: 33300 Epoch: [11] [ 460/4276] eta: 3:14:31 lr: 3.730941281578074e-05 loss: 0.1502 (0.1602) time: 3.0390 data: 0.0080 max mem: 33300 Epoch: [11] [ 470/4276] eta: 3:14:00 lr: 3.730669484653633e-05 loss: 0.1406 (0.1599) time: 3.0435 data: 0.0083 max mem: 33300 Epoch: [11] [ 480/4276] eta: 3:13:34 lr: 3.730397685528987e-05 loss: 0.1348 (0.1595) time: 3.0830 data: 0.0087 max mem: 33300 Epoch: [11] [ 490/4276] eta: 3:13:07 lr: 3.730125884203939e-05 loss: 0.1314 (0.1591) time: 3.1124 data: 0.0088 max mem: 33300 Epoch: [11] [ 500/4276] eta: 3:12:41 lr: 3.7298540806782936e-05 loss: 0.1275 (0.1588) time: 3.1122 data: 0.0082 max mem: 33300 Epoch: [11] [ 510/4276] eta: 3:12:15 lr: 3.729582274951854e-05 loss: 0.1284 (0.1583) time: 3.1202 data: 0.0079 max mem: 33300 Epoch: [11] [ 520/4276] eta: 3:11:49 lr: 3.729310467024425e-05 loss: 0.1322 (0.1583) time: 3.1268 data: 0.0082 max mem: 33300 Epoch: [11] [ 530/4276] eta: 3:11:18 lr: 3.7290386568958095e-05 loss: 0.1619 (0.1584) time: 3.0938 data: 0.0079 max mem: 33300 Epoch: [11] [ 540/4276] eta: 3:10:45 lr: 3.728766844565812e-05 loss: 0.1497 (0.1581) time: 3.0496 data: 0.0075 max mem: 33300 Epoch: [11] [ 550/4276] eta: 3:10:15 lr: 3.7284950300342375e-05 loss: 0.1463 (0.1581) time: 3.0557 data: 0.0077 max mem: 33300 Epoch: [11] [ 560/4276] eta: 3:09:44 lr: 3.728223213300888e-05 loss: 0.1485 (0.1580) time: 3.0585 data: 0.0079 max mem: 33300 Epoch: [11] [ 570/4276] eta: 3:09:11 lr: 3.727951394365568e-05 loss: 0.1485 (0.1580) time: 3.0430 data: 0.0079 max mem: 33300 Epoch: [11] [ 580/4276] eta: 3:08:38 lr: 3.727679573228082e-05 loss: 0.1456 (0.1579) time: 3.0268 data: 0.0079 max mem: 33300 Epoch: [11] [ 590/4276] eta: 3:08:04 lr: 3.7274077498882335e-05 loss: 0.1362 (0.1575) time: 3.0113 data: 0.0076 max mem: 33300 Epoch: [11] [ 600/4276] eta: 3:07:31 lr: 3.7271359243458246e-05 loss: 0.1415 (0.1575) time: 3.0169 data: 0.0075 max mem: 33300 Epoch: [11] [ 610/4276] eta: 3:07:01 lr: 3.7268640966006606e-05 loss: 0.1486 (0.1573) time: 3.0464 data: 0.0083 max mem: 33300 Epoch: [11] [ 620/4276] eta: 3:06:36 lr: 3.726592266652545e-05 loss: 0.1460 (0.1573) time: 3.1184 data: 0.0088 max mem: 33300 Epoch: [11] [ 630/4276] eta: 3:06:04 lr: 3.7263204345012806e-05 loss: 0.1461 (0.1575) time: 3.1003 data: 0.0081 max mem: 33300 Epoch: [11] [ 640/4276] eta: 3:05:30 lr: 3.726048600146672e-05 loss: 0.1470 (0.1574) time: 3.0205 data: 0.0079 max mem: 33300 Epoch: [11] [ 650/4276] eta: 3:04:59 lr: 3.7257767635885215e-05 loss: 0.1528 (0.1575) time: 3.0257 data: 0.0086 max mem: 33300 Epoch: [11] [ 660/4276] eta: 3:04:28 lr: 3.7255049248266335e-05 loss: 0.1625 (0.1576) time: 3.0469 data: 0.0090 max mem: 33300 Epoch: [11] [ 670/4276] eta: 3:03:54 lr: 3.725233083860811e-05 loss: 0.1564 (0.1575) time: 3.0282 data: 0.0088 max mem: 33300 Epoch: [11] [ 680/4276] eta: 3:03:21 lr: 3.7249612406908586e-05 loss: 0.1522 (0.1576) time: 3.0071 data: 0.0083 max mem: 33300 Epoch: [11] [ 690/4276] eta: 3:02:53 lr: 3.724689395316578e-05 loss: 0.1567 (0.1576) time: 3.0663 data: 0.0083 max mem: 33300 Epoch: [11] [ 700/4276] eta: 3:02:22 lr: 3.724417547737773e-05 loss: 0.1567 (0.1575) time: 3.0844 data: 0.0086 max mem: 33300 Epoch: [11] [ 710/4276] eta: 3:01:52 lr: 3.7241456979542475e-05 loss: 0.1625 (0.1575) time: 3.0580 data: 0.0085 max mem: 33300 Epoch: [11] [ 720/4276] eta: 3:01:24 lr: 3.7238738459658045e-05 loss: 0.1508 (0.1573) time: 3.0891 data: 0.0084 max mem: 33300 Epoch: [11] [ 730/4276] eta: 3:00:52 lr: 3.723601991772248e-05 loss: 0.1441 (0.1574) time: 3.0745 data: 0.0082 max mem: 33300 Epoch: [11] [ 740/4276] eta: 3:00:21 lr: 3.7233301353733804e-05 loss: 0.1606 (0.1573) time: 3.0425 data: 0.0086 max mem: 33300 Epoch: [11] [ 750/4276] eta: 2:59:50 lr: 3.723058276769005e-05 loss: 0.1606 (0.1573) time: 3.0519 data: 0.0089 max mem: 33300 Epoch: [11] [ 760/4276] eta: 2:59:20 lr: 3.7227864159589245e-05 loss: 0.1429 (0.1573) time: 3.0709 data: 0.0088 max mem: 33300 Epoch: [11] [ 770/4276] eta: 2:58:50 lr: 3.722514552942942e-05 loss: 0.1396 (0.1574) time: 3.0695 data: 0.0085 max mem: 33300 Epoch: [11] [ 780/4276] eta: 2:58:20 lr: 3.722242687720862e-05 loss: 0.1596 (0.1574) time: 3.0660 data: 0.0091 max mem: 33300 Epoch: [11] [ 790/4276] eta: 2:57:50 lr: 3.721970820292487e-05 loss: 0.1555 (0.1574) time: 3.0730 data: 0.0097 max mem: 33300 Epoch: [11] [ 800/4276] eta: 2:57:19 lr: 3.721698950657619e-05 loss: 0.1449 (0.1573) time: 3.0650 data: 0.0089 max mem: 33300 Epoch: [11] [ 810/4276] eta: 2:56:50 lr: 3.7214270788160625e-05 loss: 0.1456 (0.1574) time: 3.0812 data: 0.0083 max mem: 33300 Epoch: [11] [ 820/4276] eta: 2:56:19 lr: 3.721155204767619e-05 loss: 0.1456 (0.1571) time: 3.0758 data: 0.0085 max mem: 33300 Epoch: [11] [ 830/4276] eta: 2:55:48 lr: 3.720883328512093e-05 loss: 0.1363 (0.1571) time: 3.0451 data: 0.0087 max mem: 33300 Epoch: [11] [ 840/4276] eta: 2:55:17 lr: 3.7206114500492864e-05 loss: 0.1393 (0.1571) time: 3.0514 data: 0.0086 max mem: 33300 Epoch: [11] [ 850/4276] eta: 2:54:47 lr: 3.720339569379001e-05 loss: 0.1371 (0.1569) time: 3.0696 data: 0.0084 max mem: 33300 Epoch: [11] [ 860/4276] eta: 2:54:16 lr: 3.7200676865010414e-05 loss: 0.1506 (0.1569) time: 3.0634 data: 0.0083 max mem: 33300 Epoch: [11] [ 870/4276] eta: 2:53:45 lr: 3.7197958014152095e-05 loss: 0.1491 (0.1569) time: 3.0457 data: 0.0092 max mem: 33300 Epoch: [11] [ 880/4276] eta: 2:53:14 lr: 3.719523914121309e-05 loss: 0.1454 (0.1569) time: 3.0514 data: 0.0091 max mem: 33300 Epoch: [11] [ 890/4276] eta: 2:52:44 lr: 3.719252024619142e-05 loss: 0.1523 (0.1572) time: 3.0735 data: 0.0083 max mem: 33300 Epoch: [11] [ 900/4276] eta: 2:52:14 lr: 3.718980132908511e-05 loss: 0.1716 (0.1572) time: 3.0732 data: 0.0083 max mem: 33300 Epoch: [11] [ 910/4276] eta: 2:51:46 lr: 3.718708238989219e-05 loss: 0.1647 (0.1574) time: 3.0967 data: 0.0086 max mem: 33300 Epoch: [11] [ 920/4276] eta: 2:51:16 lr: 3.718436342861068e-05 loss: 0.1647 (0.1575) time: 3.1112 data: 0.0083 max mem: 33300 Epoch: [11] [ 930/4276] eta: 2:50:45 lr: 3.718164444523861e-05 loss: 0.1686 (0.1576) time: 3.0652 data: 0.0080 max mem: 33300 Epoch: [11] [ 940/4276] eta: 2:50:13 lr: 3.717892543977401e-05 loss: 0.1541 (0.1575) time: 3.0430 data: 0.0082 max mem: 33300 Epoch: [11] [ 950/4276] eta: 2:49:42 lr: 3.71762064122149e-05 loss: 0.1508 (0.1576) time: 3.0426 data: 0.0087 max mem: 33300 Epoch: [11] [ 960/4276] eta: 2:49:11 lr: 3.7173487362559304e-05 loss: 0.1589 (0.1577) time: 3.0415 data: 0.0091 max mem: 33300 Epoch: [11] [ 970/4276] eta: 2:48:40 lr: 3.7170768290805255e-05 loss: 0.1619 (0.1577) time: 3.0399 data: 0.0092 max mem: 33300 Epoch: [11] [ 980/4276] eta: 2:48:10 lr: 3.7168049196950774e-05 loss: 0.1609 (0.1578) time: 3.0649 data: 0.0092 max mem: 33300 Epoch: [11] [ 990/4276] eta: 2:47:40 lr: 3.716533008099387e-05 loss: 0.1609 (0.1578) time: 3.0942 data: 0.0090 max mem: 33300 Epoch: [11] [1000/4276] eta: 2:47:11 lr: 3.7162610942932584e-05 loss: 0.1579 (0.1578) time: 3.0937 data: 0.0085 max mem: 33300 Epoch: [11] [1010/4276] eta: 2:46:41 lr: 3.715989178276493e-05 loss: 0.1524 (0.1578) time: 3.0965 data: 0.0081 max mem: 33300 Epoch: [11] [1020/4276] eta: 2:46:11 lr: 3.715717260048894e-05 loss: 0.1502 (0.1577) time: 3.0785 data: 0.0080 max mem: 33300 Epoch: [11] [1030/4276] eta: 2:45:39 lr: 3.715445339610263e-05 loss: 0.1446 (0.1577) time: 3.0479 data: 0.0082 max mem: 33300 Epoch: [11] [1040/4276] eta: 2:45:08 lr: 3.715173416960402e-05 loss: 0.1512 (0.1577) time: 3.0484 data: 0.0082 max mem: 33300 Epoch: [11] [1050/4276] eta: 2:44:38 lr: 3.7149014920991146e-05 loss: 0.1560 (0.1579) time: 3.0540 data: 0.0079 max mem: 33300 Epoch: [11] [1060/4276] eta: 2:44:07 lr: 3.714629565026201e-05 loss: 0.1659 (0.1580) time: 3.0663 data: 0.0078 max mem: 33300 Epoch: [11] [1070/4276] eta: 2:43:36 lr: 3.714357635741464e-05 loss: 0.1634 (0.1580) time: 3.0549 data: 0.0080 max mem: 33300 Epoch: [11] [1080/4276] eta: 2:43:04 lr: 3.714085704244706e-05 loss: 0.1563 (0.1579) time: 3.0234 data: 0.0082 max mem: 33300 Epoch: [11] [1090/4276] eta: 2:42:31 lr: 3.713813770535729e-05 loss: 0.1478 (0.1579) time: 3.0016 data: 0.0078 max mem: 33300 Epoch: [11] [1100/4276] eta: 2:42:01 lr: 3.7135418346143354e-05 loss: 0.1398 (0.1579) time: 3.0364 data: 0.0076 max mem: 33300 Epoch: [11] [1110/4276] eta: 2:41:32 lr: 3.713269896480326e-05 loss: 0.1571 (0.1579) time: 3.0886 data: 0.0083 max mem: 33300 Epoch: [11] [1120/4276] eta: 2:41:01 lr: 3.712997956133505e-05 loss: 0.1571 (0.1579) time: 3.0719 data: 0.0084 max mem: 33300 Epoch: [11] [1130/4276] eta: 2:40:30 lr: 3.7127260135736715e-05 loss: 0.1491 (0.1577) time: 3.0546 data: 0.0081 max mem: 33300 Epoch: [11] [1140/4276] eta: 2:39:58 lr: 3.7124540688006295e-05 loss: 0.1323 (0.1576) time: 3.0330 data: 0.0078 max mem: 33300 Epoch: [11] [1150/4276] eta: 2:39:26 lr: 3.712182121814179e-05 loss: 0.1388 (0.1575) time: 3.0195 data: 0.0079 max mem: 33300 Epoch: [11] [1160/4276] eta: 2:38:56 lr: 3.711910172614123e-05 loss: 0.1489 (0.1575) time: 3.0430 data: 0.0083 max mem: 33300 Epoch: [11] [1170/4276] eta: 2:38:25 lr: 3.7116382212002636e-05 loss: 0.1597 (0.1576) time: 3.0613 data: 0.0081 max mem: 33300 Epoch: [11] [1180/4276] eta: 2:37:55 lr: 3.711366267572402e-05 loss: 0.1565 (0.1575) time: 3.0703 data: 0.0077 max mem: 33300 Epoch: [11] [1190/4276] eta: 2:37:25 lr: 3.71109431173034e-05 loss: 0.1416 (0.1574) time: 3.0704 data: 0.0075 max mem: 33300 Epoch: [11] [1200/4276] eta: 2:36:57 lr: 3.710822353673879e-05 loss: 0.1359 (0.1573) time: 3.1136 data: 0.0078 max mem: 33300 Epoch: [11] [1210/4276] eta: 2:36:26 lr: 3.7105503934028226e-05 loss: 0.1359 (0.1572) time: 3.1162 data: 0.0081 max mem: 33300 Epoch: [11] [1220/4276] eta: 2:35:55 lr: 3.7102784309169685e-05 loss: 0.1485 (0.1572) time: 3.0549 data: 0.0073 max mem: 33300 Epoch: [11] [1230/4276] eta: 2:35:24 lr: 3.710006466216122e-05 loss: 0.1578 (0.1574) time: 3.0429 data: 0.0070 max mem: 33300 Epoch: [11] [1240/4276] eta: 2:34:53 lr: 3.709734499300082e-05 loss: 0.1549 (0.1574) time: 3.0434 data: 0.0077 max mem: 33300 Epoch: [11] [1250/4276] eta: 2:34:22 lr: 3.7094625301686516e-05 loss: 0.1544 (0.1574) time: 3.0371 data: 0.0079 max mem: 33300 Epoch: [11] [1260/4276] eta: 2:33:50 lr: 3.709190558821632e-05 loss: 0.1490 (0.1573) time: 3.0369 data: 0.0078 max mem: 33300 Epoch: [11] [1270/4276] eta: 2:33:21 lr: 3.7089185852588246e-05 loss: 0.1423 (0.1572) time: 3.0776 data: 0.0079 max mem: 33300 Epoch: [11] [1280/4276] eta: 2:32:51 lr: 3.7086466094800305e-05 loss: 0.1504 (0.1572) time: 3.0986 data: 0.0080 max mem: 33300 Epoch: [11] [1290/4276] eta: 2:32:20 lr: 3.708374631485051e-05 loss: 0.1534 (0.1573) time: 3.0647 data: 0.0077 max mem: 33300 Epoch: [11] [1300/4276] eta: 2:31:50 lr: 3.708102651273689e-05 loss: 0.1462 (0.1573) time: 3.0706 data: 0.0076 max mem: 33300 Epoch: [11] [1310/4276] eta: 2:31:18 lr: 3.707830668845743e-05 loss: 0.1366 (0.1572) time: 3.0498 data: 0.0078 max mem: 33300 Epoch: [11] [1320/4276] eta: 2:30:47 lr: 3.707558684201016e-05 loss: 0.1454 (0.1572) time: 3.0259 data: 0.0084 max mem: 33300 Epoch: [11] [1330/4276] eta: 2:30:17 lr: 3.7072866973393094e-05 loss: 0.1571 (0.1572) time: 3.0542 data: 0.0093 max mem: 33300 Epoch: [11] [1340/4276] eta: 2:29:47 lr: 3.7070147082604237e-05 loss: 0.1459 (0.1572) time: 3.0750 data: 0.0087 max mem: 33300 Epoch: [11] [1350/4276] eta: 2:29:16 lr: 3.7067427169641615e-05 loss: 0.1556 (0.1572) time: 3.0680 data: 0.0085 max mem: 33300 Epoch: [11] [1360/4276] eta: 2:28:45 lr: 3.706470723450322e-05 loss: 0.1642 (0.1573) time: 3.0505 data: 0.0088 max mem: 33300 Epoch: [11] [1370/4276] eta: 2:28:15 lr: 3.7061987277187076e-05 loss: 0.1540 (0.1572) time: 3.0634 data: 0.0090 max mem: 33300 Epoch: [11] [1380/4276] eta: 2:27:44 lr: 3.705926729769119e-05 loss: 0.1612 (0.1573) time: 3.0611 data: 0.0092 max mem: 33300 Epoch: [11] [1390/4276] eta: 2:27:14 lr: 3.705654729601356e-05 loss: 0.1703 (0.1574) time: 3.0640 data: 0.0089 max mem: 33300 Epoch: [11] [1400/4276] eta: 2:26:44 lr: 3.7053827272152216e-05 loss: 0.1703 (0.1574) time: 3.0928 data: 0.0096 max mem: 33300 Epoch: [11] [1410/4276] eta: 2:26:14 lr: 3.705110722610515e-05 loss: 0.1433 (0.1574) time: 3.0930 data: 0.0097 max mem: 33300 Epoch: [11] [1420/4276] eta: 2:25:43 lr: 3.704838715787039e-05 loss: 0.1451 (0.1574) time: 3.0659 data: 0.0088 max mem: 33300 Epoch: [11] [1430/4276] eta: 2:25:12 lr: 3.704566706744594e-05 loss: 0.1451 (0.1574) time: 3.0428 data: 0.0085 max mem: 33300 Epoch: [11] [1440/4276] eta: 2:24:41 lr: 3.704294695482979e-05 loss: 0.1528 (0.1574) time: 3.0367 data: 0.0086 max mem: 33300 Epoch: [11] [1450/4276] eta: 2:24:10 lr: 3.704022682001997e-05 loss: 0.1585 (0.1575) time: 3.0371 data: 0.0088 max mem: 33300 Epoch: [11] [1460/4276] eta: 2:23:39 lr: 3.703750666301448e-05 loss: 0.1564 (0.1574) time: 3.0470 data: 0.0090 max mem: 33300 Epoch: [11] [1470/4276] eta: 2:23:09 lr: 3.703478648381133e-05 loss: 0.1564 (0.1575) time: 3.0685 data: 0.0090 max mem: 33300 Epoch: [11] [1480/4276] eta: 2:22:39 lr: 3.7032066282408524e-05 loss: 0.1319 (0.1574) time: 3.0859 data: 0.0086 max mem: 33300 Epoch: [11] [1490/4276] eta: 2:22:09 lr: 3.7029346058804063e-05 loss: 0.1334 (0.1574) time: 3.0880 data: 0.0085 max mem: 33300 Epoch: [11] [1500/4276] eta: 2:21:38 lr: 3.7026625812995976e-05 loss: 0.1514 (0.1573) time: 3.0825 data: 0.0091 max mem: 33300 Epoch: [11] [1510/4276] eta: 2:21:07 lr: 3.702390554498224e-05 loss: 0.1474 (0.1574) time: 3.0592 data: 0.0089 max mem: 33300 Epoch: [11] [1520/4276] eta: 2:20:36 lr: 3.702118525476088e-05 loss: 0.1474 (0.1574) time: 3.0360 data: 0.0086 max mem: 33300 Epoch: [11] [1530/4276] eta: 2:20:05 lr: 3.70184649423299e-05 loss: 0.1500 (0.1573) time: 3.0330 data: 0.0088 max mem: 33300 Epoch: [11] [1540/4276] eta: 2:19:34 lr: 3.70157446076873e-05 loss: 0.1544 (0.1574) time: 3.0329 data: 0.0087 max mem: 33300 Epoch: [11] [1550/4276] eta: 2:19:04 lr: 3.701302425083108e-05 loss: 0.1554 (0.1573) time: 3.0542 data: 0.0084 max mem: 33300 Epoch: [11] [1560/4276] eta: 2:18:33 lr: 3.701030387175925e-05 loss: 0.1497 (0.1573) time: 3.0784 data: 0.0084 max mem: 33300 Epoch: [11] [1570/4276] eta: 2:18:02 lr: 3.700758347046982e-05 loss: 0.1569 (0.1574) time: 3.0645 data: 0.0084 max mem: 33300 Epoch: [11] [1580/4276] eta: 2:17:32 lr: 3.700486304696079e-05 loss: 0.1405 (0.1573) time: 3.0527 data: 0.0084 max mem: 33300 Epoch: [11] [1590/4276] eta: 2:17:02 lr: 3.7002142601230163e-05 loss: 0.1405 (0.1573) time: 3.0772 data: 0.0085 max mem: 33300 Epoch: [11] [1600/4276] eta: 2:16:31 lr: 3.699942213327595e-05 loss: 0.1577 (0.1572) time: 3.0791 data: 0.0087 max mem: 33300 Epoch: [11] [1610/4276] eta: 2:16:00 lr: 3.699670164309613e-05 loss: 0.1408 (0.1571) time: 3.0463 data: 0.0091 max mem: 33300 Epoch: [11] [1620/4276] eta: 2:15:30 lr: 3.699398113068873e-05 loss: 0.1364 (0.1570) time: 3.0525 data: 0.0090 max mem: 33300 Epoch: [11] [1630/4276] eta: 2:14:59 lr: 3.6991260596051735e-05 loss: 0.1391 (0.1571) time: 3.0642 data: 0.0085 max mem: 33300 Epoch: [11] [1640/4276] eta: 2:14:28 lr: 3.698854003918316e-05 loss: 0.1408 (0.1570) time: 3.0469 data: 0.0084 max mem: 33300 Epoch: [11] [1650/4276] eta: 2:13:57 lr: 3.6985819460081e-05 loss: 0.1412 (0.1570) time: 3.0473 data: 0.0085 max mem: 33300 Epoch: [11] [1660/4276] eta: 2:13:27 lr: 3.698309885874326e-05 loss: 0.1504 (0.1570) time: 3.0699 data: 0.0085 max mem: 33300 Epoch: [11] [1670/4276] eta: 2:12:56 lr: 3.698037823516794e-05 loss: 0.1502 (0.1570) time: 3.0636 data: 0.0081 max mem: 33300 Epoch: [11] [1680/4276] eta: 2:12:26 lr: 3.6977657589353035e-05 loss: 0.1563 (0.1570) time: 3.0604 data: 0.0082 max mem: 33300 Epoch: [11] [1690/4276] eta: 2:11:56 lr: 3.697493692129655e-05 loss: 0.1567 (0.1570) time: 3.0919 data: 0.0084 max mem: 33300 Epoch: [11] [1700/4276] eta: 2:11:26 lr: 3.6972216230996485e-05 loss: 0.1566 (0.1571) time: 3.0914 data: 0.0089 max mem: 33300 Epoch: [11] [1710/4276] eta: 2:10:54 lr: 3.6969495518450834e-05 loss: 0.1694 (0.1571) time: 3.0551 data: 0.0093 max mem: 33300 Epoch: [11] [1720/4276] eta: 2:10:24 lr: 3.69667747836576e-05 loss: 0.1694 (0.1572) time: 3.0364 data: 0.0094 max mem: 33300 Epoch: [11] [1730/4276] eta: 2:09:53 lr: 3.696405402661478e-05 loss: 0.1552 (0.1572) time: 3.0377 data: 0.0098 max mem: 33300 Epoch: [11] [1740/4276] eta: 2:09:22 lr: 3.6961333247320385e-05 loss: 0.1649 (0.1573) time: 3.0369 data: 0.0093 max mem: 33300 Epoch: [11] [1750/4276] eta: 2:08:51 lr: 3.6958612445772395e-05 loss: 0.1530 (0.1572) time: 3.0487 data: 0.0091 max mem: 33300 Epoch: [11] [1760/4276] eta: 2:08:21 lr: 3.695589162196882e-05 loss: 0.1394 (0.1571) time: 3.0761 data: 0.0094 max mem: 33300 Epoch: [11] [1770/4276] eta: 2:07:50 lr: 3.695317077590764e-05 loss: 0.1629 (0.1572) time: 3.0771 data: 0.0093 max mem: 33300 Epoch: [11] [1780/4276] eta: 2:07:20 lr: 3.695044990758687e-05 loss: 0.1589 (0.1572) time: 3.0798 data: 0.0096 max mem: 33300 Epoch: [11] [1790/4276] eta: 2:06:50 lr: 3.69477290170045e-05 loss: 0.1497 (0.1571) time: 3.0959 data: 0.0093 max mem: 33300 Epoch: [11] [1800/4276] eta: 2:06:19 lr: 3.694500810415853e-05 loss: 0.1514 (0.1571) time: 3.0698 data: 0.0088 max mem: 33300 Epoch: [11] [1810/4276] eta: 2:05:48 lr: 3.694228716904696e-05 loss: 0.1604 (0.1572) time: 3.0431 data: 0.0088 max mem: 33300 Epoch: [11] [1820/4276] eta: 2:05:17 lr: 3.693956621166776e-05 loss: 0.1535 (0.1572) time: 3.0385 data: 0.0094 max mem: 33300 Epoch: [11] [1830/4276] eta: 2:04:47 lr: 3.6936845232018965e-05 loss: 0.1508 (0.1571) time: 3.0508 data: 0.0095 max mem: 33300 Epoch: [11] [1840/4276] eta: 2:04:16 lr: 3.6934124230098535e-05 loss: 0.1462 (0.1571) time: 3.0581 data: 0.0088 max mem: 33300 Epoch: [11] [1850/4276] eta: 2:03:45 lr: 3.693140320590447e-05 loss: 0.1529 (0.1572) time: 3.0518 data: 0.0088 max mem: 33300 Epoch: [11] [1860/4276] eta: 2:03:14 lr: 3.692868215943479e-05 loss: 0.1523 (0.1571) time: 3.0243 data: 0.0084 max mem: 33300 Epoch: [11] [1870/4276] eta: 2:02:43 lr: 3.692596109068746e-05 loss: 0.1516 (0.1573) time: 3.0141 data: 0.0075 max mem: 33300 Epoch: [11] [1880/4276] eta: 2:02:13 lr: 3.692323999966049e-05 loss: 0.1631 (0.1573) time: 3.0628 data: 0.0081 max mem: 33300 Epoch: [11] [1890/4276] eta: 2:01:42 lr: 3.6920518886351874e-05 loss: 0.1515 (0.1573) time: 3.0822 data: 0.0088 max mem: 33300 Epoch: [11] [1900/4276] eta: 2:01:11 lr: 3.691779775075959e-05 loss: 0.1416 (0.1572) time: 3.0520 data: 0.0088 max mem: 33300 Epoch: [11] [1910/4276] eta: 2:00:40 lr: 3.691507659288164e-05 loss: 0.1499 (0.1573) time: 3.0163 data: 0.0078 max mem: 33300 Epoch: [11] [1920/4276] eta: 2:00:09 lr: 3.6912355412716024e-05 loss: 0.1597 (0.1572) time: 3.0104 data: 0.0081 max mem: 33300 Epoch: [11] [1930/4276] eta: 1:59:37 lr: 3.690963421026071e-05 loss: 0.1597 (0.1572) time: 3.0064 data: 0.0085 max mem: 33300 Epoch: [11] [1940/4276] eta: 1:59:06 lr: 3.690691298551372e-05 loss: 0.1559 (0.1572) time: 2.9950 data: 0.0074 max mem: 33300 Epoch: [11] [1950/4276] eta: 1:58:36 lr: 3.6904191738473024e-05 loss: 0.1559 (0.1572) time: 3.0382 data: 0.0075 max mem: 33300 Epoch: [11] [1960/4276] eta: 1:58:05 lr: 3.6901470469136615e-05 loss: 0.1489 (0.1572) time: 3.0451 data: 0.0080 max mem: 33300 Epoch: [11] [1970/4276] eta: 1:57:34 lr: 3.68987491775025e-05 loss: 0.1274 (0.1571) time: 3.0257 data: 0.0081 max mem: 33300 Epoch: [11] [1980/4276] eta: 1:57:04 lr: 3.6896027863568644e-05 loss: 0.1283 (0.1570) time: 3.0932 data: 0.0093 max mem: 33300 Epoch: [11] [1990/4276] eta: 1:56:34 lr: 3.689330652733306e-05 loss: 0.1445 (0.1570) time: 3.1132 data: 0.0100 max mem: 33300 Epoch: [11] [2000/4276] eta: 1:56:03 lr: 3.689058516879371e-05 loss: 0.1488 (0.1570) time: 3.0625 data: 0.0095 max mem: 33300 Epoch: [11] [2010/4276] eta: 1:55:32 lr: 3.6887863787948606e-05 loss: 0.1502 (0.1569) time: 3.0435 data: 0.0094 max mem: 33300 Epoch: [11] [2020/4276] eta: 1:55:01 lr: 3.6885142384795737e-05 loss: 0.1530 (0.1569) time: 3.0380 data: 0.0093 max mem: 33300 Epoch: [11] [2030/4276] eta: 1:54:31 lr: 3.6882420959333076e-05 loss: 0.1453 (0.1569) time: 3.0405 data: 0.0094 max mem: 33300 Epoch: [11] [2040/4276] eta: 1:54:00 lr: 3.6879699511558626e-05 loss: 0.1402 (0.1568) time: 3.0619 data: 0.0095 max mem: 33300 Epoch: [11] [2050/4276] eta: 1:53:30 lr: 3.6876978041470366e-05 loss: 0.1586 (0.1568) time: 3.0999 data: 0.0094 max mem: 33300 Epoch: [11] [2060/4276] eta: 1:53:00 lr: 3.6874256549066286e-05 loss: 0.1586 (0.1568) time: 3.0819 data: 0.0092 max mem: 33300 Epoch: [11] [2070/4276] eta: 1:52:29 lr: 3.687153503434438e-05 loss: 0.1448 (0.1567) time: 3.0578 data: 0.0090 max mem: 33300 Epoch: [11] [2080/4276] eta: 1:51:59 lr: 3.6868813497302615e-05 loss: 0.1457 (0.1568) time: 3.0861 data: 0.0092 max mem: 33300 Epoch: [11] [2090/4276] eta: 1:51:29 lr: 3.6866091937938994e-05 loss: 0.1615 (0.1568) time: 3.0899 data: 0.0093 max mem: 33300 Epoch: [11] [2100/4276] eta: 1:50:58 lr: 3.6863370356251495e-05 loss: 0.1549 (0.1568) time: 3.0617 data: 0.0089 max mem: 33300 Epoch: [11] [2110/4276] eta: 1:50:27 lr: 3.686064875223811e-05 loss: 0.1413 (0.1567) time: 3.0519 data: 0.0086 max mem: 33300 Epoch: [11] [2120/4276] eta: 1:49:57 lr: 3.685792712589682e-05 loss: 0.1170 (0.1565) time: 3.0728 data: 0.0093 max mem: 33300 Epoch: [11] [2130/4276] eta: 1:49:26 lr: 3.685520547722562e-05 loss: 0.1284 (0.1564) time: 3.0644 data: 0.0094 max mem: 33300 Epoch: [11] [2140/4276] eta: 1:48:56 lr: 3.685248380622248e-05 loss: 0.1515 (0.1564) time: 3.0652 data: 0.0093 max mem: 33300 Epoch: [11] [2150/4276] eta: 1:48:25 lr: 3.6849762112885385e-05 loss: 0.1579 (0.1564) time: 3.0791 data: 0.0095 max mem: 33300 Epoch: [11] [2160/4276] eta: 1:47:55 lr: 3.684704039721233e-05 loss: 0.1440 (0.1565) time: 3.0658 data: 0.0092 max mem: 33300 Epoch: [11] [2170/4276] eta: 1:47:24 lr: 3.684431865920128e-05 loss: 0.1547 (0.1565) time: 3.0673 data: 0.0089 max mem: 33300 Epoch: [11] [2180/4276] eta: 1:46:54 lr: 3.684159689885025e-05 loss: 0.1597 (0.1565) time: 3.0889 data: 0.0088 max mem: 33300 Epoch: [11] [2190/4276] eta: 1:46:23 lr: 3.683887511615718e-05 loss: 0.1487 (0.1565) time: 3.0781 data: 0.0088 max mem: 33300 Epoch: [11] [2200/4276] eta: 1:45:53 lr: 3.683615331112009e-05 loss: 0.1530 (0.1565) time: 3.0469 data: 0.0083 max mem: 33300 Epoch: [11] [2210/4276] eta: 1:45:22 lr: 3.683343148373695e-05 loss: 0.1600 (0.1565) time: 3.0384 data: 0.0089 max mem: 33300 Epoch: [11] [2220/4276] eta: 1:44:51 lr: 3.683070963400574e-05 loss: 0.1600 (0.1565) time: 3.0355 data: 0.0097 max mem: 33300 Epoch: [11] [2230/4276] eta: 1:44:20 lr: 3.682798776192443e-05 loss: 0.1509 (0.1565) time: 3.0426 data: 0.0094 max mem: 33300 Epoch: [11] [2240/4276] eta: 1:43:50 lr: 3.6825265867491014e-05 loss: 0.1305 (0.1564) time: 3.0595 data: 0.0091 max mem: 33300 Epoch: [11] [2250/4276] eta: 1:43:19 lr: 3.682254395070348e-05 loss: 0.1366 (0.1563) time: 3.0364 data: 0.0087 max mem: 33300 Epoch: [11] [2260/4276] eta: 1:42:48 lr: 3.6819822011559794e-05 loss: 0.1481 (0.1564) time: 3.0236 data: 0.0080 max mem: 33300 Epoch: [11] [2270/4276] eta: 1:42:17 lr: 3.6817100050057936e-05 loss: 0.1479 (0.1564) time: 3.0642 data: 0.0085 max mem: 33300 Epoch: [11] [2280/4276] eta: 1:41:47 lr: 3.68143780661959e-05 loss: 0.1530 (0.1564) time: 3.0798 data: 0.0092 max mem: 33300 Epoch: [11] [2290/4276] eta: 1:41:16 lr: 3.6811656059971645e-05 loss: 0.1581 (0.1564) time: 3.0519 data: 0.0090 max mem: 33300 Epoch: [11] [2300/4276] eta: 1:40:45 lr: 3.6808934031383166e-05 loss: 0.1511 (0.1563) time: 3.0390 data: 0.0086 max mem: 33300 Epoch: [11] [2310/4276] eta: 1:40:15 lr: 3.680621198042844e-05 loss: 0.1406 (0.1563) time: 3.0396 data: 0.0085 max mem: 33300 Epoch: [11] [2320/4276] eta: 1:39:44 lr: 3.6803489907105436e-05 loss: 0.1612 (0.1564) time: 3.0342 data: 0.0087 max mem: 33300 Epoch: [11] [2330/4276] eta: 1:39:14 lr: 3.680076781141214e-05 loss: 0.1655 (0.1564) time: 3.0708 data: 0.0085 max mem: 33300 Epoch: [11] [2340/4276] eta: 1:38:43 lr: 3.679804569334653e-05 loss: 0.1634 (0.1564) time: 3.0945 data: 0.0081 max mem: 33300 Epoch: [11] [2350/4276] eta: 1:38:12 lr: 3.6795323552906577e-05 loss: 0.1497 (0.1564) time: 3.0644 data: 0.0083 max mem: 33300 Epoch: [11] [2360/4276] eta: 1:37:42 lr: 3.679260139009027e-05 loss: 0.1492 (0.1564) time: 3.0461 data: 0.0082 max mem: 33300 Epoch: [11] [2370/4276] eta: 1:37:11 lr: 3.678987920489556e-05 loss: 0.1549 (0.1564) time: 3.0585 data: 0.0082 max mem: 33300 Epoch: [11] [2380/4276] eta: 1:36:41 lr: 3.678715699732046e-05 loss: 0.1510 (0.1564) time: 3.0763 data: 0.0090 max mem: 33300 Epoch: [11] [2390/4276] eta: 1:36:10 lr: 3.678443476736292e-05 loss: 0.1467 (0.1563) time: 3.0630 data: 0.0089 max mem: 33300 Epoch: [11] [2400/4276] eta: 1:35:40 lr: 3.678171251502091e-05 loss: 0.1556 (0.1563) time: 3.0589 data: 0.0086 max mem: 33300 Epoch: [11] [2410/4276] eta: 1:35:09 lr: 3.6778990240292426e-05 loss: 0.1635 (0.1563) time: 3.0532 data: 0.0083 max mem: 33300 Epoch: [11] [2420/4276] eta: 1:34:38 lr: 3.677626794317543e-05 loss: 0.1375 (0.1563) time: 3.0324 data: 0.0085 max mem: 33300 Epoch: [11] [2430/4276] eta: 1:34:08 lr: 3.6773545623667905e-05 loss: 0.1537 (0.1564) time: 3.0534 data: 0.0086 max mem: 33300 Epoch: [11] [2440/4276] eta: 1:33:37 lr: 3.677082328176782e-05 loss: 0.1582 (0.1563) time: 3.0719 data: 0.0085 max mem: 33300 Epoch: [11] [2450/4276] eta: 1:33:06 lr: 3.676810091747315e-05 loss: 0.1485 (0.1563) time: 3.0617 data: 0.0081 max mem: 33300 Epoch: [11] [2460/4276] eta: 1:32:36 lr: 3.676537853078186e-05 loss: 0.1491 (0.1563) time: 3.0775 data: 0.0079 max mem: 33300 Epoch: [11] [2470/4276] eta: 1:32:06 lr: 3.676265612169193e-05 loss: 0.1613 (0.1564) time: 3.1188 data: 0.0083 max mem: 33300 Epoch: [11] [2480/4276] eta: 1:31:35 lr: 3.675993369020133e-05 loss: 0.1618 (0.1564) time: 3.1019 data: 0.0083 max mem: 33300 Epoch: [11] [2490/4276] eta: 1:31:05 lr: 3.675721123630804e-05 loss: 0.1547 (0.1564) time: 3.0504 data: 0.0083 max mem: 33300 Epoch: [11] [2500/4276] eta: 1:30:34 lr: 3.675448876001003e-05 loss: 0.1479 (0.1564) time: 3.0303 data: 0.0085 max mem: 33300 Epoch: [11] [2510/4276] eta: 1:30:03 lr: 3.675176626130527e-05 loss: 0.1533 (0.1564) time: 3.0366 data: 0.0084 max mem: 33300 Epoch: [11] [2520/4276] eta: 1:29:33 lr: 3.674904374019172e-05 loss: 0.1428 (0.1563) time: 3.0491 data: 0.0086 max mem: 33300 Epoch: [11] [2530/4276] eta: 1:29:02 lr: 3.674632119666737e-05 loss: 0.1225 (0.1562) time: 3.0688 data: 0.0086 max mem: 33300 Epoch: [11] [2540/4276] eta: 1:28:32 lr: 3.674359863073018e-05 loss: 0.1254 (0.1561) time: 3.0852 data: 0.0083 max mem: 33300 Epoch: [11] [2550/4276] eta: 1:28:01 lr: 3.674087604237812e-05 loss: 0.1373 (0.1561) time: 3.0707 data: 0.0084 max mem: 33300 Epoch: [11] [2560/4276] eta: 1:27:31 lr: 3.673815343160916e-05 loss: 0.1334 (0.1560) time: 3.0745 data: 0.0083 max mem: 33300 Epoch: [11] [2570/4276] eta: 1:27:01 lr: 3.6735430798421275e-05 loss: 0.1242 (0.1559) time: 3.1127 data: 0.0084 max mem: 33300 Epoch: [11] [2580/4276] eta: 1:26:30 lr: 3.673270814281242e-05 loss: 0.1397 (0.1559) time: 3.1026 data: 0.0088 max mem: 33300 Epoch: [11] [2590/4276] eta: 1:25:59 lr: 3.672998546478059e-05 loss: 0.1462 (0.1559) time: 3.0360 data: 0.0087 max mem: 33300 Epoch: [11] [2600/4276] eta: 1:25:28 lr: 3.672726276432373e-05 loss: 0.1462 (0.1558) time: 3.0193 data: 0.0081 max mem: 33300 Epoch: [11] [2610/4276] eta: 1:24:58 lr: 3.672454004143982e-05 loss: 0.1314 (0.1558) time: 3.0433 data: 0.0081 max mem: 33300 Epoch: [11] [2620/4276] eta: 1:24:27 lr: 3.6721817296126816e-05 loss: 0.1489 (0.1558) time: 3.0572 data: 0.0086 max mem: 33300 Epoch: [11] [2630/4276] eta: 1:23:57 lr: 3.671909452838269e-05 loss: 0.1504 (0.1557) time: 3.0766 data: 0.0088 max mem: 33300 Epoch: [11] [2640/4276] eta: 1:23:26 lr: 3.6716371738205416e-05 loss: 0.1377 (0.1557) time: 3.0785 data: 0.0085 max mem: 33300 Epoch: [11] [2650/4276] eta: 1:22:56 lr: 3.671364892559296e-05 loss: 0.1419 (0.1557) time: 3.0757 data: 0.0083 max mem: 33300 Epoch: [11] [2660/4276] eta: 1:22:25 lr: 3.671092609054329e-05 loss: 0.1572 (0.1558) time: 3.0792 data: 0.0079 max mem: 33300 Epoch: [11] [2670/4276] eta: 1:21:55 lr: 3.6708203233054356e-05 loss: 0.1572 (0.1557) time: 3.0799 data: 0.0078 max mem: 33300 Epoch: [11] [2680/4276] eta: 1:21:24 lr: 3.670548035312414e-05 loss: 0.1544 (0.1558) time: 3.0607 data: 0.0079 max mem: 33300 Epoch: [11] [2690/4276] eta: 1:20:53 lr: 3.670275745075061e-05 loss: 0.1516 (0.1557) time: 3.0474 data: 0.0085 max mem: 33300 Epoch: [11] [2700/4276] eta: 1:20:23 lr: 3.670003452593171e-05 loss: 0.1428 (0.1557) time: 3.0648 data: 0.0088 max mem: 33300 Epoch: [11] [2710/4276] eta: 1:19:52 lr: 3.6697311578665413e-05 loss: 0.1388 (0.1557) time: 3.0647 data: 0.0084 max mem: 33300 Epoch: [11] [2720/4276] eta: 1:19:22 lr: 3.6694588608949706e-05 loss: 0.1421 (0.1556) time: 3.0686 data: 0.0080 max mem: 33300 Epoch: [11] [2730/4276] eta: 1:18:51 lr: 3.6691865616782515e-05 loss: 0.1575 (0.1557) time: 3.0643 data: 0.0079 max mem: 33300 Epoch: [11] [2740/4276] eta: 1:18:20 lr: 3.6689142602161844e-05 loss: 0.1593 (0.1557) time: 3.0312 data: 0.0080 max mem: 33300 Epoch: [11] [2750/4276] eta: 1:17:49 lr: 3.6686419565085624e-05 loss: 0.1575 (0.1557) time: 3.0409 data: 0.0079 max mem: 33300 Epoch: [11] [2760/4276] eta: 1:17:19 lr: 3.668369650555183e-05 loss: 0.1479 (0.1557) time: 3.0746 data: 0.0085 max mem: 33300 Epoch: [11] [2770/4276] eta: 1:16:49 lr: 3.668097342355842e-05 loss: 0.1410 (0.1557) time: 3.1036 data: 0.0091 max mem: 33300 Epoch: [11] [2780/4276] eta: 1:16:18 lr: 3.667825031910337e-05 loss: 0.1418 (0.1557) time: 3.0889 data: 0.0091 max mem: 33300 Epoch: [11] [2790/4276] eta: 1:15:47 lr: 3.6675527192184625e-05 loss: 0.1498 (0.1557) time: 3.0556 data: 0.0089 max mem: 33300 Epoch: [11] [2800/4276] eta: 1:15:17 lr: 3.6672804042800144e-05 loss: 0.1484 (0.1556) time: 3.0541 data: 0.0090 max mem: 33300 Epoch: [11] [2810/4276] eta: 1:14:46 lr: 3.667008087094791e-05 loss: 0.1231 (0.1555) time: 3.0600 data: 0.0092 max mem: 33300 Epoch: [11] [2820/4276] eta: 1:14:16 lr: 3.6667357676625866e-05 loss: 0.1258 (0.1554) time: 3.0795 data: 0.0092 max mem: 33300 Epoch: [11] [2830/4276] eta: 1:13:45 lr: 3.666463445983198e-05 loss: 0.1354 (0.1554) time: 3.0792 data: 0.0091 max mem: 33300 Epoch: [11] [2840/4276] eta: 1:13:15 lr: 3.6661911220564214e-05 loss: 0.1483 (0.1554) time: 3.0792 data: 0.0088 max mem: 33300 Epoch: [11] [2850/4276] eta: 1:12:44 lr: 3.665918795882052e-05 loss: 0.1663 (0.1555) time: 3.0938 data: 0.0085 max mem: 33300 Epoch: [11] [2860/4276] eta: 1:12:14 lr: 3.665646467459885e-05 loss: 0.1598 (0.1554) time: 3.0621 data: 0.0083 max mem: 33300 Epoch: [11] [2870/4276] eta: 1:11:43 lr: 3.665374136789718e-05 loss: 0.1414 (0.1554) time: 3.0169 data: 0.0077 max mem: 33300 Epoch: [11] [2880/4276] eta: 1:11:12 lr: 3.6651018038713466e-05 loss: 0.1436 (0.1554) time: 2.9952 data: 0.0072 max mem: 33300 Epoch: [11] [2890/4276] eta: 1:10:41 lr: 3.664829468704565e-05 loss: 0.1529 (0.1554) time: 2.9877 data: 0.0076 max mem: 33300 Epoch: [11] [2900/4276] eta: 1:10:10 lr: 3.664557131289172e-05 loss: 0.1475 (0.1553) time: 3.0298 data: 0.0084 max mem: 33300 Epoch: [11] [2910/4276] eta: 1:09:40 lr: 3.66428479162496e-05 loss: 0.1442 (0.1553) time: 3.0749 data: 0.0093 max mem: 33300 Epoch: [11] [2920/4276] eta: 1:09:09 lr: 3.6640124497117275e-05 loss: 0.1446 (0.1553) time: 3.0818 data: 0.0094 max mem: 33300 Epoch: [11] [2930/4276] eta: 1:08:39 lr: 3.6637401055492685e-05 loss: 0.1343 (0.1552) time: 3.0870 data: 0.0087 max mem: 33300 Epoch: [11] [2940/4276] eta: 1:08:08 lr: 3.6634677591373786e-05 loss: 0.1257 (0.1552) time: 3.0944 data: 0.0085 max mem: 33300 Epoch: [11] [2950/4276] eta: 1:07:38 lr: 3.663195410475854e-05 loss: 0.1329 (0.1551) time: 3.0785 data: 0.0080 max mem: 33300 Epoch: [11] [2960/4276] eta: 1:07:07 lr: 3.6629230595644906e-05 loss: 0.1453 (0.1551) time: 3.0920 data: 0.0084 max mem: 33300 Epoch: [11] [2970/4276] eta: 1:06:37 lr: 3.6626507064030824e-05 loss: 0.1455 (0.1552) time: 3.1303 data: 0.0092 max mem: 33300 Epoch: [11] [2980/4276] eta: 1:06:07 lr: 3.6623783509914275e-05 loss: 0.1512 (0.1552) time: 3.1154 data: 0.0086 max mem: 33300 Epoch: [11] [2990/4276] eta: 1:05:36 lr: 3.662105993329319e-05 loss: 0.1501 (0.1551) time: 3.0829 data: 0.0085 max mem: 33300 Epoch: [11] [3000/4276] eta: 1:05:06 lr: 3.661833633416554e-05 loss: 0.1349 (0.1551) time: 3.0738 data: 0.0090 max mem: 33300 Epoch: [11] [3010/4276] eta: 1:04:35 lr: 3.661561271252927e-05 loss: 0.1401 (0.1551) time: 3.0661 data: 0.0086 max mem: 33300 Epoch: [11] [3020/4276] eta: 1:04:04 lr: 3.6612889068382325e-05 loss: 0.1479 (0.1550) time: 3.0727 data: 0.0082 max mem: 33300 Epoch: [11] [3030/4276] eta: 1:03:34 lr: 3.6610165401722677e-05 loss: 0.1450 (0.1550) time: 3.0832 data: 0.0086 max mem: 33300 Epoch: [11] [3040/4276] eta: 1:03:03 lr: 3.660744171254827e-05 loss: 0.1595 (0.1551) time: 3.0863 data: 0.0089 max mem: 33300 Epoch: [11] [3050/4276] eta: 1:02:33 lr: 3.6604718000857056e-05 loss: 0.1592 (0.1551) time: 3.0823 data: 0.0088 max mem: 33300 Epoch: [11] [3060/4276] eta: 1:02:02 lr: 3.6601994266646985e-05 loss: 0.1340 (0.1550) time: 3.0880 data: 0.0088 max mem: 33300 Epoch: [11] [3070/4276] eta: 1:01:32 lr: 3.6599270509916015e-05 loss: 0.1475 (0.1550) time: 3.0839 data: 0.0083 max mem: 33300 Epoch: [11] [3080/4276] eta: 1:01:01 lr: 3.659654673066209e-05 loss: 0.1393 (0.1549) time: 3.0488 data: 0.0079 max mem: 33300 Epoch: [11] [3090/4276] eta: 1:00:30 lr: 3.659382292888317e-05 loss: 0.1325 (0.1549) time: 3.0457 data: 0.0082 max mem: 33300 Epoch: [11] [3100/4276] eta: 1:00:00 lr: 3.65910991045772e-05 loss: 0.1353 (0.1549) time: 3.0744 data: 0.0087 max mem: 33300 Epoch: [11] [3110/4276] eta: 0:59:29 lr: 3.658837525774213e-05 loss: 0.1353 (0.1548) time: 3.0783 data: 0.0089 max mem: 33300 Epoch: [11] [3120/4276] eta: 0:58:59 lr: 3.658565138837591e-05 loss: 0.1326 (0.1548) time: 3.0668 data: 0.0090 max mem: 33300 Epoch: [11] [3130/4276] eta: 0:58:28 lr: 3.6582927496476496e-05 loss: 0.1385 (0.1548) time: 3.0791 data: 0.0088 max mem: 33300 Epoch: [11] [3140/4276] eta: 0:57:58 lr: 3.6580203582041835e-05 loss: 0.1463 (0.1548) time: 3.0740 data: 0.0087 max mem: 33300 Epoch: [11] [3150/4276] eta: 0:57:27 lr: 3.657747964506987e-05 loss: 0.1569 (0.1548) time: 3.0692 data: 0.0089 max mem: 33300 Epoch: [11] [3160/4276] eta: 0:56:56 lr: 3.657475568555856e-05 loss: 0.1538 (0.1548) time: 3.0765 data: 0.0088 max mem: 33300 Epoch: [11] [3170/4276] eta: 0:56:26 lr: 3.6572031703505836e-05 loss: 0.1455 (0.1548) time: 3.0936 data: 0.0086 max mem: 33300 Epoch: [11] [3180/4276] eta: 0:55:56 lr: 3.656930769890966e-05 loss: 0.1450 (0.1548) time: 3.1021 data: 0.0088 max mem: 33300 Epoch: [11] [3190/4276] eta: 0:55:25 lr: 3.6566583671767976e-05 loss: 0.1438 (0.1548) time: 3.0774 data: 0.0089 max mem: 33300 Epoch: [11] [3200/4276] eta: 0:54:54 lr: 3.6563859622078735e-05 loss: 0.1446 (0.1548) time: 3.0591 data: 0.0086 max mem: 33300 Epoch: [11] [3210/4276] eta: 0:54:24 lr: 3.656113554983988e-05 loss: 0.1540 (0.1549) time: 3.0556 data: 0.0083 max mem: 33300 Epoch: [11] [3220/4276] eta: 0:53:53 lr: 3.655841145504936e-05 loss: 0.1545 (0.1549) time: 3.0680 data: 0.0085 max mem: 33300 Epoch: [11] [3230/4276] eta: 0:53:22 lr: 3.655568733770512e-05 loss: 0.1535 (0.1548) time: 3.0734 data: 0.0084 max mem: 33300 Epoch: [11] [3240/4276] eta: 0:52:52 lr: 3.6552963197805096e-05 loss: 0.1553 (0.1549) time: 3.0648 data: 0.0084 max mem: 33300 Epoch: [11] [3250/4276] eta: 0:52:21 lr: 3.655023903534724e-05 loss: 0.1543 (0.1549) time: 3.0698 data: 0.0091 max mem: 33300 Epoch: [11] [3260/4276] eta: 0:51:51 lr: 3.654751485032951e-05 loss: 0.1514 (0.1549) time: 3.0897 data: 0.0098 max mem: 33300 Epoch: [11] [3270/4276] eta: 0:51:20 lr: 3.654479064274984e-05 loss: 0.1502 (0.1549) time: 3.0885 data: 0.0100 max mem: 33300 Epoch: [11] [3280/4276] eta: 0:50:50 lr: 3.654206641260617e-05 loss: 0.1462 (0.1548) time: 3.0857 data: 0.0099 max mem: 33300 Epoch: [11] [3290/4276] eta: 0:50:19 lr: 3.653934215989645e-05 loss: 0.1459 (0.1549) time: 3.0865 data: 0.0096 max mem: 33300 Epoch: [11] [3300/4276] eta: 0:49:49 lr: 3.653661788461862e-05 loss: 0.1507 (0.1548) time: 3.0826 data: 0.0096 max mem: 33300 Epoch: [11] [3310/4276] eta: 0:49:18 lr: 3.6533893586770636e-05 loss: 0.1531 (0.1549) time: 3.0753 data: 0.0099 max mem: 33300 Epoch: [11] [3320/4276] eta: 0:48:47 lr: 3.653116926635042e-05 loss: 0.1599 (0.1549) time: 3.0742 data: 0.0098 max mem: 33300 Epoch: [11] [3330/4276] eta: 0:48:17 lr: 3.652844492335592e-05 loss: 0.1474 (0.1549) time: 3.0904 data: 0.0098 max mem: 33300 Epoch: [11] [3340/4276] eta: 0:47:46 lr: 3.652572055778509e-05 loss: 0.1586 (0.1550) time: 3.0736 data: 0.0097 max mem: 33300 Epoch: [11] [3350/4276] eta: 0:47:16 lr: 3.652299616963586e-05 loss: 0.1549 (0.1549) time: 3.0677 data: 0.0094 max mem: 33300 Epoch: [11] [3360/4276] eta: 0:46:45 lr: 3.652027175890619e-05 loss: 0.1398 (0.1549) time: 3.0943 data: 0.0096 max mem: 33300 Epoch: [11] [3370/4276] eta: 0:46:15 lr: 3.6517547325593996e-05 loss: 0.1521 (0.1550) time: 3.1076 data: 0.0095 max mem: 33300 Epoch: [11] [3380/4276] eta: 0:45:44 lr: 3.651482286969724e-05 loss: 0.1516 (0.1549) time: 3.1044 data: 0.0092 max mem: 33300 Epoch: [11] [3390/4276] eta: 0:45:13 lr: 3.651209839121385e-05 loss: 0.1496 (0.1550) time: 3.0703 data: 0.0094 max mem: 33300 Epoch: [11] [3400/4276] eta: 0:44:43 lr: 3.650937389014176e-05 loss: 0.1613 (0.1550) time: 3.0484 data: 0.0096 max mem: 33300 Epoch: [11] [3410/4276] eta: 0:44:12 lr: 3.650664936647893e-05 loss: 0.1613 (0.1550) time: 3.0570 data: 0.0093 max mem: 33300 Epoch: [11] [3420/4276] eta: 0:43:42 lr: 3.650392482022328e-05 loss: 0.1623 (0.1550) time: 3.0766 data: 0.0092 max mem: 33300 Epoch: [11] [3430/4276] eta: 0:43:11 lr: 3.650120025137276e-05 loss: 0.1623 (0.1551) time: 3.0949 data: 0.0094 max mem: 33300 Epoch: [11] [3440/4276] eta: 0:42:40 lr: 3.649847565992531e-05 loss: 0.1410 (0.1550) time: 3.0893 data: 0.0092 max mem: 33300 Epoch: [11] [3450/4276] eta: 0:42:10 lr: 3.649575104587886e-05 loss: 0.1483 (0.1551) time: 3.0857 data: 0.0089 max mem: 33300 Epoch: [11] [3460/4276] eta: 0:41:39 lr: 3.649302640923135e-05 loss: 0.1696 (0.1551) time: 3.0762 data: 0.0088 max mem: 33300 Epoch: [11] [3470/4276] eta: 0:41:09 lr: 3.649030174998072e-05 loss: 0.1354 (0.1551) time: 3.0683 data: 0.0087 max mem: 33300 Epoch: [11] [3480/4276] eta: 0:40:38 lr: 3.648757706812491e-05 loss: 0.1502 (0.1551) time: 3.0845 data: 0.0093 max mem: 33300 Epoch: [11] [3490/4276] eta: 0:40:07 lr: 3.6484852363661855e-05 loss: 0.1566 (0.1551) time: 3.0782 data: 0.0093 max mem: 33300 Epoch: [11] [3500/4276] eta: 0:39:37 lr: 3.648212763658948e-05 loss: 0.1546 (0.1551) time: 3.0726 data: 0.0087 max mem: 33300 Epoch: [11] [3510/4276] eta: 0:39:06 lr: 3.6479402886905736e-05 loss: 0.1461 (0.1550) time: 3.0588 data: 0.0084 max mem: 33300 Epoch: [11] [3520/4276] eta: 0:38:35 lr: 3.647667811460857e-05 loss: 0.1407 (0.1550) time: 3.0575 data: 0.0082 max mem: 33300 Epoch: [11] [3530/4276] eta: 0:38:05 lr: 3.647395331969588e-05 loss: 0.1426 (0.1550) time: 3.0750 data: 0.0085 max mem: 33300 Epoch: [11] [3540/4276] eta: 0:37:34 lr: 3.647122850216563e-05 loss: 0.1363 (0.1550) time: 3.0613 data: 0.0086 max mem: 33300 Epoch: [11] [3550/4276] eta: 0:37:04 lr: 3.6468503662015744e-05 loss: 0.1361 (0.1550) time: 3.0748 data: 0.0085 max mem: 33300 Epoch: [11] [3560/4276] eta: 0:36:33 lr: 3.646577879924417e-05 loss: 0.1394 (0.1550) time: 3.0873 data: 0.0087 max mem: 33300 Epoch: [11] [3570/4276] eta: 0:36:02 lr: 3.6463053913848815e-05 loss: 0.1667 (0.1551) time: 3.0940 data: 0.0090 max mem: 33300 Epoch: [11] [3580/4276] eta: 0:35:32 lr: 3.6460329005827634e-05 loss: 0.1376 (0.1550) time: 3.0992 data: 0.0090 max mem: 33300 Epoch: [11] [3590/4276] eta: 0:35:01 lr: 3.645760407517856e-05 loss: 0.1334 (0.1550) time: 3.0761 data: 0.0087 max mem: 33300 Epoch: [11] [3600/4276] eta: 0:34:31 lr: 3.6454879121899525e-05 loss: 0.1448 (0.1550) time: 3.0665 data: 0.0085 max mem: 33300 Epoch: [11] [3610/4276] eta: 0:34:00 lr: 3.645215414598845e-05 loss: 0.1448 (0.1550) time: 3.0591 data: 0.0086 max mem: 33300 Epoch: [11] [3620/4276] eta: 0:33:29 lr: 3.644942914744328e-05 loss: 0.1380 (0.1549) time: 3.0573 data: 0.0091 max mem: 33300 Epoch: [11] [3630/4276] eta: 0:32:59 lr: 3.644670412626194e-05 loss: 0.1442 (0.1549) time: 3.0710 data: 0.0087 max mem: 33300 Epoch: [11] [3640/4276] eta: 0:32:28 lr: 3.644397908244236e-05 loss: 0.1418 (0.1549) time: 3.0814 data: 0.0088 max mem: 33300 Epoch: [11] [3650/4276] eta: 0:31:58 lr: 3.6441254015982475e-05 loss: 0.1249 (0.1548) time: 3.0913 data: 0.0093 max mem: 33300 Epoch: [11] [3660/4276] eta: 0:31:27 lr: 3.643852892688021e-05 loss: 0.1274 (0.1548) time: 3.0827 data: 0.0092 max mem: 33300 Epoch: [11] [3670/4276] eta: 0:30:56 lr: 3.643580381513351e-05 loss: 0.1485 (0.1548) time: 3.0795 data: 0.0087 max mem: 33300 Epoch: [11] [3680/4276] eta: 0:30:26 lr: 3.643307868074029e-05 loss: 0.1679 (0.1548) time: 3.0850 data: 0.0087 max mem: 33300 Epoch: [11] [3690/4276] eta: 0:29:55 lr: 3.643035352369849e-05 loss: 0.1480 (0.1548) time: 3.0685 data: 0.0086 max mem: 33300 Epoch: [11] [3700/4276] eta: 0:29:24 lr: 3.6427628344006026e-05 loss: 0.1531 (0.1548) time: 3.0741 data: 0.0085 max mem: 33300 Epoch: [11] [3710/4276] eta: 0:28:54 lr: 3.642490314166084e-05 loss: 0.1494 (0.1548) time: 3.0840 data: 0.0086 max mem: 33300 Epoch: [11] [3720/4276] eta: 0:28:23 lr: 3.642217791666085e-05 loss: 0.1392 (0.1548) time: 3.0821 data: 0.0087 max mem: 33300 Epoch: [11] [3730/4276] eta: 0:27:53 lr: 3.6419452669004e-05 loss: 0.1461 (0.1548) time: 3.0757 data: 0.0086 max mem: 33300 Epoch: [11] [3740/4276] eta: 0:27:22 lr: 3.64167273986882e-05 loss: 0.1514 (0.1548) time: 3.0687 data: 0.0085 max mem: 33300 Epoch: [11] [3750/4276] eta: 0:26:51 lr: 3.6414002105711385e-05 loss: 0.1521 (0.1548) time: 3.0821 data: 0.0084 max mem: 33300 Epoch: [11] [3760/4276] eta: 0:26:21 lr: 3.641127679007148e-05 loss: 0.1415 (0.1548) time: 3.0686 data: 0.0090 max mem: 33300 Epoch: [11] [3770/4276] eta: 0:25:50 lr: 3.640855145176642e-05 loss: 0.1488 (0.1548) time: 3.0796 data: 0.0092 max mem: 33300 Epoch: [11] [3780/4276] eta: 0:25:19 lr: 3.640582609079412e-05 loss: 0.1508 (0.1548) time: 3.0963 data: 0.0088 max mem: 33300 Epoch: [11] [3790/4276] eta: 0:24:49 lr: 3.6403100707152514e-05 loss: 0.1483 (0.1548) time: 3.0575 data: 0.0086 max mem: 33300 Epoch: [11] [3800/4276] eta: 0:24:18 lr: 3.640037530083952e-05 loss: 0.1544 (0.1548) time: 3.0084 data: 0.0082 max mem: 33300 Epoch: [11] [3810/4276] eta: 0:23:47 lr: 3.639764987185307e-05 loss: 0.1444 (0.1548) time: 3.0037 data: 0.0077 max mem: 33300 Epoch: [11] [3820/4276] eta: 0:23:17 lr: 3.6394924420191095e-05 loss: 0.1216 (0.1547) time: 3.0544 data: 0.0084 max mem: 33300 Epoch: [11] [3830/4276] eta: 0:22:46 lr: 3.6392198945851505e-05 loss: 0.1287 (0.1547) time: 3.0787 data: 0.0093 max mem: 33300 Epoch: [11] [3840/4276] eta: 0:22:15 lr: 3.6389473448832236e-05 loss: 0.1413 (0.1546) time: 3.0872 data: 0.0093 max mem: 33300 Epoch: [11] [3850/4276] eta: 0:21:45 lr: 3.6386747929131206e-05 loss: 0.1331 (0.1546) time: 3.1003 data: 0.0087 max mem: 33300 Epoch: [11] [3860/4276] eta: 0:21:14 lr: 3.6384022386746336e-05 loss: 0.1343 (0.1545) time: 3.0799 data: 0.0083 max mem: 33300 Epoch: [11] [3870/4276] eta: 0:20:44 lr: 3.6381296821675545e-05 loss: 0.1409 (0.1545) time: 3.0769 data: 0.0085 max mem: 33300 Epoch: [11] [3880/4276] eta: 0:20:13 lr: 3.637857123391677e-05 loss: 0.1377 (0.1545) time: 3.0882 data: 0.0084 max mem: 33300 Epoch: [11] [3890/4276] eta: 0:19:42 lr: 3.637584562346793e-05 loss: 0.1454 (0.1545) time: 3.0692 data: 0.0080 max mem: 33300 Epoch: [11] [3900/4276] eta: 0:19:12 lr: 3.637311999032694e-05 loss: 0.1437 (0.1545) time: 3.0729 data: 0.0082 max mem: 33300 Epoch: [11] [3910/4276] eta: 0:18:41 lr: 3.637039433449172e-05 loss: 0.1332 (0.1544) time: 3.0895 data: 0.0086 max mem: 33300 Epoch: [11] [3920/4276] eta: 0:18:10 lr: 3.636766865596021e-05 loss: 0.1296 (0.1544) time: 3.0789 data: 0.0086 max mem: 33300 Epoch: [11] [3930/4276] eta: 0:17:40 lr: 3.636494295473031e-05 loss: 0.1453 (0.1544) time: 3.0643 data: 0.0082 max mem: 33300 Epoch: [11] [3940/4276] eta: 0:17:09 lr: 3.6362217230799946e-05 loss: 0.1441 (0.1544) time: 3.0704 data: 0.0079 max mem: 33300 Epoch: [11] [3950/4276] eta: 0:16:39 lr: 3.635949148416704e-05 loss: 0.1396 (0.1543) time: 3.0894 data: 0.0078 max mem: 33300 Epoch: [11] [3960/4276] eta: 0:16:08 lr: 3.635676571482951e-05 loss: 0.1435 (0.1543) time: 3.0845 data: 0.0077 max mem: 33300 Epoch: [11] [3970/4276] eta: 0:15:37 lr: 3.635403992278528e-05 loss: 0.1621 (0.1544) time: 3.0938 data: 0.0076 max mem: 33300 Epoch: [11] [3980/4276] eta: 0:15:07 lr: 3.6351314108032266e-05 loss: 0.1493 (0.1544) time: 3.0977 data: 0.0076 max mem: 33300 Epoch: [11] [3990/4276] eta: 0:14:36 lr: 3.634858827056839e-05 loss: 0.1461 (0.1544) time: 3.0639 data: 0.0078 max mem: 33300 Epoch: [11] [4000/4276] eta: 0:14:05 lr: 3.634586241039157e-05 loss: 0.1437 (0.1543) time: 3.0551 data: 0.0082 max mem: 33300 Epoch: [11] [4010/4276] eta: 0:13:35 lr: 3.6343136527499715e-05 loss: 0.1437 (0.1544) time: 3.0830 data: 0.0082 max mem: 33300 Epoch: [11] [4020/4276] eta: 0:13:04 lr: 3.634041062189075e-05 loss: 0.1434 (0.1544) time: 3.0865 data: 0.0077 max mem: 33300 Epoch: [11] [4030/4276] eta: 0:12:33 lr: 3.63376846935626e-05 loss: 0.1410 (0.1544) time: 3.0822 data: 0.0075 max mem: 33300 Epoch: [11] [4040/4276] eta: 0:12:03 lr: 3.633495874251316e-05 loss: 0.1453 (0.1544) time: 3.0986 data: 0.0078 max mem: 33300 Epoch: [11] [4050/4276] eta: 0:11:32 lr: 3.633223276874037e-05 loss: 0.1524 (0.1544) time: 3.0941 data: 0.0078 max mem: 33300 Epoch: [11] [4060/4276] eta: 0:11:02 lr: 3.632950677224214e-05 loss: 0.1510 (0.1545) time: 3.0797 data: 0.0077 max mem: 33300 Epoch: [11] [4070/4276] eta: 0:10:31 lr: 3.6326780753016374e-05 loss: 0.1619 (0.1545) time: 3.0820 data: 0.0079 max mem: 33300 Epoch: [11] [4080/4276] eta: 0:10:00 lr: 3.6324054711061e-05 loss: 0.1634 (0.1545) time: 3.0803 data: 0.0081 max mem: 33300 Epoch: [11] [4090/4276] eta: 0:09:30 lr: 3.632132864637392e-05 loss: 0.1637 (0.1545) time: 3.0621 data: 0.0081 max mem: 33300 Epoch: [11] [4100/4276] eta: 0:08:59 lr: 3.631860255895307e-05 loss: 0.1560 (0.1545) time: 3.0825 data: 0.0081 max mem: 33300 Epoch: [11] [4110/4276] eta: 0:08:28 lr: 3.6315876448796345e-05 loss: 0.1560 (0.1546) time: 3.1001 data: 0.0080 max mem: 33300 Epoch: [11] [4120/4276] eta: 0:07:58 lr: 3.631315031590166e-05 loss: 0.1609 (0.1546) time: 3.0784 data: 0.0082 max mem: 33300 Epoch: [11] [4130/4276] eta: 0:07:27 lr: 3.6310424160266944e-05 loss: 0.1478 (0.1545) time: 3.0522 data: 0.0088 max mem: 33300 Epoch: [11] [4140/4276] eta: 0:06:56 lr: 3.63076979818901e-05 loss: 0.1478 (0.1545) time: 3.0686 data: 0.0085 max mem: 33300 Epoch: [11] [4150/4276] eta: 0:06:26 lr: 3.6304971780769044e-05 loss: 0.1523 (0.1546) time: 3.0965 data: 0.0088 max mem: 33300 Epoch: [11] [4160/4276] eta: 0:05:55 lr: 3.630224555690168e-05 loss: 0.1506 (0.1546) time: 3.0879 data: 0.0093 max mem: 33300 Epoch: [11] [4170/4276] eta: 0:05:24 lr: 3.629951931028593e-05 loss: 0.1816 (0.1547) time: 3.1049 data: 0.0086 max mem: 33300 Epoch: [11] [4180/4276] eta: 0:04:54 lr: 3.62967930409197e-05 loss: 0.1619 (0.1547) time: 3.1000 data: 0.0087 max mem: 33300 Epoch: [11] [4190/4276] eta: 0:04:23 lr: 3.6294066748800906e-05 loss: 0.1511 (0.1547) time: 3.0652 data: 0.0090 max mem: 33300 Epoch: [11] [4200/4276] eta: 0:03:52 lr: 3.629134043392745e-05 loss: 0.1542 (0.1547) time: 3.0683 data: 0.0092 max mem: 33300 Epoch: [11] [4210/4276] eta: 0:03:22 lr: 3.628861409629726e-05 loss: 0.1747 (0.1548) time: 3.0587 data: 0.0087 max mem: 33300 Epoch: [11] [4220/4276] eta: 0:02:51 lr: 3.628588773590823e-05 loss: 0.1747 (0.1548) time: 3.0429 data: 0.0083 max mem: 33300 Epoch: [11] [4230/4276] eta: 0:02:21 lr: 3.6283161352758274e-05 loss: 0.1743 (0.1549) time: 3.0749 data: 0.0091 max mem: 33300 Epoch: [11] [4240/4276] eta: 0:01:50 lr: 3.6280434946845316e-05 loss: 0.1648 (0.1549) time: 3.1051 data: 0.0100 max mem: 33300 Epoch: [11] [4250/4276] eta: 0:01:19 lr: 3.6277708518167244e-05 loss: 0.1603 (0.1550) time: 3.0967 data: 0.0095 max mem: 33300 Epoch: [11] [4260/4276] eta: 0:00:49 lr: 3.627498206672197e-05 loss: 0.1675 (0.1550) time: 3.0788 data: 0.0085 max mem: 33300 Epoch: [11] [4270/4276] eta: 0:00:18 lr: 3.627225559250741e-05 loss: 0.1644 (0.1550) time: 3.0797 data: 0.0078 max mem: 33300 Epoch: [11] Total time: 3:38:30 Test: [ 0/21770] eta: 12:51:10 time: 2.1254 data: 2.0860 max mem: 33300 Test: [ 100/21770] eta: 0:21:14 time: 0.0384 data: 0.0011 max mem: 33300 Test: [ 200/21770] eta: 0:17:31 time: 0.0385 data: 0.0010 max mem: 33300 Test: [ 300/21770] eta: 0:16:13 time: 0.0383 data: 0.0010 max mem: 33300 Test: [ 400/21770] eta: 0:15:33 time: 0.0388 data: 0.0011 max mem: 33300 Test: [ 500/21770] eta: 0:15:07 time: 0.0386 data: 0.0010 max mem: 33300 Test: [ 600/21770] eta: 0:14:49 time: 0.0386 data: 0.0011 max mem: 33300 Test: [ 700/21770] eta: 0:14:35 time: 0.0386 data: 0.0011 max mem: 33300 Test: [ 800/21770] eta: 0:14:23 time: 0.0386 data: 0.0011 max mem: 33300 Test: [ 900/21770] eta: 0:14:12 time: 0.0384 data: 0.0011 max mem: 33300 Test: [ 1000/21770] eta: 0:14:03 time: 0.0389 data: 0.0011 max mem: 33300 Test: [ 1100/21770] eta: 0:13:55 time: 0.0387 data: 0.0010 max mem: 33300 Test: [ 1200/21770] eta: 0:13:48 time: 0.0384 data: 0.0010 max mem: 33300 Test: [ 1300/21770] eta: 0:13:42 time: 0.0384 data: 0.0010 max mem: 33300 Test: [ 1400/21770] eta: 0:13:35 time: 0.0385 data: 0.0011 max mem: 33300 Test: [ 1500/21770] eta: 0:13:30 time: 0.0390 data: 0.0011 max mem: 33300 Test: [ 1600/21770] eta: 0:13:25 time: 0.0392 data: 0.0012 max mem: 33300 Test: [ 1700/21770] eta: 0:13:20 time: 0.0391 data: 0.0011 max mem: 33300 Test: [ 1800/21770] eta: 0:13:15 time: 0.0388 data: 0.0012 max mem: 33300 Test: [ 1900/21770] eta: 0:13:10 time: 0.0391 data: 0.0011 max mem: 33300 Test: [ 2000/21770] eta: 0:13:05 time: 0.0385 data: 0.0010 max mem: 33300 Test: [ 2100/21770] eta: 0:13:00 time: 0.0391 data: 0.0011 max mem: 33300 Test: [ 2200/21770] eta: 0:12:56 time: 0.0386 data: 0.0011 max mem: 33300 Test: [ 2300/21770] eta: 0:12:51 time: 0.0385 data: 0.0011 max mem: 33300 Test: [ 2400/21770] eta: 0:12:47 time: 0.0400 data: 0.0011 max mem: 33300 Test: [ 2500/21770] eta: 0:12:43 time: 0.0392 data: 0.0011 max mem: 33300 Test: [ 2600/21770] eta: 0:12:39 time: 0.0392 data: 0.0011 max mem: 33300 Test: [ 2700/21770] eta: 0:12:35 time: 0.0390 data: 0.0011 max mem: 33300 Test: [ 2800/21770] eta: 0:12:30 time: 0.0392 data: 0.0012 max mem: 33300 Test: [ 2900/21770] eta: 0:12:26 time: 0.0394 data: 0.0011 max mem: 33300 Test: [ 3000/21770] eta: 0:12:22 time: 0.0386 data: 0.0011 max mem: 33300 Test: [ 3100/21770] eta: 0:12:17 time: 0.0387 data: 0.0011 max mem: 33300 Test: [ 3200/21770] eta: 0:12:13 time: 0.0390 data: 0.0011 max mem: 33300 Test: [ 3300/21770] eta: 0:12:09 time: 0.0394 data: 0.0011 max mem: 33300 Test: [ 3400/21770] eta: 0:12:05 time: 0.0396 data: 0.0010 max mem: 33300 Test: [ 3500/21770] eta: 0:12:01 time: 0.0395 data: 0.0010 max mem: 33300 Test: [ 3600/21770] eta: 0:11:57 time: 0.0389 data: 0.0011 max mem: 33300 Test: [ 3700/21770] eta: 0:11:53 time: 0.0388 data: 0.0011 max mem: 33300 Test: [ 3800/21770] eta: 0:11:49 time: 0.0389 data: 0.0010 max mem: 33300 Test: [ 3900/21770] eta: 0:11:45 time: 0.0391 data: 0.0011 max mem: 33300 Test: [ 4000/21770] eta: 0:11:40 time: 0.0389 data: 0.0011 max mem: 33300 Test: [ 4100/21770] eta: 0:11:36 time: 0.0391 data: 0.0012 max mem: 33300 Test: [ 4200/21770] eta: 0:11:32 time: 0.0395 data: 0.0012 max mem: 33300 Test: [ 4300/21770] eta: 0:11:28 time: 0.0392 data: 0.0011 max mem: 33300 Test: [ 4400/21770] eta: 0:11:24 time: 0.0398 data: 0.0011 max mem: 33300 Test: [ 4500/21770] eta: 0:11:21 time: 0.0394 data: 0.0011 max mem: 33300 Test: [ 4600/21770] eta: 0:11:17 time: 0.0394 data: 0.0011 max mem: 33300 Test: [ 4700/21770] eta: 0:11:13 time: 0.0398 data: 0.0012 max mem: 33300 Test: [ 4800/21770] eta: 0:11:09 time: 0.0393 data: 0.0012 max mem: 33300 Test: [ 4900/21770] eta: 0:11:05 time: 0.0395 data: 0.0011 max mem: 33300 Test: [ 5000/21770] eta: 0:11:01 time: 0.0399 data: 0.0012 max mem: 33300 Test: [ 5100/21770] eta: 0:10:57 time: 0.0400 data: 0.0012 max mem: 33300 Test: [ 5200/21770] eta: 0:10:53 time: 0.0390 data: 0.0012 max mem: 33300 Test: [ 5300/21770] eta: 0:10:50 time: 0.0394 data: 0.0012 max mem: 33300 Test: [ 5400/21770] eta: 0:10:45 time: 0.0398 data: 0.0011 max mem: 33300 Test: [ 5500/21770] eta: 0:10:41 time: 0.0393 data: 0.0011 max mem: 33300 Test: [ 5600/21770] eta: 0:10:37 time: 0.0396 data: 0.0011 max mem: 33300 Test: [ 5700/21770] eta: 0:10:33 time: 0.0391 data: 0.0011 max mem: 33300 Test: [ 5800/21770] eta: 0:10:29 time: 0.0392 data: 0.0011 max mem: 33300 Test: [ 5900/21770] eta: 0:10:25 time: 0.0389 data: 0.0011 max mem: 33300 Test: [ 6000/21770] eta: 0:10:21 time: 0.0385 data: 0.0011 max mem: 33300 Test: [ 6100/21770] eta: 0:10:17 time: 0.0388 data: 0.0011 max mem: 33300 Test: [ 6200/21770] eta: 0:10:13 time: 0.0384 data: 0.0011 max mem: 33300 Test: [ 6300/21770] eta: 0:10:09 time: 0.0385 data: 0.0012 max mem: 33300 Test: [ 6400/21770] eta: 0:10:04 time: 0.0386 data: 0.0012 max mem: 33300 Test: [ 6500/21770] eta: 0:10:01 time: 0.0394 data: 0.0012 max mem: 33300 Test: [ 6600/21770] eta: 0:09:57 time: 0.0393 data: 0.0011 max mem: 33300 Test: [ 6700/21770] eta: 0:09:53 time: 0.0393 data: 0.0011 max mem: 33300 Test: [ 6800/21770] eta: 0:09:49 time: 0.0391 data: 0.0012 max mem: 33300 Test: [ 6900/21770] eta: 0:09:45 time: 0.0394 data: 0.0012 max mem: 33300 Test: [ 7000/21770] eta: 0:09:41 time: 0.0394 data: 0.0012 max mem: 33300 Test: [ 7100/21770] eta: 0:09:37 time: 0.0391 data: 0.0012 max mem: 33300 Test: [ 7200/21770] eta: 0:09:33 time: 0.0392 data: 0.0012 max mem: 33300 Test: [ 7300/21770] eta: 0:09:29 time: 0.0397 data: 0.0010 max mem: 33300 Test: [ 7400/21770] eta: 0:09:25 time: 0.0398 data: 0.0011 max mem: 33300 Test: [ 7500/21770] eta: 0:09:21 time: 0.0392 data: 0.0011 max mem: 33300 Test: [ 7600/21770] eta: 0:09:17 time: 0.0393 data: 0.0011 max mem: 33300 Test: [ 7700/21770] eta: 0:09:13 time: 0.0394 data: 0.0012 max mem: 33300 Test: [ 7800/21770] eta: 0:09:09 time: 0.0396 data: 0.0011 max mem: 33300 Test: [ 7900/21770] eta: 0:09:05 time: 0.0394 data: 0.0011 max mem: 33300 Test: [ 8000/21770] eta: 0:09:01 time: 0.0399 data: 0.0011 max mem: 33300 Test: [ 8100/21770] eta: 0:08:58 time: 0.0393 data: 0.0012 max mem: 33300 Test: [ 8200/21770] eta: 0:08:54 time: 0.0391 data: 0.0010 max mem: 33300 Test: [ 8300/21770] eta: 0:08:50 time: 0.0389 data: 0.0010 max mem: 33300 Test: [ 8400/21770] eta: 0:08:46 time: 0.0389 data: 0.0010 max mem: 33300 Test: [ 8500/21770] eta: 0:08:42 time: 0.0390 data: 0.0011 max mem: 33300 Test: [ 8600/21770] eta: 0:08:38 time: 0.0386 data: 0.0010 max mem: 33300 Test: [ 8700/21770] eta: 0:08:34 time: 0.0384 data: 0.0010 max mem: 33300 Test: [ 8800/21770] eta: 0:08:30 time: 0.0385 data: 0.0011 max mem: 33300 Test: [ 8900/21770] eta: 0:08:26 time: 0.0384 data: 0.0010 max mem: 33300 Test: [ 9000/21770] eta: 0:08:22 time: 0.0385 data: 0.0011 max mem: 33300 Test: [ 9100/21770] eta: 0:08:18 time: 0.0388 data: 0.0011 max mem: 33300 Test: [ 9200/21770] eta: 0:08:14 time: 0.0386 data: 0.0011 max mem: 33300 Test: [ 9300/21770] eta: 0:08:10 time: 0.0387 data: 0.0011 max mem: 33300 Test: [ 9400/21770] eta: 0:08:06 time: 0.0387 data: 0.0011 max mem: 33300 Test: [ 9500/21770] eta: 0:08:02 time: 0.0389 data: 0.0011 max mem: 33300 Test: [ 9600/21770] eta: 0:07:58 time: 0.0390 data: 0.0012 max mem: 33300 Test: [ 9700/21770] eta: 0:07:54 time: 0.0389 data: 0.0011 max mem: 33300 Test: [ 9800/21770] eta: 0:07:50 time: 0.0389 data: 0.0012 max mem: 33300 Test: [ 9900/21770] eta: 0:07:46 time: 0.0390 data: 0.0012 max mem: 33300 Test: [10000/21770] eta: 0:07:42 time: 0.0391 data: 0.0012 max mem: 33300 Test: [10100/21770] eta: 0:07:38 time: 0.0390 data: 0.0011 max mem: 33300 Test: [10200/21770] eta: 0:07:34 time: 0.0391 data: 0.0012 max mem: 33300 Test: [10300/21770] eta: 0:07:30 time: 0.0393 data: 0.0011 max mem: 33300 Test: [10400/21770] eta: 0:07:26 time: 0.0391 data: 0.0011 max mem: 33300 Test: [10500/21770] eta: 0:07:22 time: 0.0392 data: 0.0011 max mem: 33300 Test: [10600/21770] eta: 0:07:18 time: 0.0398 data: 0.0011 max mem: 33300 Test: [10700/21770] eta: 0:07:14 time: 0.0394 data: 0.0011 max mem: 33300 Test: [10800/21770] eta: 0:07:10 time: 0.0395 data: 0.0010 max mem: 33300 Test: [10900/21770] eta: 0:07:06 time: 0.0393 data: 0.0011 max mem: 33300 Test: [11000/21770] eta: 0:07:02 time: 0.0391 data: 0.0011 max mem: 33300 Test: [11100/21770] eta: 0:06:59 time: 0.0387 data: 0.0011 max mem: 33300 Test: [11200/21770] eta: 0:06:55 time: 0.0388 data: 0.0012 max mem: 33300 Test: [11300/21770] eta: 0:06:51 time: 0.0389 data: 0.0011 max mem: 33300 Test: [11400/21770] eta: 0:06:47 time: 0.0387 data: 0.0011 max mem: 33300 Test: [11500/21770] eta: 0:06:43 time: 0.0389 data: 0.0011 max mem: 33300 Test: [11600/21770] eta: 0:06:39 time: 0.0387 data: 0.0011 max mem: 33300 Test: [11700/21770] eta: 0:06:35 time: 0.0388 data: 0.0011 max mem: 33300 Test: [11800/21770] eta: 0:06:31 time: 0.0385 data: 0.0011 max mem: 33300 Test: [11900/21770] eta: 0:06:27 time: 0.0390 data: 0.0012 max mem: 33300 Test: [12000/21770] eta: 0:06:23 time: 0.0390 data: 0.0011 max mem: 33300 Test: [12100/21770] eta: 0:06:19 time: 0.0391 data: 0.0011 max mem: 33300 Test: [12200/21770] eta: 0:06:15 time: 0.0391 data: 0.0011 max mem: 33300 Test: [12300/21770] eta: 0:06:11 time: 0.0388 data: 0.0010 max mem: 33300 Test: [12400/21770] eta: 0:06:07 time: 0.0388 data: 0.0012 max mem: 33300 Test: [12500/21770] eta: 0:06:03 time: 0.0392 data: 0.0011 max mem: 33300 Test: [12600/21770] eta: 0:05:59 time: 0.0392 data: 0.0012 max mem: 33300 Test: [12700/21770] eta: 0:05:55 time: 0.0395 data: 0.0012 max mem: 33300 Test: [12800/21770] eta: 0:05:51 time: 0.0396 data: 0.0012 max mem: 33300 Test: [12900/21770] eta: 0:05:48 time: 0.0397 data: 0.0012 max mem: 33300 Test: [13000/21770] eta: 0:05:44 time: 0.0395 data: 0.0012 max mem: 33300 Test: [13100/21770] eta: 0:05:40 time: 0.0396 data: 0.0011 max mem: 33300 Test: [13200/21770] eta: 0:05:36 time: 0.0399 data: 0.0011 max mem: 33300 Test: [13300/21770] eta: 0:05:32 time: 0.0396 data: 0.0011 max mem: 33300 Test: [13400/21770] eta: 0:05:28 time: 0.0396 data: 0.0011 max mem: 33300 Test: [13500/21770] eta: 0:05:24 time: 0.0401 data: 0.0012 max mem: 33300 Test: [13600/21770] eta: 0:05:20 time: 0.0390 data: 0.0010 max mem: 33300 Test: [13700/21770] eta: 0:05:16 time: 0.0395 data: 0.0011 max mem: 33300 Test: [13800/21770] eta: 0:05:12 time: 0.0392 data: 0.0011 max mem: 33300 Test: [13900/21770] eta: 0:05:08 time: 0.0396 data: 0.0011 max mem: 33300 Test: [14000/21770] eta: 0:05:05 time: 0.0396 data: 0.0011 max mem: 33300 Test: [14100/21770] eta: 0:05:01 time: 0.0401 data: 0.0012 max mem: 33300 Test: [14200/21770] eta: 0:04:57 time: 0.0395 data: 0.0011 max mem: 33300 Test: [14300/21770] eta: 0:04:53 time: 0.0402 data: 0.0011 max mem: 33300 Test: [14400/21770] eta: 0:04:49 time: 0.0391 data: 0.0012 max mem: 33300 Test: [14500/21770] eta: 0:04:45 time: 0.0396 data: 0.0012 max mem: 33300 Test: [14600/21770] eta: 0:04:41 time: 0.0390 data: 0.0012 max mem: 33300 Test: [14700/21770] eta: 0:04:37 time: 0.0401 data: 0.0012 max mem: 33300 Test: [14800/21770] eta: 0:04:33 time: 0.0395 data: 0.0012 max mem: 33300 Test: [14900/21770] eta: 0:04:29 time: 0.0395 data: 0.0012 max mem: 33300 Test: [15000/21770] eta: 0:04:25 time: 0.0394 data: 0.0012 max mem: 33300 Test: [15100/21770] eta: 0:04:22 time: 0.0391 data: 0.0011 max mem: 33300 Test: [15200/21770] eta: 0:04:18 time: 0.0392 data: 0.0012 max mem: 33300 Test: [15300/21770] eta: 0:04:14 time: 0.0391 data: 0.0011 max mem: 33300 Test: [15400/21770] eta: 0:04:10 time: 0.0388 data: 0.0010 max mem: 33300 Test: [15500/21770] eta: 0:04:06 time: 0.0393 data: 0.0011 max mem: 33300 Test: [15600/21770] eta: 0:04:02 time: 0.0397 data: 0.0011 max mem: 33300 Test: [15700/21770] eta: 0:03:58 time: 0.0392 data: 0.0011 max mem: 33300 Test: [15800/21770] eta: 0:03:54 time: 0.0393 data: 0.0012 max mem: 33300 Test: [15900/21770] eta: 0:03:50 time: 0.0399 data: 0.0012 max mem: 33300 Test: [16000/21770] eta: 0:03:46 time: 0.0395 data: 0.0012 max mem: 33300 Test: [16100/21770] eta: 0:03:42 time: 0.0397 data: 0.0011 max mem: 33300 Test: [16200/21770] eta: 0:03:38 time: 0.0388 data: 0.0011 max mem: 33300 Test: [16300/21770] eta: 0:03:34 time: 0.0388 data: 0.0010 max mem: 33300 Test: [16400/21770] eta: 0:03:30 time: 0.0386 data: 0.0010 max mem: 33300 Test: [16500/21770] eta: 0:03:27 time: 0.0393 data: 0.0011 max mem: 33300 Test: [16600/21770] eta: 0:03:23 time: 0.0388 data: 0.0010 max mem: 33300 Test: [16700/21770] eta: 0:03:19 time: 0.0389 data: 0.0011 max mem: 33300 Test: [16800/21770] eta: 0:03:15 time: 0.0392 data: 0.0011 max mem: 33300 Test: [16900/21770] eta: 0:03:11 time: 0.0389 data: 0.0011 max mem: 33300 Test: [17000/21770] eta: 0:03:07 time: 0.0389 data: 0.0011 max mem: 33300 Test: [17100/21770] eta: 0:03:03 time: 0.0388 data: 0.0011 max mem: 33300 Test: [17200/21770] eta: 0:02:59 time: 0.0391 data: 0.0011 max mem: 33300 Test: [17300/21770] eta: 0:02:55 time: 0.0391 data: 0.0011 max mem: 33300 Test: [17400/21770] eta: 0:02:51 time: 0.0389 data: 0.0010 max mem: 33300 Test: [17500/21770] eta: 0:02:47 time: 0.0393 data: 0.0011 max mem: 33300 Test: [17600/21770] eta: 0:02:43 time: 0.0391 data: 0.0011 max mem: 33300 Test: [17700/21770] eta: 0:02:39 time: 0.0393 data: 0.0011 max mem: 33300 Test: [17800/21770] eta: 0:02:35 time: 0.0392 data: 0.0012 max mem: 33300 Test: [17900/21770] eta: 0:02:31 time: 0.0391 data: 0.0012 max mem: 33300 Test: [18000/21770] eta: 0:02:28 time: 0.0390 data: 0.0011 max mem: 33300 Test: [18100/21770] eta: 0:02:24 time: 0.0388 data: 0.0011 max mem: 33300 Test: [18200/21770] eta: 0:02:20 time: 0.0390 data: 0.0011 max mem: 33300 Test: [18300/21770] eta: 0:02:16 time: 0.0390 data: 0.0011 max mem: 33300 Test: [18400/21770] eta: 0:02:12 time: 0.0382 data: 0.0010 max mem: 33300 Test: [18500/21770] eta: 0:02:08 time: 0.0389 data: 0.0011 max mem: 33300 Test: [18600/21770] eta: 0:02:04 time: 0.0388 data: 0.0012 max mem: 33300 Test: [18700/21770] eta: 0:02:00 time: 0.0389 data: 0.0012 max mem: 33300 Test: [18800/21770] eta: 0:01:56 time: 0.0392 data: 0.0012 max mem: 33300 Test: [18900/21770] eta: 0:01:52 time: 0.0387 data: 0.0011 max mem: 33300 Test: [19000/21770] eta: 0:01:48 time: 0.0388 data: 0.0011 max mem: 33300 Test: [19100/21770] eta: 0:01:44 time: 0.0389 data: 0.0011 max mem: 33300 Test: [19200/21770] eta: 0:01:40 time: 0.0384 data: 0.0011 max mem: 33300 Test: [19300/21770] eta: 0:01:36 time: 0.0389 data: 0.0011 max mem: 33300 Test: [19400/21770] eta: 0:01:32 time: 0.0387 data: 0.0011 max mem: 33300 Test: [19500/21770] eta: 0:01:29 time: 0.0385 data: 0.0010 max mem: 33300 Test: [19600/21770] eta: 0:01:25 time: 0.0394 data: 0.0012 max mem: 33300 Test: [19700/21770] eta: 0:01:21 time: 0.0389 data: 0.0010 max mem: 33300 Test: [19800/21770] eta: 0:01:17 time: 0.0390 data: 0.0011 max mem: 33300 Test: [19900/21770] eta: 0:01:13 time: 0.0391 data: 0.0011 max mem: 33300 Test: [20000/21770] eta: 0:01:09 time: 0.0389 data: 0.0010 max mem: 33300 Test: [20100/21770] eta: 0:01:05 time: 0.0391 data: 0.0011 max mem: 33300 Test: [20200/21770] eta: 0:01:01 time: 0.0391 data: 0.0011 max mem: 33300 Test: [20300/21770] eta: 0:00:57 time: 0.0394 data: 0.0012 max mem: 33300 Test: [20400/21770] eta: 0:00:53 time: 0.0393 data: 0.0012 max mem: 33300 Test: [20500/21770] eta: 0:00:49 time: 0.0395 data: 0.0011 max mem: 33300 Test: [20600/21770] eta: 0:00:45 time: 0.0391 data: 0.0011 max mem: 33300 Test: [20700/21770] eta: 0:00:41 time: 0.0393 data: 0.0012 max mem: 33300 Test: [20800/21770] eta: 0:00:38 time: 0.0395 data: 0.0012 max mem: 33300 Test: [20900/21770] eta: 0:00:34 time: 0.0390 data: 0.0011 max mem: 33300 Test: [21000/21770] eta: 0:00:30 time: 0.0395 data: 0.0011 max mem: 33300 Test: [21100/21770] eta: 0:00:26 time: 0.0392 data: 0.0011 max mem: 33300 Test: [21200/21770] eta: 0:00:22 time: 0.0395 data: 0.0011 max mem: 33300 Test: [21300/21770] eta: 0:00:18 time: 0.0393 data: 0.0011 max mem: 33300 Test: [21400/21770] eta: 0:00:14 time: 0.0396 data: 0.0011 max mem: 33300 Test: [21500/21770] eta: 0:00:10 time: 0.0394 data: 0.0010 max mem: 33300 Test: [21600/21770] eta: 0:00:06 time: 0.0400 data: 0.0010 max mem: 33300 Test: [21700/21770] eta: 0:00:02 time: 0.0394 data: 0.0011 max mem: 33300 Test: Total time: 0:14:14 Final results: Mean IoU is 0.28 precision@0.5 = 0.00 precision@0.6 = 0.00 precision@0.7 = 0.00 precision@0.8 = 0.00 precision@0.9 = 0.00 overall IoU = 0.27 mean IoU = 0.28 Mean accuracy for one-to-zero sample is 0.00 Average object IoU 0.002750891716068778 Overall IoU 0.2721877694129944 Better epoch: 11 Epoch: [12] [ 0/4276] eta: 7:14:35 lr: 3.6270619697048556e-05 loss: 0.1167 (0.1167) time: 6.0981 data: 2.8400 max mem: 33300 Epoch: [12] [ 10/4276] eta: 4:00:56 lr: 3.626789318639879e-05 loss: 0.1789 (0.1635) time: 3.3887 data: 0.2635 max mem: 33300 Epoch: [12] [ 20/4276] eta: 3:50:11 lr: 3.626516665297429e-05 loss: 0.1627 (0.1620) time: 3.1025 data: 0.0060 max mem: 33300 Epoch: [12] [ 30/4276] eta: 3:46:06 lr: 3.626244009677297e-05 loss: 0.1597 (0.1617) time: 3.0887 data: 0.0070 max mem: 33300 Epoch: [12] [ 40/4276] eta: 3:43:46 lr: 3.625971351779274e-05 loss: 0.1582 (0.1603) time: 3.0902 data: 0.0076 max mem: 33300 Epoch: [12] [ 50/4276] eta: 3:41:56 lr: 3.62569869160315e-05 loss: 0.1550 (0.1573) time: 3.0825 data: 0.0073 max mem: 33300 Epoch: [12] [ 60/4276] eta: 3:40:20 lr: 3.625426029148717e-05 loss: 0.1532 (0.1571) time: 3.0663 data: 0.0073 max mem: 33300 Epoch: [12] [ 70/4276] eta: 3:39:07 lr: 3.625153364415764e-05 loss: 0.1532 (0.1558) time: 3.0623 data: 0.0074 max mem: 33300 Epoch: [12] [ 80/4276] eta: 3:38:20 lr: 3.6248806974040815e-05 loss: 0.1509 (0.1558) time: 3.0808 data: 0.0074 max mem: 33300 Epoch: [12] [ 90/4276] eta: 3:37:24 lr: 3.624608028113461e-05 loss: 0.1427 (0.1538) time: 3.0812 data: 0.0070 max mem: 33300 Epoch: [12] [ 100/4276] eta: 3:36:40 lr: 3.624335356543692e-05 loss: 0.1443 (0.1552) time: 3.0762 data: 0.0071 max mem: 33300 Epoch: [12] [ 110/4276] eta: 3:36:01 lr: 3.6240626826945653e-05 loss: 0.1634 (0.1561) time: 3.0885 data: 0.0072 max mem: 33300 Epoch: [12] [ 120/4276] eta: 3:35:14 lr: 3.623790006565872e-05 loss: 0.1442 (0.1553) time: 3.0794 data: 0.0069 max mem: 33300 Epoch: [12] [ 130/4276] eta: 3:34:27 lr: 3.623517328157402e-05 loss: 0.1453 (0.1549) time: 3.0612 data: 0.0069 max mem: 33300 Epoch: [12] [ 140/4276] eta: 3:33:50 lr: 3.623244647468945e-05 loss: 0.1399 (0.1539) time: 3.0697 data: 0.0071 max mem: 33300 Epoch: [12] [ 150/4276] eta: 3:33:09 lr: 3.6229719645002926e-05 loss: 0.1360 (0.1536) time: 3.0751 data: 0.0071 max mem: 33300 Epoch: [12] [ 160/4276] eta: 3:32:33 lr: 3.622699279251234e-05 loss: 0.1404 (0.1534) time: 3.0732 data: 0.0070 max mem: 33300 Epoch: [12] [ 170/4276] eta: 3:31:57 lr: 3.6224265917215586e-05 loss: 0.1422 (0.1527) time: 3.0786 data: 0.0070 max mem: 33300 Epoch: [12] [ 180/4276] eta: 3:31:19 lr: 3.622153901911058e-05 loss: 0.1494 (0.1530) time: 3.0722 data: 0.0070 max mem: 33300 Epoch: [12] [ 190/4276] eta: 3:30:40 lr: 3.6218812098195214e-05 loss: 0.1494 (0.1531) time: 3.0625 data: 0.0071 max mem: 33300 Epoch: [12] [ 200/4276] eta: 3:30:00 lr: 3.62160851544674e-05 loss: 0.1445 (0.1542) time: 3.0531 data: 0.0072 max mem: 33300 Epoch: [12] [ 210/4276] eta: 3:29:31 lr: 3.621335818792503e-05 loss: 0.1477 (0.1547) time: 3.0753 data: 0.0076 max mem: 33300 Epoch: [12] [ 220/4276] eta: 3:28:56 lr: 3.621063119856602e-05 loss: 0.1537 (0.1549) time: 3.0849 data: 0.0081 max mem: 33300 Epoch: [12] [ 230/4276] eta: 3:28:29 lr: 3.620790418638824e-05 loss: 0.1480 (0.1543) time: 3.0899 data: 0.0077 max mem: 33300 Epoch: [12] [ 240/4276] eta: 3:27:57 lr: 3.6205177151389617e-05 loss: 0.1448 (0.1543) time: 3.0986 data: 0.0072 max mem: 33300 Epoch: [12] [ 250/4276] eta: 3:27:22 lr: 3.620245009356803e-05 loss: 0.1595 (0.1552) time: 3.0775 data: 0.0072 max mem: 33300 Epoch: [12] [ 260/4276] eta: 3:26:47 lr: 3.619972301292139e-05 loss: 0.1623 (0.1552) time: 3.0655 data: 0.0076 max mem: 33300 Epoch: [12] [ 270/4276] eta: 3:26:14 lr: 3.619699590944759e-05 loss: 0.1420 (0.1552) time: 3.0694 data: 0.0077 max mem: 33300 Epoch: [12] [ 280/4276] eta: 3:25:44 lr: 3.619426878314453e-05 loss: 0.1420 (0.1549) time: 3.0870 data: 0.0076 max mem: 33300 Epoch: [12] [ 290/4276] eta: 3:25:09 lr: 3.61915416340101e-05 loss: 0.1432 (0.1544) time: 3.0757 data: 0.0075 max mem: 33300 Epoch: [12] [ 300/4276] eta: 3:24:40 lr: 3.618881446204223e-05 loss: 0.1423 (0.1542) time: 3.0798 data: 0.0076 max mem: 33300 Epoch: [12] [ 310/4276] eta: 3:24:10 lr: 3.618608726723877e-05 loss: 0.1386 (0.1539) time: 3.1000 data: 0.0075 max mem: 33300 Epoch: [12] [ 320/4276] eta: 3:23:35 lr: 3.6183360049597656e-05 loss: 0.1409 (0.1541) time: 3.0767 data: 0.0071 max mem: 33300 Epoch: [12] [ 330/4276] eta: 3:23:02 lr: 3.6180632809116755e-05 loss: 0.1516 (0.1543) time: 3.0630 data: 0.0071 max mem: 33300 Epoch: [12] [ 340/4276] eta: 3:22:33 lr: 3.617790554579397e-05 loss: 0.1464 (0.1538) time: 3.0859 data: 0.0077 max mem: 33300 Epoch: [12] [ 350/4276] eta: 3:22:01 lr: 3.6175178259627205e-05 loss: 0.1311 (0.1537) time: 3.0887 data: 0.0076 max mem: 33300 Epoch: [12] [ 360/4276] eta: 3:21:28 lr: 3.617245095061435e-05 loss: 0.1711 (0.1546) time: 3.0719 data: 0.0072 max mem: 33300 Epoch: [12] [ 370/4276] eta: 3:21:00 lr: 3.616972361875331e-05 loss: 0.1637 (0.1542) time: 3.0933 data: 0.0075 max mem: 33300 Epoch: [12] [ 380/4276] eta: 3:20:30 lr: 3.6166996264041966e-05 loss: 0.1304 (0.1541) time: 3.1038 data: 0.0073 max mem: 33300 Epoch: [12] [ 390/4276] eta: 3:19:58 lr: 3.616426888647821e-05 loss: 0.1338 (0.1541) time: 3.0874 data: 0.0070 max mem: 33300 Epoch: [12] [ 400/4276] eta: 3:19:28 lr: 3.616154148605994e-05 loss: 0.1448 (0.1539) time: 3.0902 data: 0.0072 max mem: 33300 Epoch: [12] [ 410/4276] eta: 3:18:59 lr: 3.615881406278506e-05 loss: 0.1448 (0.1535) time: 3.0987 data: 0.0073 max mem: 33300 Epoch: [12] [ 420/4276] eta: 3:18:24 lr: 3.6156086616651444e-05 loss: 0.1467 (0.1535) time: 3.0738 data: 0.0076 max mem: 33300 Epoch: [12] [ 430/4276] eta: 3:17:55 lr: 3.615335914765699e-05 loss: 0.1595 (0.1536) time: 3.0750 data: 0.0075 max mem: 33300 Epoch: [12] [ 440/4276] eta: 3:17:25 lr: 3.61506316557996e-05 loss: 0.1514 (0.1536) time: 3.1037 data: 0.0074 max mem: 33300 Epoch: [12] [ 450/4276] eta: 3:16:52 lr: 3.6147904141077163e-05 loss: 0.1548 (0.1536) time: 3.0839 data: 0.0075 max mem: 33300 Epoch: [12] [ 460/4276] eta: 3:16:21 lr: 3.6145176603487565e-05 loss: 0.1406 (0.1533) time: 3.0743 data: 0.0074 max mem: 33300 Epoch: [12] [ 470/4276] eta: 3:15:52 lr: 3.61424490430287e-05 loss: 0.1279 (0.1529) time: 3.1004 data: 0.0075 max mem: 33300 Epoch: [12] [ 480/4276] eta: 3:15:22 lr: 3.613972145969845e-05 loss: 0.1276 (0.1527) time: 3.1064 data: 0.0076 max mem: 33300 Epoch: [12] [ 490/4276] eta: 3:14:51 lr: 3.6136993853494715e-05 loss: 0.1258 (0.1522) time: 3.0870 data: 0.0074 max mem: 33300 Epoch: [12] [ 500/4276] eta: 3:14:23 lr: 3.613426622441539e-05 loss: 0.1266 (0.1519) time: 3.1031 data: 0.0073 max mem: 33300 Epoch: [12] [ 510/4276] eta: 3:13:53 lr: 3.613153857245834e-05 loss: 0.1322 (0.1517) time: 3.1144 data: 0.0074 max mem: 33300 Epoch: [12] [ 520/4276] eta: 3:13:20 lr: 3.612881089762148e-05 loss: 0.1345 (0.1516) time: 3.0853 data: 0.0073 max mem: 33300 Epoch: [12] [ 530/4276] eta: 3:12:50 lr: 3.61260831999027e-05 loss: 0.1345 (0.1515) time: 3.0843 data: 0.0073 max mem: 33300 Epoch: [12] [ 540/4276] eta: 3:12:19 lr: 3.6123355479299865e-05 loss: 0.1411 (0.1512) time: 3.0906 data: 0.0072 max mem: 33300 Epoch: [12] [ 550/4276] eta: 3:11:46 lr: 3.612062773581089e-05 loss: 0.1391 (0.1512) time: 3.0737 data: 0.0072 max mem: 33300 Epoch: [12] [ 560/4276] eta: 3:11:16 lr: 3.611789996943363e-05 loss: 0.1485 (0.1513) time: 3.0785 data: 0.0073 max mem: 33300 Epoch: [12] [ 570/4276] eta: 3:10:48 lr: 3.6115172180165994e-05 loss: 0.1552 (0.1514) time: 3.1141 data: 0.0073 max mem: 33300 Epoch: [12] [ 580/4276] eta: 3:10:17 lr: 3.611244436800587e-05 loss: 0.1567 (0.1515) time: 3.1141 data: 0.0072 max mem: 33300 Epoch: [12] [ 590/4276] eta: 3:09:44 lr: 3.6109716532951146e-05 loss: 0.1333 (0.1511) time: 3.0743 data: 0.0074 max mem: 33300 Epoch: [12] [ 600/4276] eta: 3:09:15 lr: 3.6106988674999694e-05 loss: 0.1355 (0.1511) time: 3.0830 data: 0.0076 max mem: 33300 Epoch: [12] [ 610/4276] eta: 3:08:45 lr: 3.610426079414941e-05 loss: 0.1508 (0.1510) time: 3.1100 data: 0.0074 max mem: 33300 Epoch: [12] [ 620/4276] eta: 3:08:14 lr: 3.610153289039818e-05 loss: 0.1404 (0.1510) time: 3.1007 data: 0.0073 max mem: 33300 Epoch: [12] [ 630/4276] eta: 3:07:43 lr: 3.609880496374389e-05 loss: 0.1500 (0.1512) time: 3.0840 data: 0.0078 max mem: 33300 Epoch: [12] [ 640/4276] eta: 3:07:12 lr: 3.609607701418441e-05 loss: 0.1533 (0.1511) time: 3.0841 data: 0.0080 max mem: 33300 Epoch: [12] [ 650/4276] eta: 3:06:40 lr: 3.609334904171764e-05 loss: 0.1518 (0.1512) time: 3.0793 data: 0.0074 max mem: 33300 Epoch: [12] [ 660/4276] eta: 3:06:09 lr: 3.609062104634146e-05 loss: 0.1501 (0.1513) time: 3.0807 data: 0.0072 max mem: 33300 Epoch: [12] [ 670/4276] eta: 3:05:38 lr: 3.608789302805375e-05 loss: 0.1460 (0.1513) time: 3.0919 data: 0.0078 max mem: 33300 Epoch: [12] [ 680/4276] eta: 3:05:06 lr: 3.6085164986852404e-05 loss: 0.1353 (0.1511) time: 3.0770 data: 0.0082 max mem: 33300 Epoch: [12] [ 690/4276] eta: 3:04:33 lr: 3.608243692273529e-05 loss: 0.1380 (0.1511) time: 3.0580 data: 0.0079 max mem: 33300 Epoch: [12] [ 700/4276] eta: 3:04:04 lr: 3.60797088357003e-05 loss: 0.1469 (0.1510) time: 3.0857 data: 0.0080 max mem: 33300 Epoch: [12] [ 710/4276] eta: 3:03:33 lr: 3.607698072574531e-05 loss: 0.1452 (0.1510) time: 3.0999 data: 0.0084 max mem: 33300 Epoch: [12] [ 720/4276] eta: 3:03:02 lr: 3.60742525928682e-05 loss: 0.1349 (0.1508) time: 3.0805 data: 0.0085 max mem: 33300 Epoch: [12] [ 730/4276] eta: 3:02:31 lr: 3.607152443706686e-05 loss: 0.1253 (0.1508) time: 3.0868 data: 0.0082 max mem: 33300 Epoch: [12] [ 740/4276] eta: 3:01:59 lr: 3.6068796258339165e-05 loss: 0.1316 (0.1507) time: 3.0831 data: 0.0083 max mem: 33300 Epoch: [12] [ 750/4276] eta: 3:01:25 lr: 3.6066068056683e-05 loss: 0.1386 (0.1508) time: 3.0471 data: 0.0081 max mem: 33300 Epoch: [12] [ 760/4276] eta: 3:00:54 lr: 3.6063339832096244e-05 loss: 0.1368 (0.1507) time: 3.0488 data: 0.0081 max mem: 33300 Epoch: [12] [ 770/4276] eta: 3:00:25 lr: 3.606061158457677e-05 loss: 0.1358 (0.1508) time: 3.0997 data: 0.0080 max mem: 33300 Epoch: [12] [ 780/4276] eta: 2:59:53 lr: 3.605788331412247e-05 loss: 0.1502 (0.1508) time: 3.0970 data: 0.0077 max mem: 33300 Epoch: [12] [ 790/4276] eta: 2:59:21 lr: 3.605515502073121e-05 loss: 0.1495 (0.1507) time: 3.0662 data: 0.0077 max mem: 33300 Epoch: [12] [ 800/4276] eta: 2:58:51 lr: 3.605242670440088e-05 loss: 0.1426 (0.1508) time: 3.0827 data: 0.0079 max mem: 33300 Epoch: [12] [ 810/4276] eta: 2:58:20 lr: 3.604969836512934e-05 loss: 0.1461 (0.1509) time: 3.0907 data: 0.0078 max mem: 33300 Epoch: [12] [ 820/4276] eta: 2:57:48 lr: 3.6046970002914485e-05 loss: 0.1407 (0.1507) time: 3.0710 data: 0.0080 max mem: 33300 Epoch: [12] [ 830/4276] eta: 2:57:17 lr: 3.6044241617754195e-05 loss: 0.1402 (0.1508) time: 3.0756 data: 0.0080 max mem: 33300 Epoch: [12] [ 840/4276] eta: 2:56:46 lr: 3.604151320964633e-05 loss: 0.1436 (0.1509) time: 3.0906 data: 0.0075 max mem: 33300 Epoch: [12] [ 850/4276] eta: 2:56:16 lr: 3.603878477858878e-05 loss: 0.1431 (0.1508) time: 3.0966 data: 0.0076 max mem: 33300 Epoch: [12] [ 860/4276] eta: 2:55:45 lr: 3.603605632457943e-05 loss: 0.1447 (0.1507) time: 3.0936 data: 0.0077 max mem: 33300 Epoch: [12] [ 870/4276] eta: 2:55:14 lr: 3.603332784761613e-05 loss: 0.1447 (0.1509) time: 3.0846 data: 0.0078 max mem: 33300 Epoch: [12] [ 880/4276] eta: 2:54:43 lr: 3.6030599347696777e-05 loss: 0.1459 (0.1510) time: 3.0779 data: 0.0082 max mem: 33300 Epoch: [12] [ 890/4276] eta: 2:54:12 lr: 3.602787082481923e-05 loss: 0.1514 (0.1510) time: 3.0865 data: 0.0079 max mem: 33300 Epoch: [12] [ 900/4276] eta: 2:53:42 lr: 3.6025142278981384e-05 loss: 0.1500 (0.1509) time: 3.0979 data: 0.0076 max mem: 33300 Epoch: [12] [ 910/4276] eta: 2:53:12 lr: 3.60224137101811e-05 loss: 0.1521 (0.1511) time: 3.1050 data: 0.0082 max mem: 33300 Epoch: [12] [ 920/4276] eta: 2:52:42 lr: 3.601968511841625e-05 loss: 0.1607 (0.1513) time: 3.1087 data: 0.0087 max mem: 33300 Epoch: [12] [ 930/4276] eta: 2:52:10 lr: 3.601695650368472e-05 loss: 0.1607 (0.1513) time: 3.0882 data: 0.0085 max mem: 33300 Epoch: [12] [ 940/4276] eta: 2:51:40 lr: 3.6014227865984366e-05 loss: 0.1521 (0.1512) time: 3.0894 data: 0.0080 max mem: 33300 Epoch: [12] [ 950/4276] eta: 2:51:08 lr: 3.601149920531308e-05 loss: 0.1427 (0.1513) time: 3.0862 data: 0.0079 max mem: 33300 Epoch: [12] [ 960/4276] eta: 2:50:37 lr: 3.600877052166872e-05 loss: 0.1503 (0.1514) time: 3.0632 data: 0.0080 max mem: 33300 Epoch: [12] [ 970/4276] eta: 2:50:07 lr: 3.6006041815049165e-05 loss: 0.1540 (0.1514) time: 3.0889 data: 0.0078 max mem: 33300 Epoch: [12] [ 980/4276] eta: 2:49:36 lr: 3.600331308545228e-05 loss: 0.1555 (0.1516) time: 3.1039 data: 0.0076 max mem: 33300 Epoch: [12] [ 990/4276] eta: 2:49:04 lr: 3.600058433287595e-05 loss: 0.1574 (0.1516) time: 3.0776 data: 0.0073 max mem: 33300 Epoch: [12] [1000/4276] eta: 2:48:33 lr: 3.599785555731804e-05 loss: 0.1417 (0.1515) time: 3.0668 data: 0.0072 max mem: 33300 Epoch: [12] [1010/4276] eta: 2:48:02 lr: 3.599512675877641e-05 loss: 0.1367 (0.1516) time: 3.0746 data: 0.0072 max mem: 33300 Epoch: [12] [1020/4276] eta: 2:47:29 lr: 3.599239793724895e-05 loss: 0.1372 (0.1515) time: 3.0566 data: 0.0076 max mem: 33300 Epoch: [12] [1030/4276] eta: 2:46:58 lr: 3.5989669092733506e-05 loss: 0.1425 (0.1516) time: 3.0624 data: 0.0082 max mem: 33300 Epoch: [12] [1040/4276] eta: 2:46:28 lr: 3.5986940225227974e-05 loss: 0.1558 (0.1516) time: 3.1045 data: 0.0084 max mem: 33300 Epoch: [12] [1050/4276] eta: 2:45:57 lr: 3.59842113347302e-05 loss: 0.1465 (0.1517) time: 3.0925 data: 0.0081 max mem: 33300 Epoch: [12] [1060/4276] eta: 2:45:26 lr: 3.598148242123807e-05 loss: 0.1465 (0.1517) time: 3.0682 data: 0.0075 max mem: 33300 Epoch: [12] [1070/4276] eta: 2:44:55 lr: 3.597875348474945e-05 loss: 0.1592 (0.1519) time: 3.0864 data: 0.0073 max mem: 33300 Epoch: [12] [1080/4276] eta: 2:44:24 lr: 3.5976024525262195e-05 loss: 0.1587 (0.1518) time: 3.0853 data: 0.0072 max mem: 33300 Epoch: [12] [1090/4276] eta: 2:43:52 lr: 3.597329554277419e-05 loss: 0.1504 (0.1519) time: 3.0680 data: 0.0072 max mem: 33300 Epoch: [12] [1100/4276] eta: 2:43:21 lr: 3.597056653728328e-05 loss: 0.1437 (0.1518) time: 3.0704 data: 0.0076 max mem: 33300 Epoch: [12] [1110/4276] eta: 2:42:50 lr: 3.5967837508787354e-05 loss: 0.1437 (0.1519) time: 3.0781 data: 0.0081 max mem: 33300 Epoch: [12] [1120/4276] eta: 2:42:19 lr: 3.596510845728427e-05 loss: 0.1563 (0.1520) time: 3.0857 data: 0.0082 max mem: 33300 Epoch: [12] [1130/4276] eta: 2:41:49 lr: 3.5962379382771894e-05 loss: 0.1478 (0.1519) time: 3.0916 data: 0.0079 max mem: 33300 Epoch: [12] [1140/4276] eta: 2:41:18 lr: 3.595965028524809e-05 loss: 0.1375 (0.1518) time: 3.0846 data: 0.0078 max mem: 33300 Epoch: [12] [1150/4276] eta: 2:40:46 lr: 3.595692116471073e-05 loss: 0.1395 (0.1517) time: 3.0703 data: 0.0082 max mem: 33300 Epoch: [12] [1160/4276] eta: 2:40:16 lr: 3.595419202115768e-05 loss: 0.1448 (0.1518) time: 3.0806 data: 0.0079 max mem: 33300 Epoch: [12] [1170/4276] eta: 2:39:45 lr: 3.59514628545868e-05 loss: 0.1589 (0.1519) time: 3.1039 data: 0.0075 max mem: 33300 Epoch: [12] [1180/4276] eta: 2:39:15 lr: 3.5948733664995945e-05 loss: 0.1488 (0.1519) time: 3.1003 data: 0.0078 max mem: 33300 Epoch: [12] [1190/4276] eta: 2:38:43 lr: 3.594600445238299e-05 loss: 0.1396 (0.1517) time: 3.0773 data: 0.0078 max mem: 33300 Epoch: [12] [1200/4276] eta: 2:38:12 lr: 3.59432752167458e-05 loss: 0.1327 (0.1517) time: 3.0686 data: 0.0082 max mem: 33300 Epoch: [12] [1210/4276] eta: 2:37:41 lr: 3.5940545958082235e-05 loss: 0.1327 (0.1516) time: 3.0685 data: 0.0087 max mem: 33300 Epoch: [12] [1220/4276] eta: 2:37:10 lr: 3.5937816676390154e-05 loss: 0.1388 (0.1516) time: 3.0887 data: 0.0081 max mem: 33300 Epoch: [12] [1230/4276] eta: 2:36:39 lr: 3.5935087371667424e-05 loss: 0.1428 (0.1517) time: 3.0870 data: 0.0080 max mem: 33300 Epoch: [12] [1240/4276] eta: 2:36:08 lr: 3.593235804391191e-05 loss: 0.1546 (0.1518) time: 3.0785 data: 0.0083 max mem: 33300 Epoch: [12] [1250/4276] eta: 2:35:37 lr: 3.592962869312147e-05 loss: 0.1456 (0.1518) time: 3.0876 data: 0.0079 max mem: 33300 Epoch: [12] [1260/4276] eta: 2:35:06 lr: 3.592689931929397e-05 loss: 0.1318 (0.1516) time: 3.0762 data: 0.0079 max mem: 33300 Epoch: [12] [1270/4276] eta: 2:34:34 lr: 3.5924169922427254e-05 loss: 0.1327 (0.1515) time: 3.0417 data: 0.0084 max mem: 33300 Epoch: [12] [1280/4276] eta: 2:34:02 lr: 3.59214405025192e-05 loss: 0.1409 (0.1515) time: 3.0297 data: 0.0087 max mem: 33300 Epoch: [12] [1290/4276] eta: 2:33:31 lr: 3.591871105956766e-05 loss: 0.1474 (0.1516) time: 3.0639 data: 0.0086 max mem: 33300 Epoch: [12] [1300/4276] eta: 2:33:00 lr: 3.59159815935705e-05 loss: 0.1329 (0.1515) time: 3.0889 data: 0.0081 max mem: 33300 Epoch: [12] [1310/4276] eta: 2:32:30 lr: 3.591325210452558e-05 loss: 0.1296 (0.1516) time: 3.1079 data: 0.0084 max mem: 33300 Epoch: [12] [1320/4276] eta: 2:31:59 lr: 3.5910522592430755e-05 loss: 0.1462 (0.1517) time: 3.1012 data: 0.0082 max mem: 33300 Epoch: [12] [1330/4276] eta: 2:31:28 lr: 3.590779305728388e-05 loss: 0.1489 (0.1516) time: 3.0731 data: 0.0078 max mem: 33300 Epoch: [12] [1340/4276] eta: 2:30:57 lr: 3.590506349908281e-05 loss: 0.1411 (0.1516) time: 3.0724 data: 0.0080 max mem: 33300 Epoch: [12] [1350/4276] eta: 2:30:26 lr: 3.590233391782542e-05 loss: 0.1476 (0.1516) time: 3.0889 data: 0.0080 max mem: 33300 Epoch: [12] [1360/4276] eta: 2:29:55 lr: 3.5899604313509556e-05 loss: 0.1551 (0.1516) time: 3.0836 data: 0.0079 max mem: 33300 Epoch: [12] [1370/4276] eta: 2:29:24 lr: 3.5896874686133077e-05 loss: 0.1478 (0.1516) time: 3.0721 data: 0.0077 max mem: 33300 Epoch: [12] [1380/4276] eta: 2:28:54 lr: 3.589414503569384e-05 loss: 0.1495 (0.1517) time: 3.0902 data: 0.0078 max mem: 33300 Epoch: [12] [1390/4276] eta: 2:28:24 lr: 3.5891415362189704e-05 loss: 0.1618 (0.1518) time: 3.1183 data: 0.0082 max mem: 33300 Epoch: [12] [1400/4276] eta: 2:27:53 lr: 3.5888685665618514e-05 loss: 0.1578 (0.1517) time: 3.1099 data: 0.0084 max mem: 33300 Epoch: [12] [1410/4276] eta: 2:27:22 lr: 3.588595594597814e-05 loss: 0.1509 (0.1517) time: 3.0922 data: 0.0084 max mem: 33300 Epoch: [12] [1420/4276] eta: 2:26:51 lr: 3.588322620326643e-05 loss: 0.1411 (0.1517) time: 3.0845 data: 0.0085 max mem: 33300 Epoch: [12] [1430/4276] eta: 2:26:20 lr: 3.5880496437481234e-05 loss: 0.1403 (0.1517) time: 3.0722 data: 0.0087 max mem: 33300 Epoch: [12] [1440/4276] eta: 2:25:50 lr: 3.587776664862041e-05 loss: 0.1437 (0.1517) time: 3.0937 data: 0.0088 max mem: 33300 Epoch: [12] [1450/4276] eta: 2:25:19 lr: 3.587503683668182e-05 loss: 0.1502 (0.1517) time: 3.0961 data: 0.0085 max mem: 33300 Epoch: [12] [1460/4276] eta: 2:24:48 lr: 3.587230700166331e-05 loss: 0.1395 (0.1516) time: 3.0896 data: 0.0084 max mem: 33300 Epoch: [12] [1470/4276] eta: 2:24:17 lr: 3.586957714356274e-05 loss: 0.1347 (0.1515) time: 3.0886 data: 0.0085 max mem: 33300 Epoch: [12] [1480/4276] eta: 2:23:47 lr: 3.586684726237796e-05 loss: 0.1333 (0.1514) time: 3.0966 data: 0.0082 max mem: 33300 Epoch: [12] [1490/4276] eta: 2:23:16 lr: 3.586411735810681e-05 loss: 0.1239 (0.1513) time: 3.1033 data: 0.0078 max mem: 33300 Epoch: [12] [1500/4276] eta: 2:22:46 lr: 3.586138743074715e-05 loss: 0.1422 (0.1513) time: 3.0943 data: 0.0080 max mem: 33300 Epoch: [12] [1510/4276] eta: 2:22:15 lr: 3.5858657480296845e-05 loss: 0.1461 (0.1513) time: 3.0840 data: 0.0080 max mem: 33300 Epoch: [12] [1520/4276] eta: 2:21:44 lr: 3.5855927506753734e-05 loss: 0.1434 (0.1512) time: 3.0731 data: 0.0085 max mem: 33300 Epoch: [12] [1530/4276] eta: 2:21:12 lr: 3.585319751011567e-05 loss: 0.1434 (0.1511) time: 3.0713 data: 0.0089 max mem: 33300 Epoch: [12] [1540/4276] eta: 2:20:42 lr: 3.5850467490380504e-05 loss: 0.1433 (0.1511) time: 3.0890 data: 0.0086 max mem: 33300 Epoch: [12] [1550/4276] eta: 2:20:11 lr: 3.584773744754609e-05 loss: 0.1478 (0.1512) time: 3.0999 data: 0.0087 max mem: 33300 Epoch: [12] [1560/4276] eta: 2:19:41 lr: 3.584500738161027e-05 loss: 0.1451 (0.1511) time: 3.0928 data: 0.0086 max mem: 33300 Epoch: [12] [1570/4276] eta: 2:19:10 lr: 3.584227729257089e-05 loss: 0.1463 (0.1512) time: 3.1043 data: 0.0081 max mem: 33300 Epoch: [12] [1580/4276] eta: 2:18:40 lr: 3.5839547180425814e-05 loss: 0.1471 (0.1511) time: 3.1166 data: 0.0080 max mem: 33300 Epoch: [12] [1590/4276] eta: 2:18:10 lr: 3.5836817045172886e-05 loss: 0.1482 (0.1511) time: 3.1350 data: 0.0084 max mem: 33300 Epoch: [12] [1600/4276] eta: 2:17:39 lr: 3.5834086886809945e-05 loss: 0.1524 (0.1511) time: 3.1192 data: 0.0083 max mem: 33300 Epoch: [12] [1610/4276] eta: 2:17:08 lr: 3.583135670533485e-05 loss: 0.1384 (0.1510) time: 3.0837 data: 0.0086 max mem: 33300 Epoch: [12] [1620/4276] eta: 2:16:37 lr: 3.5828626500745444e-05 loss: 0.1373 (0.1509) time: 3.0679 data: 0.0091 max mem: 33300 Epoch: [12] [1630/4276] eta: 2:16:07 lr: 3.582589627303958e-05 loss: 0.1373 (0.1509) time: 3.0895 data: 0.0091 max mem: 33300 Epoch: [12] [1640/4276] eta: 2:15:36 lr: 3.582316602221509e-05 loss: 0.1372 (0.1508) time: 3.1060 data: 0.0088 max mem: 33300 Epoch: [12] [1650/4276] eta: 2:15:05 lr: 3.582043574826983e-05 loss: 0.1357 (0.1507) time: 3.0955 data: 0.0088 max mem: 33300 Epoch: [12] [1660/4276] eta: 2:14:35 lr: 3.581770545120165e-05 loss: 0.1357 (0.1507) time: 3.1124 data: 0.0089 max mem: 33300 Epoch: [12] [1670/4276] eta: 2:14:04 lr: 3.581497513100839e-05 loss: 0.1442 (0.1507) time: 3.1219 data: 0.0087 max mem: 33300 Epoch: [12] [1680/4276] eta: 2:13:34 lr: 3.58122447876879e-05 loss: 0.1537 (0.1508) time: 3.1158 data: 0.0086 max mem: 33300 Epoch: [12] [1690/4276] eta: 2:13:03 lr: 3.5809514421238024e-05 loss: 0.1486 (0.1507) time: 3.1072 data: 0.0086 max mem: 33300 Epoch: [12] [1700/4276] eta: 2:12:33 lr: 3.5806784031656596e-05 loss: 0.1445 (0.1508) time: 3.0954 data: 0.0084 max mem: 33300 Epoch: [12] [1710/4276] eta: 2:12:02 lr: 3.580405361894147e-05 loss: 0.1668 (0.1509) time: 3.0976 data: 0.0089 max mem: 33300 Epoch: [12] [1720/4276] eta: 2:11:32 lr: 3.58013231830905e-05 loss: 0.1668 (0.1510) time: 3.1184 data: 0.0091 max mem: 33300 Epoch: [12] [1730/4276] eta: 2:11:01 lr: 3.57985927241015e-05 loss: 0.1648 (0.1510) time: 3.1235 data: 0.0082 max mem: 33300 Epoch: [12] [1740/4276] eta: 2:10:30 lr: 3.579586224197234e-05 loss: 0.1536 (0.1511) time: 3.1035 data: 0.0080 max mem: 33300 Epoch: [12] [1750/4276] eta: 2:09:59 lr: 3.5793131736700854e-05 loss: 0.1500 (0.1510) time: 3.0891 data: 0.0084 max mem: 33300 Epoch: [12] [1760/4276] eta: 2:09:29 lr: 3.579040120828488e-05 loss: 0.1288 (0.1509) time: 3.0979 data: 0.0085 max mem: 33300 Epoch: [12] [1770/4276] eta: 2:08:58 lr: 3.578767065672226e-05 loss: 0.1357 (0.1509) time: 3.1002 data: 0.0079 max mem: 33300 Epoch: [12] [1780/4276] eta: 2:08:27 lr: 3.578494008201085e-05 loss: 0.1464 (0.1509) time: 3.0923 data: 0.0078 max mem: 33300 Epoch: [12] [1790/4276] eta: 2:07:57 lr: 3.578220948414848e-05 loss: 0.1430 (0.1508) time: 3.1045 data: 0.0080 max mem: 33300 Epoch: [12] [1800/4276] eta: 2:07:26 lr: 3.577947886313298e-05 loss: 0.1387 (0.1509) time: 3.1000 data: 0.0080 max mem: 33300 Epoch: [12] [1810/4276] eta: 2:06:55 lr: 3.57767482189622e-05 loss: 0.1661 (0.1510) time: 3.1072 data: 0.0078 max mem: 33300 Epoch: [12] [1820/4276] eta: 2:06:25 lr: 3.577401755163399e-05 loss: 0.1563 (0.1509) time: 3.1102 data: 0.0083 max mem: 33300 Epoch: [12] [1830/4276] eta: 2:05:54 lr: 3.577128686114617e-05 loss: 0.1481 (0.1509) time: 3.0922 data: 0.0088 max mem: 33300 Epoch: [12] [1840/4276] eta: 2:05:23 lr: 3.576855614749659e-05 loss: 0.1481 (0.1508) time: 3.0935 data: 0.0082 max mem: 33300 Epoch: [12] [1850/4276] eta: 2:04:52 lr: 3.576582541068311e-05 loss: 0.1510 (0.1509) time: 3.1028 data: 0.0078 max mem: 33300 Epoch: [12] [1860/4276] eta: 2:04:22 lr: 3.5763094650703524e-05 loss: 0.1479 (0.1509) time: 3.1265 data: 0.0079 max mem: 33300 Epoch: [12] [1870/4276] eta: 2:03:51 lr: 3.5760363867555705e-05 loss: 0.1466 (0.1510) time: 3.1140 data: 0.0086 max mem: 33300 Epoch: [12] [1880/4276] eta: 2:03:20 lr: 3.5757633061237475e-05 loss: 0.1561 (0.1510) time: 3.0879 data: 0.0091 max mem: 33300 Epoch: [12] [1890/4276] eta: 2:02:50 lr: 3.575490223174667e-05 loss: 0.1494 (0.1511) time: 3.0919 data: 0.0087 max mem: 33300 Epoch: [12] [1900/4276] eta: 2:02:19 lr: 3.575217137908113e-05 loss: 0.1459 (0.1511) time: 3.0998 data: 0.0087 max mem: 33300 Epoch: [12] [1910/4276] eta: 2:01:48 lr: 3.5749440503238696e-05 loss: 0.1552 (0.1512) time: 3.1149 data: 0.0092 max mem: 33300 Epoch: [12] [1920/4276] eta: 2:01:18 lr: 3.57467096042172e-05 loss: 0.1552 (0.1511) time: 3.1103 data: 0.0095 max mem: 33300 Epoch: [12] [1930/4276] eta: 2:00:47 lr: 3.5743978682014485e-05 loss: 0.1408 (0.1511) time: 3.1034 data: 0.0097 max mem: 33300 Epoch: [12] [1940/4276] eta: 2:00:16 lr: 3.574124773662838e-05 loss: 0.1413 (0.1511) time: 3.1049 data: 0.0093 max mem: 33300 Epoch: [12] [1950/4276] eta: 1:59:46 lr: 3.5738516768056714e-05 loss: 0.1458 (0.1511) time: 3.1116 data: 0.0090 max mem: 33300 Epoch: [12] [1960/4276] eta: 1:59:15 lr: 3.5735785776297324e-05 loss: 0.1422 (0.1511) time: 3.1186 data: 0.0085 max mem: 33300 Epoch: [12] [1970/4276] eta: 1:58:44 lr: 3.5733054761348056e-05 loss: 0.1329 (0.1510) time: 3.1012 data: 0.0078 max mem: 33300 Epoch: [12] [1980/4276] eta: 1:58:13 lr: 3.5730323723206734e-05 loss: 0.1456 (0.1509) time: 3.0874 data: 0.0078 max mem: 33300 Epoch: [12] [1990/4276] eta: 1:57:43 lr: 3.5727592661871184e-05 loss: 0.1352 (0.1508) time: 3.1030 data: 0.0081 max mem: 33300 Epoch: [12] [2000/4276] eta: 1:57:12 lr: 3.572486157733926e-05 loss: 0.1441 (0.1509) time: 3.1189 data: 0.0081 max mem: 33300 Epoch: [12] [2010/4276] eta: 1:56:41 lr: 3.572213046960877e-05 loss: 0.1531 (0.1509) time: 3.1056 data: 0.0076 max mem: 33300 Epoch: [12] [2020/4276] eta: 1:56:10 lr: 3.571939933867757e-05 loss: 0.1613 (0.1510) time: 3.0884 data: 0.0076 max mem: 33300 Epoch: [12] [2030/4276] eta: 1:55:40 lr: 3.571666818454348e-05 loss: 0.1505 (0.1509) time: 3.0894 data: 0.0080 max mem: 33300 Epoch: [12] [2040/4276] eta: 1:55:09 lr: 3.5713937007204325e-05 loss: 0.1340 (0.1509) time: 3.1021 data: 0.0080 max mem: 33300 Epoch: [12] [2050/4276] eta: 1:54:38 lr: 3.571120580665794e-05 loss: 0.1439 (0.1509) time: 3.1184 data: 0.0078 max mem: 33300 Epoch: [12] [2060/4276] eta: 1:54:08 lr: 3.570847458290216e-05 loss: 0.1417 (0.1508) time: 3.1138 data: 0.0079 max mem: 33300 Epoch: [12] [2070/4276] eta: 1:53:37 lr: 3.5705743335934816e-05 loss: 0.1348 (0.1508) time: 3.0976 data: 0.0079 max mem: 33300 Epoch: [12] [2080/4276] eta: 1:53:06 lr: 3.5703012065753744e-05 loss: 0.1365 (0.1508) time: 3.0911 data: 0.0085 max mem: 33300 Epoch: [12] [2090/4276] eta: 1:52:35 lr: 3.5700280772356754e-05 loss: 0.1492 (0.1508) time: 3.1025 data: 0.0084 max mem: 33300 Epoch: [12] [2100/4276] eta: 1:52:04 lr: 3.5697549455741695e-05 loss: 0.1492 (0.1508) time: 3.1009 data: 0.0077 max mem: 33300 Epoch: [12] [2110/4276] eta: 1:51:33 lr: 3.569481811590638e-05 loss: 0.1389 (0.1507) time: 3.0970 data: 0.0077 max mem: 33300 Epoch: [12] [2120/4276] eta: 1:51:02 lr: 3.569208675284865e-05 loss: 0.1280 (0.1506) time: 3.0934 data: 0.0081 max mem: 33300 Epoch: [12] [2130/4276] eta: 1:50:32 lr: 3.5689355366566316e-05 loss: 0.1282 (0.1506) time: 3.1049 data: 0.0084 max mem: 33300 Epoch: [12] [2140/4276] eta: 1:50:01 lr: 3.568662395705722e-05 loss: 0.1404 (0.1505) time: 3.1224 data: 0.0083 max mem: 33300 Epoch: [12] [2150/4276] eta: 1:49:30 lr: 3.56838925243192e-05 loss: 0.1404 (0.1505) time: 3.1052 data: 0.0079 max mem: 33300 Epoch: [12] [2160/4276] eta: 1:48:59 lr: 3.5681161068350056e-05 loss: 0.1368 (0.1505) time: 3.0832 data: 0.0076 max mem: 33300 Epoch: [12] [2170/4276] eta: 1:48:28 lr: 3.5678429589147634e-05 loss: 0.1368 (0.1505) time: 3.0653 data: 0.0084 max mem: 33300 Epoch: [12] [2180/4276] eta: 1:47:57 lr: 3.5675698086709756e-05 loss: 0.1468 (0.1506) time: 3.0625 data: 0.0089 max mem: 33300 Epoch: [12] [2190/4276] eta: 1:47:26 lr: 3.5672966561034233e-05 loss: 0.1633 (0.1506) time: 3.0580 data: 0.0083 max mem: 33300 Epoch: [12] [2200/4276] eta: 1:46:54 lr: 3.5670235012118905e-05 loss: 0.1631 (0.1506) time: 3.0474 data: 0.0080 max mem: 33300 Epoch: [12] [2210/4276] eta: 1:46:23 lr: 3.5667503439961596e-05 loss: 0.1533 (0.1506) time: 3.0360 data: 0.0078 max mem: 33300 Epoch: [12] [2220/4276] eta: 1:45:52 lr: 3.566477184456013e-05 loss: 0.1599 (0.1507) time: 3.0389 data: 0.0078 max mem: 33300 Epoch: [12] [2230/4276] eta: 1:45:21 lr: 3.566204022591233e-05 loss: 0.1489 (0.1507) time: 3.0579 data: 0.0080 max mem: 33300 Epoch: [12] [2240/4276] eta: 1:44:49 lr: 3.565930858401602e-05 loss: 0.1313 (0.1506) time: 3.0673 data: 0.0078 max mem: 33300 Epoch: [12] [2250/4276] eta: 1:44:18 lr: 3.565657691886902e-05 loss: 0.1303 (0.1505) time: 3.0677 data: 0.0078 max mem: 33300 Epoch: [12] [2260/4276] eta: 1:43:48 lr: 3.565384523046916e-05 loss: 0.1394 (0.1507) time: 3.0917 data: 0.0081 max mem: 33300 Epoch: [12] [2270/4276] eta: 1:43:17 lr: 3.565111351881425e-05 loss: 0.1394 (0.1507) time: 3.1133 data: 0.0085 max mem: 33300 Epoch: [12] [2280/4276] eta: 1:42:46 lr: 3.5648381783902115e-05 loss: 0.1377 (0.1507) time: 3.0917 data: 0.0084 max mem: 33300 Epoch: [12] [2290/4276] eta: 1:42:15 lr: 3.564565002573059e-05 loss: 0.1482 (0.1507) time: 3.0532 data: 0.0084 max mem: 33300 Epoch: [12] [2300/4276] eta: 1:41:43 lr: 3.564291824429749e-05 loss: 0.1482 (0.1506) time: 3.0314 data: 0.0085 max mem: 33300 Epoch: [12] [2310/4276] eta: 1:41:12 lr: 3.564018643960064e-05 loss: 0.1427 (0.1507) time: 3.0413 data: 0.0084 max mem: 33300 Epoch: [12] [2320/4276] eta: 1:40:41 lr: 3.563745461163784e-05 loss: 0.1427 (0.1507) time: 3.0811 data: 0.0083 max mem: 33300 Epoch: [12] [2330/4276] eta: 1:40:11 lr: 3.563472276040694e-05 loss: 0.1480 (0.1507) time: 3.1275 data: 0.0081 max mem: 33300 Epoch: [12] [2340/4276] eta: 1:39:40 lr: 3.5631990885905735e-05 loss: 0.1480 (0.1507) time: 3.1076 data: 0.0082 max mem: 33300 Epoch: [12] [2350/4276] eta: 1:39:08 lr: 3.562925898813206e-05 loss: 0.1430 (0.1507) time: 3.0514 data: 0.0086 max mem: 33300 Epoch: [12] [2360/4276] eta: 1:38:38 lr: 3.562652706708372e-05 loss: 0.1413 (0.1507) time: 3.0613 data: 0.0086 max mem: 33300 Epoch: [12] [2370/4276] eta: 1:38:07 lr: 3.562379512275854e-05 loss: 0.1438 (0.1507) time: 3.1045 data: 0.0080 max mem: 33300 Epoch: [12] [2380/4276] eta: 1:37:36 lr: 3.562106315515434e-05 loss: 0.1495 (0.1507) time: 3.0981 data: 0.0078 max mem: 33300 Epoch: [12] [2390/4276] eta: 1:37:05 lr: 3.561833116426895e-05 loss: 0.1384 (0.1506) time: 3.0767 data: 0.0082 max mem: 33300 Epoch: [12] [2400/4276] eta: 1:36:34 lr: 3.561559915010017e-05 loss: 0.1461 (0.1506) time: 3.0916 data: 0.0089 max mem: 33300 Epoch: [12] [2410/4276] eta: 1:36:03 lr: 3.561286711264582e-05 loss: 0.1410 (0.1506) time: 3.0951 data: 0.0089 max mem: 33300 Epoch: [12] [2420/4276] eta: 1:35:32 lr: 3.561013505190372e-05 loss: 0.1364 (0.1505) time: 3.0780 data: 0.0087 max mem: 33300 Epoch: [12] [2430/4276] eta: 1:35:01 lr: 3.560740296787168e-05 loss: 0.1552 (0.1506) time: 3.0546 data: 0.0083 max mem: 33300 Epoch: [12] [2440/4276] eta: 1:34:30 lr: 3.560467086054753e-05 loss: 0.1477 (0.1506) time: 3.0502 data: 0.0084 max mem: 33300 Epoch: [12] [2450/4276] eta: 1:33:59 lr: 3.560193872992907e-05 loss: 0.1435 (0.1506) time: 3.0723 data: 0.0085 max mem: 33300 Epoch: [12] [2460/4276] eta: 1:33:28 lr: 3.559920657601412e-05 loss: 0.1435 (0.1506) time: 3.0968 data: 0.0082 max mem: 33300 Epoch: [12] [2470/4276] eta: 1:32:58 lr: 3.5596474398800506e-05 loss: 0.1574 (0.1507) time: 3.1115 data: 0.0082 max mem: 33300 Epoch: [12] [2480/4276] eta: 1:32:27 lr: 3.559374219828603e-05 loss: 0.1574 (0.1507) time: 3.0977 data: 0.0081 max mem: 33300 Epoch: [12] [2490/4276] eta: 1:31:56 lr: 3.55910099744685e-05 loss: 0.1428 (0.1507) time: 3.0878 data: 0.0077 max mem: 33300 Epoch: [12] [2500/4276] eta: 1:31:25 lr: 3.5588277727345745e-05 loss: 0.1511 (0.1507) time: 3.0958 data: 0.0076 max mem: 33300 Epoch: [12] [2510/4276] eta: 1:30:54 lr: 3.558554545691556e-05 loss: 0.1500 (0.1507) time: 3.1168 data: 0.0079 max mem: 33300 Epoch: [12] [2520/4276] eta: 1:30:24 lr: 3.5582813163175784e-05 loss: 0.1389 (0.1506) time: 3.1216 data: 0.0080 max mem: 33300 Epoch: [12] [2530/4276] eta: 1:29:53 lr: 3.55800808461242e-05 loss: 0.1171 (0.1505) time: 3.1051 data: 0.0078 max mem: 33300 Epoch: [12] [2540/4276] eta: 1:29:22 lr: 3.557734850575865e-05 loss: 0.1230 (0.1505) time: 3.0948 data: 0.0079 max mem: 33300 Epoch: [12] [2550/4276] eta: 1:28:51 lr: 3.557461614207692e-05 loss: 0.1435 (0.1504) time: 3.0916 data: 0.0085 max mem: 33300 Epoch: [12] [2560/4276] eta: 1:28:20 lr: 3.557188375507684e-05 loss: 0.1303 (0.1503) time: 3.0759 data: 0.0095 max mem: 33300 Epoch: [12] [2570/4276] eta: 1:27:49 lr: 3.5569151344756203e-05 loss: 0.1208 (0.1503) time: 3.0475 data: 0.0099 max mem: 33300 Epoch: [12] [2580/4276] eta: 1:27:18 lr: 3.5566418911112834e-05 loss: 0.1311 (0.1503) time: 3.0574 data: 0.0086 max mem: 33300 Epoch: [12] [2590/4276] eta: 1:26:47 lr: 3.556368645414453e-05 loss: 0.1316 (0.1502) time: 3.0876 data: 0.0084 max mem: 33300 Epoch: [12] [2600/4276] eta: 1:26:16 lr: 3.5560953973849106e-05 loss: 0.1400 (0.1502) time: 3.1153 data: 0.0094 max mem: 33300 Epoch: [12] [2610/4276] eta: 1:25:45 lr: 3.555822147022438e-05 loss: 0.1362 (0.1501) time: 3.0997 data: 0.0090 max mem: 33300 Epoch: [12] [2620/4276] eta: 1:25:14 lr: 3.5555488943268164e-05 loss: 0.1415 (0.1502) time: 3.0750 data: 0.0082 max mem: 33300 Epoch: [12] [2630/4276] eta: 1:24:43 lr: 3.5552756392978235e-05 loss: 0.1391 (0.1501) time: 3.0707 data: 0.0081 max mem: 33300 Epoch: [12] [2640/4276] eta: 1:24:12 lr: 3.555002381935244e-05 loss: 0.1275 (0.1501) time: 3.0636 data: 0.0085 max mem: 33300 Epoch: [12] [2650/4276] eta: 1:23:41 lr: 3.554729122238856e-05 loss: 0.1364 (0.1501) time: 3.0789 data: 0.0089 max mem: 33300 Epoch: [12] [2660/4276] eta: 1:23:10 lr: 3.554455860208441e-05 loss: 0.1502 (0.1501) time: 3.0685 data: 0.0087 max mem: 33300 Epoch: [12] [2670/4276] eta: 1:22:40 lr: 3.5541825958437805e-05 loss: 0.1434 (0.1501) time: 3.0855 data: 0.0084 max mem: 33300 Epoch: [12] [2680/4276] eta: 1:22:09 lr: 3.5539093291446535e-05 loss: 0.1472 (0.1501) time: 3.1012 data: 0.0080 max mem: 33300 Epoch: [12] [2690/4276] eta: 1:21:38 lr: 3.553636060110842e-05 loss: 0.1480 (0.1501) time: 3.0941 data: 0.0076 max mem: 33300 Epoch: [12] [2700/4276] eta: 1:21:07 lr: 3.553362788742127e-05 loss: 0.1325 (0.1500) time: 3.1218 data: 0.0080 max mem: 33300 Epoch: [12] [2710/4276] eta: 1:20:36 lr: 3.553089515038287e-05 loss: 0.1325 (0.1500) time: 3.0904 data: 0.0082 max mem: 33300 Epoch: [12] [2720/4276] eta: 1:20:05 lr: 3.552816238999105e-05 loss: 0.1357 (0.1500) time: 3.0645 data: 0.0079 max mem: 33300 Epoch: [12] [2730/4276] eta: 1:19:34 lr: 3.552542960624359e-05 loss: 0.1492 (0.1500) time: 3.0870 data: 0.0075 max mem: 33300 Epoch: [12] [2740/4276] eta: 1:19:04 lr: 3.55226967991383e-05 loss: 0.1546 (0.1500) time: 3.1155 data: 0.0076 max mem: 33300 Epoch: [12] [2750/4276] eta: 1:18:33 lr: 3.5519963968672995e-05 loss: 0.1574 (0.1500) time: 3.1157 data: 0.0080 max mem: 33300 Epoch: [12] [2760/4276] eta: 1:18:02 lr: 3.551723111484548e-05 loss: 0.1425 (0.1500) time: 3.0858 data: 0.0082 max mem: 33300 Epoch: [12] [2770/4276] eta: 1:17:31 lr: 3.551449823765354e-05 loss: 0.1385 (0.1500) time: 3.0802 data: 0.0082 max mem: 33300 Epoch: [12] [2780/4276] eta: 1:17:00 lr: 3.5511765337094995e-05 loss: 0.1385 (0.1500) time: 3.0848 data: 0.0085 max mem: 33300 Epoch: [12] [2790/4276] eta: 1:16:30 lr: 3.5509032413167645e-05 loss: 0.1551 (0.1500) time: 3.1167 data: 0.0089 max mem: 33300 Epoch: [12] [2800/4276] eta: 1:15:59 lr: 3.550629946586927e-05 loss: 0.1474 (0.1500) time: 3.1242 data: 0.0085 max mem: 33300 Epoch: [12] [2810/4276] eta: 1:15:28 lr: 3.55035664951977e-05 loss: 0.1259 (0.1499) time: 3.1013 data: 0.0079 max mem: 33300 Epoch: [12] [2820/4276] eta: 1:14:57 lr: 3.550083350115072e-05 loss: 0.1259 (0.1498) time: 3.0906 data: 0.0078 max mem: 33300 Epoch: [12] [2830/4276] eta: 1:14:26 lr: 3.549810048372613e-05 loss: 0.1366 (0.1498) time: 3.0961 data: 0.0079 max mem: 33300 Epoch: [12] [2840/4276] eta: 1:13:55 lr: 3.5495367442921736e-05 loss: 0.1430 (0.1498) time: 3.1135 data: 0.0082 max mem: 33300 Epoch: [12] [2850/4276] eta: 1:13:25 lr: 3.549263437873534e-05 loss: 0.1661 (0.1499) time: 3.1046 data: 0.0082 max mem: 33300 Epoch: [12] [2860/4276] eta: 1:12:54 lr: 3.548990129116473e-05 loss: 0.1532 (0.1499) time: 3.0880 data: 0.0082 max mem: 33300 Epoch: [12] [2870/4276] eta: 1:12:23 lr: 3.548716818020773e-05 loss: 0.1417 (0.1498) time: 3.0980 data: 0.0085 max mem: 33300 Epoch: [12] [2880/4276] eta: 1:11:52 lr: 3.548443504586211e-05 loss: 0.1454 (0.1498) time: 3.1241 data: 0.0084 max mem: 33300 Epoch: [12] [2890/4276] eta: 1:11:21 lr: 3.548170188812567e-05 loss: 0.1397 (0.1498) time: 3.1241 data: 0.0076 max mem: 33300 Epoch: [12] [2900/4276] eta: 1:10:51 lr: 3.547896870699623e-05 loss: 0.1366 (0.1498) time: 3.0979 data: 0.0073 max mem: 33300 Epoch: [12] [2910/4276] eta: 1:10:20 lr: 3.547623550247157e-05 loss: 0.1315 (0.1497) time: 3.0884 data: 0.0074 max mem: 33300 Epoch: [12] [2920/4276] eta: 1:09:49 lr: 3.547350227454949e-05 loss: 0.1359 (0.1497) time: 3.0701 data: 0.0087 max mem: 33300 Epoch: [12] [2930/4276] eta: 1:09:18 lr: 3.54707690232278e-05 loss: 0.1393 (0.1497) time: 3.0527 data: 0.0097 max mem: 33300 Epoch: [12] [2940/4276] eta: 1:08:46 lr: 3.5468035748504275e-05 loss: 0.1371 (0.1496) time: 3.0513 data: 0.0091 max mem: 33300 Epoch: [12] [2950/4276] eta: 1:08:16 lr: 3.546530245037673e-05 loss: 0.1360 (0.1496) time: 3.0663 data: 0.0084 max mem: 33300 Epoch: [12] [2960/4276] eta: 1:07:45 lr: 3.546256912884293e-05 loss: 0.1356 (0.1496) time: 3.0854 data: 0.0079 max mem: 33300 Epoch: [12] [2970/4276] eta: 1:07:14 lr: 3.54598357839007e-05 loss: 0.1361 (0.1497) time: 3.1099 data: 0.0079 max mem: 33300 Epoch: [12] [2980/4276] eta: 1:06:43 lr: 3.5457102415547834e-05 loss: 0.1505 (0.1497) time: 3.1315 data: 0.0082 max mem: 33300 Epoch: [12] [2990/4276] eta: 1:06:12 lr: 3.545436902378211e-05 loss: 0.1416 (0.1496) time: 3.1100 data: 0.0081 max mem: 33300 Epoch: [12] [3000/4276] eta: 1:05:41 lr: 3.5451635608601325e-05 loss: 0.1415 (0.1496) time: 3.0924 data: 0.0080 max mem: 33300 Epoch: [12] [3010/4276] eta: 1:05:11 lr: 3.5448902170003284e-05 loss: 0.1415 (0.1496) time: 3.0992 data: 0.0079 max mem: 33300 Epoch: [12] [3020/4276] eta: 1:04:40 lr: 3.544616870798577e-05 loss: 0.1445 (0.1496) time: 3.1121 data: 0.0076 max mem: 33300 Epoch: [12] [3030/4276] eta: 1:04:09 lr: 3.5443435222546586e-05 loss: 0.1445 (0.1496) time: 3.1014 data: 0.0079 max mem: 33300 Epoch: [12] [3040/4276] eta: 1:03:38 lr: 3.5440701713683504e-05 loss: 0.1457 (0.1497) time: 3.0828 data: 0.0083 max mem: 33300 Epoch: [12] [3050/4276] eta: 1:03:07 lr: 3.543796818139433e-05 loss: 0.1440 (0.1496) time: 3.0860 data: 0.0084 max mem: 33300 Epoch: [12] [3060/4276] eta: 1:02:36 lr: 3.543523462567686e-05 loss: 0.1225 (0.1496) time: 3.0972 data: 0.0083 max mem: 33300 Epoch: [12] [3070/4276] eta: 1:02:06 lr: 3.5432501046528874e-05 loss: 0.1358 (0.1496) time: 3.1446 data: 0.0085 max mem: 33300 Epoch: [12] [3080/4276] eta: 1:01:35 lr: 3.542976744394817e-05 loss: 0.1423 (0.1495) time: 3.1480 data: 0.0090 max mem: 33300 Epoch: [12] [3090/4276] eta: 1:01:04 lr: 3.542703381793254e-05 loss: 0.1308 (0.1495) time: 3.1069 data: 0.0086 max mem: 33300 Epoch: [12] [3100/4276] eta: 1:00:33 lr: 3.5424300168479765e-05 loss: 0.1395 (0.1495) time: 3.0874 data: 0.0079 max mem: 33300 Epoch: [12] [3110/4276] eta: 1:00:02 lr: 3.542156649558764e-05 loss: 0.1393 (0.1494) time: 3.0904 data: 0.0080 max mem: 33300 Epoch: [12] [3120/4276] eta: 0:59:31 lr: 3.5418832799253955e-05 loss: 0.1264 (0.1494) time: 3.0774 data: 0.0078 max mem: 33300 Epoch: [12] [3130/4276] eta: 0:59:00 lr: 3.54160990794765e-05 loss: 0.1325 (0.1494) time: 3.0468 data: 0.0086 max mem: 33300 Epoch: [12] [3140/4276] eta: 0:58:29 lr: 3.5413365336253045e-05 loss: 0.1376 (0.1493) time: 3.0476 data: 0.0092 max mem: 33300 Epoch: [12] [3150/4276] eta: 0:57:58 lr: 3.5410631569581404e-05 loss: 0.1440 (0.1494) time: 3.0881 data: 0.0085 max mem: 33300 Epoch: [12] [3160/4276] eta: 0:57:28 lr: 3.540789777945935e-05 loss: 0.1578 (0.1494) time: 3.1538 data: 0.0080 max mem: 33300 Epoch: [12] [3170/4276] eta: 0:56:57 lr: 3.5405163965884683e-05 loss: 0.1578 (0.1494) time: 3.1340 data: 0.0074 max mem: 33300 Epoch: [12] [3180/4276] eta: 0:56:26 lr: 3.540243012885517e-05 loss: 0.1599 (0.1494) time: 3.0873 data: 0.0070 max mem: 33300 Epoch: [12] [3190/4276] eta: 0:55:55 lr: 3.5399696268368615e-05 loss: 0.1457 (0.1493) time: 3.0745 data: 0.0071 max mem: 33300 Epoch: [12] [3200/4276] eta: 0:55:24 lr: 3.539696238442279e-05 loss: 0.1400 (0.1493) time: 3.0664 data: 0.0072 max mem: 33300 Epoch: [12] [3210/4276] eta: 0:54:53 lr: 3.5394228477015485e-05 loss: 0.1422 (0.1494) time: 3.0738 data: 0.0081 max mem: 33300 Epoch: [12] [3220/4276] eta: 0:54:22 lr: 3.539149454614449e-05 loss: 0.1485 (0.1494) time: 3.0890 data: 0.0090 max mem: 33300 Epoch: [12] [3230/4276] eta: 0:53:51 lr: 3.538876059180759e-05 loss: 0.1485 (0.1494) time: 3.0755 data: 0.0089 max mem: 33300 Epoch: [12] [3240/4276] eta: 0:53:20 lr: 3.538602661400256e-05 loss: 0.1572 (0.1494) time: 3.0451 data: 0.0082 max mem: 33300 Epoch: [12] [3250/4276] eta: 0:52:49 lr: 3.538329261272719e-05 loss: 0.1507 (0.1494) time: 3.0387 data: 0.0075 max mem: 33300 Epoch: [12] [3260/4276] eta: 0:52:18 lr: 3.538055858797927e-05 loss: 0.1396 (0.1494) time: 3.0604 data: 0.0073 max mem: 33300 Epoch: [12] [3270/4276] eta: 0:51:47 lr: 3.5377824539756564e-05 loss: 0.1518 (0.1494) time: 3.0823 data: 0.0073 max mem: 33300 Epoch: [12] [3280/4276] eta: 0:51:16 lr: 3.537509046805687e-05 loss: 0.1512 (0.1495) time: 3.0696 data: 0.0070 max mem: 33300 Epoch: [12] [3290/4276] eta: 0:50:45 lr: 3.537235637287797e-05 loss: 0.1528 (0.1495) time: 3.0504 data: 0.0070 max mem: 33300 Epoch: [12] [3300/4276] eta: 0:50:14 lr: 3.536962225421765e-05 loss: 0.1538 (0.1495) time: 3.0773 data: 0.0072 max mem: 33300 Epoch: [12] [3310/4276] eta: 0:49:44 lr: 3.5366888112073676e-05 loss: 0.1537 (0.1495) time: 3.0993 data: 0.0072 max mem: 33300 Epoch: [12] [3320/4276] eta: 0:49:13 lr: 3.536415394644384e-05 loss: 0.1601 (0.1496) time: 3.0993 data: 0.0070 max mem: 33300 Epoch: [12] [3330/4276] eta: 0:48:42 lr: 3.536141975732592e-05 loss: 0.1356 (0.1496) time: 3.0945 data: 0.0072 max mem: 33300 Epoch: [12] [3340/4276] eta: 0:48:11 lr: 3.53586855447177e-05 loss: 0.1357 (0.1495) time: 3.0617 data: 0.0071 max mem: 33300 Epoch: [12] [3350/4276] eta: 0:47:40 lr: 3.535595130861695e-05 loss: 0.1365 (0.1495) time: 3.0303 data: 0.0071 max mem: 33300 Epoch: [12] [3360/4276] eta: 0:47:09 lr: 3.535321704902146e-05 loss: 0.1402 (0.1495) time: 3.0260 data: 0.0071 max mem: 33300 Epoch: [12] [3370/4276] eta: 0:46:38 lr: 3.5350482765929e-05 loss: 0.1483 (0.1496) time: 3.0753 data: 0.0072 max mem: 33300 Epoch: [12] [3380/4276] eta: 0:46:07 lr: 3.534774845933735e-05 loss: 0.1540 (0.1496) time: 3.0970 data: 0.0070 max mem: 33300 Epoch: [12] [3390/4276] eta: 0:45:36 lr: 3.5345014129244305e-05 loss: 0.1621 (0.1496) time: 3.0458 data: 0.0067 max mem: 33300 Epoch: [12] [3400/4276] eta: 0:45:05 lr: 3.5342279775647624e-05 loss: 0.1576 (0.1496) time: 3.0127 data: 0.0068 max mem: 33300 Epoch: [12] [3410/4276] eta: 0:44:34 lr: 3.5339545398545094e-05 loss: 0.1576 (0.1497) time: 3.0197 data: 0.0071 max mem: 33300 Epoch: [12] [3420/4276] eta: 0:44:03 lr: 3.533681099793448e-05 loss: 0.1606 (0.1497) time: 3.0284 data: 0.0070 max mem: 33300 Epoch: [12] [3430/4276] eta: 0:43:32 lr: 3.5334076573813565e-05 loss: 0.1604 (0.1498) time: 3.0344 data: 0.0069 max mem: 33300 Epoch: [12] [3440/4276] eta: 0:43:01 lr: 3.533134212618013e-05 loss: 0.1471 (0.1497) time: 3.0679 data: 0.0073 max mem: 33300 Epoch: [12] [3450/4276] eta: 0:42:30 lr: 3.5328607655031956e-05 loss: 0.1534 (0.1498) time: 3.1008 data: 0.0075 max mem: 33300 Epoch: [12] [3460/4276] eta: 0:41:59 lr: 3.53258731603668e-05 loss: 0.1586 (0.1498) time: 3.0964 data: 0.0072 max mem: 33300 Epoch: [12] [3470/4276] eta: 0:41:28 lr: 3.532313864218246e-05 loss: 0.1374 (0.1498) time: 3.0862 data: 0.0070 max mem: 33300 Epoch: [12] [3480/4276] eta: 0:40:57 lr: 3.532040410047669e-05 loss: 0.1533 (0.1498) time: 3.0780 data: 0.0071 max mem: 33300 Epoch: [12] [3490/4276] eta: 0:40:26 lr: 3.531766953524727e-05 loss: 0.1590 (0.1498) time: 3.0442 data: 0.0072 max mem: 33300 Epoch: [12] [3500/4276] eta: 0:39:55 lr: 3.531493494649198e-05 loss: 0.1551 (0.1498) time: 3.0409 data: 0.0072 max mem: 33300 Epoch: [12] [3510/4276] eta: 0:39:24 lr: 3.531220033420858e-05 loss: 0.1397 (0.1498) time: 3.0701 data: 0.0072 max mem: 33300 Epoch: [12] [3520/4276] eta: 0:38:54 lr: 3.530946569839486e-05 loss: 0.1401 (0.1498) time: 3.1073 data: 0.0073 max mem: 33300 Epoch: [12] [3530/4276] eta: 0:38:23 lr: 3.530673103904859e-05 loss: 0.1499 (0.1498) time: 3.1269 data: 0.0073 max mem: 33300 Epoch: [12] [3540/4276] eta: 0:37:52 lr: 3.530399635616753e-05 loss: 0.1443 (0.1498) time: 3.1023 data: 0.0073 max mem: 33300 Epoch: [12] [3550/4276] eta: 0:37:21 lr: 3.530126164974947e-05 loss: 0.1443 (0.1498) time: 3.1038 data: 0.0074 max mem: 33300 Epoch: [12] [3560/4276] eta: 0:36:50 lr: 3.5298526919792155e-05 loss: 0.1474 (0.1498) time: 3.1138 data: 0.0073 max mem: 33300 Epoch: [12] [3570/4276] eta: 0:36:19 lr: 3.5295792166293384e-05 loss: 0.1598 (0.1498) time: 3.0982 data: 0.0071 max mem: 33300 Epoch: [12] [3580/4276] eta: 0:35:49 lr: 3.529305738925091e-05 loss: 0.1413 (0.1498) time: 3.1017 data: 0.0070 max mem: 33300 Epoch: [12] [3590/4276] eta: 0:35:18 lr: 3.529032258866251e-05 loss: 0.1330 (0.1498) time: 3.1161 data: 0.0071 max mem: 33300 Epoch: [12] [3600/4276] eta: 0:34:47 lr: 3.528758776452596e-05 loss: 0.1455 (0.1498) time: 3.1168 data: 0.0072 max mem: 33300 Epoch: [12] [3610/4276] eta: 0:34:16 lr: 3.5284852916839006e-05 loss: 0.1484 (0.1498) time: 3.1208 data: 0.0069 max mem: 33300 Epoch: [12] [3620/4276] eta: 0:33:45 lr: 3.5282118045599444e-05 loss: 0.1499 (0.1497) time: 3.0970 data: 0.0067 max mem: 33300 Epoch: [12] [3630/4276] eta: 0:33:14 lr: 3.527938315080503e-05 loss: 0.1395 (0.1498) time: 3.0721 data: 0.0070 max mem: 33300 Epoch: [12] [3640/4276] eta: 0:32:43 lr: 3.5276648232453534e-05 loss: 0.1323 (0.1497) time: 3.0649 data: 0.0072 max mem: 33300 Epoch: [12] [3650/4276] eta: 0:32:13 lr: 3.5273913290542726e-05 loss: 0.1267 (0.1497) time: 3.0710 data: 0.0072 max mem: 33300 Epoch: [12] [3660/4276] eta: 0:31:42 lr: 3.527117832507036e-05 loss: 0.1267 (0.1497) time: 3.0890 data: 0.0072 max mem: 33300 Epoch: [12] [3670/4276] eta: 0:31:11 lr: 3.526844333603423e-05 loss: 0.1443 (0.1497) time: 3.1118 data: 0.0071 max mem: 33300 Epoch: [12] [3680/4276] eta: 0:30:40 lr: 3.5265708323432076e-05 loss: 0.1440 (0.1497) time: 3.1224 data: 0.0073 max mem: 33300 Epoch: [12] [3690/4276] eta: 0:30:09 lr: 3.5262973287261673e-05 loss: 0.1505 (0.1497) time: 3.0994 data: 0.0073 max mem: 33300 Epoch: [12] [3700/4276] eta: 0:29:38 lr: 3.52602382275208e-05 loss: 0.1397 (0.1497) time: 3.0865 data: 0.0072 max mem: 33300 Epoch: [12] [3710/4276] eta: 0:29:07 lr: 3.52575031442072e-05 loss: 0.1320 (0.1496) time: 3.0977 data: 0.0073 max mem: 33300 Epoch: [12] [3720/4276] eta: 0:28:37 lr: 3.525476803731865e-05 loss: 0.1250 (0.1496) time: 3.0894 data: 0.0074 max mem: 33300 Epoch: [12] [3730/4276] eta: 0:28:06 lr: 3.5252032906852916e-05 loss: 0.1362 (0.1496) time: 3.0675 data: 0.0077 max mem: 33300 Epoch: [12] [3740/4276] eta: 0:27:35 lr: 3.524929775280776e-05 loss: 0.1404 (0.1496) time: 3.0767 data: 0.0076 max mem: 33300 Epoch: [12] [3750/4276] eta: 0:27:04 lr: 3.524656257518095e-05 loss: 0.1444 (0.1496) time: 3.1215 data: 0.0077 max mem: 33300 Epoch: [12] [3760/4276] eta: 0:26:33 lr: 3.5243827373970234e-05 loss: 0.1379 (0.1496) time: 3.1125 data: 0.0079 max mem: 33300 Epoch: [12] [3770/4276] eta: 0:26:02 lr: 3.5241092149173387e-05 loss: 0.1407 (0.1496) time: 3.0979 data: 0.0075 max mem: 33300 Epoch: [12] [3780/4276] eta: 0:25:31 lr: 3.5238356900788176e-05 loss: 0.1442 (0.1496) time: 3.1006 data: 0.0073 max mem: 33300 Epoch: [12] [3790/4276] eta: 0:25:00 lr: 3.523562162881236e-05 loss: 0.1385 (0.1495) time: 3.0740 data: 0.0079 max mem: 33300 Epoch: [12] [3800/4276] eta: 0:24:30 lr: 3.5232886333243694e-05 loss: 0.1450 (0.1496) time: 3.0919 data: 0.0081 max mem: 33300 Epoch: [12] [3810/4276] eta: 0:23:59 lr: 3.5230151014079944e-05 loss: 0.1441 (0.1496) time: 3.1101 data: 0.0078 max mem: 33300 Epoch: [12] [3820/4276] eta: 0:23:28 lr: 3.522741567131887e-05 loss: 0.1281 (0.1495) time: 3.1298 data: 0.0080 max mem: 33300 Epoch: [12] [3830/4276] eta: 0:22:57 lr: 3.522468030495823e-05 loss: 0.1277 (0.1495) time: 3.1034 data: 0.0086 max mem: 33300 Epoch: [12] [3840/4276] eta: 0:22:26 lr: 3.5221944914995784e-05 loss: 0.1251 (0.1495) time: 3.0613 data: 0.0081 max mem: 33300 Epoch: [12] [3850/4276] eta: 0:21:55 lr: 3.52192095014293e-05 loss: 0.1223 (0.1494) time: 3.0691 data: 0.0078 max mem: 33300 Epoch: [12] [3860/4276] eta: 0:21:24 lr: 3.5216474064256536e-05 loss: 0.1326 (0.1494) time: 3.0910 data: 0.0081 max mem: 33300 Epoch: [12] [3870/4276] eta: 0:20:53 lr: 3.5213738603475246e-05 loss: 0.1495 (0.1494) time: 3.1297 data: 0.0083 max mem: 33300 Epoch: [12] [3880/4276] eta: 0:20:23 lr: 3.521100311908318e-05 loss: 0.1404 (0.1494) time: 3.0885 data: 0.0085 max mem: 33300 Epoch: [12] [3890/4276] eta: 0:19:52 lr: 3.520826761107812e-05 loss: 0.1380 (0.1494) time: 3.0598 data: 0.0085 max mem: 33300 Epoch: [12] [3900/4276] eta: 0:19:21 lr: 3.52055320794578e-05 loss: 0.1361 (0.1494) time: 3.0990 data: 0.0083 max mem: 33300 Epoch: [12] [3910/4276] eta: 0:18:50 lr: 3.520279652421998e-05 loss: 0.1342 (0.1493) time: 3.0839 data: 0.0083 max mem: 33300 Epoch: [12] [3920/4276] eta: 0:18:19 lr: 3.520006094536243e-05 loss: 0.1342 (0.1493) time: 3.0635 data: 0.0086 max mem: 33300 Epoch: [12] [3930/4276] eta: 0:17:48 lr: 3.5197325342882906e-05 loss: 0.1405 (0.1493) time: 3.0637 data: 0.0081 max mem: 33300 Epoch: [12] [3940/4276] eta: 0:17:17 lr: 3.5194589716779155e-05 loss: 0.1328 (0.1494) time: 3.0617 data: 0.0074 max mem: 33300 Epoch: [12] [3950/4276] eta: 0:16:46 lr: 3.519185406704893e-05 loss: 0.1323 (0.1493) time: 3.0656 data: 0.0077 max mem: 33300 Epoch: [12] [3960/4276] eta: 0:16:15 lr: 3.518911839369e-05 loss: 0.1439 (0.1494) time: 3.1246 data: 0.0085 max mem: 33300 Epoch: [12] [3970/4276] eta: 0:15:45 lr: 3.51863826967001e-05 loss: 0.1517 (0.1494) time: 3.1405 data: 0.0088 max mem: 33300 Epoch: [12] [3980/4276] eta: 0:15:14 lr: 3.5183646976077e-05 loss: 0.1423 (0.1494) time: 3.1028 data: 0.0083 max mem: 33300 Epoch: [12] [3990/4276] eta: 0:14:43 lr: 3.518091123181845e-05 loss: 0.1398 (0.1493) time: 3.1096 data: 0.0079 max mem: 33300 Epoch: [12] [4000/4276] eta: 0:14:12 lr: 3.51781754639222e-05 loss: 0.1396 (0.1494) time: 3.0876 data: 0.0079 max mem: 33300 Epoch: [12] [4010/4276] eta: 0:13:41 lr: 3.517543967238602e-05 loss: 0.1399 (0.1494) time: 3.0671 data: 0.0077 max mem: 33300 Epoch: [12] [4020/4276] eta: 0:13:10 lr: 3.517270385720764e-05 loss: 0.1438 (0.1494) time: 3.0713 data: 0.0077 max mem: 33300 Epoch: [12] [4030/4276] eta: 0:12:39 lr: 3.5169968018384826e-05 loss: 0.1422 (0.1494) time: 3.1061 data: 0.0077 max mem: 33300 Epoch: [12] [4040/4276] eta: 0:12:08 lr: 3.516723215591532e-05 loss: 0.1433 (0.1494) time: 3.1353 data: 0.0077 max mem: 33300 Epoch: [12] [4050/4276] eta: 0:11:38 lr: 3.516449626979688e-05 loss: 0.1419 (0.1494) time: 3.1003 data: 0.0075 max mem: 33300 Epoch: [12] [4060/4276] eta: 0:11:07 lr: 3.516176036002726e-05 loss: 0.1419 (0.1494) time: 3.0813 data: 0.0085 max mem: 33300 Epoch: [12] [4070/4276] eta: 0:10:36 lr: 3.5159024426604206e-05 loss: 0.1488 (0.1494) time: 3.0580 data: 0.0094 max mem: 33300 Epoch: [12] [4080/4276] eta: 0:10:05 lr: 3.515628846952546e-05 loss: 0.1522 (0.1495) time: 3.0457 data: 0.0085 max mem: 33300 Epoch: [12] [4090/4276] eta: 0:09:34 lr: 3.5153552488788796e-05 loss: 0.1638 (0.1495) time: 3.0745 data: 0.0077 max mem: 33300 Epoch: [12] [4100/4276] eta: 0:09:03 lr: 3.5150816484391945e-05 loss: 0.1642 (0.1495) time: 3.1053 data: 0.0076 max mem: 33300 Epoch: [12] [4110/4276] eta: 0:08:32 lr: 3.5148080456332666e-05 loss: 0.1633 (0.1495) time: 3.1097 data: 0.0078 max mem: 33300 Epoch: [12] [4120/4276] eta: 0:08:01 lr: 3.514534440460869e-05 loss: 0.1556 (0.1495) time: 3.0735 data: 0.0081 max mem: 33300 Epoch: [12] [4130/4276] eta: 0:07:30 lr: 3.514260832921778e-05 loss: 0.1500 (0.1495) time: 3.0581 data: 0.0078 max mem: 33300 Epoch: [12] [4140/4276] eta: 0:06:59 lr: 3.5139872230157685e-05 loss: 0.1426 (0.1495) time: 3.0451 data: 0.0078 max mem: 33300 Epoch: [12] [4150/4276] eta: 0:06:29 lr: 3.5137136107426144e-05 loss: 0.1426 (0.1495) time: 3.0740 data: 0.0077 max mem: 33300 Epoch: [12] [4160/4276] eta: 0:05:58 lr: 3.513439996102091e-05 loss: 0.1511 (0.1496) time: 3.0946 data: 0.0072 max mem: 33300 Epoch: [12] [4170/4276] eta: 0:05:27 lr: 3.5131663790939735e-05 loss: 0.1539 (0.1496) time: 3.0918 data: 0.0072 max mem: 33300 Epoch: [12] [4180/4276] eta: 0:04:56 lr: 3.5128927597180355e-05 loss: 0.1470 (0.1496) time: 3.1205 data: 0.0073 max mem: 33300 Epoch: [12] [4190/4276] eta: 0:04:25 lr: 3.512619137974052e-05 loss: 0.1411 (0.1496) time: 3.0790 data: 0.0074 max mem: 33300 Epoch: [12] [4200/4276] eta: 0:03:54 lr: 3.512345513861797e-05 loss: 0.1418 (0.1496) time: 3.0310 data: 0.0073 max mem: 33300 Epoch: [12] [4210/4276] eta: 0:03:23 lr: 3.512071887381046e-05 loss: 0.1534 (0.1496) time: 3.0408 data: 0.0073 max mem: 33300 Epoch: [12] [4220/4276] eta: 0:02:52 lr: 3.511798258531572e-05 loss: 0.1603 (0.1497) time: 3.0400 data: 0.0073 max mem: 33300 Epoch: [12] [4230/4276] eta: 0:02:22 lr: 3.511524627313151e-05 loss: 0.1682 (0.1497) time: 3.0346 data: 0.0073 max mem: 33300 Epoch: [12] [4240/4276] eta: 0:01:51 lr: 3.5112509937255574e-05 loss: 0.1653 (0.1497) time: 3.0352 data: 0.0074 max mem: 33300 Epoch: [12] [4250/4276] eta: 0:01:20 lr: 3.5109773577685645e-05 loss: 0.1617 (0.1498) time: 3.0525 data: 0.0076 max mem: 33300 Epoch: [12] [4260/4276] eta: 0:00:49 lr: 3.510703719441947e-05 loss: 0.1544 (0.1498) time: 3.0504 data: 0.0080 max mem: 33300 Epoch: [12] [4270/4276] eta: 0:00:18 lr: 3.5104300787454786e-05 loss: 0.1458 (0.1498) time: 3.0014 data: 0.0076 max mem: 33300 Epoch: [12] Total time: 3:40:01 Test: [ 0/21770] eta: 8:55:38 time: 1.4763 data: 1.4363 max mem: 33300 Test: [ 100/21770] eta: 0:19:20 time: 0.0393 data: 0.0010 max mem: 33300 Test: [ 200/21770] eta: 0:16:33 time: 0.0387 data: 0.0011 max mem: 33300 Test: [ 300/21770] eta: 0:15:39 time: 0.0396 data: 0.0010 max mem: 33300 Test: [ 400/21770] eta: 0:15:04 time: 0.0380 data: 0.0011 max mem: 33300 Test: [ 500/21770] eta: 0:14:41 time: 0.0380 data: 0.0011 max mem: 33300 Test: [ 600/21770] eta: 0:14:29 time: 0.0397 data: 0.0010 max mem: 33300 Test: [ 700/21770] eta: 0:14:21 time: 0.0384 data: 0.0010 max mem: 33300 Test: [ 800/21770] eta: 0:14:09 time: 0.0381 data: 0.0010 max mem: 33300 Test: [ 900/21770] eta: 0:14:01 time: 0.0384 data: 0.0010 max mem: 33300 Test: [ 1000/21770] eta: 0:13:53 time: 0.0381 data: 0.0009 max mem: 33300 Test: [ 1100/21770] eta: 0:13:45 time: 0.0382 data: 0.0009 max mem: 33300 Test: [ 1200/21770] eta: 0:13:39 time: 0.0383 data: 0.0010 max mem: 33300 Test: [ 1300/21770] eta: 0:13:32 time: 0.0382 data: 0.0010 max mem: 33300 Test: [ 1400/21770] eta: 0:13:27 time: 0.0383 data: 0.0009 max mem: 33300 Test: [ 1500/21770] eta: 0:13:21 time: 0.0384 data: 0.0010 max mem: 33300 Test: [ 1600/21770] eta: 0:13:16 time: 0.0382 data: 0.0009 max mem: 33300 Test: [ 1700/21770] eta: 0:13:10 time: 0.0381 data: 0.0009 max mem: 33300 Test: [ 1800/21770] eta: 0:13:05 time: 0.0383 data: 0.0010 max mem: 33300 Test: [ 1900/21770] eta: 0:13:01 time: 0.0383 data: 0.0010 max mem: 33300 Test: [ 2000/21770] eta: 0:12:56 time: 0.0383 data: 0.0009 max mem: 33300 Test: [ 2100/21770] eta: 0:12:51 time: 0.0379 data: 0.0009 max mem: 33300 Test: [ 2200/21770] eta: 0:12:46 time: 0.0379 data: 0.0009 max mem: 33300 Test: [ 2300/21770] eta: 0:12:41 time: 0.0379 data: 0.0009 max mem: 33300 Test: [ 2400/21770] eta: 0:12:37 time: 0.0380 data: 0.0009 max mem: 33300 Test: [ 2500/21770] eta: 0:12:32 time: 0.0382 data: 0.0009 max mem: 33300 Test: [ 2600/21770] eta: 0:12:28 time: 0.0380 data: 0.0009 max mem: 33300 Test: [ 2700/21770] eta: 0:12:23 time: 0.0384 data: 0.0010 max mem: 33300 Test: [ 2800/21770] eta: 0:12:19 time: 0.0382 data: 0.0009 max mem: 33300 Test: [ 2900/21770] eta: 0:12:15 time: 0.0385 data: 0.0010 max mem: 33300 Test: [ 3000/21770] eta: 0:12:10 time: 0.0382 data: 0.0009 max mem: 33300 Test: [ 3100/21770] eta: 0:12:06 time: 0.0383 data: 0.0010 max mem: 33300 Test: [ 3200/21770] eta: 0:12:02 time: 0.0383 data: 0.0010 max mem: 33300 Test: [ 3300/21770] eta: 0:11:58 time: 0.0381 data: 0.0010 max mem: 33300 Test: [ 3400/21770] eta: 0:11:53 time: 0.0382 data: 0.0011 max mem: 33300 Test: [ 3500/21770] eta: 0:11:49 time: 0.0380 data: 0.0010 max mem: 33300 Test: [ 3600/21770] eta: 0:11:45 time: 0.0382 data: 0.0011 max mem: 33300 Test: [ 3700/21770] eta: 0:11:41 time: 0.0382 data: 0.0011 max mem: 33300 Test: [ 3800/21770] eta: 0:11:36 time: 0.0381 data: 0.0011 max mem: 33300 Test: [ 3900/21770] eta: 0:11:32 time: 0.0383 data: 0.0011 max mem: 33300 Test: [ 4000/21770] eta: 0:11:28 time: 0.0382 data: 0.0011 max mem: 33300 Test: [ 4100/21770] eta: 0:11:24 time: 0.0382 data: 0.0011 max mem: 33300 Test: [ 4200/21770] eta: 0:11:20 time: 0.0380 data: 0.0010 max mem: 33300 Test: [ 4300/21770] eta: 0:11:16 time: 0.0381 data: 0.0010 max mem: 33300 Test: [ 4400/21770] eta: 0:11:12 time: 0.0382 data: 0.0011 max mem: 33300 Test: [ 4500/21770] eta: 0:11:08 time: 0.0379 data: 0.0010 max mem: 33300 Test: [ 4600/21770] eta: 0:11:04 time: 0.0383 data: 0.0010 max mem: 33300 Test: [ 4700/21770] eta: 0:11:00 time: 0.0385 data: 0.0010 max mem: 33300 Test: [ 4800/21770] eta: 0:10:56 time: 0.0384 data: 0.0010 max mem: 33300 Test: [ 4900/21770] eta: 0:10:52 time: 0.0389 data: 0.0011 max mem: 33300 Test: [ 5000/21770] eta: 0:10:48 time: 0.0387 data: 0.0011 max mem: 33300 Test: [ 5100/21770] eta: 0:10:44 time: 0.0383 data: 0.0010 max mem: 33300 Test: [ 5200/21770] eta: 0:10:40 time: 0.0383 data: 0.0010 max mem: 33300 Test: [ 5300/21770] eta: 0:10:36 time: 0.0385 data: 0.0011 max mem: 33300 Test: [ 5400/21770] eta: 0:10:32 time: 0.0383 data: 0.0011 max mem: 33300 Test: [ 5500/21770] eta: 0:10:28 time: 0.0383 data: 0.0011 max mem: 33300 Test: [ 5600/21770] eta: 0:10:24 time: 0.0382 data: 0.0011 max mem: 33300 Test: [ 5700/21770] eta: 0:10:20 time: 0.0383 data: 0.0011 max mem: 33300 Test: [ 5800/21770] eta: 0:10:16 time: 0.0382 data: 0.0010 max mem: 33300 Test: [ 5900/21770] eta: 0:10:12 time: 0.0382 data: 0.0011 max mem: 33300 Test: [ 6000/21770] eta: 0:10:08 time: 0.0384 data: 0.0011 max mem: 33300 Test: [ 6100/21770] eta: 0:10:05 time: 0.0383 data: 0.0011 max mem: 33300 Test: [ 6200/21770] eta: 0:10:01 time: 0.0382 data: 0.0011 max mem: 33300 Test: [ 6300/21770] eta: 0:09:57 time: 0.0382 data: 0.0011 max mem: 33300 Test: [ 6400/21770] eta: 0:09:53 time: 0.0383 data: 0.0011 max mem: 33300 Test: [ 6500/21770] eta: 0:09:49 time: 0.0381 data: 0.0010 max mem: 33300 Test: [ 6600/21770] eta: 0:09:45 time: 0.0382 data: 0.0011 max mem: 33300 Test: [ 6700/21770] eta: 0:09:41 time: 0.0383 data: 0.0011 max mem: 33300 Test: [ 6800/21770] eta: 0:09:37 time: 0.0382 data: 0.0010 max mem: 33300 Test: [ 6900/21770] eta: 0:09:33 time: 0.0382 data: 0.0010 max mem: 33300 Test: [ 7000/21770] eta: 0:09:29 time: 0.0383 data: 0.0011 max mem: 33300 Test: [ 7100/21770] eta: 0:09:25 time: 0.0380 data: 0.0010 max mem: 33300 Test: [ 7200/21770] eta: 0:09:21 time: 0.0382 data: 0.0010 max mem: 33300 Test: [ 7300/21770] eta: 0:09:17 time: 0.0381 data: 0.0010 max mem: 33300 Test: [ 7400/21770] eta: 0:09:13 time: 0.0386 data: 0.0012 max mem: 33300 Test: [ 7500/21770] eta: 0:09:09 time: 0.0387 data: 0.0012 max mem: 33300 Test: [ 7600/21770] eta: 0:09:06 time: 0.0383 data: 0.0011 max mem: 33300 Test: [ 7700/21770] eta: 0:09:02 time: 0.0383 data: 0.0011 max mem: 33300 Test: [ 7800/21770] eta: 0:08:58 time: 0.0382 data: 0.0011 max mem: 33300 Test: [ 7900/21770] eta: 0:08:54 time: 0.0383 data: 0.0011 max mem: 33300 Test: [ 8000/21770] eta: 0:08:50 time: 0.0384 data: 0.0011 max mem: 33300 Test: [ 8100/21770] eta: 0:08:46 time: 0.0383 data: 0.0011 max mem: 33300 Test: [ 8200/21770] eta: 0:08:42 time: 0.0385 data: 0.0011 max mem: 33300 Test: [ 8300/21770] eta: 0:08:38 time: 0.0381 data: 0.0010 max mem: 33300 Test: [ 8400/21770] eta: 0:08:34 time: 0.0381 data: 0.0010 max mem: 33300 Test: [ 8500/21770] eta: 0:08:30 time: 0.0382 data: 0.0010 max mem: 33300 Test: [ 8600/21770] eta: 0:08:26 time: 0.0381 data: 0.0010 max mem: 33300 Test: [ 8700/21770] eta: 0:08:23 time: 0.0379 data: 0.0010 max mem: 33300 Test: [ 8800/21770] eta: 0:08:19 time: 0.0381 data: 0.0010 max mem: 33300 Test: [ 8900/21770] eta: 0:08:15 time: 0.0380 data: 0.0010 max mem: 33300 Test: [ 9000/21770] eta: 0:08:11 time: 0.0383 data: 0.0010 max mem: 33300 Test: [ 9100/21770] eta: 0:08:07 time: 0.0384 data: 0.0010 max mem: 33300 Test: [ 9200/21770] eta: 0:08:03 time: 0.0382 data: 0.0010 max mem: 33300 Test: [ 9300/21770] eta: 0:07:59 time: 0.0384 data: 0.0010 max mem: 33300 Test: [ 9400/21770] eta: 0:07:56 time: 0.0385 data: 0.0011 max mem: 33300 Test: [ 9500/21770] eta: 0:07:52 time: 0.0383 data: 0.0010 max mem: 33300 Test: [ 9600/21770] eta: 0:07:48 time: 0.0382 data: 0.0010 max mem: 33300 Test: [ 9700/21770] eta: 0:07:44 time: 0.0384 data: 0.0010 max mem: 33300 Test: [ 9800/21770] eta: 0:07:40 time: 0.0383 data: 0.0010 max mem: 33300 Test: [ 9900/21770] eta: 0:07:36 time: 0.0384 data: 0.0010 max mem: 33300 Test: [10000/21770] eta: 0:07:32 time: 0.0386 data: 0.0010 max mem: 33300 Test: [10100/21770] eta: 0:07:29 time: 0.0383 data: 0.0010 max mem: 33300 Test: [10200/21770] eta: 0:07:25 time: 0.0381 data: 0.0010 max mem: 33300 Test: [10300/21770] eta: 0:07:21 time: 0.0381 data: 0.0010 max mem: 33300 Test: [10400/21770] eta: 0:07:17 time: 0.0382 data: 0.0010 max mem: 33300 Test: [10500/21770] eta: 0:07:13 time: 0.0381 data: 0.0010 max mem: 33300 Test: [10600/21770] eta: 0:07:09 time: 0.0386 data: 0.0010 max mem: 33300 Test: [10700/21770] eta: 0:07:05 time: 0.0387 data: 0.0011 max mem: 33300 Test: [10800/21770] eta: 0:07:01 time: 0.0382 data: 0.0010 max mem: 33300 Test: [10900/21770] eta: 0:06:58 time: 0.0383 data: 0.0010 max mem: 33300 Test: [11000/21770] eta: 0:06:54 time: 0.0381 data: 0.0010 max mem: 33300 Test: [11100/21770] eta: 0:06:50 time: 0.0383 data: 0.0010 max mem: 33300 Test: [11200/21770] eta: 0:06:46 time: 0.0388 data: 0.0010 max mem: 33300 Test: [11300/21770] eta: 0:06:42 time: 0.0389 data: 0.0010 max mem: 33300 Test: [11400/21770] eta: 0:06:38 time: 0.0391 data: 0.0010 max mem: 33300 Test: [11500/21770] eta: 0:06:35 time: 0.0389 data: 0.0010 max mem: 33300 Test: [11600/21770] eta: 0:06:31 time: 0.0389 data: 0.0011 max mem: 33300 Test: [11700/21770] eta: 0:06:27 time: 0.0389 data: 0.0010 max mem: 33300 Test: [11800/21770] eta: 0:06:23 time: 0.0386 data: 0.0010 max mem: 33300 Test: [11900/21770] eta: 0:06:19 time: 0.0399 data: 0.0010 max mem: 33300 Test: [12000/21770] eta: 0:06:16 time: 0.0396 data: 0.0010 max mem: 33300 Test: [12100/21770] eta: 0:06:12 time: 0.0387 data: 0.0011 max mem: 33300 Test: [12200/21770] eta: 0:06:08 time: 0.0387 data: 0.0011 max mem: 33300 Test: [12300/21770] eta: 0:06:04 time: 0.0390 data: 0.0010 max mem: 33300 Test: [12400/21770] eta: 0:06:00 time: 0.0387 data: 0.0010 max mem: 33300 Test: [12500/21770] eta: 0:05:57 time: 0.0391 data: 0.0010 max mem: 33300 Test: [12600/21770] eta: 0:05:53 time: 0.0394 data: 0.0011 max mem: 33300 Test: [12700/21770] eta: 0:05:49 time: 0.0391 data: 0.0010 max mem: 33300 Test: [12800/21770] eta: 0:05:45 time: 0.0389 data: 0.0011 max mem: 33300 Test: [12900/21770] eta: 0:05:41 time: 0.0400 data: 0.0012 max mem: 33300 Test: [13000/21770] eta: 0:05:37 time: 0.0395 data: 0.0011 max mem: 33300 Test: [13100/21770] eta: 0:05:34 time: 0.0394 data: 0.0010 max mem: 33300 Test: [13200/21770] eta: 0:05:30 time: 0.0390 data: 0.0009 max mem: 33300 Test: [13300/21770] eta: 0:05:26 time: 0.0390 data: 0.0009 max mem: 33300 Test: [13400/21770] eta: 0:05:22 time: 0.0388 data: 0.0009 max mem: 33300 Test: [13500/21770] eta: 0:05:18 time: 0.0386 data: 0.0009 max mem: 33300 Test: [13600/21770] eta: 0:05:15 time: 0.0383 data: 0.0010 max mem: 33300 Test: [13700/21770] eta: 0:05:11 time: 0.0387 data: 0.0010 max mem: 33300 Test: [13800/21770] eta: 0:05:07 time: 0.0387 data: 0.0010 max mem: 33300 Test: [13900/21770] eta: 0:05:03 time: 0.0387 data: 0.0010 max mem: 33300 Test: [14000/21770] eta: 0:04:59 time: 0.0386 data: 0.0010 max mem: 33300 Test: [14100/21770] eta: 0:04:55 time: 0.0389 data: 0.0010 max mem: 33300 Test: [14200/21770] eta: 0:04:51 time: 0.0384 data: 0.0010 max mem: 33300 Test: [14300/21770] eta: 0:04:48 time: 0.0390 data: 0.0010 max mem: 33300 Test: [14400/21770] eta: 0:04:44 time: 0.0387 data: 0.0010 max mem: 33300 Test: [14500/21770] eta: 0:04:40 time: 0.0390 data: 0.0010 max mem: 33300 Test: [14600/21770] eta: 0:04:36 time: 0.0393 data: 0.0012 max mem: 33300 Test: [14700/21770] eta: 0:04:32 time: 0.0399 data: 0.0011 max mem: 33300 Test: [14800/21770] eta: 0:04:28 time: 0.0386 data: 0.0011 max mem: 33300 Test: [14900/21770] eta: 0:04:25 time: 0.0388 data: 0.0011 max mem: 33300 Test: [15000/21770] eta: 0:04:21 time: 0.0389 data: 0.0012 max mem: 33300 Test: [15100/21770] eta: 0:04:17 time: 0.0399 data: 0.0011 max mem: 33300 Test: [15200/21770] eta: 0:04:13 time: 0.0399 data: 0.0010 max mem: 33300 Test: [15300/21770] eta: 0:04:09 time: 0.0399 data: 0.0010 max mem: 33300 Test: [15400/21770] eta: 0:04:05 time: 0.0385 data: 0.0010 max mem: 33300 Test: [15500/21770] eta: 0:04:02 time: 0.0387 data: 0.0011 max mem: 33300 Test: [15600/21770] eta: 0:03:58 time: 0.0401 data: 0.0010 max mem: 33300 Test: [15700/21770] eta: 0:03:54 time: 0.0394 data: 0.0010 max mem: 33300 Test: [15800/21770] eta: 0:03:50 time: 0.0399 data: 0.0010 max mem: 33300 Test: [15900/21770] eta: 0:03:46 time: 0.0390 data: 0.0010 max mem: 33300 Test: [16000/21770] eta: 0:03:42 time: 0.0387 data: 0.0010 max mem: 33300 Test: [16100/21770] eta: 0:03:39 time: 0.0396 data: 0.0010 max mem: 33300 Test: [16200/21770] eta: 0:03:35 time: 0.0390 data: 0.0009 max mem: 33300 Test: [16300/21770] eta: 0:03:31 time: 0.0391 data: 0.0010 max mem: 33300 Test: [16400/21770] eta: 0:03:27 time: 0.0392 data: 0.0010 max mem: 33300 Test: [16500/21770] eta: 0:03:23 time: 0.0394 data: 0.0010 max mem: 33300 Test: [16600/21770] eta: 0:03:19 time: 0.0394 data: 0.0010 max mem: 33300 Test: [16700/21770] eta: 0:03:16 time: 0.0389 data: 0.0011 max mem: 33300 Test: [16800/21770] eta: 0:03:12 time: 0.0388 data: 0.0011 max mem: 33300 Test: [16900/21770] eta: 0:03:08 time: 0.0388 data: 0.0011 max mem: 33300 Test: [17000/21770] eta: 0:03:04 time: 0.0386 data: 0.0011 max mem: 33300 Test: [17100/21770] eta: 0:03:00 time: 0.0387 data: 0.0010 max mem: 33300 Test: [17200/21770] eta: 0:02:56 time: 0.0390 data: 0.0011 max mem: 33300 Test: [17300/21770] eta: 0:02:52 time: 0.0382 data: 0.0011 max mem: 33300 Test: [17400/21770] eta: 0:02:48 time: 0.0386 data: 0.0011 max mem: 33300 Test: [17500/21770] eta: 0:02:45 time: 0.0399 data: 0.0010 max mem: 33300 Test: [17600/21770] eta: 0:02:41 time: 0.0396 data: 0.0011 max mem: 33300 Test: [17700/21770] eta: 0:02:37 time: 0.0391 data: 0.0010 max mem: 33300 Test: [17800/21770] eta: 0:02:33 time: 0.0382 data: 0.0011 max mem: 33300 Test: [17900/21770] eta: 0:02:29 time: 0.0383 data: 0.0011 max mem: 33300 Test: [18000/21770] eta: 0:02:25 time: 0.0386 data: 0.0009 max mem: 33300 Test: [18100/21770] eta: 0:02:21 time: 0.0390 data: 0.0010 max mem: 33300 Test: [18200/21770] eta: 0:02:18 time: 0.0390 data: 0.0010 max mem: 33300 Test: [18300/21770] eta: 0:02:14 time: 0.0387 data: 0.0010 max mem: 33300 Test: [18400/21770] eta: 0:02:10 time: 0.0387 data: 0.0010 max mem: 33300 Test: [18500/21770] eta: 0:02:06 time: 0.0386 data: 0.0009 max mem: 33300 Test: [18600/21770] eta: 0:02:02 time: 0.0388 data: 0.0010 max mem: 33300 Test: [18700/21770] eta: 0:01:58 time: 0.0384 data: 0.0009 max mem: 33300 Test: [18800/21770] eta: 0:01:54 time: 0.0384 data: 0.0010 max mem: 33300 Test: [18900/21770] eta: 0:01:50 time: 0.0384 data: 0.0009 max mem: 33300 Test: [19000/21770] eta: 0:01:47 time: 0.0385 data: 0.0010 max mem: 33300 Test: [19100/21770] eta: 0:01:43 time: 0.0386 data: 0.0010 max mem: 33300 Test: [19200/21770] eta: 0:01:39 time: 0.0384 data: 0.0010 max mem: 33300 Test: [19300/21770] eta: 0:01:35 time: 0.0387 data: 0.0010 max mem: 33300 Test: [19400/21770] eta: 0:01:31 time: 0.0384 data: 0.0010 max mem: 33300 Test: [19500/21770] eta: 0:01:27 time: 0.0387 data: 0.0010 max mem: 33300 Test: [19600/21770] eta: 0:01:23 time: 0.0386 data: 0.0010 max mem: 33300 Test: [19700/21770] eta: 0:01:20 time: 0.0386 data: 0.0010 max mem: 33300 Test: [19800/21770] eta: 0:01:16 time: 0.0386 data: 0.0009 max mem: 33300 Test: [19900/21770] eta: 0:01:12 time: 0.0387 data: 0.0010 max mem: 33300 Test: [20000/21770] eta: 0:01:08 time: 0.0380 data: 0.0010 max mem: 33300 Test: [20100/21770] eta: 0:01:04 time: 0.0381 data: 0.0010 max mem: 33300 Test: [20200/21770] eta: 0:01:00 time: 0.0385 data: 0.0009 max mem: 33300 Test: [20300/21770] eta: 0:00:56 time: 0.0382 data: 0.0010 max mem: 33300 Test: [20400/21770] eta: 0:00:52 time: 0.0380 data: 0.0009 max mem: 33300 Test: [20500/21770] eta: 0:00:49 time: 0.0389 data: 0.0011 max mem: 33300 Test: [20600/21770] eta: 0:00:45 time: 0.0386 data: 0.0011 max mem: 33300 Test: [20700/21770] eta: 0:00:41 time: 0.0388 data: 0.0010 max mem: 33300 Test: [20800/21770] eta: 0:00:37 time: 0.0386 data: 0.0010 max mem: 33300 Test: [20900/21770] eta: 0:00:33 time: 0.0388 data: 0.0011 max mem: 33300 Test: [21000/21770] eta: 0:00:29 time: 0.0386 data: 0.0010 max mem: 33300 Test: [21100/21770] eta: 0:00:25 time: 0.0388 data: 0.0010 max mem: 33300 Test: [21200/21770] eta: 0:00:22 time: 0.0387 data: 0.0010 max mem: 33300 Test: [21300/21770] eta: 0:00:18 time: 0.0393 data: 0.0011 max mem: 33300 Test: [21400/21770] eta: 0:00:14 time: 0.0396 data: 0.0010 max mem: 33300 Test: [21500/21770] eta: 0:00:10 time: 0.0389 data: 0.0009 max mem: 33300 Test: [21600/21770] eta: 0:00:06 time: 0.0392 data: 0.0009 max mem: 33300 Test: [21700/21770] eta: 0:00:02 time: 0.0391 data: 0.0010 max mem: 33300 Test: Total time: 0:14:02 Final results: Mean IoU is 0.00 precision@0.5 = 0.00 precision@0.6 = 0.00 precision@0.7 = 0.00 precision@0.8 = 0.00 precision@0.9 = 0.00 overall IoU = 0.00 mean IoU = 0.00 Mean accuracy for one-to-zero sample is 100.00 Average object IoU 0.0 Overall IoU 0.0 Epoch: [13] [ 0/4276] eta: 6:45:28 lr: 3.510265893189975e-05 loss: 0.1308 (0.1308) time: 5.6896 data: 2.5855 max mem: 33300 Epoch: [13] [ 10/4276] eta: 3:54:40 lr: 3.509992248701277e-05 loss: 0.1399 (0.1545) time: 3.3006 data: 0.2416 max mem: 33300 Epoch: [13] [ 20/4276] eta: 3:45:49 lr: 3.509718601842141e-05 loss: 0.1438 (0.1545) time: 3.0583 data: 0.0073 max mem: 33300 Epoch: [13] [ 30/4276] eta: 3:41:32 lr: 3.5094449526123424e-05 loss: 0.1474 (0.1553) time: 3.0371 data: 0.0073 max mem: 33300 Epoch: [13] [ 40/4276] eta: 3:39:13 lr: 3.509171301011654e-05 loss: 0.1544 (0.1551) time: 3.0230 data: 0.0077 max mem: 33300 Epoch: [13] [ 50/4276] eta: 3:37:41 lr: 3.5088976470398504e-05 loss: 0.1541 (0.1519) time: 3.0290 data: 0.0083 max mem: 33300 Epoch: [13] [ 60/4276] eta: 3:36:18 lr: 3.508623990696705e-05 loss: 0.1361 (0.1524) time: 3.0234 data: 0.0085 max mem: 33300 Epoch: [13] [ 70/4276] eta: 3:35:14 lr: 3.508350331981993e-05 loss: 0.1414 (0.1515) time: 3.0191 data: 0.0083 max mem: 33300 Epoch: [13] [ 80/4276] eta: 3:34:15 lr: 3.5080766708954866e-05 loss: 0.1516 (0.1521) time: 3.0187 data: 0.0084 max mem: 33300 Epoch: [13] [ 90/4276] eta: 3:33:04 lr: 3.507803007436961e-05 loss: 0.1372 (0.1499) time: 2.9960 data: 0.0084 max mem: 33300 Epoch: [13] [ 100/4276] eta: 3:31:59 lr: 3.50752934160619e-05 loss: 0.1346 (0.1517) time: 2.9741 data: 0.0083 max mem: 33300 Epoch: [13] [ 110/4276] eta: 3:31:17 lr: 3.507255673402947e-05 loss: 0.1505 (0.1526) time: 2.9923 data: 0.0087 max mem: 33300 Epoch: [13] [ 120/4276] eta: 3:30:35 lr: 3.506982002827006e-05 loss: 0.1470 (0.1517) time: 3.0116 data: 0.0090 max mem: 33300 Epoch: [13] [ 130/4276] eta: 3:29:55 lr: 3.506708329878139e-05 loss: 0.1470 (0.1525) time: 3.0093 data: 0.0088 max mem: 33300 Epoch: [13] [ 140/4276] eta: 3:29:22 lr: 3.5064346545561224e-05 loss: 0.1441 (0.1517) time: 3.0197 data: 0.0085 max mem: 33300 Epoch: [13] [ 150/4276] eta: 3:28:48 lr: 3.506160976860728e-05 loss: 0.1347 (0.1515) time: 3.0271 data: 0.0086 max mem: 33300 Epoch: [13] [ 160/4276] eta: 3:28:13 lr: 3.5058872967917296e-05 loss: 0.1381 (0.1515) time: 3.0221 data: 0.0084 max mem: 33300 Epoch: [13] [ 170/4276] eta: 3:27:40 lr: 3.505613614348902e-05 loss: 0.1381 (0.1512) time: 3.0204 data: 0.0082 max mem: 33300 Epoch: [13] [ 180/4276] eta: 3:27:13 lr: 3.505339929532017e-05 loss: 0.1359 (0.1508) time: 3.0361 data: 0.0078 max mem: 33300 Epoch: [13] [ 190/4276] eta: 3:26:38 lr: 3.505066242340849e-05 loss: 0.1348 (0.1509) time: 3.0319 data: 0.0078 max mem: 33300 Epoch: [13] [ 200/4276] eta: 3:26:10 lr: 3.5047925527751706e-05 loss: 0.1485 (0.1512) time: 3.0303 data: 0.0084 max mem: 33300 Epoch: [13] [ 210/4276] eta: 3:25:40 lr: 3.504518860834756e-05 loss: 0.1564 (0.1512) time: 3.0423 data: 0.0086 max mem: 33300 Epoch: [13] [ 220/4276] eta: 3:25:09 lr: 3.504245166519378e-05 loss: 0.1455 (0.1509) time: 3.0327 data: 0.0080 max mem: 33300 Epoch: [13] [ 230/4276] eta: 3:24:28 lr: 3.503971469828811e-05 loss: 0.1332 (0.1500) time: 3.0026 data: 0.0078 max mem: 33300 Epoch: [13] [ 240/4276] eta: 3:23:56 lr: 3.5036977707628265e-05 loss: 0.1387 (0.1501) time: 2.9986 data: 0.0085 max mem: 33300 Epoch: [13] [ 250/4276] eta: 3:23:22 lr: 3.5034240693211995e-05 loss: 0.1592 (0.1508) time: 3.0166 data: 0.0087 max mem: 33300 Epoch: [13] [ 260/4276] eta: 3:22:51 lr: 3.503150365503701e-05 loss: 0.1664 (0.1510) time: 3.0169 data: 0.0080 max mem: 33300 Epoch: [13] [ 270/4276] eta: 3:22:20 lr: 3.502876659310107e-05 loss: 0.1391 (0.1507) time: 3.0267 data: 0.0078 max mem: 33300 Epoch: [13] [ 280/4276] eta: 3:21:54 lr: 3.502602950740188e-05 loss: 0.1378 (0.1510) time: 3.0434 data: 0.0082 max mem: 33300 Epoch: [13] [ 290/4276] eta: 3:21:24 lr: 3.502329239793718e-05 loss: 0.1378 (0.1504) time: 3.0450 data: 0.0080 max mem: 33300 Epoch: [13] [ 300/4276] eta: 3:20:56 lr: 3.50205552647047e-05 loss: 0.1356 (0.1503) time: 3.0441 data: 0.0076 max mem: 33300 Epoch: [13] [ 310/4276] eta: 3:20:25 lr: 3.501781810770216e-05 loss: 0.1380 (0.1500) time: 3.0395 data: 0.0078 max mem: 33300 Epoch: [13] [ 320/4276] eta: 3:19:54 lr: 3.5015080926927305e-05 loss: 0.1409 (0.1501) time: 3.0248 data: 0.0080 max mem: 33300 Epoch: [13] [ 330/4276] eta: 3:19:21 lr: 3.501234372237787e-05 loss: 0.1474 (0.1502) time: 3.0189 data: 0.0082 max mem: 33300 Epoch: [13] [ 340/4276] eta: 3:18:50 lr: 3.500960649405156e-05 loss: 0.1429 (0.1500) time: 3.0197 data: 0.0082 max mem: 33300 Epoch: [13] [ 350/4276] eta: 3:18:19 lr: 3.500686924194611e-05 loss: 0.1342 (0.1499) time: 3.0245 data: 0.0078 max mem: 33300 Epoch: [13] [ 360/4276] eta: 3:17:48 lr: 3.500413196605925e-05 loss: 0.1502 (0.1505) time: 3.0218 data: 0.0075 max mem: 33300 Epoch: [13] [ 370/4276] eta: 3:17:17 lr: 3.500139466638871e-05 loss: 0.1432 (0.1499) time: 3.0251 data: 0.0077 max mem: 33300 Epoch: [13] [ 380/4276] eta: 3:16:48 lr: 3.4998657342932215e-05 loss: 0.1415 (0.1500) time: 3.0340 data: 0.0077 max mem: 33300 Epoch: [13] [ 390/4276] eta: 3:16:17 lr: 3.499591999568749e-05 loss: 0.1416 (0.1500) time: 3.0304 data: 0.0075 max mem: 33300 Epoch: [13] [ 400/4276] eta: 3:15:48 lr: 3.4993182624652266e-05 loss: 0.1416 (0.1499) time: 3.0369 data: 0.0079 max mem: 33300 Epoch: [13] [ 410/4276] eta: 3:15:15 lr: 3.4990445229824256e-05 loss: 0.1494 (0.1498) time: 3.0241 data: 0.0081 max mem: 33300 Epoch: [13] [ 420/4276] eta: 3:14:40 lr: 3.498770781120121e-05 loss: 0.1460 (0.1499) time: 2.9874 data: 0.0081 max mem: 33300 Epoch: [13] [ 430/4276] eta: 3:14:04 lr: 3.4984970368780814e-05 loss: 0.1374 (0.1498) time: 2.9716 data: 0.0076 max mem: 33300 Epoch: [13] [ 440/4276] eta: 3:13:29 lr: 3.498223290256082e-05 loss: 0.1397 (0.1496) time: 2.9688 data: 0.0071 max mem: 33300 Epoch: [13] [ 450/4276] eta: 3:12:57 lr: 3.4979495412538946e-05 loss: 0.1366 (0.1496) time: 2.9913 data: 0.0079 max mem: 33300 Epoch: [13] [ 460/4276] eta: 3:12:27 lr: 3.497675789871291e-05 loss: 0.1347 (0.1492) time: 3.0179 data: 0.0086 max mem: 33300 Epoch: [13] [ 470/4276] eta: 3:11:58 lr: 3.497402036108044e-05 loss: 0.1327 (0.1488) time: 3.0360 data: 0.0087 max mem: 33300 Epoch: [13] [ 480/4276] eta: 3:11:28 lr: 3.4971282799639264e-05 loss: 0.1374 (0.1486) time: 3.0353 data: 0.0089 max mem: 33300 Epoch: [13] [ 490/4276] eta: 3:11:00 lr: 3.49685452143871e-05 loss: 0.1292 (0.1481) time: 3.0398 data: 0.0090 max mem: 33300 Epoch: [13] [ 500/4276] eta: 3:10:30 lr: 3.496580760532166e-05 loss: 0.1241 (0.1481) time: 3.0456 data: 0.0087 max mem: 33300 Epoch: [13] [ 510/4276] eta: 3:10:00 lr: 3.4963069972440674e-05 loss: 0.1307 (0.1478) time: 3.0324 data: 0.0082 max mem: 33300 Epoch: [13] [ 520/4276] eta: 3:09:30 lr: 3.496033231574187e-05 loss: 0.1307 (0.1478) time: 3.0300 data: 0.0085 max mem: 33300 Epoch: [13] [ 530/4276] eta: 3:09:01 lr: 3.495759463522294e-05 loss: 0.1396 (0.1477) time: 3.0377 data: 0.0092 max mem: 33300 Epoch: [13] [ 540/4276] eta: 3:08:35 lr: 3.495485693088164e-05 loss: 0.1413 (0.1475) time: 3.0709 data: 0.0093 max mem: 33300 Epoch: [13] [ 550/4276] eta: 3:08:11 lr: 3.495211920271567e-05 loss: 0.1517 (0.1478) time: 3.1073 data: 0.0091 max mem: 33300 Epoch: [13] [ 560/4276] eta: 3:07:46 lr: 3.4949381450722754e-05 loss: 0.1625 (0.1479) time: 3.1170 data: 0.0092 max mem: 33300 Epoch: [13] [ 570/4276] eta: 3:07:23 lr: 3.494664367490061e-05 loss: 0.1533 (0.1478) time: 3.1287 data: 0.0090 max mem: 33300 Epoch: [13] [ 580/4276] eta: 3:06:59 lr: 3.494390587524696e-05 loss: 0.1422 (0.1479) time: 3.1321 data: 0.0082 max mem: 33300 Epoch: [13] [ 590/4276] eta: 3:06:34 lr: 3.494116805175951e-05 loss: 0.1166 (0.1475) time: 3.1228 data: 0.0078 max mem: 33300 Epoch: [13] [ 600/4276] eta: 3:06:08 lr: 3.493843020443598e-05 loss: 0.1225 (0.1476) time: 3.1182 data: 0.0084 max mem: 33300 Epoch: [13] [ 610/4276] eta: 3:05:43 lr: 3.49356923332741e-05 loss: 0.1362 (0.1474) time: 3.1203 data: 0.0086 max mem: 33300 Epoch: [13] [ 620/4276] eta: 3:05:17 lr: 3.493295443827158e-05 loss: 0.1362 (0.1474) time: 3.1184 data: 0.0082 max mem: 33300 Epoch: [13] [ 630/4276] eta: 3:04:50 lr: 3.4930216519426126e-05 loss: 0.1403 (0.1475) time: 3.1085 data: 0.0082 max mem: 33300 Epoch: [13] [ 640/4276] eta: 3:04:23 lr: 3.492747857673547e-05 loss: 0.1398 (0.1474) time: 3.1021 data: 0.0087 max mem: 33300 Epoch: [13] [ 650/4276] eta: 3:03:56 lr: 3.492474061019732e-05 loss: 0.1398 (0.1475) time: 3.1033 data: 0.0087 max mem: 33300 Epoch: [13] [ 660/4276] eta: 3:03:31 lr: 3.4922002619809394e-05 loss: 0.1517 (0.1476) time: 3.1269 data: 0.0082 max mem: 33300 Epoch: [13] [ 670/4276] eta: 3:03:04 lr: 3.4919264605569405e-05 loss: 0.1445 (0.1476) time: 3.1298 data: 0.0076 max mem: 33300 Epoch: [13] [ 680/4276] eta: 3:02:37 lr: 3.491652656747506e-05 loss: 0.1382 (0.1475) time: 3.1120 data: 0.0074 max mem: 33300 Epoch: [13] [ 690/4276] eta: 3:02:09 lr: 3.4913788505524074e-05 loss: 0.1334 (0.1474) time: 3.1038 data: 0.0073 max mem: 33300 Epoch: [13] [ 700/4276] eta: 3:01:42 lr: 3.491105041971417e-05 loss: 0.1392 (0.1474) time: 3.1062 data: 0.0072 max mem: 33300 Epoch: [13] [ 710/4276] eta: 3:01:15 lr: 3.490831231004306e-05 loss: 0.1438 (0.1474) time: 3.1157 data: 0.0072 max mem: 33300 Epoch: [13] [ 720/4276] eta: 3:00:47 lr: 3.490557417650844e-05 loss: 0.1430 (0.1472) time: 3.1019 data: 0.0072 max mem: 33300 Epoch: [13] [ 730/4276] eta: 3:00:19 lr: 3.4902836019108046e-05 loss: 0.1398 (0.1472) time: 3.0948 data: 0.0072 max mem: 33300 Epoch: [13] [ 740/4276] eta: 2:59:51 lr: 3.490009783783958e-05 loss: 0.1393 (0.1472) time: 3.1078 data: 0.0073 max mem: 33300 Epoch: [13] [ 750/4276] eta: 2:59:24 lr: 3.489735963270074e-05 loss: 0.1393 (0.1472) time: 3.1200 data: 0.0075 max mem: 33300 Epoch: [13] [ 760/4276] eta: 2:58:57 lr: 3.4894621403689256e-05 loss: 0.1303 (0.1472) time: 3.1315 data: 0.0076 max mem: 33300 Epoch: [13] [ 770/4276] eta: 2:58:30 lr: 3.489188315080282e-05 loss: 0.1303 (0.1473) time: 3.1255 data: 0.0074 max mem: 33300 Epoch: [13] [ 780/4276] eta: 2:58:01 lr: 3.488914487403916e-05 loss: 0.1443 (0.1474) time: 3.1103 data: 0.0074 max mem: 33300 Epoch: [13] [ 790/4276] eta: 2:57:30 lr: 3.488640657339598e-05 loss: 0.1487 (0.1474) time: 3.0710 data: 0.0072 max mem: 33300 Epoch: [13] [ 800/4276] eta: 2:57:02 lr: 3.4883668248870984e-05 loss: 0.1446 (0.1474) time: 3.0721 data: 0.0072 max mem: 33300 Epoch: [13] [ 810/4276] eta: 2:56:33 lr: 3.488092990046188e-05 loss: 0.1405 (0.1475) time: 3.1028 data: 0.0073 max mem: 33300 Epoch: [13] [ 820/4276] eta: 2:56:04 lr: 3.487819152816639e-05 loss: 0.1402 (0.1472) time: 3.0931 data: 0.0072 max mem: 33300 Epoch: [13] [ 830/4276] eta: 2:55:35 lr: 3.48754531319822e-05 loss: 0.1310 (0.1473) time: 3.0894 data: 0.0070 max mem: 33300 Epoch: [13] [ 840/4276] eta: 2:55:05 lr: 3.487271471190703e-05 loss: 0.1376 (0.1473) time: 3.0890 data: 0.0072 max mem: 33300 Epoch: [13] [ 850/4276] eta: 2:54:38 lr: 3.486997626793859e-05 loss: 0.1310 (0.1470) time: 3.1083 data: 0.0073 max mem: 33300 Epoch: [13] [ 860/4276] eta: 2:54:10 lr: 3.486723780007458e-05 loss: 0.1335 (0.1471) time: 3.1312 data: 0.0072 max mem: 33300 Epoch: [13] [ 870/4276] eta: 2:53:42 lr: 3.486449930831271e-05 loss: 0.1379 (0.1471) time: 3.1278 data: 0.0070 max mem: 33300 Epoch: [13] [ 880/4276] eta: 2:53:13 lr: 3.4861760792650684e-05 loss: 0.1374 (0.1472) time: 3.1190 data: 0.0071 max mem: 33300 Epoch: [13] [ 890/4276] eta: 2:52:44 lr: 3.4859022253086214e-05 loss: 0.1594 (0.1473) time: 3.1028 data: 0.0073 max mem: 33300 Epoch: [13] [ 900/4276] eta: 2:52:13 lr: 3.485628368961699e-05 loss: 0.1552 (0.1473) time: 3.0795 data: 0.0073 max mem: 33300 Epoch: [13] [ 910/4276] eta: 2:51:42 lr: 3.485354510224073e-05 loss: 0.1484 (0.1473) time: 3.0577 data: 0.0072 max mem: 33300 Epoch: [13] [ 920/4276] eta: 2:51:13 lr: 3.485080649095513e-05 loss: 0.1484 (0.1474) time: 3.0654 data: 0.0075 max mem: 33300 Epoch: [13] [ 930/4276] eta: 2:50:43 lr: 3.48480678557579e-05 loss: 0.1469 (0.1474) time: 3.0874 data: 0.0074 max mem: 33300 Epoch: [13] [ 940/4276] eta: 2:50:12 lr: 3.484532919664674e-05 loss: 0.1378 (0.1473) time: 3.0635 data: 0.0071 max mem: 33300 Epoch: [13] [ 950/4276] eta: 2:49:44 lr: 3.484259051361935e-05 loss: 0.1430 (0.1475) time: 3.0897 data: 0.0071 max mem: 33300 Epoch: [13] [ 960/4276] eta: 2:49:15 lr: 3.483985180667344e-05 loss: 0.1475 (0.1476) time: 3.1250 data: 0.0072 max mem: 33300 Epoch: [13] [ 970/4276] eta: 2:48:47 lr: 3.4837113075806706e-05 loss: 0.1441 (0.1476) time: 3.1201 data: 0.0073 max mem: 33300 Epoch: [13] [ 980/4276] eta: 2:48:17 lr: 3.483437432101685e-05 loss: 0.1438 (0.1476) time: 3.1143 data: 0.0072 max mem: 33300 Epoch: [13] [ 990/4276] eta: 2:47:47 lr: 3.483163554230157e-05 loss: 0.1438 (0.1475) time: 3.0874 data: 0.0072 max mem: 33300 Epoch: [13] [1000/4276] eta: 2:47:16 lr: 3.4828896739658575e-05 loss: 0.1239 (0.1475) time: 3.0650 data: 0.0071 max mem: 33300 Epoch: [13] [1010/4276] eta: 2:46:45 lr: 3.4826157913085564e-05 loss: 0.1403 (0.1475) time: 3.0484 data: 0.0070 max mem: 33300 Epoch: [13] [1020/4276] eta: 2:46:15 lr: 3.4823419062580225e-05 loss: 0.1403 (0.1474) time: 3.0684 data: 0.0071 max mem: 33300 Epoch: [13] [1030/4276] eta: 2:45:44 lr: 3.482068018814028e-05 loss: 0.1323 (0.1474) time: 3.0804 data: 0.0072 max mem: 33300 Epoch: [13] [1040/4276] eta: 2:45:14 lr: 3.48179412897634e-05 loss: 0.1357 (0.1473) time: 3.0644 data: 0.0072 max mem: 33300 Epoch: [13] [1050/4276] eta: 2:44:43 lr: 3.48152023674473e-05 loss: 0.1368 (0.1474) time: 3.0683 data: 0.0072 max mem: 33300 Epoch: [13] [1060/4276] eta: 2:44:14 lr: 3.4812463421189686e-05 loss: 0.1525 (0.1475) time: 3.0857 data: 0.0071 max mem: 33300 Epoch: [13] [1070/4276] eta: 2:43:45 lr: 3.480972445098824e-05 loss: 0.1510 (0.1476) time: 3.1117 data: 0.0073 max mem: 33300 Epoch: [13] [1080/4276] eta: 2:43:14 lr: 3.480698545684066e-05 loss: 0.1510 (0.1475) time: 3.0888 data: 0.0072 max mem: 33300 Epoch: [13] [1090/4276] eta: 2:42:44 lr: 3.480424643874465e-05 loss: 0.1526 (0.1476) time: 3.0741 data: 0.0070 max mem: 33300 Epoch: [13] [1100/4276] eta: 2:42:14 lr: 3.480150739669791e-05 loss: 0.1452 (0.1476) time: 3.0954 data: 0.0071 max mem: 33300 Epoch: [13] [1110/4276] eta: 2:41:45 lr: 3.4798768330698124e-05 loss: 0.1393 (0.1476) time: 3.0946 data: 0.0071 max mem: 33300 Epoch: [13] [1120/4276] eta: 2:41:15 lr: 3.4796029240743e-05 loss: 0.1358 (0.1476) time: 3.0925 data: 0.0072 max mem: 33300 Epoch: [13] [1130/4276] eta: 2:40:44 lr: 3.4793290126830225e-05 loss: 0.1300 (0.1474) time: 3.0775 data: 0.0072 max mem: 33300 Epoch: [13] [1140/4276] eta: 2:40:14 lr: 3.4790550988957494e-05 loss: 0.1251 (0.1472) time: 3.0838 data: 0.0071 max mem: 33300 Epoch: [13] [1150/4276] eta: 2:39:45 lr: 3.47878118271225e-05 loss: 0.1288 (0.1472) time: 3.1062 data: 0.0071 max mem: 33300 Epoch: [13] [1160/4276] eta: 2:39:15 lr: 3.478507264132295e-05 loss: 0.1447 (0.1472) time: 3.1061 data: 0.0072 max mem: 33300 Epoch: [13] [1170/4276] eta: 2:38:45 lr: 3.478233343155652e-05 loss: 0.1508 (0.1473) time: 3.0974 data: 0.0075 max mem: 33300 Epoch: [13] [1180/4276] eta: 2:38:16 lr: 3.477959419782091e-05 loss: 0.1504 (0.1472) time: 3.1142 data: 0.0075 max mem: 33300 Epoch: [13] [1190/4276] eta: 2:37:46 lr: 3.4776854940113814e-05 loss: 0.1389 (0.1471) time: 3.1157 data: 0.0072 max mem: 33300 Epoch: [13] [1200/4276] eta: 2:37:16 lr: 3.4774115658432934e-05 loss: 0.1239 (0.1471) time: 3.0973 data: 0.0077 max mem: 33300 Epoch: [13] [1210/4276] eta: 2:36:46 lr: 3.4771376352775946e-05 loss: 0.1276 (0.1470) time: 3.0962 data: 0.0078 max mem: 33300 Epoch: [13] [1220/4276] eta: 2:36:17 lr: 3.4768637023140546e-05 loss: 0.1395 (0.1470) time: 3.1036 data: 0.0075 max mem: 33300 Epoch: [13] [1230/4276] eta: 2:35:47 lr: 3.4765897669524424e-05 loss: 0.1405 (0.1470) time: 3.1072 data: 0.0077 max mem: 33300 Epoch: [13] [1240/4276] eta: 2:35:17 lr: 3.476315829192527e-05 loss: 0.1459 (0.1470) time: 3.1006 data: 0.0082 max mem: 33300 Epoch: [13] [1250/4276] eta: 2:34:47 lr: 3.476041889034078e-05 loss: 0.1490 (0.1471) time: 3.1038 data: 0.0083 max mem: 33300 Epoch: [13] [1260/4276] eta: 2:34:18 lr: 3.475767946476864e-05 loss: 0.1433 (0.1469) time: 3.1230 data: 0.0082 max mem: 33300 Epoch: [13] [1270/4276] eta: 2:33:49 lr: 3.475494001520654e-05 loss: 0.1447 (0.1470) time: 3.1232 data: 0.0079 max mem: 33300 Epoch: [13] [1280/4276] eta: 2:33:19 lr: 3.475220054165218e-05 loss: 0.1534 (0.1470) time: 3.1138 data: 0.0080 max mem: 33300 Epoch: [13] [1290/4276] eta: 2:32:49 lr: 3.474946104410322e-05 loss: 0.1504 (0.1471) time: 3.1086 data: 0.0081 max mem: 33300 Epoch: [13] [1300/4276] eta: 2:32:19 lr: 3.4746721522557374e-05 loss: 0.1256 (0.1470) time: 3.1025 data: 0.0080 max mem: 33300 Epoch: [13] [1310/4276] eta: 2:31:49 lr: 3.4743981977012314e-05 loss: 0.1245 (0.1470) time: 3.1038 data: 0.0082 max mem: 33300 Epoch: [13] [1320/4276] eta: 2:31:19 lr: 3.4741242407465735e-05 loss: 0.1521 (0.1471) time: 3.1067 data: 0.0085 max mem: 33300 Epoch: [13] [1330/4276] eta: 2:30:49 lr: 3.473850281391533e-05 loss: 0.1521 (0.1471) time: 3.1083 data: 0.0087 max mem: 33300 Epoch: [13] [1340/4276] eta: 2:30:19 lr: 3.473576319635877e-05 loss: 0.1442 (0.1470) time: 3.1068 data: 0.0087 max mem: 33300 Epoch: [13] [1350/4276] eta: 2:29:50 lr: 3.473302355479375e-05 loss: 0.1330 (0.1470) time: 3.1160 data: 0.0083 max mem: 33300 Epoch: [13] [1360/4276] eta: 2:29:20 lr: 3.4730283889217955e-05 loss: 0.1420 (0.1470) time: 3.1215 data: 0.0084 max mem: 33300 Epoch: [13] [1370/4276] eta: 2:28:50 lr: 3.472754419962907e-05 loss: 0.1423 (0.1470) time: 3.1242 data: 0.0081 max mem: 33300 Epoch: [13] [1380/4276] eta: 2:28:20 lr: 3.472480448602477e-05 loss: 0.1569 (0.1472) time: 3.1120 data: 0.0076 max mem: 33300 Epoch: [13] [1390/4276] eta: 2:27:50 lr: 3.472206474840275e-05 loss: 0.1611 (0.1473) time: 3.1005 data: 0.0078 max mem: 33300 Epoch: [13] [1400/4276] eta: 2:27:20 lr: 3.471932498676069e-05 loss: 0.1567 (0.1473) time: 3.1111 data: 0.0080 max mem: 33300 Epoch: [13] [1410/4276] eta: 2:26:50 lr: 3.4716585201096285e-05 loss: 0.1432 (0.1473) time: 3.1153 data: 0.0083 max mem: 33300 Epoch: [13] [1420/4276] eta: 2:26:20 lr: 3.47138453914072e-05 loss: 0.1418 (0.1473) time: 3.1087 data: 0.0086 max mem: 33300 Epoch: [13] [1430/4276] eta: 2:25:50 lr: 3.471110555769113e-05 loss: 0.1328 (0.1472) time: 3.1002 data: 0.0084 max mem: 33300 Epoch: [13] [1440/4276] eta: 2:25:20 lr: 3.470836569994575e-05 loss: 0.1449 (0.1472) time: 3.0985 data: 0.0089 max mem: 33300 Epoch: [13] [1450/4276] eta: 2:24:49 lr: 3.470562581816874e-05 loss: 0.1479 (0.1472) time: 3.1007 data: 0.0087 max mem: 33300 Epoch: [13] [1460/4276] eta: 2:24:19 lr: 3.470288591235778e-05 loss: 0.1345 (0.1471) time: 3.1022 data: 0.0079 max mem: 33300 Epoch: [13] [1470/4276] eta: 2:23:49 lr: 3.470014598251057e-05 loss: 0.1322 (0.1470) time: 3.1021 data: 0.0078 max mem: 33300 Epoch: [13] [1480/4276] eta: 2:23:20 lr: 3.469740602862476e-05 loss: 0.1211 (0.1469) time: 3.1277 data: 0.0078 max mem: 33300 Epoch: [13] [1490/4276] eta: 2:22:49 lr: 3.469466605069806e-05 loss: 0.1210 (0.1468) time: 3.1285 data: 0.0080 max mem: 33300 Epoch: [13] [1500/4276] eta: 2:22:19 lr: 3.469192604872813e-05 loss: 0.1354 (0.1467) time: 3.1065 data: 0.0080 max mem: 33300 Epoch: [13] [1510/4276] eta: 2:21:49 lr: 3.4689186022712655e-05 loss: 0.1316 (0.1466) time: 3.1168 data: 0.0078 max mem: 33300 Epoch: [13] [1520/4276] eta: 2:21:20 lr: 3.468644597264931e-05 loss: 0.1197 (0.1466) time: 3.1287 data: 0.0078 max mem: 33300 Epoch: [13] [1530/4276] eta: 2:20:49 lr: 3.468370589853577e-05 loss: 0.1205 (0.1464) time: 3.1220 data: 0.0080 max mem: 33300 Epoch: [13] [1540/4276] eta: 2:20:20 lr: 3.468096580036973e-05 loss: 0.1363 (0.1465) time: 3.1186 data: 0.0077 max mem: 33300 Epoch: [13] [1550/4276] eta: 2:19:50 lr: 3.467822567814885e-05 loss: 0.1576 (0.1466) time: 3.1323 data: 0.0075 max mem: 33300 Epoch: [13] [1560/4276] eta: 2:19:20 lr: 3.467548553187081e-05 loss: 0.1493 (0.1465) time: 3.1315 data: 0.0079 max mem: 33300 Epoch: [13] [1570/4276] eta: 2:18:49 lr: 3.46727453615333e-05 loss: 0.1484 (0.1466) time: 3.1116 data: 0.0082 max mem: 33300 Epoch: [13] [1580/4276] eta: 2:18:19 lr: 3.467000516713397e-05 loss: 0.1281 (0.1465) time: 3.1003 data: 0.0084 max mem: 33300 Epoch: [13] [1590/4276] eta: 2:17:49 lr: 3.466726494867053e-05 loss: 0.1373 (0.1464) time: 3.1060 data: 0.0084 max mem: 33300 Epoch: [13] [1600/4276] eta: 2:17:18 lr: 3.466452470614062e-05 loss: 0.1401 (0.1464) time: 3.1085 data: 0.0083 max mem: 33300 Epoch: [13] [1610/4276] eta: 2:16:48 lr: 3.466178443954194e-05 loss: 0.1352 (0.1463) time: 3.0927 data: 0.0080 max mem: 33300 Epoch: [13] [1620/4276] eta: 2:16:18 lr: 3.465904414887215e-05 loss: 0.1291 (0.1462) time: 3.0994 data: 0.0081 max mem: 33300 Epoch: [13] [1630/4276] eta: 2:15:48 lr: 3.4656303834128926e-05 loss: 0.1331 (0.1462) time: 3.1437 data: 0.0082 max mem: 33300 Epoch: [13] [1640/4276] eta: 2:15:18 lr: 3.465356349530996e-05 loss: 0.1300 (0.1462) time: 3.1454 data: 0.0083 max mem: 33300 Epoch: [13] [1650/4276] eta: 2:14:48 lr: 3.465082313241289e-05 loss: 0.1413 (0.1461) time: 3.1323 data: 0.0084 max mem: 33300 Epoch: [13] [1660/4276] eta: 2:14:18 lr: 3.4648082745435427e-05 loss: 0.1424 (0.1462) time: 3.1252 data: 0.0079 max mem: 33300 Epoch: [13] [1670/4276] eta: 2:13:48 lr: 3.4645342334375214e-05 loss: 0.1384 (0.1461) time: 3.1086 data: 0.0079 max mem: 33300 Epoch: [13] [1680/4276] eta: 2:13:17 lr: 3.464260189922993e-05 loss: 0.1384 (0.1461) time: 3.1118 data: 0.0083 max mem: 33300 Epoch: [13] [1690/4276] eta: 2:12:47 lr: 3.4639861439997255e-05 loss: 0.1411 (0.1461) time: 3.1102 data: 0.0085 max mem: 33300 Epoch: [13] [1700/4276] eta: 2:12:16 lr: 3.463712095667485e-05 loss: 0.1436 (0.1461) time: 3.1071 data: 0.0090 max mem: 33300 Epoch: [13] [1710/4276] eta: 2:11:46 lr: 3.4634380449260394e-05 loss: 0.1600 (0.1462) time: 3.1045 data: 0.0089 max mem: 33300 Epoch: [13] [1720/4276] eta: 2:11:16 lr: 3.4631639917751554e-05 loss: 0.1501 (0.1462) time: 3.1056 data: 0.0088 max mem: 33300 Epoch: [13] [1730/4276] eta: 2:10:46 lr: 3.4628899362145996e-05 loss: 0.1501 (0.1463) time: 3.1250 data: 0.0089 max mem: 33300 Epoch: [13] [1740/4276] eta: 2:10:16 lr: 3.4626158782441395e-05 loss: 0.1504 (0.1463) time: 3.1364 data: 0.0082 max mem: 33300 Epoch: [13] [1750/4276] eta: 2:09:45 lr: 3.462341817863541e-05 loss: 0.1451 (0.1462) time: 3.1293 data: 0.0083 max mem: 33300 Epoch: [13] [1760/4276] eta: 2:09:15 lr: 3.4620677550725714e-05 loss: 0.1349 (0.1461) time: 3.1074 data: 0.0088 max mem: 33300 Epoch: [13] [1770/4276] eta: 2:08:44 lr: 3.4617936898709985e-05 loss: 0.1335 (0.1461) time: 3.1016 data: 0.0087 max mem: 33300 Epoch: [13] [1780/4276] eta: 2:08:13 lr: 3.4615196222585875e-05 loss: 0.1335 (0.1460) time: 3.0949 data: 0.0082 max mem: 33300 Epoch: [13] [1790/4276] eta: 2:07:42 lr: 3.461245552235106e-05 loss: 0.1378 (0.1459) time: 3.0704 data: 0.0080 max mem: 33300 Epoch: [13] [1800/4276] eta: 2:07:11 lr: 3.4609714798003204e-05 loss: 0.1381 (0.1459) time: 3.0572 data: 0.0082 max mem: 33300 Epoch: [13] [1810/4276] eta: 2:06:40 lr: 3.460697404953998e-05 loss: 0.1451 (0.1460) time: 3.0659 data: 0.0085 max mem: 33300 Epoch: [13] [1820/4276] eta: 2:06:09 lr: 3.4604233276959045e-05 loss: 0.1552 (0.1460) time: 3.0836 data: 0.0083 max mem: 33300 Epoch: [13] [1830/4276] eta: 2:05:38 lr: 3.460149248025806e-05 loss: 0.1385 (0.1460) time: 3.0737 data: 0.0081 max mem: 33300 Epoch: [13] [1840/4276] eta: 2:05:07 lr: 3.45987516594347e-05 loss: 0.1335 (0.1459) time: 3.0662 data: 0.0080 max mem: 33300 Epoch: [13] [1850/4276] eta: 2:04:36 lr: 3.459601081448663e-05 loss: 0.1364 (0.1460) time: 3.0608 data: 0.0077 max mem: 33300 Epoch: [13] [1860/4276] eta: 2:04:06 lr: 3.45932699454115e-05 loss: 0.1412 (0.1459) time: 3.0776 data: 0.0081 max mem: 33300 Epoch: [13] [1870/4276] eta: 2:03:37 lr: 3.459052905220699e-05 loss: 0.1426 (0.1461) time: 3.1751 data: 0.0083 max mem: 33300 Epoch: [13] [1880/4276] eta: 2:03:06 lr: 3.458778813487076e-05 loss: 0.1426 (0.1461) time: 3.1573 data: 0.0080 max mem: 33300 Epoch: [13] [1890/4276] eta: 2:02:35 lr: 3.458504719340047e-05 loss: 0.1480 (0.1461) time: 3.0668 data: 0.0082 max mem: 33300 Epoch: [13] [1900/4276] eta: 2:02:04 lr: 3.4582306227793776e-05 loss: 0.1478 (0.1461) time: 3.0689 data: 0.0083 max mem: 33300 Epoch: [13] [1910/4276] eta: 2:01:33 lr: 3.4579565238048345e-05 loss: 0.1466 (0.1462) time: 3.0921 data: 0.0084 max mem: 33300 Epoch: [13] [1920/4276] eta: 2:01:03 lr: 3.457682422416184e-05 loss: 0.1485 (0.1462) time: 3.0981 data: 0.0085 max mem: 33300 Epoch: [13] [1930/4276] eta: 2:00:32 lr: 3.457408318613192e-05 loss: 0.1507 (0.1461) time: 3.0894 data: 0.0082 max mem: 33300 Epoch: [13] [1940/4276] eta: 2:00:01 lr: 3.4571342123956245e-05 loss: 0.1413 (0.1462) time: 3.0763 data: 0.0081 max mem: 33300 Epoch: [13] [1950/4276] eta: 1:59:30 lr: 3.4568601037632484e-05 loss: 0.1468 (0.1462) time: 3.0817 data: 0.0080 max mem: 33300 Epoch: [13] [1960/4276] eta: 1:59:00 lr: 3.456585992715828e-05 loss: 0.1523 (0.1462) time: 3.1379 data: 0.0081 max mem: 33300 Epoch: [13] [1970/4276] eta: 1:58:29 lr: 3.4563118792531314e-05 loss: 0.1210 (0.1461) time: 3.1182 data: 0.0083 max mem: 33300 Epoch: [13] [1980/4276] eta: 1:57:58 lr: 3.456037763374922e-05 loss: 0.1365 (0.1461) time: 3.0628 data: 0.0084 max mem: 33300 Epoch: [13] [1990/4276] eta: 1:57:28 lr: 3.455763645080967e-05 loss: 0.1383 (0.1461) time: 3.0747 data: 0.0084 max mem: 33300 Epoch: [13] [2000/4276] eta: 1:56:57 lr: 3.455489524371032e-05 loss: 0.1514 (0.1461) time: 3.1025 data: 0.0085 max mem: 33300 Epoch: [13] [2010/4276] eta: 1:56:26 lr: 3.455215401244883e-05 loss: 0.1495 (0.1461) time: 3.0943 data: 0.0089 max mem: 33300 Epoch: [13] [2020/4276] eta: 1:55:55 lr: 3.454941275702285e-05 loss: 0.1409 (0.1461) time: 3.0579 data: 0.0087 max mem: 33300 Epoch: [13] [2030/4276] eta: 1:55:24 lr: 3.454667147743005e-05 loss: 0.1298 (0.1460) time: 3.0669 data: 0.0085 max mem: 33300 Epoch: [13] [2040/4276] eta: 1:54:54 lr: 3.4543930173668075e-05 loss: 0.1310 (0.1460) time: 3.0995 data: 0.0089 max mem: 33300 Epoch: [13] [2050/4276] eta: 1:54:23 lr: 3.454118884573459e-05 loss: 0.1437 (0.1460) time: 3.1069 data: 0.0094 max mem: 33300 Epoch: [13] [2060/4276] eta: 1:53:52 lr: 3.453844749362724e-05 loss: 0.1361 (0.1460) time: 3.0814 data: 0.0092 max mem: 33300 Epoch: [13] [2070/4276] eta: 1:53:21 lr: 3.4535706117343674e-05 loss: 0.1305 (0.1459) time: 3.0579 data: 0.0085 max mem: 33300 Epoch: [13] [2080/4276] eta: 1:52:50 lr: 3.453296471688156e-05 loss: 0.1413 (0.1460) time: 3.0556 data: 0.0086 max mem: 33300 Epoch: [13] [2090/4276] eta: 1:52:19 lr: 3.453022329223855e-05 loss: 0.1595 (0.1460) time: 3.0863 data: 0.0090 max mem: 33300 Epoch: [13] [2100/4276] eta: 1:51:48 lr: 3.4527481843412294e-05 loss: 0.1585 (0.1460) time: 3.1087 data: 0.0092 max mem: 33300 Epoch: [13] [2110/4276] eta: 1:51:18 lr: 3.452474037040046e-05 loss: 0.1369 (0.1459) time: 3.0889 data: 0.0093 max mem: 33300 Epoch: [13] [2120/4276] eta: 1:50:47 lr: 3.4521998873200675e-05 loss: 0.1145 (0.1458) time: 3.0989 data: 0.0092 max mem: 33300 Epoch: [13] [2130/4276] eta: 1:50:16 lr: 3.451925735181061e-05 loss: 0.1145 (0.1457) time: 3.1078 data: 0.0090 max mem: 33300 Epoch: [13] [2140/4276] eta: 1:49:46 lr: 3.451651580622791e-05 loss: 0.1332 (0.1457) time: 3.1118 data: 0.0091 max mem: 33300 Epoch: [13] [2150/4276] eta: 1:49:15 lr: 3.451377423645022e-05 loss: 0.1324 (0.1456) time: 3.1041 data: 0.0091 max mem: 33300 Epoch: [13] [2160/4276] eta: 1:48:44 lr: 3.4511032642475205e-05 loss: 0.1322 (0.1456) time: 3.0772 data: 0.0091 max mem: 33300 Epoch: [13] [2170/4276] eta: 1:48:13 lr: 3.4508291024300506e-05 loss: 0.1360 (0.1457) time: 3.0556 data: 0.0087 max mem: 33300 Epoch: [13] [2180/4276] eta: 1:47:42 lr: 3.450554938192377e-05 loss: 0.1530 (0.1456) time: 3.0766 data: 0.0085 max mem: 33300 Epoch: [13] [2190/4276] eta: 1:47:12 lr: 3.4502807715342665e-05 loss: 0.1482 (0.1457) time: 3.1061 data: 0.0082 max mem: 33300 Epoch: [13] [2200/4276] eta: 1:46:41 lr: 3.450006602455482e-05 loss: 0.1423 (0.1457) time: 3.0722 data: 0.0081 max mem: 33300 Epoch: [13] [2210/4276] eta: 1:46:09 lr: 3.44973243095579e-05 loss: 0.1485 (0.1458) time: 3.0443 data: 0.0090 max mem: 33300 Epoch: [13] [2220/4276] eta: 1:45:39 lr: 3.449458257034954e-05 loss: 0.1522 (0.1458) time: 3.0689 data: 0.0095 max mem: 33300 Epoch: [13] [2230/4276] eta: 1:45:08 lr: 3.449184080692739e-05 loss: 0.1404 (0.1457) time: 3.1054 data: 0.0090 max mem: 33300 Epoch: [13] [2240/4276] eta: 1:44:37 lr: 3.44890990192891e-05 loss: 0.1277 (0.1456) time: 3.1072 data: 0.0087 max mem: 33300 Epoch: [13] [2250/4276] eta: 1:44:06 lr: 3.448635720743232e-05 loss: 0.1223 (0.1456) time: 3.0787 data: 0.0090 max mem: 33300 Epoch: [13] [2260/4276] eta: 1:43:35 lr: 3.44836153713547e-05 loss: 0.1314 (0.1456) time: 3.0613 data: 0.0086 max mem: 33300 Epoch: [13] [2270/4276] eta: 1:43:05 lr: 3.4480873511053876e-05 loss: 0.1251 (0.1456) time: 3.0801 data: 0.0087 max mem: 33300 Epoch: [13] [2280/4276] eta: 1:42:34 lr: 3.44781316265275e-05 loss: 0.1377 (0.1456) time: 3.1053 data: 0.0092 max mem: 33300 Epoch: [13] [2290/4276] eta: 1:42:03 lr: 3.447538971777321e-05 loss: 0.1377 (0.1456) time: 3.0846 data: 0.0091 max mem: 33300 Epoch: [13] [2300/4276] eta: 1:41:33 lr: 3.447264778478866e-05 loss: 0.1350 (0.1457) time: 3.0955 data: 0.0094 max mem: 33300 Epoch: [13] [2310/4276] eta: 1:41:02 lr: 3.4469905827571485e-05 loss: 0.1395 (0.1457) time: 3.1252 data: 0.0093 max mem: 33300 Epoch: [13] [2320/4276] eta: 1:40:31 lr: 3.446716384611934e-05 loss: 0.1425 (0.1457) time: 3.1160 data: 0.0086 max mem: 33300 Epoch: [13] [2330/4276] eta: 1:40:00 lr: 3.446442184042986e-05 loss: 0.1476 (0.1457) time: 3.0741 data: 0.0087 max mem: 33300 Epoch: [13] [2340/4276] eta: 1:39:29 lr: 3.44616798105007e-05 loss: 0.1517 (0.1457) time: 3.0188 data: 0.0088 max mem: 33300 Epoch: [13] [2350/4276] eta: 1:38:58 lr: 3.4458937756329486e-05 loss: 0.1397 (0.1457) time: 3.0321 data: 0.0084 max mem: 33300 Epoch: [13] [2360/4276] eta: 1:38:27 lr: 3.445619567791387e-05 loss: 0.1305 (0.1457) time: 3.0686 data: 0.0082 max mem: 33300 Epoch: [13] [2370/4276] eta: 1:37:56 lr: 3.445345357525149e-05 loss: 0.1413 (0.1457) time: 3.0957 data: 0.0078 max mem: 33300 Epoch: [13] [2380/4276] eta: 1:37:25 lr: 3.445071144833999e-05 loss: 0.1378 (0.1457) time: 3.0986 data: 0.0079 max mem: 33300 Epoch: [13] [2390/4276] eta: 1:36:54 lr: 3.444796929717701e-05 loss: 0.1341 (0.1456) time: 3.0706 data: 0.0081 max mem: 33300 Epoch: [13] [2400/4276] eta: 1:36:24 lr: 3.4445227121760194e-05 loss: 0.1341 (0.1457) time: 3.0770 data: 0.0078 max mem: 33300 Epoch: [13] [2410/4276] eta: 1:35:53 lr: 3.444248492208718e-05 loss: 0.1392 (0.1457) time: 3.1049 data: 0.0077 max mem: 33300 Epoch: [13] [2420/4276] eta: 1:35:22 lr: 3.443974269815561e-05 loss: 0.1334 (0.1456) time: 3.1097 data: 0.0082 max mem: 33300 Epoch: [13] [2430/4276] eta: 1:34:51 lr: 3.443700044996311e-05 loss: 0.1430 (0.1457) time: 3.0904 data: 0.0084 max mem: 33300 Epoch: [13] [2440/4276] eta: 1:34:20 lr: 3.4434258177507336e-05 loss: 0.1441 (0.1457) time: 3.0676 data: 0.0082 max mem: 33300 Epoch: [13] [2450/4276] eta: 1:33:49 lr: 3.443151588078592e-05 loss: 0.1358 (0.1457) time: 3.0607 data: 0.0079 max mem: 33300 Epoch: [13] [2460/4276] eta: 1:33:19 lr: 3.442877355979649e-05 loss: 0.1498 (0.1457) time: 3.0874 data: 0.0076 max mem: 33300 Epoch: [13] [2470/4276] eta: 1:32:48 lr: 3.44260312145367e-05 loss: 0.1399 (0.1457) time: 3.1086 data: 0.0076 max mem: 33300 Epoch: [13] [2480/4276] eta: 1:32:17 lr: 3.442328884500417e-05 loss: 0.1423 (0.1458) time: 3.0797 data: 0.0080 max mem: 33300 Epoch: [13] [2490/4276] eta: 1:31:46 lr: 3.442054645119655e-05 loss: 0.1468 (0.1458) time: 3.0626 data: 0.0082 max mem: 33300 Epoch: [13] [2500/4276] eta: 1:31:16 lr: 3.441780403311148e-05 loss: 0.1356 (0.1458) time: 3.1021 data: 0.0083 max mem: 33300 Epoch: [13] [2510/4276] eta: 1:30:45 lr: 3.441506159074658e-05 loss: 0.1447 (0.1458) time: 3.1263 data: 0.0084 max mem: 33300 Epoch: [13] [2520/4276] eta: 1:30:14 lr: 3.44123191240995e-05 loss: 0.1301 (0.1458) time: 3.0922 data: 0.0081 max mem: 33300 Epoch: [13] [2530/4276] eta: 1:29:43 lr: 3.440957663316786e-05 loss: 0.1145 (0.1457) time: 3.0626 data: 0.0078 max mem: 33300 Epoch: [13] [2540/4276] eta: 1:29:12 lr: 3.440683411794931e-05 loss: 0.1184 (0.1456) time: 3.0643 data: 0.0078 max mem: 33300 Epoch: [13] [2550/4276] eta: 1:28:41 lr: 3.440409157844146e-05 loss: 0.1239 (0.1455) time: 3.0841 data: 0.0078 max mem: 33300 Epoch: [13] [2560/4276] eta: 1:28:11 lr: 3.4401349014641974e-05 loss: 0.1216 (0.1455) time: 3.0955 data: 0.0079 max mem: 33300 Epoch: [13] [2570/4276] eta: 1:27:39 lr: 3.439860642654846e-05 loss: 0.1216 (0.1454) time: 3.0516 data: 0.0081 max mem: 33300 Epoch: [13] [2580/4276] eta: 1:27:08 lr: 3.439586381415857e-05 loss: 0.1316 (0.1454) time: 3.0353 data: 0.0084 max mem: 33300 Epoch: [13] [2590/4276] eta: 1:26:38 lr: 3.439312117746993e-05 loss: 0.1362 (0.1454) time: 3.0859 data: 0.0079 max mem: 33300 Epoch: [13] [2600/4276] eta: 1:26:07 lr: 3.439037851648015e-05 loss: 0.1397 (0.1454) time: 3.1227 data: 0.0078 max mem: 33300 Epoch: [13] [2610/4276] eta: 1:25:36 lr: 3.43876358311869e-05 loss: 0.1340 (0.1453) time: 3.1059 data: 0.0081 max mem: 33300 Epoch: [13] [2620/4276] eta: 1:25:05 lr: 3.438489312158778e-05 loss: 0.1375 (0.1453) time: 3.0658 data: 0.0081 max mem: 33300 Epoch: [13] [2630/4276] eta: 1:24:34 lr: 3.4382150387680435e-05 loss: 0.1364 (0.1453) time: 3.0365 data: 0.0076 max mem: 33300 Epoch: [13] [2640/4276] eta: 1:24:04 lr: 3.437940762946248e-05 loss: 0.1364 (0.1453) time: 3.1100 data: 0.0079 max mem: 33300 Epoch: [13] [2650/4276] eta: 1:23:33 lr: 3.4376664846931575e-05 loss: 0.1434 (0.1453) time: 3.1583 data: 0.0085 max mem: 33300 Epoch: [13] [2660/4276] eta: 1:23:02 lr: 3.437392204008532e-05 loss: 0.1482 (0.1453) time: 3.0877 data: 0.0084 max mem: 33300 Epoch: [13] [2670/4276] eta: 1:22:31 lr: 3.437117920892136e-05 loss: 0.1482 (0.1454) time: 3.0587 data: 0.0083 max mem: 33300 Epoch: [13] [2680/4276] eta: 1:22:00 lr: 3.4368436353437304e-05 loss: 0.1474 (0.1454) time: 3.0588 data: 0.0080 max mem: 33300 Epoch: [13] [2690/4276] eta: 1:21:30 lr: 3.43656934736308e-05 loss: 0.1422 (0.1454) time: 3.1147 data: 0.0079 max mem: 33300 Epoch: [13] [2700/4276] eta: 1:20:59 lr: 3.436295056949947e-05 loss: 0.1306 (0.1453) time: 3.1321 data: 0.0080 max mem: 33300 Epoch: [13] [2710/4276] eta: 1:20:28 lr: 3.4360207641040934e-05 loss: 0.1347 (0.1453) time: 3.0809 data: 0.0078 max mem: 33300 Epoch: [13] [2720/4276] eta: 1:19:57 lr: 3.435746468825282e-05 loss: 0.1347 (0.1453) time: 3.0648 data: 0.0082 max mem: 33300 Epoch: [13] [2730/4276] eta: 1:19:27 lr: 3.435472171113277e-05 loss: 0.1282 (0.1453) time: 3.0625 data: 0.0085 max mem: 33300 Epoch: [13] [2740/4276] eta: 1:18:56 lr: 3.435197870967838e-05 loss: 0.1476 (0.1453) time: 3.0853 data: 0.0086 max mem: 33300 Epoch: [13] [2750/4276] eta: 1:18:25 lr: 3.43492356838873e-05 loss: 0.1528 (0.1453) time: 3.0997 data: 0.0084 max mem: 33300 Epoch: [13] [2760/4276] eta: 1:17:54 lr: 3.434649263375715e-05 loss: 0.1378 (0.1453) time: 3.0703 data: 0.0080 max mem: 33300 Epoch: [13] [2770/4276] eta: 1:17:23 lr: 3.4343749559285544e-05 loss: 0.1376 (0.1453) time: 3.0535 data: 0.0084 max mem: 33300 Epoch: [13] [2780/4276] eta: 1:16:52 lr: 3.434100646047011e-05 loss: 0.1374 (0.1453) time: 3.0969 data: 0.0084 max mem: 33300 Epoch: [13] [2790/4276] eta: 1:16:22 lr: 3.433826333730847e-05 loss: 0.1397 (0.1453) time: 3.1333 data: 0.0081 max mem: 33300 Epoch: [13] [2800/4276] eta: 1:15:51 lr: 3.433552018979826e-05 loss: 0.1410 (0.1453) time: 3.0988 data: 0.0080 max mem: 33300 Epoch: [13] [2810/4276] eta: 1:15:20 lr: 3.4332777017937096e-05 loss: 0.1208 (0.1452) time: 3.0642 data: 0.0081 max mem: 33300 Epoch: [13] [2820/4276] eta: 1:14:49 lr: 3.433003382172259e-05 loss: 0.1171 (0.1451) time: 3.0520 data: 0.0079 max mem: 33300 Epoch: [13] [2830/4276] eta: 1:14:18 lr: 3.432729060115237e-05 loss: 0.1258 (0.1451) time: 3.0769 data: 0.0082 max mem: 33300 Epoch: [13] [2840/4276] eta: 1:13:48 lr: 3.432454735622406e-05 loss: 0.1444 (0.1451) time: 3.1026 data: 0.0092 max mem: 33300 Epoch: [13] [2850/4276] eta: 1:13:17 lr: 3.432180408693528e-05 loss: 0.1508 (0.1452) time: 3.0738 data: 0.0087 max mem: 33300 Epoch: [13] [2860/4276] eta: 1:12:46 lr: 3.431906079328364e-05 loss: 0.1441 (0.1451) time: 3.0536 data: 0.0080 max mem: 33300 Epoch: [13] [2870/4276] eta: 1:12:15 lr: 3.431631747526677e-05 loss: 0.1294 (0.1451) time: 3.0678 data: 0.0080 max mem: 33300 Epoch: [13] [2880/4276] eta: 1:11:44 lr: 3.431357413288229e-05 loss: 0.1369 (0.1451) time: 3.1215 data: 0.0084 max mem: 33300 Epoch: [13] [2890/4276] eta: 1:11:13 lr: 3.4310830766127815e-05 loss: 0.1369 (0.1451) time: 3.1158 data: 0.0085 max mem: 33300 Epoch: [13] [2900/4276] eta: 1:10:42 lr: 3.430808737500097e-05 loss: 0.1240 (0.1450) time: 3.0599 data: 0.0082 max mem: 33300 Epoch: [13] [2910/4276] eta: 1:10:11 lr: 3.430534395949936e-05 loss: 0.1236 (0.1450) time: 3.0523 data: 0.0081 max mem: 33300 Epoch: [13] [2920/4276] eta: 1:09:41 lr: 3.4302600519620614e-05 loss: 0.1320 (0.1449) time: 3.0717 data: 0.0088 max mem: 33300 Epoch: [13] [2930/4276] eta: 1:09:10 lr: 3.429985705536235e-05 loss: 0.1227 (0.1449) time: 3.0956 data: 0.0094 max mem: 33300 Epoch: [13] [2940/4276] eta: 1:08:39 lr: 3.429711356672217e-05 loss: 0.1227 (0.1449) time: 3.0578 data: 0.0089 max mem: 33300 Epoch: [13] [2950/4276] eta: 1:08:08 lr: 3.42943700536977e-05 loss: 0.1401 (0.1449) time: 3.0304 data: 0.0082 max mem: 33300 Epoch: [13] [2960/4276] eta: 1:07:37 lr: 3.4291626516286564e-05 loss: 0.1241 (0.1449) time: 3.0799 data: 0.0082 max mem: 33300 Epoch: [13] [2970/4276] eta: 1:07:06 lr: 3.428888295448637e-05 loss: 0.1408 (0.1449) time: 3.1323 data: 0.0086 max mem: 33300 Epoch: [13] [2980/4276] eta: 1:06:36 lr: 3.428613936829473e-05 loss: 0.1431 (0.1449) time: 3.1904 data: 0.0086 max mem: 33300 Epoch: [13] [2990/4276] eta: 1:06:06 lr: 3.4283395757709255e-05 loss: 0.1270 (0.1448) time: 3.1826 data: 0.0087 max mem: 33300 Epoch: [13] [3000/4276] eta: 1:05:35 lr: 3.4280652122727567e-05 loss: 0.1260 (0.1448) time: 3.1190 data: 0.0086 max mem: 33300 Epoch: [13] [3010/4276] eta: 1:05:04 lr: 3.427790846334728e-05 loss: 0.1414 (0.1448) time: 3.0804 data: 0.0088 max mem: 33300 Epoch: [13] [3020/4276] eta: 1:04:33 lr: 3.4275164779566e-05 loss: 0.1414 (0.1448) time: 3.0945 data: 0.0095 max mem: 33300 Epoch: [13] [3030/4276] eta: 1:04:03 lr: 3.427242107138134e-05 loss: 0.1321 (0.1448) time: 3.1355 data: 0.0098 max mem: 33300 Epoch: [13] [3040/4276] eta: 1:03:32 lr: 3.426967733879093e-05 loss: 0.1401 (0.1448) time: 3.1269 data: 0.0093 max mem: 33300 Epoch: [13] [3050/4276] eta: 1:03:01 lr: 3.426693358179235e-05 loss: 0.1401 (0.1448) time: 3.1168 data: 0.0085 max mem: 33300 Epoch: [13] [3060/4276] eta: 1:02:31 lr: 3.426418980038325e-05 loss: 0.1203 (0.1447) time: 3.1210 data: 0.0082 max mem: 33300 Epoch: [13] [3070/4276] eta: 1:02:00 lr: 3.426144599456121e-05 loss: 0.1329 (0.1447) time: 3.1415 data: 0.0083 max mem: 33300 Epoch: [13] [3080/4276] eta: 1:01:29 lr: 3.4258702164323846e-05 loss: 0.1329 (0.1447) time: 3.1365 data: 0.0081 max mem: 33300 Epoch: [13] [3090/4276] eta: 1:00:58 lr: 3.425595830966877e-05 loss: 0.1199 (0.1446) time: 3.0975 data: 0.0087 max mem: 33300 Epoch: [13] [3100/4276] eta: 1:00:28 lr: 3.42532144305936e-05 loss: 0.1233 (0.1446) time: 3.0959 data: 0.0088 max mem: 33300 Epoch: [13] [3110/4276] eta: 0:59:57 lr: 3.425047052709594e-05 loss: 0.1233 (0.1445) time: 3.1108 data: 0.0080 max mem: 33300 Epoch: [13] [3120/4276] eta: 0:59:26 lr: 3.42477265991734e-05 loss: 0.1218 (0.1445) time: 3.1123 data: 0.0082 max mem: 33300 Epoch: [13] [3130/4276] eta: 0:58:55 lr: 3.4244982646823584e-05 loss: 0.1294 (0.1444) time: 3.1047 data: 0.0086 max mem: 33300 Epoch: [13] [3140/4276] eta: 0:58:24 lr: 3.42422386700441e-05 loss: 0.1321 (0.1444) time: 3.0966 data: 0.0082 max mem: 33300 Epoch: [13] [3150/4276] eta: 0:57:54 lr: 3.423949466883255e-05 loss: 0.1428 (0.1444) time: 3.1031 data: 0.0078 max mem: 33300 Epoch: [13] [3160/4276] eta: 0:57:23 lr: 3.4236750643186554e-05 loss: 0.1484 (0.1444) time: 3.1463 data: 0.0083 max mem: 33300 Epoch: [13] [3170/4276] eta: 0:56:52 lr: 3.423400659310371e-05 loss: 0.1402 (0.1445) time: 3.1440 data: 0.0089 max mem: 33300 Epoch: [13] [3180/4276] eta: 0:56:22 lr: 3.423126251858163e-05 loss: 0.1462 (0.1445) time: 3.1129 data: 0.0085 max mem: 33300 Epoch: [13] [3190/4276] eta: 0:55:51 lr: 3.4228518419617915e-05 loss: 0.1462 (0.1445) time: 3.0800 data: 0.0084 max mem: 33300 Epoch: [13] [3200/4276] eta: 0:55:20 lr: 3.422577429621017e-05 loss: 0.1306 (0.1444) time: 3.0741 data: 0.0086 max mem: 33300 Epoch: [13] [3210/4276] eta: 0:54:49 lr: 3.4223030148356e-05 loss: 0.1358 (0.1445) time: 3.1062 data: 0.0086 max mem: 33300 Epoch: [13] [3220/4276] eta: 0:54:18 lr: 3.4220285976053006e-05 loss: 0.1435 (0.1445) time: 3.1065 data: 0.0082 max mem: 33300 Epoch: [13] [3230/4276] eta: 0:53:47 lr: 3.42175417792988e-05 loss: 0.1352 (0.1445) time: 3.1046 data: 0.0076 max mem: 33300 Epoch: [13] [3240/4276] eta: 0:53:17 lr: 3.4214797558090976e-05 loss: 0.1593 (0.1446) time: 3.0976 data: 0.0078 max mem: 33300 Epoch: [13] [3250/4276] eta: 0:52:46 lr: 3.421205331242714e-05 loss: 0.1492 (0.1446) time: 3.1076 data: 0.0081 max mem: 33300 Epoch: [13] [3260/4276] eta: 0:52:15 lr: 3.42093090423049e-05 loss: 0.1484 (0.1446) time: 3.1383 data: 0.0081 max mem: 33300 Epoch: [13] [3270/4276] eta: 0:51:44 lr: 3.420656474772185e-05 loss: 0.1429 (0.1446) time: 3.1314 data: 0.0076 max mem: 33300 Epoch: [13] [3280/4276] eta: 0:51:14 lr: 3.4203820428675596e-05 loss: 0.1429 (0.1447) time: 3.1105 data: 0.0076 max mem: 33300 Epoch: [13] [3290/4276] eta: 0:50:43 lr: 3.420107608516375e-05 loss: 0.1546 (0.1447) time: 3.1050 data: 0.0080 max mem: 33300 Epoch: [13] [3300/4276] eta: 0:50:12 lr: 3.4198331717183886e-05 loss: 0.1462 (0.1447) time: 3.1025 data: 0.0080 max mem: 33300 Epoch: [13] [3310/4276] eta: 0:49:41 lr: 3.419558732473362e-05 loss: 0.1462 (0.1448) time: 3.0889 data: 0.0083 max mem: 33300 Epoch: [13] [3320/4276] eta: 0:49:11 lr: 3.419284290781055e-05 loss: 0.1405 (0.1448) time: 3.1512 data: 0.0091 max mem: 33300 Epoch: [13] [3330/4276] eta: 0:48:40 lr: 3.419009846641227e-05 loss: 0.1365 (0.1448) time: 3.1648 data: 0.0091 max mem: 33300 Epoch: [13] [3340/4276] eta: 0:48:09 lr: 3.4187354000536394e-05 loss: 0.1376 (0.1448) time: 3.0995 data: 0.0084 max mem: 33300 Epoch: [13] [3350/4276] eta: 0:47:38 lr: 3.418460951018051e-05 loss: 0.1365 (0.1447) time: 3.1375 data: 0.0084 max mem: 33300 Epoch: [13] [3360/4276] eta: 0:47:08 lr: 3.418186499534221e-05 loss: 0.1363 (0.1447) time: 3.1422 data: 0.0087 max mem: 33300 Epoch: [13] [3370/4276] eta: 0:46:37 lr: 3.417912045601911e-05 loss: 0.1474 (0.1448) time: 3.0899 data: 0.0086 max mem: 33300 Epoch: [13] [3380/4276] eta: 0:46:06 lr: 3.4176375892208784e-05 loss: 0.1422 (0.1448) time: 3.0831 data: 0.0088 max mem: 33300 Epoch: [13] [3390/4276] eta: 0:45:35 lr: 3.417363130390884e-05 loss: 0.1422 (0.1448) time: 3.1045 data: 0.0088 max mem: 33300 Epoch: [13] [3400/4276] eta: 0:45:04 lr: 3.417088669111688e-05 loss: 0.1549 (0.1449) time: 3.1103 data: 0.0088 max mem: 33300 Epoch: [13] [3410/4276] eta: 0:44:33 lr: 3.416814205383048e-05 loss: 0.1485 (0.1449) time: 3.1122 data: 0.0091 max mem: 33300 Epoch: [13] [3420/4276] eta: 0:44:03 lr: 3.4165397392047257e-05 loss: 0.1485 (0.1449) time: 3.1106 data: 0.0085 max mem: 33300 Epoch: [13] [3430/4276] eta: 0:43:32 lr: 3.41626527057648e-05 loss: 0.1492 (0.1450) time: 3.1077 data: 0.0084 max mem: 33300 Epoch: [13] [3440/4276] eta: 0:43:01 lr: 3.415990799498069e-05 loss: 0.1415 (0.1449) time: 3.1259 data: 0.0090 max mem: 33300 Epoch: [13] [3450/4276] eta: 0:42:30 lr: 3.415716325969255e-05 loss: 0.1477 (0.1450) time: 3.1558 data: 0.0091 max mem: 33300 Epoch: [13] [3460/4276] eta: 0:42:00 lr: 3.4154418499897936e-05 loss: 0.1529 (0.1450) time: 3.1654 data: 0.0086 max mem: 33300 Epoch: [13] [3470/4276] eta: 0:41:29 lr: 3.415167371559446e-05 loss: 0.1312 (0.1450) time: 3.1257 data: 0.0086 max mem: 33300 Epoch: [13] [3480/4276] eta: 0:40:58 lr: 3.414892890677972e-05 loss: 0.1305 (0.1450) time: 3.0914 data: 0.0085 max mem: 33300 Epoch: [13] [3490/4276] eta: 0:40:27 lr: 3.41461840734513e-05 loss: 0.1441 (0.1450) time: 3.0940 data: 0.0083 max mem: 33300 Epoch: [13] [3500/4276] eta: 0:39:56 lr: 3.41434392156068e-05 loss: 0.1465 (0.1450) time: 3.0934 data: 0.0082 max mem: 33300 Epoch: [13] [3510/4276] eta: 0:39:25 lr: 3.4140694333243795e-05 loss: 0.1282 (0.1449) time: 3.0937 data: 0.0081 max mem: 33300 Epoch: [13] [3520/4276] eta: 0:38:54 lr: 3.4137949426359887e-05 loss: 0.1350 (0.1449) time: 3.0964 data: 0.0084 max mem: 33300 Epoch: [13] [3530/4276] eta: 0:38:23 lr: 3.4135204494952667e-05 loss: 0.1462 (0.1449) time: 3.0976 data: 0.0086 max mem: 33300 Epoch: [13] [3540/4276] eta: 0:37:53 lr: 3.4132459539019716e-05 loss: 0.1314 (0.1449) time: 3.1176 data: 0.0085 max mem: 33300 Epoch: [13] [3550/4276] eta: 0:37:22 lr: 3.4129714558558636e-05 loss: 0.1271 (0.1449) time: 3.1649 data: 0.0088 max mem: 33300 Epoch: [13] [3560/4276] eta: 0:36:51 lr: 3.4126969553567e-05 loss: 0.1417 (0.1449) time: 3.1692 data: 0.0089 max mem: 33300 Epoch: [13] [3570/4276] eta: 0:36:20 lr: 3.412422452404241e-05 loss: 0.1620 (0.1450) time: 3.1132 data: 0.0083 max mem: 33300 Epoch: [13] [3580/4276] eta: 0:35:49 lr: 3.4121479469982456e-05 loss: 0.1370 (0.1450) time: 3.0800 data: 0.0077 max mem: 33300 Epoch: [13] [3590/4276] eta: 0:35:19 lr: 3.4118734391384715e-05 loss: 0.1321 (0.1450) time: 3.0788 data: 0.0078 max mem: 33300 Epoch: [13] [3600/4276] eta: 0:34:48 lr: 3.411598928824678e-05 loss: 0.1421 (0.1450) time: 3.0721 data: 0.0082 max mem: 33300 Epoch: [13] [3610/4276] eta: 0:34:17 lr: 3.411324416056623e-05 loss: 0.1356 (0.1449) time: 3.0531 data: 0.0087 max mem: 33300 Epoch: [13] [3620/4276] eta: 0:33:46 lr: 3.4110499008340654e-05 loss: 0.1323 (0.1449) time: 3.0357 data: 0.0097 max mem: 33300 Epoch: [13] [3630/4276] eta: 0:33:15 lr: 3.4107753831567654e-05 loss: 0.1401 (0.1449) time: 3.0374 data: 0.0099 max mem: 33300 Epoch: [13] [3640/4276] eta: 0:32:44 lr: 3.4105008630244786e-05 loss: 0.1284 (0.1448) time: 3.0588 data: 0.0085 max mem: 33300 Epoch: [13] [3650/4276] eta: 0:32:13 lr: 3.410226340436965e-05 loss: 0.1245 (0.1448) time: 3.1206 data: 0.0085 max mem: 33300 Epoch: [13] [3660/4276] eta: 0:31:42 lr: 3.4099518153939846e-05 loss: 0.1309 (0.1448) time: 3.1840 data: 0.0091 max mem: 33300 Epoch: [13] [3670/4276] eta: 0:31:11 lr: 3.4096772878952935e-05 loss: 0.1340 (0.1447) time: 3.1632 data: 0.0084 max mem: 33300 Epoch: [13] [3680/4276] eta: 0:30:41 lr: 3.4094027579406514e-05 loss: 0.1419 (0.1448) time: 3.1252 data: 0.0079 max mem: 33300 Epoch: [13] [3690/4276] eta: 0:30:10 lr: 3.409128225529815e-05 loss: 0.1428 (0.1447) time: 3.1379 data: 0.0081 max mem: 33300 Epoch: [13] [3700/4276] eta: 0:29:39 lr: 3.408853690662544e-05 loss: 0.1399 (0.1447) time: 3.1329 data: 0.0085 max mem: 33300 Epoch: [13] [3710/4276] eta: 0:29:08 lr: 3.408579153338596e-05 loss: 0.1315 (0.1447) time: 3.0977 data: 0.0085 max mem: 33300 Epoch: [13] [3720/4276] eta: 0:28:37 lr: 3.4083046135577284e-05 loss: 0.1243 (0.1446) time: 3.0795 data: 0.0082 max mem: 33300 Epoch: [13] [3730/4276] eta: 0:28:06 lr: 3.408030071319701e-05 loss: 0.1352 (0.1446) time: 3.1080 data: 0.0084 max mem: 33300 Epoch: [13] [3740/4276] eta: 0:27:36 lr: 3.4077555266242714e-05 loss: 0.1321 (0.1446) time: 3.1909 data: 0.0089 max mem: 33300 Epoch: [13] [3750/4276] eta: 0:27:05 lr: 3.407480979471198e-05 loss: 0.1389 (0.1447) time: 3.2737 data: 0.0092 max mem: 33300 Epoch: [13] [3760/4276] eta: 0:26:34 lr: 3.407206429860237e-05 loss: 0.1429 (0.1446) time: 3.2240 data: 0.0091 max mem: 33300 Epoch: [13] [3770/4276] eta: 0:26:03 lr: 3.406931877791147e-05 loss: 0.1249 (0.1446) time: 3.1437 data: 0.0085 max mem: 33300 Epoch: [13] [3780/4276] eta: 0:25:33 lr: 3.406657323263687e-05 loss: 0.1291 (0.1446) time: 3.1477 data: 0.0084 max mem: 33300 Epoch: [13] [3790/4276] eta: 0:25:02 lr: 3.406382766277614e-05 loss: 0.1254 (0.1445) time: 3.1482 data: 0.0089 max mem: 33300 Epoch: [13] [3800/4276] eta: 0:24:31 lr: 3.406108206832686e-05 loss: 0.1324 (0.1445) time: 3.1434 data: 0.0090 max mem: 33300 Epoch: [13] [3810/4276] eta: 0:24:00 lr: 3.40583364492866e-05 loss: 0.1405 (0.1446) time: 3.1410 data: 0.0090 max mem: 33300 Epoch: [13] [3820/4276] eta: 0:23:29 lr: 3.4055590805652954e-05 loss: 0.1175 (0.1445) time: 3.1092 data: 0.0088 max mem: 33300 Epoch: [13] [3830/4276] eta: 0:22:58 lr: 3.405284513742349e-05 loss: 0.1179 (0.1445) time: 3.0894 data: 0.0085 max mem: 33300 Epoch: [13] [3840/4276] eta: 0:22:28 lr: 3.4050099444595775e-05 loss: 0.1318 (0.1444) time: 3.1844 data: 0.0087 max mem: 33300 Epoch: [13] [3850/4276] eta: 0:21:57 lr: 3.404735372716739e-05 loss: 0.1270 (0.1444) time: 3.2232 data: 0.0088 max mem: 33300 Epoch: [13] [3860/4276] eta: 0:21:26 lr: 3.404460798513592e-05 loss: 0.1363 (0.1444) time: 3.1258 data: 0.0085 max mem: 33300 Epoch: [13] [3870/4276] eta: 0:20:55 lr: 3.4041862218498925e-05 loss: 0.1409 (0.1443) time: 3.0802 data: 0.0084 max mem: 33300 Epoch: [13] [3880/4276] eta: 0:20:24 lr: 3.4039116427253984e-05 loss: 0.1303 (0.1443) time: 3.0840 data: 0.0088 max mem: 33300 Epoch: [13] [3890/4276] eta: 0:19:53 lr: 3.403637061139869e-05 loss: 0.1308 (0.1443) time: 3.0842 data: 0.0089 max mem: 33300 Epoch: [13] [3900/4276] eta: 0:19:22 lr: 3.4033624770930586e-05 loss: 0.1308 (0.1443) time: 3.0784 data: 0.0087 max mem: 33300 Epoch: [13] [3910/4276] eta: 0:18:51 lr: 3.403087890584726e-05 loss: 0.1330 (0.1443) time: 3.0781 data: 0.0086 max mem: 33300 Epoch: [13] [3920/4276] eta: 0:18:20 lr: 3.402813301614629e-05 loss: 0.1277 (0.1443) time: 3.0825 data: 0.0088 max mem: 33300 Epoch: [13] [3930/4276] eta: 0:17:49 lr: 3.402538710182523e-05 loss: 0.1275 (0.1443) time: 3.0881 data: 0.0084 max mem: 33300 Epoch: [13] [3940/4276] eta: 0:17:18 lr: 3.402264116288167e-05 loss: 0.1284 (0.1443) time: 3.1600 data: 0.0084 max mem: 33300 Epoch: [13] [3950/4276] eta: 0:16:48 lr: 3.401989519931317e-05 loss: 0.1284 (0.1442) time: 3.1841 data: 0.0088 max mem: 33300 Epoch: [13] [3960/4276] eta: 0:16:17 lr: 3.401714921111731e-05 loss: 0.1381 (0.1442) time: 3.1124 data: 0.0085 max mem: 33300 Epoch: [13] [3970/4276] eta: 0:15:46 lr: 3.401440319829166e-05 loss: 0.1501 (0.1442) time: 3.0693 data: 0.0089 max mem: 33300 Epoch: [13] [3980/4276] eta: 0:15:15 lr: 3.401165716083377e-05 loss: 0.1427 (0.1443) time: 3.0697 data: 0.0088 max mem: 33300 Epoch: [13] [3990/4276] eta: 0:14:44 lr: 3.400891109874123e-05 loss: 0.1418 (0.1442) time: 3.0745 data: 0.0080 max mem: 33300 Epoch: [13] [4000/4276] eta: 0:14:13 lr: 3.40061650120116e-05 loss: 0.1198 (0.1442) time: 3.0846 data: 0.0082 max mem: 33300 Epoch: [13] [4010/4276] eta: 0:13:42 lr: 3.400341890064245e-05 loss: 0.1221 (0.1442) time: 3.0884 data: 0.0083 max mem: 33300 Epoch: [13] [4020/4276] eta: 0:13:11 lr: 3.400067276463135e-05 loss: 0.1381 (0.1442) time: 3.0731 data: 0.0081 max mem: 33300 Epoch: [13] [4030/4276] eta: 0:12:40 lr: 3.3997926603975874e-05 loss: 0.1381 (0.1442) time: 3.1048 data: 0.0080 max mem: 33300 Epoch: [13] [4040/4276] eta: 0:12:09 lr: 3.3995180418673574e-05 loss: 0.1391 (0.1443) time: 3.1687 data: 0.0089 max mem: 33300 Epoch: [13] [4050/4276] eta: 0:11:38 lr: 3.3992434208722027e-05 loss: 0.1348 (0.1442) time: 3.1513 data: 0.0087 max mem: 33300 Epoch: [13] [4060/4276] eta: 0:11:07 lr: 3.398968797411879e-05 loss: 0.1324 (0.1443) time: 3.0920 data: 0.0073 max mem: 33300 Epoch: [13] [4070/4276] eta: 0:10:37 lr: 3.398694171486144e-05 loss: 0.1476 (0.1443) time: 3.0793 data: 0.0075 max mem: 33300 Epoch: [13] [4080/4276] eta: 0:10:06 lr: 3.398419543094753e-05 loss: 0.1465 (0.1443) time: 3.0747 data: 0.0083 max mem: 33300 Epoch: [13] [4090/4276] eta: 0:09:35 lr: 3.398144912237464e-05 loss: 0.1521 (0.1443) time: 3.0767 data: 0.0085 max mem: 33300 Epoch: [13] [4100/4276] eta: 0:09:04 lr: 3.397870278914032e-05 loss: 0.1547 (0.1443) time: 3.0813 data: 0.0082 max mem: 33300 Epoch: [13] [4110/4276] eta: 0:08:33 lr: 3.397595643124213e-05 loss: 0.1484 (0.1443) time: 3.0818 data: 0.0080 max mem: 33300 Epoch: [13] [4120/4276] eta: 0:08:02 lr: 3.3973210048677665e-05 loss: 0.1477 (0.1444) time: 3.0805 data: 0.0082 max mem: 33300 Epoch: [13] [4130/4276] eta: 0:07:31 lr: 3.397046364144445e-05 loss: 0.1456 (0.1444) time: 3.1287 data: 0.0084 max mem: 33300 Epoch: [13] [4140/4276] eta: 0:07:00 lr: 3.396771720954007e-05 loss: 0.1342 (0.1444) time: 3.1680 data: 0.0081 max mem: 33300 Epoch: [13] [4150/4276] eta: 0:06:29 lr: 3.3964970752962073e-05 loss: 0.1334 (0.1444) time: 3.1198 data: 0.0080 max mem: 33300 Epoch: [13] [4160/4276] eta: 0:05:58 lr: 3.396222427170803e-05 loss: 0.1414 (0.1444) time: 3.0880 data: 0.0082 max mem: 33300 Epoch: [13] [4170/4276] eta: 0:05:27 lr: 3.3959477765775494e-05 loss: 0.1548 (0.1444) time: 3.0731 data: 0.0088 max mem: 33300 Epoch: [13] [4180/4276] eta: 0:04:56 lr: 3.395673123516203e-05 loss: 0.1417 (0.1444) time: 3.0387 data: 0.0093 max mem: 33300 Epoch: [13] [4190/4276] eta: 0:04:25 lr: 3.3953984679865205e-05 loss: 0.1417 (0.1445) time: 3.0239 data: 0.0088 max mem: 33300 Epoch: [13] [4200/4276] eta: 0:03:54 lr: 3.395123809988258e-05 loss: 0.1471 (0.1445) time: 3.0495 data: 0.0079 max mem: 33300 Epoch: [13] [4210/4276] eta: 0:03:24 lr: 3.3948491495211684e-05 loss: 0.1574 (0.1445) time: 3.0814 data: 0.0079 max mem: 33300 Epoch: [13] [4220/4276] eta: 0:02:53 lr: 3.394574486585012e-05 loss: 0.1607 (0.1446) time: 3.0813 data: 0.0085 max mem: 33300 Epoch: [13] [4230/4276] eta: 0:02:22 lr: 3.3942998211795416e-05 loss: 0.1588 (0.1446) time: 3.1139 data: 0.0083 max mem: 33300 Epoch: [13] [4240/4276] eta: 0:01:51 lr: 3.3940251533045135e-05 loss: 0.1479 (0.1446) time: 3.1312 data: 0.0081 max mem: 33300 Epoch: [13] [4250/4276] eta: 0:01:20 lr: 3.393750482959684e-05 loss: 0.1484 (0.1447) time: 3.1080 data: 0.0080 max mem: 33300 Epoch: [13] [4260/4276] eta: 0:00:49 lr: 3.393475810144809e-05 loss: 0.1531 (0.1447) time: 3.0974 data: 0.0081 max mem: 33300 Epoch: [13] [4270/4276] eta: 0:00:18 lr: 3.3932011348596426e-05 loss: 0.1595 (0.1447) time: 3.0891 data: 0.0077 max mem: 33300 Epoch: [13] Total time: 3:40:22 Test: [ 0/21770] eta: 9:04:20 time: 1.5003 data: 1.4577 max mem: 33300 Test: [ 100/21770] eta: 0:19:06 time: 0.0388 data: 0.0011 max mem: 33300 Test: [ 200/21770] eta: 0:16:30 time: 0.0389 data: 0.0010 max mem: 33300 Test: [ 300/21770] eta: 0:15:36 time: 0.0391 data: 0.0010 max mem: 33300 Test: [ 400/21770] eta: 0:15:09 time: 0.0393 data: 0.0012 max mem: 33300 Test: [ 500/21770] eta: 0:14:50 time: 0.0392 data: 0.0011 max mem: 33300 Test: [ 600/21770] eta: 0:14:36 time: 0.0389 data: 0.0011 max mem: 33300 Test: [ 700/21770] eta: 0:14:24 time: 0.0389 data: 0.0011 max mem: 33300 Test: [ 800/21770] eta: 0:14:15 time: 0.0388 data: 0.0011 max mem: 33300 Test: [ 900/21770] eta: 0:14:06 time: 0.0387 data: 0.0010 max mem: 33300 Test: [ 1000/21770] eta: 0:13:58 time: 0.0391 data: 0.0010 max mem: 33300 Test: [ 1100/21770] eta: 0:13:52 time: 0.0387 data: 0.0010 max mem: 33300 Test: [ 1200/21770] eta: 0:13:45 time: 0.0386 data: 0.0010 max mem: 33300 Test: [ 1300/21770] eta: 0:13:39 time: 0.0391 data: 0.0011 max mem: 33300 Test: [ 1400/21770] eta: 0:13:33 time: 0.0389 data: 0.0011 max mem: 33300 Test: [ 1500/21770] eta: 0:13:28 time: 0.0394 data: 0.0011 max mem: 33300 Test: [ 1600/21770] eta: 0:13:23 time: 0.0390 data: 0.0010 max mem: 33300 Test: [ 1700/21770] eta: 0:13:19 time: 0.0388 data: 0.0011 max mem: 33300 Test: [ 1800/21770] eta: 0:13:14 time: 0.0392 data: 0.0011 max mem: 33300 Test: [ 1900/21770] eta: 0:13:10 time: 0.0397 data: 0.0011 max mem: 33300 Test: [ 2000/21770] eta: 0:13:05 time: 0.0390 data: 0.0010 max mem: 33300 Test: [ 2100/21770] eta: 0:13:01 time: 0.0391 data: 0.0011 max mem: 33300 Test: [ 2200/21770] eta: 0:12:56 time: 0.0385 data: 0.0011 max mem: 33300 Test: [ 2300/21770] eta: 0:12:51 time: 0.0384 data: 0.0011 max mem: 33300 Test: [ 2400/21770] eta: 0:12:46 time: 0.0386 data: 0.0011 max mem: 33300 Test: [ 2500/21770] eta: 0:12:41 time: 0.0396 data: 0.0011 max mem: 33300 Test: [ 2600/21770] eta: 0:12:37 time: 0.0400 data: 0.0011 max mem: 33300 Test: [ 2700/21770] eta: 0:12:33 time: 0.0385 data: 0.0011 max mem: 33300 Test: [ 2800/21770] eta: 0:12:29 time: 0.0392 data: 0.0011 max mem: 33300 Test: [ 2900/21770] eta: 0:12:25 time: 0.0396 data: 0.0011 max mem: 33300 Test: [ 3000/21770] eta: 0:12:21 time: 0.0400 data: 0.0011 max mem: 33300 Test: [ 3100/21770] eta: 0:12:18 time: 0.0397 data: 0.0011 max mem: 33300 Test: [ 3200/21770] eta: 0:12:14 time: 0.0391 data: 0.0010 max mem: 33300 Test: [ 3300/21770] eta: 0:12:10 time: 0.0395 data: 0.0011 max mem: 33300 Test: [ 3400/21770] eta: 0:12:06 time: 0.0393 data: 0.0011 max mem: 33300 Test: [ 3500/21770] eta: 0:12:01 time: 0.0394 data: 0.0012 max mem: 33300 Test: [ 3600/21770] eta: 0:11:57 time: 0.0393 data: 0.0012 max mem: 33300 Test: [ 3700/21770] eta: 0:11:53 time: 0.0390 data: 0.0011 max mem: 33300 Test: [ 3800/21770] eta: 0:11:49 time: 0.0393 data: 0.0012 max mem: 33300 Test: [ 3900/21770] eta: 0:11:45 time: 0.0392 data: 0.0012 max mem: 33300 Test: [ 4000/21770] eta: 0:11:41 time: 0.0400 data: 0.0012 max mem: 33300 Test: [ 4100/21770] eta: 0:11:38 time: 0.0399 data: 0.0011 max mem: 33300 Test: [ 4200/21770] eta: 0:11:34 time: 0.0402 data: 0.0012 max mem: 33300 Test: [ 4300/21770] eta: 0:11:30 time: 0.0401 data: 0.0011 max mem: 33300 Test: [ 4400/21770] eta: 0:11:26 time: 0.0403 data: 0.0011 max mem: 33300 Test: [ 4500/21770] eta: 0:11:23 time: 0.0399 data: 0.0011 max mem: 33300 Test: [ 4600/21770] eta: 0:11:19 time: 0.0400 data: 0.0011 max mem: 33300 Test: [ 4700/21770] eta: 0:11:15 time: 0.0402 data: 0.0011 max mem: 33300 Test: [ 4800/21770] eta: 0:11:11 time: 0.0398 data: 0.0011 max mem: 33300 Test: [ 4900/21770] eta: 0:11:07 time: 0.0401 data: 0.0010 max mem: 33300 Test: [ 5000/21770] eta: 0:11:03 time: 0.0399 data: 0.0011 max mem: 33300 Test: [ 5100/21770] eta: 0:11:00 time: 0.0400 data: 0.0011 max mem: 33300 Test: [ 5200/21770] eta: 0:10:56 time: 0.0393 data: 0.0011 max mem: 33300 Test: [ 5300/21770] eta: 0:10:52 time: 0.0400 data: 0.0011 max mem: 33300 Test: [ 5400/21770] eta: 0:10:48 time: 0.0400 data: 0.0011 max mem: 33300 Test: [ 5500/21770] eta: 0:10:44 time: 0.0399 data: 0.0011 max mem: 33300 Test: [ 5600/21770] eta: 0:10:40 time: 0.0398 data: 0.0011 max mem: 33300 Test: [ 5700/21770] eta: 0:10:36 time: 0.0400 data: 0.0010 max mem: 33300 Test: [ 5800/21770] eta: 0:10:32 time: 0.0394 data: 0.0010 max mem: 33300 Test: [ 5900/21770] eta: 0:10:28 time: 0.0395 data: 0.0011 max mem: 33300 Test: [ 6000/21770] eta: 0:10:24 time: 0.0394 data: 0.0011 max mem: 33300 Test: [ 6100/21770] eta: 0:10:20 time: 0.0399 data: 0.0011 max mem: 33300 Test: [ 6200/21770] eta: 0:10:17 time: 0.0400 data: 0.0011 max mem: 33300 Test: [ 6300/21770] eta: 0:10:13 time: 0.0400 data: 0.0011 max mem: 33300 Test: [ 6400/21770] eta: 0:10:09 time: 0.0396 data: 0.0011 max mem: 33300 Test: [ 6500/21770] eta: 0:10:05 time: 0.0399 data: 0.0010 max mem: 33300 Test: [ 6600/21770] eta: 0:10:01 time: 0.0396 data: 0.0011 max mem: 33300 Test: [ 6700/21770] eta: 0:09:57 time: 0.0401 data: 0.0011 max mem: 33300 Test: [ 6800/21770] eta: 0:09:53 time: 0.0402 data: 0.0011 max mem: 33300 Test: [ 6900/21770] eta: 0:09:49 time: 0.0399 data: 0.0011 max mem: 33300 Test: [ 7000/21770] eta: 0:09:45 time: 0.0400 data: 0.0011 max mem: 33300 Test: [ 7100/21770] eta: 0:09:41 time: 0.0400 data: 0.0011 max mem: 33300 Test: [ 7200/21770] eta: 0:09:37 time: 0.0400 data: 0.0011 max mem: 33300 Test: [ 7300/21770] eta: 0:09:33 time: 0.0385 data: 0.0011 max mem: 33300 Test: [ 7400/21770] eta: 0:09:29 time: 0.0385 data: 0.0011 max mem: 33300 Test: [ 7500/21770] eta: 0:09:25 time: 0.0394 data: 0.0010 max mem: 33300 Test: [ 7600/21770] eta: 0:09:21 time: 0.0400 data: 0.0011 max mem: 33300 Test: [ 7700/21770] eta: 0:09:17 time: 0.0401 data: 0.0010 max mem: 33300 Test: [ 7800/21770] eta: 0:09:13 time: 0.0401 data: 0.0011 max mem: 33300 Test: [ 7900/21770] eta: 0:09:09 time: 0.0396 data: 0.0011 max mem: 33300 Test: [ 8000/21770] eta: 0:09:05 time: 0.0395 data: 0.0011 max mem: 33300 Test: [ 8100/21770] eta: 0:09:01 time: 0.0393 data: 0.0011 max mem: 33300 Test: [ 8200/21770] eta: 0:08:57 time: 0.0391 data: 0.0011 max mem: 33300 Test: [ 8300/21770] eta: 0:08:53 time: 0.0397 data: 0.0011 max mem: 33300 Test: [ 8400/21770] eta: 0:08:49 time: 0.0399 data: 0.0011 max mem: 33300 Test: [ 8500/21770] eta: 0:08:45 time: 0.0391 data: 0.0011 max mem: 33300 Test: [ 8600/21770] eta: 0:08:41 time: 0.0390 data: 0.0011 max mem: 33300 Test: [ 8700/21770] eta: 0:08:37 time: 0.0394 data: 0.0011 max mem: 33300 Test: [ 8800/21770] eta: 0:08:33 time: 0.0391 data: 0.0011 max mem: 33300 Test: [ 8900/21770] eta: 0:08:29 time: 0.0393 data: 0.0011 max mem: 33300 Test: [ 9000/21770] eta: 0:08:25 time: 0.0395 data: 0.0011 max mem: 33300 Test: [ 9100/21770] eta: 0:08:21 time: 0.0397 data: 0.0011 max mem: 33300 Test: [ 9200/21770] eta: 0:08:17 time: 0.0396 data: 0.0011 max mem: 33300 Test: [ 9300/21770] eta: 0:08:13 time: 0.0394 data: 0.0011 max mem: 33300 Test: [ 9400/21770] eta: 0:08:09 time: 0.0398 data: 0.0011 max mem: 33300 Test: [ 9500/21770] eta: 0:08:05 time: 0.0394 data: 0.0010 max mem: 33300 Test: [ 9600/21770] eta: 0:08:02 time: 0.0396 data: 0.0010 max mem: 33300 Test: [ 9700/21770] eta: 0:07:58 time: 0.0387 data: 0.0011 max mem: 33300 Test: [ 9800/21770] eta: 0:07:53 time: 0.0387 data: 0.0011 max mem: 33300 Test: [ 9900/21770] eta: 0:07:49 time: 0.0391 data: 0.0011 max mem: 33300 Test: [10000/21770] eta: 0:07:45 time: 0.0395 data: 0.0011 max mem: 33300 Test: [10100/21770] eta: 0:07:41 time: 0.0389 data: 0.0011 max mem: 33300 Test: [10200/21770] eta: 0:07:37 time: 0.0387 data: 0.0011 max mem: 33300 Test: [10300/21770] eta: 0:07:33 time: 0.0391 data: 0.0011 max mem: 33300 Test: [10400/21770] eta: 0:07:29 time: 0.0391 data: 0.0011 max mem: 33300 Test: [10500/21770] eta: 0:07:25 time: 0.0397 data: 0.0010 max mem: 33300 Test: [10600/21770] eta: 0:07:21 time: 0.0386 data: 0.0011 max mem: 33300 Test: [10700/21770] eta: 0:07:17 time: 0.0387 data: 0.0011 max mem: 33300 Test: [10800/21770] eta: 0:07:13 time: 0.0400 data: 0.0011 max mem: 33300 Test: [10900/21770] eta: 0:07:09 time: 0.0392 data: 0.0011 max mem: 33300 Test: [11000/21770] eta: 0:07:05 time: 0.0388 data: 0.0011 max mem: 33300 Test: [11100/21770] eta: 0:07:01 time: 0.0391 data: 0.0010 max mem: 33300 Test: [11200/21770] eta: 0:06:57 time: 0.0401 data: 0.0011 max mem: 33300 Test: [11300/21770] eta: 0:06:53 time: 0.0401 data: 0.0012 max mem: 33300 Test: [11400/21770] eta: 0:06:50 time: 0.0392 data: 0.0012 max mem: 33300 Test: [11500/21770] eta: 0:06:46 time: 0.0393 data: 0.0012 max mem: 33300 Test: [11600/21770] eta: 0:06:42 time: 0.0400 data: 0.0012 max mem: 33300 Test: [11700/21770] eta: 0:06:38 time: 0.0397 data: 0.0011 max mem: 33300 Test: [11800/21770] eta: 0:06:34 time: 0.0398 data: 0.0012 max mem: 33300 Test: [11900/21770] eta: 0:06:30 time: 0.0399 data: 0.0012 max mem: 33300 Test: [12000/21770] eta: 0:06:26 time: 0.0389 data: 0.0011 max mem: 33300 Test: [12100/21770] eta: 0:06:22 time: 0.0392 data: 0.0011 max mem: 33300 Test: [12200/21770] eta: 0:06:18 time: 0.0398 data: 0.0011 max mem: 33300 Test: [12300/21770] eta: 0:06:14 time: 0.0393 data: 0.0011 max mem: 33300 Test: [12400/21770] eta: 0:06:10 time: 0.0397 data: 0.0011 max mem: 33300 Test: [12500/21770] eta: 0:06:06 time: 0.0387 data: 0.0010 max mem: 33300 Test: [12600/21770] eta: 0:06:02 time: 0.0386 data: 0.0010 max mem: 33300 Test: [12700/21770] eta: 0:05:58 time: 0.0390 data: 0.0010 max mem: 33300 Test: [12800/21770] eta: 0:05:54 time: 0.0391 data: 0.0011 max mem: 33300 Test: [12900/21770] eta: 0:05:50 time: 0.0390 data: 0.0010 max mem: 33300 Test: [13000/21770] eta: 0:05:46 time: 0.0397 data: 0.0010 max mem: 33300 Test: [13100/21770] eta: 0:05:42 time: 0.0401 data: 0.0011 max mem: 33300 Test: [13200/21770] eta: 0:05:38 time: 0.0398 data: 0.0010 max mem: 33300 Test: [13300/21770] eta: 0:05:34 time: 0.0400 data: 0.0010 max mem: 33300 Test: [13400/21770] eta: 0:05:30 time: 0.0399 data: 0.0010 max mem: 33300 Test: [13500/21770] eta: 0:05:27 time: 0.0395 data: 0.0011 max mem: 33300 Test: [13600/21770] eta: 0:05:23 time: 0.0389 data: 0.0011 max mem: 33300 Test: [13700/21770] eta: 0:05:19 time: 0.0386 data: 0.0011 max mem: 33300 Test: [13800/21770] eta: 0:05:15 time: 0.0385 data: 0.0010 max mem: 33300 Test: [13900/21770] eta: 0:05:11 time: 0.0386 data: 0.0011 max mem: 33300 Test: [14000/21770] eta: 0:05:07 time: 0.0386 data: 0.0011 max mem: 33300 Test: [14100/21770] eta: 0:05:03 time: 0.0387 data: 0.0011 max mem: 33300 Test: [14200/21770] eta: 0:04:59 time: 0.0386 data: 0.0011 max mem: 33300 Test: [14300/21770] eta: 0:04:55 time: 0.0387 data: 0.0011 max mem: 33300 Test: [14400/21770] eta: 0:04:51 time: 0.0383 data: 0.0010 max mem: 33300 Test: [14500/21770] eta: 0:04:47 time: 0.0391 data: 0.0011 max mem: 33300 Test: [14600/21770] eta: 0:04:43 time: 0.0391 data: 0.0011 max mem: 33300 Test: [14700/21770] eta: 0:04:39 time: 0.0391 data: 0.0011 max mem: 33300 Test: [14800/21770] eta: 0:04:35 time: 0.0391 data: 0.0011 max mem: 33300 Test: [14900/21770] eta: 0:04:31 time: 0.0398 data: 0.0011 max mem: 33300 Test: [15000/21770] eta: 0:04:27 time: 0.0398 data: 0.0011 max mem: 33300 Test: [15100/21770] eta: 0:04:23 time: 0.0400 data: 0.0011 max mem: 33300 Test: [15200/21770] eta: 0:04:19 time: 0.0398 data: 0.0011 max mem: 33300 Test: [15300/21770] eta: 0:04:15 time: 0.0397 data: 0.0011 max mem: 33300 Test: [15400/21770] eta: 0:04:11 time: 0.0394 data: 0.0010 max mem: 33300 Test: [15500/21770] eta: 0:04:07 time: 0.0396 data: 0.0010 max mem: 33300 Test: [15600/21770] eta: 0:04:03 time: 0.0394 data: 0.0010 max mem: 33300 Test: [15700/21770] eta: 0:03:59 time: 0.0401 data: 0.0011 max mem: 33300 Test: [15800/21770] eta: 0:03:55 time: 0.0392 data: 0.0011 max mem: 33300 Test: [15900/21770] eta: 0:03:51 time: 0.0403 data: 0.0011 max mem: 33300 Test: [16000/21770] eta: 0:03:47 time: 0.0398 data: 0.0011 max mem: 33300 Test: [16100/21770] eta: 0:03:43 time: 0.0394 data: 0.0011 max mem: 33300 Test: [16200/21770] eta: 0:03:39 time: 0.0391 data: 0.0011 max mem: 33300 Test: [16300/21770] eta: 0:03:35 time: 0.0392 data: 0.0011 max mem: 33300 Test: [16400/21770] eta: 0:03:31 time: 0.0394 data: 0.0011 max mem: 33300 Test: [16500/21770] eta: 0:03:28 time: 0.0391 data: 0.0010 max mem: 33300 Test: [16600/21770] eta: 0:03:24 time: 0.0386 data: 0.0010 max mem: 33300 Test: [16700/21770] eta: 0:03:20 time: 0.0387 data: 0.0011 max mem: 33300 Test: [16800/21770] eta: 0:03:16 time: 0.0392 data: 0.0011 max mem: 33300 Test: [16900/21770] eta: 0:03:12 time: 0.0391 data: 0.0010 max mem: 33300 Test: [17000/21770] eta: 0:03:08 time: 0.0394 data: 0.0010 max mem: 33300 Test: [17100/21770] eta: 0:03:04 time: 0.0386 data: 0.0010 max mem: 33300 Test: [17200/21770] eta: 0:03:00 time: 0.0387 data: 0.0010 max mem: 33300 Test: [17300/21770] eta: 0:02:56 time: 0.0387 data: 0.0010 max mem: 33300 Test: [17400/21770] eta: 0:02:52 time: 0.0388 data: 0.0010 max mem: 33300 Test: [17500/21770] eta: 0:02:48 time: 0.0398 data: 0.0010 max mem: 33300 Test: [17600/21770] eta: 0:02:44 time: 0.0400 data: 0.0011 max mem: 33300 Test: [17700/21770] eta: 0:02:40 time: 0.0391 data: 0.0010 max mem: 33300 Test: [17800/21770] eta: 0:02:36 time: 0.0390 data: 0.0011 max mem: 33300 Test: [17900/21770] eta: 0:02:32 time: 0.0396 data: 0.0011 max mem: 33300 Test: [18000/21770] eta: 0:02:28 time: 0.0391 data: 0.0010 max mem: 33300 Test: [18100/21770] eta: 0:02:24 time: 0.0391 data: 0.0011 max mem: 33300 Test: [18200/21770] eta: 0:02:20 time: 0.0389 data: 0.0011 max mem: 33300 Test: [18300/21770] eta: 0:02:16 time: 0.0394 data: 0.0011 max mem: 33300 Test: [18400/21770] eta: 0:02:12 time: 0.0396 data: 0.0010 max mem: 33300 Test: [18500/21770] eta: 0:02:08 time: 0.0395 data: 0.0011 max mem: 33300 Test: [18600/21770] eta: 0:02:05 time: 0.0386 data: 0.0011 max mem: 33300 Test: [18700/21770] eta: 0:02:01 time: 0.0394 data: 0.0011 max mem: 33300 Test: [18800/21770] eta: 0:01:57 time: 0.0389 data: 0.0010 max mem: 33300 Test: [18900/21770] eta: 0:01:53 time: 0.0395 data: 0.0012 max mem: 33300 Test: [19000/21770] eta: 0:01:49 time: 0.0395 data: 0.0011 max mem: 33300 Test: [19100/21770] eta: 0:01:45 time: 0.0391 data: 0.0012 max mem: 33300 Test: [19200/21770] eta: 0:01:41 time: 0.0394 data: 0.0011 max mem: 33300 Test: [19300/21770] eta: 0:01:37 time: 0.0392 data: 0.0011 max mem: 33300 Test: [19400/21770] eta: 0:01:33 time: 0.0392 data: 0.0011 max mem: 33300 Test: [19500/21770] eta: 0:01:29 time: 0.0392 data: 0.0012 max mem: 33300 Test: [19600/21770] eta: 0:01:25 time: 0.0389 data: 0.0011 max mem: 33300 Test: [19700/21770] eta: 0:01:21 time: 0.0388 data: 0.0011 max mem: 33300 Test: [19800/21770] eta: 0:01:17 time: 0.0387 data: 0.0011 max mem: 33300 Test: [19900/21770] eta: 0:01:13 time: 0.0388 data: 0.0011 max mem: 33300 Test: [20000/21770] eta: 0:01:09 time: 0.0392 data: 0.0012 max mem: 33300 Test: [20100/21770] eta: 0:01:05 time: 0.0386 data: 0.0011 max mem: 33300 Test: [20200/21770] eta: 0:01:01 time: 0.0393 data: 0.0011 max mem: 33300 Test: [20300/21770] eta: 0:00:57 time: 0.0387 data: 0.0011 max mem: 33300 Test: [20400/21770] eta: 0:00:53 time: 0.0386 data: 0.0011 max mem: 33300 Test: [20500/21770] eta: 0:00:50 time: 0.0387 data: 0.0011 max mem: 33300 Test: [20600/21770] eta: 0:00:46 time: 0.0388 data: 0.0011 max mem: 33300 Test: [20700/21770] eta: 0:00:42 time: 0.0388 data: 0.0012 max mem: 33300 Test: [20800/21770] eta: 0:00:38 time: 0.0387 data: 0.0011 max mem: 33300 Test: [20900/21770] eta: 0:00:34 time: 0.0387 data: 0.0011 max mem: 33300 Test: [21000/21770] eta: 0:00:30 time: 0.0391 data: 0.0011 max mem: 33300 Test: [21100/21770] eta: 0:00:26 time: 0.0390 data: 0.0011 max mem: 33300 Test: [21200/21770] eta: 0:00:22 time: 0.0396 data: 0.0011 max mem: 33300 Test: [21300/21770] eta: 0:00:18 time: 0.0397 data: 0.0011 max mem: 33300 Test: [21400/21770] eta: 0:00:14 time: 0.0393 data: 0.0011 max mem: 33300 Test: [21500/21770] eta: 0:00:10 time: 0.0399 data: 0.0011 max mem: 33300 Test: [21600/21770] eta: 0:00:06 time: 0.0408 data: 0.0011 max mem: 33300 Test: [21700/21770] eta: 0:00:02 time: 0.0409 data: 0.0011 max mem: 33300 Test: Total time: 0:14:18 Final results: Mean IoU is 0.00 precision@0.5 = 0.00 precision@0.6 = 0.00 precision@0.7 = 0.00 precision@0.8 = 0.00 precision@0.9 = 0.00 overall IoU = 0.00 mean IoU = 0.00 Mean accuracy for one-to-zero sample is 100.00 Average object IoU 0.0 Overall IoU 0.0 Epoch: [14] [ 0/4276] eta: 6:23:14 lr: 3.393036328502703e-05 loss: 0.1173 (0.1173) time: 5.3776 data: 2.1847 max mem: 33300 Epoch: [14] [ 10/4276] eta: 3:54:45 lr: 3.3927616492645645e-05 loss: 0.1500 (0.1533) time: 3.3018 data: 0.2064 max mem: 33300 Epoch: [14] [ 20/4276] eta: 3:47:39 lr: 3.3924869675554996e-05 loss: 0.1500 (0.1507) time: 3.1010 data: 0.0084 max mem: 33300 Epoch: [14] [ 30/4276] eta: 3:44:38 lr: 3.392212283375265e-05 loss: 0.1502 (0.1509) time: 3.1043 data: 0.0084 max mem: 33300 Epoch: [14] [ 40/4276] eta: 3:42:46 lr: 3.391937596723615e-05 loss: 0.1634 (0.1512) time: 3.0988 data: 0.0087 max mem: 33300 Epoch: [14] [ 50/4276] eta: 3:41:17 lr: 3.391662907600306e-05 loss: 0.1367 (0.1483) time: 3.0917 data: 0.0085 max mem: 33300 Epoch: [14] [ 60/4276] eta: 3:40:19 lr: 3.391388216005092e-05 loss: 0.1314 (0.1465) time: 3.0945 data: 0.0079 max mem: 33300 Epoch: [14] [ 70/4276] eta: 3:40:46 lr: 3.39111352193773e-05 loss: 0.1395 (0.1457) time: 3.1685 data: 0.0079 max mem: 33300 Epoch: [14] [ 80/4276] eta: 3:40:03 lr: 3.3908388253979735e-05 loss: 0.1427 (0.1481) time: 3.1809 data: 0.0076 max mem: 33300 Epoch: [14] [ 90/4276] eta: 3:39:04 lr: 3.3905641263855786e-05 loss: 0.1343 (0.1465) time: 3.1072 data: 0.0081 max mem: 33300 Epoch: [14] [ 100/4276] eta: 3:37:52 lr: 3.3902894249003e-05 loss: 0.1337 (0.1479) time: 3.0638 data: 0.0084 max mem: 33300 Epoch: [14] [ 110/4276] eta: 3:36:54 lr: 3.3900147209418935e-05 loss: 0.1511 (0.1483) time: 3.0498 data: 0.0081 max mem: 33300 Epoch: [14] [ 120/4276] eta: 3:36:03 lr: 3.389740014510114e-05 loss: 0.1453 (0.1477) time: 3.0639 data: 0.0080 max mem: 33300 Epoch: [14] [ 130/4276] eta: 3:35:27 lr: 3.389465305604716e-05 loss: 0.1403 (0.1476) time: 3.0855 data: 0.0083 max mem: 33300 Epoch: [14] [ 140/4276] eta: 3:34:45 lr: 3.389190594225455e-05 loss: 0.1348 (0.1468) time: 3.0926 data: 0.0087 max mem: 33300 Epoch: [14] [ 150/4276] eta: 3:34:07 lr: 3.388915880372086e-05 loss: 0.1348 (0.1471) time: 3.0856 data: 0.0085 max mem: 33300 Epoch: [14] [ 160/4276] eta: 3:33:42 lr: 3.388641164044363e-05 loss: 0.1513 (0.1466) time: 3.1145 data: 0.0081 max mem: 33300 Epoch: [14] [ 170/4276] eta: 3:33:39 lr: 3.388366445242041e-05 loss: 0.1431 (0.1468) time: 3.1866 data: 0.0083 max mem: 33300 Epoch: [14] [ 180/4276] eta: 3:33:07 lr: 3.388091723964876e-05 loss: 0.1508 (0.1475) time: 3.1738 data: 0.0084 max mem: 33300 Epoch: [14] [ 190/4276] eta: 3:32:25 lr: 3.387817000212622e-05 loss: 0.1631 (0.1474) time: 3.0946 data: 0.0078 max mem: 33300 Epoch: [14] [ 200/4276] eta: 3:31:36 lr: 3.3875422739850324e-05 loss: 0.1395 (0.1473) time: 3.0527 data: 0.0077 max mem: 33300 Epoch: [14] [ 210/4276] eta: 3:30:53 lr: 3.387267545281864e-05 loss: 0.1395 (0.1475) time: 3.0430 data: 0.0085 max mem: 33300 Epoch: [14] [ 220/4276] eta: 3:30:18 lr: 3.3869928141028704e-05 loss: 0.1395 (0.1469) time: 3.0728 data: 0.0086 max mem: 33300 Epoch: [14] [ 230/4276] eta: 3:29:38 lr: 3.386718080447806e-05 loss: 0.1319 (0.1466) time: 3.0745 data: 0.0079 max mem: 33300 Epoch: [14] [ 240/4276] eta: 3:28:59 lr: 3.3864433443164254e-05 loss: 0.1341 (0.1467) time: 3.0589 data: 0.0081 max mem: 33300 Epoch: [14] [ 250/4276] eta: 3:28:26 lr: 3.3861686057084826e-05 loss: 0.1482 (0.1474) time: 3.0790 data: 0.0083 max mem: 33300 Epoch: [14] [ 260/4276] eta: 3:27:59 lr: 3.385893864623733e-05 loss: 0.1591 (0.1474) time: 3.1150 data: 0.0084 max mem: 33300 Epoch: [14] [ 270/4276] eta: 3:27:40 lr: 3.385619121061931e-05 loss: 0.1220 (0.1473) time: 3.1590 data: 0.0089 max mem: 33300 Epoch: [14] [ 280/4276] eta: 3:27:07 lr: 3.3853443750228304e-05 loss: 0.1263 (0.1471) time: 3.1422 data: 0.0087 max mem: 33300 Epoch: [14] [ 290/4276] eta: 3:26:33 lr: 3.385069626506185e-05 loss: 0.1263 (0.1468) time: 3.0943 data: 0.0083 max mem: 33300 Epoch: [14] [ 300/4276] eta: 3:26:00 lr: 3.3847948755117495e-05 loss: 0.1281 (0.1466) time: 3.0921 data: 0.0083 max mem: 33300 Epoch: [14] [ 310/4276] eta: 3:25:28 lr: 3.384520122039278e-05 loss: 0.1305 (0.1463) time: 3.0965 data: 0.0085 max mem: 33300 Epoch: [14] [ 320/4276] eta: 3:24:55 lr: 3.384245366088525e-05 loss: 0.1366 (0.1463) time: 3.0970 data: 0.0084 max mem: 33300 Epoch: [14] [ 330/4276] eta: 3:24:20 lr: 3.3839706076592436e-05 loss: 0.1367 (0.1462) time: 3.0862 data: 0.0080 max mem: 33300 Epoch: [14] [ 340/4276] eta: 3:23:46 lr: 3.383695846751189e-05 loss: 0.1319 (0.1460) time: 3.0800 data: 0.0079 max mem: 33300 Epoch: [14] [ 350/4276] eta: 3:23:14 lr: 3.383421083364114e-05 loss: 0.1176 (0.1458) time: 3.0882 data: 0.0084 max mem: 33300 Epoch: [14] [ 360/4276] eta: 3:22:53 lr: 3.383146317497775e-05 loss: 0.1510 (0.1465) time: 3.1476 data: 0.0086 max mem: 33300 Epoch: [14] [ 370/4276] eta: 3:22:22 lr: 3.382871549151923e-05 loss: 0.1362 (0.1461) time: 3.1535 data: 0.0084 max mem: 33300 Epoch: [14] [ 380/4276] eta: 3:21:45 lr: 3.382596778326314e-05 loss: 0.1317 (0.1465) time: 3.0797 data: 0.0085 max mem: 33300 Epoch: [14] [ 390/4276] eta: 3:21:10 lr: 3.382322005020699e-05 loss: 0.1389 (0.1467) time: 3.0623 data: 0.0087 max mem: 33300 Epoch: [14] [ 400/4276] eta: 3:20:36 lr: 3.382047229234834e-05 loss: 0.1495 (0.1466) time: 3.0749 data: 0.0088 max mem: 33300 Epoch: [14] [ 410/4276] eta: 3:20:03 lr: 3.381772450968473e-05 loss: 0.1416 (0.1465) time: 3.0785 data: 0.0084 max mem: 33300 Epoch: [14] [ 420/4276] eta: 3:19:24 lr: 3.3814976702213686e-05 loss: 0.1322 (0.1464) time: 3.0480 data: 0.0077 max mem: 33300 Epoch: [14] [ 430/4276] eta: 3:18:49 lr: 3.3812228869932745e-05 loss: 0.1367 (0.1464) time: 3.0416 data: 0.0078 max mem: 33300 Epoch: [14] [ 440/4276] eta: 3:18:16 lr: 3.380948101283945e-05 loss: 0.1312 (0.1459) time: 3.0703 data: 0.0084 max mem: 33300 Epoch: [14] [ 450/4276] eta: 3:17:45 lr: 3.380673313093133e-05 loss: 0.1220 (0.1458) time: 3.0871 data: 0.0088 max mem: 33300 Epoch: [14] [ 460/4276] eta: 3:17:23 lr: 3.380398522420592e-05 loss: 0.1220 (0.1453) time: 3.1531 data: 0.0089 max mem: 33300 Epoch: [14] [ 470/4276] eta: 3:16:53 lr: 3.380123729266075e-05 loss: 0.1337 (0.1449) time: 3.1644 data: 0.0089 max mem: 33300 Epoch: [14] [ 480/4276] eta: 3:16:20 lr: 3.379848933629337e-05 loss: 0.1283 (0.1446) time: 3.0985 data: 0.0084 max mem: 33300 Epoch: [14] [ 490/4276] eta: 3:15:46 lr: 3.3795741355101295e-05 loss: 0.1166 (0.1442) time: 3.0687 data: 0.0077 max mem: 33300 Epoch: [14] [ 500/4276] eta: 3:15:13 lr: 3.3792993349082065e-05 loss: 0.1131 (0.1440) time: 3.0687 data: 0.0075 max mem: 33300 Epoch: [14] [ 510/4276] eta: 3:14:39 lr: 3.3790245318233216e-05 loss: 0.1125 (0.1437) time: 3.0750 data: 0.0073 max mem: 33300 Epoch: [14] [ 520/4276] eta: 3:14:06 lr: 3.3787497262552275e-05 loss: 0.1153 (0.1435) time: 3.0676 data: 0.0077 max mem: 33300 Epoch: [14] [ 530/4276] eta: 3:13:31 lr: 3.378474918203678e-05 loss: 0.1355 (0.1435) time: 3.0586 data: 0.0081 max mem: 33300 Epoch: [14] [ 540/4276] eta: 3:12:55 lr: 3.3782001076684256e-05 loss: 0.1355 (0.1433) time: 3.0401 data: 0.0081 max mem: 33300 Epoch: [14] [ 550/4276] eta: 3:12:21 lr: 3.377925294649223e-05 loss: 0.1315 (0.1436) time: 3.0388 data: 0.0081 max mem: 33300 Epoch: [14] [ 560/4276] eta: 3:11:57 lr: 3.3776504791458237e-05 loss: 0.1416 (0.1435) time: 3.1295 data: 0.0085 max mem: 33300 Epoch: [14] [ 570/4276] eta: 3:11:25 lr: 3.37737566115798e-05 loss: 0.1404 (0.1435) time: 3.1425 data: 0.0085 max mem: 33300 Epoch: [14] [ 580/4276] eta: 3:10:49 lr: 3.377100840685446e-05 loss: 0.1255 (0.1435) time: 3.0519 data: 0.0083 max mem: 33300 Epoch: [14] [ 590/4276] eta: 3:10:16 lr: 3.376826017727975e-05 loss: 0.1226 (0.1431) time: 3.0449 data: 0.0077 max mem: 33300 Epoch: [14] [ 600/4276] eta: 3:09:45 lr: 3.376551192285318e-05 loss: 0.1263 (0.1429) time: 3.0763 data: 0.0075 max mem: 33300 Epoch: [14] [ 610/4276] eta: 3:09:12 lr: 3.376276364357229e-05 loss: 0.1362 (0.1428) time: 3.0805 data: 0.0080 max mem: 33300 Epoch: [14] [ 620/4276] eta: 3:08:40 lr: 3.3760015339434606e-05 loss: 0.1307 (0.1427) time: 3.0741 data: 0.0082 max mem: 33300 Epoch: [14] [ 630/4276] eta: 3:08:08 lr: 3.375726701043764e-05 loss: 0.1258 (0.1430) time: 3.0739 data: 0.0077 max mem: 33300 Epoch: [14] [ 640/4276] eta: 3:07:34 lr: 3.3754518656578944e-05 loss: 0.1413 (0.1429) time: 3.0584 data: 0.0077 max mem: 33300 Epoch: [14] [ 650/4276] eta: 3:07:05 lr: 3.375177027785602e-05 loss: 0.1413 (0.1432) time: 3.0893 data: 0.0086 max mem: 33300 Epoch: [14] [ 660/4276] eta: 3:06:38 lr: 3.374902187426641e-05 loss: 0.1527 (0.1433) time: 3.1555 data: 0.0085 max mem: 33300 Epoch: [14] [ 670/4276] eta: 3:06:09 lr: 3.374627344580763e-05 loss: 0.1481 (0.1433) time: 3.1489 data: 0.0083 max mem: 33300 Epoch: [14] [ 680/4276] eta: 3:05:37 lr: 3.3743524992477204e-05 loss: 0.1333 (0.1433) time: 3.0992 data: 0.0084 max mem: 33300 Epoch: [14] [ 690/4276] eta: 3:05:04 lr: 3.374077651427267e-05 loss: 0.1394 (0.1432) time: 3.0743 data: 0.0084 max mem: 33300 Epoch: [14] [ 700/4276] eta: 3:04:32 lr: 3.373802801119153e-05 loss: 0.1407 (0.1431) time: 3.0671 data: 0.0082 max mem: 33300 Epoch: [14] [ 710/4276] eta: 3:03:58 lr: 3.373527948323132e-05 loss: 0.1457 (0.1432) time: 3.0539 data: 0.0085 max mem: 33300 Epoch: [14] [ 720/4276] eta: 3:03:26 lr: 3.373253093038956e-05 loss: 0.1460 (0.1431) time: 3.0559 data: 0.0086 max mem: 33300 Epoch: [14] [ 730/4276] eta: 3:02:54 lr: 3.3729782352663775e-05 loss: 0.1396 (0.1432) time: 3.0693 data: 0.0083 max mem: 33300 Epoch: [14] [ 740/4276] eta: 3:02:22 lr: 3.372703375005148e-05 loss: 0.1238 (0.1430) time: 3.0770 data: 0.0081 max mem: 33300 Epoch: [14] [ 750/4276] eta: 3:01:54 lr: 3.37242851225502e-05 loss: 0.1300 (0.1431) time: 3.1190 data: 0.0081 max mem: 33300 Epoch: [14] [ 760/4276] eta: 3:01:22 lr: 3.372153647015746e-05 loss: 0.1272 (0.1430) time: 3.1092 data: 0.0081 max mem: 33300 Epoch: [14] [ 770/4276] eta: 3:00:45 lr: 3.371878779287078e-05 loss: 0.1354 (0.1430) time: 3.0101 data: 0.0085 max mem: 33300 Epoch: [14] [ 780/4276] eta: 3:00:06 lr: 3.371603909068767e-05 loss: 0.1372 (0.1430) time: 2.9417 data: 0.0091 max mem: 33300 Epoch: [14] [ 790/4276] eta: 2:59:28 lr: 3.371329036360565e-05 loss: 0.1311 (0.1429) time: 2.9228 data: 0.0083 max mem: 33300 Epoch: [14] [ 800/4276] eta: 2:58:50 lr: 3.3710541611622244e-05 loss: 0.1259 (0.1429) time: 2.9264 data: 0.0077 max mem: 33300 Epoch: [14] [ 810/4276] eta: 2:58:12 lr: 3.370779283473497e-05 loss: 0.1332 (0.1431) time: 2.9237 data: 0.0078 max mem: 33300 Epoch: [14] [ 820/4276] eta: 2:57:34 lr: 3.370504403294136e-05 loss: 0.1322 (0.1428) time: 2.9199 data: 0.0079 max mem: 33300 Epoch: [14] [ 830/4276] eta: 2:56:56 lr: 3.3702295206238906e-05 loss: 0.1242 (0.1428) time: 2.9125 data: 0.0078 max mem: 33300 Epoch: [14] [ 840/4276] eta: 2:56:19 lr: 3.369954635462514e-05 loss: 0.1351 (0.1428) time: 2.9130 data: 0.0074 max mem: 33300 Epoch: [14] [ 850/4276] eta: 2:55:45 lr: 3.3696797478097567e-05 loss: 0.1259 (0.1427) time: 2.9727 data: 0.0077 max mem: 33300 Epoch: [14] [ 860/4276] eta: 2:55:15 lr: 3.369404857665371e-05 loss: 0.1259 (0.1427) time: 3.0595 data: 0.0085 max mem: 33300 Epoch: [14] [ 870/4276] eta: 2:54:42 lr: 3.369129965029109e-05 loss: 0.1319 (0.1426) time: 3.0531 data: 0.0090 max mem: 33300 Epoch: [14] [ 880/4276] eta: 2:54:07 lr: 3.3688550699007215e-05 loss: 0.1319 (0.1428) time: 2.9852 data: 0.0088 max mem: 33300 Epoch: [14] [ 890/4276] eta: 2:53:32 lr: 3.36858017227996e-05 loss: 0.1618 (0.1431) time: 2.9630 data: 0.0080 max mem: 33300 Epoch: [14] [ 900/4276] eta: 2:52:57 lr: 3.368305272166577e-05 loss: 0.1527 (0.1431) time: 2.9701 data: 0.0077 max mem: 33300 Epoch: [14] [ 910/4276] eta: 2:52:23 lr: 3.3680303695603214e-05 loss: 0.1406 (0.1431) time: 2.9765 data: 0.0079 max mem: 33300 Epoch: [14] [ 920/4276] eta: 2:51:48 lr: 3.367755464460948e-05 loss: 0.1380 (0.1431) time: 2.9665 data: 0.0079 max mem: 33300 Epoch: [14] [ 930/4276] eta: 2:51:13 lr: 3.367480556868204e-05 loss: 0.1400 (0.1432) time: 2.9559 data: 0.0080 max mem: 33300 Epoch: [14] [ 940/4276] eta: 2:50:39 lr: 3.367205646781844e-05 loss: 0.1370 (0.1431) time: 2.9634 data: 0.0081 max mem: 33300 Epoch: [14] [ 950/4276] eta: 2:50:07 lr: 3.366930734201617e-05 loss: 0.1321 (0.1431) time: 3.0088 data: 0.0080 max mem: 33300 Epoch: [14] [ 960/4276] eta: 2:49:37 lr: 3.366655819127275e-05 loss: 0.1476 (0.1433) time: 3.0689 data: 0.0081 max mem: 33300 Epoch: [14] [ 970/4276] eta: 2:49:04 lr: 3.36638090155857e-05 loss: 0.1508 (0.1433) time: 3.0486 data: 0.0081 max mem: 33300 Epoch: [14] [ 980/4276] eta: 2:48:29 lr: 3.366105981495252e-05 loss: 0.1433 (0.1436) time: 2.9597 data: 0.0083 max mem: 33300 Epoch: [14] [ 990/4276] eta: 2:47:53 lr: 3.365831058937071e-05 loss: 0.1433 (0.1436) time: 2.9090 data: 0.0082 max mem: 33300 Epoch: [14] [1000/4276] eta: 2:47:17 lr: 3.36555613388378e-05 loss: 0.1286 (0.1436) time: 2.9053 data: 0.0078 max mem: 33300 Epoch: [14] [1010/4276] eta: 2:46:42 lr: 3.3652812063351285e-05 loss: 0.1271 (0.1436) time: 2.9200 data: 0.0079 max mem: 33300 Epoch: [14] [1020/4276] eta: 2:46:07 lr: 3.365006276290867e-05 loss: 0.1266 (0.1435) time: 2.9275 data: 0.0078 max mem: 33300 Epoch: [14] [1030/4276] eta: 2:45:33 lr: 3.364731343750748e-05 loss: 0.1377 (0.1435) time: 2.9463 data: 0.0080 max mem: 33300 Epoch: [14] [1040/4276] eta: 2:44:59 lr: 3.364456408714521e-05 loss: 0.1334 (0.1434) time: 2.9587 data: 0.0083 max mem: 33300 Epoch: [14] [1050/4276] eta: 2:44:27 lr: 3.364181471181937e-05 loss: 0.1334 (0.1436) time: 2.9766 data: 0.0086 max mem: 33300 Epoch: [14] [1060/4276] eta: 2:43:58 lr: 3.363906531152747e-05 loss: 0.1469 (0.1436) time: 3.0569 data: 0.0091 max mem: 33300 Epoch: [14] [1070/4276] eta: 2:43:26 lr: 3.363631588626701e-05 loss: 0.1476 (0.1438) time: 3.0627 data: 0.0091 max mem: 33300 Epoch: [14] [1080/4276] eta: 2:42:53 lr: 3.363356643603549e-05 loss: 0.1476 (0.1437) time: 2.9907 data: 0.0096 max mem: 33300 Epoch: [14] [1090/4276] eta: 2:42:19 lr: 3.363081696083042e-05 loss: 0.1472 (0.1437) time: 2.9656 data: 0.0097 max mem: 33300 Epoch: [14] [1100/4276] eta: 2:41:48 lr: 3.3628067460649314e-05 loss: 0.1434 (0.1437) time: 3.0010 data: 0.0094 max mem: 33300 Epoch: [14] [1110/4276] eta: 2:41:18 lr: 3.3625317935489666e-05 loss: 0.1403 (0.1438) time: 3.0464 data: 0.0094 max mem: 33300 Epoch: [14] [1120/4276] eta: 2:40:48 lr: 3.3622568385348984e-05 loss: 0.1359 (0.1438) time: 3.0802 data: 0.0099 max mem: 33300 Epoch: [14] [1130/4276] eta: 2:40:17 lr: 3.361981881022478e-05 loss: 0.1245 (0.1436) time: 3.0671 data: 0.0103 max mem: 33300 Epoch: [14] [1140/4276] eta: 2:39:44 lr: 3.3617069210114534e-05 loss: 0.1245 (0.1435) time: 3.0001 data: 0.0095 max mem: 33300 Epoch: [14] [1150/4276] eta: 2:39:13 lr: 3.3614319585015766e-05 loss: 0.1259 (0.1434) time: 3.0111 data: 0.0093 max mem: 33300 Epoch: [14] [1160/4276] eta: 2:38:44 lr: 3.3611569934925975e-05 loss: 0.1244 (0.1433) time: 3.0766 data: 0.0098 max mem: 33300 Epoch: [14] [1170/4276] eta: 2:38:13 lr: 3.360882025984265e-05 loss: 0.1410 (0.1435) time: 3.0640 data: 0.0098 max mem: 33300 Epoch: [14] [1180/4276] eta: 2:37:40 lr: 3.36060705597633e-05 loss: 0.1410 (0.1434) time: 3.0053 data: 0.0092 max mem: 33300 Epoch: [14] [1190/4276] eta: 2:37:07 lr: 3.3603320834685435e-05 loss: 0.1288 (0.1434) time: 2.9731 data: 0.0092 max mem: 33300 Epoch: [14] [1200/4276] eta: 2:36:35 lr: 3.360057108460655e-05 loss: 0.1295 (0.1434) time: 2.9778 data: 0.0091 max mem: 33300 Epoch: [14] [1210/4276] eta: 2:36:02 lr: 3.359782130952414e-05 loss: 0.1246 (0.1433) time: 2.9788 data: 0.0087 max mem: 33300 Epoch: [14] [1220/4276] eta: 2:35:30 lr: 3.3595071509435695e-05 loss: 0.1201 (0.1432) time: 2.9725 data: 0.0086 max mem: 33300 Epoch: [14] [1230/4276] eta: 2:34:57 lr: 3.3592321684338726e-05 loss: 0.1325 (0.1434) time: 2.9790 data: 0.0090 max mem: 33300 Epoch: [14] [1240/4276] eta: 2:34:25 lr: 3.358957183423073e-05 loss: 0.1510 (0.1434) time: 2.9727 data: 0.0094 max mem: 33300 Epoch: [14] [1250/4276] eta: 2:33:54 lr: 3.35868219591092e-05 loss: 0.1448 (0.1434) time: 3.0047 data: 0.0090 max mem: 33300 Epoch: [14] [1260/4276] eta: 2:33:24 lr: 3.358407205897163e-05 loss: 0.1310 (0.1434) time: 3.0564 data: 0.0086 max mem: 33300 Epoch: [14] [1270/4276] eta: 2:32:53 lr: 3.358132213381553e-05 loss: 0.1310 (0.1433) time: 3.0462 data: 0.0090 max mem: 33300 Epoch: [14] [1280/4276] eta: 2:32:20 lr: 3.3578572183638384e-05 loss: 0.1351 (0.1432) time: 2.9954 data: 0.0094 max mem: 33300 Epoch: [14] [1290/4276] eta: 2:31:48 lr: 3.3575822208437696e-05 loss: 0.1351 (0.1433) time: 2.9700 data: 0.0089 max mem: 33300 Epoch: [14] [1300/4276] eta: 2:31:15 lr: 3.357307220821095e-05 loss: 0.1175 (0.1431) time: 2.9649 data: 0.0088 max mem: 33300 Epoch: [14] [1310/4276] eta: 2:30:42 lr: 3.3570322182955654e-05 loss: 0.1076 (0.1430) time: 2.9336 data: 0.0084 max mem: 33300 Epoch: [14] [1320/4276] eta: 2:30:08 lr: 3.356757213266929e-05 loss: 0.1417 (0.1431) time: 2.9105 data: 0.0078 max mem: 33300 Epoch: [14] [1330/4276] eta: 2:29:35 lr: 3.3564822057349356e-05 loss: 0.1431 (0.1431) time: 2.9161 data: 0.0081 max mem: 33300 Epoch: [14] [1340/4276] eta: 2:29:03 lr: 3.3562071956993344e-05 loss: 0.1361 (0.1430) time: 2.9436 data: 0.0086 max mem: 33300 Epoch: [14] [1350/4276] eta: 2:28:32 lr: 3.355932183159875e-05 loss: 0.1387 (0.1430) time: 3.0012 data: 0.0089 max mem: 33300 Epoch: [14] [1360/4276] eta: 2:28:02 lr: 3.355657168116306e-05 loss: 0.1456 (0.1430) time: 3.0614 data: 0.0094 max mem: 33300 Epoch: [14] [1370/4276] eta: 2:27:33 lr: 3.355382150568378e-05 loss: 0.1328 (0.1430) time: 3.0824 data: 0.0091 max mem: 33300 Epoch: [14] [1380/4276] eta: 2:27:01 lr: 3.355107130515839e-05 loss: 0.1411 (0.1431) time: 3.0460 data: 0.0086 max mem: 33300 Epoch: [14] [1390/4276] eta: 2:26:30 lr: 3.354832107958437e-05 loss: 0.1661 (0.1432) time: 3.0090 data: 0.0085 max mem: 33300 Epoch: [14] [1400/4276] eta: 2:25:59 lr: 3.354557082895923e-05 loss: 0.1518 (0.1433) time: 3.0058 data: 0.0083 max mem: 33300 Epoch: [14] [1410/4276] eta: 2:25:28 lr: 3.3542820553280445e-05 loss: 0.1361 (0.1432) time: 3.0065 data: 0.0085 max mem: 33300 Epoch: [14] [1420/4276] eta: 2:24:56 lr: 3.3540070252545515e-05 loss: 0.1299 (0.1432) time: 3.0029 data: 0.0082 max mem: 33300 Epoch: [14] [1430/4276] eta: 2:24:25 lr: 3.353731992675193e-05 loss: 0.1321 (0.1433) time: 3.0001 data: 0.0085 max mem: 33300 Epoch: [14] [1440/4276] eta: 2:23:54 lr: 3.353456957589716e-05 loss: 0.1395 (0.1433) time: 3.0011 data: 0.0086 max mem: 33300 Epoch: [14] [1450/4276] eta: 2:23:22 lr: 3.3531819199978714e-05 loss: 0.1446 (0.1433) time: 3.0022 data: 0.0081 max mem: 33300 Epoch: [14] [1460/4276] eta: 2:22:52 lr: 3.352906879899408e-05 loss: 0.1448 (0.1433) time: 3.0211 data: 0.0080 max mem: 33300 Epoch: [14] [1470/4276] eta: 2:22:22 lr: 3.3526318372940726e-05 loss: 0.1372 (0.1433) time: 3.0689 data: 0.0089 max mem: 33300 Epoch: [14] [1480/4276] eta: 2:21:55 lr: 3.3523567921816146e-05 loss: 0.1269 (0.1432) time: 3.1458 data: 0.0097 max mem: 33300 Epoch: [14] [1490/4276] eta: 2:21:25 lr: 3.352081744561783e-05 loss: 0.1236 (0.1431) time: 3.1477 data: 0.0093 max mem: 33300 Epoch: [14] [1500/4276] eta: 2:20:56 lr: 3.351806694434326e-05 loss: 0.1257 (0.1431) time: 3.0953 data: 0.0084 max mem: 33300 Epoch: [14] [1510/4276] eta: 2:20:26 lr: 3.351531641798992e-05 loss: 0.1295 (0.1431) time: 3.0892 data: 0.0076 max mem: 33300 Epoch: [14] [1520/4276] eta: 2:19:57 lr: 3.3512565866555314e-05 loss: 0.1286 (0.1430) time: 3.0956 data: 0.0071 max mem: 33300 Epoch: [14] [1530/4276] eta: 2:19:27 lr: 3.350981529003689e-05 loss: 0.1301 (0.1429) time: 3.1027 data: 0.0068 max mem: 33300 Epoch: [14] [1540/4276] eta: 2:18:57 lr: 3.3507064688432165e-05 loss: 0.1433 (0.1430) time: 3.0796 data: 0.0069 max mem: 33300 Epoch: [14] [1550/4276] eta: 2:18:26 lr: 3.3504314061738594e-05 loss: 0.1471 (0.1430) time: 3.0497 data: 0.0072 max mem: 33300 Epoch: [14] [1560/4276] eta: 2:17:56 lr: 3.3501563409953684e-05 loss: 0.1405 (0.1429) time: 3.0387 data: 0.0074 max mem: 33300 Epoch: [14] [1570/4276] eta: 2:17:25 lr: 3.3498812733074894e-05 loss: 0.1415 (0.1430) time: 3.0316 data: 0.0075 max mem: 33300 Epoch: [14] [1580/4276] eta: 2:16:54 lr: 3.349606203109973e-05 loss: 0.1384 (0.1429) time: 3.0343 data: 0.0077 max mem: 33300 Epoch: [14] [1590/4276] eta: 2:16:25 lr: 3.3493311304025654e-05 loss: 0.1346 (0.1429) time: 3.0735 data: 0.0077 max mem: 33300 Epoch: [14] [1600/4276] eta: 2:15:55 lr: 3.3490560551850156e-05 loss: 0.1356 (0.1428) time: 3.1017 data: 0.0083 max mem: 33300 Epoch: [14] [1610/4276] eta: 2:15:25 lr: 3.348780977457071e-05 loss: 0.1282 (0.1427) time: 3.0855 data: 0.0083 max mem: 33300 Epoch: [14] [1620/4276] eta: 2:14:55 lr: 3.348505897218481e-05 loss: 0.1231 (0.1426) time: 3.0796 data: 0.0084 max mem: 33300 Epoch: [14] [1630/4276] eta: 2:14:25 lr: 3.348230814468991e-05 loss: 0.1227 (0.1426) time: 3.0790 data: 0.0087 max mem: 33300 Epoch: [14] [1640/4276] eta: 2:13:55 lr: 3.347955729208351e-05 loss: 0.1227 (0.1425) time: 3.0799 data: 0.0087 max mem: 33300 Epoch: [14] [1650/4276] eta: 2:13:25 lr: 3.347680641436308e-05 loss: 0.1223 (0.1425) time: 3.0765 data: 0.0085 max mem: 33300 Epoch: [14] [1660/4276] eta: 2:12:55 lr: 3.347405551152609e-05 loss: 0.1293 (0.1425) time: 3.0762 data: 0.0082 max mem: 33300 Epoch: [14] [1670/4276] eta: 2:12:25 lr: 3.3471304583570034e-05 loss: 0.1304 (0.1424) time: 3.0761 data: 0.0078 max mem: 33300 Epoch: [14] [1680/4276] eta: 2:11:55 lr: 3.346855363049238e-05 loss: 0.1338 (0.1423) time: 3.0845 data: 0.0081 max mem: 33300 Epoch: [14] [1690/4276] eta: 2:11:25 lr: 3.346580265229061e-05 loss: 0.1325 (0.1423) time: 3.0958 data: 0.0084 max mem: 33300 Epoch: [14] [1700/4276] eta: 2:10:56 lr: 3.3463051648962194e-05 loss: 0.1275 (0.1423) time: 3.0919 data: 0.0076 max mem: 33300 Epoch: [14] [1710/4276] eta: 2:10:26 lr: 3.3460300620504606e-05 loss: 0.1428 (0.1423) time: 3.0965 data: 0.0080 max mem: 33300 Epoch: [14] [1720/4276] eta: 2:09:56 lr: 3.345754956691531e-05 loss: 0.1436 (0.1424) time: 3.1058 data: 0.0087 max mem: 33300 Epoch: [14] [1730/4276] eta: 2:09:26 lr: 3.345479848819181e-05 loss: 0.1569 (0.1425) time: 3.1026 data: 0.0085 max mem: 33300 Epoch: [14] [1740/4276] eta: 2:08:56 lr: 3.345204738433155e-05 loss: 0.1473 (0.1425) time: 3.0818 data: 0.0084 max mem: 33300 Epoch: [14] [1750/4276] eta: 2:08:27 lr: 3.344929625533203e-05 loss: 0.1351 (0.1424) time: 3.0941 data: 0.0087 max mem: 33300 Epoch: [14] [1760/4276] eta: 2:07:56 lr: 3.34465451011907e-05 loss: 0.1237 (0.1424) time: 3.0958 data: 0.0087 max mem: 33300 Epoch: [14] [1770/4276] eta: 2:07:27 lr: 3.344379392190505e-05 loss: 0.1277 (0.1424) time: 3.0918 data: 0.0085 max mem: 33300 Epoch: [14] [1780/4276] eta: 2:06:56 lr: 3.344104271747253e-05 loss: 0.1412 (0.1423) time: 3.0802 data: 0.0082 max mem: 33300 Epoch: [14] [1790/4276] eta: 2:06:25 lr: 3.343829148789063e-05 loss: 0.1335 (0.1423) time: 3.0221 data: 0.0082 max mem: 33300 Epoch: [14] [1800/4276] eta: 2:05:53 lr: 3.3435540233156815e-05 loss: 0.1335 (0.1423) time: 2.9826 data: 0.0083 max mem: 33300 Epoch: [14] [1810/4276] eta: 2:05:22 lr: 3.3432788953268555e-05 loss: 0.1440 (0.1424) time: 2.9677 data: 0.0076 max mem: 33300 Epoch: [14] [1820/4276] eta: 2:04:50 lr: 3.343003764822332e-05 loss: 0.1462 (0.1424) time: 2.9656 data: 0.0074 max mem: 33300 Epoch: [14] [1830/4276] eta: 2:04:19 lr: 3.3427286318018594e-05 loss: 0.1315 (0.1423) time: 2.9865 data: 0.0076 max mem: 33300 Epoch: [14] [1840/4276] eta: 2:03:50 lr: 3.342453496265182e-05 loss: 0.1310 (0.1423) time: 3.0802 data: 0.0083 max mem: 33300 Epoch: [14] [1850/4276] eta: 2:03:20 lr: 3.342178358212048e-05 loss: 0.1339 (0.1423) time: 3.1189 data: 0.0089 max mem: 33300 Epoch: [14] [1860/4276] eta: 2:02:49 lr: 3.341903217642204e-05 loss: 0.1389 (0.1423) time: 3.0712 data: 0.0090 max mem: 33300 Epoch: [14] [1870/4276] eta: 2:02:19 lr: 3.341628074555397e-05 loss: 0.1407 (0.1425) time: 3.0453 data: 0.0085 max mem: 33300 Epoch: [14] [1880/4276] eta: 2:01:49 lr: 3.3413529289513734e-05 loss: 0.1455 (0.1425) time: 3.0649 data: 0.0093 max mem: 33300 Epoch: [14] [1890/4276] eta: 2:01:17 lr: 3.34107778082988e-05 loss: 0.1333 (0.1424) time: 3.0326 data: 0.0094 max mem: 33300 Epoch: [14] [1900/4276] eta: 2:00:46 lr: 3.340802630190663e-05 loss: 0.1257 (0.1424) time: 3.0017 data: 0.0085 max mem: 33300 Epoch: [14] [1910/4276] eta: 2:00:16 lr: 3.34052747703347e-05 loss: 0.1379 (0.1424) time: 3.0330 data: 0.0084 max mem: 33300 Epoch: [14] [1920/4276] eta: 1:59:45 lr: 3.340252321358047e-05 loss: 0.1326 (0.1423) time: 3.0267 data: 0.0085 max mem: 33300 Epoch: [14] [1930/4276] eta: 1:59:15 lr: 3.3399771631641406e-05 loss: 0.1221 (0.1423) time: 3.0591 data: 0.0088 max mem: 33300 Epoch: [14] [1940/4276] eta: 1:58:45 lr: 3.339702002451496e-05 loss: 0.1379 (0.1423) time: 3.0869 data: 0.0088 max mem: 33300 Epoch: [14] [1950/4276] eta: 1:58:14 lr: 3.339426839219861e-05 loss: 0.1460 (0.1424) time: 3.0561 data: 0.0088 max mem: 33300 Epoch: [14] [1960/4276] eta: 1:57:44 lr: 3.3391516734689806e-05 loss: 0.1488 (0.1424) time: 3.0644 data: 0.0087 max mem: 33300 Epoch: [14] [1970/4276] eta: 1:57:13 lr: 3.338876505198603e-05 loss: 0.1320 (0.1423) time: 3.0663 data: 0.0081 max mem: 33300 Epoch: [14] [1980/4276] eta: 1:56:43 lr: 3.338601334408472e-05 loss: 0.1285 (0.1423) time: 3.0309 data: 0.0078 max mem: 33300 Epoch: [14] [1990/4276] eta: 1:56:12 lr: 3.338326161098336e-05 loss: 0.1428 (0.1423) time: 3.0363 data: 0.0081 max mem: 33300 Epoch: [14] [2000/4276] eta: 1:55:41 lr: 3.338050985267941e-05 loss: 0.1494 (0.1423) time: 3.0388 data: 0.0079 max mem: 33300 Epoch: [14] [2010/4276] eta: 1:55:11 lr: 3.337775806917031e-05 loss: 0.1383 (0.1423) time: 3.0287 data: 0.0076 max mem: 33300 Epoch: [14] [2020/4276] eta: 1:54:40 lr: 3.337500626045353e-05 loss: 0.1383 (0.1423) time: 3.0272 data: 0.0082 max mem: 33300 Epoch: [14] [2030/4276] eta: 1:54:09 lr: 3.337225442652654e-05 loss: 0.1367 (0.1423) time: 3.0071 data: 0.0084 max mem: 33300 Epoch: [14] [2040/4276] eta: 1:53:37 lr: 3.336950256738679e-05 loss: 0.1279 (0.1422) time: 2.9793 data: 0.0075 max mem: 33300 Epoch: [14] [2050/4276] eta: 1:53:07 lr: 3.336675068303173e-05 loss: 0.1395 (0.1422) time: 3.0057 data: 0.0073 max mem: 33300 Epoch: [14] [2060/4276] eta: 1:52:36 lr: 3.336399877345884e-05 loss: 0.1472 (0.1422) time: 3.0345 data: 0.0080 max mem: 33300 Epoch: [14] [2070/4276] eta: 1:52:05 lr: 3.3361246838665564e-05 loss: 0.1278 (0.1422) time: 3.0309 data: 0.0089 max mem: 33300 Epoch: [14] [2080/4276] eta: 1:51:35 lr: 3.335849487864936e-05 loss: 0.1290 (0.1422) time: 3.0348 data: 0.0090 max mem: 33300 Epoch: [14] [2090/4276] eta: 1:51:04 lr: 3.335574289340769e-05 loss: 0.1460 (0.1422) time: 3.0313 data: 0.0082 max mem: 33300 Epoch: [14] [2100/4276] eta: 1:50:33 lr: 3.3352990882938e-05 loss: 0.1423 (0.1422) time: 3.0308 data: 0.0080 max mem: 33300 Epoch: [14] [2110/4276] eta: 1:50:03 lr: 3.3350238847237755e-05 loss: 0.1274 (0.1421) time: 3.0328 data: 0.0082 max mem: 33300 Epoch: [14] [2120/4276] eta: 1:49:32 lr: 3.3347486786304406e-05 loss: 0.1131 (0.1420) time: 3.0351 data: 0.0079 max mem: 33300 Epoch: [14] [2130/4276] eta: 1:49:02 lr: 3.3344734700135404e-05 loss: 0.1119 (0.1419) time: 3.0393 data: 0.0081 max mem: 33300 Epoch: [14] [2140/4276] eta: 1:48:31 lr: 3.334198258872822e-05 loss: 0.1258 (0.1418) time: 3.0479 data: 0.0084 max mem: 33300 Epoch: [14] [2150/4276] eta: 1:48:01 lr: 3.33392304520803e-05 loss: 0.1277 (0.1418) time: 3.0514 data: 0.0088 max mem: 33300 Epoch: [14] [2160/4276] eta: 1:47:30 lr: 3.3336478290189085e-05 loss: 0.1278 (0.1418) time: 3.0401 data: 0.0088 max mem: 33300 Epoch: [14] [2170/4276] eta: 1:46:59 lr: 3.3333726103052034e-05 loss: 0.1391 (0.1418) time: 3.0353 data: 0.0084 max mem: 33300 Epoch: [14] [2180/4276] eta: 1:46:29 lr: 3.33309738906666e-05 loss: 0.1450 (0.1418) time: 3.0347 data: 0.0086 max mem: 33300 Epoch: [14] [2190/4276] eta: 1:45:58 lr: 3.3328221653030243e-05 loss: 0.1450 (0.1419) time: 3.0384 data: 0.0089 max mem: 33300 Epoch: [14] [2200/4276] eta: 1:45:28 lr: 3.332546939014041e-05 loss: 0.1398 (0.1419) time: 3.0450 data: 0.0091 max mem: 33300 Epoch: [14] [2210/4276] eta: 1:44:57 lr: 3.332271710199454e-05 loss: 0.1420 (0.1419) time: 3.0517 data: 0.0088 max mem: 33300 Epoch: [14] [2220/4276] eta: 1:44:27 lr: 3.331996478859011e-05 loss: 0.1420 (0.1419) time: 3.0467 data: 0.0084 max mem: 33300 Epoch: [14] [2230/4276] eta: 1:43:56 lr: 3.331721244992455e-05 loss: 0.1400 (0.1419) time: 3.0155 data: 0.0084 max mem: 33300 Epoch: [14] [2240/4276] eta: 1:43:25 lr: 3.33144600859953e-05 loss: 0.1301 (0.1418) time: 2.9969 data: 0.0083 max mem: 33300 Epoch: [14] [2250/4276] eta: 1:42:54 lr: 3.331170769679983e-05 loss: 0.1301 (0.1418) time: 3.0245 data: 0.0083 max mem: 33300 Epoch: [14] [2260/4276] eta: 1:42:24 lr: 3.330895528233558e-05 loss: 0.1402 (0.1418) time: 3.0490 data: 0.0083 max mem: 33300 Epoch: [14] [2270/4276] eta: 1:41:53 lr: 3.33062028426e-05 loss: 0.1338 (0.1418) time: 3.0489 data: 0.0087 max mem: 33300 Epoch: [14] [2280/4276] eta: 1:41:23 lr: 3.330345037759053e-05 loss: 0.1292 (0.1418) time: 3.0501 data: 0.0092 max mem: 33300 Epoch: [14] [2290/4276] eta: 1:40:52 lr: 3.330069788730463e-05 loss: 0.1343 (0.1417) time: 3.0414 data: 0.0090 max mem: 33300 Epoch: [14] [2300/4276] eta: 1:40:22 lr: 3.3297945371739736e-05 loss: 0.1299 (0.1417) time: 3.0370 data: 0.0090 max mem: 33300 Epoch: [14] [2310/4276] eta: 1:39:52 lr: 3.32951928308933e-05 loss: 0.1338 (0.1418) time: 3.0541 data: 0.0088 max mem: 33300 Epoch: [14] [2320/4276] eta: 1:39:21 lr: 3.3292440264762756e-05 loss: 0.1550 (0.1418) time: 3.0351 data: 0.0077 max mem: 33300 Epoch: [14] [2330/4276] eta: 1:38:50 lr: 3.328968767334557e-05 loss: 0.1563 (0.1419) time: 3.0058 data: 0.0078 max mem: 33300 Epoch: [14] [2340/4276] eta: 1:38:19 lr: 3.328693505663916e-05 loss: 0.1452 (0.1419) time: 2.9969 data: 0.0083 max mem: 33300 Epoch: [14] [2350/4276] eta: 1:37:48 lr: 3.3284182414640984e-05 loss: 0.1378 (0.1419) time: 2.9863 data: 0.0077 max mem: 33300 Epoch: [14] [2360/4276] eta: 1:37:17 lr: 3.3281429747348495e-05 loss: 0.1327 (0.1418) time: 3.0011 data: 0.0082 max mem: 33300 Epoch: [14] [2370/4276] eta: 1:36:47 lr: 3.327867705475913e-05 loss: 0.1326 (0.1419) time: 3.0279 data: 0.0085 max mem: 33300 Epoch: [14] [2380/4276] eta: 1:36:16 lr: 3.327592433687032e-05 loss: 0.1341 (0.1419) time: 3.0403 data: 0.0085 max mem: 33300 Epoch: [14] [2390/4276] eta: 1:35:46 lr: 3.327317159367952e-05 loss: 0.1333 (0.1419) time: 3.0448 data: 0.0091 max mem: 33300 Epoch: [14] [2400/4276] eta: 1:35:15 lr: 3.3270418825184154e-05 loss: 0.1333 (0.1419) time: 3.0615 data: 0.0092 max mem: 33300 Epoch: [14] [2410/4276] eta: 1:34:45 lr: 3.326766603138169e-05 loss: 0.1317 (0.1419) time: 3.0618 data: 0.0084 max mem: 33300 Epoch: [14] [2420/4276] eta: 1:34:15 lr: 3.326491321226954e-05 loss: 0.1314 (0.1418) time: 3.0578 data: 0.0081 max mem: 33300 Epoch: [14] [2430/4276] eta: 1:33:44 lr: 3.326216036784517e-05 loss: 0.1397 (0.1419) time: 3.0568 data: 0.0082 max mem: 33300 Epoch: [14] [2440/4276] eta: 1:33:14 lr: 3.3259407498106006e-05 loss: 0.1397 (0.1419) time: 3.0497 data: 0.0082 max mem: 33300 Epoch: [14] [2450/4276] eta: 1:32:43 lr: 3.325665460304949e-05 loss: 0.1379 (0.1419) time: 3.0507 data: 0.0080 max mem: 33300 Epoch: [14] [2460/4276] eta: 1:32:13 lr: 3.325390168267305e-05 loss: 0.1528 (0.1419) time: 3.0461 data: 0.0078 max mem: 33300 Epoch: [14] [2470/4276] eta: 1:31:42 lr: 3.325114873697415e-05 loss: 0.1379 (0.1420) time: 3.0436 data: 0.0076 max mem: 33300 Epoch: [14] [2480/4276] eta: 1:31:12 lr: 3.3248395765950195e-05 loss: 0.1448 (0.1420) time: 3.0492 data: 0.0077 max mem: 33300 Epoch: [14] [2490/4276] eta: 1:30:42 lr: 3.3245642769598646e-05 loss: 0.1443 (0.1420) time: 3.0575 data: 0.0080 max mem: 33300 Epoch: [14] [2500/4276] eta: 1:30:11 lr: 3.324288974791693e-05 loss: 0.1333 (0.1420) time: 3.0552 data: 0.0078 max mem: 33300 Epoch: [14] [2510/4276] eta: 1:29:41 lr: 3.324013670090248e-05 loss: 0.1474 (0.1420) time: 3.0582 data: 0.0074 max mem: 33300 Epoch: [14] [2520/4276] eta: 1:29:10 lr: 3.323738362855274e-05 loss: 0.1371 (0.1420) time: 3.0584 data: 0.0080 max mem: 33300 Epoch: [14] [2530/4276] eta: 1:28:40 lr: 3.3234630530865144e-05 loss: 0.1095 (0.1419) time: 3.0456 data: 0.0083 max mem: 33300 Epoch: [14] [2540/4276] eta: 1:28:09 lr: 3.323187740783712e-05 loss: 0.1210 (0.1419) time: 3.0427 data: 0.0082 max mem: 33300 Epoch: [14] [2550/4276] eta: 1:27:39 lr: 3.3229124259466116e-05 loss: 0.1222 (0.1418) time: 3.0454 data: 0.0084 max mem: 33300 Epoch: [14] [2560/4276] eta: 1:27:08 lr: 3.322637108574955e-05 loss: 0.1189 (0.1418) time: 3.0444 data: 0.0088 max mem: 33300 Epoch: [14] [2570/4276] eta: 1:26:38 lr: 3.3223617886684855e-05 loss: 0.1184 (0.1417) time: 3.0465 data: 0.0088 max mem: 33300 Epoch: [14] [2580/4276] eta: 1:26:07 lr: 3.322086466226947e-05 loss: 0.1181 (0.1417) time: 3.0344 data: 0.0089 max mem: 33300 Epoch: [14] [2590/4276] eta: 1:25:37 lr: 3.321811141250083e-05 loss: 0.1207 (0.1416) time: 3.0161 data: 0.0090 max mem: 33300 Epoch: [14] [2600/4276] eta: 1:25:06 lr: 3.321535813737637e-05 loss: 0.1291 (0.1416) time: 3.0246 data: 0.0087 max mem: 33300 Epoch: [14] [2610/4276] eta: 1:24:36 lr: 3.32126048368935e-05 loss: 0.1250 (0.1416) time: 3.0460 data: 0.0085 max mem: 33300 Epoch: [14] [2620/4276] eta: 1:24:05 lr: 3.320985151104968e-05 loss: 0.1328 (0.1416) time: 3.0377 data: 0.0081 max mem: 33300 Epoch: [14] [2630/4276] eta: 1:23:34 lr: 3.320709815984232e-05 loss: 0.1433 (0.1416) time: 2.9748 data: 0.0081 max mem: 33300 Epoch: [14] [2640/4276] eta: 1:23:03 lr: 3.3204344783268846e-05 loss: 0.1216 (0.1416) time: 2.9309 data: 0.0079 max mem: 33300 Epoch: [14] [2650/4276] eta: 1:22:31 lr: 3.3201591381326704e-05 loss: 0.1316 (0.1415) time: 2.9323 data: 0.0079 max mem: 33300 Epoch: [14] [2660/4276] eta: 1:22:00 lr: 3.319883795401331e-05 loss: 0.1409 (0.1416) time: 2.9330 data: 0.0078 max mem: 33300 Epoch: [14] [2670/4276] eta: 1:21:29 lr: 3.3196084501326094e-05 loss: 0.1381 (0.1416) time: 2.9340 data: 0.0077 max mem: 33300 Epoch: [14] [2680/4276] eta: 1:20:58 lr: 3.31933310232625e-05 loss: 0.1381 (0.1416) time: 2.9359 data: 0.0077 max mem: 33300 Epoch: [14] [2690/4276] eta: 1:20:27 lr: 3.319057751981993e-05 loss: 0.1375 (0.1416) time: 2.9359 data: 0.0076 max mem: 33300 Epoch: [14] [2700/4276] eta: 1:19:56 lr: 3.318782399099583e-05 loss: 0.1262 (0.1415) time: 2.9336 data: 0.0076 max mem: 33300 Epoch: [14] [2710/4276] eta: 1:19:25 lr: 3.3185070436787616e-05 loss: 0.1281 (0.1415) time: 2.9343 data: 0.0077 max mem: 33300 Epoch: [14] [2720/4276] eta: 1:18:54 lr: 3.318231685719271e-05 loss: 0.1261 (0.1415) time: 2.9526 data: 0.0075 max mem: 33300 Epoch: [14] [2730/4276] eta: 1:18:23 lr: 3.317956325220855e-05 loss: 0.1325 (0.1415) time: 2.9567 data: 0.0075 max mem: 33300 Epoch: [14] [2740/4276] eta: 1:17:52 lr: 3.317680962183254e-05 loss: 0.1415 (0.1415) time: 2.9399 data: 0.0076 max mem: 33300 Epoch: [14] [2750/4276] eta: 1:17:21 lr: 3.317405596606213e-05 loss: 0.1459 (0.1416) time: 2.9415 data: 0.0080 max mem: 33300 Epoch: [14] [2760/4276] eta: 1:16:50 lr: 3.317130228489473e-05 loss: 0.1307 (0.1416) time: 2.9502 data: 0.0082 max mem: 33300 Epoch: [14] [2770/4276] eta: 1:16:19 lr: 3.316854857832777e-05 loss: 0.1307 (0.1416) time: 2.9553 data: 0.0078 max mem: 33300 Epoch: [14] [2780/4276] eta: 1:15:48 lr: 3.316579484635866e-05 loss: 0.1412 (0.1416) time: 2.9564 data: 0.0075 max mem: 33300 Epoch: [14] [2790/4276] eta: 1:15:17 lr: 3.316304108898483e-05 loss: 0.1412 (0.1416) time: 2.9491 data: 0.0078 max mem: 33300 Epoch: [14] [2800/4276] eta: 1:14:46 lr: 3.3160287306203704e-05 loss: 0.1413 (0.1416) time: 2.9313 data: 0.0085 max mem: 33300 Epoch: [14] [2810/4276] eta: 1:14:16 lr: 3.31575334980127e-05 loss: 0.1175 (0.1415) time: 2.9379 data: 0.0086 max mem: 33300 Epoch: [14] [2820/4276] eta: 1:13:45 lr: 3.3154779664409234e-05 loss: 0.1206 (0.1414) time: 2.9564 data: 0.0079 max mem: 33300 Epoch: [14] [2830/4276] eta: 1:13:14 lr: 3.315202580539073e-05 loss: 0.1330 (0.1414) time: 2.9462 data: 0.0075 max mem: 33300 Epoch: [14] [2840/4276] eta: 1:12:43 lr: 3.314927192095462e-05 loss: 0.1492 (0.1414) time: 2.9470 data: 0.0075 max mem: 33300 Epoch: [14] [2850/4276] eta: 1:12:12 lr: 3.31465180110983e-05 loss: 0.1533 (0.1416) time: 2.9559 data: 0.0074 max mem: 33300 Epoch: [14] [2860/4276] eta: 1:11:41 lr: 3.3143764075819214e-05 loss: 0.1458 (0.1416) time: 2.9498 data: 0.0077 max mem: 33300 Epoch: [14] [2870/4276] eta: 1:11:11 lr: 3.314101011511476e-05 loss: 0.1343 (0.1416) time: 2.9429 data: 0.0085 max mem: 33300 Epoch: [14] [2880/4276] eta: 1:10:40 lr: 3.313825612898236e-05 loss: 0.1374 (0.1416) time: 2.9632 data: 0.0083 max mem: 33300 Epoch: [14] [2890/4276] eta: 1:10:09 lr: 3.3135502117419434e-05 loss: 0.1373 (0.1415) time: 2.9634 data: 0.0081 max mem: 33300 Epoch: [14] [2900/4276] eta: 1:09:38 lr: 3.313274808042341e-05 loss: 0.1249 (0.1415) time: 2.9407 data: 0.0083 max mem: 33300 Epoch: [14] [2910/4276] eta: 1:09:08 lr: 3.312999401799168e-05 loss: 0.1253 (0.1415) time: 2.9594 data: 0.0078 max mem: 33300 Epoch: [14] [2920/4276] eta: 1:08:37 lr: 3.312723993012167e-05 loss: 0.1308 (0.1414) time: 2.9570 data: 0.0077 max mem: 33300 Epoch: [14] [2930/4276] eta: 1:08:06 lr: 3.312448581681081e-05 loss: 0.1270 (0.1414) time: 2.9348 data: 0.0078 max mem: 33300 Epoch: [14] [2940/4276] eta: 1:07:35 lr: 3.312173167805649e-05 loss: 0.1282 (0.1414) time: 2.9348 data: 0.0076 max mem: 33300 Epoch: [14] [2950/4276] eta: 1:07:04 lr: 3.311897751385614e-05 loss: 0.1308 (0.1414) time: 2.9418 data: 0.0075 max mem: 33300 Epoch: [14] [2960/4276] eta: 1:06:34 lr: 3.311622332420716e-05 loss: 0.1233 (0.1414) time: 2.9391 data: 0.0076 max mem: 33300 Epoch: [14] [2970/4276] eta: 1:06:03 lr: 3.311346910910699e-05 loss: 0.1255 (0.1414) time: 2.9352 data: 0.0076 max mem: 33300 Epoch: [14] [2980/4276] eta: 1:05:32 lr: 3.311071486855301e-05 loss: 0.1434 (0.1414) time: 2.9368 data: 0.0076 max mem: 33300 Epoch: [14] [2990/4276] eta: 1:05:01 lr: 3.3107960602542667e-05 loss: 0.1279 (0.1413) time: 2.9365 data: 0.0075 max mem: 33300 Epoch: [14] [3000/4276] eta: 1:04:30 lr: 3.310520631107333e-05 loss: 0.1279 (0.1413) time: 2.9370 data: 0.0076 max mem: 33300 Epoch: [14] [3010/4276] eta: 1:04:00 lr: 3.3102451994142454e-05 loss: 0.1357 (0.1413) time: 2.9456 data: 0.0079 max mem: 33300 Epoch: [14] [3020/4276] eta: 1:03:29 lr: 3.309969765174742e-05 loss: 0.1341 (0.1413) time: 2.9671 data: 0.0079 max mem: 33300 Epoch: [14] [3030/4276] eta: 1:02:59 lr: 3.309694328388564e-05 loss: 0.1270 (0.1413) time: 2.9615 data: 0.0074 max mem: 33300 Epoch: [14] [3040/4276] eta: 1:02:28 lr: 3.309418889055454e-05 loss: 0.1325 (0.1413) time: 2.9503 data: 0.0074 max mem: 33300 Epoch: [14] [3050/4276] eta: 1:01:57 lr: 3.309143447175152e-05 loss: 0.1493 (0.1413) time: 2.9728 data: 0.0074 max mem: 33300 Epoch: [14] [3060/4276] eta: 1:01:27 lr: 3.308868002747399e-05 loss: 0.1178 (0.1412) time: 3.0140 data: 0.0075 max mem: 33300 Epoch: [14] [3070/4276] eta: 1:00:57 lr: 3.3085925557719353e-05 loss: 0.1264 (0.1412) time: 3.0469 data: 0.0075 max mem: 33300 Epoch: [14] [3080/4276] eta: 1:00:26 lr: 3.308317106248503e-05 loss: 0.1326 (0.1412) time: 3.0323 data: 0.0072 max mem: 33300 Epoch: [14] [3090/4276] eta: 0:59:56 lr: 3.3080416541768405e-05 loss: 0.1230 (0.1411) time: 2.9923 data: 0.0073 max mem: 33300 Epoch: [14] [3100/4276] eta: 0:59:25 lr: 3.30776619955669e-05 loss: 0.1173 (0.1411) time: 2.9895 data: 0.0073 max mem: 33300 Epoch: [14] [3110/4276] eta: 0:58:55 lr: 3.307490742387792e-05 loss: 0.1136 (0.1410) time: 2.9986 data: 0.0075 max mem: 33300 Epoch: [14] [3120/4276] eta: 0:58:25 lr: 3.307215282669887e-05 loss: 0.1147 (0.1410) time: 2.9870 data: 0.0075 max mem: 33300 Epoch: [14] [3130/4276] eta: 0:57:54 lr: 3.306939820402716e-05 loss: 0.1333 (0.1409) time: 2.9897 data: 0.0072 max mem: 33300 Epoch: [14] [3140/4276] eta: 0:57:24 lr: 3.3066643555860185e-05 loss: 0.1338 (0.1409) time: 3.0103 data: 0.0073 max mem: 33300 Epoch: [14] [3150/4276] eta: 0:56:53 lr: 3.306388888219535e-05 loss: 0.1569 (0.1410) time: 3.0272 data: 0.0075 max mem: 33300 Epoch: [14] [3160/4276] eta: 0:56:23 lr: 3.306113418303007e-05 loss: 0.1405 (0.1410) time: 3.0239 data: 0.0073 max mem: 33300 Epoch: [14] [3170/4276] eta: 0:55:53 lr: 3.3058379458361747e-05 loss: 0.1332 (0.1410) time: 3.0132 data: 0.0073 max mem: 33300 Epoch: [14] [3180/4276] eta: 0:55:22 lr: 3.305562470818776e-05 loss: 0.1278 (0.1409) time: 3.0063 data: 0.0075 max mem: 33300 Epoch: [14] [3190/4276] eta: 0:54:52 lr: 3.3052869932505536e-05 loss: 0.1330 (0.1409) time: 2.9890 data: 0.0072 max mem: 33300 Epoch: [14] [3200/4276] eta: 0:54:21 lr: 3.305011513131247e-05 loss: 0.1404 (0.1409) time: 2.9777 data: 0.0071 max mem: 33300 Epoch: [14] [3210/4276] eta: 0:53:51 lr: 3.304736030460595e-05 loss: 0.1269 (0.1409) time: 2.9856 data: 0.0072 max mem: 33300 Epoch: [14] [3220/4276] eta: 0:53:20 lr: 3.30446054523834e-05 loss: 0.1269 (0.1409) time: 2.9859 data: 0.0072 max mem: 33300 Epoch: [14] [3230/4276] eta: 0:52:50 lr: 3.304185057464221e-05 loss: 0.1366 (0.1409) time: 3.0040 data: 0.0071 max mem: 33300 Epoch: [14] [3240/4276] eta: 0:52:20 lr: 3.3039095671379774e-05 loss: 0.1446 (0.1410) time: 3.0185 data: 0.0075 max mem: 33300 Epoch: [14] [3250/4276] eta: 0:51:49 lr: 3.303634074259349e-05 loss: 0.1446 (0.1410) time: 3.0116 data: 0.0077 max mem: 33300 Epoch: [14] [3260/4276] eta: 0:51:19 lr: 3.3033585788280764e-05 loss: 0.1382 (0.1410) time: 3.0047 data: 0.0073 max mem: 33300 Epoch: [14] [3270/4276] eta: 0:50:48 lr: 3.3030830808438986e-05 loss: 0.1414 (0.1410) time: 2.9928 data: 0.0071 max mem: 33300 Epoch: [14] [3280/4276] eta: 0:50:18 lr: 3.302807580306557e-05 loss: 0.1323 (0.1410) time: 3.0042 data: 0.0073 max mem: 33300 Epoch: [14] [3290/4276] eta: 0:49:48 lr: 3.302532077215789e-05 loss: 0.1397 (0.1410) time: 3.0078 data: 0.0073 max mem: 33300 Epoch: [14] [3300/4276] eta: 0:49:17 lr: 3.3022565715713366e-05 loss: 0.1443 (0.1410) time: 2.9967 data: 0.0071 max mem: 33300 Epoch: [14] [3310/4276] eta: 0:48:47 lr: 3.3019810633729376e-05 loss: 0.1479 (0.1410) time: 3.0145 data: 0.0072 max mem: 33300 Epoch: [14] [3320/4276] eta: 0:48:17 lr: 3.3017055526203324e-05 loss: 0.1482 (0.1411) time: 3.0236 data: 0.0073 max mem: 33300 Epoch: [14] [3330/4276] eta: 0:47:46 lr: 3.30143003931326e-05 loss: 0.1245 (0.1411) time: 3.0030 data: 0.0073 max mem: 33300 Epoch: [14] [3340/4276] eta: 0:47:16 lr: 3.30115452345146e-05 loss: 0.1308 (0.1411) time: 3.0098 data: 0.0071 max mem: 33300 Epoch: [14] [3350/4276] eta: 0:46:45 lr: 3.3008790050346724e-05 loss: 0.1323 (0.1410) time: 3.0161 data: 0.0070 max mem: 33300 Epoch: [14] [3360/4276] eta: 0:46:15 lr: 3.300603484062635e-05 loss: 0.1323 (0.1410) time: 2.9982 data: 0.0072 max mem: 33300 Epoch: [14] [3370/4276] eta: 0:45:45 lr: 3.300327960535089e-05 loss: 0.1439 (0.1411) time: 2.9719 data: 0.0075 max mem: 33300 Epoch: [14] [3380/4276] eta: 0:45:14 lr: 3.300052434451774e-05 loss: 0.1369 (0.1411) time: 2.9702 data: 0.0078 max mem: 33300 Epoch: [14] [3390/4276] eta: 0:44:44 lr: 3.299776905812426e-05 loss: 0.1362 (0.1411) time: 2.9945 data: 0.0077 max mem: 33300 Epoch: [14] [3400/4276] eta: 0:44:13 lr: 3.299501374616787e-05 loss: 0.1420 (0.1411) time: 3.0114 data: 0.0075 max mem: 33300 Epoch: [14] [3410/4276] eta: 0:43:43 lr: 3.2992258408645954e-05 loss: 0.1418 (0.1411) time: 3.0080 data: 0.0077 max mem: 33300 Epoch: [14] [3420/4276] eta: 0:43:13 lr: 3.29895030455559e-05 loss: 0.1399 (0.1412) time: 2.9871 data: 0.0079 max mem: 33300 Epoch: [14] [3430/4276] eta: 0:42:42 lr: 3.298674765689509e-05 loss: 0.1423 (0.1412) time: 2.9751 data: 0.0080 max mem: 33300 Epoch: [14] [3440/4276] eta: 0:42:12 lr: 3.298399224266093e-05 loss: 0.1423 (0.1411) time: 2.9610 data: 0.0074 max mem: 33300 Epoch: [14] [3450/4276] eta: 0:41:41 lr: 3.29812368028508e-05 loss: 0.1417 (0.1412) time: 2.9661 data: 0.0076 max mem: 33300 Epoch: [14] [3460/4276] eta: 0:41:11 lr: 3.297848133746209e-05 loss: 0.1562 (0.1412) time: 2.9697 data: 0.0080 max mem: 33300 Epoch: [14] [3470/4276] eta: 0:40:40 lr: 3.2975725846492186e-05 loss: 0.1409 (0.1412) time: 2.9569 data: 0.0077 max mem: 33300 Epoch: [14] [3480/4276] eta: 0:40:10 lr: 3.2972970329938485e-05 loss: 0.1405 (0.1412) time: 2.9588 data: 0.0075 max mem: 33300 Epoch: [14] [3490/4276] eta: 0:39:40 lr: 3.297021478779835e-05 loss: 0.1363 (0.1412) time: 2.9615 data: 0.0073 max mem: 33300 Epoch: [14] [3500/4276] eta: 0:39:09 lr: 3.296745922006919e-05 loss: 0.1345 (0.1412) time: 2.9545 data: 0.0078 max mem: 33300 Epoch: [14] [3510/4276] eta: 0:38:39 lr: 3.296470362674838e-05 loss: 0.1279 (0.1412) time: 2.9588 data: 0.0080 max mem: 33300 Epoch: [14] [3520/4276] eta: 0:38:08 lr: 3.296194800783332e-05 loss: 0.1343 (0.1412) time: 2.9738 data: 0.0075 max mem: 33300 Epoch: [14] [3530/4276] eta: 0:37:38 lr: 3.295919236332138e-05 loss: 0.1473 (0.1412) time: 2.9943 data: 0.0073 max mem: 33300 Epoch: [14] [3540/4276] eta: 0:37:08 lr: 3.295643669320994e-05 loss: 0.1464 (0.1412) time: 2.9972 data: 0.0075 max mem: 33300 Epoch: [14] [3550/4276] eta: 0:36:37 lr: 3.29536809974964e-05 loss: 0.1297 (0.1412) time: 2.9834 data: 0.0079 max mem: 33300 Epoch: [14] [3560/4276] eta: 0:36:07 lr: 3.2950925276178126e-05 loss: 0.1297 (0.1412) time: 3.0052 data: 0.0082 max mem: 33300 Epoch: [14] [3570/4276] eta: 0:35:37 lr: 3.294816952925252e-05 loss: 0.1558 (0.1413) time: 3.0006 data: 0.0079 max mem: 33300 Epoch: [14] [3580/4276] eta: 0:35:06 lr: 3.294541375671695e-05 loss: 0.1324 (0.1412) time: 2.9859 data: 0.0079 max mem: 33300 Epoch: [14] [3590/4276] eta: 0:34:36 lr: 3.29426579585688e-05 loss: 0.1247 (0.1412) time: 3.0045 data: 0.0081 max mem: 33300 Epoch: [14] [3600/4276] eta: 0:34:06 lr: 3.293990213480545e-05 loss: 0.1247 (0.1412) time: 3.0090 data: 0.0082 max mem: 33300 Epoch: [14] [3610/4276] eta: 0:33:35 lr: 3.293714628542429e-05 loss: 0.1223 (0.1412) time: 2.9997 data: 0.0080 max mem: 33300 Epoch: [14] [3620/4276] eta: 0:33:05 lr: 3.2934390410422686e-05 loss: 0.1206 (0.1412) time: 2.9953 data: 0.0080 max mem: 33300 Epoch: [14] [3630/4276] eta: 0:32:35 lr: 3.293163450979804e-05 loss: 0.1304 (0.1412) time: 2.9885 data: 0.0086 max mem: 33300 Epoch: [14] [3640/4276] eta: 0:32:04 lr: 3.2928878583547705e-05 loss: 0.1114 (0.1411) time: 2.9979 data: 0.0087 max mem: 33300 Epoch: [14] [3650/4276] eta: 0:31:34 lr: 3.292612263166908e-05 loss: 0.1114 (0.1411) time: 3.0098 data: 0.0084 max mem: 33300 Epoch: [14] [3660/4276] eta: 0:31:04 lr: 3.2923366654159524e-05 loss: 0.1184 (0.1411) time: 3.0026 data: 0.0082 max mem: 33300 Epoch: [14] [3670/4276] eta: 0:30:33 lr: 3.292061065101643e-05 loss: 0.1347 (0.1411) time: 2.9975 data: 0.0083 max mem: 33300 Epoch: [14] [3680/4276] eta: 0:30:03 lr: 3.291785462223717e-05 loss: 0.1456 (0.1411) time: 2.9859 data: 0.0087 max mem: 33300 Epoch: [14] [3690/4276] eta: 0:29:33 lr: 3.291509856781913e-05 loss: 0.1483 (0.1411) time: 2.9936 data: 0.0088 max mem: 33300 Epoch: [14] [3700/4276] eta: 0:29:03 lr: 3.291234248775967e-05 loss: 0.1348 (0.1411) time: 3.0080 data: 0.0083 max mem: 33300 Epoch: [14] [3710/4276] eta: 0:28:32 lr: 3.290958638205617e-05 loss: 0.1209 (0.1410) time: 3.0002 data: 0.0080 max mem: 33300 Epoch: [14] [3720/4276] eta: 0:28:02 lr: 3.2906830250706015e-05 loss: 0.1165 (0.1410) time: 3.0018 data: 0.0079 max mem: 33300 Epoch: [14] [3730/4276] eta: 0:27:32 lr: 3.290407409370657e-05 loss: 0.1361 (0.1411) time: 3.0117 data: 0.0078 max mem: 33300 Epoch: [14] [3740/4276] eta: 0:27:02 lr: 3.290131791105521e-05 loss: 0.1380 (0.1410) time: 3.0782 data: 0.0083 max mem: 33300 Epoch: [14] [3750/4276] eta: 0:26:31 lr: 3.289856170274931e-05 loss: 0.1334 (0.1410) time: 3.1384 data: 0.0088 max mem: 33300 Epoch: [14] [3760/4276] eta: 0:26:01 lr: 3.289580546878624e-05 loss: 0.1203 (0.1410) time: 3.0706 data: 0.0088 max mem: 33300 Epoch: [14] [3770/4276] eta: 0:25:31 lr: 3.289304920916339e-05 loss: 0.1307 (0.1410) time: 3.0123 data: 0.0081 max mem: 33300 Epoch: [14] [3780/4276] eta: 0:25:01 lr: 3.2890292923878105e-05 loss: 0.1376 (0.1410) time: 3.0359 data: 0.0084 max mem: 33300 Epoch: [14] [3790/4276] eta: 0:24:30 lr: 3.288753661292777e-05 loss: 0.1285 (0.1410) time: 3.0657 data: 0.0083 max mem: 33300 Epoch: [14] [3800/4276] eta: 0:24:00 lr: 3.2884780276309756e-05 loss: 0.1370 (0.1410) time: 3.0544 data: 0.0079 max mem: 33300 Epoch: [14] [3810/4276] eta: 0:23:30 lr: 3.288202391402143e-05 loss: 0.1370 (0.1409) time: 3.0622 data: 0.0087 max mem: 33300 Epoch: [14] [3820/4276] eta: 0:23:00 lr: 3.287926752606017e-05 loss: 0.1186 (0.1409) time: 3.0768 data: 0.0086 max mem: 33300 Epoch: [14] [3830/4276] eta: 0:22:30 lr: 3.2876511112423335e-05 loss: 0.1262 (0.1409) time: 3.0504 data: 0.0079 max mem: 33300 Epoch: [14] [3840/4276] eta: 0:22:00 lr: 3.287375467310831e-05 loss: 0.1276 (0.1408) time: 3.1562 data: 0.0084 max mem: 33300 Epoch: [14] [3850/4276] eta: 0:21:29 lr: 3.287099820811245e-05 loss: 0.1184 (0.1408) time: 3.1833 data: 0.0087 max mem: 33300 Epoch: [14] [3860/4276] eta: 0:20:59 lr: 3.2868241717433115e-05 loss: 0.1253 (0.1408) time: 3.0987 data: 0.0082 max mem: 33300 Epoch: [14] [3870/4276] eta: 0:20:29 lr: 3.286548520106769e-05 loss: 0.1407 (0.1408) time: 3.0984 data: 0.0077 max mem: 33300 Epoch: [14] [3880/4276] eta: 0:19:59 lr: 3.286272865901353e-05 loss: 0.1341 (0.1408) time: 3.0958 data: 0.0076 max mem: 33300 Epoch: [14] [3890/4276] eta: 0:19:28 lr: 3.285997209126801e-05 loss: 0.1273 (0.1408) time: 3.0730 data: 0.0076 max mem: 33300 Epoch: [14] [3900/4276] eta: 0:18:58 lr: 3.285721549782849e-05 loss: 0.1324 (0.1408) time: 3.0146 data: 0.0073 max mem: 33300 Epoch: [14] [3910/4276] eta: 0:18:28 lr: 3.2854458878692335e-05 loss: 0.1249 (0.1407) time: 3.0325 data: 0.0080 max mem: 33300 Epoch: [14] [3920/4276] eta: 0:17:58 lr: 3.285170223385692e-05 loss: 0.1176 (0.1407) time: 3.0574 data: 0.0084 max mem: 33300 Epoch: [14] [3930/4276] eta: 0:17:27 lr: 3.284894556331959e-05 loss: 0.1176 (0.1407) time: 3.0494 data: 0.0075 max mem: 33300 Epoch: [14] [3940/4276] eta: 0:16:57 lr: 3.2846188867077736e-05 loss: 0.1280 (0.1407) time: 3.0589 data: 0.0074 max mem: 33300 Epoch: [14] [3950/4276] eta: 0:16:27 lr: 3.284343214512869e-05 loss: 0.1204 (0.1406) time: 3.0620 data: 0.0080 max mem: 33300 Epoch: [14] [3960/4276] eta: 0:15:57 lr: 3.284067539746984e-05 loss: 0.1204 (0.1406) time: 3.0890 data: 0.0078 max mem: 33300 Epoch: [14] [3970/4276] eta: 0:15:26 lr: 3.2837918624098526e-05 loss: 0.1366 (0.1406) time: 3.1122 data: 0.0080 max mem: 33300 Epoch: [14] [3980/4276] eta: 0:14:56 lr: 3.283516182501213e-05 loss: 0.1347 (0.1406) time: 3.0617 data: 0.0082 max mem: 33300 Epoch: [14] [3990/4276] eta: 0:14:26 lr: 3.2832405000208005e-05 loss: 0.1331 (0.1406) time: 3.0508 data: 0.0082 max mem: 33300 Epoch: [14] [4000/4276] eta: 0:13:56 lr: 3.2829648149683515e-05 loss: 0.1205 (0.1406) time: 3.0420 data: 0.0083 max mem: 33300 Epoch: [14] [4010/4276] eta: 0:13:25 lr: 3.282689127343601e-05 loss: 0.1250 (0.1406) time: 2.9879 data: 0.0081 max mem: 33300 Epoch: [14] [4020/4276] eta: 0:12:55 lr: 3.2824134371462866e-05 loss: 0.1343 (0.1406) time: 2.9957 data: 0.0082 max mem: 33300 Epoch: [14] [4030/4276] eta: 0:12:25 lr: 3.282137744376142e-05 loss: 0.1308 (0.1405) time: 2.9957 data: 0.0079 max mem: 33300 Epoch: [14] [4040/4276] eta: 0:11:54 lr: 3.281862049032905e-05 loss: 0.1342 (0.1406) time: 2.9865 data: 0.0075 max mem: 33300 Epoch: [14] [4050/4276] eta: 0:11:24 lr: 3.281586351116311e-05 loss: 0.1342 (0.1406) time: 3.0055 data: 0.0075 max mem: 33300 Epoch: [14] [4060/4276] eta: 0:10:54 lr: 3.281310650626095e-05 loss: 0.1338 (0.1406) time: 3.0108 data: 0.0077 max mem: 33300 Epoch: [14] [4070/4276] eta: 0:10:23 lr: 3.281034947561993e-05 loss: 0.1403 (0.1406) time: 2.9880 data: 0.0077 max mem: 33300 Epoch: [14] [4080/4276] eta: 0:09:53 lr: 3.280759241923742e-05 loss: 0.1353 (0.1406) time: 2.9734 data: 0.0079 max mem: 33300 Epoch: [14] [4090/4276] eta: 0:09:23 lr: 3.280483533711075e-05 loss: 0.1391 (0.1407) time: 2.9637 data: 0.0083 max mem: 33300 Epoch: [14] [4100/4276] eta: 0:08:52 lr: 3.2802078229237306e-05 loss: 0.1488 (0.1407) time: 2.9689 data: 0.0082 max mem: 33300 Epoch: [14] [4110/4276] eta: 0:08:22 lr: 3.2799321095614414e-05 loss: 0.1468 (0.1407) time: 2.9782 data: 0.0083 max mem: 33300 Epoch: [14] [4120/4276] eta: 0:07:52 lr: 3.2796563936239444e-05 loss: 0.1624 (0.1408) time: 2.9853 data: 0.0081 max mem: 33300 Epoch: [14] [4130/4276] eta: 0:07:22 lr: 3.279380675110975e-05 loss: 0.1480 (0.1408) time: 2.9955 data: 0.0076 max mem: 33300 Epoch: [14] [4140/4276] eta: 0:06:51 lr: 3.279104954022268e-05 loss: 0.1405 (0.1407) time: 3.0134 data: 0.0080 max mem: 33300 Epoch: [14] [4150/4276] eta: 0:06:21 lr: 3.2788292303575596e-05 loss: 0.1361 (0.1407) time: 3.0116 data: 0.0083 max mem: 33300 Epoch: [14] [4160/4276] eta: 0:05:51 lr: 3.278553504116584e-05 loss: 0.1385 (0.1408) time: 2.9691 data: 0.0083 max mem: 33300 Epoch: [14] [4170/4276] eta: 0:05:20 lr: 3.2782777752990766e-05 loss: 0.1553 (0.1408) time: 2.9393 data: 0.0089 max mem: 33300 Epoch: [14] [4180/4276] eta: 0:04:50 lr: 3.278002043904773e-05 loss: 0.1391 (0.1408) time: 2.9474 data: 0.0094 max mem: 33300 Epoch: [14] [4190/4276] eta: 0:04:20 lr: 3.2777263099334075e-05 loss: 0.1285 (0.1408) time: 2.9509 data: 0.0091 max mem: 33300 Epoch: [14] [4200/4276] eta: 0:03:50 lr: 3.277450573384716e-05 loss: 0.1435 (0.1408) time: 2.9439 data: 0.0085 max mem: 33300 Epoch: [14] [4210/4276] eta: 0:03:19 lr: 3.277174834258433e-05 loss: 0.1501 (0.1409) time: 2.9713 data: 0.0085 max mem: 33300 Epoch: [14] [4220/4276] eta: 0:02:49 lr: 3.276899092554294e-05 loss: 0.1464 (0.1409) time: 3.0159 data: 0.0083 max mem: 33300 Epoch: [14] [4230/4276] eta: 0:02:19 lr: 3.276623348272033e-05 loss: 0.1598 (0.1410) time: 3.0239 data: 0.0081 max mem: 33300 Epoch: [14] [4240/4276] eta: 0:01:48 lr: 3.276347601411385e-05 loss: 0.1540 (0.1410) time: 3.0141 data: 0.0084 max mem: 33300 Epoch: [14] [4250/4276] eta: 0:01:18 lr: 3.2760718519720865e-05 loss: 0.1400 (0.1410) time: 3.0051 data: 0.0082 max mem: 33300 Epoch: [14] [4260/4276] eta: 0:00:48 lr: 3.275796099953869e-05 loss: 0.1318 (0.1410) time: 3.0021 data: 0.0079 max mem: 33300 Epoch: [14] [4270/4276] eta: 0:00:18 lr: 3.27552034535647e-05 loss: 0.1450 (0.1411) time: 2.9937 data: 0.0075 max mem: 33300 Epoch: [14] Total time: 3:35:41 Test: [ 0/21770] eta: 8:51:35 time: 1.4651 data: 1.4219 max mem: 33300 Test: [ 100/21770] eta: 0:19:08 time: 0.0393 data: 0.0009 max mem: 33300 Test: [ 200/21770] eta: 0:16:35 time: 0.0392 data: 0.0009 max mem: 33300 Test: [ 300/21770] eta: 0:15:42 time: 0.0394 data: 0.0009 max mem: 33300 Test: [ 400/21770] eta: 0:15:13 time: 0.0392 data: 0.0009 max mem: 33300 Test: [ 500/21770] eta: 0:14:54 time: 0.0393 data: 0.0009 max mem: 33300 Test: [ 600/21770] eta: 0:14:41 time: 0.0394 data: 0.0009 max mem: 33300 Test: [ 700/21770] eta: 0:14:31 time: 0.0399 data: 0.0010 max mem: 33300 Test: [ 800/21770] eta: 0:14:22 time: 0.0398 data: 0.0009 max mem: 33300 Test: [ 900/21770] eta: 0:14:14 time: 0.0393 data: 0.0009 max mem: 33300 Test: [ 1000/21770] eta: 0:14:08 time: 0.0400 data: 0.0009 max mem: 33300 Test: [ 1100/21770] eta: 0:14:02 time: 0.0399 data: 0.0010 max mem: 33300 Test: [ 1200/21770] eta: 0:13:56 time: 0.0399 data: 0.0010 max mem: 33300 Test: [ 1300/21770] eta: 0:13:50 time: 0.0394 data: 0.0009 max mem: 33300 Test: [ 1400/21770] eta: 0:13:45 time: 0.0395 data: 0.0009 max mem: 33300 Test: [ 1500/21770] eta: 0:13:38 time: 0.0389 data: 0.0009 max mem: 33300 Test: [ 1600/21770] eta: 0:13:33 time: 0.0400 data: 0.0011 max mem: 33300 Test: [ 1700/21770] eta: 0:13:27 time: 0.0385 data: 0.0011 max mem: 33300 Test: [ 1800/21770] eta: 0:13:22 time: 0.0383 data: 0.0009 max mem: 33300 Test: [ 1900/21770] eta: 0:13:16 time: 0.0384 data: 0.0009 max mem: 33300 Test: [ 2000/21770] eta: 0:13:10 time: 0.0384 data: 0.0009 max mem: 33300 Test: [ 2100/21770] eta: 0:13:04 time: 0.0381 data: 0.0009 max mem: 33300 Test: [ 2200/21770] eta: 0:12:59 time: 0.0382 data: 0.0009 max mem: 33300 Test: [ 2300/21770] eta: 0:12:53 time: 0.0381 data: 0.0009 max mem: 33300 Test: [ 2400/21770] eta: 0:12:48 time: 0.0387 data: 0.0010 max mem: 33300 Test: [ 2500/21770] eta: 0:12:43 time: 0.0383 data: 0.0009 max mem: 33300 Test: [ 2600/21770] eta: 0:12:38 time: 0.0383 data: 0.0010 max mem: 33300 Test: [ 2700/21770] eta: 0:12:34 time: 0.0388 data: 0.0010 max mem: 33300 Test: [ 2800/21770] eta: 0:12:29 time: 0.0386 data: 0.0010 max mem: 33300 Test: [ 2900/21770] eta: 0:12:25 time: 0.0390 data: 0.0010 max mem: 33300 Test: [ 3000/21770] eta: 0:12:21 time: 0.0390 data: 0.0009 max mem: 33300 Test: [ 3100/21770] eta: 0:12:17 time: 0.0397 data: 0.0010 max mem: 33300 Test: [ 3200/21770] eta: 0:12:13 time: 0.0400 data: 0.0009 max mem: 33300 Test: [ 3300/21770] eta: 0:12:09 time: 0.0396 data: 0.0010 max mem: 33300 Test: [ 3400/21770] eta: 0:12:05 time: 0.0389 data: 0.0009 max mem: 33300 Test: [ 3500/21770] eta: 0:12:01 time: 0.0391 data: 0.0010 max mem: 33300 Test: [ 3600/21770] eta: 0:11:57 time: 0.0390 data: 0.0010 max mem: 33300 Test: [ 3700/21770] eta: 0:11:53 time: 0.0387 data: 0.0010 max mem: 33300 Test: [ 3800/21770] eta: 0:11:48 time: 0.0387 data: 0.0010 max mem: 33300 Test: [ 3900/21770] eta: 0:11:44 time: 0.0389 data: 0.0010 max mem: 33300 Test: [ 4000/21770] eta: 0:11:40 time: 0.0398 data: 0.0009 max mem: 33300 Test: [ 4100/21770] eta: 0:11:36 time: 0.0397 data: 0.0010 max mem: 33300 Test: [ 4200/21770] eta: 0:11:32 time: 0.0393 data: 0.0009 max mem: 33300 Test: [ 4300/21770] eta: 0:11:29 time: 0.0392 data: 0.0009 max mem: 33300 Test: [ 4400/21770] eta: 0:11:25 time: 0.0397 data: 0.0010 max mem: 33300 Test: [ 4500/21770] eta: 0:11:21 time: 0.0391 data: 0.0010 max mem: 33300 Test: [ 4600/21770] eta: 0:11:17 time: 0.0391 data: 0.0010 max mem: 33300 Test: [ 4700/21770] eta: 0:11:12 time: 0.0392 data: 0.0010 max mem: 33300 Test: [ 4800/21770] eta: 0:11:08 time: 0.0388 data: 0.0009 max mem: 33300 Test: [ 4900/21770] eta: 0:11:05 time: 0.0399 data: 0.0009 max mem: 33300 Test: [ 5000/21770] eta: 0:11:01 time: 0.0393 data: 0.0009 max mem: 33300 Test: [ 5100/21770] eta: 0:10:57 time: 0.0393 data: 0.0009 max mem: 33300 Test: [ 5200/21770] eta: 0:10:53 time: 0.0395 data: 0.0009 max mem: 33300 Test: [ 5300/21770] eta: 0:10:49 time: 0.0394 data: 0.0009 max mem: 33300 Test: [ 5400/21770] eta: 0:10:45 time: 0.0396 data: 0.0009 max mem: 33300 Test: [ 5500/21770] eta: 0:10:41 time: 0.0397 data: 0.0010 max mem: 33300 Test: [ 5600/21770] eta: 0:10:37 time: 0.0396 data: 0.0009 max mem: 33300 Test: [ 5700/21770] eta: 0:10:33 time: 0.0391 data: 0.0009 max mem: 33300 Test: [ 5800/21770] eta: 0:10:29 time: 0.0401 data: 0.0009 max mem: 33300 Test: [ 5900/21770] eta: 0:10:25 time: 0.0395 data: 0.0009 max mem: 33300 Test: [ 6000/21770] eta: 0:10:21 time: 0.0397 data: 0.0009 max mem: 33300 Test: [ 6100/21770] eta: 0:10:17 time: 0.0395 data: 0.0009 max mem: 33300 Test: [ 6200/21770] eta: 0:10:14 time: 0.0399 data: 0.0009 max mem: 33300 Test: [ 6300/21770] eta: 0:10:10 time: 0.0398 data: 0.0009 max mem: 33300 Test: [ 6400/21770] eta: 0:10:06 time: 0.0394 data: 0.0009 max mem: 33300 Test: [ 6500/21770] eta: 0:10:02 time: 0.0392 data: 0.0009 max mem: 33300 Test: [ 6600/21770] eta: 0:09:58 time: 0.0392 data: 0.0009 max mem: 33300 Test: [ 6700/21770] eta: 0:09:54 time: 0.0393 data: 0.0009 max mem: 33300 Test: [ 6800/21770] eta: 0:09:50 time: 0.0397 data: 0.0010 max mem: 33300 Test: [ 6900/21770] eta: 0:09:46 time: 0.0393 data: 0.0009 max mem: 33300 Test: [ 7000/21770] eta: 0:09:42 time: 0.0400 data: 0.0010 max mem: 33300 Test: [ 7100/21770] eta: 0:09:38 time: 0.0395 data: 0.0009 max mem: 33300 Test: [ 7200/21770] eta: 0:09:34 time: 0.0390 data: 0.0010 max mem: 33300 Test: [ 7300/21770] eta: 0:09:30 time: 0.0388 data: 0.0010 max mem: 33300 Test: [ 7400/21770] eta: 0:09:26 time: 0.0398 data: 0.0010 max mem: 33300 Test: [ 7500/21770] eta: 0:09:22 time: 0.0395 data: 0.0009 max mem: 33300 Test: [ 7600/21770] eta: 0:09:18 time: 0.0390 data: 0.0009 max mem: 33300 Test: [ 7700/21770] eta: 0:09:14 time: 0.0391 data: 0.0009 max mem: 33300 Test: [ 7800/21770] eta: 0:09:10 time: 0.0391 data: 0.0010 max mem: 33300 Test: [ 7900/21770] eta: 0:09:06 time: 0.0391 data: 0.0010 max mem: 33300 Test: [ 8000/21770] eta: 0:09:02 time: 0.0392 data: 0.0010 max mem: 33300 Test: [ 8100/21770] eta: 0:08:58 time: 0.0392 data: 0.0009 max mem: 33300 Test: [ 8200/21770] eta: 0:08:54 time: 0.0389 data: 0.0009 max mem: 33300 Test: [ 8300/21770] eta: 0:08:50 time: 0.0396 data: 0.0010 max mem: 33300 Test: [ 8400/21770] eta: 0:08:46 time: 0.0388 data: 0.0009 max mem: 33300 Test: [ 8500/21770] eta: 0:08:42 time: 0.0391 data: 0.0009 max mem: 33300 Test: [ 8600/21770] eta: 0:08:38 time: 0.0390 data: 0.0009 max mem: 33300 Test: [ 8700/21770] eta: 0:08:34 time: 0.0391 data: 0.0009 max mem: 33300 Test: [ 8800/21770] eta: 0:08:30 time: 0.0389 data: 0.0009 max mem: 33300 Test: [ 8900/21770] eta: 0:08:26 time: 0.0391 data: 0.0009 max mem: 33300 Test: [ 9000/21770] eta: 0:08:22 time: 0.0389 data: 0.0009 max mem: 33300 Test: [ 9100/21770] eta: 0:08:18 time: 0.0383 data: 0.0009 max mem: 33300 Test: [ 9200/21770] eta: 0:08:14 time: 0.0386 data: 0.0010 max mem: 33300 Test: [ 9300/21770] eta: 0:08:10 time: 0.0387 data: 0.0009 max mem: 33300 Test: [ 9400/21770] eta: 0:08:06 time: 0.0390 data: 0.0010 max mem: 33300 Test: [ 9500/21770] eta: 0:08:02 time: 0.0386 data: 0.0009 max mem: 33300 Test: [ 9600/21770] eta: 0:07:58 time: 0.0387 data: 0.0009 max mem: 33300 Test: [ 9700/21770] eta: 0:07:54 time: 0.0391 data: 0.0010 max mem: 33300 Test: [ 9800/21770] eta: 0:07:50 time: 0.0388 data: 0.0010 max mem: 33300 Test: [ 9900/21770] eta: 0:07:46 time: 0.0392 data: 0.0011 max mem: 33300 Test: [10000/21770] eta: 0:07:42 time: 0.0390 data: 0.0009 max mem: 33300 Test: [10100/21770] eta: 0:07:39 time: 0.0394 data: 0.0010 max mem: 33300 Test: [10200/21770] eta: 0:07:35 time: 0.0389 data: 0.0010 max mem: 33300 Test: [10300/21770] eta: 0:07:31 time: 0.0396 data: 0.0011 max mem: 33300 Test: [10400/21770] eta: 0:07:27 time: 0.0389 data: 0.0009 max mem: 33300 Test: [10500/21770] eta: 0:07:23 time: 0.0386 data: 0.0010 max mem: 33300 Test: [10600/21770] eta: 0:07:19 time: 0.0387 data: 0.0010 max mem: 33300 Test: [10700/21770] eta: 0:07:15 time: 0.0388 data: 0.0010 max mem: 33300 Test: [10800/21770] eta: 0:07:11 time: 0.0380 data: 0.0009 max mem: 33300 Test: [10900/21770] eta: 0:07:07 time: 0.0381 data: 0.0010 max mem: 33300 Test: [11000/21770] eta: 0:07:03 time: 0.0381 data: 0.0009 max mem: 33300 Test: [11100/21770] eta: 0:06:59 time: 0.0383 data: 0.0009 max mem: 33300 Test: [11200/21770] eta: 0:06:55 time: 0.0382 data: 0.0010 max mem: 33300 Test: [11300/21770] eta: 0:06:51 time: 0.0384 data: 0.0009 max mem: 33300 Test: [11400/21770] eta: 0:06:47 time: 0.0385 data: 0.0010 max mem: 33300 Test: [11500/21770] eta: 0:06:43 time: 0.0386 data: 0.0010 max mem: 33300 Test: [11600/21770] eta: 0:06:39 time: 0.0381 data: 0.0010 max mem: 33300 Test: [11700/21770] eta: 0:06:35 time: 0.0381 data: 0.0010 max mem: 33300 Test: [11800/21770] eta: 0:06:31 time: 0.0383 data: 0.0010 max mem: 33300 Test: [11900/21770] eta: 0:06:27 time: 0.0387 data: 0.0009 max mem: 33300 Test: [12000/21770] eta: 0:06:23 time: 0.0383 data: 0.0009 max mem: 33300 Test: [12100/21770] eta: 0:06:19 time: 0.0385 data: 0.0010 max mem: 33300 Test: [12200/21770] eta: 0:06:15 time: 0.0389 data: 0.0009 max mem: 33300 Test: [12300/21770] eta: 0:06:11 time: 0.0396 data: 0.0011 max mem: 33300 Test: [12400/21770] eta: 0:06:07 time: 0.0399 data: 0.0012 max mem: 33300 Test: [12500/21770] eta: 0:06:03 time: 0.0391 data: 0.0011 max mem: 33300 Test: [12600/21770] eta: 0:05:59 time: 0.0394 data: 0.0012 max mem: 33300 Test: [12700/21770] eta: 0:05:55 time: 0.0392 data: 0.0012 max mem: 33300 Test: [12800/21770] eta: 0:05:51 time: 0.0403 data: 0.0013 max mem: 33300 Test: [12900/21770] eta: 0:05:47 time: 0.0394 data: 0.0010 max mem: 33300 Test: [13000/21770] eta: 0:05:43 time: 0.0391 data: 0.0010 max mem: 33300 Test: [13100/21770] eta: 0:05:40 time: 0.0396 data: 0.0010 max mem: 33300 Test: [13200/21770] eta: 0:05:36 time: 0.0400 data: 0.0010 max mem: 33300 Test: [13300/21770] eta: 0:05:32 time: 0.0398 data: 0.0011 max mem: 33300 Test: [13400/21770] eta: 0:05:28 time: 0.0387 data: 0.0010 max mem: 33300 Test: [13500/21770] eta: 0:05:24 time: 0.0409 data: 0.0010 max mem: 33300 Test: [13600/21770] eta: 0:05:20 time: 0.0405 data: 0.0010 max mem: 33300 Test: [13700/21770] eta: 0:05:16 time: 0.0402 data: 0.0010 max mem: 33300 Test: [13800/21770] eta: 0:05:12 time: 0.0402 data: 0.0011 max mem: 33300 Test: [13900/21770] eta: 0:05:09 time: 0.0397 data: 0.0011 max mem: 33300 Test: [14000/21770] eta: 0:05:05 time: 0.0393 data: 0.0011 max mem: 33300 Test: [14100/21770] eta: 0:05:01 time: 0.0392 data: 0.0012 max mem: 33300 Test: [14200/21770] eta: 0:04:57 time: 0.0393 data: 0.0011 max mem: 33300 Test: [14300/21770] eta: 0:04:53 time: 0.0393 data: 0.0011 max mem: 33300 Test: [14400/21770] eta: 0:04:49 time: 0.0406 data: 0.0013 max mem: 33300 Test: [14500/21770] eta: 0:04:45 time: 0.0400 data: 0.0011 max mem: 33300 Test: [14600/21770] eta: 0:04:41 time: 0.0394 data: 0.0010 max mem: 33300 Test: [14700/21770] eta: 0:04:37 time: 0.0401 data: 0.0010 max mem: 33300 Test: [14800/21770] eta: 0:04:33 time: 0.0392 data: 0.0010 max mem: 33300 Test: [14900/21770] eta: 0:04:29 time: 0.0393 data: 0.0010 max mem: 33300 Test: [15000/21770] eta: 0:04:26 time: 0.0391 data: 0.0010 max mem: 33300 Test: [15100/21770] eta: 0:04:22 time: 0.0384 data: 0.0010 max mem: 33300 Test: [15200/21770] eta: 0:04:18 time: 0.0389 data: 0.0010 max mem: 33300 Test: [15300/21770] eta: 0:04:14 time: 0.0389 data: 0.0010 max mem: 33300 Test: [15400/21770] eta: 0:04:10 time: 0.0387 data: 0.0010 max mem: 33300 Test: [15500/21770] eta: 0:04:06 time: 0.0397 data: 0.0010 max mem: 33300 Test: [15600/21770] eta: 0:04:02 time: 0.0393 data: 0.0011 max mem: 33300 Test: [15700/21770] eta: 0:03:58 time: 0.0400 data: 0.0012 max mem: 33300 Test: [15800/21770] eta: 0:03:54 time: 0.0396 data: 0.0012 max mem: 33300 Test: [15900/21770] eta: 0:03:50 time: 0.0394 data: 0.0012 max mem: 33300 Test: [16000/21770] eta: 0:03:46 time: 0.0390 data: 0.0012 max mem: 33300 Test: [16100/21770] eta: 0:03:42 time: 0.0401 data: 0.0013 max mem: 33300 Test: [16200/21770] eta: 0:03:38 time: 0.0402 data: 0.0013 max mem: 33300 Test: [16300/21770] eta: 0:03:34 time: 0.0402 data: 0.0010 max mem: 33300 Test: [16400/21770] eta: 0:03:31 time: 0.0399 data: 0.0010 max mem: 33300 Test: [16500/21770] eta: 0:03:27 time: 0.0386 data: 0.0010 max mem: 33300 Test: [16600/21770] eta: 0:03:23 time: 0.0386 data: 0.0010 max mem: 33300 Test: [16700/21770] eta: 0:03:19 time: 0.0389 data: 0.0011 max mem: 33300 Test: [16800/21770] eta: 0:03:15 time: 0.0386 data: 0.0010 max mem: 33300 Test: [16900/21770] eta: 0:03:11 time: 0.0386 data: 0.0010 max mem: 33300 Test: [17000/21770] eta: 0:03:07 time: 0.0396 data: 0.0011 max mem: 33300 Test: [17100/21770] eta: 0:03:03 time: 0.0391 data: 0.0010 max mem: 33300 Test: [17200/21770] eta: 0:02:59 time: 0.0396 data: 0.0010 max mem: 33300 Test: [17300/21770] eta: 0:02:55 time: 0.0396 data: 0.0012 max mem: 33300 Test: [17400/21770] eta: 0:02:51 time: 0.0386 data: 0.0010 max mem: 33300 Test: [17500/21770] eta: 0:02:47 time: 0.0395 data: 0.0011 max mem: 33300 Test: [17600/21770] eta: 0:02:43 time: 0.0383 data: 0.0009 max mem: 33300 Test: [17700/21770] eta: 0:02:39 time: 0.0387 data: 0.0010 max mem: 33300 Test: [17800/21770] eta: 0:02:35 time: 0.0385 data: 0.0009 max mem: 33300 Test: [17900/21770] eta: 0:02:31 time: 0.0387 data: 0.0009 max mem: 33300 Test: [18000/21770] eta: 0:02:28 time: 0.0385 data: 0.0009 max mem: 33300 Test: [18100/21770] eta: 0:02:24 time: 0.0390 data: 0.0010 max mem: 33300 Test: [18200/21770] eta: 0:02:20 time: 0.0388 data: 0.0010 max mem: 33300 Test: [18300/21770] eta: 0:02:16 time: 0.0386 data: 0.0010 max mem: 33300 Test: [18400/21770] eta: 0:02:12 time: 0.0384 data: 0.0010 max mem: 33300 Test: [18500/21770] eta: 0:02:08 time: 0.0387 data: 0.0010 max mem: 33300 Test: [18600/21770] eta: 0:02:04 time: 0.0389 data: 0.0010 max mem: 33300 Test: [18700/21770] eta: 0:02:00 time: 0.0388 data: 0.0009 max mem: 33300 Test: [18800/21770] eta: 0:01:56 time: 0.0388 data: 0.0009 max mem: 33300 Test: [18900/21770] eta: 0:01:52 time: 0.0387 data: 0.0010 max mem: 33300 Test: [19000/21770] eta: 0:01:48 time: 0.0386 data: 0.0010 max mem: 33300 Test: [19100/21770] eta: 0:01:44 time: 0.0390 data: 0.0010 max mem: 33300 Test: [19200/21770] eta: 0:01:40 time: 0.0388 data: 0.0010 max mem: 33300 Test: [19300/21770] eta: 0:01:36 time: 0.0386 data: 0.0009 max mem: 33300 Test: [19400/21770] eta: 0:01:32 time: 0.0387 data: 0.0009 max mem: 33300 Test: [19500/21770] eta: 0:01:29 time: 0.0386 data: 0.0010 max mem: 33300 Test: [19600/21770] eta: 0:01:25 time: 0.0385 data: 0.0010 max mem: 33300 Test: [19700/21770] eta: 0:01:21 time: 0.0385 data: 0.0010 max mem: 33300 Test: [19800/21770] eta: 0:01:17 time: 0.0387 data: 0.0011 max mem: 33300 Test: [19900/21770] eta: 0:01:13 time: 0.0385 data: 0.0010 max mem: 33300 Test: [20000/21770] eta: 0:01:09 time: 0.0381 data: 0.0009 max mem: 33300 Test: [20100/21770] eta: 0:01:05 time: 0.0385 data: 0.0010 max mem: 33300 Test: [20200/21770] eta: 0:01:01 time: 0.0387 data: 0.0011 max mem: 33300 Test: [20300/21770] eta: 0:00:57 time: 0.0385 data: 0.0010 max mem: 33300 Test: [20400/21770] eta: 0:00:53 time: 0.0384 data: 0.0010 max mem: 33300 Test: [20500/21770] eta: 0:00:49 time: 0.0389 data: 0.0011 max mem: 33300 Test: [20600/21770] eta: 0:00:45 time: 0.0391 data: 0.0009 max mem: 33300 Test: [20700/21770] eta: 0:00:41 time: 0.0398 data: 0.0011 max mem: 33300 Test: [20800/21770] eta: 0:00:38 time: 0.0388 data: 0.0011 max mem: 33300 Test: [20900/21770] eta: 0:00:34 time: 0.0397 data: 0.0011 max mem: 33300 Test: [21000/21770] eta: 0:00:30 time: 0.0390 data: 0.0009 max mem: 33300 Test: [21100/21770] eta: 0:00:26 time: 0.0392 data: 0.0011 max mem: 33300 Test: [21200/21770] eta: 0:00:22 time: 0.0396 data: 0.0012 max mem: 33300 Test: [21300/21770] eta: 0:00:18 time: 0.0400 data: 0.0012 max mem: 33300 Test: [21400/21770] eta: 0:00:14 time: 0.0408 data: 0.0014 max mem: 33300 Test: [21500/21770] eta: 0:00:10 time: 0.0387 data: 0.0010 max mem: 33300 Test: [21600/21770] eta: 0:00:06 time: 0.0386 data: 0.0010 max mem: 33300 Test: [21700/21770] eta: 0:00:02 time: 0.0388 data: 0.0011 max mem: 33300 Test: Total time: 0:14:13 Final results: Mean IoU is 0.00 precision@0.5 = 0.00 precision@0.6 = 0.00 precision@0.7 = 0.00 precision@0.8 = 0.00 precision@0.9 = 0.00 overall IoU = 0.00 mean IoU = 0.00 Mean accuracy for one-to-zero sample is 100.00 Average object IoU 0.0 Overall IoU 0.0 Epoch: [15] [ 0/4276] eta: 6:16:59 lr: 3.275354891359912e-05 loss: 0.1102 (0.1102) time: 5.2900 data: 2.1394 max mem: 33300 Epoch: [15] [ 10/4276] eta: 3:46:10 lr: 3.275079132635268e-05 loss: 0.1318 (0.1391) time: 3.1812 data: 0.2015 max mem: 33300 Epoch: [15] [ 20/4276] eta: 3:39:55 lr: 3.274803371330752e-05 loss: 0.1343 (0.1490) time: 2.9910 data: 0.0075 max mem: 33300 Epoch: [15] [ 30/4276] eta: 3:37:25 lr: 3.274527607446097e-05 loss: 0.1385 (0.1459) time: 3.0127 data: 0.0079 max mem: 33300 Epoch: [15] [ 40/4276] eta: 3:35:39 lr: 3.2742518409810386e-05 loss: 0.1446 (0.1484) time: 3.0064 data: 0.0082 max mem: 33300 Epoch: [15] [ 50/4276] eta: 3:34:30 lr: 3.273976071935311e-05 loss: 0.1450 (0.1465) time: 3.0039 data: 0.0082 max mem: 33300 Epoch: [15] [ 60/4276] eta: 3:33:29 lr: 3.2737003003086476e-05 loss: 0.1336 (0.1448) time: 3.0052 data: 0.0079 max mem: 33300 Epoch: [15] [ 70/4276] eta: 3:32:47 lr: 3.2734245261007835e-05 loss: 0.1303 (0.1435) time: 3.0097 data: 0.0078 max mem: 33300 Epoch: [15] [ 80/4276] eta: 3:32:03 lr: 3.273148749311453e-05 loss: 0.1303 (0.1444) time: 3.0135 data: 0.0079 max mem: 33300 Epoch: [15] [ 90/4276] eta: 3:31:11 lr: 3.2728729699403903e-05 loss: 0.1280 (0.1427) time: 2.9975 data: 0.0076 max mem: 33300 Epoch: [15] [ 100/4276] eta: 3:30:23 lr: 3.272597187987329e-05 loss: 0.1280 (0.1443) time: 2.9855 data: 0.0076 max mem: 33300 Epoch: [15] [ 110/4276] eta: 3:29:46 lr: 3.272321403452003e-05 loss: 0.1341 (0.1442) time: 2.9950 data: 0.0074 max mem: 33300 Epoch: [15] [ 120/4276] eta: 3:29:15 lr: 3.2720456163341474e-05 loss: 0.1308 (0.1437) time: 3.0116 data: 0.0074 max mem: 33300 Epoch: [15] [ 130/4276] eta: 3:28:36 lr: 3.271769826633495e-05 loss: 0.1395 (0.1445) time: 3.0063 data: 0.0076 max mem: 33300 Epoch: [15] [ 140/4276] eta: 3:28:02 lr: 3.27149403434978e-05 loss: 0.1320 (0.1436) time: 2.9999 data: 0.0075 max mem: 33300 Epoch: [15] [ 150/4276] eta: 3:27:39 lr: 3.2712182394827365e-05 loss: 0.1315 (0.1434) time: 3.0251 data: 0.0073 max mem: 33300 Epoch: [15] [ 160/4276] eta: 3:27:00 lr: 3.2709424420320986e-05 loss: 0.1372 (0.1437) time: 3.0136 data: 0.0071 max mem: 33300 Epoch: [15] [ 170/4276] eta: 3:26:27 lr: 3.2706666419976e-05 loss: 0.1304 (0.1432) time: 2.9939 data: 0.0068 max mem: 33300 Epoch: [15] [ 180/4276] eta: 3:25:55 lr: 3.270390839378974e-05 loss: 0.1286 (0.1432) time: 3.0078 data: 0.0071 max mem: 33300 Epoch: [15] [ 190/4276] eta: 3:25:23 lr: 3.270115034175953e-05 loss: 0.1286 (0.1432) time: 3.0082 data: 0.0073 max mem: 33300 Epoch: [15] [ 200/4276] eta: 3:24:51 lr: 3.269839226388273e-05 loss: 0.1318 (0.1427) time: 3.0087 data: 0.0073 max mem: 33300 Epoch: [15] [ 210/4276] eta: 3:24:27 lr: 3.269563416015666e-05 loss: 0.1341 (0.1430) time: 3.0274 data: 0.0074 max mem: 33300 Epoch: [15] [ 220/4276] eta: 3:23:57 lr: 3.269287603057866e-05 loss: 0.1285 (0.1427) time: 3.0322 data: 0.0078 max mem: 33300 Epoch: [15] [ 230/4276] eta: 3:23:30 lr: 3.269011787514607e-05 loss: 0.1268 (0.1421) time: 3.0262 data: 0.0080 max mem: 33300 Epoch: [15] [ 240/4276] eta: 3:22:59 lr: 3.268735969385621e-05 loss: 0.1337 (0.1420) time: 3.0236 data: 0.0078 max mem: 33300 Epoch: [15] [ 250/4276] eta: 3:22:27 lr: 3.268460148670643e-05 loss: 0.1406 (0.1425) time: 3.0102 data: 0.0078 max mem: 33300 Epoch: [15] [ 260/4276] eta: 3:21:58 lr: 3.268184325369404e-05 loss: 0.1425 (0.1425) time: 3.0143 data: 0.0076 max mem: 33300 Epoch: [15] [ 270/4276] eta: 3:21:22 lr: 3.267908499481639e-05 loss: 0.1354 (0.1424) time: 3.0036 data: 0.0075 max mem: 33300 Epoch: [15] [ 280/4276] eta: 3:20:47 lr: 3.267632671007081e-05 loss: 0.1345 (0.1419) time: 2.9827 data: 0.0077 max mem: 33300 Epoch: [15] [ 290/4276] eta: 3:20:18 lr: 3.267356839945462e-05 loss: 0.1295 (0.1415) time: 3.0015 data: 0.0086 max mem: 33300 Epoch: [15] [ 300/4276] eta: 3:19:50 lr: 3.267081006296517e-05 loss: 0.1289 (0.1413) time: 3.0261 data: 0.0094 max mem: 33300 Epoch: [15] [ 310/4276] eta: 3:19:21 lr: 3.2668051700599774e-05 loss: 0.1279 (0.1407) time: 3.0264 data: 0.0093 max mem: 33300 Epoch: [15] [ 320/4276] eta: 3:18:52 lr: 3.266529331235577e-05 loss: 0.1305 (0.1410) time: 3.0245 data: 0.0091 max mem: 33300 Epoch: [15] [ 330/4276] eta: 3:18:20 lr: 3.266253489823048e-05 loss: 0.1404 (0.1411) time: 3.0161 data: 0.0089 max mem: 33300 Epoch: [15] [ 340/4276] eta: 3:17:47 lr: 3.265977645822124e-05 loss: 0.1230 (0.1408) time: 2.9954 data: 0.0088 max mem: 33300 Epoch: [15] [ 350/4276] eta: 3:17:26 lr: 3.2657017992325365e-05 loss: 0.1166 (0.1404) time: 3.0445 data: 0.0089 max mem: 33300 Epoch: [15] [ 360/4276] eta: 3:16:56 lr: 3.26542595005402e-05 loss: 0.1395 (0.1413) time: 3.0575 data: 0.0090 max mem: 33300 Epoch: [15] [ 370/4276] eta: 3:16:21 lr: 3.265150098286305e-05 loss: 0.1378 (0.1409) time: 2.9921 data: 0.0087 max mem: 33300 Epoch: [15] [ 380/4276] eta: 3:15:51 lr: 3.264874243929127e-05 loss: 0.1281 (0.1408) time: 2.9978 data: 0.0085 max mem: 33300 Epoch: [15] [ 390/4276] eta: 3:15:24 lr: 3.264598386982217e-05 loss: 0.1321 (0.1411) time: 3.0357 data: 0.0085 max mem: 33300 Epoch: [15] [ 400/4276] eta: 3:14:57 lr: 3.264322527445307e-05 loss: 0.1416 (0.1411) time: 3.0468 data: 0.0085 max mem: 33300 Epoch: [15] [ 410/4276] eta: 3:14:24 lr: 3.264046665318131e-05 loss: 0.1357 (0.1410) time: 3.0171 data: 0.0079 max mem: 33300 Epoch: [15] [ 420/4276] eta: 3:13:51 lr: 3.2637708006004206e-05 loss: 0.1343 (0.1409) time: 2.9901 data: 0.0077 max mem: 33300 Epoch: [15] [ 430/4276] eta: 3:13:20 lr: 3.2634949332919065e-05 loss: 0.1391 (0.1410) time: 2.9965 data: 0.0081 max mem: 33300 Epoch: [15] [ 440/4276] eta: 3:12:49 lr: 3.263219063392324e-05 loss: 0.1431 (0.1407) time: 3.0026 data: 0.0081 max mem: 33300 Epoch: [15] [ 450/4276] eta: 3:12:16 lr: 3.2629431909014035e-05 loss: 0.1340 (0.1406) time: 2.9983 data: 0.0080 max mem: 33300 Epoch: [15] [ 460/4276] eta: 3:11:44 lr: 3.262667315818878e-05 loss: 0.1264 (0.1401) time: 2.9930 data: 0.0079 max mem: 33300 Epoch: [15] [ 470/4276] eta: 3:11:16 lr: 3.26239143814448e-05 loss: 0.1142 (0.1398) time: 3.0158 data: 0.0077 max mem: 33300 Epoch: [15] [ 480/4276] eta: 3:10:49 lr: 3.26211555787794e-05 loss: 0.1142 (0.1394) time: 3.0436 data: 0.0078 max mem: 33300 Epoch: [15] [ 490/4276] eta: 3:10:17 lr: 3.261839675018992e-05 loss: 0.1133 (0.1389) time: 3.0199 data: 0.0077 max mem: 33300 Epoch: [15] [ 500/4276] eta: 3:09:45 lr: 3.261563789567366e-05 loss: 0.1109 (0.1385) time: 2.9925 data: 0.0075 max mem: 33300 Epoch: [15] [ 510/4276] eta: 3:09:13 lr: 3.261287901522796e-05 loss: 0.1109 (0.1383) time: 2.9936 data: 0.0079 max mem: 33300 Epoch: [15] [ 520/4276] eta: 3:08:40 lr: 3.2610120108850127e-05 loss: 0.1348 (0.1382) time: 2.9871 data: 0.0082 max mem: 33300 Epoch: [15] [ 530/4276] eta: 3:08:09 lr: 3.2607361176537484e-05 loss: 0.1287 (0.1380) time: 2.9857 data: 0.0079 max mem: 33300 Epoch: [15] [ 540/4276] eta: 3:07:38 lr: 3.2604602218287345e-05 loss: 0.1206 (0.1376) time: 2.9994 data: 0.0081 max mem: 33300 Epoch: [15] [ 550/4276] eta: 3:07:08 lr: 3.260184323409703e-05 loss: 0.1260 (0.1376) time: 3.0093 data: 0.0087 max mem: 33300 Epoch: [15] [ 560/4276] eta: 3:06:40 lr: 3.2599084223963853e-05 loss: 0.1328 (0.1375) time: 3.0289 data: 0.0082 max mem: 33300 Epoch: [15] [ 570/4276] eta: 3:06:10 lr: 3.2596325187885135e-05 loss: 0.1320 (0.1374) time: 3.0361 data: 0.0075 max mem: 33300 Epoch: [15] [ 580/4276] eta: 3:05:42 lr: 3.2593566125858186e-05 loss: 0.1287 (0.1374) time: 3.0343 data: 0.0074 max mem: 33300 Epoch: [15] [ 590/4276] eta: 3:05:11 lr: 3.2590807037880335e-05 loss: 0.1120 (0.1370) time: 3.0222 data: 0.0074 max mem: 33300 Epoch: [15] [ 600/4276] eta: 3:04:40 lr: 3.258804792394888e-05 loss: 0.1149 (0.1369) time: 3.0025 data: 0.0073 max mem: 33300 Epoch: [15] [ 610/4276] eta: 3:04:09 lr: 3.258528878406114e-05 loss: 0.1237 (0.1367) time: 2.9962 data: 0.0077 max mem: 33300 Epoch: [15] [ 620/4276] eta: 3:03:38 lr: 3.258252961821444e-05 loss: 0.1331 (0.1367) time: 2.9972 data: 0.0079 max mem: 33300 Epoch: [15] [ 630/4276] eta: 3:03:07 lr: 3.257977042640607e-05 loss: 0.1337 (0.1369) time: 3.0006 data: 0.0077 max mem: 33300 Epoch: [15] [ 640/4276] eta: 3:02:36 lr: 3.257701120863337e-05 loss: 0.1375 (0.1369) time: 3.0023 data: 0.0076 max mem: 33300 Epoch: [15] [ 650/4276] eta: 3:02:07 lr: 3.2574251964893635e-05 loss: 0.1333 (0.1370) time: 3.0154 data: 0.0075 max mem: 33300 Epoch: [15] [ 660/4276] eta: 3:01:37 lr: 3.2571492695184175e-05 loss: 0.1372 (0.1372) time: 3.0241 data: 0.0076 max mem: 33300 Epoch: [15] [ 670/4276] eta: 3:01:08 lr: 3.256873339950231e-05 loss: 0.1363 (0.1371) time: 3.0278 data: 0.0076 max mem: 33300 Epoch: [15] [ 680/4276] eta: 3:00:37 lr: 3.2565974077845345e-05 loss: 0.1245 (0.1371) time: 3.0080 data: 0.0072 max mem: 33300 Epoch: [15] [ 690/4276] eta: 3:00:06 lr: 3.256321473021059e-05 loss: 0.1316 (0.1371) time: 2.9990 data: 0.0075 max mem: 33300 Epoch: [15] [ 700/4276] eta: 2:59:41 lr: 3.2560455356595366e-05 loss: 0.1296 (0.1370) time: 3.0639 data: 0.0084 max mem: 33300 Epoch: [15] [ 710/4276] eta: 2:59:11 lr: 3.255769595699697e-05 loss: 0.1267 (0.1370) time: 3.0608 data: 0.0083 max mem: 33300 Epoch: [15] [ 720/4276] eta: 2:58:40 lr: 3.255493653141271e-05 loss: 0.1325 (0.1369) time: 3.0050 data: 0.0080 max mem: 33300 Epoch: [15] [ 730/4276] eta: 2:58:10 lr: 3.2552177079839896e-05 loss: 0.1325 (0.1369) time: 3.0072 data: 0.0080 max mem: 33300 Epoch: [15] [ 740/4276] eta: 2:57:39 lr: 3.254941760227584e-05 loss: 0.1221 (0.1367) time: 3.0113 data: 0.0080 max mem: 33300 Epoch: [15] [ 750/4276] eta: 2:57:11 lr: 3.2546658098717844e-05 loss: 0.1242 (0.1367) time: 3.0372 data: 0.0082 max mem: 33300 Epoch: [15] [ 760/4276] eta: 2:56:43 lr: 3.254389856916321e-05 loss: 0.1300 (0.1366) time: 3.0514 data: 0.0081 max mem: 33300 Epoch: [15] [ 770/4276] eta: 2:56:13 lr: 3.254113901360925e-05 loss: 0.1332 (0.1368) time: 3.0351 data: 0.0077 max mem: 33300 Epoch: [15] [ 780/4276] eta: 2:55:43 lr: 3.253837943205328e-05 loss: 0.1351 (0.1369) time: 3.0209 data: 0.0076 max mem: 33300 Epoch: [15] [ 790/4276] eta: 2:55:11 lr: 3.253561982449258e-05 loss: 0.1451 (0.1370) time: 2.9996 data: 0.0082 max mem: 33300 Epoch: [15] [ 800/4276] eta: 2:54:41 lr: 3.253286019092448e-05 loss: 0.1330 (0.1370) time: 2.9976 data: 0.0085 max mem: 33300 Epoch: [15] [ 810/4276] eta: 2:54:10 lr: 3.253010053134626e-05 loss: 0.1301 (0.1371) time: 3.0001 data: 0.0079 max mem: 33300 Epoch: [15] [ 820/4276] eta: 2:53:39 lr: 3.2527340845755234e-05 loss: 0.1288 (0.1370) time: 2.9991 data: 0.0076 max mem: 33300 Epoch: [15] [ 830/4276] eta: 2:53:09 lr: 3.25245811341487e-05 loss: 0.1310 (0.1371) time: 3.0166 data: 0.0080 max mem: 33300 Epoch: [15] [ 840/4276] eta: 2:52:40 lr: 3.2521821396523975e-05 loss: 0.1404 (0.1372) time: 3.0302 data: 0.0079 max mem: 33300 Epoch: [15] [ 850/4276] eta: 2:52:09 lr: 3.251906163287835e-05 loss: 0.1217 (0.1371) time: 3.0174 data: 0.0074 max mem: 33300 Epoch: [15] [ 860/4276] eta: 2:51:38 lr: 3.251630184320912e-05 loss: 0.1293 (0.1371) time: 2.9993 data: 0.0073 max mem: 33300 Epoch: [15] [ 870/4276] eta: 2:51:08 lr: 3.2513542027513596e-05 loss: 0.1367 (0.1371) time: 3.0053 data: 0.0075 max mem: 33300 Epoch: [15] [ 880/4276] eta: 2:50:38 lr: 3.2510782185789065e-05 loss: 0.1312 (0.1372) time: 3.0111 data: 0.0074 max mem: 33300 Epoch: [15] [ 890/4276] eta: 2:50:07 lr: 3.2508022318032846e-05 loss: 0.1520 (0.1374) time: 3.0083 data: 0.0078 max mem: 33300 Epoch: [15] [ 900/4276] eta: 2:49:36 lr: 3.2505262424242216e-05 loss: 0.1414 (0.1373) time: 2.9995 data: 0.0084 max mem: 33300 Epoch: [15] [ 910/4276] eta: 2:49:07 lr: 3.250250250441449e-05 loss: 0.1345 (0.1374) time: 3.0104 data: 0.0080 max mem: 33300 Epoch: [15] [ 920/4276] eta: 2:48:38 lr: 3.2499742558546954e-05 loss: 0.1365 (0.1375) time: 3.0449 data: 0.0080 max mem: 33300 Epoch: [15] [ 930/4276] eta: 2:48:08 lr: 3.2496982586636915e-05 loss: 0.1365 (0.1375) time: 3.0345 data: 0.0078 max mem: 33300 Epoch: [15] [ 940/4276] eta: 2:47:37 lr: 3.249422258868167e-05 loss: 0.1291 (0.1374) time: 2.9987 data: 0.0074 max mem: 33300 Epoch: [15] [ 950/4276] eta: 2:47:05 lr: 3.249146256467851e-05 loss: 0.1392 (0.1375) time: 2.9803 data: 0.0072 max mem: 33300 Epoch: [15] [ 960/4276] eta: 2:46:33 lr: 3.248870251462472e-05 loss: 0.1447 (0.1375) time: 2.9594 data: 0.0076 max mem: 33300 Epoch: [15] [ 970/4276] eta: 2:46:01 lr: 3.2485942438517605e-05 loss: 0.1447 (0.1376) time: 2.9604 data: 0.0078 max mem: 33300 Epoch: [15] [ 980/4276] eta: 2:45:29 lr: 3.248318233635447e-05 loss: 0.1426 (0.1377) time: 2.9586 data: 0.0078 max mem: 33300 Epoch: [15] [ 990/4276] eta: 2:44:58 lr: 3.248042220813259e-05 loss: 0.1384 (0.1376) time: 2.9658 data: 0.0086 max mem: 33300 Epoch: [15] [1000/4276] eta: 2:44:28 lr: 3.247766205384928e-05 loss: 0.1342 (0.1377) time: 2.9993 data: 0.0087 max mem: 33300 Epoch: [15] [1010/4276] eta: 2:43:56 lr: 3.247490187350182e-05 loss: 0.1250 (0.1376) time: 2.9925 data: 0.0076 max mem: 33300 Epoch: [15] [1020/4276] eta: 2:43:25 lr: 3.2472141667087495e-05 loss: 0.1250 (0.1376) time: 2.9717 data: 0.0075 max mem: 33300 Epoch: [15] [1030/4276] eta: 2:42:54 lr: 3.246938143460362e-05 loss: 0.1295 (0.1376) time: 2.9803 data: 0.0088 max mem: 33300 Epoch: [15] [1040/4276] eta: 2:42:24 lr: 3.246662117604746e-05 loss: 0.1297 (0.1376) time: 2.9947 data: 0.0093 max mem: 33300 Epoch: [15] [1050/4276] eta: 2:41:54 lr: 3.246386089141631e-05 loss: 0.1309 (0.1377) time: 3.0138 data: 0.0087 max mem: 33300 Epoch: [15] [1060/4276] eta: 2:41:25 lr: 3.246110058070748e-05 loss: 0.1413 (0.1379) time: 3.0423 data: 0.0088 max mem: 33300 Epoch: [15] [1070/4276] eta: 2:40:57 lr: 3.245834024391824e-05 loss: 0.1694 (0.1382) time: 3.0685 data: 0.0089 max mem: 33300 Epoch: [15] [1080/4276] eta: 2:40:29 lr: 3.245557988104589e-05 loss: 0.1410 (0.1381) time: 3.0755 data: 0.0089 max mem: 33300 Epoch: [15] [1090/4276] eta: 2:40:01 lr: 3.2452819492087715e-05 loss: 0.1339 (0.1381) time: 3.0825 data: 0.0088 max mem: 33300 Epoch: [15] [1100/4276] eta: 2:39:31 lr: 3.2450059077041e-05 loss: 0.1337 (0.1382) time: 3.0503 data: 0.0081 max mem: 33300 Epoch: [15] [1110/4276] eta: 2:38:59 lr: 3.2447298635903044e-05 loss: 0.1362 (0.1382) time: 2.9901 data: 0.0079 max mem: 33300 Epoch: [15] [1120/4276] eta: 2:38:29 lr: 3.2444538168671116e-05 loss: 0.1362 (0.1381) time: 2.9882 data: 0.0081 max mem: 33300 Epoch: [15] [1130/4276] eta: 2:38:00 lr: 3.2441777675342515e-05 loss: 0.1169 (0.1380) time: 3.0263 data: 0.0091 max mem: 33300 Epoch: [15] [1140/4276] eta: 2:37:30 lr: 3.2439017155914526e-05 loss: 0.1181 (0.1380) time: 3.0440 data: 0.0101 max mem: 33300 Epoch: [15] [1150/4276] eta: 2:37:02 lr: 3.243625661038442e-05 loss: 0.1289 (0.1378) time: 3.0567 data: 0.0099 max mem: 33300 Epoch: [15] [1160/4276] eta: 2:36:33 lr: 3.243349603874952e-05 loss: 0.1311 (0.1380) time: 3.0693 data: 0.0097 max mem: 33300 Epoch: [15] [1170/4276] eta: 2:36:05 lr: 3.2430735441007064e-05 loss: 0.1405 (0.1381) time: 3.0730 data: 0.0098 max mem: 33300 Epoch: [15] [1180/4276] eta: 2:35:36 lr: 3.2427974817154366e-05 loss: 0.1367 (0.1380) time: 3.0833 data: 0.0101 max mem: 33300 Epoch: [15] [1190/4276] eta: 2:35:07 lr: 3.242521416718869e-05 loss: 0.1280 (0.1380) time: 3.0562 data: 0.0095 max mem: 33300 Epoch: [15] [1200/4276] eta: 2:34:37 lr: 3.242245349110733e-05 loss: 0.1215 (0.1379) time: 3.0302 data: 0.0084 max mem: 33300 Epoch: [15] [1210/4276] eta: 2:34:07 lr: 3.241969278890758e-05 loss: 0.1183 (0.1379) time: 3.0251 data: 0.0088 max mem: 33300 Epoch: [15] [1220/4276] eta: 2:33:38 lr: 3.2416932060586693e-05 loss: 0.1366 (0.1380) time: 3.0433 data: 0.0093 max mem: 33300 Epoch: [15] [1230/4276] eta: 2:33:09 lr: 3.2414171306141966e-05 loss: 0.1427 (0.1380) time: 3.0647 data: 0.0093 max mem: 33300 Epoch: [15] [1240/4276] eta: 2:32:39 lr: 3.241141052557069e-05 loss: 0.1366 (0.1380) time: 3.0375 data: 0.0088 max mem: 33300 Epoch: [15] [1250/4276] eta: 2:32:06 lr: 3.240864971887012e-05 loss: 0.1359 (0.1381) time: 2.9748 data: 0.0078 max mem: 33300 Epoch: [15] [1260/4276] eta: 2:31:36 lr: 3.240588888603756e-05 loss: 0.1295 (0.1380) time: 2.9684 data: 0.0076 max mem: 33300 Epoch: [15] [1270/4276] eta: 2:31:05 lr: 3.240312802707027e-05 loss: 0.1278 (0.1380) time: 2.9837 data: 0.0077 max mem: 33300 Epoch: [15] [1280/4276] eta: 2:30:36 lr: 3.2400367141965546e-05 loss: 0.1350 (0.1380) time: 3.0296 data: 0.0086 max mem: 33300 Epoch: [15] [1290/4276] eta: 2:30:05 lr: 3.239760623072065e-05 loss: 0.1350 (0.1380) time: 3.0392 data: 0.0096 max mem: 33300 Epoch: [15] [1300/4276] eta: 2:29:35 lr: 3.2394845293332856e-05 loss: 0.1213 (0.1379) time: 3.0018 data: 0.0095 max mem: 33300 Epoch: [15] [1310/4276] eta: 2:29:05 lr: 3.239208432979946e-05 loss: 0.1210 (0.1379) time: 3.0042 data: 0.0091 max mem: 33300 Epoch: [15] [1320/4276] eta: 2:28:34 lr: 3.238932334011773e-05 loss: 0.1311 (0.1380) time: 2.9933 data: 0.0090 max mem: 33300 Epoch: [15] [1330/4276] eta: 2:28:04 lr: 3.238656232428493e-05 loss: 0.1334 (0.1380) time: 3.0115 data: 0.0088 max mem: 33300 Epoch: [15] [1340/4276] eta: 2:27:34 lr: 3.2383801282298364e-05 loss: 0.1334 (0.1380) time: 3.0249 data: 0.0087 max mem: 33300 Epoch: [15] [1350/4276] eta: 2:27:04 lr: 3.2381040214155276e-05 loss: 0.1333 (0.1379) time: 3.0290 data: 0.0090 max mem: 33300 Epoch: [15] [1360/4276] eta: 2:26:34 lr: 3.237827911985295e-05 loss: 0.1295 (0.1379) time: 3.0228 data: 0.0089 max mem: 33300 Epoch: [15] [1370/4276] eta: 2:26:04 lr: 3.2375517999388664e-05 loss: 0.1267 (0.1379) time: 3.0157 data: 0.0092 max mem: 33300 Epoch: [15] [1380/4276] eta: 2:25:33 lr: 3.2372756852759685e-05 loss: 0.1304 (0.1379) time: 3.0102 data: 0.0093 max mem: 33300 Epoch: [15] [1390/4276] eta: 2:25:02 lr: 3.236999567996329e-05 loss: 0.1470 (0.1380) time: 2.9852 data: 0.0095 max mem: 33300 Epoch: [15] [1400/4276] eta: 2:24:33 lr: 3.236723448099676e-05 loss: 0.1470 (0.1381) time: 3.0106 data: 0.0099 max mem: 33300 Epoch: [15] [1410/4276] eta: 2:24:02 lr: 3.2364473255857345e-05 loss: 0.1360 (0.1381) time: 3.0118 data: 0.0096 max mem: 33300 Epoch: [15] [1420/4276] eta: 2:23:31 lr: 3.2361712004542324e-05 loss: 0.1265 (0.1381) time: 2.9796 data: 0.0091 max mem: 33300 Epoch: [15] [1430/4276] eta: 2:23:01 lr: 3.235895072704898e-05 loss: 0.1265 (0.1380) time: 3.0030 data: 0.0093 max mem: 33300 Epoch: [15] [1440/4276] eta: 2:22:31 lr: 3.235618942337456e-05 loss: 0.1273 (0.1380) time: 3.0180 data: 0.0093 max mem: 33300 Epoch: [15] [1450/4276] eta: 2:22:01 lr: 3.235342809351635e-05 loss: 0.1289 (0.1380) time: 3.0175 data: 0.0091 max mem: 33300 Epoch: [15] [1460/4276] eta: 2:21:31 lr: 3.2350666737471616e-05 loss: 0.1343 (0.1380) time: 3.0122 data: 0.0088 max mem: 33300 Epoch: [15] [1470/4276] eta: 2:21:00 lr: 3.234790535523763e-05 loss: 0.1356 (0.1380) time: 3.0011 data: 0.0087 max mem: 33300 Epoch: [15] [1480/4276] eta: 2:20:29 lr: 3.234514394681165e-05 loss: 0.1219 (0.1379) time: 2.9868 data: 0.0090 max mem: 33300 Epoch: [15] [1490/4276] eta: 2:19:59 lr: 3.2342382512190945e-05 loss: 0.1138 (0.1378) time: 2.9837 data: 0.0092 max mem: 33300 Epoch: [15] [1500/4276] eta: 2:19:28 lr: 3.233962105137278e-05 loss: 0.1138 (0.1377) time: 2.9846 data: 0.0091 max mem: 33300 Epoch: [15] [1510/4276] eta: 2:18:57 lr: 3.2336859564354424e-05 loss: 0.1178 (0.1376) time: 2.9839 data: 0.0087 max mem: 33300 Epoch: [15] [1520/4276] eta: 2:18:29 lr: 3.233409805113315e-05 loss: 0.1170 (0.1376) time: 3.0467 data: 0.0093 max mem: 33300 Epoch: [15] [1530/4276] eta: 2:17:58 lr: 3.233133651170621e-05 loss: 0.1201 (0.1375) time: 3.0478 data: 0.0092 max mem: 33300 Epoch: [15] [1540/4276] eta: 2:17:27 lr: 3.2328574946070875e-05 loss: 0.1230 (0.1376) time: 2.9877 data: 0.0086 max mem: 33300 Epoch: [15] [1550/4276] eta: 2:16:57 lr: 3.2325813354224405e-05 loss: 0.1533 (0.1377) time: 2.9922 data: 0.0085 max mem: 33300 Epoch: [15] [1560/4276] eta: 2:16:26 lr: 3.232305173616407e-05 loss: 0.1402 (0.1376) time: 2.9962 data: 0.0083 max mem: 33300 Epoch: [15] [1570/4276] eta: 2:15:55 lr: 3.2320290091887126e-05 loss: 0.1393 (0.1376) time: 2.9806 data: 0.0087 max mem: 33300 Epoch: [15] [1580/4276] eta: 2:15:25 lr: 3.2317528421390834e-05 loss: 0.1233 (0.1376) time: 2.9742 data: 0.0092 max mem: 33300 Epoch: [15] [1590/4276] eta: 2:14:54 lr: 3.2314766724672456e-05 loss: 0.1273 (0.1376) time: 2.9848 data: 0.0089 max mem: 33300 Epoch: [15] [1600/4276] eta: 2:14:24 lr: 3.231200500172926e-05 loss: 0.1371 (0.1377) time: 3.0106 data: 0.0084 max mem: 33300 Epoch: [15] [1610/4276] eta: 2:13:54 lr: 3.23092432525585e-05 loss: 0.1266 (0.1376) time: 3.0145 data: 0.0083 max mem: 33300 Epoch: [15] [1620/4276] eta: 2:13:23 lr: 3.230648147715744e-05 loss: 0.1223 (0.1375) time: 2.9931 data: 0.0084 max mem: 33300 Epoch: [15] [1630/4276] eta: 2:12:53 lr: 3.2303719675523343e-05 loss: 0.1244 (0.1375) time: 2.9877 data: 0.0082 max mem: 33300 Epoch: [15] [1640/4276] eta: 2:12:22 lr: 3.230095784765345e-05 loss: 0.1227 (0.1374) time: 2.9908 data: 0.0082 max mem: 33300 Epoch: [15] [1650/4276] eta: 2:11:53 lr: 3.2298195993545044e-05 loss: 0.1227 (0.1373) time: 3.0230 data: 0.0084 max mem: 33300 Epoch: [15] [1660/4276] eta: 2:11:23 lr: 3.229543411319536e-05 loss: 0.1256 (0.1373) time: 3.0514 data: 0.0085 max mem: 33300 Epoch: [15] [1670/4276] eta: 2:10:54 lr: 3.229267220660166e-05 loss: 0.1222 (0.1372) time: 3.0435 data: 0.0088 max mem: 33300 Epoch: [15] [1680/4276] eta: 2:10:24 lr: 3.2289910273761216e-05 loss: 0.1343 (0.1373) time: 3.0466 data: 0.0081 max mem: 33300 Epoch: [15] [1690/4276] eta: 2:09:55 lr: 3.228714831467127e-05 loss: 0.1290 (0.1372) time: 3.0634 data: 0.0073 max mem: 33300 Epoch: [15] [1700/4276] eta: 2:09:25 lr: 3.228438632932908e-05 loss: 0.1275 (0.1372) time: 3.0555 data: 0.0073 max mem: 33300 Epoch: [15] [1710/4276] eta: 2:08:55 lr: 3.228162431773191e-05 loss: 0.1403 (0.1372) time: 3.0324 data: 0.0074 max mem: 33300 Epoch: [15] [1720/4276] eta: 2:08:25 lr: 3.2278862279876995e-05 loss: 0.1363 (0.1373) time: 3.0202 data: 0.0076 max mem: 33300 Epoch: [15] [1730/4276] eta: 2:07:55 lr: 3.2276100215761614e-05 loss: 0.1358 (0.1373) time: 3.0349 data: 0.0076 max mem: 33300 Epoch: [15] [1740/4276] eta: 2:07:26 lr: 3.227333812538299e-05 loss: 0.1374 (0.1373) time: 3.0482 data: 0.0074 max mem: 33300 Epoch: [15] [1750/4276] eta: 2:06:56 lr: 3.227057600873841e-05 loss: 0.1351 (0.1373) time: 3.0548 data: 0.0073 max mem: 33300 Epoch: [15] [1760/4276] eta: 2:06:27 lr: 3.226781386582509e-05 loss: 0.1228 (0.1372) time: 3.0577 data: 0.0073 max mem: 33300 Epoch: [15] [1770/4276] eta: 2:05:57 lr: 3.2265051696640306e-05 loss: 0.1225 (0.1372) time: 3.0626 data: 0.0074 max mem: 33300 Epoch: [15] [1780/4276] eta: 2:05:27 lr: 3.226228950118131e-05 loss: 0.1250 (0.1371) time: 3.0543 data: 0.0074 max mem: 33300 Epoch: [15] [1790/4276] eta: 2:04:57 lr: 3.225952727944534e-05 loss: 0.1250 (0.1371) time: 3.0330 data: 0.0072 max mem: 33300 Epoch: [15] [1800/4276] eta: 2:04:28 lr: 3.225676503142966e-05 loss: 0.1260 (0.1370) time: 3.0348 data: 0.0074 max mem: 33300 Epoch: [15] [1810/4276] eta: 2:03:57 lr: 3.225400275713151e-05 loss: 0.1347 (0.1371) time: 3.0310 data: 0.0074 max mem: 33300 Epoch: [15] [1820/4276] eta: 2:03:28 lr: 3.225124045654814e-05 loss: 0.1358 (0.1371) time: 3.0455 data: 0.0073 max mem: 33300 Epoch: [15] [1830/4276] eta: 2:02:58 lr: 3.224847812967679e-05 loss: 0.1310 (0.1371) time: 3.0480 data: 0.0073 max mem: 33300 Epoch: [15] [1840/4276] eta: 2:02:28 lr: 3.2245715776514727e-05 loss: 0.1310 (0.1370) time: 3.0316 data: 0.0072 max mem: 33300 Epoch: [15] [1850/4276] eta: 2:01:58 lr: 3.224295339705918e-05 loss: 0.1369 (0.1371) time: 3.0207 data: 0.0071 max mem: 33300 Epoch: [15] [1860/4276] eta: 2:01:27 lr: 3.224019099130741e-05 loss: 0.1229 (0.1370) time: 3.0072 data: 0.0075 max mem: 33300 Epoch: [15] [1870/4276] eta: 2:00:57 lr: 3.223742855925665e-05 loss: 0.1305 (0.1372) time: 2.9892 data: 0.0079 max mem: 33300 Epoch: [15] [1880/4276] eta: 2:00:26 lr: 3.223466610090416e-05 loss: 0.1374 (0.1372) time: 2.9775 data: 0.0078 max mem: 33300 Epoch: [15] [1890/4276] eta: 1:59:55 lr: 3.223190361624717e-05 loss: 0.1309 (0.1372) time: 2.9740 data: 0.0077 max mem: 33300 Epoch: [15] [1900/4276] eta: 1:59:25 lr: 3.2229141105282934e-05 loss: 0.1225 (0.1372) time: 2.9763 data: 0.0081 max mem: 33300 Epoch: [15] [1910/4276] eta: 1:58:54 lr: 3.2226378568008694e-05 loss: 0.1282 (0.1372) time: 2.9829 data: 0.0083 max mem: 33300 Epoch: [15] [1920/4276] eta: 1:58:24 lr: 3.222361600442169e-05 loss: 0.1310 (0.1371) time: 2.9804 data: 0.0077 max mem: 33300 Epoch: [15] [1930/4276] eta: 1:57:53 lr: 3.2220853414519174e-05 loss: 0.1310 (0.1370) time: 2.9818 data: 0.0075 max mem: 33300 Epoch: [15] [1940/4276] eta: 1:57:23 lr: 3.2218090798298387e-05 loss: 0.1323 (0.1371) time: 2.9990 data: 0.0079 max mem: 33300 Epoch: [15] [1950/4276] eta: 1:56:53 lr: 3.221532815575655e-05 loss: 0.1355 (0.1372) time: 3.0035 data: 0.0079 max mem: 33300 Epoch: [15] [1960/4276] eta: 1:56:22 lr: 3.221256548689094e-05 loss: 0.1347 (0.1372) time: 2.9763 data: 0.0078 max mem: 33300 Epoch: [15] [1970/4276] eta: 1:55:51 lr: 3.220980279169877e-05 loss: 0.1160 (0.1371) time: 2.9730 data: 0.0079 max mem: 33300 Epoch: [15] [1980/4276] eta: 1:55:21 lr: 3.220704007017728e-05 loss: 0.1160 (0.1371) time: 2.9846 data: 0.0080 max mem: 33300 Epoch: [15] [1990/4276] eta: 1:54:50 lr: 3.220427732232372e-05 loss: 0.1301 (0.1371) time: 2.9805 data: 0.0079 max mem: 33300 Epoch: [15] [2000/4276] eta: 1:54:20 lr: 3.2201514548135334e-05 loss: 0.1427 (0.1372) time: 2.9811 data: 0.0075 max mem: 33300 Epoch: [15] [2010/4276] eta: 1:53:49 lr: 3.2198751747609345e-05 loss: 0.1427 (0.1371) time: 2.9792 data: 0.0076 max mem: 33300 Epoch: [15] [2020/4276] eta: 1:53:22 lr: 3.219598892074302e-05 loss: 0.1363 (0.1372) time: 3.1418 data: 0.0083 max mem: 33300 Epoch: [15] [2030/4276] eta: 1:52:54 lr: 3.219322606753355e-05 loss: 0.1286 (0.1371) time: 3.2371 data: 0.0089 max mem: 33300 Epoch: [15] [2040/4276] eta: 1:52:26 lr: 3.219046318797821e-05 loss: 0.1164 (0.1371) time: 3.1904 data: 0.0095 max mem: 33300 Epoch: [15] [2050/4276] eta: 1:51:55 lr: 3.218770028207422e-05 loss: 0.1370 (0.1371) time: 3.1077 data: 0.0096 max mem: 33300 Epoch: [15] [2060/4276] eta: 1:51:25 lr: 3.218493734981882e-05 loss: 0.1339 (0.1371) time: 2.9947 data: 0.0096 max mem: 33300 Epoch: [15] [2070/4276] eta: 1:50:54 lr: 3.218217439120924e-05 loss: 0.1253 (0.1370) time: 2.9874 data: 0.0098 max mem: 33300 Epoch: [15] [2080/4276] eta: 1:50:24 lr: 3.217941140624272e-05 loss: 0.1368 (0.1370) time: 2.9880 data: 0.0095 max mem: 33300 Epoch: [15] [2090/4276] eta: 1:49:54 lr: 3.21766483949165e-05 loss: 0.1369 (0.1370) time: 3.0008 data: 0.0087 max mem: 33300 Epoch: [15] [2100/4276] eta: 1:49:24 lr: 3.21738853572278e-05 loss: 0.1325 (0.1370) time: 3.0194 data: 0.0075 max mem: 33300 Epoch: [15] [2110/4276] eta: 1:48:53 lr: 3.217112229317386e-05 loss: 0.1272 (0.1370) time: 3.0176 data: 0.0064 max mem: 33300 Epoch: [15] [2120/4276] eta: 1:48:24 lr: 3.2168359202751915e-05 loss: 0.1081 (0.1368) time: 3.0454 data: 0.0058 max mem: 33300 Epoch: [15] [2130/4276] eta: 1:47:54 lr: 3.2165596085959185e-05 loss: 0.1081 (0.1368) time: 3.0681 data: 0.0057 max mem: 33300 Epoch: [15] [2140/4276] eta: 1:47:25 lr: 3.216283294279291e-05 loss: 0.1179 (0.1367) time: 3.0696 data: 0.0057 max mem: 33300 Epoch: [15] [2150/4276] eta: 1:46:55 lr: 3.2160069773250325e-05 loss: 0.1197 (0.1367) time: 3.0666 data: 0.0059 max mem: 33300 Epoch: [15] [2160/4276] eta: 1:46:25 lr: 3.215730657732865e-05 loss: 0.1197 (0.1367) time: 3.0649 data: 0.0059 max mem: 33300 Epoch: [15] [2170/4276] eta: 1:45:55 lr: 3.2154543355025135e-05 loss: 0.1452 (0.1367) time: 3.0408 data: 0.0061 max mem: 33300 Epoch: [15] [2180/4276] eta: 1:45:25 lr: 3.215178010633699e-05 loss: 0.1471 (0.1367) time: 3.0112 data: 0.0064 max mem: 33300 Epoch: [15] [2190/4276] eta: 1:44:54 lr: 3.214901683126144e-05 loss: 0.1375 (0.1367) time: 3.0104 data: 0.0062 max mem: 33300 Epoch: [15] [2200/4276] eta: 1:44:24 lr: 3.214625352979572e-05 loss: 0.1369 (0.1368) time: 3.0135 data: 0.0058 max mem: 33300 Epoch: [15] [2210/4276] eta: 1:43:54 lr: 3.214349020193706e-05 loss: 0.1398 (0.1368) time: 3.0027 data: 0.0057 max mem: 33300 Epoch: [15] [2220/4276] eta: 1:43:24 lr: 3.214072684768268e-05 loss: 0.1423 (0.1368) time: 2.9999 data: 0.0057 max mem: 33300 Epoch: [15] [2230/4276] eta: 1:42:53 lr: 3.213796346702981e-05 loss: 0.1300 (0.1368) time: 3.0082 data: 0.0057 max mem: 33300 Epoch: [15] [2240/4276] eta: 1:42:23 lr: 3.213520005997568e-05 loss: 0.1264 (0.1368) time: 3.0236 data: 0.0057 max mem: 33300 Epoch: [15] [2250/4276] eta: 1:41:53 lr: 3.213243662651752e-05 loss: 0.1238 (0.1368) time: 3.0344 data: 0.0057 max mem: 33300 Epoch: [15] [2260/4276] eta: 1:41:23 lr: 3.212967316665254e-05 loss: 0.1309 (0.1368) time: 3.0071 data: 0.0057 max mem: 33300 Epoch: [15] [2270/4276] eta: 1:40:52 lr: 3.212690968037797e-05 loss: 0.1263 (0.1367) time: 2.9801 data: 0.0056 max mem: 33300 Epoch: [15] [2280/4276] eta: 1:40:22 lr: 3.2124146167691035e-05 loss: 0.1227 (0.1367) time: 3.0000 data: 0.0056 max mem: 33300 Epoch: [15] [2290/4276] eta: 1:39:52 lr: 3.2121382628588955e-05 loss: 0.1225 (0.1366) time: 3.0142 data: 0.0057 max mem: 33300 Epoch: [15] [2300/4276] eta: 1:39:22 lr: 3.211861906306895e-05 loss: 0.1245 (0.1366) time: 3.0238 data: 0.0057 max mem: 33300 Epoch: [15] [2310/4276] eta: 1:38:51 lr: 3.211585547112825e-05 loss: 0.1292 (0.1366) time: 3.0203 data: 0.0057 max mem: 33300 Epoch: [15] [2320/4276] eta: 1:38:21 lr: 3.2113091852764065e-05 loss: 0.1372 (0.1366) time: 3.0076 data: 0.0056 max mem: 33300 Epoch: [15] [2330/4276] eta: 1:37:51 lr: 3.211032820797363e-05 loss: 0.1435 (0.1367) time: 3.0052 data: 0.0056 max mem: 33300 Epoch: [15] [2340/4276] eta: 1:37:21 lr: 3.210756453675415e-05 loss: 0.1454 (0.1367) time: 3.0112 data: 0.0057 max mem: 33300 Epoch: [15] [2350/4276] eta: 1:36:51 lr: 3.210480083910287e-05 loss: 0.1282 (0.1367) time: 3.0455 data: 0.0057 max mem: 33300 Epoch: [15] [2360/4276] eta: 1:36:22 lr: 3.210203711501698e-05 loss: 0.1175 (0.1366) time: 3.0741 data: 0.0057 max mem: 33300 Epoch: [15] [2370/4276] eta: 1:35:51 lr: 3.2099273364493706e-05 loss: 0.1326 (0.1366) time: 3.0361 data: 0.0055 max mem: 33300 Epoch: [15] [2380/4276] eta: 1:35:21 lr: 3.2096509587530267e-05 loss: 0.1396 (0.1366) time: 3.0007 data: 0.0055 max mem: 33300 Epoch: [15] [2390/4276] eta: 1:34:50 lr: 3.209374578412389e-05 loss: 0.1393 (0.1366) time: 2.9992 data: 0.0055 max mem: 33300 Epoch: [15] [2400/4276] eta: 1:34:21 lr: 3.209098195427178e-05 loss: 0.1287 (0.1366) time: 3.0154 data: 0.0056 max mem: 33300 Epoch: [15] [2410/4276] eta: 1:33:50 lr: 3.208821809797116e-05 loss: 0.1229 (0.1365) time: 3.0360 data: 0.0056 max mem: 33300 Epoch: [15] [2420/4276] eta: 1:33:20 lr: 3.2085454215219244e-05 loss: 0.1154 (0.1365) time: 3.0263 data: 0.0058 max mem: 33300 Epoch: [15] [2430/4276] eta: 1:32:50 lr: 3.208269030601325e-05 loss: 0.1397 (0.1366) time: 3.0271 data: 0.0057 max mem: 33300 Epoch: [15] [2440/4276] eta: 1:32:20 lr: 3.2079926370350386e-05 loss: 0.1372 (0.1366) time: 3.0210 data: 0.0055 max mem: 33300 Epoch: [15] [2450/4276] eta: 1:31:49 lr: 3.207716240822786e-05 loss: 0.1336 (0.1366) time: 2.9990 data: 0.0058 max mem: 33300 Epoch: [15] [2460/4276] eta: 1:31:20 lr: 3.2074398419642904e-05 loss: 0.1374 (0.1366) time: 3.0071 data: 0.0060 max mem: 33300 Epoch: [15] [2470/4276] eta: 1:30:49 lr: 3.207163440459272e-05 loss: 0.1422 (0.1366) time: 3.0243 data: 0.0059 max mem: 33300 Epoch: [15] [2480/4276] eta: 1:30:20 lr: 3.206887036307452e-05 loss: 0.1437 (0.1367) time: 3.0486 data: 0.0059 max mem: 33300 Epoch: [15] [2490/4276] eta: 1:29:51 lr: 3.206610629508552e-05 loss: 0.1306 (0.1367) time: 3.1557 data: 0.0057 max mem: 33300 Epoch: [15] [2500/4276] eta: 1:29:21 lr: 3.206334220062293e-05 loss: 0.1411 (0.1367) time: 3.1459 data: 0.0059 max mem: 33300 Epoch: [15] [2510/4276] eta: 1:28:51 lr: 3.2060578079683956e-05 loss: 0.1510 (0.1368) time: 3.0610 data: 0.0065 max mem: 33300 Epoch: [15] [2520/4276] eta: 1:28:21 lr: 3.2057813932265806e-05 loss: 0.1288 (0.1367) time: 3.0578 data: 0.0064 max mem: 33300 Epoch: [15] [2530/4276] eta: 1:27:51 lr: 3.20550497583657e-05 loss: 0.1035 (0.1366) time: 3.0524 data: 0.0066 max mem: 33300 Epoch: [15] [2540/4276] eta: 1:27:21 lr: 3.205228555798084e-05 loss: 0.1152 (0.1365) time: 3.0342 data: 0.0065 max mem: 33300 Epoch: [15] [2550/4276] eta: 1:26:51 lr: 3.2049521331108435e-05 loss: 0.1180 (0.1365) time: 3.0069 data: 0.0058 max mem: 33300 Epoch: [15] [2560/4276] eta: 1:26:21 lr: 3.20467570777457e-05 loss: 0.1177 (0.1364) time: 3.0055 data: 0.0056 max mem: 33300 Epoch: [15] [2570/4276] eta: 1:25:50 lr: 3.204399279788983e-05 loss: 0.1179 (0.1364) time: 3.0114 data: 0.0057 max mem: 33300 Epoch: [15] [2580/4276] eta: 1:25:20 lr: 3.204122849153804e-05 loss: 0.1200 (0.1363) time: 3.0128 data: 0.0058 max mem: 33300 Epoch: [15] [2590/4276] eta: 1:24:50 lr: 3.2038464158687526e-05 loss: 0.1204 (0.1363) time: 3.0408 data: 0.0059 max mem: 33300 Epoch: [15] [2600/4276] eta: 1:24:20 lr: 3.2035699799335505e-05 loss: 0.1281 (0.1363) time: 3.0307 data: 0.0059 max mem: 33300 Epoch: [15] [2610/4276] eta: 1:23:49 lr: 3.203293541347918e-05 loss: 0.1267 (0.1362) time: 2.9984 data: 0.0059 max mem: 33300 Epoch: [15] [2620/4276] eta: 1:23:20 lr: 3.203017100111575e-05 loss: 0.1246 (0.1362) time: 3.0255 data: 0.0059 max mem: 33300 Epoch: [15] [2630/4276] eta: 1:22:49 lr: 3.2027406562242426e-05 loss: 0.1246 (0.1361) time: 3.0465 data: 0.0059 max mem: 33300 Epoch: [15] [2640/4276] eta: 1:22:19 lr: 3.202464209685641e-05 loss: 0.1119 (0.1361) time: 3.0187 data: 0.0059 max mem: 33300 Epoch: [15] [2650/4276] eta: 1:21:49 lr: 3.20218776049549e-05 loss: 0.1201 (0.1361) time: 2.9976 data: 0.0059 max mem: 33300 Epoch: [15] [2660/4276] eta: 1:21:18 lr: 3.2019113086535104e-05 loss: 0.1280 (0.1361) time: 2.9914 data: 0.0058 max mem: 33300 Epoch: [15] [2670/4276] eta: 1:20:48 lr: 3.201634854159421e-05 loss: 0.1339 (0.1361) time: 2.9839 data: 0.0058 max mem: 33300 Epoch: [15] [2680/4276] eta: 1:20:18 lr: 3.2013583970129435e-05 loss: 0.1339 (0.1362) time: 3.0116 data: 0.0062 max mem: 33300 Epoch: [15] [2690/4276] eta: 1:19:48 lr: 3.201081937213797e-05 loss: 0.1332 (0.1361) time: 3.0447 data: 0.0067 max mem: 33300 Epoch: [15] [2700/4276] eta: 1:19:18 lr: 3.200805474761702e-05 loss: 0.1211 (0.1361) time: 3.0506 data: 0.0064 max mem: 33300 Epoch: [15] [2710/4276] eta: 1:18:48 lr: 3.200529009656379e-05 loss: 0.1245 (0.1360) time: 3.0471 data: 0.0057 max mem: 33300 Epoch: [15] [2720/4276] eta: 1:18:18 lr: 3.200252541897547e-05 loss: 0.1237 (0.1360) time: 3.0415 data: 0.0057 max mem: 33300 Epoch: [15] [2730/4276] eta: 1:17:48 lr: 3.199976071484925e-05 loss: 0.1337 (0.1360) time: 3.0501 data: 0.0058 max mem: 33300 Epoch: [15] [2740/4276] eta: 1:17:18 lr: 3.1996995984182345e-05 loss: 0.1421 (0.1361) time: 3.0349 data: 0.0058 max mem: 33300 Epoch: [15] [2750/4276] eta: 1:16:47 lr: 3.199423122697194e-05 loss: 0.1417 (0.1361) time: 3.0017 data: 0.0062 max mem: 33300 Epoch: [15] [2760/4276] eta: 1:16:17 lr: 3.199146644321524e-05 loss: 0.1361 (0.1361) time: 3.0008 data: 0.0064 max mem: 33300 Epoch: [15] [2770/4276] eta: 1:15:47 lr: 3.198870163290943e-05 loss: 0.1312 (0.1361) time: 3.0416 data: 0.0066 max mem: 33300 Epoch: [15] [2780/4276] eta: 1:15:17 lr: 3.198593679605172e-05 loss: 0.1312 (0.1361) time: 3.0680 data: 0.0068 max mem: 33300 Epoch: [15] [2790/4276] eta: 1:14:47 lr: 3.19831719326393e-05 loss: 0.1322 (0.1361) time: 3.0324 data: 0.0066 max mem: 33300 Epoch: [15] [2800/4276] eta: 1:14:17 lr: 3.198040704266936e-05 loss: 0.1322 (0.1361) time: 3.0267 data: 0.0064 max mem: 33300 Epoch: [15] [2810/4276] eta: 1:13:47 lr: 3.197764212613909e-05 loss: 0.1206 (0.1360) time: 3.0513 data: 0.0065 max mem: 33300 Epoch: [15] [2820/4276] eta: 1:13:17 lr: 3.197487718304569e-05 loss: 0.1166 (0.1359) time: 3.0647 data: 0.0068 max mem: 33300 Epoch: [15] [2830/4276] eta: 1:12:47 lr: 3.1972112213386354e-05 loss: 0.1241 (0.1359) time: 3.0792 data: 0.0067 max mem: 33300 Epoch: [15] [2840/4276] eta: 1:12:17 lr: 3.196934721715826e-05 loss: 0.1321 (0.1359) time: 3.0888 data: 0.0068 max mem: 33300 Epoch: [15] [2850/4276] eta: 1:11:47 lr: 3.196658219435862e-05 loss: 0.1362 (0.1360) time: 3.0790 data: 0.0069 max mem: 33300 Epoch: [15] [2860/4276] eta: 1:11:17 lr: 3.196381714498462e-05 loss: 0.1355 (0.1360) time: 3.0411 data: 0.0067 max mem: 33300 Epoch: [15] [2870/4276] eta: 1:10:47 lr: 3.196105206903344e-05 loss: 0.1331 (0.1360) time: 3.0160 data: 0.0068 max mem: 33300 Epoch: [15] [2880/4276] eta: 1:10:16 lr: 3.195828696650227e-05 loss: 0.1331 (0.1360) time: 3.0173 data: 0.0070 max mem: 33300 Epoch: [15] [2890/4276] eta: 1:09:46 lr: 3.1955521837388305e-05 loss: 0.1254 (0.1360) time: 3.0217 data: 0.0070 max mem: 33300 Epoch: [15] [2900/4276] eta: 1:09:16 lr: 3.195275668168874e-05 loss: 0.1157 (0.1359) time: 3.0159 data: 0.0070 max mem: 33300 Epoch: [15] [2910/4276] eta: 1:08:46 lr: 3.194999149940075e-05 loss: 0.1160 (0.1359) time: 3.0150 data: 0.0069 max mem: 33300 Epoch: [15] [2920/4276] eta: 1:08:16 lr: 3.194722629052153e-05 loss: 0.1233 (0.1358) time: 3.0212 data: 0.0069 max mem: 33300 Epoch: [15] [2930/4276] eta: 1:07:45 lr: 3.1944461055048263e-05 loss: 0.1088 (0.1357) time: 3.0175 data: 0.0069 max mem: 33300 Epoch: [15] [2940/4276] eta: 1:07:15 lr: 3.194169579297814e-05 loss: 0.1185 (0.1357) time: 3.0144 data: 0.0068 max mem: 33300 Epoch: [15] [2950/4276] eta: 1:06:45 lr: 3.193893050430835e-05 loss: 0.1223 (0.1357) time: 3.0149 data: 0.0068 max mem: 33300 Epoch: [15] [2960/4276] eta: 1:06:15 lr: 3.193616518903607e-05 loss: 0.1192 (0.1357) time: 3.0155 data: 0.0068 max mem: 33300 Epoch: [15] [2970/4276] eta: 1:05:44 lr: 3.193339984715848e-05 loss: 0.1235 (0.1357) time: 3.0225 data: 0.0069 max mem: 33300 Epoch: [15] [2980/4276] eta: 1:05:14 lr: 3.1930634478672774e-05 loss: 0.1394 (0.1357) time: 3.0303 data: 0.0069 max mem: 33300 Epoch: [15] [2990/4276] eta: 1:04:44 lr: 3.1927869083576136e-05 loss: 0.1231 (0.1357) time: 3.0268 data: 0.0068 max mem: 33300 Epoch: [15] [3000/4276] eta: 1:04:14 lr: 3.192510366186574e-05 loss: 0.1216 (0.1356) time: 3.0299 data: 0.0069 max mem: 33300 Epoch: [15] [3010/4276] eta: 1:03:44 lr: 3.1922338213538776e-05 loss: 0.1301 (0.1357) time: 3.0362 data: 0.0069 max mem: 33300 Epoch: [15] [3020/4276] eta: 1:03:14 lr: 3.191957273859243e-05 loss: 0.1396 (0.1357) time: 3.0699 data: 0.0068 max mem: 33300 Epoch: [15] [3030/4276] eta: 1:02:44 lr: 3.191680723702387e-05 loss: 0.1336 (0.1357) time: 3.0782 data: 0.0069 max mem: 33300 Epoch: [15] [3040/4276] eta: 1:02:13 lr: 3.191404170883029e-05 loss: 0.1483 (0.1358) time: 3.0046 data: 0.0077 max mem: 33300 Epoch: [15] [3050/4276] eta: 1:01:43 lr: 3.1911276154008853e-05 loss: 0.1483 (0.1357) time: 2.9860 data: 0.0086 max mem: 33300 Epoch: [15] [3060/4276] eta: 1:01:13 lr: 3.190851057255676e-05 loss: 0.1179 (0.1357) time: 2.9676 data: 0.0080 max mem: 33300 Epoch: [15] [3070/4276] eta: 1:00:42 lr: 3.190574496447117e-05 loss: 0.1179 (0.1357) time: 2.9677 data: 0.0076 max mem: 33300 Epoch: [15] [3080/4276] eta: 1:00:12 lr: 3.1902979329749274e-05 loss: 0.1241 (0.1356) time: 2.9704 data: 0.0077 max mem: 33300 Epoch: [15] [3090/4276] eta: 0:59:41 lr: 3.190021366838824e-05 loss: 0.1186 (0.1356) time: 2.9630 data: 0.0073 max mem: 33300 Epoch: [15] [3100/4276] eta: 0:59:11 lr: 3.189744798038527e-05 loss: 0.1194 (0.1356) time: 2.9861 data: 0.0081 max mem: 33300 Epoch: [15] [3110/4276] eta: 0:58:41 lr: 3.18946822657375e-05 loss: 0.1217 (0.1355) time: 2.9787 data: 0.0084 max mem: 33300 Epoch: [15] [3120/4276] eta: 0:58:10 lr: 3.189191652444214e-05 loss: 0.1169 (0.1355) time: 2.9429 data: 0.0078 max mem: 33300 Epoch: [15] [3130/4276] eta: 0:57:40 lr: 3.188915075649635e-05 loss: 0.1208 (0.1354) time: 2.9404 data: 0.0075 max mem: 33300 Epoch: [15] [3140/4276] eta: 0:57:09 lr: 3.1886384961897306e-05 loss: 0.1279 (0.1354) time: 2.9442 data: 0.0078 max mem: 33300 Epoch: [15] [3150/4276] eta: 0:56:39 lr: 3.1883619140642183e-05 loss: 0.1336 (0.1354) time: 2.9371 data: 0.0082 max mem: 33300 Epoch: [15] [3160/4276] eta: 0:56:08 lr: 3.188085329272816e-05 loss: 0.1362 (0.1354) time: 2.9470 data: 0.0079 max mem: 33300 Epoch: [15] [3170/4276] eta: 0:55:38 lr: 3.187808741815241e-05 loss: 0.1472 (0.1355) time: 2.9520 data: 0.0075 max mem: 33300 Epoch: [15] [3180/4276] eta: 0:55:08 lr: 3.18753215169121e-05 loss: 0.1370 (0.1355) time: 2.9616 data: 0.0074 max mem: 33300 Epoch: [15] [3190/4276] eta: 0:54:37 lr: 3.18725555890044e-05 loss: 0.1314 (0.1355) time: 2.9563 data: 0.0076 max mem: 33300 Epoch: [15] [3200/4276] eta: 0:54:07 lr: 3.186978963442649e-05 loss: 0.1346 (0.1355) time: 2.9568 data: 0.0074 max mem: 33300 Epoch: [15] [3210/4276] eta: 0:53:36 lr: 3.186702365317554e-05 loss: 0.1394 (0.1355) time: 2.9582 data: 0.0075 max mem: 33300 Epoch: [15] [3220/4276] eta: 0:53:06 lr: 3.1864257645248715e-05 loss: 0.1396 (0.1355) time: 2.9606 data: 0.0078 max mem: 33300 Epoch: [15] [3230/4276] eta: 0:52:36 lr: 3.186149161064318e-05 loss: 0.1284 (0.1355) time: 2.9818 data: 0.0083 max mem: 33300 Epoch: [15] [3240/4276] eta: 0:52:05 lr: 3.185872554935611e-05 loss: 0.1284 (0.1356) time: 2.9836 data: 0.0088 max mem: 33300 Epoch: [15] [3250/4276] eta: 0:51:35 lr: 3.185595946138468e-05 loss: 0.1418 (0.1356) time: 2.9891 data: 0.0085 max mem: 33300 Epoch: [15] [3260/4276] eta: 0:51:05 lr: 3.1853193346726055e-05 loss: 0.1409 (0.1356) time: 3.0154 data: 0.0079 max mem: 33300 Epoch: [15] [3270/4276] eta: 0:50:35 lr: 3.1850427205377405e-05 loss: 0.1446 (0.1356) time: 3.0054 data: 0.0078 max mem: 33300 Epoch: [15] [3280/4276] eta: 0:50:05 lr: 3.184766103733589e-05 loss: 0.1434 (0.1356) time: 2.9990 data: 0.0082 max mem: 33300 Epoch: [15] [3290/4276] eta: 0:49:34 lr: 3.184489484259866e-05 loss: 0.1443 (0.1357) time: 3.0046 data: 0.0083 max mem: 33300 Epoch: [15] [3300/4276] eta: 0:49:04 lr: 3.184212862116291e-05 loss: 0.1380 (0.1357) time: 2.9944 data: 0.0080 max mem: 33300 Epoch: [15] [3310/4276] eta: 0:48:34 lr: 3.18393623730258e-05 loss: 0.1457 (0.1357) time: 3.0078 data: 0.0079 max mem: 33300 Epoch: [15] [3320/4276] eta: 0:48:04 lr: 3.183659609818448e-05 loss: 0.1592 (0.1358) time: 3.0119 data: 0.0081 max mem: 33300 Epoch: [15] [3330/4276] eta: 0:47:34 lr: 3.183382979663614e-05 loss: 0.1286 (0.1357) time: 3.0078 data: 0.0081 max mem: 33300 Epoch: [15] [3340/4276] eta: 0:47:03 lr: 3.183106346837791e-05 loss: 0.1258 (0.1357) time: 3.0046 data: 0.0079 max mem: 33300 Epoch: [15] [3350/4276] eta: 0:46:33 lr: 3.182829711340698e-05 loss: 0.1217 (0.1357) time: 3.0242 data: 0.0078 max mem: 33300 Epoch: [15] [3360/4276] eta: 0:46:03 lr: 3.1825530731720495e-05 loss: 0.1194 (0.1357) time: 3.0281 data: 0.0077 max mem: 33300 Epoch: [15] [3370/4276] eta: 0:45:33 lr: 3.182276432331562e-05 loss: 0.1357 (0.1357) time: 3.0001 data: 0.0077 max mem: 33300 Epoch: [15] [3380/4276] eta: 0:45:03 lr: 3.181999788818952e-05 loss: 0.1451 (0.1357) time: 2.9940 data: 0.0079 max mem: 33300 Epoch: [15] [3390/4276] eta: 0:44:32 lr: 3.181723142633935e-05 loss: 0.1372 (0.1357) time: 2.9765 data: 0.0082 max mem: 33300 Epoch: [15] [3400/4276] eta: 0:44:02 lr: 3.181446493776229e-05 loss: 0.1519 (0.1357) time: 2.9881 data: 0.0085 max mem: 33300 Epoch: [15] [3410/4276] eta: 0:43:32 lr: 3.181169842245548e-05 loss: 0.1465 (0.1357) time: 3.0149 data: 0.0086 max mem: 33300 Epoch: [15] [3420/4276] eta: 0:43:02 lr: 3.180893188041608e-05 loss: 0.1463 (0.1358) time: 3.0053 data: 0.0082 max mem: 33300 Epoch: [15] [3430/4276] eta: 0:42:32 lr: 3.1806165311641254e-05 loss: 0.1509 (0.1359) time: 3.0223 data: 0.0080 max mem: 33300 Epoch: [15] [3440/4276] eta: 0:42:01 lr: 3.1803398716128155e-05 loss: 0.1302 (0.1358) time: 3.0140 data: 0.0077 max mem: 33300 Epoch: [15] [3450/4276] eta: 0:41:31 lr: 3.180063209387394e-05 loss: 0.1374 (0.1359) time: 2.9936 data: 0.0072 max mem: 33300 Epoch: [15] [3460/4276] eta: 0:41:01 lr: 3.179786544487577e-05 loss: 0.1445 (0.1359) time: 2.9975 data: 0.0073 max mem: 33300 Epoch: [15] [3470/4276] eta: 0:40:31 lr: 3.1795098769130796e-05 loss: 0.1268 (0.1358) time: 2.9930 data: 0.0077 max mem: 33300 Epoch: [15] [3480/4276] eta: 0:40:01 lr: 3.1792332066636175e-05 loss: 0.1227 (0.1358) time: 2.9980 data: 0.0079 max mem: 33300 Epoch: [15] [3490/4276] eta: 0:39:30 lr: 3.178956533738907e-05 loss: 0.1308 (0.1359) time: 3.0014 data: 0.0077 max mem: 33300 Epoch: [15] [3500/4276] eta: 0:39:00 lr: 3.178679858138662e-05 loss: 0.1304 (0.1358) time: 3.0023 data: 0.0072 max mem: 33300 Epoch: [15] [3510/4276] eta: 0:38:30 lr: 3.1784031798626e-05 loss: 0.1121 (0.1358) time: 2.9996 data: 0.0071 max mem: 33300 Epoch: [15] [3520/4276] eta: 0:38:00 lr: 3.178126498910434e-05 loss: 0.1238 (0.1358) time: 3.0160 data: 0.0069 max mem: 33300 Epoch: [15] [3530/4276] eta: 0:37:30 lr: 3.1778498152818806e-05 loss: 0.1435 (0.1358) time: 3.0155 data: 0.0071 max mem: 33300 Epoch: [15] [3540/4276] eta: 0:36:59 lr: 3.177573128976654e-05 loss: 0.1435 (0.1358) time: 2.9854 data: 0.0073 max mem: 33300 Epoch: [15] [3550/4276] eta: 0:36:29 lr: 3.17729643999447e-05 loss: 0.1269 (0.1358) time: 2.9881 data: 0.0074 max mem: 33300 Epoch: [15] [3560/4276] eta: 0:35:59 lr: 3.177019748335044e-05 loss: 0.1224 (0.1358) time: 3.0007 data: 0.0073 max mem: 33300 Epoch: [15] [3570/4276] eta: 0:35:29 lr: 3.176743053998091e-05 loss: 0.1457 (0.1359) time: 3.0026 data: 0.0071 max mem: 33300 Epoch: [15] [3580/4276] eta: 0:34:59 lr: 3.176466356983326e-05 loss: 0.1259 (0.1358) time: 3.0021 data: 0.0071 max mem: 33300 Epoch: [15] [3590/4276] eta: 0:34:28 lr: 3.1761896572904626e-05 loss: 0.1230 (0.1359) time: 3.0009 data: 0.0069 max mem: 33300 Epoch: [15] [3600/4276] eta: 0:33:58 lr: 3.175912954919216e-05 loss: 0.1368 (0.1359) time: 3.0208 data: 0.0069 max mem: 33300 Epoch: [15] [3610/4276] eta: 0:33:28 lr: 3.175636249869303e-05 loss: 0.1406 (0.1359) time: 3.0259 data: 0.0072 max mem: 33300 Epoch: [15] [3620/4276] eta: 0:32:58 lr: 3.175359542140435e-05 loss: 0.1406 (0.1358) time: 3.0078 data: 0.0074 max mem: 33300 Epoch: [15] [3630/4276] eta: 0:32:28 lr: 3.17508283173233e-05 loss: 0.1267 (0.1359) time: 3.0056 data: 0.0075 max mem: 33300 Epoch: [15] [3640/4276] eta: 0:31:58 lr: 3.174806118644702e-05 loss: 0.1323 (0.1358) time: 3.0051 data: 0.0077 max mem: 33300 Epoch: [15] [3650/4276] eta: 0:31:27 lr: 3.1745294028772624e-05 loss: 0.1221 (0.1358) time: 3.0013 data: 0.0079 max mem: 33300 Epoch: [15] [3660/4276] eta: 0:30:57 lr: 3.1742526844297296e-05 loss: 0.1159 (0.1358) time: 2.9815 data: 0.0074 max mem: 33300 Epoch: [15] [3670/4276] eta: 0:30:27 lr: 3.173975963301816e-05 loss: 0.1133 (0.1357) time: 2.9817 data: 0.0078 max mem: 33300 Epoch: [15] [3680/4276] eta: 0:29:57 lr: 3.173699239493236e-05 loss: 0.1403 (0.1358) time: 3.0056 data: 0.0085 max mem: 33300 Epoch: [15] [3690/4276] eta: 0:29:27 lr: 3.173422513003705e-05 loss: 0.1479 (0.1358) time: 3.0272 data: 0.0083 max mem: 33300 Epoch: [15] [3700/4276] eta: 0:28:57 lr: 3.173145783832936e-05 loss: 0.1450 (0.1358) time: 3.0394 data: 0.0081 max mem: 33300 Epoch: [15] [3710/4276] eta: 0:28:26 lr: 3.172869051980644e-05 loss: 0.1252 (0.1357) time: 3.0246 data: 0.0077 max mem: 33300 Epoch: [15] [3720/4276] eta: 0:27:56 lr: 3.172592317446543e-05 loss: 0.1127 (0.1357) time: 3.0180 data: 0.0076 max mem: 33300 Epoch: [15] [3730/4276] eta: 0:27:26 lr: 3.1723155802303473e-05 loss: 0.1171 (0.1357) time: 3.0419 data: 0.0074 max mem: 33300 Epoch: [15] [3740/4276] eta: 0:26:56 lr: 3.172038840331771e-05 loss: 0.1229 (0.1357) time: 3.0561 data: 0.0070 max mem: 33300 Epoch: [15] [3750/4276] eta: 0:26:26 lr: 3.171762097750527e-05 loss: 0.1302 (0.1357) time: 3.0411 data: 0.0073 max mem: 33300 Epoch: [15] [3760/4276] eta: 0:25:56 lr: 3.17148535248633e-05 loss: 0.1206 (0.1357) time: 3.0062 data: 0.0076 max mem: 33300 Epoch: [15] [3770/4276] eta: 0:25:26 lr: 3.171208604538894e-05 loss: 0.1194 (0.1357) time: 3.0016 data: 0.0078 max mem: 33300 Epoch: [15] [3780/4276] eta: 0:24:55 lr: 3.170931853907933e-05 loss: 0.1298 (0.1357) time: 3.0143 data: 0.0079 max mem: 33300 Epoch: [15] [3790/4276] eta: 0:24:25 lr: 3.1706551005931594e-05 loss: 0.1256 (0.1357) time: 2.9999 data: 0.0079 max mem: 33300 Epoch: [15] [3800/4276] eta: 0:23:55 lr: 3.1703783445942884e-05 loss: 0.1410 (0.1357) time: 3.0128 data: 0.0080 max mem: 33300 Epoch: [15] [3810/4276] eta: 0:23:25 lr: 3.170101585911033e-05 loss: 0.1406 (0.1357) time: 3.0134 data: 0.0076 max mem: 33300 Epoch: [15] [3820/4276] eta: 0:22:55 lr: 3.1698248245431075e-05 loss: 0.1159 (0.1357) time: 2.9948 data: 0.0075 max mem: 33300 Epoch: [15] [3830/4276] eta: 0:22:25 lr: 3.169548060490224e-05 loss: 0.1270 (0.1357) time: 3.0323 data: 0.0079 max mem: 33300 Epoch: [15] [3840/4276] eta: 0:21:55 lr: 3.169271293752097e-05 loss: 0.1287 (0.1356) time: 3.1308 data: 0.0079 max mem: 33300 Epoch: [15] [3850/4276] eta: 0:21:24 lr: 3.168994524328439e-05 loss: 0.1162 (0.1356) time: 3.0866 data: 0.0080 max mem: 33300 Epoch: [15] [3860/4276] eta: 0:20:54 lr: 3.168717752218964e-05 loss: 0.1221 (0.1356) time: 3.0571 data: 0.0080 max mem: 33300 Epoch: [15] [3870/4276] eta: 0:20:24 lr: 3.1684409774233856e-05 loss: 0.1329 (0.1356) time: 3.0817 data: 0.0076 max mem: 33300 Epoch: [15] [3880/4276] eta: 0:19:54 lr: 3.1681641999414165e-05 loss: 0.1235 (0.1356) time: 3.0285 data: 0.0080 max mem: 33300 Epoch: [15] [3890/4276] eta: 0:19:24 lr: 3.167887419772771e-05 loss: 0.1251 (0.1356) time: 3.0284 data: 0.0082 max mem: 33300 Epoch: [15] [3900/4276] eta: 0:18:54 lr: 3.1676106369171595e-05 loss: 0.1324 (0.1356) time: 3.0190 data: 0.0077 max mem: 33300 Epoch: [15] [3910/4276] eta: 0:18:24 lr: 3.167333851374297e-05 loss: 0.1262 (0.1356) time: 2.9852 data: 0.0074 max mem: 33300 Epoch: [15] [3920/4276] eta: 0:17:53 lr: 3.167057063143896e-05 loss: 0.1183 (0.1355) time: 2.9554 data: 0.0076 max mem: 33300 Epoch: [15] [3930/4276] eta: 0:17:23 lr: 3.1667802722256694e-05 loss: 0.1190 (0.1355) time: 2.9629 data: 0.0080 max mem: 33300 Epoch: [15] [3940/4276] eta: 0:16:53 lr: 3.1665034786193304e-05 loss: 0.1147 (0.1355) time: 2.9794 data: 0.0081 max mem: 33300 Epoch: [15] [3950/4276] eta: 0:16:23 lr: 3.166226682324593e-05 loss: 0.1187 (0.1355) time: 2.9749 data: 0.0077 max mem: 33300 Epoch: [15] [3960/4276] eta: 0:15:53 lr: 3.1659498833411673e-05 loss: 0.1226 (0.1355) time: 2.9829 data: 0.0076 max mem: 33300 Epoch: [15] [3970/4276] eta: 0:15:22 lr: 3.1656730816687673e-05 loss: 0.1351 (0.1355) time: 2.9799 data: 0.0076 max mem: 33300 Epoch: [15] [3980/4276] eta: 0:14:52 lr: 3.165396277307106e-05 loss: 0.1228 (0.1355) time: 2.9791 data: 0.0075 max mem: 33300 Epoch: [15] [3990/4276] eta: 0:14:22 lr: 3.165119470255895e-05 loss: 0.1169 (0.1355) time: 2.9726 data: 0.0078 max mem: 33300 Epoch: [15] [4000/4276] eta: 0:13:52 lr: 3.164842660514848e-05 loss: 0.1220 (0.1354) time: 2.9744 data: 0.0075 max mem: 33300 Epoch: [15] [4010/4276] eta: 0:13:22 lr: 3.164565848083676e-05 loss: 0.1220 (0.1354) time: 2.9954 data: 0.0076 max mem: 33300 Epoch: [15] [4020/4276] eta: 0:12:51 lr: 3.164289032962093e-05 loss: 0.1376 (0.1354) time: 3.0025 data: 0.0078 max mem: 33300 Epoch: [15] [4030/4276] eta: 0:12:21 lr: 3.1640122151498106e-05 loss: 0.1376 (0.1354) time: 3.0278 data: 0.0074 max mem: 33300 Epoch: [15] [4040/4276] eta: 0:11:51 lr: 3.163735394646541e-05 loss: 0.1395 (0.1355) time: 3.0261 data: 0.0070 max mem: 33300 Epoch: [15] [4050/4276] eta: 0:11:21 lr: 3.1634585714519965e-05 loss: 0.1228 (0.1355) time: 3.0189 data: 0.0071 max mem: 33300 Epoch: [15] [4060/4276] eta: 0:10:51 lr: 3.1631817455658884e-05 loss: 0.1273 (0.1355) time: 3.0289 data: 0.0078 max mem: 33300 Epoch: [15] [4070/4276] eta: 0:10:21 lr: 3.162904916987931e-05 loss: 0.1349 (0.1355) time: 3.0320 data: 0.0084 max mem: 33300 Epoch: [15] [4080/4276] eta: 0:09:51 lr: 3.162628085717833e-05 loss: 0.1363 (0.1355) time: 3.0214 data: 0.0080 max mem: 33300 Epoch: [15] [4090/4276] eta: 0:09:20 lr: 3.1623512517553095e-05 loss: 0.1374 (0.1355) time: 2.9980 data: 0.0075 max mem: 33300 Epoch: [15] [4100/4276] eta: 0:08:50 lr: 3.162074415100071e-05 loss: 0.1351 (0.1356) time: 3.0037 data: 0.0079 max mem: 33300 Epoch: [15] [4110/4276] eta: 0:08:20 lr: 3.16179757575183e-05 loss: 0.1345 (0.1356) time: 3.0208 data: 0.0078 max mem: 33300 Epoch: [15] [4120/4276] eta: 0:07:50 lr: 3.161520733710298e-05 loss: 0.1359 (0.1356) time: 3.0234 data: 0.0079 max mem: 33300 Epoch: [15] [4130/4276] eta: 0:07:20 lr: 3.1612438889751875e-05 loss: 0.1359 (0.1356) time: 3.0076 data: 0.0079 max mem: 33300 Epoch: [15] [4140/4276] eta: 0:06:50 lr: 3.160967041546208e-05 loss: 0.1251 (0.1356) time: 3.0049 data: 0.0079 max mem: 33300 Epoch: [15] [4150/4276] eta: 0:06:19 lr: 3.160690191423073e-05 loss: 0.1329 (0.1356) time: 3.0220 data: 0.0079 max mem: 33300 Epoch: [15] [4160/4276] eta: 0:05:49 lr: 3.160413338605493e-05 loss: 0.1329 (0.1356) time: 3.0190 data: 0.0073 max mem: 33300 Epoch: [15] [4170/4276] eta: 0:05:19 lr: 3.16013648309318e-05 loss: 0.1391 (0.1356) time: 3.0754 data: 0.0072 max mem: 33300 Epoch: [15] [4180/4276] eta: 0:04:49 lr: 3.159859624885846e-05 loss: 0.1391 (0.1356) time: 3.0687 data: 0.0076 max mem: 33300 Epoch: [15] [4190/4276] eta: 0:04:19 lr: 3.1595827639832026e-05 loss: 0.1326 (0.1357) time: 3.0063 data: 0.0076 max mem: 33300 Epoch: [15] [4200/4276] eta: 0:03:49 lr: 3.1593059003849604e-05 loss: 0.1416 (0.1357) time: 3.0185 data: 0.0076 max mem: 33300 Epoch: [15] [4210/4276] eta: 0:03:19 lr: 3.15902903409083e-05 loss: 0.1481 (0.1357) time: 3.0053 data: 0.0075 max mem: 33300 Epoch: [15] [4220/4276] eta: 0:02:48 lr: 3.158752165100523e-05 loss: 0.1492 (0.1357) time: 3.0535 data: 0.0075 max mem: 33300 Epoch: [15] [4230/4276] eta: 0:02:18 lr: 3.158475293413751e-05 loss: 0.1492 (0.1358) time: 3.0435 data: 0.0080 max mem: 33300 Epoch: [15] [4240/4276] eta: 0:01:48 lr: 3.1581984190302255e-05 loss: 0.1445 (0.1358) time: 2.9973 data: 0.0085 max mem: 33300 Epoch: [15] [4250/4276] eta: 0:01:18 lr: 3.157921541949657e-05 loss: 0.1419 (0.1358) time: 3.0139 data: 0.0078 max mem: 33300 Epoch: [15] [4260/4276] eta: 0:00:48 lr: 3.157644662171757e-05 loss: 0.1478 (0.1359) time: 3.0023 data: 0.0069 max mem: 33300 Epoch: [15] [4270/4276] eta: 0:00:18 lr: 3.157367779696235e-05 loss: 0.1507 (0.1359) time: 3.0025 data: 0.0071 max mem: 33300 Epoch: [15] Total time: 3:34:56 Test: [ 0/21770] eta: 10:04:37 time: 1.6664 data: 1.5804 max mem: 33300 Test: [ 100/21770] eta: 0:19:42 time: 0.0386 data: 0.0010 max mem: 33300 Test: [ 200/21770] eta: 0:16:44 time: 0.0391 data: 0.0011 max mem: 33300 Test: [ 300/21770] eta: 0:15:42 time: 0.0383 data: 0.0010 max mem: 33300 Test: [ 400/21770] eta: 0:15:08 time: 0.0387 data: 0.0011 max mem: 33300 Test: [ 500/21770] eta: 0:14:47 time: 0.0384 data: 0.0011 max mem: 33300 Test: [ 600/21770] eta: 0:14:32 time: 0.0386 data: 0.0011 max mem: 33300 Test: [ 700/21770] eta: 0:14:19 time: 0.0382 data: 0.0009 max mem: 33300 Test: [ 800/21770] eta: 0:14:09 time: 0.0383 data: 0.0010 max mem: 33300 Test: [ 900/21770] eta: 0:14:00 time: 0.0384 data: 0.0010 max mem: 33300 Test: [ 1000/21770] eta: 0:13:52 time: 0.0381 data: 0.0010 max mem: 33300 Test: [ 1100/21770] eta: 0:13:45 time: 0.0383 data: 0.0010 max mem: 33300 Test: [ 1200/21770] eta: 0:13:38 time: 0.0384 data: 0.0010 max mem: 33300 Test: [ 1300/21770] eta: 0:13:32 time: 0.0387 data: 0.0010 max mem: 33300 Test: [ 1400/21770] eta: 0:13:26 time: 0.0379 data: 0.0009 max mem: 33300 Test: [ 1500/21770] eta: 0:13:20 time: 0.0382 data: 0.0010 max mem: 33300 Test: [ 1600/21770] eta: 0:13:15 time: 0.0384 data: 0.0010 max mem: 33300 Test: [ 1700/21770] eta: 0:13:09 time: 0.0382 data: 0.0010 max mem: 33300 Test: [ 1800/21770] eta: 0:13:04 time: 0.0383 data: 0.0010 max mem: 33300 Test: [ 1900/21770] eta: 0:12:59 time: 0.0382 data: 0.0010 max mem: 33300 Test: [ 2000/21770] eta: 0:12:54 time: 0.0381 data: 0.0009 max mem: 33300 Test: [ 2100/21770] eta: 0:12:49 time: 0.0381 data: 0.0009 max mem: 33300 Test: [ 2200/21770] eta: 0:12:45 time: 0.0384 data: 0.0010 max mem: 33300 Test: [ 2300/21770] eta: 0:12:40 time: 0.0382 data: 0.0009 max mem: 33300 Test: [ 2400/21770] eta: 0:12:36 time: 0.0382 data: 0.0010 max mem: 33300 Test: [ 2500/21770] eta: 0:12:32 time: 0.0402 data: 0.0010 max mem: 33300 Test: [ 2600/21770] eta: 0:12:29 time: 0.0397 data: 0.0011 max mem: 33300 Test: [ 2700/21770] eta: 0:12:26 time: 0.0395 data: 0.0010 max mem: 33300 Test: [ 2800/21770] eta: 0:12:22 time: 0.0402 data: 0.0012 max mem: 33300 Test: [ 2900/21770] eta: 0:12:18 time: 0.0390 data: 0.0010 max mem: 33300 Test: [ 3000/21770] eta: 0:12:14 time: 0.0391 data: 0.0010 max mem: 33300 Test: [ 3100/21770] eta: 0:12:10 time: 0.0391 data: 0.0010 max mem: 33300 Test: [ 3200/21770] eta: 0:12:06 time: 0.0391 data: 0.0011 max mem: 33300 Test: [ 3300/21770] eta: 0:12:03 time: 0.0397 data: 0.0010 max mem: 33300 Test: [ 3400/21770] eta: 0:11:59 time: 0.0393 data: 0.0011 max mem: 33300 Test: [ 3500/21770] eta: 0:11:55 time: 0.0388 data: 0.0010 max mem: 33300 Test: [ 3600/21770] eta: 0:11:51 time: 0.0388 data: 0.0011 max mem: 33300 Test: [ 3700/21770] eta: 0:11:47 time: 0.0388 data: 0.0010 max mem: 33300 Test: [ 3800/21770] eta: 0:11:42 time: 0.0388 data: 0.0011 max mem: 33300 Test: [ 3900/21770] eta: 0:11:38 time: 0.0388 data: 0.0010 max mem: 33300 Test: [ 4000/21770] eta: 0:11:34 time: 0.0389 data: 0.0011 max mem: 33300 Test: [ 4100/21770] eta: 0:11:30 time: 0.0385 data: 0.0010 max mem: 33300 Test: [ 4200/21770] eta: 0:11:26 time: 0.0392 data: 0.0010 max mem: 33300 Test: [ 4300/21770] eta: 0:11:22 time: 0.0389 data: 0.0010 max mem: 33300 Test: [ 4400/21770] eta: 0:11:18 time: 0.0393 data: 0.0010 max mem: 33300 Test: [ 4500/21770] eta: 0:11:14 time: 0.0390 data: 0.0010 max mem: 33300 Test: [ 4600/21770] eta: 0:11:10 time: 0.0388 data: 0.0010 max mem: 33300 Test: [ 4700/21770] eta: 0:11:06 time: 0.0385 data: 0.0010 max mem: 33300 Test: [ 4800/21770] eta: 0:11:02 time: 0.0384 data: 0.0009 max mem: 33300 Test: [ 4900/21770] eta: 0:10:58 time: 0.0382 data: 0.0010 max mem: 33300 Test: [ 5000/21770] eta: 0:10:54 time: 0.0389 data: 0.0010 max mem: 33300 Test: [ 5100/21770] eta: 0:10:50 time: 0.0389 data: 0.0010 max mem: 33300 Test: [ 5200/21770] eta: 0:10:46 time: 0.0393 data: 0.0009 max mem: 33300 Test: [ 5300/21770] eta: 0:10:42 time: 0.0389 data: 0.0010 max mem: 33300 Test: [ 5400/21770] eta: 0:10:38 time: 0.0392 data: 0.0010 max mem: 33300 Test: [ 5500/21770] eta: 0:10:35 time: 0.0399 data: 0.0009 max mem: 33300 Test: [ 5600/21770] eta: 0:10:31 time: 0.0394 data: 0.0010 max mem: 33300 Test: [ 5700/21770] eta: 0:10:27 time: 0.0394 data: 0.0010 max mem: 33300 Test: [ 5800/21770] eta: 0:10:23 time: 0.0391 data: 0.0010 max mem: 33300 Test: [ 5900/21770] eta: 0:10:19 time: 0.0394 data: 0.0011 max mem: 33300 Test: [ 6000/21770] eta: 0:10:15 time: 0.0393 data: 0.0010 max mem: 33300 Test: [ 6100/21770] eta: 0:10:11 time: 0.0391 data: 0.0010 max mem: 33300 Test: [ 6200/21770] eta: 0:10:08 time: 0.0390 data: 0.0010 max mem: 33300 Test: [ 6300/21770] eta: 0:10:04 time: 0.0392 data: 0.0009 max mem: 33300 Test: [ 6400/21770] eta: 0:10:00 time: 0.0390 data: 0.0009 max mem: 33300 Test: [ 6500/21770] eta: 0:09:56 time: 0.0399 data: 0.0011 max mem: 33300 Test: [ 6600/21770] eta: 0:09:52 time: 0.0389 data: 0.0010 max mem: 33300 Test: [ 6700/21770] eta: 0:09:48 time: 0.0393 data: 0.0010 max mem: 33300 Test: [ 6800/21770] eta: 0:09:45 time: 0.0396 data: 0.0011 max mem: 33300 Test: [ 6900/21770] eta: 0:09:41 time: 0.0394 data: 0.0010 max mem: 33300 Test: [ 7000/21770] eta: 0:09:37 time: 0.0397 data: 0.0011 max mem: 33300 Test: [ 7100/21770] eta: 0:09:33 time: 0.0384 data: 0.0011 max mem: 33300 Test: [ 7200/21770] eta: 0:09:29 time: 0.0385 data: 0.0011 max mem: 33300 Test: [ 7300/21770] eta: 0:09:25 time: 0.0390 data: 0.0009 max mem: 33300 Test: [ 7400/21770] eta: 0:09:21 time: 0.0391 data: 0.0010 max mem: 33300 Test: [ 7500/21770] eta: 0:09:17 time: 0.0400 data: 0.0010 max mem: 33300 Test: [ 7600/21770] eta: 0:09:14 time: 0.0398 data: 0.0010 max mem: 33300 Test: [ 7700/21770] eta: 0:09:10 time: 0.0394 data: 0.0010 max mem: 33300 Test: [ 7800/21770] eta: 0:09:06 time: 0.0393 data: 0.0010 max mem: 33300 Test: [ 7900/21770] eta: 0:09:02 time: 0.0398 data: 0.0010 max mem: 33300 Test: [ 8000/21770] eta: 0:08:58 time: 0.0401 data: 0.0010 max mem: 33300 Test: [ 8100/21770] eta: 0:08:55 time: 0.0400 data: 0.0010 max mem: 33300 Test: [ 8200/21770] eta: 0:08:51 time: 0.0396 data: 0.0010 max mem: 33300 Test: [ 8300/21770] eta: 0:08:47 time: 0.0398 data: 0.0010 max mem: 33300 Test: [ 8400/21770] eta: 0:08:43 time: 0.0391 data: 0.0010 max mem: 33300 Test: [ 8500/21770] eta: 0:08:39 time: 0.0391 data: 0.0010 max mem: 33300 Test: [ 8600/21770] eta: 0:08:35 time: 0.0391 data: 0.0010 max mem: 33300 Test: [ 8700/21770] eta: 0:08:31 time: 0.0383 data: 0.0010 max mem: 33300 Test: [ 8800/21770] eta: 0:08:27 time: 0.0388 data: 0.0009 max mem: 33300 Test: [ 8900/21770] eta: 0:08:23 time: 0.0379 data: 0.0009 max mem: 33300 Test: [ 9000/21770] eta: 0:08:19 time: 0.0385 data: 0.0010 max mem: 33300 Test: [ 9100/21770] eta: 0:08:15 time: 0.0386 data: 0.0010 max mem: 33300 Test: [ 9200/21770] eta: 0:08:11 time: 0.0395 data: 0.0010 max mem: 33300 Test: [ 9300/21770] eta: 0:08:07 time: 0.0385 data: 0.0010 max mem: 33300 Test: [ 9400/21770] eta: 0:08:03 time: 0.0386 data: 0.0011 max mem: 33300 Test: [ 9500/21770] eta: 0:08:00 time: 0.0398 data: 0.0010 max mem: 33300 Test: [ 9600/21770] eta: 0:07:56 time: 0.0389 data: 0.0009 max mem: 33300 Test: [ 9700/21770] eta: 0:07:52 time: 0.0392 data: 0.0010 max mem: 33300 Test: [ 9800/21770] eta: 0:07:48 time: 0.0391 data: 0.0010 max mem: 33300 Test: [ 9900/21770] eta: 0:07:44 time: 0.0390 data: 0.0010 max mem: 33300 Test: [10000/21770] eta: 0:07:40 time: 0.0389 data: 0.0011 max mem: 33300 Test: [10100/21770] eta: 0:07:36 time: 0.0387 data: 0.0010 max mem: 33300 Test: [10200/21770] eta: 0:07:32 time: 0.0389 data: 0.0011 max mem: 33300 Test: [10300/21770] eta: 0:07:28 time: 0.0387 data: 0.0011 max mem: 33300 Test: [10400/21770] eta: 0:07:24 time: 0.0392 data: 0.0010 max mem: 33300 Test: [10500/21770] eta: 0:07:20 time: 0.0400 data: 0.0012 max mem: 33300 Test: [10600/21770] eta: 0:07:16 time: 0.0387 data: 0.0011 max mem: 33300 Test: [10700/21770] eta: 0:07:12 time: 0.0384 data: 0.0010 max mem: 33300 Test: [10800/21770] eta: 0:07:08 time: 0.0386 data: 0.0010 max mem: 33300 Test: [10900/21770] eta: 0:07:05 time: 0.0385 data: 0.0010 max mem: 33300 Test: [11000/21770] eta: 0:07:01 time: 0.0385 data: 0.0010 max mem: 33300 Test: [11100/21770] eta: 0:06:57 time: 0.0390 data: 0.0010 max mem: 33300 Test: [11200/21770] eta: 0:06:53 time: 0.0384 data: 0.0010 max mem: 33300 Test: [11300/21770] eta: 0:06:49 time: 0.0384 data: 0.0009 max mem: 33300 Test: [11400/21770] eta: 0:06:45 time: 0.0381 data: 0.0010 max mem: 33300 Test: [11500/21770] eta: 0:06:41 time: 0.0384 data: 0.0010 max mem: 33300 Test: [11600/21770] eta: 0:06:37 time: 0.0385 data: 0.0010 max mem: 33300 Test: [11700/21770] eta: 0:06:33 time: 0.0390 data: 0.0010 max mem: 33300 Test: [11800/21770] eta: 0:06:29 time: 0.0383 data: 0.0010 max mem: 33300 Test: [11900/21770] eta: 0:06:25 time: 0.0384 data: 0.0010 max mem: 33300 Test: [12000/21770] eta: 0:06:21 time: 0.0381 data: 0.0010 max mem: 33300 Test: [12100/21770] eta: 0:06:17 time: 0.0388 data: 0.0009 max mem: 33300 Test: [12200/21770] eta: 0:06:13 time: 0.0393 data: 0.0010 max mem: 33300 Test: [12300/21770] eta: 0:06:09 time: 0.0390 data: 0.0010 max mem: 33300 Test: [12400/21770] eta: 0:06:05 time: 0.0386 data: 0.0010 max mem: 33300 Test: [12500/21770] eta: 0:06:01 time: 0.0387 data: 0.0010 max mem: 33300 Test: [12600/21770] eta: 0:05:58 time: 0.0383 data: 0.0010 max mem: 33300 Test: [12700/21770] eta: 0:05:54 time: 0.0381 data: 0.0009 max mem: 33300 Test: [12800/21770] eta: 0:05:50 time: 0.0391 data: 0.0010 max mem: 33300 Test: [12900/21770] eta: 0:05:46 time: 0.0387 data: 0.0009 max mem: 33300 Test: [13000/21770] eta: 0:05:42 time: 0.0386 data: 0.0009 max mem: 33300 Test: [13100/21770] eta: 0:05:38 time: 0.0388 data: 0.0011 max mem: 33300 Test: [13200/21770] eta: 0:05:34 time: 0.0398 data: 0.0011 max mem: 33300 Test: [13300/21770] eta: 0:05:30 time: 0.0387 data: 0.0010 max mem: 33300 Test: [13400/21770] eta: 0:05:26 time: 0.0389 data: 0.0011 max mem: 33300 Test: [13500/21770] eta: 0:05:22 time: 0.0383 data: 0.0010 max mem: 33300 Test: [13600/21770] eta: 0:05:18 time: 0.0389 data: 0.0011 max mem: 33300 Test: [13700/21770] eta: 0:05:15 time: 0.0394 data: 0.0010 max mem: 33300 Test: [13800/21770] eta: 0:05:11 time: 0.0387 data: 0.0011 max mem: 33300 Test: [13900/21770] eta: 0:05:07 time: 0.0400 data: 0.0010 max mem: 33300 Test: [14000/21770] eta: 0:05:03 time: 0.0400 data: 0.0010 max mem: 33300 Test: [14100/21770] eta: 0:04:59 time: 0.0398 data: 0.0010 max mem: 33300 Test: [14200/21770] eta: 0:04:55 time: 0.0398 data: 0.0010 max mem: 33300 Test: [14300/21770] eta: 0:04:51 time: 0.0401 data: 0.0010 max mem: 33300 Test: [14400/21770] eta: 0:04:47 time: 0.0385 data: 0.0010 max mem: 33300 Test: [14500/21770] eta: 0:04:43 time: 0.0385 data: 0.0010 max mem: 33300 Test: [14600/21770] eta: 0:04:40 time: 0.0385 data: 0.0010 max mem: 33300 Test: [14700/21770] eta: 0:04:36 time: 0.0382 data: 0.0010 max mem: 33300 Test: [14800/21770] eta: 0:04:32 time: 0.0384 data: 0.0010 max mem: 33300 Test: [14900/21770] eta: 0:04:28 time: 0.0390 data: 0.0010 max mem: 33300 Test: [15000/21770] eta: 0:04:24 time: 0.0392 data: 0.0011 max mem: 33300 Test: [15100/21770] eta: 0:04:20 time: 0.0389 data: 0.0010 max mem: 33300 Test: [15200/21770] eta: 0:04:16 time: 0.0383 data: 0.0010 max mem: 33300 Test: [15300/21770] eta: 0:04:12 time: 0.0384 data: 0.0010 max mem: 33300 Test: [15400/21770] eta: 0:04:08 time: 0.0392 data: 0.0011 max mem: 33300 Test: [15500/21770] eta: 0:04:04 time: 0.0387 data: 0.0010 max mem: 33300 Test: [15600/21770] eta: 0:04:00 time: 0.0396 data: 0.0011 max mem: 33300 Test: [15700/21770] eta: 0:03:56 time: 0.0400 data: 0.0011 max mem: 33300 Test: [15800/21770] eta: 0:03:53 time: 0.0390 data: 0.0009 max mem: 33300 Test: [15900/21770] eta: 0:03:49 time: 0.0396 data: 0.0010 max mem: 33300 Test: [16000/21770] eta: 0:03:45 time: 0.0389 data: 0.0010 max mem: 33300 Test: [16100/21770] eta: 0:03:41 time: 0.0388 data: 0.0010 max mem: 33300 Test: [16200/21770] eta: 0:03:37 time: 0.0393 data: 0.0011 max mem: 33300 Test: [16300/21770] eta: 0:03:33 time: 0.0389 data: 0.0010 max mem: 33300 Test: [16400/21770] eta: 0:03:29 time: 0.0400 data: 0.0011 max mem: 33300 Test: [16500/21770] eta: 0:03:25 time: 0.0392 data: 0.0011 max mem: 33300 Test: [16600/21770] eta: 0:03:21 time: 0.0391 data: 0.0010 max mem: 33300 Test: [16700/21770] eta: 0:03:17 time: 0.0389 data: 0.0010 max mem: 33300 Test: [16800/21770] eta: 0:03:14 time: 0.0391 data: 0.0010 max mem: 33300 Test: [16900/21770] eta: 0:03:10 time: 0.0395 data: 0.0010 max mem: 33300 Test: [17000/21770] eta: 0:03:06 time: 0.0386 data: 0.0011 max mem: 33300 Test: [17100/21770] eta: 0:03:02 time: 0.0388 data: 0.0010 max mem: 33300 Test: [17200/21770] eta: 0:02:58 time: 0.0391 data: 0.0010 max mem: 33300 Test: [17300/21770] eta: 0:02:54 time: 0.0389 data: 0.0010 max mem: 33300 Test: [17400/21770] eta: 0:02:50 time: 0.0391 data: 0.0010 max mem: 33300 Test: [17500/21770] eta: 0:02:46 time: 0.0385 data: 0.0010 max mem: 33300 Test: [17600/21770] eta: 0:02:42 time: 0.0389 data: 0.0011 max mem: 33300 Test: [17700/21770] eta: 0:02:38 time: 0.0391 data: 0.0010 max mem: 33300 Test: [17800/21770] eta: 0:02:34 time: 0.0386 data: 0.0009 max mem: 33300 Test: [17900/21770] eta: 0:02:31 time: 0.0384 data: 0.0010 max mem: 33300 Test: [18000/21770] eta: 0:02:27 time: 0.0390 data: 0.0010 max mem: 33300 Test: [18100/21770] eta: 0:02:23 time: 0.0390 data: 0.0010 max mem: 33300 Test: [18200/21770] eta: 0:02:19 time: 0.0389 data: 0.0010 max mem: 33300 Test: [18300/21770] eta: 0:02:15 time: 0.0384 data: 0.0009 max mem: 33300 Test: [18400/21770] eta: 0:02:11 time: 0.0389 data: 0.0010 max mem: 33300 Test: [18500/21770] eta: 0:02:07 time: 0.0391 data: 0.0010 max mem: 33300 Test: [18600/21770] eta: 0:02:03 time: 0.0387 data: 0.0010 max mem: 33300 Test: [18700/21770] eta: 0:01:59 time: 0.0385 data: 0.0010 max mem: 33300 Test: [18800/21770] eta: 0:01:55 time: 0.0385 data: 0.0010 max mem: 33300 Test: [18900/21770] eta: 0:01:52 time: 0.0392 data: 0.0010 max mem: 33300 Test: [19000/21770] eta: 0:01:48 time: 0.0387 data: 0.0010 max mem: 33300 Test: [19100/21770] eta: 0:01:44 time: 0.0387 data: 0.0010 max mem: 33300 Test: [19200/21770] eta: 0:01:40 time: 0.0392 data: 0.0009 max mem: 33300 Test: [19300/21770] eta: 0:01:36 time: 0.0393 data: 0.0010 max mem: 33300 Test: [19400/21770] eta: 0:01:32 time: 0.0391 data: 0.0010 max mem: 33300 Test: [19500/21770] eta: 0:01:28 time: 0.0391 data: 0.0010 max mem: 33300 Test: [19600/21770] eta: 0:01:24 time: 0.0401 data: 0.0010 max mem: 33300 Test: [19700/21770] eta: 0:01:20 time: 0.0403 data: 0.0011 max mem: 33300 Test: [19800/21770] eta: 0:01:16 time: 0.0396 data: 0.0011 max mem: 33300 Test: [19900/21770] eta: 0:01:13 time: 0.0388 data: 0.0010 max mem: 33300 Test: [20000/21770] eta: 0:01:09 time: 0.0396 data: 0.0011 max mem: 33300 Test: [20100/21770] eta: 0:01:05 time: 0.0394 data: 0.0010 max mem: 33300 Test: [20200/21770] eta: 0:01:01 time: 0.0393 data: 0.0011 max mem: 33300 Test: [20300/21770] eta: 0:00:57 time: 0.0399 data: 0.0011 max mem: 33300 Test: [20400/21770] eta: 0:00:53 time: 0.0387 data: 0.0011 max mem: 33300 Test: [20500/21770] eta: 0:00:49 time: 0.0384 data: 0.0009 max mem: 33300 Test: [20600/21770] eta: 0:00:45 time: 0.0391 data: 0.0010 max mem: 33300 Test: [20700/21770] eta: 0:00:41 time: 0.0400 data: 0.0010 max mem: 33300 Test: [20800/21770] eta: 0:00:37 time: 0.0399 data: 0.0010 max mem: 33300 Test: [20900/21770] eta: 0:00:33 time: 0.0394 data: 0.0010 max mem: 33300 Test: [21000/21770] eta: 0:00:30 time: 0.0394 data: 0.0010 max mem: 33300 Test: [21100/21770] eta: 0:00:26 time: 0.0398 data: 0.0010 max mem: 33300 Test: [21200/21770] eta: 0:00:22 time: 0.0390 data: 0.0010 max mem: 33300 Test: [21300/21770] eta: 0:00:18 time: 0.0388 data: 0.0010 max mem: 33300 Test: [21400/21770] eta: 0:00:14 time: 0.0393 data: 0.0010 max mem: 33300 Test: [21500/21770] eta: 0:00:10 time: 0.0398 data: 0.0011 max mem: 33300 Test: [21600/21770] eta: 0:00:06 time: 0.0397 data: 0.0010 max mem: 33300 Test: [21700/21770] eta: 0:00:02 time: 0.0390 data: 0.0010 max mem: 33300 Test: Total time: 0:14:10 Final results: Mean IoU is 0.00 precision@0.5 = 0.00 precision@0.6 = 0.00 precision@0.7 = 0.00 precision@0.8 = 0.00 precision@0.9 = 0.00 overall IoU = 0.00 mean IoU = 0.00 Mean accuracy for one-to-zero sample is 100.00 Average object IoU 0.0 Overall IoU 0.0 Epoch: [16] [ 0/4276] eta: 6:49:50 lr: 3.157201648915944e-05 loss: 0.1153 (0.1153) time: 5.7508 data: 2.2581 max mem: 33300 Epoch: [16] [ 10/4276] eta: 4:17:15 lr: 3.156924762123627e-05 loss: 0.1454 (0.1455) time: 3.6182 data: 0.2130 max mem: 33300 Epoch: [16] [ 20/4276] eta: 4:09:47 lr: 3.156647872632937e-05 loss: 0.1454 (0.1439) time: 3.4101 data: 0.0087 max mem: 33300 Epoch: [16] [ 30/4276] eta: 4:07:02 lr: 3.156370980443586e-05 loss: 0.1429 (0.1441) time: 3.4211 data: 0.0087 max mem: 33300 Epoch: [16] [ 40/4276] eta: 4:04:59 lr: 3.1560940855552826e-05 loss: 0.1496 (0.1444) time: 3.4161 data: 0.0085 max mem: 33300 Epoch: [16] [ 50/4276] eta: 4:03:34 lr: 3.1558171879677386e-05 loss: 0.1424 (0.1414) time: 3.4076 data: 0.0081 max mem: 33300 Epoch: [16] [ 60/4276] eta: 4:02:57 lr: 3.1555402876806635e-05 loss: 0.1270 (0.1402) time: 3.4323 data: 0.0081 max mem: 33300 Epoch: [16] [ 70/4276] eta: 4:01:43 lr: 3.1552633846937684e-05 loss: 0.1304 (0.1399) time: 3.4224 data: 0.0079 max mem: 33300 Epoch: [16] [ 80/4276] eta: 4:00:45 lr: 3.154986479006764e-05 loss: 0.1402 (0.1397) time: 3.3969 data: 0.0077 max mem: 33300 Epoch: [16] [ 90/4276] eta: 3:59:44 lr: 3.15470957061936e-05 loss: 0.1388 (0.1393) time: 3.3942 data: 0.0079 max mem: 33300 Epoch: [16] [ 100/4276] eta: 3:58:53 lr: 3.154432659531268e-05 loss: 0.1388 (0.1407) time: 3.3905 data: 0.0079 max mem: 33300 Epoch: [16] [ 110/4276] eta: 3:58:04 lr: 3.154155745742197e-05 loss: 0.1432 (0.1411) time: 3.3950 data: 0.0076 max mem: 33300 Epoch: [16] [ 120/4276] eta: 3:57:10 lr: 3.153878829251857e-05 loss: 0.1302 (0.1402) time: 3.3827 data: 0.0073 max mem: 33300 Epoch: [16] [ 130/4276] eta: 3:56:27 lr: 3.153601910059959e-05 loss: 0.1372 (0.1404) time: 3.3835 data: 0.0080 max mem: 33300 Epoch: [16] [ 140/4276] eta: 3:55:56 lr: 3.1533249881662126e-05 loss: 0.1342 (0.1392) time: 3.4142 data: 0.0081 max mem: 33300 Epoch: [16] [ 150/4276] eta: 3:55:14 lr: 3.153048063570327e-05 loss: 0.1221 (0.1390) time: 3.4139 data: 0.0074 max mem: 33300 Epoch: [16] [ 160/4276] eta: 3:54:30 lr: 3.152771136272014e-05 loss: 0.1254 (0.1384) time: 3.3877 data: 0.0073 max mem: 33300 Epoch: [16] [ 170/4276] eta: 3:53:53 lr: 3.1524942062709815e-05 loss: 0.1286 (0.1387) time: 3.3940 data: 0.0076 max mem: 33300 Epoch: [16] [ 180/4276] eta: 3:53:11 lr: 3.152217273566942e-05 loss: 0.1342 (0.1389) time: 3.3967 data: 0.0077 max mem: 33300 Epoch: [16] [ 190/4276] eta: 3:52:28 lr: 3.151940338159602e-05 loss: 0.1342 (0.1389) time: 3.3801 data: 0.0072 max mem: 33300 Epoch: [16] [ 200/4276] eta: 3:51:45 lr: 3.151663400048673e-05 loss: 0.1318 (0.1386) time: 3.3702 data: 0.0074 max mem: 33300 Epoch: [16] [ 210/4276] eta: 3:51:16 lr: 3.151386459233865e-05 loss: 0.1318 (0.1388) time: 3.4044 data: 0.0075 max mem: 33300 Epoch: [16] [ 220/4276] eta: 3:50:33 lr: 3.151109515714886e-05 loss: 0.1306 (0.1386) time: 3.4029 data: 0.0070 max mem: 33300 Epoch: [16] [ 230/4276] eta: 3:49:54 lr: 3.150832569491447e-05 loss: 0.1238 (0.1381) time: 3.3722 data: 0.0068 max mem: 33300 Epoch: [16] [ 240/4276] eta: 3:49:17 lr: 3.150555620563257e-05 loss: 0.1358 (0.1382) time: 3.3881 data: 0.0074 max mem: 33300 Epoch: [16] [ 250/4276] eta: 3:48:39 lr: 3.150278668930025e-05 loss: 0.1396 (0.1387) time: 3.3880 data: 0.0077 max mem: 33300 Epoch: [16] [ 260/4276] eta: 3:47:57 lr: 3.1500017145914605e-05 loss: 0.1458 (0.1388) time: 3.3704 data: 0.0075 max mem: 33300 Epoch: [16] [ 270/4276] eta: 3:47:20 lr: 3.149724757547274e-05 loss: 0.1254 (0.1384) time: 3.3705 data: 0.0075 max mem: 33300 Epoch: [16] [ 280/4276] eta: 3:46:47 lr: 3.149447797797173e-05 loss: 0.1254 (0.1383) time: 3.3984 data: 0.0074 max mem: 33300 Epoch: [16] [ 290/4276] eta: 3:46:15 lr: 3.149170835340868e-05 loss: 0.1171 (0.1380) time: 3.4167 data: 0.0070 max mem: 33300 Epoch: [16] [ 300/4276] eta: 3:45:37 lr: 3.148893870178066e-05 loss: 0.1250 (0.1379) time: 3.3960 data: 0.0067 max mem: 33300 Epoch: [16] [ 310/4276] eta: 3:44:59 lr: 3.1486169023084784e-05 loss: 0.1300 (0.1376) time: 3.3753 data: 0.0069 max mem: 33300 Epoch: [16] [ 320/4276] eta: 3:44:21 lr: 3.148339931731813e-05 loss: 0.1319 (0.1380) time: 3.3755 data: 0.0071 max mem: 33300 Epoch: [16] [ 330/4276] eta: 3:43:44 lr: 3.148062958447779e-05 loss: 0.1376 (0.1381) time: 3.3768 data: 0.0070 max mem: 33300 Epoch: [16] [ 340/4276] eta: 3:43:08 lr: 3.147785982456086e-05 loss: 0.1292 (0.1378) time: 3.3804 data: 0.0069 max mem: 33300 Epoch: [16] [ 350/4276] eta: 3:42:30 lr: 3.147509003756441e-05 loss: 0.1248 (0.1376) time: 3.3714 data: 0.0070 max mem: 33300 Epoch: [16] [ 360/4276] eta: 3:42:00 lr: 3.1472320223485545e-05 loss: 0.1400 (0.1383) time: 3.4018 data: 0.0082 max mem: 33300 Epoch: [16] [ 370/4276] eta: 3:41:24 lr: 3.146955038232134e-05 loss: 0.1319 (0.1379) time: 3.4104 data: 0.0081 max mem: 33300 Epoch: [16] [ 380/4276] eta: 3:40:46 lr: 3.14667805140689e-05 loss: 0.1208 (0.1381) time: 3.3734 data: 0.0074 max mem: 33300 Epoch: [16] [ 390/4276] eta: 3:40:09 lr: 3.1464010618725275e-05 loss: 0.1399 (0.1384) time: 3.3661 data: 0.0073 max mem: 33300 Epoch: [16] [ 400/4276] eta: 3:39:31 lr: 3.1461240696287584e-05 loss: 0.1446 (0.1384) time: 3.3624 data: 0.0072 max mem: 33300 Epoch: [16] [ 410/4276] eta: 3:38:51 lr: 3.1458470746752895e-05 loss: 0.1337 (0.1383) time: 3.3492 data: 0.0074 max mem: 33300 Epoch: [16] [ 420/4276] eta: 3:38:18 lr: 3.14557007701183e-05 loss: 0.1311 (0.1381) time: 3.3728 data: 0.0078 max mem: 33300 Epoch: [16] [ 430/4276] eta: 3:37:43 lr: 3.1452930766380876e-05 loss: 0.1311 (0.1386) time: 3.3960 data: 0.0081 max mem: 33300 Epoch: [16] [ 440/4276] eta: 3:37:09 lr: 3.145016073553771e-05 loss: 0.1330 (0.1384) time: 3.3884 data: 0.0081 max mem: 33300 Epoch: [16] [ 450/4276] eta: 3:36:34 lr: 3.144739067758588e-05 loss: 0.1330 (0.1384) time: 3.3912 data: 0.0080 max mem: 33300 Epoch: [16] [ 460/4276] eta: 3:35:58 lr: 3.144462059252247e-05 loss: 0.1298 (0.1380) time: 3.3799 data: 0.0073 max mem: 33300 Epoch: [16] [ 470/4276] eta: 3:35:22 lr: 3.144185048034456e-05 loss: 0.1071 (0.1375) time: 3.3686 data: 0.0076 max mem: 33300 Epoch: [16] [ 480/4276] eta: 3:34:48 lr: 3.143908034104923e-05 loss: 0.1156 (0.1373) time: 3.3794 data: 0.0087 max mem: 33300 Epoch: [16] [ 490/4276] eta: 3:34:15 lr: 3.1436310174633565e-05 loss: 0.1144 (0.1368) time: 3.3993 data: 0.0086 max mem: 33300 Epoch: [16] [ 500/4276] eta: 3:33:41 lr: 3.143353998109464e-05 loss: 0.1101 (0.1364) time: 3.4053 data: 0.0080 max mem: 33300 Epoch: [16] [ 510/4276] eta: 3:33:09 lr: 3.1430769760429526e-05 loss: 0.1119 (0.1361) time: 3.4091 data: 0.0079 max mem: 33300 Epoch: [16] [ 520/4276] eta: 3:32:34 lr: 3.142799951263531e-05 loss: 0.1209 (0.1361) time: 3.3970 data: 0.0073 max mem: 33300 Epoch: [16] [ 530/4276] eta: 3:31:58 lr: 3.142522923770907e-05 loss: 0.1250 (0.1362) time: 3.3773 data: 0.0069 max mem: 33300 Epoch: [16] [ 540/4276] eta: 3:31:21 lr: 3.142245893564787e-05 loss: 0.1255 (0.1359) time: 3.3617 data: 0.0077 max mem: 33300 Epoch: [16] [ 550/4276] eta: 3:30:43 lr: 3.1419688606448804e-05 loss: 0.1324 (0.1360) time: 3.3406 data: 0.0079 max mem: 33300 Epoch: [16] [ 560/4276] eta: 3:30:08 lr: 3.141691825010894e-05 loss: 0.1365 (0.1361) time: 3.3567 data: 0.0087 max mem: 33300 Epoch: [16] [ 570/4276] eta: 3:29:34 lr: 3.141414786662535e-05 loss: 0.1369 (0.1360) time: 3.3880 data: 0.0088 max mem: 33300 Epoch: [16] [ 580/4276] eta: 3:29:02 lr: 3.141137745599511e-05 loss: 0.1303 (0.1360) time: 3.4058 data: 0.0084 max mem: 33300 Epoch: [16] [ 590/4276] eta: 3:28:30 lr: 3.1408607018215296e-05 loss: 0.1167 (0.1356) time: 3.4253 data: 0.0080 max mem: 33300 Epoch: [16] [ 600/4276] eta: 3:27:58 lr: 3.140583655328297e-05 loss: 0.1170 (0.1358) time: 3.4248 data: 0.0078 max mem: 33300 Epoch: [16] [ 610/4276] eta: 3:27:23 lr: 3.140306606119522e-05 loss: 0.1312 (0.1357) time: 3.3934 data: 0.0077 max mem: 33300 Epoch: [16] [ 620/4276] eta: 3:26:46 lr: 3.140029554194911e-05 loss: 0.1280 (0.1356) time: 3.3593 data: 0.0069 max mem: 33300 Epoch: [16] [ 630/4276] eta: 3:26:10 lr: 3.1397524995541714e-05 loss: 0.1249 (0.1357) time: 3.3500 data: 0.0075 max mem: 33300 Epoch: [16] [ 640/4276] eta: 3:25:32 lr: 3.13947544219701e-05 loss: 0.1199 (0.1355) time: 3.3426 data: 0.0080 max mem: 33300 Epoch: [16] [ 650/4276] eta: 3:24:56 lr: 3.139198382123134e-05 loss: 0.1305 (0.1356) time: 3.3373 data: 0.0073 max mem: 33300 Epoch: [16] [ 660/4276] eta: 3:24:20 lr: 3.13892131933225e-05 loss: 0.1477 (0.1358) time: 3.3475 data: 0.0071 max mem: 33300 Epoch: [16] [ 670/4276] eta: 3:23:46 lr: 3.138644253824066e-05 loss: 0.1400 (0.1358) time: 3.3760 data: 0.0071 max mem: 33300 Epoch: [16] [ 680/4276] eta: 3:23:15 lr: 3.138367185598286e-05 loss: 0.1371 (0.1358) time: 3.4177 data: 0.0069 max mem: 33300 Epoch: [16] [ 690/4276] eta: 3:22:41 lr: 3.1380901146546206e-05 loss: 0.1299 (0.1356) time: 3.4180 data: 0.0074 max mem: 33300 Epoch: [16] [ 700/4276] eta: 3:22:06 lr: 3.137813040992773e-05 loss: 0.1236 (0.1355) time: 3.3875 data: 0.0075 max mem: 33300 Epoch: [16] [ 710/4276] eta: 3:21:31 lr: 3.1375359646124524e-05 loss: 0.1336 (0.1356) time: 3.3641 data: 0.0075 max mem: 33300 Epoch: [16] [ 720/4276] eta: 3:20:55 lr: 3.1372588855133644e-05 loss: 0.1301 (0.1355) time: 3.3588 data: 0.0072 max mem: 33300 Epoch: [16] [ 730/4276] eta: 3:20:21 lr: 3.136981803695216e-05 loss: 0.1244 (0.1354) time: 3.3741 data: 0.0071 max mem: 33300 Epoch: [16] [ 740/4276] eta: 3:19:46 lr: 3.136704719157712e-05 loss: 0.1200 (0.1353) time: 3.3747 data: 0.0074 max mem: 33300 Epoch: [16] [ 750/4276] eta: 3:19:15 lr: 3.136427631900561e-05 loss: 0.1279 (0.1354) time: 3.4066 data: 0.0074 max mem: 33300 Epoch: [16] [ 760/4276] eta: 3:18:43 lr: 3.136150541923468e-05 loss: 0.1300 (0.1353) time: 3.4422 data: 0.0080 max mem: 33300 Epoch: [16] [ 770/4276] eta: 3:18:07 lr: 3.135873449226139e-05 loss: 0.1372 (0.1353) time: 3.3936 data: 0.0082 max mem: 33300 Epoch: [16] [ 780/4276] eta: 3:17:31 lr: 3.135596353808282e-05 loss: 0.1386 (0.1353) time: 3.3426 data: 0.0080 max mem: 33300 Epoch: [16] [ 790/4276] eta: 3:16:55 lr: 3.135319255669601e-05 loss: 0.1310 (0.1353) time: 3.3437 data: 0.0076 max mem: 33300 Epoch: [16] [ 800/4276] eta: 3:16:21 lr: 3.1350421548098033e-05 loss: 0.1310 (0.1352) time: 3.3669 data: 0.0080 max mem: 33300 Epoch: [16] [ 810/4276] eta: 3:15:47 lr: 3.1347650512285955e-05 loss: 0.1234 (0.1352) time: 3.3884 data: 0.0092 max mem: 33300 Epoch: [16] [ 820/4276] eta: 3:15:13 lr: 3.134487944925682e-05 loss: 0.1173 (0.1349) time: 3.3842 data: 0.0085 max mem: 33300 Epoch: [16] [ 830/4276] eta: 3:14:42 lr: 3.13421083590077e-05 loss: 0.1179 (0.1349) time: 3.4225 data: 0.0079 max mem: 33300 Epoch: [16] [ 840/4276] eta: 3:14:11 lr: 3.133933724153564e-05 loss: 0.1273 (0.1349) time: 3.4575 data: 0.0084 max mem: 33300 Epoch: [16] [ 850/4276] eta: 3:13:36 lr: 3.1336566096837714e-05 loss: 0.1255 (0.1350) time: 3.4123 data: 0.0081 max mem: 33300 Epoch: [16] [ 860/4276] eta: 3:13:02 lr: 3.1333794924910964e-05 loss: 0.1270 (0.1351) time: 3.3781 data: 0.0074 max mem: 33300 Epoch: [16] [ 870/4276] eta: 3:12:28 lr: 3.133102372575246e-05 loss: 0.1282 (0.1351) time: 3.3777 data: 0.0074 max mem: 33300 Epoch: [16] [ 880/4276] eta: 3:11:53 lr: 3.132825249935925e-05 loss: 0.1315 (0.1352) time: 3.3748 data: 0.0075 max mem: 33300 Epoch: [16] [ 890/4276] eta: 3:11:19 lr: 3.1325481245728396e-05 loss: 0.1434 (0.1354) time: 3.3797 data: 0.0075 max mem: 33300 Epoch: [16] [ 900/4276] eta: 3:10:45 lr: 3.132270996485695e-05 loss: 0.1327 (0.1354) time: 3.3918 data: 0.0076 max mem: 33300 Epoch: [16] [ 910/4276] eta: 3:10:14 lr: 3.131993865674196e-05 loss: 0.1416 (0.1354) time: 3.4354 data: 0.0074 max mem: 33300 Epoch: [16] [ 920/4276] eta: 3:09:41 lr: 3.1317167321380484e-05 loss: 0.1427 (0.1355) time: 3.4472 data: 0.0074 max mem: 33300 Epoch: [16] [ 930/4276] eta: 3:09:06 lr: 3.131439595876958e-05 loss: 0.1383 (0.1355) time: 3.3879 data: 0.0073 max mem: 33300 Epoch: [16] [ 940/4276] eta: 3:08:32 lr: 3.13116245689063e-05 loss: 0.1197 (0.1354) time: 3.3655 data: 0.0080 max mem: 33300 Epoch: [16] [ 950/4276] eta: 3:07:57 lr: 3.130885315178768e-05 loss: 0.1203 (0.1356) time: 3.3713 data: 0.0081 max mem: 33300 Epoch: [16] [ 960/4276] eta: 3:07:23 lr: 3.13060817074108e-05 loss: 0.1376 (0.1356) time: 3.3752 data: 0.0077 max mem: 33300 Epoch: [16] [ 970/4276] eta: 3:06:49 lr: 3.130331023577268e-05 loss: 0.1345 (0.1356) time: 3.3802 data: 0.0075 max mem: 33300 Epoch: [16] [ 980/4276] eta: 3:06:15 lr: 3.13005387368704e-05 loss: 0.1317 (0.1356) time: 3.3941 data: 0.0074 max mem: 33300 Epoch: [16] [ 990/4276] eta: 3:05:42 lr: 3.129776721070097e-05 loss: 0.1251 (0.1355) time: 3.4121 data: 0.0073 max mem: 33300 Epoch: [16] [1000/4276] eta: 3:05:08 lr: 3.129499565726148e-05 loss: 0.1236 (0.1355) time: 3.3963 data: 0.0073 max mem: 33300 Epoch: [16] [1010/4276] eta: 3:04:33 lr: 3.129222407654895e-05 loss: 0.1339 (0.1354) time: 3.3693 data: 0.0073 max mem: 33300 Epoch: [16] [1020/4276] eta: 3:03:58 lr: 3.128945246856044e-05 loss: 0.1180 (0.1354) time: 3.3557 data: 0.0076 max mem: 33300 Epoch: [16] [1030/4276] eta: 3:03:23 lr: 3.1286680833293e-05 loss: 0.1410 (0.1355) time: 3.3555 data: 0.0080 max mem: 33300 Epoch: [16] [1040/4276] eta: 3:02:48 lr: 3.128390917074367e-05 loss: 0.1352 (0.1353) time: 3.3620 data: 0.0078 max mem: 33300 Epoch: [16] [1050/4276] eta: 3:02:14 lr: 3.12811374809095e-05 loss: 0.1271 (0.1354) time: 3.3687 data: 0.0073 max mem: 33300 Epoch: [16] [1060/4276] eta: 3:01:41 lr: 3.127836576378753e-05 loss: 0.1364 (0.1355) time: 3.3998 data: 0.0074 max mem: 33300 Epoch: [16] [1070/4276] eta: 3:01:08 lr: 3.127559401937481e-05 loss: 0.1559 (0.1358) time: 3.4152 data: 0.0080 max mem: 33300 Epoch: [16] [1080/4276] eta: 3:00:28 lr: 3.127282224766837e-05 loss: 0.1535 (0.1358) time: 3.2992 data: 0.0078 max mem: 33300 Epoch: [16] [1090/4276] eta: 2:59:43 lr: 3.1270050448665275e-05 loss: 0.1374 (0.1359) time: 3.1069 data: 0.0076 max mem: 33300 Epoch: [16] [1100/4276] eta: 2:59:00 lr: 3.126727862236255e-05 loss: 0.1362 (0.1359) time: 3.0354 data: 0.0078 max mem: 33300 Epoch: [16] [1110/4276] eta: 2:58:16 lr: 3.126450676875724e-05 loss: 0.1314 (0.1360) time: 3.0390 data: 0.0076 max mem: 33300 Epoch: [16] [1120/4276] eta: 2:57:32 lr: 3.12617348878464e-05 loss: 0.1347 (0.1359) time: 3.0291 data: 0.0075 max mem: 33300 Epoch: [16] [1130/4276] eta: 2:56:49 lr: 3.125896297962706e-05 loss: 0.1180 (0.1358) time: 3.0296 data: 0.0074 max mem: 33300 Epoch: [16] [1140/4276] eta: 2:56:06 lr: 3.125619104409626e-05 loss: 0.1257 (0.1358) time: 3.0419 data: 0.0068 max mem: 33300 Epoch: [16] [1150/4276] eta: 2:55:25 lr: 3.125341908125104e-05 loss: 0.1296 (0.1357) time: 3.0669 data: 0.0070 max mem: 33300 Epoch: [16] [1160/4276] eta: 2:54:43 lr: 3.1250647091088436e-05 loss: 0.1283 (0.1357) time: 3.0669 data: 0.0071 max mem: 33300 Epoch: [16] [1170/4276] eta: 2:54:00 lr: 3.124787507360549e-05 loss: 0.1346 (0.1357) time: 3.0278 data: 0.0066 max mem: 33300 Epoch: [16] [1180/4276] eta: 2:53:18 lr: 3.124510302879925e-05 loss: 0.1247 (0.1358) time: 3.0331 data: 0.0070 max mem: 33300 Epoch: [16] [1190/4276] eta: 2:52:36 lr: 3.124233095666673e-05 loss: 0.1208 (0.1357) time: 3.0523 data: 0.0077 max mem: 33300 Epoch: [16] [1200/4276] eta: 2:51:55 lr: 3.123955885720499e-05 loss: 0.1231 (0.1357) time: 3.0377 data: 0.0076 max mem: 33300 Epoch: [16] [1210/4276] eta: 2:51:13 lr: 3.123678673041105e-05 loss: 0.1232 (0.1356) time: 3.0263 data: 0.0073 max mem: 33300 Epoch: [16] [1220/4276] eta: 2:50:30 lr: 3.123401457628196e-05 loss: 0.1234 (0.1356) time: 3.0098 data: 0.0073 max mem: 33300 Epoch: [16] [1230/4276] eta: 2:49:50 lr: 3.123124239481474e-05 loss: 0.1285 (0.1357) time: 3.0396 data: 0.0075 max mem: 33300 Epoch: [16] [1240/4276] eta: 2:49:10 lr: 3.122847018600643e-05 loss: 0.1268 (0.1356) time: 3.0784 data: 0.0073 max mem: 33300 Epoch: [16] [1250/4276] eta: 2:48:29 lr: 3.122569794985405e-05 loss: 0.1215 (0.1356) time: 3.0581 data: 0.0071 max mem: 33300 Epoch: [16] [1260/4276] eta: 2:47:47 lr: 3.1222925686354664e-05 loss: 0.1167 (0.1355) time: 3.0089 data: 0.0073 max mem: 33300 Epoch: [16] [1270/4276] eta: 2:47:05 lr: 3.1220153395505285e-05 loss: 0.1209 (0.1355) time: 2.9788 data: 0.0074 max mem: 33300 Epoch: [16] [1280/4276] eta: 2:46:24 lr: 3.121738107730294e-05 loss: 0.1307 (0.1355) time: 2.9810 data: 0.0073 max mem: 33300 Epoch: [16] [1290/4276] eta: 2:45:42 lr: 3.1214608731744675e-05 loss: 0.1254 (0.1354) time: 2.9865 data: 0.0071 max mem: 33300 Epoch: [16] [1300/4276] eta: 2:45:02 lr: 3.121183635882751e-05 loss: 0.1182 (0.1353) time: 3.0062 data: 0.0071 max mem: 33300 Epoch: [16] [1310/4276] eta: 2:44:22 lr: 3.1209063958548465e-05 loss: 0.1180 (0.1352) time: 3.0315 data: 0.0073 max mem: 33300 Epoch: [16] [1320/4276] eta: 2:43:44 lr: 3.120629153090459e-05 loss: 0.1373 (0.1353) time: 3.0724 data: 0.0076 max mem: 33300 Epoch: [16] [1330/4276] eta: 2:43:05 lr: 3.12035190758929e-05 loss: 0.1373 (0.1353) time: 3.0851 data: 0.0079 max mem: 33300 Epoch: [16] [1340/4276] eta: 2:42:26 lr: 3.120074659351043e-05 loss: 0.1317 (0.1352) time: 3.0557 data: 0.0074 max mem: 33300 Epoch: [16] [1350/4276] eta: 2:41:46 lr: 3.119797408375421e-05 loss: 0.1188 (0.1352) time: 3.0235 data: 0.0071 max mem: 33300 Epoch: [16] [1360/4276] eta: 2:41:07 lr: 3.119520154662125e-05 loss: 0.1287 (0.1351) time: 3.0173 data: 0.0077 max mem: 33300 Epoch: [16] [1370/4276] eta: 2:40:28 lr: 3.119242898210859e-05 loss: 0.1224 (0.1351) time: 3.0361 data: 0.0076 max mem: 33300 Epoch: [16] [1380/4276] eta: 2:39:49 lr: 3.118965639021326e-05 loss: 0.1288 (0.1352) time: 3.0405 data: 0.0076 max mem: 33300 Epoch: [16] [1390/4276] eta: 2:39:10 lr: 3.118688377093227e-05 loss: 0.1390 (0.1352) time: 3.0370 data: 0.0076 max mem: 33300 Epoch: [16] [1400/4276] eta: 2:38:32 lr: 3.118411112426266e-05 loss: 0.1369 (0.1352) time: 3.0506 data: 0.0073 max mem: 33300 Epoch: [16] [1410/4276] eta: 2:37:54 lr: 3.1181338450201437e-05 loss: 0.1250 (0.1352) time: 3.0777 data: 0.0074 max mem: 33300 Epoch: [16] [1420/4276] eta: 2:37:17 lr: 3.1178565748745634e-05 loss: 0.1222 (0.1352) time: 3.0794 data: 0.0071 max mem: 33300 Epoch: [16] [1430/4276] eta: 2:36:39 lr: 3.117579301989228e-05 loss: 0.1157 (0.1351) time: 3.0626 data: 0.0078 max mem: 33300 Epoch: [16] [1440/4276] eta: 2:36:00 lr: 3.117302026363838e-05 loss: 0.1178 (0.1351) time: 3.0438 data: 0.0077 max mem: 33300 Epoch: [16] [1450/4276] eta: 2:35:22 lr: 3.1170247479980974e-05 loss: 0.1258 (0.1351) time: 3.0407 data: 0.0071 max mem: 33300 Epoch: [16] [1460/4276] eta: 2:34:44 lr: 3.116747466891706e-05 loss: 0.1320 (0.1350) time: 3.0338 data: 0.0074 max mem: 33300 Epoch: [16] [1470/4276] eta: 2:34:06 lr: 3.1164701830443684e-05 loss: 0.1340 (0.1350) time: 3.0302 data: 0.0075 max mem: 33300 Epoch: [16] [1480/4276] eta: 2:33:28 lr: 3.116192896455784e-05 loss: 0.1142 (0.1349) time: 3.0230 data: 0.0074 max mem: 33300 Epoch: [16] [1490/4276] eta: 2:32:50 lr: 3.115915607125656e-05 loss: 0.1142 (0.1349) time: 3.0355 data: 0.0076 max mem: 33300 Epoch: [16] [1500/4276] eta: 2:32:13 lr: 3.115638315053687e-05 loss: 0.1284 (0.1349) time: 3.0414 data: 0.0079 max mem: 33300 Epoch: [16] [1510/4276] eta: 2:31:35 lr: 3.1153610202395765e-05 loss: 0.1267 (0.1348) time: 3.0235 data: 0.0076 max mem: 33300 Epoch: [16] [1520/4276] eta: 2:30:57 lr: 3.115083722683029e-05 loss: 0.1289 (0.1348) time: 3.0178 data: 0.0072 max mem: 33300 Epoch: [16] [1530/4276] eta: 2:30:19 lr: 3.114806422383743e-05 loss: 0.1281 (0.1347) time: 3.0025 data: 0.0069 max mem: 33300 Epoch: [16] [1540/4276] eta: 2:29:41 lr: 3.114529119341423e-05 loss: 0.1171 (0.1347) time: 3.0144 data: 0.0069 max mem: 33300 Epoch: [16] [1550/4276] eta: 2:29:04 lr: 3.114251813555768e-05 loss: 0.1258 (0.1347) time: 3.0198 data: 0.0073 max mem: 33300 Epoch: [16] [1560/4276] eta: 2:28:26 lr: 3.113974505026481e-05 loss: 0.1258 (0.1346) time: 3.0152 data: 0.0074 max mem: 33300 Epoch: [16] [1570/4276] eta: 2:27:49 lr: 3.113697193753263e-05 loss: 0.1332 (0.1347) time: 3.0203 data: 0.0070 max mem: 33300 Epoch: [16] [1580/4276] eta: 2:27:13 lr: 3.113419879735816e-05 loss: 0.1173 (0.1346) time: 3.0395 data: 0.0072 max mem: 33300 Epoch: [16] [1590/4276] eta: 2:26:37 lr: 3.1131425629738395e-05 loss: 0.1200 (0.1346) time: 3.0681 data: 0.0072 max mem: 33300 Epoch: [16] [1600/4276] eta: 2:26:00 lr: 3.112865243467036e-05 loss: 0.1280 (0.1346) time: 3.0643 data: 0.0072 max mem: 33300 Epoch: [16] [1610/4276] eta: 2:25:24 lr: 3.112587921215107e-05 loss: 0.1195 (0.1345) time: 3.0492 data: 0.0070 max mem: 33300 Epoch: [16] [1620/4276] eta: 2:24:47 lr: 3.112310596217752e-05 loss: 0.1171 (0.1344) time: 3.0395 data: 0.0070 max mem: 33300 Epoch: [16] [1630/4276] eta: 2:24:10 lr: 3.1120332684746735e-05 loss: 0.1283 (0.1345) time: 3.0226 data: 0.0074 max mem: 33300 Epoch: [16] [1640/4276] eta: 2:23:33 lr: 3.1117559379855716e-05 loss: 0.1248 (0.1343) time: 3.0189 data: 0.0073 max mem: 33300 Epoch: [16] [1650/4276] eta: 2:22:56 lr: 3.111478604750147e-05 loss: 0.1092 (0.1342) time: 3.0084 data: 0.0074 max mem: 33300 Epoch: [16] [1660/4276] eta: 2:22:20 lr: 3.111201268768102e-05 loss: 0.1169 (0.1342) time: 3.0072 data: 0.0073 max mem: 33300 Epoch: [16] [1670/4276] eta: 2:21:44 lr: 3.1109239300391355e-05 loss: 0.1211 (0.1341) time: 3.0324 data: 0.0070 max mem: 33300 Epoch: [16] [1680/4276] eta: 2:21:07 lr: 3.1106465885629496e-05 loss: 0.1337 (0.1342) time: 3.0300 data: 0.0067 max mem: 33300 Epoch: [16] [1690/4276] eta: 2:20:31 lr: 3.110369244339244e-05 loss: 0.1223 (0.1342) time: 3.0260 data: 0.0067 max mem: 33300 Epoch: [16] [1700/4276] eta: 2:19:56 lr: 3.110091897367719e-05 loss: 0.1210 (0.1342) time: 3.0445 data: 0.0070 max mem: 33300 Epoch: [16] [1710/4276] eta: 2:19:19 lr: 3.1098145476480755e-05 loss: 0.1300 (0.1342) time: 3.0364 data: 0.0069 max mem: 33300 Epoch: [16] [1720/4276] eta: 2:18:42 lr: 3.109537195180015e-05 loss: 0.1300 (0.1342) time: 2.9957 data: 0.0069 max mem: 33300 Epoch: [16] [1730/4276] eta: 2:18:06 lr: 3.109259839963237e-05 loss: 0.1367 (0.1342) time: 2.9832 data: 0.0071 max mem: 33300 Epoch: [16] [1740/4276] eta: 2:17:30 lr: 3.108982481997441e-05 loss: 0.1367 (0.1342) time: 2.9917 data: 0.0072 max mem: 33300 Epoch: [16] [1750/4276] eta: 2:16:55 lr: 3.1087051212823284e-05 loss: 0.1290 (0.1341) time: 3.0331 data: 0.0075 max mem: 33300 Epoch: [16] [1760/4276] eta: 2:16:20 lr: 3.1084277578176e-05 loss: 0.1128 (0.1340) time: 3.0842 data: 0.0074 max mem: 33300 Epoch: [16] [1770/4276] eta: 2:15:45 lr: 3.108150391602954e-05 loss: 0.1134 (0.1340) time: 3.0808 data: 0.0071 max mem: 33300 Epoch: [16] [1780/4276] eta: 2:15:09 lr: 3.107873022638091e-05 loss: 0.1186 (0.1339) time: 3.0517 data: 0.0067 max mem: 33300 Epoch: [16] [1790/4276] eta: 2:14:33 lr: 3.107595650922712e-05 loss: 0.1186 (0.1339) time: 3.0160 data: 0.0067 max mem: 33300 Epoch: [16] [1800/4276] eta: 2:13:57 lr: 3.1073182764565166e-05 loss: 0.1188 (0.1339) time: 3.0045 data: 0.0074 max mem: 33300 Epoch: [16] [1810/4276] eta: 2:13:22 lr: 3.107040899239205e-05 loss: 0.1301 (0.1340) time: 3.0178 data: 0.0075 max mem: 33300 Epoch: [16] [1820/4276] eta: 2:12:47 lr: 3.1067635192704755e-05 loss: 0.1301 (0.1339) time: 3.0248 data: 0.0070 max mem: 33300 Epoch: [16] [1830/4276] eta: 2:12:11 lr: 3.1064861365500295e-05 loss: 0.1223 (0.1338) time: 3.0351 data: 0.0071 max mem: 33300 Epoch: [16] [1840/4276] eta: 2:11:37 lr: 3.106208751077565e-05 loss: 0.1179 (0.1337) time: 3.0663 data: 0.0076 max mem: 33300 Epoch: [16] [1850/4276] eta: 2:11:02 lr: 3.105931362852784e-05 loss: 0.1252 (0.1338) time: 3.0644 data: 0.0075 max mem: 33300 Epoch: [16] [1860/4276] eta: 2:10:26 lr: 3.105653971875384e-05 loss: 0.1327 (0.1337) time: 3.0207 data: 0.0072 max mem: 33300 Epoch: [16] [1870/4276] eta: 2:09:51 lr: 3.105376578145065e-05 loss: 0.1333 (0.1338) time: 3.0138 data: 0.0073 max mem: 33300 Epoch: [16] [1880/4276] eta: 2:09:16 lr: 3.105099181661527e-05 loss: 0.1306 (0.1338) time: 3.0241 data: 0.0074 max mem: 33300 Epoch: [16] [1890/4276] eta: 2:08:41 lr: 3.104821782424469e-05 loss: 0.1175 (0.1338) time: 3.0102 data: 0.0074 max mem: 33300 Epoch: [16] [1900/4276] eta: 2:08:05 lr: 3.1045443804335904e-05 loss: 0.1178 (0.1338) time: 2.9988 data: 0.0073 max mem: 33300 Epoch: [16] [1910/4276] eta: 2:07:30 lr: 3.104266975688591e-05 loss: 0.1274 (0.1338) time: 2.9927 data: 0.0072 max mem: 33300 Epoch: [16] [1920/4276] eta: 2:06:55 lr: 3.1039895681891684e-05 loss: 0.1253 (0.1337) time: 3.0129 data: 0.0074 max mem: 33300 Epoch: [16] [1930/4276] eta: 2:06:21 lr: 3.103712157935023e-05 loss: 0.1193 (0.1336) time: 3.0525 data: 0.0073 max mem: 33300 Epoch: [16] [1940/4276] eta: 2:05:46 lr: 3.103434744925854e-05 loss: 0.1236 (0.1336) time: 3.0464 data: 0.0069 max mem: 33300 Epoch: [16] [1950/4276] eta: 2:05:12 lr: 3.103157329161359e-05 loss: 0.1258 (0.1337) time: 3.0567 data: 0.0069 max mem: 33300 Epoch: [16] [1960/4276] eta: 2:04:38 lr: 3.1028799106412384e-05 loss: 0.1274 (0.1337) time: 3.0758 data: 0.0072 max mem: 33300 Epoch: [16] [1970/4276] eta: 2:04:03 lr: 3.1026024893651916e-05 loss: 0.1146 (0.1335) time: 3.0354 data: 0.0069 max mem: 33300 Epoch: [16] [1980/4276] eta: 2:03:28 lr: 3.102325065332916e-05 loss: 0.1071 (0.1335) time: 2.9879 data: 0.0064 max mem: 33300 Epoch: [16] [1990/4276] eta: 2:02:53 lr: 3.10204763854411e-05 loss: 0.1237 (0.1335) time: 2.9790 data: 0.0064 max mem: 33300 Epoch: [16] [2000/4276] eta: 2:02:18 lr: 3.101770208998474e-05 loss: 0.1340 (0.1335) time: 2.9987 data: 0.0066 max mem: 33300 Epoch: [16] [2010/4276] eta: 2:01:44 lr: 3.1014927766957055e-05 loss: 0.1340 (0.1334) time: 3.0325 data: 0.0069 max mem: 33300 Epoch: [16] [2020/4276] eta: 2:01:09 lr: 3.101215341635502e-05 loss: 0.1259 (0.1335) time: 3.0323 data: 0.0071 max mem: 33300 Epoch: [16] [2030/4276] eta: 2:00:34 lr: 3.1009379038175644e-05 loss: 0.1227 (0.1334) time: 2.9955 data: 0.0073 max mem: 33300 Epoch: [16] [2040/4276] eta: 1:59:59 lr: 3.10066046324159e-05 loss: 0.1121 (0.1333) time: 2.9826 data: 0.0071 max mem: 33300 Epoch: [16] [2050/4276] eta: 1:59:25 lr: 3.100383019907277e-05 loss: 0.1259 (0.1334) time: 2.9831 data: 0.0069 max mem: 33300 Epoch: [16] [2060/4276] eta: 1:58:51 lr: 3.100105573814324e-05 loss: 0.1236 (0.1334) time: 3.0258 data: 0.0067 max mem: 33300 Epoch: [16] [2070/4276] eta: 1:58:16 lr: 3.099828124962429e-05 loss: 0.1188 (0.1333) time: 3.0243 data: 0.0068 max mem: 33300 Epoch: [16] [2080/4276] eta: 1:57:42 lr: 3.09955067335129e-05 loss: 0.1287 (0.1334) time: 2.9855 data: 0.0071 max mem: 33300 Epoch: [16] [2090/4276] eta: 1:57:07 lr: 3.0992732189806056e-05 loss: 0.1326 (0.1333) time: 3.0058 data: 0.0071 max mem: 33300 Epoch: [16] [2100/4276] eta: 1:56:33 lr: 3.0989957618500736e-05 loss: 0.1251 (0.1333) time: 3.0219 data: 0.0070 max mem: 33300 Epoch: [16] [2110/4276] eta: 1:55:59 lr: 3.0987183019593926e-05 loss: 0.1251 (0.1333) time: 3.0357 data: 0.0073 max mem: 33300 Epoch: [16] [2120/4276] eta: 1:55:25 lr: 3.0984408393082596e-05 loss: 0.1089 (0.1331) time: 3.0397 data: 0.0075 max mem: 33300 Epoch: [16] [2130/4276] eta: 1:54:51 lr: 3.0981633738963725e-05 loss: 0.1078 (0.1330) time: 3.0078 data: 0.0070 max mem: 33300 Epoch: [16] [2140/4276] eta: 1:54:17 lr: 3.0978859057234305e-05 loss: 0.1115 (0.1330) time: 2.9847 data: 0.0065 max mem: 33300 Epoch: [16] [2150/4276] eta: 1:53:42 lr: 3.097608434789129e-05 loss: 0.1115 (0.1329) time: 2.9881 data: 0.0066 max mem: 33300 Epoch: [16] [2160/4276] eta: 1:53:08 lr: 3.0973309610931676e-05 loss: 0.1081 (0.1329) time: 2.9794 data: 0.0069 max mem: 33300 Epoch: [16] [2170/4276] eta: 1:52:34 lr: 3.097053484635243e-05 loss: 0.1216 (0.1329) time: 2.9775 data: 0.0070 max mem: 33300 Epoch: [16] [2180/4276] eta: 1:52:00 lr: 3.096776005415054e-05 loss: 0.1251 (0.1328) time: 3.0215 data: 0.0070 max mem: 33300 Epoch: [16] [2190/4276] eta: 1:51:27 lr: 3.096498523432296e-05 loss: 0.1250 (0.1328) time: 3.0701 data: 0.0069 max mem: 33300 Epoch: [16] [2200/4276] eta: 1:50:53 lr: 3.0962210386866693e-05 loss: 0.1250 (0.1328) time: 3.0467 data: 0.0070 max mem: 33300 Epoch: [16] [2210/4276] eta: 1:50:19 lr: 3.095943551177868e-05 loss: 0.1342 (0.1329) time: 3.0198 data: 0.0072 max mem: 33300 Epoch: [16] [2220/4276] eta: 1:49:45 lr: 3.0956660609055915e-05 loss: 0.1365 (0.1329) time: 3.0142 data: 0.0076 max mem: 33300 Epoch: [16] [2230/4276] eta: 1:49:11 lr: 3.095388567869537e-05 loss: 0.1331 (0.1329) time: 2.9982 data: 0.0074 max mem: 33300 Epoch: [16] [2240/4276] eta: 1:48:38 lr: 3.0951110720694e-05 loss: 0.1226 (0.1329) time: 3.0051 data: 0.0069 max mem: 33300 Epoch: [16] [2250/4276] eta: 1:48:04 lr: 3.094833573504879e-05 loss: 0.1226 (0.1328) time: 3.0242 data: 0.0068 max mem: 33300 Epoch: [16] [2260/4276] eta: 1:47:30 lr: 3.094556072175671e-05 loss: 0.1326 (0.1328) time: 3.0109 data: 0.0066 max mem: 33300 Epoch: [16] [2270/4276] eta: 1:46:57 lr: 3.094278568081473e-05 loss: 0.1251 (0.1328) time: 3.0180 data: 0.0067 max mem: 33300 Epoch: [16] [2280/4276] eta: 1:46:23 lr: 3.0940010612219815e-05 loss: 0.1237 (0.1329) time: 3.0352 data: 0.0070 max mem: 33300 Epoch: [16] [2290/4276] eta: 1:45:50 lr: 3.0937235515968936e-05 loss: 0.1237 (0.1328) time: 3.0350 data: 0.0070 max mem: 33300 Epoch: [16] [2300/4276] eta: 1:45:16 lr: 3.093446039205907e-05 loss: 0.1253 (0.1328) time: 3.0037 data: 0.0071 max mem: 33300 Epoch: [16] [2310/4276] eta: 1:44:42 lr: 3.093168524048716e-05 loss: 0.1315 (0.1329) time: 2.9752 data: 0.0072 max mem: 33300 Epoch: [16] [2320/4276] eta: 1:44:09 lr: 3.0928910061250194e-05 loss: 0.1386 (0.1329) time: 2.9817 data: 0.0069 max mem: 33300 Epoch: [16] [2330/4276] eta: 1:43:35 lr: 3.092613485434513e-05 loss: 0.1312 (0.1329) time: 2.9830 data: 0.0067 max mem: 33300 Epoch: [16] [2340/4276] eta: 1:43:01 lr: 3.092335961976893e-05 loss: 0.1319 (0.1329) time: 2.9921 data: 0.0067 max mem: 33300 Epoch: [16] [2350/4276] eta: 1:42:28 lr: 3.092058435751857e-05 loss: 0.1312 (0.1329) time: 2.9920 data: 0.0068 max mem: 33300 Epoch: [16] [2360/4276] eta: 1:41:55 lr: 3.0917809067591015e-05 loss: 0.1179 (0.1329) time: 3.0108 data: 0.0069 max mem: 33300 Epoch: [16] [2370/4276] eta: 1:41:21 lr: 3.0915033749983214e-05 loss: 0.1269 (0.1330) time: 3.0128 data: 0.0070 max mem: 33300 Epoch: [16] [2380/4276] eta: 1:40:47 lr: 3.091225840469214e-05 loss: 0.1389 (0.1330) time: 2.9890 data: 0.0071 max mem: 33300 Epoch: [16] [2390/4276] eta: 1:40:14 lr: 3.090948303171475e-05 loss: 0.1249 (0.1330) time: 2.9656 data: 0.0072 max mem: 33300 Epoch: [16] [2400/4276] eta: 1:39:40 lr: 3.0906707631048e-05 loss: 0.1303 (0.1330) time: 2.9529 data: 0.0070 max mem: 33300 Epoch: [16] [2410/4276] eta: 1:39:07 lr: 3.090393220268887e-05 loss: 0.1276 (0.1330) time: 2.9934 data: 0.0069 max mem: 33300 Epoch: [16] [2420/4276] eta: 1:38:33 lr: 3.09011567466343e-05 loss: 0.1229 (0.1330) time: 3.0060 data: 0.0070 max mem: 33300 Epoch: [16] [2430/4276] eta: 1:38:00 lr: 3.089838126288127e-05 loss: 0.1351 (0.1331) time: 2.9879 data: 0.0070 max mem: 33300 Epoch: [16] [2440/4276] eta: 1:37:27 lr: 3.089560575142672e-05 loss: 0.1320 (0.1331) time: 3.0062 data: 0.0069 max mem: 33300 Epoch: [16] [2450/4276] eta: 1:36:54 lr: 3.089283021226762e-05 loss: 0.1300 (0.1331) time: 3.0243 data: 0.0074 max mem: 33300 Epoch: [16] [2460/4276] eta: 1:36:21 lr: 3.089005464540092e-05 loss: 0.1420 (0.1331) time: 3.0362 data: 0.0077 max mem: 33300 Epoch: [16] [2470/4276] eta: 1:35:48 lr: 3.0887279050823584e-05 loss: 0.1291 (0.1332) time: 3.0251 data: 0.0076 max mem: 33300 Epoch: [16] [2480/4276] eta: 1:35:15 lr: 3.088450342853256e-05 loss: 0.1483 (0.1333) time: 3.0000 data: 0.0074 max mem: 33300 Epoch: [16] [2490/4276] eta: 1:34:42 lr: 3.0881727778524814e-05 loss: 0.1510 (0.1333) time: 2.9913 data: 0.0070 max mem: 33300 Epoch: [16] [2500/4276] eta: 1:34:08 lr: 3.0878952100797296e-05 loss: 0.1385 (0.1334) time: 2.9922 data: 0.0069 max mem: 33300 Epoch: [16] [2510/4276] eta: 1:33:35 lr: 3.0876176395346965e-05 loss: 0.1414 (0.1334) time: 3.0004 data: 0.0070 max mem: 33300 Epoch: [16] [2520/4276] eta: 1:33:02 lr: 3.087340066217077e-05 loss: 0.1235 (0.1333) time: 3.0019 data: 0.0076 max mem: 33300 Epoch: [16] [2530/4276] eta: 1:32:30 lr: 3.087062490126566e-05 loss: 0.1089 (0.1332) time: 3.0207 data: 0.0079 max mem: 33300 Epoch: [16] [2540/4276] eta: 1:31:57 lr: 3.08678491126286e-05 loss: 0.1196 (0.1333) time: 3.0287 data: 0.0076 max mem: 33300 Epoch: [16] [2550/4276] eta: 1:31:24 lr: 3.086507329625653e-05 loss: 0.1215 (0.1332) time: 3.0028 data: 0.0072 max mem: 33300 Epoch: [16] [2560/4276] eta: 1:30:51 lr: 3.0862297452146414e-05 loss: 0.1044 (0.1331) time: 2.9890 data: 0.0074 max mem: 33300 Epoch: [16] [2570/4276] eta: 1:30:18 lr: 3.085952158029518e-05 loss: 0.1105 (0.1331) time: 2.9907 data: 0.0075 max mem: 33300 Epoch: [16] [2580/4276] eta: 1:29:45 lr: 3.08567456806998e-05 loss: 0.1168 (0.1330) time: 2.9938 data: 0.0069 max mem: 33300 Epoch: [16] [2590/4276] eta: 1:29:12 lr: 3.0853969753357224e-05 loss: 0.1210 (0.1330) time: 2.9870 data: 0.0067 max mem: 33300 Epoch: [16] [2600/4276] eta: 1:28:39 lr: 3.085119379826439e-05 loss: 0.1263 (0.1329) time: 2.9818 data: 0.0069 max mem: 33300 Epoch: [16] [2610/4276] eta: 1:28:06 lr: 3.084841781541824e-05 loss: 0.1196 (0.1329) time: 2.9876 data: 0.0070 max mem: 33300 Epoch: [16] [2620/4276] eta: 1:27:33 lr: 3.084564180481574e-05 loss: 0.1249 (0.1329) time: 3.0123 data: 0.0070 max mem: 33300 Epoch: [16] [2630/4276] eta: 1:27:00 lr: 3.084286576645383e-05 loss: 0.1249 (0.1329) time: 3.0219 data: 0.0073 max mem: 33300 Epoch: [16] [2640/4276] eta: 1:26:27 lr: 3.084008970032945e-05 loss: 0.1140 (0.1329) time: 2.9890 data: 0.0073 max mem: 33300 Epoch: [16] [2650/4276] eta: 1:25:54 lr: 3.0837313606439547e-05 loss: 0.1206 (0.1328) time: 2.9692 data: 0.0073 max mem: 33300 Epoch: [16] [2660/4276] eta: 1:25:22 lr: 3.0834537484781064e-05 loss: 0.1270 (0.1329) time: 2.9856 data: 0.0076 max mem: 33300 Epoch: [16] [2670/4276] eta: 1:24:49 lr: 3.083176133535097e-05 loss: 0.1321 (0.1328) time: 2.9894 data: 0.0072 max mem: 33300 Epoch: [16] [2680/4276] eta: 1:24:16 lr: 3.082898515814617e-05 loss: 0.1321 (0.1329) time: 2.9846 data: 0.0068 max mem: 33300 Epoch: [16] [2690/4276] eta: 1:23:43 lr: 3.0826208953163636e-05 loss: 0.1321 (0.1328) time: 2.9810 data: 0.0068 max mem: 33300 Epoch: [16] [2700/4276] eta: 1:23:11 lr: 3.0823432720400296e-05 loss: 0.1187 (0.1328) time: 2.9994 data: 0.0071 max mem: 33300 Epoch: [16] [2710/4276] eta: 1:22:39 lr: 3.08206564598531e-05 loss: 0.1194 (0.1328) time: 3.0575 data: 0.0074 max mem: 33300 Epoch: [16] [2720/4276] eta: 1:22:06 lr: 3.0817880171518975e-05 loss: 0.1120 (0.1327) time: 3.0546 data: 0.0074 max mem: 33300 Epoch: [16] [2730/4276] eta: 1:21:33 lr: 3.081510385539488e-05 loss: 0.1242 (0.1328) time: 3.0012 data: 0.0074 max mem: 33300 Epoch: [16] [2740/4276] eta: 1:21:01 lr: 3.0812327511477745e-05 loss: 0.1389 (0.1328) time: 3.0040 data: 0.0075 max mem: 33300 Epoch: [16] [2750/4276] eta: 1:20:29 lr: 3.0809551139764505e-05 loss: 0.1407 (0.1329) time: 3.0209 data: 0.0077 max mem: 33300 Epoch: [16] [2760/4276] eta: 1:19:56 lr: 3.080677474025211e-05 loss: 0.1327 (0.1328) time: 2.9896 data: 0.0074 max mem: 33300 Epoch: [16] [2770/4276] eta: 1:19:23 lr: 3.0803998312937494e-05 loss: 0.1220 (0.1328) time: 2.9684 data: 0.0072 max mem: 33300 Epoch: [16] [2780/4276] eta: 1:18:50 lr: 3.080122185781758e-05 loss: 0.1220 (0.1328) time: 2.9748 data: 0.0072 max mem: 33300 Epoch: [16] [2790/4276] eta: 1:18:18 lr: 3.0798445374889325e-05 loss: 0.1290 (0.1329) time: 2.9746 data: 0.0070 max mem: 33300 Epoch: [16] [2800/4276] eta: 1:17:45 lr: 3.079566886414966e-05 loss: 0.1305 (0.1328) time: 2.9875 data: 0.0072 max mem: 33300 Epoch: [16] [2810/4276] eta: 1:17:13 lr: 3.079289232559551e-05 loss: 0.1095 (0.1327) time: 3.0138 data: 0.0073 max mem: 33300 Epoch: [16] [2820/4276] eta: 1:16:40 lr: 3.079011575922382e-05 loss: 0.1045 (0.1326) time: 2.9929 data: 0.0072 max mem: 33300 Epoch: [16] [2830/4276] eta: 1:16:08 lr: 3.078733916503152e-05 loss: 0.1152 (0.1326) time: 2.9664 data: 0.0077 max mem: 33300 Epoch: [16] [2840/4276] eta: 1:15:35 lr: 3.078456254301554e-05 loss: 0.1347 (0.1327) time: 2.9617 data: 0.0079 max mem: 33300 Epoch: [16] [2850/4276] eta: 1:15:03 lr: 3.078178589317282e-05 loss: 0.1444 (0.1328) time: 2.9538 data: 0.0078 max mem: 33300 Epoch: [16] [2860/4276] eta: 1:14:30 lr: 3.0779009215500295e-05 loss: 0.1392 (0.1328) time: 2.9691 data: 0.0077 max mem: 33300 Epoch: [16] [2870/4276] eta: 1:13:58 lr: 3.077623250999488e-05 loss: 0.1307 (0.1328) time: 2.9931 data: 0.0077 max mem: 33300 Epoch: [16] [2880/4276] eta: 1:13:26 lr: 3.077345577665352e-05 loss: 0.1335 (0.1328) time: 3.0127 data: 0.0075 max mem: 33300 Epoch: [16] [2890/4276] eta: 1:12:53 lr: 3.077067901547313e-05 loss: 0.1280 (0.1328) time: 3.0254 data: 0.0074 max mem: 33300 Epoch: [16] [2900/4276] eta: 1:12:21 lr: 3.076790222645067e-05 loss: 0.1228 (0.1327) time: 3.0223 data: 0.0073 max mem: 33300 Epoch: [16] [2910/4276] eta: 1:11:49 lr: 3.076512540958304e-05 loss: 0.1214 (0.1327) time: 3.0171 data: 0.0075 max mem: 33300 Epoch: [16] [2920/4276] eta: 1:11:17 lr: 3.0762348564867184e-05 loss: 0.1266 (0.1328) time: 3.0234 data: 0.0075 max mem: 33300 Epoch: [16] [2930/4276] eta: 1:10:45 lr: 3.0759571692300014e-05 loss: 0.1228 (0.1327) time: 3.0131 data: 0.0071 max mem: 33300 Epoch: [16] [2940/4276] eta: 1:10:12 lr: 3.0756794791878465e-05 loss: 0.1039 (0.1327) time: 2.9996 data: 0.0072 max mem: 33300 Epoch: [16] [2950/4276] eta: 1:09:40 lr: 3.075401786359947e-05 loss: 0.1128 (0.1326) time: 2.9956 data: 0.0072 max mem: 33300 Epoch: [16] [2960/4276] eta: 1:09:08 lr: 3.0751240907459946e-05 loss: 0.1152 (0.1326) time: 2.9969 data: 0.0072 max mem: 33300 Epoch: [16] [2970/4276] eta: 1:08:36 lr: 3.0748463923456824e-05 loss: 0.1206 (0.1326) time: 3.0081 data: 0.0071 max mem: 33300 Epoch: [16] [2980/4276] eta: 1:08:04 lr: 3.074568691158703e-05 loss: 0.1276 (0.1326) time: 3.0196 data: 0.0071 max mem: 33300 Epoch: [16] [2990/4276] eta: 1:07:32 lr: 3.074290987184747e-05 loss: 0.1275 (0.1326) time: 3.0162 data: 0.0074 max mem: 33300 Epoch: [16] [3000/4276] eta: 1:06:59 lr: 3.074013280423509e-05 loss: 0.1126 (0.1325) time: 2.9772 data: 0.0073 max mem: 33300 Epoch: [16] [3010/4276] eta: 1:06:27 lr: 3.07373557087468e-05 loss: 0.1155 (0.1325) time: 2.9513 data: 0.0072 max mem: 33300 Epoch: [16] [3020/4276] eta: 1:05:55 lr: 3.0734578585379525e-05 loss: 0.1268 (0.1325) time: 2.9588 data: 0.0074 max mem: 33300 Epoch: [16] [3030/4276] eta: 1:05:22 lr: 3.073180143413018e-05 loss: 0.1268 (0.1325) time: 2.9542 data: 0.0071 max mem: 33300 Epoch: [16] [3040/4276] eta: 1:04:50 lr: 3.072902425499569e-05 loss: 0.1324 (0.1325) time: 2.9760 data: 0.0073 max mem: 33300 Epoch: [16] [3050/4276] eta: 1:04:18 lr: 3.072624704797299e-05 loss: 0.1203 (0.1325) time: 3.0013 data: 0.0074 max mem: 33300 Epoch: [16] [3060/4276] eta: 1:03:46 lr: 3.072346981305897e-05 loss: 0.1079 (0.1324) time: 2.9908 data: 0.0071 max mem: 33300 Epoch: [16] [3070/4276] eta: 1:03:14 lr: 3.072069255025057e-05 loss: 0.1191 (0.1324) time: 2.9823 data: 0.0074 max mem: 33300 Epoch: [16] [3080/4276] eta: 1:02:42 lr: 3.071791525954469e-05 loss: 0.1191 (0.1324) time: 3.0032 data: 0.0074 max mem: 33300 Epoch: [16] [3090/4276] eta: 1:02:10 lr: 3.071513794093827e-05 loss: 0.1190 (0.1323) time: 3.0014 data: 0.0074 max mem: 33300 Epoch: [16] [3100/4276] eta: 1:01:38 lr: 3.0712360594428205e-05 loss: 0.1108 (0.1323) time: 2.9870 data: 0.0074 max mem: 33300 Epoch: [16] [3110/4276] eta: 1:01:06 lr: 3.070958322001142e-05 loss: 0.1108 (0.1322) time: 2.9824 data: 0.0072 max mem: 33300 Epoch: [16] [3120/4276] eta: 1:00:34 lr: 3.070680581768484e-05 loss: 0.1123 (0.1322) time: 2.9687 data: 0.0071 max mem: 33300 Epoch: [16] [3130/4276] eta: 1:00:02 lr: 3.0704028387445365e-05 loss: 0.1177 (0.1322) time: 2.9615 data: 0.0070 max mem: 33300 Epoch: [16] [3140/4276] eta: 0:59:29 lr: 3.0701250929289916e-05 loss: 0.1312 (0.1322) time: 2.9594 data: 0.0073 max mem: 33300 Epoch: [16] [3150/4276] eta: 0:58:58 lr: 3.06984734432154e-05 loss: 0.1330 (0.1322) time: 2.9826 data: 0.0077 max mem: 33300 Epoch: [16] [3160/4276] eta: 0:58:26 lr: 3.0695695929218736e-05 loss: 0.1286 (0.1322) time: 2.9893 data: 0.0074 max mem: 33300 Epoch: [16] [3170/4276] eta: 0:57:54 lr: 3.069291838729684e-05 loss: 0.1293 (0.1322) time: 2.9783 data: 0.0074 max mem: 33300 Epoch: [16] [3180/4276] eta: 0:57:22 lr: 3.0690140817446604e-05 loss: 0.1232 (0.1321) time: 2.9782 data: 0.0074 max mem: 33300 Epoch: [16] [3190/4276] eta: 0:56:50 lr: 3.068736321966496e-05 loss: 0.1232 (0.1321) time: 2.9706 data: 0.0071 max mem: 33300 Epoch: [16] [3200/4276] eta: 0:56:18 lr: 3.0684585593948806e-05 loss: 0.1246 (0.1321) time: 2.9708 data: 0.0073 max mem: 33300 Epoch: [16] [3210/4276] eta: 0:55:46 lr: 3.068180794029506e-05 loss: 0.1272 (0.1321) time: 2.9609 data: 0.0078 max mem: 33300 Epoch: [16] [3220/4276] eta: 0:55:14 lr: 3.0679030258700624e-05 loss: 0.1400 (0.1321) time: 2.9523 data: 0.0075 max mem: 33300 Epoch: [16] [3230/4276] eta: 0:54:42 lr: 3.067625254916241e-05 loss: 0.1304 (0.1321) time: 2.9613 data: 0.0075 max mem: 33300 Epoch: [16] [3240/4276] eta: 0:54:10 lr: 3.0673474811677324e-05 loss: 0.1437 (0.1322) time: 2.9675 data: 0.0076 max mem: 33300 Epoch: [16] [3250/4276] eta: 0:53:38 lr: 3.067069704624226e-05 loss: 0.1377 (0.1322) time: 2.9936 data: 0.0072 max mem: 33300 Epoch: [16] [3260/4276] eta: 0:53:06 lr: 3.066791925285415e-05 loss: 0.1306 (0.1322) time: 3.0099 data: 0.0070 max mem: 33300 Epoch: [16] [3270/4276] eta: 0:52:34 lr: 3.0665141431509874e-05 loss: 0.1234 (0.1322) time: 3.0044 data: 0.0070 max mem: 33300 Epoch: [16] [3280/4276] eta: 0:52:03 lr: 3.066236358220635e-05 loss: 0.1293 (0.1322) time: 2.9833 data: 0.0070 max mem: 33300 Epoch: [16] [3290/4276] eta: 0:51:31 lr: 3.06595857049405e-05 loss: 0.1315 (0.1323) time: 2.9804 data: 0.0072 max mem: 33300 Epoch: [16] [3300/4276] eta: 0:50:59 lr: 3.065680779970919e-05 loss: 0.1323 (0.1323) time: 2.9890 data: 0.0072 max mem: 33300 Epoch: [16] [3310/4276] eta: 0:50:27 lr: 3.0654029866509346e-05 loss: 0.1360 (0.1323) time: 2.9659 data: 0.0069 max mem: 33300 Epoch: [16] [3320/4276] eta: 0:49:55 lr: 3.0651251905337866e-05 loss: 0.1420 (0.1323) time: 2.9508 data: 0.0073 max mem: 33300 Epoch: [16] [3330/4276] eta: 0:49:23 lr: 3.0648473916191645e-05 loss: 0.1255 (0.1323) time: 2.9675 data: 0.0081 max mem: 33300 Epoch: [16] [3340/4276] eta: 0:48:52 lr: 3.064569589906759e-05 loss: 0.1138 (0.1323) time: 2.9744 data: 0.0077 max mem: 33300 Epoch: [16] [3350/4276] eta: 0:48:20 lr: 3.06429178539626e-05 loss: 0.1171 (0.1322) time: 2.9722 data: 0.0073 max mem: 33300 Epoch: [16] [3360/4276] eta: 0:47:48 lr: 3.064013978087358e-05 loss: 0.1171 (0.1322) time: 2.9843 data: 0.0076 max mem: 33300 Epoch: [16] [3370/4276] eta: 0:47:16 lr: 3.0637361679797424e-05 loss: 0.1211 (0.1322) time: 2.9739 data: 0.0075 max mem: 33300 Epoch: [16] [3380/4276] eta: 0:46:45 lr: 3.063458355073103e-05 loss: 0.1343 (0.1322) time: 2.9580 data: 0.0074 max mem: 33300 Epoch: [16] [3390/4276] eta: 0:46:13 lr: 3.0631805393671296e-05 loss: 0.1299 (0.1322) time: 2.9707 data: 0.0079 max mem: 33300 Epoch: [16] [3400/4276] eta: 0:45:41 lr: 3.0629027208615116e-05 loss: 0.1328 (0.1322) time: 2.9974 data: 0.0081 max mem: 33300 Epoch: [16] [3410/4276] eta: 0:45:10 lr: 3.062624899555938e-05 loss: 0.1199 (0.1322) time: 2.9984 data: 0.0081 max mem: 33300 Epoch: [16] [3420/4276] eta: 0:44:38 lr: 3.0623470754501e-05 loss: 0.1297 (0.1323) time: 2.9850 data: 0.0080 max mem: 33300 Epoch: [16] [3430/4276] eta: 0:44:06 lr: 3.062069248543687e-05 loss: 0.1358 (0.1323) time: 3.0021 data: 0.0075 max mem: 33300 Epoch: [16] [3440/4276] eta: 0:43:35 lr: 3.061791418836387e-05 loss: 0.1249 (0.1322) time: 3.0280 data: 0.0074 max mem: 33300 Epoch: [16] [3450/4276] eta: 0:43:03 lr: 3.0615135863278904e-05 loss: 0.1320 (0.1323) time: 3.0216 data: 0.0068 max mem: 33300 Epoch: [16] [3460/4276] eta: 0:42:32 lr: 3.061235751017887e-05 loss: 0.1495 (0.1323) time: 3.0405 data: 0.0062 max mem: 33300 Epoch: [16] [3470/4276] eta: 0:42:01 lr: 3.0609579129060637e-05 loss: 0.1342 (0.1323) time: 3.0702 data: 0.0072 max mem: 33300 Epoch: [16] [3480/4276] eta: 0:41:29 lr: 3.0606800719921125e-05 loss: 0.1348 (0.1323) time: 3.0623 data: 0.0077 max mem: 33300 Epoch: [16] [3490/4276] eta: 0:40:58 lr: 3.06040222827572e-05 loss: 0.1348 (0.1324) time: 3.0618 data: 0.0073 max mem: 33300 Epoch: [16] [3500/4276] eta: 0:40:26 lr: 3.060124381756578e-05 loss: 0.1288 (0.1323) time: 3.0703 data: 0.0080 max mem: 33300 Epoch: [16] [3510/4276] eta: 0:39:55 lr: 3.059846532434373e-05 loss: 0.1170 (0.1323) time: 3.0706 data: 0.0083 max mem: 33300 Epoch: [16] [3520/4276] eta: 0:39:24 lr: 3.0595686803087954e-05 loss: 0.1176 (0.1323) time: 3.0896 data: 0.0081 max mem: 33300 Epoch: [16] [3530/4276] eta: 0:38:52 lr: 3.059290825379533e-05 loss: 0.1292 (0.1323) time: 3.1006 data: 0.0082 max mem: 33300 Epoch: [16] [3540/4276] eta: 0:38:21 lr: 3.0590129676462754e-05 loss: 0.1269 (0.1323) time: 3.0798 data: 0.0077 max mem: 33300 Epoch: [16] [3550/4276] eta: 0:37:49 lr: 3.05873510710871e-05 loss: 0.1282 (0.1323) time: 3.0630 data: 0.0073 max mem: 33300 Epoch: [16] [3560/4276] eta: 0:37:18 lr: 3.058457243766527e-05 loss: 0.1326 (0.1323) time: 3.0558 data: 0.0074 max mem: 33300 Epoch: [16] [3570/4276] eta: 0:36:47 lr: 3.0581793776194145e-05 loss: 0.1382 (0.1324) time: 3.0441 data: 0.0075 max mem: 33300 Epoch: [16] [3580/4276] eta: 0:36:15 lr: 3.057901508667061e-05 loss: 0.1246 (0.1323) time: 3.0535 data: 0.0075 max mem: 33300 Epoch: [16] [3590/4276] eta: 0:35:44 lr: 3.057623636909154e-05 loss: 0.1165 (0.1324) time: 3.0948 data: 0.0077 max mem: 33300 Epoch: [16] [3600/4276] eta: 0:35:13 lr: 3.057345762345384e-05 loss: 0.1268 (0.1324) time: 3.1094 data: 0.0080 max mem: 33300 Epoch: [16] [3610/4276] eta: 0:34:41 lr: 3.057067884975437e-05 loss: 0.1273 (0.1324) time: 3.1074 data: 0.0082 max mem: 33300 Epoch: [16] [3620/4276] eta: 0:34:10 lr: 3.056790004799003e-05 loss: 0.1273 (0.1324) time: 3.1041 data: 0.0084 max mem: 33300 Epoch: [16] [3630/4276] eta: 0:33:39 lr: 3.056512121815768e-05 loss: 0.1297 (0.1324) time: 3.0957 data: 0.0082 max mem: 33300 Epoch: [16] [3640/4276] eta: 0:33:07 lr: 3.0562342360254224e-05 loss: 0.1196 (0.1324) time: 3.0900 data: 0.0078 max mem: 33300 Epoch: [16] [3650/4276] eta: 0:32:36 lr: 3.055956347427654e-05 loss: 0.1089 (0.1323) time: 3.0716 data: 0.0077 max mem: 33300 Epoch: [16] [3660/4276] eta: 0:32:05 lr: 3.0556784560221494e-05 loss: 0.1099 (0.1322) time: 3.0623 data: 0.0080 max mem: 33300 Epoch: [16] [3670/4276] eta: 0:31:33 lr: 3.055400561808597e-05 loss: 0.1171 (0.1323) time: 3.0672 data: 0.0081 max mem: 33300 Epoch: [16] [3680/4276] eta: 0:31:02 lr: 3.0551226647866856e-05 loss: 0.1256 (0.1323) time: 3.0622 data: 0.0077 max mem: 33300 Epoch: [16] [3690/4276] eta: 0:30:31 lr: 3.054844764956102e-05 loss: 0.1278 (0.1323) time: 3.0597 data: 0.0075 max mem: 33300 Epoch: [16] [3700/4276] eta: 0:29:59 lr: 3.054566862316534e-05 loss: 0.1260 (0.1323) time: 3.0733 data: 0.0076 max mem: 33300 Epoch: [16] [3710/4276] eta: 0:29:28 lr: 3.054288956867669e-05 loss: 0.1206 (0.1322) time: 3.0707 data: 0.0078 max mem: 33300 Epoch: [16] [3720/4276] eta: 0:28:57 lr: 3.054011048609196e-05 loss: 0.1193 (0.1322) time: 3.0472 data: 0.0077 max mem: 33300 Epoch: [16] [3730/4276] eta: 0:28:25 lr: 3.0537331375408005e-05 loss: 0.1221 (0.1322) time: 3.0501 data: 0.0081 max mem: 33300 Epoch: [16] [3740/4276] eta: 0:27:54 lr: 3.0534552236621716e-05 loss: 0.1251 (0.1322) time: 3.0534 data: 0.0082 max mem: 33300 Epoch: [16] [3750/4276] eta: 0:27:23 lr: 3.0531773069729965e-05 loss: 0.1275 (0.1322) time: 3.0692 data: 0.0081 max mem: 33300 Epoch: [16] [3760/4276] eta: 0:26:51 lr: 3.052899387472961e-05 loss: 0.1212 (0.1322) time: 3.0750 data: 0.0084 max mem: 33300 Epoch: [16] [3770/4276] eta: 0:26:20 lr: 3.0526214651617545e-05 loss: 0.1213 (0.1322) time: 3.0626 data: 0.0079 max mem: 33300 Epoch: [16] [3780/4276] eta: 0:25:49 lr: 3.0523435400390627e-05 loss: 0.1338 (0.1322) time: 3.0655 data: 0.0082 max mem: 33300 Epoch: [16] [3790/4276] eta: 0:25:17 lr: 3.0520656121045725e-05 loss: 0.1246 (0.1322) time: 3.0644 data: 0.0086 max mem: 33300 Epoch: [16] [3800/4276] eta: 0:24:46 lr: 3.051787681357972e-05 loss: 0.1292 (0.1322) time: 3.0776 data: 0.0078 max mem: 33300 Epoch: [16] [3810/4276] eta: 0:24:15 lr: 3.0515097477989475e-05 loss: 0.1292 (0.1322) time: 3.0915 data: 0.0081 max mem: 33300 Epoch: [16] [3820/4276] eta: 0:23:43 lr: 3.0512318114271865e-05 loss: 0.1071 (0.1321) time: 3.0701 data: 0.0079 max mem: 33300 Epoch: [16] [3830/4276] eta: 0:23:12 lr: 3.050953872242376e-05 loss: 0.1097 (0.1322) time: 3.0697 data: 0.0074 max mem: 33300 Epoch: [16] [3840/4276] eta: 0:22:41 lr: 3.0506759302442013e-05 loss: 0.1195 (0.1321) time: 3.0746 data: 0.0080 max mem: 33300 Epoch: [16] [3850/4276] eta: 0:22:10 lr: 3.0503979854323506e-05 loss: 0.1063 (0.1321) time: 3.0234 data: 0.0083 max mem: 33300 Epoch: [16] [3860/4276] eta: 0:21:38 lr: 3.0501200378065098e-05 loss: 0.1235 (0.1321) time: 2.9798 data: 0.0085 max mem: 33300 Epoch: [16] [3870/4276] eta: 0:21:07 lr: 3.0498420873663657e-05 loss: 0.1282 (0.1320) time: 2.9846 data: 0.0080 max mem: 33300 Epoch: [16] [3880/4276] eta: 0:20:36 lr: 3.049564134111605e-05 loss: 0.1186 (0.1320) time: 3.0245 data: 0.0074 max mem: 33300 Epoch: [16] [3890/4276] eta: 0:20:04 lr: 3.049286178041914e-05 loss: 0.1186 (0.1320) time: 3.0240 data: 0.0074 max mem: 33300 Epoch: [16] [3900/4276] eta: 0:19:33 lr: 3.049008219156979e-05 loss: 0.1197 (0.1320) time: 2.9896 data: 0.0071 max mem: 33300 Epoch: [16] [3910/4276] eta: 0:19:02 lr: 3.0487302574564868e-05 loss: 0.1184 (0.1320) time: 2.9808 data: 0.0073 max mem: 33300 Epoch: [16] [3920/4276] eta: 0:18:30 lr: 3.0484522929401226e-05 loss: 0.1117 (0.1320) time: 2.9789 data: 0.0082 max mem: 33300 Epoch: [16] [3930/4276] eta: 0:17:59 lr: 3.048174325607574e-05 loss: 0.1131 (0.1319) time: 2.9793 data: 0.0078 max mem: 33300 Epoch: [16] [3940/4276] eta: 0:17:28 lr: 3.047896355458526e-05 loss: 0.1225 (0.1319) time: 2.9976 data: 0.0073 max mem: 33300 Epoch: [16] [3950/4276] eta: 0:16:56 lr: 3.0476183824926646e-05 loss: 0.1112 (0.1319) time: 2.9744 data: 0.0075 max mem: 33300 Epoch: [16] [3960/4276] eta: 0:16:25 lr: 3.0473404067096766e-05 loss: 0.1144 (0.1319) time: 2.9431 data: 0.0079 max mem: 33300 Epoch: [16] [3970/4276] eta: 0:15:54 lr: 3.047062428109247e-05 loss: 0.1241 (0.1319) time: 2.9681 data: 0.0083 max mem: 33300 Epoch: [16] [3980/4276] eta: 0:15:22 lr: 3.046784446691063e-05 loss: 0.1265 (0.1319) time: 2.9883 data: 0.0080 max mem: 33300 Epoch: [16] [3990/4276] eta: 0:14:51 lr: 3.046506462454809e-05 loss: 0.1265 (0.1319) time: 2.9756 data: 0.0074 max mem: 33300 Epoch: [16] [4000/4276] eta: 0:14:20 lr: 3.0462284754001714e-05 loss: 0.1223 (0.1319) time: 3.0041 data: 0.0079 max mem: 33300 Epoch: [16] [4010/4276] eta: 0:13:49 lr: 3.0459504855268357e-05 loss: 0.1237 (0.1319) time: 3.0187 data: 0.0081 max mem: 33300 Epoch: [16] [4020/4276] eta: 0:13:17 lr: 3.045672492834487e-05 loss: 0.1273 (0.1319) time: 2.9957 data: 0.0070 max mem: 33300 Epoch: [16] [4030/4276] eta: 0:12:46 lr: 3.0453944973228117e-05 loss: 0.1301 (0.1319) time: 3.0097 data: 0.0074 max mem: 33300 Epoch: [16] [4040/4276] eta: 0:12:15 lr: 3.0451164989914943e-05 loss: 0.1309 (0.1319) time: 2.9996 data: 0.0078 max mem: 33300 Epoch: [16] [4050/4276] eta: 0:11:44 lr: 3.0448384978402212e-05 loss: 0.1141 (0.1319) time: 3.0048 data: 0.0078 max mem: 33300 Epoch: [16] [4060/4276] eta: 0:11:12 lr: 3.0445604938686774e-05 loss: 0.1178 (0.1319) time: 3.0407 data: 0.0083 max mem: 33300 Epoch: [16] [4070/4276] eta: 0:10:41 lr: 3.0442824870765474e-05 loss: 0.1318 (0.1319) time: 3.0203 data: 0.0080 max mem: 33300 Epoch: [16] [4080/4276] eta: 0:10:10 lr: 3.0440044774635178e-05 loss: 0.1318 (0.1320) time: 2.9858 data: 0.0077 max mem: 33300 Epoch: [16] [4090/4276] eta: 0:09:39 lr: 3.043726465029272e-05 loss: 0.1367 (0.1320) time: 2.9893 data: 0.0079 max mem: 33300 Epoch: [16] [4100/4276] eta: 0:09:08 lr: 3.0434484497734962e-05 loss: 0.1422 (0.1320) time: 3.0237 data: 0.0076 max mem: 33300 Epoch: [16] [4110/4276] eta: 0:08:36 lr: 3.043170431695875e-05 loss: 0.1317 (0.1320) time: 3.0373 data: 0.0070 max mem: 33300 Epoch: [16] [4120/4276] eta: 0:08:05 lr: 3.042892410796093e-05 loss: 0.1308 (0.1321) time: 2.9757 data: 0.0070 max mem: 33300 Epoch: [16] [4130/4276] eta: 0:07:34 lr: 3.042614387073836e-05 loss: 0.1269 (0.1321) time: 2.9616 data: 0.0071 max mem: 33300 Epoch: [16] [4140/4276] eta: 0:07:03 lr: 3.0423363605287887e-05 loss: 0.1358 (0.1321) time: 2.9829 data: 0.0073 max mem: 33300 Epoch: [16] [4150/4276] eta: 0:06:32 lr: 3.0420583311606343e-05 loss: 0.1403 (0.1321) time: 2.9572 data: 0.0072 max mem: 33300 Epoch: [16] [4160/4276] eta: 0:06:01 lr: 3.0417802989690593e-05 loss: 0.1327 (0.1322) time: 2.9603 data: 0.0072 max mem: 33300 Epoch: [16] [4170/4276] eta: 0:05:29 lr: 3.0415022639537468e-05 loss: 0.1445 (0.1322) time: 2.9627 data: 0.0071 max mem: 33300 Epoch: [16] [4180/4276] eta: 0:04:58 lr: 3.041224226114382e-05 loss: 0.1317 (0.1322) time: 2.9477 data: 0.0068 max mem: 33300 Epoch: [16] [4190/4276] eta: 0:04:27 lr: 3.040946185450649e-05 loss: 0.1260 (0.1322) time: 2.9800 data: 0.0070 max mem: 33300 Epoch: [16] [4200/4276] eta: 0:03:56 lr: 3.0406681419622324e-05 loss: 0.1380 (0.1322) time: 2.9770 data: 0.0071 max mem: 33300 Epoch: [16] [4210/4276] eta: 0:03:25 lr: 3.040390095648817e-05 loss: 0.1394 (0.1322) time: 2.9473 data: 0.0074 max mem: 33300 Epoch: [16] [4220/4276] eta: 0:02:54 lr: 3.040112046510087e-05 loss: 0.1394 (0.1323) time: 2.9809 data: 0.0081 max mem: 33300 Epoch: [16] [4230/4276] eta: 0:02:23 lr: 3.0398339945457254e-05 loss: 0.1412 (0.1323) time: 2.9782 data: 0.0080 max mem: 33300 Epoch: [16] [4240/4276] eta: 0:01:51 lr: 3.039555939755418e-05 loss: 0.1386 (0.1323) time: 2.9646 data: 0.0072 max mem: 33300 Epoch: [16] [4250/4276] eta: 0:01:20 lr: 3.0392778821388467e-05 loss: 0.1390 (0.1324) time: 3.0096 data: 0.0072 max mem: 33300 Epoch: [16] [4260/4276] eta: 0:00:49 lr: 3.0389998216956973e-05 loss: 0.1274 (0.1324) time: 3.0322 data: 0.0072 max mem: 33300 Epoch: [16] [4270/4276] eta: 0:00:18 lr: 3.0387217584256526e-05 loss: 0.1350 (0.1324) time: 3.0068 data: 0.0071 max mem: 33300 Epoch: [16] Total time: 3:41:34 Test: [ 0/21770] eta: 11:00:42 time: 1.8210 data: 1.7333 max mem: 33300 Test: [ 100/21770] eta: 0:20:01 time: 0.0378 data: 0.0010 max mem: 33300 Test: [ 200/21770] eta: 0:16:47 time: 0.0376 data: 0.0008 max mem: 33300 Test: [ 300/21770] eta: 0:15:40 time: 0.0385 data: 0.0010 max mem: 33300 Test: [ 400/21770] eta: 0:15:06 time: 0.0386 data: 0.0010 max mem: 33300 Test: [ 500/21770] eta: 0:14:43 time: 0.0380 data: 0.0010 max mem: 33300 Test: [ 600/21770] eta: 0:14:26 time: 0.0379 data: 0.0010 max mem: 33300 Test: [ 700/21770] eta: 0:14:13 time: 0.0383 data: 0.0011 max mem: 33300 Test: [ 800/21770] eta: 0:14:02 time: 0.0379 data: 0.0008 max mem: 33300 Test: [ 900/21770] eta: 0:13:52 time: 0.0377 data: 0.0009 max mem: 33300 Test: [ 1000/21770] eta: 0:13:44 time: 0.0379 data: 0.0009 max mem: 33300 Test: [ 1100/21770] eta: 0:13:37 time: 0.0379 data: 0.0009 max mem: 33300 Test: [ 1200/21770] eta: 0:13:30 time: 0.0378 data: 0.0008 max mem: 33300 Test: [ 1300/21770] eta: 0:13:24 time: 0.0379 data: 0.0009 max mem: 33300 Test: [ 1400/21770] eta: 0:13:19 time: 0.0382 data: 0.0010 max mem: 33300 Test: [ 1500/21770] eta: 0:13:13 time: 0.0379 data: 0.0010 max mem: 33300 Test: [ 1600/21770] eta: 0:13:07 time: 0.0381 data: 0.0010 max mem: 33300 Test: [ 1700/21770] eta: 0:13:02 time: 0.0378 data: 0.0009 max mem: 33300 Test: [ 1800/21770] eta: 0:12:58 time: 0.0381 data: 0.0009 max mem: 33300 Test: [ 1900/21770] eta: 0:12:53 time: 0.0379 data: 0.0009 max mem: 33300 Test: [ 2000/21770] eta: 0:12:48 time: 0.0381 data: 0.0009 max mem: 33300 Test: [ 2100/21770] eta: 0:12:44 time: 0.0381 data: 0.0009 max mem: 33300 Test: [ 2200/21770] eta: 0:12:39 time: 0.0381 data: 0.0009 max mem: 33300 Test: [ 2300/21770] eta: 0:12:35 time: 0.0380 data: 0.0009 max mem: 33300 Test: [ 2400/21770] eta: 0:12:31 time: 0.0387 data: 0.0009 max mem: 33300 Test: [ 2500/21770] eta: 0:12:27 time: 0.0392 data: 0.0009 max mem: 33300 Test: [ 2600/21770] eta: 0:12:23 time: 0.0385 data: 0.0009 max mem: 33300 Test: [ 2700/21770] eta: 0:12:19 time: 0.0386 data: 0.0009 max mem: 33300 Test: [ 2800/21770] eta: 0:12:15 time: 0.0385 data: 0.0009 max mem: 33300 Test: [ 2900/21770] eta: 0:12:11 time: 0.0389 data: 0.0009 max mem: 33300 Test: [ 3000/21770] eta: 0:12:07 time: 0.0383 data: 0.0009 max mem: 33300 Test: [ 3100/21770] eta: 0:12:03 time: 0.0383 data: 0.0009 max mem: 33300 Test: [ 3200/21770] eta: 0:11:58 time: 0.0384 data: 0.0010 max mem: 33300 Test: [ 3300/21770] eta: 0:11:55 time: 0.0386 data: 0.0009 max mem: 33300 Test: [ 3400/21770] eta: 0:11:51 time: 0.0387 data: 0.0009 max mem: 33300 Test: [ 3500/21770] eta: 0:11:47 time: 0.0389 data: 0.0009 max mem: 33300 Test: [ 3600/21770] eta: 0:11:43 time: 0.0388 data: 0.0009 max mem: 33300 Test: [ 3700/21770] eta: 0:11:39 time: 0.0387 data: 0.0009 max mem: 33300 Test: [ 3800/21770] eta: 0:11:35 time: 0.0389 data: 0.0010 max mem: 33300 Test: [ 3900/21770] eta: 0:11:31 time: 0.0390 data: 0.0009 max mem: 33300 Test: [ 4000/21770] eta: 0:11:27 time: 0.0383 data: 0.0009 max mem: 33300 Test: [ 4100/21770] eta: 0:11:23 time: 0.0384 data: 0.0009 max mem: 33300 Test: [ 4200/21770] eta: 0:11:19 time: 0.0387 data: 0.0010 max mem: 33300 Test: [ 4300/21770] eta: 0:11:15 time: 0.0380 data: 0.0008 max mem: 33300 Test: [ 4400/21770] eta: 0:11:11 time: 0.0384 data: 0.0009 max mem: 33300 Test: [ 4500/21770] eta: 0:11:07 time: 0.0385 data: 0.0009 max mem: 33300 Test: [ 4600/21770] eta: 0:11:03 time: 0.0383 data: 0.0009 max mem: 33300 Test: [ 4700/21770] eta: 0:11:00 time: 0.0382 data: 0.0009 max mem: 33300 Test: [ 4800/21770] eta: 0:10:56 time: 0.0381 data: 0.0009 max mem: 33300 Test: [ 4900/21770] eta: 0:10:52 time: 0.0384 data: 0.0009 max mem: 33300 Test: [ 5000/21770] eta: 0:10:48 time: 0.0386 data: 0.0010 max mem: 33300 Test: [ 5100/21770] eta: 0:10:44 time: 0.0390 data: 0.0009 max mem: 33300 Test: [ 5200/21770] eta: 0:10:40 time: 0.0387 data: 0.0009 max mem: 33300 Test: [ 5300/21770] eta: 0:10:36 time: 0.0388 data: 0.0010 max mem: 33300 Test: [ 5400/21770] eta: 0:10:32 time: 0.0391 data: 0.0010 max mem: 33300 Test: [ 5500/21770] eta: 0:10:29 time: 0.0388 data: 0.0009 max mem: 33300 Test: [ 5600/21770] eta: 0:10:25 time: 0.0381 data: 0.0009 max mem: 33300 Test: [ 5700/21770] eta: 0:10:21 time: 0.0380 data: 0.0009 max mem: 33300 Test: [ 5800/21770] eta: 0:10:17 time: 0.0383 data: 0.0009 max mem: 33300 Test: [ 5900/21770] eta: 0:10:13 time: 0.0385 data: 0.0009 max mem: 33300 Test: [ 6000/21770] eta: 0:10:09 time: 0.0383 data: 0.0009 max mem: 33300 Test: [ 6100/21770] eta: 0:10:05 time: 0.0386 data: 0.0009 max mem: 33300 Test: [ 6200/21770] eta: 0:10:01 time: 0.0386 data: 0.0009 max mem: 33300 Test: [ 6300/21770] eta: 0:09:57 time: 0.0386 data: 0.0009 max mem: 33300 Test: [ 6400/21770] eta: 0:09:53 time: 0.0388 data: 0.0009 max mem: 33300 Test: [ 6500/21770] eta: 0:09:50 time: 0.0390 data: 0.0009 max mem: 33300 Test: [ 6600/21770] eta: 0:09:46 time: 0.0391 data: 0.0009 max mem: 33300 Test: [ 6700/21770] eta: 0:09:42 time: 0.0390 data: 0.0009 max mem: 33300 Test: [ 6800/21770] eta: 0:09:38 time: 0.0385 data: 0.0010 max mem: 33300 Test: [ 6900/21770] eta: 0:09:34 time: 0.0385 data: 0.0009 max mem: 33300 Test: [ 7000/21770] eta: 0:09:30 time: 0.0386 data: 0.0009 max mem: 33300 Test: [ 7100/21770] eta: 0:09:26 time: 0.0384 data: 0.0009 max mem: 33300 Test: [ 7200/21770] eta: 0:09:23 time: 0.0383 data: 0.0009 max mem: 33300 Test: [ 7300/21770] eta: 0:09:19 time: 0.0388 data: 0.0009 max mem: 33300 Test: [ 7400/21770] eta: 0:09:15 time: 0.0377 data: 0.0009 max mem: 33300 Test: [ 7500/21770] eta: 0:09:11 time: 0.0381 data: 0.0009 max mem: 33300 Test: [ 7600/21770] eta: 0:09:07 time: 0.0380 data: 0.0009 max mem: 33300 Test: [ 7700/21770] eta: 0:09:03 time: 0.0386 data: 0.0009 max mem: 33300 Test: [ 7800/21770] eta: 0:08:59 time: 0.0382 data: 0.0009 max mem: 33300 Test: [ 7900/21770] eta: 0:08:55 time: 0.0392 data: 0.0008 max mem: 33300 Test: [ 8000/21770] eta: 0:08:51 time: 0.0387 data: 0.0008 max mem: 33300 Test: [ 8100/21770] eta: 0:08:47 time: 0.0386 data: 0.0008 max mem: 33300 Test: [ 8200/21770] eta: 0:08:43 time: 0.0381 data: 0.0008 max mem: 33300 Test: [ 8300/21770] eta: 0:08:40 time: 0.0384 data: 0.0008 max mem: 33300 Test: [ 8400/21770] eta: 0:08:36 time: 0.0384 data: 0.0009 max mem: 33300 Test: [ 8500/21770] eta: 0:08:32 time: 0.0379 data: 0.0009 max mem: 33300 Test: [ 8600/21770] eta: 0:08:28 time: 0.0388 data: 0.0010 max mem: 33300 Test: [ 8700/21770] eta: 0:08:24 time: 0.0383 data: 0.0009 max mem: 33300 Test: [ 8800/21770] eta: 0:08:20 time: 0.0387 data: 0.0009 max mem: 33300 Test: [ 8900/21770] eta: 0:08:16 time: 0.0385 data: 0.0009 max mem: 33300 Test: [ 9000/21770] eta: 0:08:12 time: 0.0379 data: 0.0009 max mem: 33300 Test: [ 9100/21770] eta: 0:08:09 time: 0.0392 data: 0.0010 max mem: 33300 Test: [ 9200/21770] eta: 0:08:05 time: 0.0383 data: 0.0009 max mem: 33300 Test: [ 9300/21770] eta: 0:08:01 time: 0.0384 data: 0.0009 max mem: 33300 Test: [ 9400/21770] eta: 0:07:57 time: 0.0386 data: 0.0010 max mem: 33300 Test: [ 9500/21770] eta: 0:07:53 time: 0.0390 data: 0.0009 max mem: 33300 Test: [ 9600/21770] eta: 0:07:49 time: 0.0385 data: 0.0009 max mem: 33300 Test: [ 9700/21770] eta: 0:07:45 time: 0.0383 data: 0.0009 max mem: 33300 Test: [ 9800/21770] eta: 0:07:42 time: 0.0385 data: 0.0009 max mem: 33300 Test: [ 9900/21770] eta: 0:07:38 time: 0.0382 data: 0.0010 max mem: 33300 Test: [10000/21770] eta: 0:07:34 time: 0.0388 data: 0.0009 max mem: 33300 Test: [10100/21770] eta: 0:07:30 time: 0.0387 data: 0.0010 max mem: 33300 Test: [10200/21770] eta: 0:07:26 time: 0.0392 data: 0.0010 max mem: 33300 Test: [10300/21770] eta: 0:07:22 time: 0.0389 data: 0.0009 max mem: 33300 Test: [10400/21770] eta: 0:07:19 time: 0.0398 data: 0.0009 max mem: 33300 Test: [10500/21770] eta: 0:07:15 time: 0.0389 data: 0.0010 max mem: 33300 Test: [10600/21770] eta: 0:07:11 time: 0.0382 data: 0.0009 max mem: 33300 Test: [10700/21770] eta: 0:07:07 time: 0.0380 data: 0.0009 max mem: 33300 Test: [10800/21770] eta: 0:07:03 time: 0.0378 data: 0.0008 max mem: 33300 Test: [10900/21770] eta: 0:06:59 time: 0.0381 data: 0.0009 max mem: 33300 Test: [11000/21770] eta: 0:06:55 time: 0.0379 data: 0.0009 max mem: 33300 Test: [11100/21770] eta: 0:06:51 time: 0.0381 data: 0.0009 max mem: 33300 Test: [11200/21770] eta: 0:06:47 time: 0.0379 data: 0.0009 max mem: 33300 Test: [11300/21770] eta: 0:06:43 time: 0.0380 data: 0.0009 max mem: 33300 Test: [11400/21770] eta: 0:06:39 time: 0.0379 data: 0.0008 max mem: 33300 Test: [11500/21770] eta: 0:06:36 time: 0.0381 data: 0.0009 max mem: 33300 Test: [11600/21770] eta: 0:06:32 time: 0.0381 data: 0.0009 max mem: 33300 Test: [11700/21770] eta: 0:06:28 time: 0.0384 data: 0.0009 max mem: 33300 Test: [11800/21770] eta: 0:06:24 time: 0.0382 data: 0.0009 max mem: 33300 Test: [11900/21770] eta: 0:06:20 time: 0.0383 data: 0.0009 max mem: 33300 Test: [12000/21770] eta: 0:06:16 time: 0.0380 data: 0.0008 max mem: 33300 Test: [12100/21770] eta: 0:06:12 time: 0.0378 data: 0.0008 max mem: 33300 Test: [12200/21770] eta: 0:06:08 time: 0.0380 data: 0.0009 max mem: 33300 Test: [12300/21770] eta: 0:06:04 time: 0.0381 data: 0.0009 max mem: 33300 Test: [12400/21770] eta: 0:06:01 time: 0.0381 data: 0.0009 max mem: 33300 Test: [12500/21770] eta: 0:05:57 time: 0.0382 data: 0.0009 max mem: 33300 Test: [12600/21770] eta: 0:05:53 time: 0.0380 data: 0.0009 max mem: 33300 Test: [12700/21770] eta: 0:05:49 time: 0.0381 data: 0.0009 max mem: 33300 Test: [12800/21770] eta: 0:05:45 time: 0.0379 data: 0.0009 max mem: 33300 Test: [12900/21770] eta: 0:05:41 time: 0.0380 data: 0.0009 max mem: 33300 Test: [13000/21770] eta: 0:05:37 time: 0.0381 data: 0.0009 max mem: 33300 Test: [13100/21770] eta: 0:05:33 time: 0.0382 data: 0.0009 max mem: 33300 Test: [13200/21770] eta: 0:05:30 time: 0.0387 data: 0.0009 max mem: 33300 Test: [13300/21770] eta: 0:05:26 time: 0.0396 data: 0.0010 max mem: 33300 Test: [13400/21770] eta: 0:05:22 time: 0.0391 data: 0.0010 max mem: 33300 Test: [13500/21770] eta: 0:05:18 time: 0.0390 data: 0.0010 max mem: 33300 Test: [13600/21770] eta: 0:05:14 time: 0.0383 data: 0.0009 max mem: 33300 Test: [13700/21770] eta: 0:05:10 time: 0.0390 data: 0.0009 max mem: 33300 Test: [13800/21770] eta: 0:05:07 time: 0.0389 data: 0.0009 max mem: 33300 Test: [13900/21770] eta: 0:05:03 time: 0.0390 data: 0.0009 max mem: 33300 Test: [14000/21770] eta: 0:04:59 time: 0.0389 data: 0.0009 max mem: 33300 Test: [14100/21770] eta: 0:04:55 time: 0.0392 data: 0.0009 max mem: 33300 Test: [14200/21770] eta: 0:04:51 time: 0.0389 data: 0.0009 max mem: 33300 Test: [14300/21770] eta: 0:04:47 time: 0.0394 data: 0.0010 max mem: 33300 Test: [14400/21770] eta: 0:04:44 time: 0.0389 data: 0.0009 max mem: 33300 Test: [14500/21770] eta: 0:04:40 time: 0.0389 data: 0.0010 max mem: 33300 Test: [14600/21770] eta: 0:04:36 time: 0.0384 data: 0.0008 max mem: 33300 Test: [14700/21770] eta: 0:04:32 time: 0.0386 data: 0.0008 max mem: 33300 Test: [14800/21770] eta: 0:04:28 time: 0.0382 data: 0.0008 max mem: 33300 Test: [14900/21770] eta: 0:04:24 time: 0.0393 data: 0.0008 max mem: 33300 Test: [15000/21770] eta: 0:04:20 time: 0.0385 data: 0.0009 max mem: 33300 Test: [15100/21770] eta: 0:04:17 time: 0.0390 data: 0.0008 max mem: 33300 Test: [15200/21770] eta: 0:04:13 time: 0.0388 data: 0.0008 max mem: 33300 Test: [15300/21770] eta: 0:04:09 time: 0.0387 data: 0.0008 max mem: 33300 Test: [15400/21770] eta: 0:04:05 time: 0.0382 data: 0.0008 max mem: 33300 Test: [15500/21770] eta: 0:04:01 time: 0.0383 data: 0.0008 max mem: 33300 Test: [15600/21770] eta: 0:03:57 time: 0.0384 data: 0.0009 max mem: 33300 Test: [15700/21770] eta: 0:03:54 time: 0.0394 data: 0.0009 max mem: 33300 Test: [15800/21770] eta: 0:03:50 time: 0.0396 data: 0.0009 max mem: 33300 Test: [15900/21770] eta: 0:03:46 time: 0.0395 data: 0.0009 max mem: 33300 Test: [16000/21770] eta: 0:03:42 time: 0.0396 data: 0.0010 max mem: 33300 Test: [16100/21770] eta: 0:03:38 time: 0.0388 data: 0.0009 max mem: 33300 Test: [16200/21770] eta: 0:03:34 time: 0.0382 data: 0.0009 max mem: 33300 Test: [16300/21770] eta: 0:03:31 time: 0.0382 data: 0.0009 max mem: 33300 Test: [16400/21770] eta: 0:03:27 time: 0.0380 data: 0.0009 max mem: 33300 Test: [16500/21770] eta: 0:03:23 time: 0.0383 data: 0.0009 max mem: 33300 Test: [16600/21770] eta: 0:03:19 time: 0.0381 data: 0.0009 max mem: 33300 Test: [16700/21770] eta: 0:03:15 time: 0.0378 data: 0.0008 max mem: 33300 Test: [16800/21770] eta: 0:03:11 time: 0.0380 data: 0.0009 max mem: 33300 Test: [16900/21770] eta: 0:03:07 time: 0.0382 data: 0.0009 max mem: 33300 Test: [17000/21770] eta: 0:03:03 time: 0.0378 data: 0.0008 max mem: 33300 Test: [17100/21770] eta: 0:03:00 time: 0.0382 data: 0.0008 max mem: 33300 Test: [17200/21770] eta: 0:02:56 time: 0.0382 data: 0.0009 max mem: 33300 Test: [17300/21770] eta: 0:02:52 time: 0.0380 data: 0.0008 max mem: 33300 Test: [17400/21770] eta: 0:02:48 time: 0.0379 data: 0.0008 max mem: 33300 Test: [17500/21770] eta: 0:02:44 time: 0.0381 data: 0.0008 max mem: 33300 Test: [17600/21770] eta: 0:02:40 time: 0.0378 data: 0.0008 max mem: 33300 Test: [17700/21770] eta: 0:02:36 time: 0.0380 data: 0.0008 max mem: 33300 Test: [17800/21770] eta: 0:02:32 time: 0.0379 data: 0.0008 max mem: 33300 Test: [17900/21770] eta: 0:02:29 time: 0.0381 data: 0.0009 max mem: 33300 Test: [18000/21770] eta: 0:02:25 time: 0.0385 data: 0.0010 max mem: 33300 Test: [18100/21770] eta: 0:02:21 time: 0.0384 data: 0.0009 max mem: 33300 Test: [18200/21770] eta: 0:02:17 time: 0.0382 data: 0.0009 max mem: 33300 Test: [18300/21770] eta: 0:02:13 time: 0.0382 data: 0.0008 max mem: 33300 Test: [18400/21770] eta: 0:02:09 time: 0.0379 data: 0.0008 max mem: 33300 Test: [18500/21770] eta: 0:02:05 time: 0.0379 data: 0.0009 max mem: 33300 Test: [18600/21770] eta: 0:02:02 time: 0.0377 data: 0.0008 max mem: 33300 Test: [18700/21770] eta: 0:01:58 time: 0.0377 data: 0.0008 max mem: 33300 Test: [18800/21770] eta: 0:01:54 time: 0.0378 data: 0.0008 max mem: 33300 Test: [18900/21770] eta: 0:01:50 time: 0.0379 data: 0.0009 max mem: 33300 Test: [19000/21770] eta: 0:01:46 time: 0.0379 data: 0.0008 max mem: 33300 Test: [19100/21770] eta: 0:01:42 time: 0.0377 data: 0.0009 max mem: 33300 Test: [19200/21770] eta: 0:01:38 time: 0.0377 data: 0.0009 max mem: 33300 Test: [19300/21770] eta: 0:01:35 time: 0.0381 data: 0.0008 max mem: 33300 Test: [19400/21770] eta: 0:01:31 time: 0.0382 data: 0.0009 max mem: 33300 Test: [19500/21770] eta: 0:01:27 time: 0.0380 data: 0.0008 max mem: 33300 Test: [19600/21770] eta: 0:01:23 time: 0.0381 data: 0.0009 max mem: 33300 Test: [19700/21770] eta: 0:01:19 time: 0.0378 data: 0.0008 max mem: 33300 Test: [19800/21770] eta: 0:01:15 time: 0.0381 data: 0.0009 max mem: 33300 Test: [19900/21770] eta: 0:01:11 time: 0.0378 data: 0.0008 max mem: 33300 Test: [20000/21770] eta: 0:01:08 time: 0.0380 data: 0.0009 max mem: 33300 Test: [20100/21770] eta: 0:01:04 time: 0.0383 data: 0.0009 max mem: 33300 Test: [20200/21770] eta: 0:01:00 time: 0.0380 data: 0.0008 max mem: 33300 Test: [20300/21770] eta: 0:00:56 time: 0.0380 data: 0.0008 max mem: 33300 Test: [20400/21770] eta: 0:00:52 time: 0.0382 data: 0.0009 max mem: 33300 Test: [20500/21770] eta: 0:00:48 time: 0.0377 data: 0.0008 max mem: 33300 Test: [20600/21770] eta: 0:00:44 time: 0.0380 data: 0.0009 max mem: 33300 Test: [20700/21770] eta: 0:00:41 time: 0.0378 data: 0.0008 max mem: 33300 Test: [20800/21770] eta: 0:00:37 time: 0.0378 data: 0.0008 max mem: 33300 Test: [20900/21770] eta: 0:00:33 time: 0.0387 data: 0.0008 max mem: 33300 Test: [21000/21770] eta: 0:00:29 time: 0.0394 data: 0.0008 max mem: 33300 Test: [21100/21770] eta: 0:00:25 time: 0.0391 data: 0.0008 max mem: 33300 Test: [21200/21770] eta: 0:00:21 time: 0.0395 data: 0.0008 max mem: 33300 Test: [21300/21770] eta: 0:00:18 time: 0.0393 data: 0.0008 max mem: 33300 Test: [21400/21770] eta: 0:00:14 time: 0.0391 data: 0.0009 max mem: 33300 Test: [21500/21770] eta: 0:00:10 time: 0.0391 data: 0.0008 max mem: 33300 Test: [21600/21770] eta: 0:00:06 time: 0.0392 data: 0.0008 max mem: 33300 Test: [21700/21770] eta: 0:00:02 time: 0.0392 data: 0.0008 max mem: 33300 Test: Total time: 0:13:57 Final results: Mean IoU is 0.00 precision@0.5 = 0.00 precision@0.6 = 0.00 precision@0.7 = 0.00 precision@0.8 = 0.00 precision@0.9 = 0.00 overall IoU = 0.00 mean IoU = 0.00 Mean accuracy for one-to-zero sample is 100.00 Average object IoU 0.0 Overall IoU 0.0 Epoch: [17] [ 0/4276] eta: 6:03:33 lr: 3.038554919106585e-05 loss: 0.0995 (0.0995) time: 5.1015 data: 2.0752 max mem: 33300 Epoch: [17] [ 10/4276] eta: 3:42:07 lr: 3.0382768513128516e-05 loss: 0.1260 (0.1377) time: 3.1240 data: 0.1947 max mem: 33300 Epoch: [17] [ 20/4276] eta: 3:34:52 lr: 3.0379987806914006e-05 loss: 0.1260 (0.1357) time: 2.9257 data: 0.0071 max mem: 33300 Epoch: [17] [ 30/4276] eta: 3:31:40 lr: 3.0377207072419166e-05 loss: 0.1232 (0.1361) time: 2.9182 data: 0.0073 max mem: 33300 Epoch: [17] [ 40/4276] eta: 3:29:43 lr: 3.0374426309640825e-05 loss: 0.1253 (0.1355) time: 2.9088 data: 0.0069 max mem: 33300 Epoch: [17] [ 50/4276] eta: 3:28:14 lr: 3.0371645518575826e-05 loss: 0.1277 (0.1331) time: 2.9027 data: 0.0070 max mem: 33300 Epoch: [17] [ 60/4276] eta: 3:26:54 lr: 3.0368864699220993e-05 loss: 0.1148 (0.1329) time: 2.8914 data: 0.0072 max mem: 33300 Epoch: [17] [ 70/4276] eta: 3:25:54 lr: 3.0366083851573173e-05 loss: 0.1217 (0.1314) time: 2.8889 data: 0.0070 max mem: 33300 Epoch: [17] [ 80/4276] eta: 3:24:58 lr: 3.0363302975629194e-05 loss: 0.1283 (0.1318) time: 2.8899 data: 0.0069 max mem: 33300 Epoch: [17] [ 90/4276] eta: 3:24:09 lr: 3.036052207138588e-05 loss: 0.1300 (0.1308) time: 2.8868 data: 0.0071 max mem: 33300 Epoch: [17] [ 100/4276] eta: 3:23:24 lr: 3.035774113884007e-05 loss: 0.1310 (0.1326) time: 2.8878 data: 0.0071 max mem: 33300 Epoch: [17] [ 110/4276] eta: 3:22:41 lr: 3.035496017798859e-05 loss: 0.1481 (0.1335) time: 2.8868 data: 0.0074 max mem: 33300 Epoch: [17] [ 120/4276] eta: 3:22:09 lr: 3.0352179188828282e-05 loss: 0.1309 (0.1332) time: 2.8985 data: 0.0073 max mem: 33300 Epoch: [17] [ 130/4276] eta: 3:21:30 lr: 3.0349398171355964e-05 loss: 0.1195 (0.1338) time: 2.9001 data: 0.0071 max mem: 33300 Epoch: [17] [ 140/4276] eta: 3:21:05 lr: 3.034661712556848e-05 loss: 0.1195 (0.1331) time: 2.9090 data: 0.0072 max mem: 33300 Epoch: [17] [ 150/4276] eta: 3:20:44 lr: 3.0343836051462643e-05 loss: 0.1200 (0.1328) time: 2.9383 data: 0.0069 max mem: 33300 Epoch: [17] [ 160/4276] eta: 3:20:22 lr: 3.0341054949035285e-05 loss: 0.1200 (0.1324) time: 2.9482 data: 0.0071 max mem: 33300 Epoch: [17] [ 170/4276] eta: 3:20:13 lr: 3.0338273818283236e-05 loss: 0.1178 (0.1322) time: 2.9760 data: 0.0073 max mem: 33300 Epoch: [17] [ 180/4276] eta: 3:20:06 lr: 3.033549265920332e-05 loss: 0.1334 (0.1323) time: 3.0129 data: 0.0073 max mem: 33300 Epoch: [17] [ 190/4276] eta: 3:19:47 lr: 3.0332711471792356e-05 loss: 0.1378 (0.1326) time: 3.0024 data: 0.0067 max mem: 33300 Epoch: [17] [ 200/4276] eta: 3:19:06 lr: 3.0329930256047185e-05 loss: 0.1453 (0.1329) time: 2.9294 data: 0.0058 max mem: 33300 Epoch: [17] [ 210/4276] eta: 3:18:33 lr: 3.0327149011964622e-05 loss: 0.1337 (0.1328) time: 2.8923 data: 0.0064 max mem: 33300 Epoch: [17] [ 220/4276] eta: 3:18:05 lr: 3.0324367739541492e-05 loss: 0.1305 (0.1330) time: 2.9234 data: 0.0076 max mem: 33300 Epoch: [17] [ 230/4276] eta: 3:17:30 lr: 3.032158643877462e-05 loss: 0.1137 (0.1323) time: 2.9186 data: 0.0076 max mem: 33300 Epoch: [17] [ 240/4276] eta: 3:16:56 lr: 3.0318805109660825e-05 loss: 0.1254 (0.1327) time: 2.8999 data: 0.0073 max mem: 33300 Epoch: [17] [ 250/4276] eta: 3:16:54 lr: 3.0316023752196925e-05 loss: 0.1539 (0.1336) time: 2.9977 data: 0.0073 max mem: 33300 Epoch: [17] [ 260/4276] eta: 3:16:28 lr: 3.0313242366379746e-05 loss: 0.1395 (0.1337) time: 3.0278 data: 0.0071 max mem: 33300 Epoch: [17] [ 270/4276] eta: 3:16:23 lr: 3.031046095220611e-05 loss: 0.1312 (0.1332) time: 3.0293 data: 0.0076 max mem: 33300 Epoch: [17] [ 280/4276] eta: 3:15:48 lr: 3.0307679509672837e-05 loss: 0.1198 (0.1331) time: 3.0009 data: 0.0078 max mem: 33300 Epoch: [17] [ 290/4276] eta: 3:15:17 lr: 3.0304898038776734e-05 loss: 0.1198 (0.1333) time: 2.9143 data: 0.0072 max mem: 33300 Epoch: [17] [ 300/4276] eta: 3:14:40 lr: 3.0302116539514647e-05 loss: 0.1261 (0.1332) time: 2.9057 data: 0.0070 max mem: 33300 Epoch: [17] [ 310/4276] eta: 3:14:09 lr: 3.0299335011883363e-05 loss: 0.1233 (0.1327) time: 2.9046 data: 0.0072 max mem: 33300 Epoch: [17] [ 320/4276] eta: 3:13:35 lr: 3.029655345587971e-05 loss: 0.1233 (0.1329) time: 2.9102 data: 0.0076 max mem: 33300 Epoch: [17] [ 330/4276] eta: 3:13:00 lr: 3.029377187150051e-05 loss: 0.1329 (0.1330) time: 2.8929 data: 0.0073 max mem: 33300 Epoch: [17] [ 340/4276] eta: 3:12:30 lr: 3.029099025874257e-05 loss: 0.1244 (0.1327) time: 2.9117 data: 0.0072 max mem: 33300 Epoch: [17] [ 350/4276] eta: 3:12:02 lr: 3.028820861760271e-05 loss: 0.1244 (0.1325) time: 2.9411 data: 0.0079 max mem: 33300 Epoch: [17] [ 360/4276] eta: 3:11:35 lr: 3.0285426948077745e-05 loss: 0.1366 (0.1331) time: 2.9483 data: 0.0077 max mem: 33300 Epoch: [17] [ 370/4276] eta: 3:11:18 lr: 3.0282645250164486e-05 loss: 0.1393 (0.1328) time: 3.0048 data: 0.0074 max mem: 33300 Epoch: [17] [ 380/4276] eta: 3:10:47 lr: 3.027986352385974e-05 loss: 0.1228 (0.1328) time: 2.9915 data: 0.0073 max mem: 33300 Epoch: [17] [ 390/4276] eta: 3:10:19 lr: 3.0277081769160336e-05 loss: 0.1308 (0.1329) time: 2.9354 data: 0.0070 max mem: 33300 Epoch: [17] [ 400/4276] eta: 3:10:04 lr: 3.027429998606307e-05 loss: 0.1317 (0.1330) time: 3.0190 data: 0.0079 max mem: 33300 Epoch: [17] [ 410/4276] eta: 3:09:35 lr: 3.0271518174564755e-05 loss: 0.1193 (0.1328) time: 3.0155 data: 0.0080 max mem: 33300 Epoch: [17] [ 420/4276] eta: 3:09:10 lr: 3.0268736334662208e-05 loss: 0.1193 (0.1328) time: 2.9691 data: 0.0073 max mem: 33300 Epoch: [17] [ 430/4276] eta: 3:08:47 lr: 3.0265954466352226e-05 loss: 0.1238 (0.1328) time: 3.0052 data: 0.0074 max mem: 33300 Epoch: [17] [ 440/4276] eta: 3:08:13 lr: 3.026317256963163e-05 loss: 0.1144 (0.1325) time: 2.9541 data: 0.0076 max mem: 33300 Epoch: [17] [ 450/4276] eta: 3:07:41 lr: 3.0260390644497227e-05 loss: 0.1178 (0.1324) time: 2.9018 data: 0.0075 max mem: 33300 Epoch: [17] [ 460/4276] eta: 3:07:07 lr: 3.025760869094582e-05 loss: 0.1176 (0.1320) time: 2.8993 data: 0.0074 max mem: 33300 Epoch: [17] [ 470/4276] eta: 3:06:33 lr: 3.0254826708974215e-05 loss: 0.1112 (0.1316) time: 2.8898 data: 0.0072 max mem: 33300 Epoch: [17] [ 480/4276] eta: 3:06:00 lr: 3.0252044698579217e-05 loss: 0.1207 (0.1315) time: 2.8902 data: 0.0072 max mem: 33300 Epoch: [17] [ 490/4276] eta: 3:05:27 lr: 3.0249262659757637e-05 loss: 0.1171 (0.1311) time: 2.8938 data: 0.0074 max mem: 33300 Epoch: [17] [ 500/4276] eta: 3:04:54 lr: 3.0246480592506277e-05 loss: 0.1097 (0.1309) time: 2.8908 data: 0.0072 max mem: 33300 Epoch: [17] [ 510/4276] eta: 3:04:20 lr: 3.0243698496821938e-05 loss: 0.1077 (0.1306) time: 2.8840 data: 0.0074 max mem: 33300 Epoch: [17] [ 520/4276] eta: 3:03:48 lr: 3.0240916372701423e-05 loss: 0.1125 (0.1304) time: 2.8871 data: 0.0076 max mem: 33300 Epoch: [17] [ 530/4276] eta: 3:03:15 lr: 3.0238134220141546e-05 loss: 0.1270 (0.1303) time: 2.8898 data: 0.0072 max mem: 33300 Epoch: [17] [ 540/4276] eta: 3:02:43 lr: 3.0235352039139093e-05 loss: 0.1250 (0.1301) time: 2.8965 data: 0.0072 max mem: 33300 Epoch: [17] [ 550/4276] eta: 3:02:14 lr: 3.0232569829690877e-05 loss: 0.1250 (0.1301) time: 2.9168 data: 0.0074 max mem: 33300 Epoch: [17] [ 560/4276] eta: 3:01:43 lr: 3.022978759179369e-05 loss: 0.1337 (0.1300) time: 2.9230 data: 0.0071 max mem: 33300 Epoch: [17] [ 570/4276] eta: 3:01:15 lr: 3.022700532544433e-05 loss: 0.1295 (0.1300) time: 2.9372 data: 0.0070 max mem: 33300 Epoch: [17] [ 580/4276] eta: 3:00:43 lr: 3.0224223030639608e-05 loss: 0.1251 (0.1300) time: 2.9219 data: 0.0072 max mem: 33300 Epoch: [17] [ 590/4276] eta: 3:00:10 lr: 3.0221440707376314e-05 loss: 0.1105 (0.1296) time: 2.8795 data: 0.0071 max mem: 33300 Epoch: [17] [ 600/4276] eta: 2:59:36 lr: 3.0218658355651243e-05 loss: 0.1105 (0.1295) time: 2.8700 data: 0.0069 max mem: 33300 Epoch: [17] [ 610/4276] eta: 2:59:04 lr: 3.0215875975461206e-05 loss: 0.1274 (0.1293) time: 2.8712 data: 0.0069 max mem: 33300 Epoch: [17] [ 620/4276] eta: 2:58:31 lr: 3.0213093566802987e-05 loss: 0.1159 (0.1294) time: 2.8769 data: 0.0071 max mem: 33300 Epoch: [17] [ 630/4276] eta: 2:58:00 lr: 3.021031112967338e-05 loss: 0.1176 (0.1295) time: 2.8855 data: 0.0072 max mem: 33300 Epoch: [17] [ 640/4276] eta: 2:57:29 lr: 3.020752866406919e-05 loss: 0.1343 (0.1295) time: 2.8997 data: 0.0075 max mem: 33300 Epoch: [17] [ 650/4276] eta: 2:56:58 lr: 3.0204746169987198e-05 loss: 0.1295 (0.1295) time: 2.9047 data: 0.0076 max mem: 33300 Epoch: [17] [ 660/4276] eta: 2:56:27 lr: 3.0201963647424207e-05 loss: 0.1295 (0.1297) time: 2.8988 data: 0.0072 max mem: 33300 Epoch: [17] [ 670/4276] eta: 2:55:56 lr: 3.019918109637701e-05 loss: 0.1269 (0.1297) time: 2.8928 data: 0.0072 max mem: 33300 Epoch: [17] [ 680/4276] eta: 2:55:25 lr: 3.0196398516842407e-05 loss: 0.1182 (0.1296) time: 2.8949 data: 0.0076 max mem: 33300 Epoch: [17] [ 690/4276] eta: 2:54:55 lr: 3.019361590881717e-05 loss: 0.1153 (0.1295) time: 2.9015 data: 0.0074 max mem: 33300 Epoch: [17] [ 700/4276] eta: 2:54:26 lr: 3.0190833272298103e-05 loss: 0.1133 (0.1293) time: 2.9145 data: 0.0074 max mem: 33300 Epoch: [17] [ 710/4276] eta: 2:53:55 lr: 3.0188050607281986e-05 loss: 0.1189 (0.1294) time: 2.9098 data: 0.0076 max mem: 33300 Epoch: [17] [ 720/4276] eta: 2:53:23 lr: 3.0185267913765618e-05 loss: 0.1251 (0.1292) time: 2.8867 data: 0.0073 max mem: 33300 Epoch: [17] [ 730/4276] eta: 2:52:52 lr: 3.018248519174579e-05 loss: 0.1156 (0.1291) time: 2.8851 data: 0.0070 max mem: 33300 Epoch: [17] [ 740/4276] eta: 2:52:21 lr: 3.0179702441219283e-05 loss: 0.1233 (0.1290) time: 2.8869 data: 0.0069 max mem: 33300 Epoch: [17] [ 750/4276] eta: 2:51:50 lr: 3.0176919662182878e-05 loss: 0.1162 (0.1291) time: 2.8899 data: 0.0071 max mem: 33300 Epoch: [17] [ 760/4276] eta: 2:51:19 lr: 3.0174136854633382e-05 loss: 0.1126 (0.1290) time: 2.8880 data: 0.0072 max mem: 33300 Epoch: [17] [ 770/4276] eta: 2:50:48 lr: 3.0171354018567567e-05 loss: 0.1172 (0.1290) time: 2.8816 data: 0.0070 max mem: 33300 Epoch: [17] [ 780/4276] eta: 2:50:17 lr: 3.0168571153982223e-05 loss: 0.1308 (0.1291) time: 2.8821 data: 0.0069 max mem: 33300 Epoch: [17] [ 790/4276] eta: 2:49:46 lr: 3.0165788260874122e-05 loss: 0.1338 (0.1292) time: 2.8805 data: 0.0069 max mem: 33300 Epoch: [17] [ 800/4276] eta: 2:49:16 lr: 3.0163005339240063e-05 loss: 0.1290 (0.1292) time: 2.8933 data: 0.0070 max mem: 33300 Epoch: [17] [ 810/4276] eta: 2:48:47 lr: 3.0160222389076825e-05 loss: 0.1290 (0.1292) time: 2.9226 data: 0.0078 max mem: 33300 Epoch: [17] [ 820/4276] eta: 2:48:19 lr: 3.015743941038119e-05 loss: 0.1224 (0.1290) time: 2.9363 data: 0.0083 max mem: 33300 Epoch: [17] [ 830/4276] eta: 2:47:50 lr: 3.0154656403149933e-05 loss: 0.1174 (0.1292) time: 2.9333 data: 0.0080 max mem: 33300 Epoch: [17] [ 840/4276] eta: 2:47:21 lr: 3.015187336737985e-05 loss: 0.1276 (0.1292) time: 2.9312 data: 0.0082 max mem: 33300 Epoch: [17] [ 850/4276] eta: 2:46:51 lr: 3.0149090303067713e-05 loss: 0.1207 (0.1291) time: 2.9193 data: 0.0079 max mem: 33300 Epoch: [17] [ 860/4276] eta: 2:46:21 lr: 3.01463072102103e-05 loss: 0.1233 (0.1291) time: 2.9009 data: 0.0072 max mem: 33300 Epoch: [17] [ 870/4276] eta: 2:45:51 lr: 3.0143524088804394e-05 loss: 0.1233 (0.1291) time: 2.9047 data: 0.0069 max mem: 33300 Epoch: [17] [ 880/4276] eta: 2:45:21 lr: 3.014074093884677e-05 loss: 0.1311 (0.1293) time: 2.9024 data: 0.0075 max mem: 33300 Epoch: [17] [ 890/4276] eta: 2:44:50 lr: 3.01379577603342e-05 loss: 0.1370 (0.1293) time: 2.8849 data: 0.0080 max mem: 33300 Epoch: [17] [ 900/4276] eta: 2:44:20 lr: 3.0135174553263474e-05 loss: 0.1264 (0.1293) time: 2.8874 data: 0.0077 max mem: 33300 Epoch: [17] [ 910/4276] eta: 2:43:50 lr: 3.013239131763136e-05 loss: 0.1369 (0.1295) time: 2.8944 data: 0.0075 max mem: 33300 Epoch: [17] [ 920/4276] eta: 2:43:20 lr: 3.0129608053434642e-05 loss: 0.1369 (0.1296) time: 2.8988 data: 0.0081 max mem: 33300 Epoch: [17] [ 930/4276] eta: 2:42:51 lr: 3.0126824760670085e-05 loss: 0.1244 (0.1295) time: 2.9196 data: 0.0082 max mem: 33300 Epoch: [17] [ 940/4276] eta: 2:42:22 lr: 3.0124041439334472e-05 loss: 0.1222 (0.1294) time: 2.9358 data: 0.0075 max mem: 33300 Epoch: [17] [ 950/4276] eta: 2:41:54 lr: 3.0121258089424566e-05 loss: 0.1240 (0.1295) time: 2.9352 data: 0.0078 max mem: 33300 Epoch: [17] [ 960/4276] eta: 2:41:25 lr: 3.011847471093714e-05 loss: 0.1284 (0.1295) time: 2.9348 data: 0.0080 max mem: 33300 Epoch: [17] [ 970/4276] eta: 2:40:56 lr: 3.0115691303868976e-05 loss: 0.1257 (0.1295) time: 2.9267 data: 0.0074 max mem: 33300 Epoch: [17] [ 980/4276] eta: 2:40:25 lr: 3.0112907868216838e-05 loss: 0.1245 (0.1295) time: 2.9006 data: 0.0073 max mem: 33300 Epoch: [17] [ 990/4276] eta: 2:39:55 lr: 3.0110124403977508e-05 loss: 0.1240 (0.1295) time: 2.8821 data: 0.0071 max mem: 33300 Epoch: [17] [1000/4276] eta: 2:39:24 lr: 3.0107340911147742e-05 loss: 0.1191 (0.1295) time: 2.8798 data: 0.0071 max mem: 33300 Epoch: [17] [1010/4276] eta: 2:38:54 lr: 3.010455738972432e-05 loss: 0.1179 (0.1295) time: 2.8814 data: 0.0073 max mem: 33300 Epoch: [17] [1020/4276] eta: 2:38:23 lr: 3.0101773839703995e-05 loss: 0.1219 (0.1295) time: 2.8805 data: 0.0071 max mem: 33300 Epoch: [17] [1030/4276] eta: 2:37:53 lr: 3.0098990261083554e-05 loss: 0.1375 (0.1296) time: 2.8792 data: 0.0072 max mem: 33300 Epoch: [17] [1040/4276] eta: 2:37:22 lr: 3.009620665385975e-05 loss: 0.1348 (0.1296) time: 2.8794 data: 0.0076 max mem: 33300 Epoch: [17] [1050/4276] eta: 2:36:53 lr: 3.0093423018029356e-05 loss: 0.1301 (0.1297) time: 2.8883 data: 0.0075 max mem: 33300 Epoch: [17] [1060/4276] eta: 2:36:23 lr: 3.009063935358914e-05 loss: 0.1301 (0.1297) time: 2.8955 data: 0.0071 max mem: 33300 Epoch: [17] [1070/4276] eta: 2:35:52 lr: 3.0087855660535863e-05 loss: 0.1482 (0.1300) time: 2.8843 data: 0.0071 max mem: 33300 Epoch: [17] [1080/4276] eta: 2:35:22 lr: 3.0085071938866293e-05 loss: 0.1455 (0.1300) time: 2.8766 data: 0.0073 max mem: 33300 Epoch: [17] [1090/4276] eta: 2:34:51 lr: 3.008228818857719e-05 loss: 0.1362 (0.1301) time: 2.8723 data: 0.0073 max mem: 33300 Epoch: [17] [1100/4276] eta: 2:34:21 lr: 3.007950440966532e-05 loss: 0.1296 (0.1301) time: 2.8688 data: 0.0075 max mem: 33300 Epoch: [17] [1110/4276] eta: 2:33:51 lr: 3.007672060212744e-05 loss: 0.1266 (0.1302) time: 2.8875 data: 0.0073 max mem: 33300 Epoch: [17] [1120/4276] eta: 2:33:23 lr: 3.0073936765960316e-05 loss: 0.1165 (0.1301) time: 2.9175 data: 0.0073 max mem: 33300 Epoch: [17] [1130/4276] eta: 2:32:54 lr: 3.0071152901160704e-05 loss: 0.1173 (0.1300) time: 2.9280 data: 0.0076 max mem: 33300 Epoch: [17] [1140/4276] eta: 2:32:25 lr: 3.006836900772537e-05 loss: 0.1184 (0.1299) time: 2.9269 data: 0.0075 max mem: 33300 Epoch: [17] [1150/4276] eta: 2:31:56 lr: 3.0065585085651082e-05 loss: 0.1254 (0.1299) time: 2.9293 data: 0.0077 max mem: 33300 Epoch: [17] [1160/4276] eta: 2:31:28 lr: 3.006280113493458e-05 loss: 0.1272 (0.1299) time: 2.9347 data: 0.0078 max mem: 33300 Epoch: [17] [1170/4276] eta: 2:30:59 lr: 3.006001715557264e-05 loss: 0.1311 (0.1299) time: 2.9347 data: 0.0076 max mem: 33300 Epoch: [17] [1180/4276] eta: 2:30:30 lr: 3.0057233147562004e-05 loss: 0.1228 (0.1298) time: 2.9328 data: 0.0075 max mem: 33300 Epoch: [17] [1190/4276] eta: 2:30:00 lr: 3.0054449110899434e-05 loss: 0.1204 (0.1298) time: 2.9107 data: 0.0077 max mem: 33300 Epoch: [17] [1200/4276] eta: 2:29:30 lr: 3.005166504558169e-05 loss: 0.1195 (0.1298) time: 2.8840 data: 0.0078 max mem: 33300 Epoch: [17] [1210/4276] eta: 2:29:01 lr: 3.0048880951605522e-05 loss: 0.1119 (0.1297) time: 2.8965 data: 0.0074 max mem: 33300 Epoch: [17] [1220/4276] eta: 2:28:32 lr: 3.0046096828967685e-05 loss: 0.1294 (0.1298) time: 2.9137 data: 0.0072 max mem: 33300 Epoch: [17] [1230/4276] eta: 2:28:03 lr: 3.004331267766495e-05 loss: 0.1209 (0.1298) time: 2.9268 data: 0.0074 max mem: 33300 Epoch: [17] [1240/4276] eta: 2:27:34 lr: 3.0040528497694037e-05 loss: 0.1218 (0.1298) time: 2.9382 data: 0.0078 max mem: 33300 Epoch: [17] [1250/4276] eta: 2:27:06 lr: 3.0037744289051734e-05 loss: 0.1277 (0.1299) time: 2.9366 data: 0.0082 max mem: 33300 Epoch: [17] [1260/4276] eta: 2:26:36 lr: 3.0034960051734763e-05 loss: 0.1154 (0.1299) time: 2.9240 data: 0.0080 max mem: 33300 Epoch: [17] [1270/4276] eta: 2:26:07 lr: 3.003217578573989e-05 loss: 0.1191 (0.1298) time: 2.8999 data: 0.0078 max mem: 33300 Epoch: [17] [1280/4276] eta: 2:25:38 lr: 3.0029391491063868e-05 loss: 0.1245 (0.1299) time: 2.9086 data: 0.0077 max mem: 33300 Epoch: [17] [1290/4276] eta: 2:25:09 lr: 3.0026607167703434e-05 loss: 0.1320 (0.1299) time: 2.9319 data: 0.0075 max mem: 33300 Epoch: [17] [1300/4276] eta: 2:24:40 lr: 3.0023822815655356e-05 loss: 0.1190 (0.1299) time: 2.9213 data: 0.0072 max mem: 33300 Epoch: [17] [1310/4276] eta: 2:24:10 lr: 3.0021038434916365e-05 loss: 0.1091 (0.1298) time: 2.9096 data: 0.0071 max mem: 33300 Epoch: [17] [1320/4276] eta: 2:23:41 lr: 3.001825402548322e-05 loss: 0.1112 (0.1299) time: 2.9233 data: 0.0079 max mem: 33300 Epoch: [17] [1330/4276] eta: 2:23:12 lr: 3.0015469587352662e-05 loss: 0.1422 (0.1299) time: 2.9175 data: 0.0083 max mem: 33300 Epoch: [17] [1340/4276] eta: 2:22:42 lr: 3.0012685120521432e-05 loss: 0.1203 (0.1298) time: 2.8881 data: 0.0075 max mem: 33300 Epoch: [17] [1350/4276] eta: 2:22:12 lr: 3.0009900624986288e-05 loss: 0.1184 (0.1298) time: 2.8752 data: 0.0071 max mem: 33300 Epoch: [17] [1360/4276] eta: 2:21:42 lr: 3.0007116100743964e-05 loss: 0.1209 (0.1298) time: 2.8766 data: 0.0073 max mem: 33300 Epoch: [17] [1370/4276] eta: 2:21:12 lr: 3.0004331547791208e-05 loss: 0.1180 (0.1298) time: 2.8833 data: 0.0073 max mem: 33300 Epoch: [17] [1380/4276] eta: 2:20:43 lr: 3.0001546966124773e-05 loss: 0.1284 (0.1298) time: 2.8968 data: 0.0074 max mem: 33300 Epoch: [17] [1390/4276] eta: 2:20:14 lr: 2.9998762355741384e-05 loss: 0.1499 (0.1299) time: 2.9136 data: 0.0077 max mem: 33300 Epoch: [17] [1400/4276] eta: 2:19:45 lr: 2.99959777166378e-05 loss: 0.1326 (0.1299) time: 2.9346 data: 0.0076 max mem: 33300 Epoch: [17] [1410/4276] eta: 2:19:17 lr: 2.9993193048810748e-05 loss: 0.1183 (0.1298) time: 2.9441 data: 0.0076 max mem: 33300 Epoch: [17] [1420/4276] eta: 2:18:47 lr: 2.9990408352256973e-05 loss: 0.1098 (0.1297) time: 2.9212 data: 0.0079 max mem: 33300 Epoch: [17] [1430/4276] eta: 2:18:18 lr: 2.9987623626973215e-05 loss: 0.1197 (0.1297) time: 2.9004 data: 0.0078 max mem: 33300 Epoch: [17] [1440/4276] eta: 2:17:48 lr: 2.9984838872956223e-05 loss: 0.1241 (0.1297) time: 2.9064 data: 0.0078 max mem: 33300 Epoch: [17] [1450/4276] eta: 2:17:20 lr: 2.9982054090202717e-05 loss: 0.1252 (0.1298) time: 2.9295 data: 0.0082 max mem: 33300 Epoch: [17] [1460/4276] eta: 2:16:51 lr: 2.9979269278709454e-05 loss: 0.1252 (0.1297) time: 2.9442 data: 0.0082 max mem: 33300 Epoch: [17] [1470/4276] eta: 2:16:23 lr: 2.9976484438473157e-05 loss: 0.1199 (0.1297) time: 2.9450 data: 0.0081 max mem: 33300 Epoch: [17] [1480/4276] eta: 2:15:54 lr: 2.9973699569490572e-05 loss: 0.1186 (0.1297) time: 2.9499 data: 0.0087 max mem: 33300 Epoch: [17] [1490/4276] eta: 2:15:26 lr: 2.9970914671758428e-05 loss: 0.1167 (0.1296) time: 2.9503 data: 0.0086 max mem: 33300 Epoch: [17] [1500/4276] eta: 2:14:57 lr: 2.9968129745273453e-05 loss: 0.1167 (0.1296) time: 2.9471 data: 0.0081 max mem: 33300 Epoch: [17] [1510/4276] eta: 2:14:28 lr: 2.9965344790032397e-05 loss: 0.1159 (0.1295) time: 2.9470 data: 0.0082 max mem: 33300 Epoch: [17] [1520/4276] eta: 2:14:00 lr: 2.9962559806031987e-05 loss: 0.1083 (0.1294) time: 2.9453 data: 0.0082 max mem: 33300 Epoch: [17] [1530/4276] eta: 2:13:31 lr: 2.9959774793268954e-05 loss: 0.1083 (0.1293) time: 2.9442 data: 0.0082 max mem: 33300 Epoch: [17] [1540/4276] eta: 2:13:02 lr: 2.9956989751740034e-05 loss: 0.1148 (0.1293) time: 2.9445 data: 0.0089 max mem: 33300 Epoch: [17] [1550/4276] eta: 2:12:34 lr: 2.995420468144195e-05 loss: 0.1197 (0.1293) time: 2.9439 data: 0.0089 max mem: 33300 Epoch: [17] [1560/4276] eta: 2:12:05 lr: 2.9951419582371443e-05 loss: 0.1206 (0.1292) time: 2.9455 data: 0.0087 max mem: 33300 Epoch: [17] [1570/4276] eta: 2:11:36 lr: 2.994863445452523e-05 loss: 0.1266 (0.1292) time: 2.9443 data: 0.0084 max mem: 33300 Epoch: [17] [1580/4276] eta: 2:11:07 lr: 2.994584929790006e-05 loss: 0.1174 (0.1291) time: 2.9416 data: 0.0081 max mem: 33300 Epoch: [17] [1590/4276] eta: 2:10:39 lr: 2.9943064112492643e-05 loss: 0.1174 (0.1292) time: 2.9410 data: 0.0082 max mem: 33300 Epoch: [17] [1600/4276] eta: 2:10:09 lr: 2.9940278898299713e-05 loss: 0.1406 (0.1292) time: 2.9115 data: 0.0078 max mem: 33300 Epoch: [17] [1610/4276] eta: 2:09:39 lr: 2.9937493655318004e-05 loss: 0.1208 (0.1291) time: 2.8867 data: 0.0077 max mem: 33300 Epoch: [17] [1620/4276] eta: 2:09:09 lr: 2.993470838354423e-05 loss: 0.1198 (0.1291) time: 2.8897 data: 0.0079 max mem: 33300 Epoch: [17] [1630/4276] eta: 2:08:41 lr: 2.9931923082975127e-05 loss: 0.1230 (0.1291) time: 2.9118 data: 0.0082 max mem: 33300 Epoch: [17] [1640/4276] eta: 2:08:12 lr: 2.9929137753607418e-05 loss: 0.1289 (0.1290) time: 2.9317 data: 0.0082 max mem: 33300 Epoch: [17] [1650/4276] eta: 2:07:43 lr: 2.9926352395437817e-05 loss: 0.1095 (0.1290) time: 2.9353 data: 0.0084 max mem: 33300 Epoch: [17] [1660/4276] eta: 2:07:14 lr: 2.992356700846306e-05 loss: 0.1145 (0.1289) time: 2.9427 data: 0.0086 max mem: 33300 Epoch: [17] [1670/4276] eta: 2:06:45 lr: 2.9920781592679864e-05 loss: 0.1160 (0.1288) time: 2.9440 data: 0.0079 max mem: 33300 Epoch: [17] [1680/4276] eta: 2:06:16 lr: 2.9917996148084954e-05 loss: 0.1267 (0.1289) time: 2.9446 data: 0.0081 max mem: 33300 Epoch: [17] [1690/4276] eta: 2:05:48 lr: 2.9915210674675053e-05 loss: 0.1192 (0.1288) time: 2.9424 data: 0.0084 max mem: 33300 Epoch: [17] [1700/4276] eta: 2:05:19 lr: 2.991242517244688e-05 loss: 0.1156 (0.1288) time: 2.9403 data: 0.0080 max mem: 33300 Epoch: [17] [1710/4276] eta: 2:04:50 lr: 2.9909639641397152e-05 loss: 0.1346 (0.1288) time: 2.9412 data: 0.0080 max mem: 33300 Epoch: [17] [1720/4276] eta: 2:04:21 lr: 2.990685408152259e-05 loss: 0.1267 (0.1288) time: 2.9442 data: 0.0083 max mem: 33300 Epoch: [17] [1730/4276] eta: 2:03:52 lr: 2.990406849281991e-05 loss: 0.1249 (0.1288) time: 2.9464 data: 0.0085 max mem: 33300 Epoch: [17] [1740/4276] eta: 2:03:23 lr: 2.990128287528583e-05 loss: 0.1205 (0.1288) time: 2.9450 data: 0.0083 max mem: 33300 Epoch: [17] [1750/4276] eta: 2:02:55 lr: 2.989849722891707e-05 loss: 0.1160 (0.1288) time: 2.9457 data: 0.0081 max mem: 33300 Epoch: [17] [1760/4276] eta: 2:02:26 lr: 2.9895711553710353e-05 loss: 0.1160 (0.1288) time: 2.9462 data: 0.0080 max mem: 33300 Epoch: [17] [1770/4276] eta: 2:01:57 lr: 2.989292584966239e-05 loss: 0.1203 (0.1287) time: 2.9439 data: 0.0080 max mem: 33300 Epoch: [17] [1780/4276] eta: 2:01:28 lr: 2.989014011676988e-05 loss: 0.1203 (0.1287) time: 2.9414 data: 0.0082 max mem: 33300 Epoch: [17] [1790/4276] eta: 2:00:59 lr: 2.988735435502957e-05 loss: 0.1257 (0.1287) time: 2.9399 data: 0.0085 max mem: 33300 Epoch: [17] [1800/4276] eta: 2:00:30 lr: 2.9884568564438142e-05 loss: 0.1321 (0.1287) time: 2.9381 data: 0.0085 max mem: 33300 Epoch: [17] [1810/4276] eta: 2:00:01 lr: 2.9881782744992325e-05 loss: 0.1266 (0.1288) time: 2.9379 data: 0.0081 max mem: 33300 Epoch: [17] [1820/4276] eta: 1:59:32 lr: 2.9878996896688826e-05 loss: 0.1287 (0.1288) time: 2.9400 data: 0.0081 max mem: 33300 Epoch: [17] [1830/4276] eta: 1:59:03 lr: 2.9876211019524357e-05 loss: 0.1270 (0.1287) time: 2.9414 data: 0.0079 max mem: 33300 Epoch: [17] [1840/4276] eta: 1:58:34 lr: 2.9873425113495628e-05 loss: 0.1127 (0.1286) time: 2.9382 data: 0.0083 max mem: 33300 Epoch: [17] [1850/4276] eta: 1:58:05 lr: 2.9870639178599362e-05 loss: 0.1103 (0.1287) time: 2.9365 data: 0.0085 max mem: 33300 Epoch: [17] [1860/4276] eta: 1:57:36 lr: 2.986785321483225e-05 loss: 0.1236 (0.1287) time: 2.9370 data: 0.0082 max mem: 33300 Epoch: [17] [1870/4276] eta: 1:57:07 lr: 2.9865067222191006e-05 loss: 0.1337 (0.1288) time: 2.9388 data: 0.0082 max mem: 33300 Epoch: [17] [1880/4276] eta: 1:56:38 lr: 2.9862281200672342e-05 loss: 0.1263 (0.1287) time: 2.9396 data: 0.0078 max mem: 33300 Epoch: [17] [1890/4276] eta: 1:56:09 lr: 2.9859495150272966e-05 loss: 0.1263 (0.1288) time: 2.9375 data: 0.0078 max mem: 33300 Epoch: [17] [1900/4276] eta: 1:55:40 lr: 2.9856709070989573e-05 loss: 0.1192 (0.1288) time: 2.9380 data: 0.0084 max mem: 33300 Epoch: [17] [1910/4276] eta: 1:55:11 lr: 2.985392296281888e-05 loss: 0.1195 (0.1288) time: 2.9389 data: 0.0083 max mem: 33300 Epoch: [17] [1920/4276] eta: 1:54:42 lr: 2.9851136825757592e-05 loss: 0.1224 (0.1287) time: 2.9411 data: 0.0077 max mem: 33300 Epoch: [17] [1930/4276] eta: 1:54:13 lr: 2.9848350659802403e-05 loss: 0.1269 (0.1287) time: 2.9429 data: 0.0076 max mem: 33300 Epoch: [17] [1940/4276] eta: 1:53:44 lr: 2.9845564464950038e-05 loss: 0.1269 (0.1287) time: 2.9414 data: 0.0079 max mem: 33300 Epoch: [17] [1950/4276] eta: 1:53:15 lr: 2.9842778241197173e-05 loss: 0.1280 (0.1288) time: 2.9389 data: 0.0081 max mem: 33300 Epoch: [17] [1960/4276] eta: 1:52:46 lr: 2.9839991988540522e-05 loss: 0.1313 (0.1287) time: 2.9381 data: 0.0079 max mem: 33300 Epoch: [17] [1970/4276] eta: 1:52:17 lr: 2.983720570697679e-05 loss: 0.1067 (0.1287) time: 2.9335 data: 0.0079 max mem: 33300 Epoch: [17] [1980/4276] eta: 1:51:48 lr: 2.983441939650267e-05 loss: 0.1146 (0.1286) time: 2.9369 data: 0.0080 max mem: 33300 Epoch: [17] [1990/4276] eta: 1:51:19 lr: 2.983163305711487e-05 loss: 0.1188 (0.1286) time: 2.9446 data: 0.0082 max mem: 33300 Epoch: [17] [2000/4276] eta: 1:50:50 lr: 2.9828846688810097e-05 loss: 0.1252 (0.1286) time: 2.9444 data: 0.0083 max mem: 33300 Epoch: [17] [2010/4276] eta: 1:50:21 lr: 2.9826060291585023e-05 loss: 0.1342 (0.1286) time: 2.9426 data: 0.0083 max mem: 33300 Epoch: [17] [2020/4276] eta: 1:49:52 lr: 2.9823273865436374e-05 loss: 0.1354 (0.1287) time: 2.9158 data: 0.0084 max mem: 33300 Epoch: [17] [2030/4276] eta: 1:49:22 lr: 2.982048741036082e-05 loss: 0.1196 (0.1286) time: 2.8882 data: 0.0079 max mem: 33300 Epoch: [17] [2040/4276] eta: 1:48:53 lr: 2.981770092635508e-05 loss: 0.1151 (0.1286) time: 2.8865 data: 0.0076 max mem: 33300 Epoch: [17] [2050/4276] eta: 1:48:23 lr: 2.9814914413415836e-05 loss: 0.1228 (0.1286) time: 2.8865 data: 0.0074 max mem: 33300 Epoch: [17] [2060/4276] eta: 1:47:53 lr: 2.981212787153979e-05 loss: 0.1233 (0.1286) time: 2.8867 data: 0.0076 max mem: 33300 Epoch: [17] [2070/4276] eta: 1:47:24 lr: 2.9809341300723637e-05 loss: 0.1122 (0.1285) time: 2.9083 data: 0.0077 max mem: 33300 Epoch: [17] [2080/4276] eta: 1:46:55 lr: 2.9806554700964068e-05 loss: 0.1122 (0.1285) time: 2.9358 data: 0.0075 max mem: 33300 Epoch: [17] [2090/4276] eta: 1:46:26 lr: 2.9803768072257772e-05 loss: 0.1248 (0.1285) time: 2.9440 data: 0.0077 max mem: 33300 Epoch: [17] [2100/4276] eta: 1:45:57 lr: 2.9800981414601448e-05 loss: 0.1248 (0.1286) time: 2.9460 data: 0.0081 max mem: 33300 Epoch: [17] [2110/4276] eta: 1:45:28 lr: 2.979819472799178e-05 loss: 0.1175 (0.1285) time: 2.9484 data: 0.0079 max mem: 33300 Epoch: [17] [2120/4276] eta: 1:44:59 lr: 2.9795408012425463e-05 loss: 0.1046 (0.1284) time: 2.9500 data: 0.0075 max mem: 33300 Epoch: [17] [2130/4276] eta: 1:44:31 lr: 2.9792621267899185e-05 loss: 0.1043 (0.1283) time: 2.9498 data: 0.0076 max mem: 33300 Epoch: [17] [2140/4276] eta: 1:44:02 lr: 2.9789834494409642e-05 loss: 0.1099 (0.1283) time: 2.9455 data: 0.0077 max mem: 33300 Epoch: [17] [2150/4276] eta: 1:43:32 lr: 2.9787047691953507e-05 loss: 0.1099 (0.1282) time: 2.9137 data: 0.0077 max mem: 33300 Epoch: [17] [2160/4276] eta: 1:43:02 lr: 2.978426086052749e-05 loss: 0.1117 (0.1281) time: 2.8857 data: 0.0077 max mem: 33300 Epoch: [17] [2170/4276] eta: 1:42:33 lr: 2.9781474000128263e-05 loss: 0.1330 (0.1282) time: 2.8879 data: 0.0076 max mem: 33300 Epoch: [17] [2180/4276] eta: 1:42:03 lr: 2.9778687110752507e-05 loss: 0.1407 (0.1283) time: 2.8906 data: 0.0076 max mem: 33300 Epoch: [17] [2190/4276] eta: 1:41:34 lr: 2.977590019239692e-05 loss: 0.1341 (0.1283) time: 2.8974 data: 0.0076 max mem: 33300 Epoch: [17] [2200/4276] eta: 1:41:04 lr: 2.9773113245058175e-05 loss: 0.1255 (0.1283) time: 2.8949 data: 0.0070 max mem: 33300 Epoch: [17] [2210/4276] eta: 1:40:35 lr: 2.9770326268732973e-05 loss: 0.1265 (0.1283) time: 2.8849 data: 0.0072 max mem: 33300 Epoch: [17] [2220/4276] eta: 1:40:05 lr: 2.9767539263417983e-05 loss: 0.1317 (0.1283) time: 2.8792 data: 0.0076 max mem: 33300 Epoch: [17] [2230/4276] eta: 1:39:36 lr: 2.9764752229109892e-05 loss: 0.1219 (0.1283) time: 2.8930 data: 0.0073 max mem: 33300 Epoch: [17] [2240/4276] eta: 1:39:06 lr: 2.9761965165805383e-05 loss: 0.1138 (0.1282) time: 2.8974 data: 0.0072 max mem: 33300 Epoch: [17] [2250/4276] eta: 1:38:37 lr: 2.9759178073501136e-05 loss: 0.1114 (0.1282) time: 2.8876 data: 0.0073 max mem: 33300 Epoch: [17] [2260/4276] eta: 1:38:08 lr: 2.975639095219383e-05 loss: 0.1310 (0.1282) time: 2.9022 data: 0.0074 max mem: 33300 Epoch: [17] [2270/4276] eta: 1:37:39 lr: 2.975360380188015e-05 loss: 0.1253 (0.1282) time: 2.9281 data: 0.0079 max mem: 33300 Epoch: [17] [2280/4276] eta: 1:37:10 lr: 2.9750816622556765e-05 loss: 0.1253 (0.1283) time: 2.9425 data: 0.0080 max mem: 33300 Epoch: [17] [2290/4276] eta: 1:36:40 lr: 2.9748029414220365e-05 loss: 0.1258 (0.1282) time: 2.9404 data: 0.0079 max mem: 33300 Epoch: [17] [2300/4276] eta: 1:36:11 lr: 2.9745242176867617e-05 loss: 0.1199 (0.1282) time: 2.9397 data: 0.0084 max mem: 33300 Epoch: [17] [2310/4276] eta: 1:35:42 lr: 2.974245491049521e-05 loss: 0.1246 (0.1283) time: 2.9282 data: 0.0082 max mem: 33300 Epoch: [17] [2320/4276] eta: 1:35:13 lr: 2.9739667615099808e-05 loss: 0.1246 (0.1283) time: 2.9036 data: 0.0075 max mem: 33300 Epoch: [17] [2330/4276] eta: 1:34:44 lr: 2.9736880290678094e-05 loss: 0.1286 (0.1283) time: 2.9151 data: 0.0073 max mem: 33300 Epoch: [17] [2340/4276] eta: 1:34:15 lr: 2.973409293722674e-05 loss: 0.1320 (0.1284) time: 2.9410 data: 0.0074 max mem: 33300 Epoch: [17] [2350/4276] eta: 1:33:46 lr: 2.9731305554742417e-05 loss: 0.1222 (0.1284) time: 2.9446 data: 0.0077 max mem: 33300 Epoch: [17] [2360/4276] eta: 1:33:17 lr: 2.97285181432218e-05 loss: 0.1131 (0.1284) time: 2.9462 data: 0.0077 max mem: 33300 Epoch: [17] [2370/4276] eta: 1:32:48 lr: 2.9725730702661563e-05 loss: 0.1294 (0.1284) time: 2.9441 data: 0.0074 max mem: 33300 Epoch: [17] [2380/4276] eta: 1:32:19 lr: 2.9722943233058382e-05 loss: 0.1299 (0.1284) time: 2.9429 data: 0.0074 max mem: 33300 Epoch: [17] [2390/4276] eta: 1:31:49 lr: 2.9720155734408923e-05 loss: 0.1212 (0.1284) time: 2.9398 data: 0.0075 max mem: 33300 Epoch: [17] [2400/4276] eta: 1:31:20 lr: 2.971736820670985e-05 loss: 0.1304 (0.1284) time: 2.9360 data: 0.0075 max mem: 33300 Epoch: [17] [2410/4276] eta: 1:30:51 lr: 2.971458064995785e-05 loss: 0.1311 (0.1284) time: 2.9356 data: 0.0075 max mem: 33300 Epoch: [17] [2420/4276] eta: 1:30:22 lr: 2.9711793064149573e-05 loss: 0.1217 (0.1284) time: 2.9299 data: 0.0072 max mem: 33300 Epoch: [17] [2430/4276] eta: 1:29:53 lr: 2.9709005449281695e-05 loss: 0.1271 (0.1284) time: 2.9277 data: 0.0070 max mem: 33300 Epoch: [17] [2440/4276] eta: 1:29:24 lr: 2.9706217805350883e-05 loss: 0.1358 (0.1284) time: 2.9222 data: 0.0070 max mem: 33300 Epoch: [17] [2450/4276] eta: 1:28:54 lr: 2.9703430132353804e-05 loss: 0.1193 (0.1284) time: 2.9038 data: 0.0071 max mem: 33300 Epoch: [17] [2460/4276] eta: 1:28:25 lr: 2.9700642430287123e-05 loss: 0.1219 (0.1284) time: 2.8863 data: 0.0070 max mem: 33300 Epoch: [17] [2470/4276] eta: 1:27:55 lr: 2.9697854699147513e-05 loss: 0.1272 (0.1284) time: 2.8849 data: 0.0075 max mem: 33300 Epoch: [17] [2480/4276] eta: 1:27:26 lr: 2.969506693893163e-05 loss: 0.1360 (0.1285) time: 2.8991 data: 0.0078 max mem: 33300 Epoch: [17] [2490/4276] eta: 1:26:57 lr: 2.969227914963614e-05 loss: 0.1360 (0.1285) time: 2.9057 data: 0.0076 max mem: 33300 Epoch: [17] [2500/4276] eta: 1:26:27 lr: 2.9689491331257695e-05 loss: 0.1195 (0.1285) time: 2.8931 data: 0.0071 max mem: 33300 Epoch: [17] [2510/4276] eta: 1:25:58 lr: 2.9686703483792978e-05 loss: 0.1361 (0.1286) time: 2.8889 data: 0.0071 max mem: 33300 Epoch: [17] [2520/4276] eta: 1:25:28 lr: 2.968391560723863e-05 loss: 0.1223 (0.1285) time: 2.8917 data: 0.0081 max mem: 33300 Epoch: [17] [2530/4276] eta: 1:24:59 lr: 2.9681127701591325e-05 loss: 0.1100 (0.1285) time: 2.8843 data: 0.0084 max mem: 33300 Epoch: [17] [2540/4276] eta: 1:24:29 lr: 2.967833976684773e-05 loss: 0.1126 (0.1285) time: 2.8810 data: 0.0077 max mem: 33300 Epoch: [17] [2550/4276] eta: 1:24:00 lr: 2.9675551803004487e-05 loss: 0.1094 (0.1284) time: 2.8801 data: 0.0073 max mem: 33300 Epoch: [17] [2560/4276] eta: 1:23:30 lr: 2.9672763810058262e-05 loss: 0.1017 (0.1284) time: 2.8877 data: 0.0075 max mem: 33300 Epoch: [17] [2570/4276] eta: 1:23:01 lr: 2.9669975788005712e-05 loss: 0.1070 (0.1283) time: 2.8912 data: 0.0075 max mem: 33300 Epoch: [17] [2580/4276] eta: 1:22:32 lr: 2.9667187736843492e-05 loss: 0.1070 (0.1283) time: 2.8836 data: 0.0070 max mem: 33300 Epoch: [17] [2590/4276] eta: 1:22:02 lr: 2.9664399656568266e-05 loss: 0.1086 (0.1282) time: 2.8827 data: 0.0071 max mem: 33300 Epoch: [17] [2600/4276] eta: 1:21:33 lr: 2.9661611547176678e-05 loss: 0.1192 (0.1282) time: 2.8998 data: 0.0070 max mem: 33300 Epoch: [17] [2610/4276] eta: 1:21:04 lr: 2.9658823408665392e-05 loss: 0.1134 (0.1281) time: 2.9024 data: 0.0072 max mem: 33300 Epoch: [17] [2620/4276] eta: 1:20:34 lr: 2.9656035241031066e-05 loss: 0.1121 (0.1282) time: 2.8929 data: 0.0075 max mem: 33300 Epoch: [17] [2630/4276] eta: 1:20:05 lr: 2.9653247044270337e-05 loss: 0.1121 (0.1281) time: 2.9126 data: 0.0079 max mem: 33300 Epoch: [17] [2640/4276] eta: 1:19:36 lr: 2.9650458818379878e-05 loss: 0.1154 (0.1281) time: 2.9071 data: 0.0078 max mem: 33300 Epoch: [17] [2650/4276] eta: 1:19:06 lr: 2.9647670563356326e-05 loss: 0.1120 (0.1281) time: 2.8853 data: 0.0069 max mem: 33300 Epoch: [17] [2660/4276] eta: 1:18:37 lr: 2.9644882279196336e-05 loss: 0.1188 (0.1281) time: 2.8894 data: 0.0071 max mem: 33300 Epoch: [17] [2670/4276] eta: 1:18:07 lr: 2.9642093965896562e-05 loss: 0.1202 (0.1281) time: 2.8877 data: 0.0073 max mem: 33300 Epoch: [17] [2680/4276] eta: 1:17:38 lr: 2.963930562345365e-05 loss: 0.1193 (0.1281) time: 2.9023 data: 0.0073 max mem: 33300 Epoch: [17] [2690/4276] eta: 1:17:09 lr: 2.9636517251864247e-05 loss: 0.1206 (0.1281) time: 2.9029 data: 0.0074 max mem: 33300 Epoch: [17] [2700/4276] eta: 1:16:39 lr: 2.9633728851125015e-05 loss: 0.1153 (0.1280) time: 2.8821 data: 0.0076 max mem: 33300 Epoch: [17] [2710/4276] eta: 1:16:10 lr: 2.9630940421232582e-05 loss: 0.1100 (0.1280) time: 2.8832 data: 0.0078 max mem: 33300 Epoch: [17] [2720/4276] eta: 1:15:41 lr: 2.962815196218361e-05 loss: 0.1099 (0.1279) time: 2.8919 data: 0.0074 max mem: 33300 Epoch: [17] [2730/4276] eta: 1:15:11 lr: 2.9625363473974736e-05 loss: 0.1198 (0.1280) time: 2.8964 data: 0.0071 max mem: 33300 Epoch: [17] [2740/4276] eta: 1:14:42 lr: 2.962257495660261e-05 loss: 0.1252 (0.1280) time: 2.8890 data: 0.0070 max mem: 33300 Epoch: [17] [2750/4276] eta: 1:14:13 lr: 2.961978641006387e-05 loss: 0.1334 (0.1281) time: 2.8807 data: 0.0071 max mem: 33300 Epoch: [17] [2760/4276] eta: 1:13:43 lr: 2.961699783435517e-05 loss: 0.1221 (0.1281) time: 2.8798 data: 0.0070 max mem: 33300 Epoch: [17] [2770/4276] eta: 1:13:14 lr: 2.961420922947315e-05 loss: 0.1148 (0.1281) time: 2.8938 data: 0.0070 max mem: 33300 Epoch: [17] [2780/4276] eta: 1:12:45 lr: 2.9611420595414452e-05 loss: 0.1226 (0.1281) time: 2.9082 data: 0.0072 max mem: 33300 Epoch: [17] [2790/4276] eta: 1:12:16 lr: 2.9608631932175718e-05 loss: 0.1275 (0.1281) time: 2.9249 data: 0.0073 max mem: 33300 Epoch: [17] [2800/4276] eta: 1:11:47 lr: 2.9605843239753578e-05 loss: 0.1264 (0.1281) time: 2.9395 data: 0.0076 max mem: 33300 Epoch: [17] [2810/4276] eta: 1:11:18 lr: 2.9603054518144686e-05 loss: 0.1103 (0.1280) time: 2.9380 data: 0.0076 max mem: 33300 Epoch: [17] [2820/4276] eta: 1:10:49 lr: 2.9600265767345676e-05 loss: 0.1026 (0.1279) time: 2.9387 data: 0.0082 max mem: 33300 Epoch: [17] [2830/4276] eta: 1:10:19 lr: 2.9597476987353184e-05 loss: 0.1128 (0.1279) time: 2.9387 data: 0.0086 max mem: 33300 Epoch: [17] [2840/4276] eta: 1:09:50 lr: 2.9594688178163853e-05 loss: 0.1286 (0.1280) time: 2.9389 data: 0.0080 max mem: 33300 Epoch: [17] [2850/4276] eta: 1:09:21 lr: 2.9591899339774332e-05 loss: 0.1427 (0.1280) time: 2.9399 data: 0.0076 max mem: 33300 Epoch: [17] [2860/4276] eta: 1:08:52 lr: 2.9589110472181232e-05 loss: 0.1367 (0.1280) time: 2.9310 data: 0.0077 max mem: 33300 Epoch: [17] [2870/4276] eta: 1:08:23 lr: 2.958632157538121e-05 loss: 0.1235 (0.1281) time: 2.8986 data: 0.0076 max mem: 33300 Epoch: [17] [2880/4276] eta: 1:07:54 lr: 2.9583532649370883e-05 loss: 0.1344 (0.1281) time: 2.8890 data: 0.0075 max mem: 33300 Epoch: [17] [2890/4276] eta: 1:07:24 lr: 2.9580743694146896e-05 loss: 0.1282 (0.1281) time: 2.8979 data: 0.0079 max mem: 33300 Epoch: [17] [2900/4276] eta: 1:06:55 lr: 2.957795470970588e-05 loss: 0.1173 (0.1280) time: 2.8916 data: 0.0081 max mem: 33300 Epoch: [17] [2910/4276] eta: 1:06:26 lr: 2.957516569604447e-05 loss: 0.1194 (0.1280) time: 2.8888 data: 0.0075 max mem: 33300 Epoch: [17] [2920/4276] eta: 1:05:56 lr: 2.95723766531593e-05 loss: 0.1330 (0.1281) time: 2.8891 data: 0.0074 max mem: 33300 Epoch: [17] [2930/4276] eta: 1:05:27 lr: 2.9569587581046997e-05 loss: 0.1132 (0.1280) time: 2.8924 data: 0.0075 max mem: 33300 Epoch: [17] [2940/4276] eta: 1:04:58 lr: 2.9566798479704195e-05 loss: 0.1116 (0.1280) time: 2.9174 data: 0.0073 max mem: 33300 Epoch: [17] [2950/4276] eta: 1:04:29 lr: 2.956400934912752e-05 loss: 0.1131 (0.1280) time: 2.9410 data: 0.0073 max mem: 33300 Epoch: [17] [2960/4276] eta: 1:04:00 lr: 2.9561220189313597e-05 loss: 0.1103 (0.1280) time: 2.9416 data: 0.0074 max mem: 33300 Epoch: [17] [2970/4276] eta: 1:03:31 lr: 2.955843100025907e-05 loss: 0.1241 (0.1280) time: 2.9394 data: 0.0071 max mem: 33300 Epoch: [17] [2980/4276] eta: 1:03:02 lr: 2.9555641781960547e-05 loss: 0.1306 (0.1280) time: 2.9427 data: 0.0069 max mem: 33300 Epoch: [17] [2990/4276] eta: 1:02:33 lr: 2.955285253441467e-05 loss: 0.1133 (0.1279) time: 2.9516 data: 0.0073 max mem: 33300 Epoch: [17] [3000/4276] eta: 1:02:03 lr: 2.9550063257618053e-05 loss: 0.1133 (0.1279) time: 2.9493 data: 0.0078 max mem: 33300 Epoch: [17] [3010/4276] eta: 1:01:34 lr: 2.954727395156734e-05 loss: 0.1186 (0.1279) time: 2.9430 data: 0.0080 max mem: 33300 Epoch: [17] [3020/4276] eta: 1:01:05 lr: 2.9544484616259137e-05 loss: 0.1219 (0.1279) time: 2.9234 data: 0.0077 max mem: 33300 Epoch: [17] [3030/4276] eta: 1:00:36 lr: 2.9541695251690082e-05 loss: 0.1225 (0.1280) time: 2.9112 data: 0.0074 max mem: 33300 Epoch: [17] [3040/4276] eta: 1:00:07 lr: 2.953890585785678e-05 loss: 0.1273 (0.1281) time: 2.9071 data: 0.0074 max mem: 33300 Epoch: [17] [3050/4276] eta: 0:59:37 lr: 2.9536116434755866e-05 loss: 0.1250 (0.1280) time: 2.8883 data: 0.0071 max mem: 33300 Epoch: [17] [3060/4276] eta: 0:59:08 lr: 2.9533326982383962e-05 loss: 0.1076 (0.1280) time: 2.8781 data: 0.0069 max mem: 33300 Epoch: [17] [3070/4276] eta: 0:58:39 lr: 2.9530537500737688e-05 loss: 0.1149 (0.1280) time: 2.8802 data: 0.0074 max mem: 33300 Epoch: [17] [3080/4276] eta: 0:58:09 lr: 2.952774798981366e-05 loss: 0.1155 (0.1280) time: 2.8832 data: 0.0076 max mem: 33300 Epoch: [17] [3090/4276] eta: 0:57:40 lr: 2.95249584496085e-05 loss: 0.1095 (0.1279) time: 2.9046 data: 0.0079 max mem: 33300 Epoch: [17] [3100/4276] eta: 0:57:11 lr: 2.952216888011884e-05 loss: 0.1147 (0.1279) time: 2.9061 data: 0.0080 max mem: 33300 Epoch: [17] [3110/4276] eta: 0:56:42 lr: 2.9519379281341268e-05 loss: 0.1098 (0.1279) time: 2.8875 data: 0.0078 max mem: 33300 Epoch: [17] [3120/4276] eta: 0:56:12 lr: 2.9516589653272424e-05 loss: 0.1074 (0.1278) time: 2.8854 data: 0.0076 max mem: 33300 Epoch: [17] [3130/4276] eta: 0:55:43 lr: 2.951379999590892e-05 loss: 0.1119 (0.1278) time: 2.8822 data: 0.0072 max mem: 33300 Epoch: [17] [3140/4276] eta: 0:55:14 lr: 2.9511010309247365e-05 loss: 0.1109 (0.1277) time: 2.8849 data: 0.0075 max mem: 33300 Epoch: [17] [3150/4276] eta: 0:54:44 lr: 2.950822059328438e-05 loss: 0.1250 (0.1277) time: 2.8837 data: 0.0073 max mem: 33300 Epoch: [17] [3160/4276] eta: 0:54:15 lr: 2.9505430848016585e-05 loss: 0.1272 (0.1277) time: 2.8850 data: 0.0071 max mem: 33300 Epoch: [17] [3170/4276] eta: 0:53:46 lr: 2.9502641073440586e-05 loss: 0.1254 (0.1277) time: 2.8820 data: 0.0070 max mem: 33300 Epoch: [17] [3180/4276] eta: 0:53:17 lr: 2.9499851269552997e-05 loss: 0.1248 (0.1277) time: 2.9096 data: 0.0078 max mem: 33300 Epoch: [17] [3190/4276] eta: 0:52:48 lr: 2.9497061436350425e-05 loss: 0.1248 (0.1277) time: 2.9391 data: 0.0083 max mem: 33300 Epoch: [17] [3200/4276] eta: 0:52:19 lr: 2.9494271573829484e-05 loss: 0.1248 (0.1277) time: 2.9382 data: 0.0073 max mem: 33300 Epoch: [17] [3210/4276] eta: 0:51:49 lr: 2.9491481681986794e-05 loss: 0.1248 (0.1277) time: 2.9419 data: 0.0073 max mem: 33300 Epoch: [17] [3220/4276] eta: 0:51:20 lr: 2.948869176081895e-05 loss: 0.1343 (0.1277) time: 2.9407 data: 0.0072 max mem: 33300 Epoch: [17] [3230/4276] eta: 0:50:51 lr: 2.948590181032257e-05 loss: 0.1136 (0.1277) time: 2.9253 data: 0.0071 max mem: 33300 Epoch: [17] [3240/4276] eta: 0:50:22 lr: 2.9483111830494266e-05 loss: 0.1394 (0.1278) time: 2.9020 data: 0.0075 max mem: 33300 Epoch: [17] [3250/4276] eta: 0:49:53 lr: 2.948032182133063e-05 loss: 0.1387 (0.1277) time: 2.9167 data: 0.0082 max mem: 33300 Epoch: [17] [3260/4276] eta: 0:49:24 lr: 2.947753178282829e-05 loss: 0.1197 (0.1277) time: 2.9382 data: 0.0077 max mem: 33300 Epoch: [17] [3270/4276] eta: 0:48:54 lr: 2.9474741714983827e-05 loss: 0.1244 (0.1277) time: 2.9159 data: 0.0075 max mem: 33300 Epoch: [17] [3280/4276] eta: 0:48:25 lr: 2.947195161779387e-05 loss: 0.1162 (0.1277) time: 2.8977 data: 0.0077 max mem: 33300 Epoch: [17] [3290/4276] eta: 0:47:56 lr: 2.9469161491255005e-05 loss: 0.1260 (0.1277) time: 2.9058 data: 0.0072 max mem: 33300 Epoch: [17] [3300/4276] eta: 0:47:27 lr: 2.9466371335363848e-05 loss: 0.1260 (0.1277) time: 2.9063 data: 0.0072 max mem: 33300 Epoch: [17] [3310/4276] eta: 0:46:58 lr: 2.946358115011699e-05 loss: 0.1341 (0.1278) time: 2.9087 data: 0.0077 max mem: 33300 Epoch: [17] [3320/4276] eta: 0:46:28 lr: 2.9460790935511058e-05 loss: 0.1427 (0.1278) time: 2.9050 data: 0.0076 max mem: 33300 Epoch: [17] [3330/4276] eta: 0:45:59 lr: 2.9458000691542626e-05 loss: 0.1293 (0.1278) time: 2.8985 data: 0.0070 max mem: 33300 Epoch: [17] [3340/4276] eta: 0:45:30 lr: 2.9455210418208307e-05 loss: 0.1293 (0.1278) time: 2.9240 data: 0.0077 max mem: 33300 Epoch: [17] [3350/4276] eta: 0:45:01 lr: 2.9452420115504697e-05 loss: 0.1142 (0.1278) time: 2.9446 data: 0.0084 max mem: 33300 Epoch: [17] [3360/4276] eta: 0:44:32 lr: 2.9449629783428402e-05 loss: 0.1122 (0.1278) time: 2.9436 data: 0.0077 max mem: 33300 Epoch: [17] [3370/4276] eta: 0:44:03 lr: 2.944683942197601e-05 loss: 0.1281 (0.1278) time: 2.9139 data: 0.0076 max mem: 33300 Epoch: [17] [3380/4276] eta: 0:43:33 lr: 2.9444049031144126e-05 loss: 0.1263 (0.1278) time: 2.8826 data: 0.0078 max mem: 33300 Epoch: [17] [3390/4276] eta: 0:43:04 lr: 2.944125861092935e-05 loss: 0.1151 (0.1278) time: 2.8895 data: 0.0076 max mem: 33300 Epoch: [17] [3400/4276] eta: 0:42:35 lr: 2.943846816132827e-05 loss: 0.1183 (0.1278) time: 2.8939 data: 0.0074 max mem: 33300 Epoch: [17] [3410/4276] eta: 0:42:06 lr: 2.9435677682337488e-05 loss: 0.1243 (0.1279) time: 2.8962 data: 0.0071 max mem: 33300 Epoch: [17] [3420/4276] eta: 0:41:36 lr: 2.9432887173953593e-05 loss: 0.1295 (0.1279) time: 2.8965 data: 0.0073 max mem: 33300 Epoch: [17] [3430/4276] eta: 0:41:07 lr: 2.9430096636173182e-05 loss: 0.1254 (0.1279) time: 2.9091 data: 0.0074 max mem: 33300 Epoch: [17] [3440/4276] eta: 0:40:38 lr: 2.9427306068992848e-05 loss: 0.1174 (0.1279) time: 2.9314 data: 0.0076 max mem: 33300 Epoch: [17] [3450/4276] eta: 0:40:09 lr: 2.942451547240918e-05 loss: 0.1279 (0.1280) time: 2.9353 data: 0.0079 max mem: 33300 Epoch: [17] [3460/4276] eta: 0:39:40 lr: 2.942172484641878e-05 loss: 0.1334 (0.1280) time: 2.9366 data: 0.0078 max mem: 33300 Epoch: [17] [3470/4276] eta: 0:39:11 lr: 2.9418934191018232e-05 loss: 0.1188 (0.1280) time: 2.9369 data: 0.0077 max mem: 33300 Epoch: [17] [3480/4276] eta: 0:38:42 lr: 2.941614350620412e-05 loss: 0.1163 (0.1280) time: 2.9372 data: 0.0074 max mem: 33300 Epoch: [17] [3490/4276] eta: 0:38:13 lr: 2.941335279197305e-05 loss: 0.1163 (0.1280) time: 2.9395 data: 0.0077 max mem: 33300 Epoch: [17] [3500/4276] eta: 0:37:43 lr: 2.9410562048321595e-05 loss: 0.1147 (0.1279) time: 2.9400 data: 0.0078 max mem: 33300 Epoch: [17] [3510/4276] eta: 0:37:14 lr: 2.9407771275246342e-05 loss: 0.1113 (0.1279) time: 2.9330 data: 0.0077 max mem: 33300 Epoch: [17] [3520/4276] eta: 0:36:45 lr: 2.9404980472743886e-05 loss: 0.1180 (0.1279) time: 2.9059 data: 0.0079 max mem: 33300 Epoch: [17] [3530/4276] eta: 0:36:16 lr: 2.9402189640810817e-05 loss: 0.1343 (0.1279) time: 2.8853 data: 0.0079 max mem: 33300 Epoch: [17] [3540/4276] eta: 0:35:47 lr: 2.9399398779443716e-05 loss: 0.1330 (0.1279) time: 2.9069 data: 0.0079 max mem: 33300 Epoch: [17] [3550/4276] eta: 0:35:18 lr: 2.9396607888639167e-05 loss: 0.1233 (0.1279) time: 2.9355 data: 0.0081 max mem: 33300 Epoch: [17] [3560/4276] eta: 0:34:48 lr: 2.939381696839375e-05 loss: 0.1137 (0.1279) time: 2.9259 data: 0.0081 max mem: 33300 Epoch: [17] [3570/4276] eta: 0:34:19 lr: 2.939102601870406e-05 loss: 0.1367 (0.1280) time: 2.8958 data: 0.0076 max mem: 33300 Epoch: [17] [3580/4276] eta: 0:33:50 lr: 2.938823503956667e-05 loss: 0.1163 (0.1279) time: 2.9085 data: 0.0077 max mem: 33300 Epoch: [17] [3590/4276] eta: 0:33:21 lr: 2.938544403097816e-05 loss: 0.1131 (0.1279) time: 2.9362 data: 0.0080 max mem: 33300 Epoch: [17] [3600/4276] eta: 0:32:52 lr: 2.9382652992935116e-05 loss: 0.1268 (0.1279) time: 2.9382 data: 0.0075 max mem: 33300 Epoch: [17] [3610/4276] eta: 0:32:23 lr: 2.937986192543412e-05 loss: 0.1356 (0.1279) time: 2.9407 data: 0.0071 max mem: 33300 Epoch: [17] [3620/4276] eta: 0:31:53 lr: 2.9377070828471747e-05 loss: 0.1301 (0.1279) time: 2.9433 data: 0.0072 max mem: 33300 Epoch: [17] [3630/4276] eta: 0:31:24 lr: 2.9374279702044587e-05 loss: 0.1193 (0.1279) time: 2.9416 data: 0.0072 max mem: 33300 Epoch: [17] [3640/4276] eta: 0:30:55 lr: 2.9371488546149205e-05 loss: 0.1157 (0.1278) time: 2.9419 data: 0.0072 max mem: 33300 Epoch: [17] [3650/4276] eta: 0:30:26 lr: 2.9368697360782182e-05 loss: 0.1023 (0.1278) time: 2.9409 data: 0.0072 max mem: 33300 Epoch: [17] [3660/4276] eta: 0:29:57 lr: 2.9365906145940097e-05 loss: 0.1134 (0.1278) time: 2.9385 data: 0.0075 max mem: 33300 Epoch: [17] [3670/4276] eta: 0:29:28 lr: 2.936311490161952e-05 loss: 0.1186 (0.1278) time: 2.9302 data: 0.0076 max mem: 33300 Epoch: [17] [3680/4276] eta: 0:28:58 lr: 2.9360323627817026e-05 loss: 0.1188 (0.1278) time: 2.9023 data: 0.0075 max mem: 33300 Epoch: [17] [3690/4276] eta: 0:28:29 lr: 2.9357532324529204e-05 loss: 0.1245 (0.1278) time: 2.8915 data: 0.0078 max mem: 33300 Epoch: [17] [3700/4276] eta: 0:28:00 lr: 2.9354740991752617e-05 loss: 0.1225 (0.1278) time: 2.9187 data: 0.0077 max mem: 33300 Epoch: [17] [3710/4276] eta: 0:27:31 lr: 2.9351949629483827e-05 loss: 0.1178 (0.1278) time: 2.9382 data: 0.0072 max mem: 33300 Epoch: [17] [3720/4276] eta: 0:27:02 lr: 2.9349158237719426e-05 loss: 0.1112 (0.1277) time: 2.9371 data: 0.0069 max mem: 33300 Epoch: [17] [3730/4276] eta: 0:26:33 lr: 2.9346366816455974e-05 loss: 0.1229 (0.1277) time: 2.9428 data: 0.0069 max mem: 33300 Epoch: [17] [3740/4276] eta: 0:26:04 lr: 2.934357536569004e-05 loss: 0.1180 (0.1277) time: 2.9443 data: 0.0071 max mem: 33300 Epoch: [17] [3750/4276] eta: 0:25:34 lr: 2.93407838854182e-05 loss: 0.1096 (0.1277) time: 2.9391 data: 0.0070 max mem: 33300 Epoch: [17] [3760/4276] eta: 0:25:05 lr: 2.9337992375637012e-05 loss: 0.1085 (0.1277) time: 2.9401 data: 0.0070 max mem: 33300 Epoch: [17] [3770/4276] eta: 0:24:36 lr: 2.9335200836343063e-05 loss: 0.1141 (0.1278) time: 2.9377 data: 0.0075 max mem: 33300 Epoch: [17] [3780/4276] eta: 0:24:07 lr: 2.9332409267532905e-05 loss: 0.1181 (0.1277) time: 2.9193 data: 0.0078 max mem: 33300 Epoch: [17] [3790/4276] eta: 0:23:38 lr: 2.9329617669203114e-05 loss: 0.1103 (0.1277) time: 2.9138 data: 0.0078 max mem: 33300 Epoch: [17] [3800/4276] eta: 0:23:09 lr: 2.9326826041350247e-05 loss: 0.1295 (0.1277) time: 2.9327 data: 0.0075 max mem: 33300 Epoch: [17] [3810/4276] eta: 0:22:39 lr: 2.9324034383970872e-05 loss: 0.1186 (0.1277) time: 2.9392 data: 0.0073 max mem: 33300 Epoch: [17] [3820/4276] eta: 0:22:10 lr: 2.9321242697061557e-05 loss: 0.1075 (0.1277) time: 2.9101 data: 0.0076 max mem: 33300 Epoch: [17] [3830/4276] eta: 0:21:41 lr: 2.9318450980618863e-05 loss: 0.1107 (0.1277) time: 2.9048 data: 0.0079 max mem: 33300 Epoch: [17] [3840/4276] eta: 0:21:12 lr: 2.9315659234639354e-05 loss: 0.1206 (0.1276) time: 2.9340 data: 0.0082 max mem: 33300 Epoch: [17] [3850/4276] eta: 0:20:43 lr: 2.9312867459119587e-05 loss: 0.1128 (0.1276) time: 2.9378 data: 0.0079 max mem: 33300 Epoch: [17] [3860/4276] eta: 0:20:14 lr: 2.9310075654056135e-05 loss: 0.1151 (0.1276) time: 2.9384 data: 0.0078 max mem: 33300 Epoch: [17] [3870/4276] eta: 0:19:44 lr: 2.9307283819445548e-05 loss: 0.1197 (0.1275) time: 2.9419 data: 0.0084 max mem: 33300 Epoch: [17] [3880/4276] eta: 0:19:15 lr: 2.930449195528439e-05 loss: 0.1153 (0.1275) time: 2.9402 data: 0.0085 max mem: 33300 Epoch: [17] [3890/4276] eta: 0:18:46 lr: 2.9301700061569217e-05 loss: 0.1153 (0.1275) time: 2.9386 data: 0.0082 max mem: 33300 Epoch: [17] [3900/4276] eta: 0:18:17 lr: 2.9298908138296587e-05 loss: 0.1222 (0.1275) time: 2.9402 data: 0.0082 max mem: 33300 Epoch: [17] [3910/4276] eta: 0:17:48 lr: 2.9296116185463064e-05 loss: 0.1146 (0.1275) time: 2.9415 data: 0.0080 max mem: 33300 Epoch: [17] [3920/4276] eta: 0:17:19 lr: 2.9293324203065194e-05 loss: 0.1079 (0.1275) time: 2.9409 data: 0.0080 max mem: 33300 Epoch: [17] [3930/4276] eta: 0:16:49 lr: 2.9290532191099545e-05 loss: 0.1173 (0.1274) time: 2.9415 data: 0.0084 max mem: 33300 Epoch: [17] [3940/4276] eta: 0:16:20 lr: 2.928774014956267e-05 loss: 0.1195 (0.1275) time: 2.9419 data: 0.0085 max mem: 33300 Epoch: [17] [3950/4276] eta: 0:15:51 lr: 2.9284948078451114e-05 loss: 0.1195 (0.1274) time: 2.9418 data: 0.0082 max mem: 33300 Epoch: [17] [3960/4276] eta: 0:15:22 lr: 2.9282155977761443e-05 loss: 0.1132 (0.1274) time: 2.9443 data: 0.0080 max mem: 33300 Epoch: [17] [3970/4276] eta: 0:14:53 lr: 2.9279363847490198e-05 loss: 0.1242 (0.1274) time: 2.9423 data: 0.0082 max mem: 33300 Epoch: [17] [3980/4276] eta: 0:14:24 lr: 2.9276571687633934e-05 loss: 0.1240 (0.1274) time: 2.9385 data: 0.0079 max mem: 33300 Epoch: [17] [3990/4276] eta: 0:13:54 lr: 2.9273779498189214e-05 loss: 0.1217 (0.1274) time: 2.9123 data: 0.0072 max mem: 33300 Epoch: [17] [4000/4276] eta: 0:13:25 lr: 2.9270987279152572e-05 loss: 0.1151 (0.1274) time: 2.8924 data: 0.0071 max mem: 33300 Epoch: [17] [4010/4276] eta: 0:12:56 lr: 2.9268195030520574e-05 loss: 0.1167 (0.1274) time: 2.9048 data: 0.0075 max mem: 33300 Epoch: [17] [4020/4276] eta: 0:12:27 lr: 2.9265402752289756e-05 loss: 0.1190 (0.1274) time: 2.9242 data: 0.0079 max mem: 33300 Epoch: [17] [4030/4276] eta: 0:11:58 lr: 2.9262610444456674e-05 loss: 0.1230 (0.1274) time: 2.9368 data: 0.0077 max mem: 33300 Epoch: [17] [4040/4276] eta: 0:11:28 lr: 2.9259818107017866e-05 loss: 0.1303 (0.1274) time: 2.9380 data: 0.0075 max mem: 33300 Epoch: [17] [4050/4276] eta: 0:10:59 lr: 2.925702573996989e-05 loss: 0.1219 (0.1274) time: 2.9419 data: 0.0078 max mem: 33300 Epoch: [17] [4060/4276] eta: 0:10:30 lr: 2.9254233343309283e-05 loss: 0.1175 (0.1274) time: 2.9270 data: 0.0076 max mem: 33300 Epoch: [17] [4070/4276] eta: 0:10:01 lr: 2.9251440917032596e-05 loss: 0.1384 (0.1275) time: 2.8970 data: 0.0075 max mem: 33300 Epoch: [17] [4080/4276] eta: 0:09:32 lr: 2.9248648461136376e-05 loss: 0.1326 (0.1275) time: 2.9123 data: 0.0080 max mem: 33300 Epoch: [17] [4090/4276] eta: 0:09:02 lr: 2.9245855975617164e-05 loss: 0.1226 (0.1275) time: 2.9398 data: 0.0082 max mem: 33300 Epoch: [17] [4100/4276] eta: 0:08:33 lr: 2.924306346047149e-05 loss: 0.1398 (0.1275) time: 2.9413 data: 0.0080 max mem: 33300 Epoch: [17] [4110/4276] eta: 0:08:04 lr: 2.924027091569592e-05 loss: 0.1210 (0.1275) time: 2.9406 data: 0.0080 max mem: 33300 Epoch: [17] [4120/4276] eta: 0:07:35 lr: 2.9237478341286985e-05 loss: 0.1259 (0.1276) time: 2.9387 data: 0.0081 max mem: 33300 Epoch: [17] [4130/4276] eta: 0:07:06 lr: 2.9234685737241213e-05 loss: 0.1318 (0.1275) time: 2.9393 data: 0.0081 max mem: 33300 Epoch: [17] [4140/4276] eta: 0:06:37 lr: 2.923189310355516e-05 loss: 0.1302 (0.1275) time: 2.9383 data: 0.0080 max mem: 33300 Epoch: [17] [4150/4276] eta: 0:06:07 lr: 2.9229100440225355e-05 loss: 0.1302 (0.1276) time: 2.9373 data: 0.0079 max mem: 33300 Epoch: [17] [4160/4276] eta: 0:05:38 lr: 2.9226307747248344e-05 loss: 0.1301 (0.1276) time: 2.9375 data: 0.0079 max mem: 33300 Epoch: [17] [4170/4276] eta: 0:05:09 lr: 2.9223515024620663e-05 loss: 0.1316 (0.1276) time: 2.9355 data: 0.0082 max mem: 33300 Epoch: [17] [4180/4276] eta: 0:04:40 lr: 2.9220722272338846e-05 loss: 0.1307 (0.1276) time: 2.9338 data: 0.0077 max mem: 33300 Epoch: [17] [4190/4276] eta: 0:04:11 lr: 2.921792949039943e-05 loss: 0.1204 (0.1276) time: 2.9361 data: 0.0071 max mem: 33300 Epoch: [17] [4200/4276] eta: 0:03:41 lr: 2.9215136678798956e-05 loss: 0.1270 (0.1276) time: 2.9380 data: 0.0074 max mem: 33300 Epoch: [17] [4210/4276] eta: 0:03:12 lr: 2.9212343837533946e-05 loss: 0.1371 (0.1277) time: 2.9386 data: 0.0073 max mem: 33300 Epoch: [17] [4220/4276] eta: 0:02:43 lr: 2.9209550966600933e-05 loss: 0.1382 (0.1277) time: 2.9380 data: 0.0071 max mem: 33300 Epoch: [17] [4230/4276] eta: 0:02:14 lr: 2.920675806599647e-05 loss: 0.1384 (0.1277) time: 2.9371 data: 0.0070 max mem: 33300 Epoch: [17] [4240/4276] eta: 0:01:45 lr: 2.920396513571707e-05 loss: 0.1384 (0.1277) time: 2.9346 data: 0.0071 max mem: 33300 Epoch: [17] [4250/4276] eta: 0:01:15 lr: 2.9201172175759277e-05 loss: 0.1335 (0.1278) time: 2.9285 data: 0.0072 max mem: 33300 Epoch: [17] [4260/4276] eta: 0:00:46 lr: 2.9198379186119607e-05 loss: 0.1376 (0.1279) time: 2.9216 data: 0.0070 max mem: 33300 Epoch: [17] [4270/4276] eta: 0:00:17 lr: 2.9195586166794608e-05 loss: 0.1376 (0.1279) time: 2.9182 data: 0.0069 max mem: 33300 Epoch: [17] Total time: 3:28:05 Test: [ 0/21770] eta: 7:26:47 time: 1.2314 data: 1.1765 max mem: 33300 Test: [ 100/21770] eta: 0:18:22 time: 0.0385 data: 0.0008 max mem: 33300 Test: [ 200/21770] eta: 0:16:05 time: 0.0387 data: 0.0008 max mem: 33300 Test: [ 300/21770] eta: 0:15:18 time: 0.0386 data: 0.0008 max mem: 33300 Test: [ 400/21770] eta: 0:14:52 time: 0.0388 data: 0.0008 max mem: 33300 Test: [ 500/21770] eta: 0:14:35 time: 0.0388 data: 0.0008 max mem: 33300 Test: [ 600/21770] eta: 0:14:23 time: 0.0388 data: 0.0008 max mem: 33300 Test: [ 700/21770] eta: 0:14:13 time: 0.0390 data: 0.0008 max mem: 33300 Test: [ 800/21770] eta: 0:14:06 time: 0.0393 data: 0.0008 max mem: 33300 Test: [ 900/21770] eta: 0:13:59 time: 0.0392 data: 0.0008 max mem: 33300 Test: [ 1000/21770] eta: 0:13:53 time: 0.0391 data: 0.0008 max mem: 33300 Test: [ 1100/21770] eta: 0:13:47 time: 0.0391 data: 0.0008 max mem: 33300 Test: [ 1200/21770] eta: 0:13:41 time: 0.0392 data: 0.0008 max mem: 33300 Test: [ 1300/21770] eta: 0:13:36 time: 0.0393 data: 0.0008 max mem: 33300 Test: [ 1400/21770] eta: 0:13:31 time: 0.0393 data: 0.0008 max mem: 33300 Test: [ 1500/21770] eta: 0:13:26 time: 0.0393 data: 0.0008 max mem: 33300 Test: [ 1600/21770] eta: 0:13:22 time: 0.0392 data: 0.0008 max mem: 33300 Test: [ 1700/21770] eta: 0:13:17 time: 0.0390 data: 0.0008 max mem: 33300 Test: [ 1800/21770] eta: 0:13:12 time: 0.0393 data: 0.0008 max mem: 33300 Test: [ 1900/21770] eta: 0:13:08 time: 0.0394 data: 0.0008 max mem: 33300 Test: [ 2000/21770] eta: 0:13:03 time: 0.0394 data: 0.0008 max mem: 33300 Test: [ 2100/21770] eta: 0:12:59 time: 0.0394 data: 0.0008 max mem: 33300 Test: [ 2200/21770] eta: 0:12:55 time: 0.0391 data: 0.0008 max mem: 33300 Test: [ 2300/21770] eta: 0:12:51 time: 0.0398 data: 0.0009 max mem: 33300 Test: [ 2400/21770] eta: 0:12:47 time: 0.0395 data: 0.0008 max mem: 33300 Test: [ 2500/21770] eta: 0:12:43 time: 0.0395 data: 0.0008 max mem: 33300 Test: [ 2600/21770] eta: 0:12:39 time: 0.0395 data: 0.0008 max mem: 33300 Test: [ 2700/21770] eta: 0:12:35 time: 0.0392 data: 0.0008 max mem: 33300 Test: [ 2800/21770] eta: 0:12:31 time: 0.0399 data: 0.0008 max mem: 33300 Test: [ 2900/21770] eta: 0:12:27 time: 0.0394 data: 0.0008 max mem: 33300 Test: [ 3000/21770] eta: 0:12:23 time: 0.0394 data: 0.0008 max mem: 33300 Test: [ 3100/21770] eta: 0:12:19 time: 0.0400 data: 0.0008 max mem: 33300 Test: [ 3200/21770] eta: 0:12:16 time: 0.0400 data: 0.0008 max mem: 33300 Test: [ 3300/21770] eta: 0:12:12 time: 0.0397 data: 0.0008 max mem: 33300 Test: [ 3400/21770] eta: 0:12:08 time: 0.0399 data: 0.0008 max mem: 33300 Test: [ 3500/21770] eta: 0:12:04 time: 0.0392 data: 0.0008 max mem: 33300 Test: [ 3600/21770] eta: 0:12:00 time: 0.0394 data: 0.0008 max mem: 33300 Test: [ 3700/21770] eta: 0:11:56 time: 0.0400 data: 0.0008 max mem: 33300 Test: [ 3800/21770] eta: 0:11:52 time: 0.0392 data: 0.0008 max mem: 33300 Test: [ 3900/21770] eta: 0:11:48 time: 0.0393 data: 0.0009 max mem: 33300 Test: [ 4000/21770] eta: 0:11:44 time: 0.0394 data: 0.0008 max mem: 33300 Test: [ 4100/21770] eta: 0:11:40 time: 0.0392 data: 0.0008 max mem: 33300 Test: [ 4200/21770] eta: 0:11:36 time: 0.0392 data: 0.0008 max mem: 33300 Test: [ 4300/21770] eta: 0:11:32 time: 0.0392 data: 0.0008 max mem: 33300 Test: [ 4400/21770] eta: 0:11:28 time: 0.0393 data: 0.0008 max mem: 33300 Test: [ 4500/21770] eta: 0:11:24 time: 0.0393 data: 0.0008 max mem: 33300 Test: [ 4600/21770] eta: 0:11:19 time: 0.0392 data: 0.0008 max mem: 33300 Test: [ 4700/21770] eta: 0:11:16 time: 0.0400 data: 0.0008 max mem: 33300 Test: [ 4800/21770] eta: 0:11:12 time: 0.0393 data: 0.0008 max mem: 33300 Test: [ 4900/21770] eta: 0:11:08 time: 0.0400 data: 0.0008 max mem: 33300 Test: [ 5000/21770] eta: 0:11:04 time: 0.0400 data: 0.0008 max mem: 33300 Test: [ 5100/21770] eta: 0:11:00 time: 0.0399 data: 0.0008 max mem: 33300 Test: [ 5200/21770] eta: 0:10:56 time: 0.0400 data: 0.0008 max mem: 33300 Test: [ 5300/21770] eta: 0:10:52 time: 0.0394 data: 0.0008 max mem: 33300 Test: [ 5400/21770] eta: 0:10:49 time: 0.0399 data: 0.0009 max mem: 33300 Test: [ 5500/21770] eta: 0:10:45 time: 0.0400 data: 0.0008 max mem: 33300 Test: [ 5600/21770] eta: 0:10:41 time: 0.0400 data: 0.0008 max mem: 33300 Test: [ 5700/21770] eta: 0:10:37 time: 0.0395 data: 0.0008 max mem: 33300 Test: [ 5800/21770] eta: 0:10:33 time: 0.0400 data: 0.0008 max mem: 33300 Test: [ 5900/21770] eta: 0:10:29 time: 0.0398 data: 0.0008 max mem: 33300 Test: [ 6000/21770] eta: 0:10:25 time: 0.0394 data: 0.0008 max mem: 33300 Test: [ 6100/21770] eta: 0:10:21 time: 0.0398 data: 0.0008 max mem: 33300 Test: [ 6200/21770] eta: 0:10:17 time: 0.0399 data: 0.0008 max mem: 33300 Test: [ 6300/21770] eta: 0:10:13 time: 0.0401 data: 0.0008 max mem: 33300 Test: [ 6400/21770] eta: 0:10:09 time: 0.0401 data: 0.0008 max mem: 33300 Test: [ 6500/21770] eta: 0:10:05 time: 0.0400 data: 0.0008 max mem: 33300 Test: [ 6600/21770] eta: 0:10:02 time: 0.0400 data: 0.0008 max mem: 33300 Test: [ 6700/21770] eta: 0:09:58 time: 0.0394 data: 0.0008 max mem: 33300 Test: [ 6800/21770] eta: 0:09:54 time: 0.0399 data: 0.0008 max mem: 33300 Test: [ 6900/21770] eta: 0:09:50 time: 0.0399 data: 0.0008 max mem: 33300 Test: [ 7000/21770] eta: 0:09:46 time: 0.0399 data: 0.0008 max mem: 33300 Test: [ 7100/21770] eta: 0:09:42 time: 0.0391 data: 0.0008 max mem: 33300 Test: [ 7200/21770] eta: 0:09:38 time: 0.0395 data: 0.0008 max mem: 33300 Test: [ 7300/21770] eta: 0:09:34 time: 0.0392 data: 0.0008 max mem: 33300 Test: [ 7400/21770] eta: 0:09:30 time: 0.0393 data: 0.0008 max mem: 33300 Test: [ 7500/21770] eta: 0:09:26 time: 0.0393 data: 0.0008 max mem: 33300 Test: [ 7600/21770] eta: 0:09:22 time: 0.0397 data: 0.0008 max mem: 33300 Test: [ 7700/21770] eta: 0:09:18 time: 0.0394 data: 0.0008 max mem: 33300 Test: [ 7800/21770] eta: 0:09:14 time: 0.0393 data: 0.0008 max mem: 33300 Test: [ 7900/21770] eta: 0:09:10 time: 0.0395 data: 0.0008 max mem: 33300 Test: [ 8000/21770] eta: 0:09:06 time: 0.0396 data: 0.0008 max mem: 33300 Test: [ 8100/21770] eta: 0:09:02 time: 0.0396 data: 0.0008 max mem: 33300 Test: [ 8200/21770] eta: 0:08:57 time: 0.0393 data: 0.0008 max mem: 33300 Test: [ 8300/21770] eta: 0:08:53 time: 0.0393 data: 0.0008 max mem: 33300 Test: [ 8400/21770] eta: 0:08:49 time: 0.0396 data: 0.0008 max mem: 33300 Test: [ 8500/21770] eta: 0:08:45 time: 0.0396 data: 0.0009 max mem: 33300 Test: [ 8600/21770] eta: 0:08:41 time: 0.0393 data: 0.0008 max mem: 33300 Test: [ 8700/21770] eta: 0:08:37 time: 0.0392 data: 0.0008 max mem: 33300 Test: [ 8800/21770] eta: 0:08:33 time: 0.0393 data: 0.0008 max mem: 33300 Test: [ 8900/21770] eta: 0:08:29 time: 0.0395 data: 0.0008 max mem: 33300 Test: [ 9000/21770] eta: 0:08:25 time: 0.0396 data: 0.0008 max mem: 33300 Test: [ 9100/21770] eta: 0:08:21 time: 0.0394 data: 0.0008 max mem: 33300 Test: [ 9200/21770] eta: 0:08:17 time: 0.0391 data: 0.0008 max mem: 33300 Test: [ 9300/21770] eta: 0:08:13 time: 0.0388 data: 0.0008 max mem: 33300 Test: [ 9400/21770] eta: 0:08:09 time: 0.0391 data: 0.0008 max mem: 33300 Test: [ 9500/21770] eta: 0:08:05 time: 0.0389 data: 0.0008 max mem: 33300 Test: [ 9600/21770] eta: 0:08:01 time: 0.0391 data: 0.0008 max mem: 33300 Test: [ 9700/21770] eta: 0:07:57 time: 0.0389 data: 0.0008 max mem: 33300 Test: [ 9800/21770] eta: 0:07:53 time: 0.0390 data: 0.0008 max mem: 33300 Test: [ 9900/21770] eta: 0:07:49 time: 0.0390 data: 0.0008 max mem: 33300 Test: [10000/21770] eta: 0:07:45 time: 0.0391 data: 0.0008 max mem: 33300 Test: [10100/21770] eta: 0:07:41 time: 0.0390 data: 0.0008 max mem: 33300 Test: [10200/21770] eta: 0:07:37 time: 0.0393 data: 0.0008 max mem: 33300 Test: [10300/21770] eta: 0:07:33 time: 0.0391 data: 0.0008 max mem: 33300 Test: [10400/21770] eta: 0:07:29 time: 0.0390 data: 0.0008 max mem: 33300 Test: [10500/21770] eta: 0:07:25 time: 0.0390 data: 0.0008 max mem: 33300 Test: [10600/21770] eta: 0:07:21 time: 0.0393 data: 0.0008 max mem: 33300 Test: [10700/21770] eta: 0:07:17 time: 0.0388 data: 0.0008 max mem: 33300 Test: [10800/21770] eta: 0:07:13 time: 0.0392 data: 0.0008 max mem: 33300 Test: [10900/21770] eta: 0:07:09 time: 0.0390 data: 0.0008 max mem: 33300 Test: [11000/21770] eta: 0:07:05 time: 0.0392 data: 0.0008 max mem: 33300 Test: [11100/21770] eta: 0:07:01 time: 0.0390 data: 0.0008 max mem: 33300 Test: [11200/21770] eta: 0:06:57 time: 0.0395 data: 0.0008 max mem: 33300 Test: [11300/21770] eta: 0:06:53 time: 0.0397 data: 0.0008 max mem: 33300 Test: [11400/21770] eta: 0:06:49 time: 0.0394 data: 0.0008 max mem: 33300 Test: [11500/21770] eta: 0:06:45 time: 0.0394 data: 0.0008 max mem: 33300 Test: [11600/21770] eta: 0:06:41 time: 0.0394 data: 0.0008 max mem: 33300 Test: [11700/21770] eta: 0:06:37 time: 0.0392 data: 0.0008 max mem: 33300 Test: [11800/21770] eta: 0:06:33 time: 0.0394 data: 0.0008 max mem: 33300 Test: [11900/21770] eta: 0:06:29 time: 0.0396 data: 0.0008 max mem: 33300 Test: [12000/21770] eta: 0:06:25 time: 0.0395 data: 0.0008 max mem: 33300 Test: [12100/21770] eta: 0:06:22 time: 0.0393 data: 0.0008 max mem: 33300 Test: [12200/21770] eta: 0:06:18 time: 0.0393 data: 0.0008 max mem: 33300 Test: [12300/21770] eta: 0:06:14 time: 0.0391 data: 0.0008 max mem: 33300 Test: [12400/21770] eta: 0:06:10 time: 0.0393 data: 0.0008 max mem: 33300 Test: [12500/21770] eta: 0:06:06 time: 0.0392 data: 0.0008 max mem: 33300 Test: [12600/21770] eta: 0:06:02 time: 0.0397 data: 0.0008 max mem: 33300 Test: [12700/21770] eta: 0:05:58 time: 0.0395 data: 0.0008 max mem: 33300 Test: [12800/21770] eta: 0:05:54 time: 0.0393 data: 0.0008 max mem: 33300 Test: [12900/21770] eta: 0:05:50 time: 0.0395 data: 0.0008 max mem: 33300 Test: [13000/21770] eta: 0:05:46 time: 0.0397 data: 0.0008 max mem: 33300 Test: [13100/21770] eta: 0:05:42 time: 0.0395 data: 0.0008 max mem: 33300 Test: [13200/21770] eta: 0:05:38 time: 0.0395 data: 0.0009 max mem: 33300 Test: [13300/21770] eta: 0:05:34 time: 0.0392 data: 0.0008 max mem: 33300 Test: [13400/21770] eta: 0:05:30 time: 0.0397 data: 0.0008 max mem: 33300 Test: [13500/21770] eta: 0:05:26 time: 0.0395 data: 0.0008 max mem: 33300 Test: [13600/21770] eta: 0:05:22 time: 0.0395 data: 0.0008 max mem: 33300 Test: [13700/21770] eta: 0:05:18 time: 0.0393 data: 0.0008 max mem: 33300 Test: [13800/21770] eta: 0:05:14 time: 0.0395 data: 0.0008 max mem: 33300 Test: [13900/21770] eta: 0:05:10 time: 0.0393 data: 0.0008 max mem: 33300 Test: [14000/21770] eta: 0:05:06 time: 0.0396 data: 0.0008 max mem: 33300 Test: [14100/21770] eta: 0:05:02 time: 0.0395 data: 0.0008 max mem: 33300 Test: [14200/21770] eta: 0:04:58 time: 0.0398 data: 0.0008 max mem: 33300 Test: [14300/21770] eta: 0:04:55 time: 0.0394 data: 0.0008 max mem: 33300 Test: [14400/21770] eta: 0:04:51 time: 0.0395 data: 0.0008 max mem: 33300 Test: [14500/21770] eta: 0:04:47 time: 0.0394 data: 0.0008 max mem: 33300 Test: [14600/21770] eta: 0:04:43 time: 0.0396 data: 0.0008 max mem: 33300 Test: [14700/21770] eta: 0:04:39 time: 0.0397 data: 0.0008 max mem: 33300 Test: [14800/21770] eta: 0:04:35 time: 0.0394 data: 0.0008 max mem: 33300 Test: [14900/21770] eta: 0:04:31 time: 0.0397 data: 0.0008 max mem: 33300 Test: [15000/21770] eta: 0:04:27 time: 0.0395 data: 0.0008 max mem: 33300 Test: [15100/21770] eta: 0:04:23 time: 0.0393 data: 0.0008 max mem: 33300 Test: [15200/21770] eta: 0:04:19 time: 0.0397 data: 0.0008 max mem: 33300 Test: [15300/21770] eta: 0:04:15 time: 0.0395 data: 0.0008 max mem: 33300 Test: [15400/21770] eta: 0:04:11 time: 0.0394 data: 0.0008 max mem: 33300 Test: [15500/21770] eta: 0:04:07 time: 0.0393 data: 0.0008 max mem: 33300 Test: [15600/21770] eta: 0:04:03 time: 0.0396 data: 0.0008 max mem: 33300 Test: [15700/21770] eta: 0:03:59 time: 0.0393 data: 0.0008 max mem: 33300 Test: [15800/21770] eta: 0:03:55 time: 0.0398 data: 0.0008 max mem: 33300 Test: [15900/21770] eta: 0:03:51 time: 0.0397 data: 0.0008 max mem: 33300 Test: [16000/21770] eta: 0:03:47 time: 0.0397 data: 0.0008 max mem: 33300 Test: [16100/21770] eta: 0:03:43 time: 0.0393 data: 0.0008 max mem: 33300 Test: [16200/21770] eta: 0:03:40 time: 0.0393 data: 0.0008 max mem: 33300 Test: [16300/21770] eta: 0:03:36 time: 0.0395 data: 0.0009 max mem: 33300 Test: [16400/21770] eta: 0:03:32 time: 0.0395 data: 0.0008 max mem: 33300 Test: [16500/21770] eta: 0:03:28 time: 0.0393 data: 0.0008 max mem: 33300 Test: [16600/21770] eta: 0:03:24 time: 0.0390 data: 0.0008 max mem: 33300 Test: [16700/21770] eta: 0:03:20 time: 0.0390 data: 0.0008 max mem: 33300 Test: [16800/21770] eta: 0:03:16 time: 0.0389 data: 0.0008 max mem: 33300 Test: [16900/21770] eta: 0:03:12 time: 0.0391 data: 0.0008 max mem: 33300 Test: [17000/21770] eta: 0:03:08 time: 0.0390 data: 0.0008 max mem: 33300 Test: [17100/21770] eta: 0:03:04 time: 0.0390 data: 0.0008 max mem: 33300 Test: [17200/21770] eta: 0:03:00 time: 0.0390 data: 0.0008 max mem: 33300 Test: [17300/21770] eta: 0:02:56 time: 0.0389 data: 0.0008 max mem: 33300 Test: [17400/21770] eta: 0:02:52 time: 0.0386 data: 0.0008 max mem: 33300 Test: [17500/21770] eta: 0:02:48 time: 0.0384 data: 0.0008 max mem: 33300 Test: [17600/21770] eta: 0:02:44 time: 0.0385 data: 0.0008 max mem: 33300 Test: [17700/21770] eta: 0:02:40 time: 0.0384 data: 0.0008 max mem: 33300 Test: [17800/21770] eta: 0:02:36 time: 0.0380 data: 0.0008 max mem: 33300 Test: [17900/21770] eta: 0:02:32 time: 0.0383 data: 0.0009 max mem: 33300 Test: [18000/21770] eta: 0:02:28 time: 0.0383 data: 0.0008 max mem: 33300 Test: [18100/21770] eta: 0:02:24 time: 0.0383 data: 0.0008 max mem: 33300 Test: [18200/21770] eta: 0:02:20 time: 0.0384 data: 0.0008 max mem: 33300 Test: [18300/21770] eta: 0:02:16 time: 0.0384 data: 0.0008 max mem: 33300 Test: [18400/21770] eta: 0:02:12 time: 0.0384 data: 0.0008 max mem: 33300 Test: [18500/21770] eta: 0:02:08 time: 0.0384 data: 0.0008 max mem: 33300 Test: [18600/21770] eta: 0:02:04 time: 0.0384 data: 0.0008 max mem: 33300 Test: [18700/21770] eta: 0:02:00 time: 0.0383 data: 0.0008 max mem: 33300 Test: [18800/21770] eta: 0:01:56 time: 0.0383 data: 0.0008 max mem: 33300 Test: [18900/21770] eta: 0:01:53 time: 0.0383 data: 0.0008 max mem: 33300 Test: [19000/21770] eta: 0:01:49 time: 0.0383 data: 0.0008 max mem: 33300 Test: [19100/21770] eta: 0:01:45 time: 0.0383 data: 0.0008 max mem: 33300 Test: [19200/21770] eta: 0:01:41 time: 0.0383 data: 0.0008 max mem: 33300 Test: [19300/21770] eta: 0:01:37 time: 0.0383 data: 0.0008 max mem: 33300 Test: [19400/21770] eta: 0:01:33 time: 0.0384 data: 0.0008 max mem: 33300 Test: [19500/21770] eta: 0:01:29 time: 0.0387 data: 0.0008 max mem: 33300 Test: [19600/21770] eta: 0:01:25 time: 0.0388 data: 0.0008 max mem: 33300 Test: [19700/21770] eta: 0:01:21 time: 0.0390 data: 0.0008 max mem: 33300 Test: [19800/21770] eta: 0:01:17 time: 0.0391 data: 0.0008 max mem: 33300 Test: [19900/21770] eta: 0:01:13 time: 0.0390 data: 0.0008 max mem: 33300 Test: [20000/21770] eta: 0:01:09 time: 0.0392 data: 0.0008 max mem: 33300 Test: [20100/21770] eta: 0:01:05 time: 0.0388 data: 0.0008 max mem: 33300 Test: [20200/21770] eta: 0:01:01 time: 0.0392 data: 0.0008 max mem: 33300 Test: [20300/21770] eta: 0:00:57 time: 0.0389 data: 0.0008 max mem: 33300 Test: [20400/21770] eta: 0:00:53 time: 0.0389 data: 0.0008 max mem: 33300 Test: [20500/21770] eta: 0:00:49 time: 0.0392 data: 0.0008 max mem: 33300 Test: [20600/21770] eta: 0:00:46 time: 0.0393 data: 0.0008 max mem: 33300 Test: [20700/21770] eta: 0:00:42 time: 0.0392 data: 0.0008 max mem: 33300 Test: [20800/21770] eta: 0:00:38 time: 0.0393 data: 0.0008 max mem: 33300 Test: [20900/21770] eta: 0:00:34 time: 0.0389 data: 0.0008 max mem: 33300 Test: [21000/21770] eta: 0:00:30 time: 0.0395 data: 0.0008 max mem: 33300 Test: [21100/21770] eta: 0:00:26 time: 0.0394 data: 0.0008 max mem: 33300 Test: [21200/21770] eta: 0:00:22 time: 0.0398 data: 0.0008 max mem: 33300 Test: [21300/21770] eta: 0:00:18 time: 0.0394 data: 0.0008 max mem: 33300 Test: [21400/21770] eta: 0:00:14 time: 0.0400 data: 0.0008 max mem: 33300 Test: [21500/21770] eta: 0:00:10 time: 0.0395 data: 0.0008 max mem: 33300 Test: [21600/21770] eta: 0:00:06 time: 0.0395 data: 0.0008 max mem: 33300 Test: [21700/21770] eta: 0:00:02 time: 0.0393 data: 0.0008 max mem: 33300 Test: Total time: 0:14:16 Final results: Mean IoU is 0.00 precision@0.5 = 0.00 precision@0.6 = 0.00 precision@0.7 = 0.00 precision@0.8 = 0.00 precision@0.9 = 0.00 overall IoU = 0.00 mean IoU = 0.00 Mean accuracy for one-to-zero sample is 100.00 Average object IoU 0.0 Overall IoU 0.0 Epoch: [18] [ 0/4276] eta: 6:01:32 lr: 2.9193910340949192e-05 loss: 0.1046 (0.1046) time: 5.0731 data: 1.9851 max mem: 33300 Epoch: [18] [ 10/4276] eta: 3:42:58 lr: 2.9191117274120423e-05 loss: 0.1226 (0.1229) time: 3.1360 data: 0.1866 max mem: 33300 Epoch: [18] [ 20/4276] eta: 3:35:25 lr: 2.91883241775973e-05 loss: 0.1234 (0.1334) time: 2.9351 data: 0.0068 max mem: 33300 Epoch: [18] [ 30/4276] eta: 3:32:19 lr: 2.9185531051376324e-05 loss: 0.1241 (0.1305) time: 2.9257 data: 0.0072 max mem: 33300 Epoch: [18] [ 40/4276] eta: 3:30:34 lr: 2.9182737895454042e-05 loss: 0.1241 (0.1291) time: 2.9255 data: 0.0077 max mem: 33300 Epoch: [18] [ 50/4276] eta: 3:29:14 lr: 2.9179944709826975e-05 loss: 0.1215 (0.1264) time: 2.9249 data: 0.0077 max mem: 33300 Epoch: [18] [ 60/4276] eta: 3:28:14 lr: 2.9177151494491646e-05 loss: 0.1102 (0.1264) time: 2.9246 data: 0.0077 max mem: 33300 Epoch: [18] [ 70/4276] eta: 3:27:20 lr: 2.917435824944457e-05 loss: 0.1242 (0.1255) time: 2.9245 data: 0.0077 max mem: 33300 Epoch: [18] [ 80/4276] eta: 3:26:39 lr: 2.9171564974682285e-05 loss: 0.1239 (0.1259) time: 2.9291 data: 0.0075 max mem: 33300 Epoch: [18] [ 90/4276] eta: 3:25:55 lr: 2.9168771670201313e-05 loss: 0.1185 (0.1255) time: 2.9301 data: 0.0075 max mem: 33300 Epoch: [18] [ 100/4276] eta: 3:25:18 lr: 2.9165978335998168e-05 loss: 0.1199 (0.1277) time: 2.9292 data: 0.0077 max mem: 33300 Epoch: [18] [ 110/4276] eta: 3:24:42 lr: 2.9163184972069374e-05 loss: 0.1304 (0.1291) time: 2.9331 data: 0.0077 max mem: 33300 Epoch: [18] [ 120/4276] eta: 3:24:08 lr: 2.9160391578411445e-05 loss: 0.1214 (0.1281) time: 2.9338 data: 0.0075 max mem: 33300 Epoch: [18] [ 130/4276] eta: 3:23:35 lr: 2.9157598155020905e-05 loss: 0.1249 (0.1291) time: 2.9355 data: 0.0075 max mem: 33300 Epoch: [18] [ 140/4276] eta: 3:23:03 lr: 2.9154804701894274e-05 loss: 0.1249 (0.1286) time: 2.9365 data: 0.0077 max mem: 33300 Epoch: [18] [ 150/4276] eta: 3:22:32 lr: 2.9152011219028065e-05 loss: 0.1199 (0.1280) time: 2.9387 data: 0.0077 max mem: 33300 Epoch: [18] [ 160/4276] eta: 3:22:01 lr: 2.9149217706418795e-05 loss: 0.1204 (0.1281) time: 2.9393 data: 0.0075 max mem: 33300 Epoch: [18] [ 170/4276] eta: 3:21:31 lr: 2.9146424164062996e-05 loss: 0.1175 (0.1275) time: 2.9413 data: 0.0076 max mem: 33300 Epoch: [18] [ 180/4276] eta: 3:21:02 lr: 2.9143630591957166e-05 loss: 0.1175 (0.1279) time: 2.9442 data: 0.0078 max mem: 33300 Epoch: [18] [ 190/4276] eta: 3:20:32 lr: 2.9140836990097824e-05 loss: 0.1309 (0.1273) time: 2.9439 data: 0.0077 max mem: 33300 Epoch: [18] [ 200/4276] eta: 3:20:02 lr: 2.9138043358481483e-05 loss: 0.1130 (0.1276) time: 2.9434 data: 0.0075 max mem: 33300 Epoch: [18] [ 210/4276] eta: 3:19:32 lr: 2.9135249697104655e-05 loss: 0.1147 (0.1278) time: 2.9411 data: 0.0075 max mem: 33300 Epoch: [18] [ 220/4276] eta: 3:19:02 lr: 2.913245600596385e-05 loss: 0.1356 (0.1282) time: 2.9417 data: 0.0077 max mem: 33300 Epoch: [18] [ 230/4276] eta: 3:18:32 lr: 2.9129662285055585e-05 loss: 0.1289 (0.1276) time: 2.9436 data: 0.0077 max mem: 33300 Epoch: [18] [ 240/4276] eta: 3:18:02 lr: 2.9126868534376372e-05 loss: 0.1229 (0.1274) time: 2.9409 data: 0.0075 max mem: 33300 Epoch: [18] [ 250/4276] eta: 3:17:32 lr: 2.9124074753922725e-05 loss: 0.1265 (0.1278) time: 2.9384 data: 0.0075 max mem: 33300 Epoch: [18] [ 260/4276] eta: 3:17:00 lr: 2.9121280943691137e-05 loss: 0.1268 (0.1281) time: 2.9352 data: 0.0079 max mem: 33300 Epoch: [18] [ 270/4276] eta: 3:16:29 lr: 2.9118487103678134e-05 loss: 0.1219 (0.1281) time: 2.9311 data: 0.0078 max mem: 33300 Epoch: [18] [ 280/4276] eta: 3:15:58 lr: 2.911569323388021e-05 loss: 0.1222 (0.1279) time: 2.9323 data: 0.0075 max mem: 33300 Epoch: [18] [ 290/4276] eta: 3:15:28 lr: 2.911289933429387e-05 loss: 0.1219 (0.1276) time: 2.9361 data: 0.0079 max mem: 33300 Epoch: [18] [ 300/4276] eta: 3:14:52 lr: 2.911010540491563e-05 loss: 0.1219 (0.1274) time: 2.9158 data: 0.0084 max mem: 33300 Epoch: [18] [ 310/4276] eta: 3:14:23 lr: 2.9107311445741997e-05 loss: 0.1288 (0.1274) time: 2.9161 data: 0.0084 max mem: 33300 Epoch: [18] [ 320/4276] eta: 3:13:53 lr: 2.9104517456769465e-05 loss: 0.1331 (0.1278) time: 2.9400 data: 0.0080 max mem: 33300 Epoch: [18] [ 330/4276] eta: 3:13:23 lr: 2.910172343799455e-05 loss: 0.1380 (0.1284) time: 2.9392 data: 0.0080 max mem: 33300 Epoch: [18] [ 340/4276] eta: 3:12:54 lr: 2.9098929389413737e-05 loss: 0.1256 (0.1283) time: 2.9362 data: 0.0080 max mem: 33300 Epoch: [18] [ 350/4276] eta: 3:12:24 lr: 2.909613531102355e-05 loss: 0.1106 (0.1283) time: 2.9371 data: 0.0080 max mem: 33300 Epoch: [18] [ 360/4276] eta: 3:11:52 lr: 2.9093341202820474e-05 loss: 0.1292 (0.1290) time: 2.9304 data: 0.0079 max mem: 33300 Epoch: [18] [ 370/4276] eta: 3:11:21 lr: 2.909054706480101e-05 loss: 0.1206 (0.1287) time: 2.9238 data: 0.0078 max mem: 33300 Epoch: [18] [ 380/4276] eta: 3:10:51 lr: 2.908775289696167e-05 loss: 0.1179 (0.1291) time: 2.9282 data: 0.0077 max mem: 33300 Epoch: [18] [ 390/4276] eta: 3:10:22 lr: 2.9084958699298938e-05 loss: 0.1378 (0.1294) time: 2.9343 data: 0.0075 max mem: 33300 Epoch: [18] [ 400/4276] eta: 3:09:52 lr: 2.9082164471809324e-05 loss: 0.1378 (0.1297) time: 2.9346 data: 0.0076 max mem: 33300 Epoch: [18] [ 410/4276] eta: 3:09:21 lr: 2.907937021448932e-05 loss: 0.1291 (0.1297) time: 2.9277 data: 0.0077 max mem: 33300 Epoch: [18] [ 420/4276] eta: 3:08:47 lr: 2.9076575927335425e-05 loss: 0.1234 (0.1296) time: 2.9098 data: 0.0078 max mem: 33300 Epoch: [18] [ 430/4276] eta: 3:08:18 lr: 2.907378161034413e-05 loss: 0.1279 (0.1298) time: 2.9167 data: 0.0080 max mem: 33300 Epoch: [18] [ 440/4276] eta: 3:07:48 lr: 2.9070987263511927e-05 loss: 0.1241 (0.1294) time: 2.9364 data: 0.0081 max mem: 33300 Epoch: [18] [ 450/4276] eta: 3:07:19 lr: 2.9068192886835322e-05 loss: 0.1206 (0.1294) time: 2.9353 data: 0.0077 max mem: 33300 Epoch: [18] [ 460/4276] eta: 3:06:50 lr: 2.90653984803108e-05 loss: 0.1204 (0.1289) time: 2.9373 data: 0.0078 max mem: 33300 Epoch: [18] [ 470/4276] eta: 3:06:20 lr: 2.906260404393486e-05 loss: 0.0999 (0.1284) time: 2.9359 data: 0.0079 max mem: 33300 Epoch: [18] [ 480/4276] eta: 3:05:51 lr: 2.9059809577703996e-05 loss: 0.1006 (0.1282) time: 2.9376 data: 0.0080 max mem: 33300 Epoch: [18] [ 490/4276] eta: 3:05:22 lr: 2.905701508161468e-05 loss: 0.1012 (0.1276) time: 2.9408 data: 0.0079 max mem: 33300 Epoch: [18] [ 500/4276] eta: 3:04:52 lr: 2.9054220555663424e-05 loss: 0.1059 (0.1275) time: 2.9367 data: 0.0078 max mem: 33300 Epoch: [18] [ 510/4276] eta: 3:04:23 lr: 2.9051425999846703e-05 loss: 0.1104 (0.1272) time: 2.9378 data: 0.0082 max mem: 33300 Epoch: [18] [ 520/4276] eta: 3:03:53 lr: 2.9048631414161015e-05 loss: 0.1160 (0.1271) time: 2.9394 data: 0.0080 max mem: 33300 Epoch: [18] [ 530/4276] eta: 3:03:24 lr: 2.9045836798602843e-05 loss: 0.1160 (0.1270) time: 2.9389 data: 0.0076 max mem: 33300 Epoch: [18] [ 540/4276] eta: 3:02:55 lr: 2.9043042153168674e-05 loss: 0.1137 (0.1268) time: 2.9393 data: 0.0081 max mem: 33300 Epoch: [18] [ 550/4276] eta: 3:02:26 lr: 2.904024747785499e-05 loss: 0.1203 (0.1268) time: 2.9417 data: 0.0087 max mem: 33300 Epoch: [18] [ 560/4276] eta: 3:01:57 lr: 2.9037452772658298e-05 loss: 0.1227 (0.1268) time: 2.9440 data: 0.0086 max mem: 33300 Epoch: [18] [ 570/4276] eta: 3:01:28 lr: 2.903465803757506e-05 loss: 0.1227 (0.1267) time: 2.9414 data: 0.0082 max mem: 33300 Epoch: [18] [ 580/4276] eta: 3:00:59 lr: 2.9031863272601767e-05 loss: 0.1185 (0.1266) time: 2.9405 data: 0.0083 max mem: 33300 Epoch: [18] [ 590/4276] eta: 3:00:29 lr: 2.90290684777349e-05 loss: 0.1056 (0.1262) time: 2.9399 data: 0.0087 max mem: 33300 Epoch: [18] [ 600/4276] eta: 3:00:00 lr: 2.9026273652970943e-05 loss: 0.1106 (0.1262) time: 2.9392 data: 0.0085 max mem: 33300 Epoch: [18] [ 610/4276] eta: 2:59:31 lr: 2.902347879830638e-05 loss: 0.1229 (0.1262) time: 2.9400 data: 0.0081 max mem: 33300 Epoch: [18] [ 620/4276] eta: 2:59:01 lr: 2.9020683913737685e-05 loss: 0.1184 (0.1262) time: 2.9392 data: 0.0083 max mem: 33300 Epoch: [18] [ 630/4276] eta: 2:58:32 lr: 2.901788899926135e-05 loss: 0.1245 (0.1263) time: 2.9405 data: 0.0087 max mem: 33300 Epoch: [18] [ 640/4276] eta: 2:58:03 lr: 2.9015094054873847e-05 loss: 0.1228 (0.1263) time: 2.9430 data: 0.0085 max mem: 33300 Epoch: [18] [ 650/4276] eta: 2:57:34 lr: 2.9012299080571648e-05 loss: 0.1151 (0.1264) time: 2.9414 data: 0.0081 max mem: 33300 Epoch: [18] [ 660/4276] eta: 2:57:04 lr: 2.9009504076351247e-05 loss: 0.1352 (0.1267) time: 2.9357 data: 0.0083 max mem: 33300 Epoch: [18] [ 670/4276] eta: 2:56:34 lr: 2.90067090422091e-05 loss: 0.1352 (0.1268) time: 2.9329 data: 0.0086 max mem: 33300 Epoch: [18] [ 680/4276] eta: 2:56:04 lr: 2.9003913978141702e-05 loss: 0.1262 (0.1267) time: 2.9269 data: 0.0082 max mem: 33300 Epoch: [18] [ 690/4276] eta: 2:55:34 lr: 2.9001118884145518e-05 loss: 0.1283 (0.1269) time: 2.9253 data: 0.0075 max mem: 33300 Epoch: [18] [ 700/4276] eta: 2:55:05 lr: 2.8998323760217023e-05 loss: 0.1321 (0.1270) time: 2.9360 data: 0.0075 max mem: 33300 Epoch: [18] [ 710/4276] eta: 2:54:35 lr: 2.8995528606352696e-05 loss: 0.1270 (0.1270) time: 2.9370 data: 0.0075 max mem: 33300 Epoch: [18] [ 720/4276] eta: 2:54:06 lr: 2.8992733422549e-05 loss: 0.1105 (0.1269) time: 2.9341 data: 0.0073 max mem: 33300 Epoch: [18] [ 730/4276] eta: 2:53:36 lr: 2.8989938208802426e-05 loss: 0.1022 (0.1268) time: 2.9297 data: 0.0074 max mem: 33300 Epoch: [18] [ 740/4276] eta: 2:53:07 lr: 2.8987142965109422e-05 loss: 0.1081 (0.1266) time: 2.9326 data: 0.0076 max mem: 33300 Epoch: [18] [ 750/4276] eta: 2:52:37 lr: 2.8984347691466468e-05 loss: 0.1172 (0.1266) time: 2.9319 data: 0.0076 max mem: 33300 Epoch: [18] [ 760/4276] eta: 2:52:07 lr: 2.8981552387870036e-05 loss: 0.1108 (0.1265) time: 2.9224 data: 0.0074 max mem: 33300 Epoch: [18] [ 770/4276] eta: 2:51:37 lr: 2.8978757054316598e-05 loss: 0.1129 (0.1265) time: 2.9247 data: 0.0075 max mem: 33300 Epoch: [18] [ 780/4276] eta: 2:51:07 lr: 2.897596169080261e-05 loss: 0.1190 (0.1266) time: 2.9269 data: 0.0076 max mem: 33300 Epoch: [18] [ 790/4276] eta: 2:50:37 lr: 2.8973166297324563e-05 loss: 0.1327 (0.1266) time: 2.9288 data: 0.0076 max mem: 33300 Epoch: [18] [ 800/4276] eta: 2:50:08 lr: 2.8970370873878895e-05 loss: 0.1242 (0.1266) time: 2.9344 data: 0.0074 max mem: 33300 Epoch: [18] [ 810/4276] eta: 2:49:39 lr: 2.8967575420462088e-05 loss: 0.1242 (0.1267) time: 2.9439 data: 0.0074 max mem: 33300 Epoch: [18] [ 820/4276] eta: 2:49:10 lr: 2.89647799370706e-05 loss: 0.1195 (0.1265) time: 2.9443 data: 0.0077 max mem: 33300 Epoch: [18] [ 830/4276] eta: 2:48:41 lr: 2.89619844237009e-05 loss: 0.1189 (0.1266) time: 2.9383 data: 0.0079 max mem: 33300 Epoch: [18] [ 840/4276] eta: 2:48:11 lr: 2.8959188880349453e-05 loss: 0.1194 (0.1266) time: 2.9365 data: 0.0076 max mem: 33300 Epoch: [18] [ 850/4276] eta: 2:47:42 lr: 2.8956393307012713e-05 loss: 0.1167 (0.1265) time: 2.9347 data: 0.0073 max mem: 33300 Epoch: [18] [ 860/4276] eta: 2:47:12 lr: 2.8953597703687148e-05 loss: 0.1101 (0.1265) time: 2.9338 data: 0.0075 max mem: 33300 Epoch: [18] [ 870/4276] eta: 2:46:43 lr: 2.8950802070369227e-05 loss: 0.1101 (0.1264) time: 2.9329 data: 0.0076 max mem: 33300 Epoch: [18] [ 880/4276] eta: 2:46:13 lr: 2.8948006407055396e-05 loss: 0.1182 (0.1265) time: 2.9303 data: 0.0073 max mem: 33300 Epoch: [18] [ 890/4276] eta: 2:45:43 lr: 2.8945210713742117e-05 loss: 0.1338 (0.1267) time: 2.9257 data: 0.0075 max mem: 33300 Epoch: [18] [ 900/4276] eta: 2:45:13 lr: 2.8942414990425855e-05 loss: 0.1294 (0.1266) time: 2.9283 data: 0.0078 max mem: 33300 Epoch: [18] [ 910/4276] eta: 2:44:44 lr: 2.8939619237103054e-05 loss: 0.1234 (0.1267) time: 2.9383 data: 0.0076 max mem: 33300 Epoch: [18] [ 920/4276] eta: 2:44:15 lr: 2.893682345377019e-05 loss: 0.1237 (0.1267) time: 2.9379 data: 0.0073 max mem: 33300 Epoch: [18] [ 930/4276] eta: 2:43:45 lr: 2.8934027640423707e-05 loss: 0.1355 (0.1268) time: 2.9335 data: 0.0074 max mem: 33300 Epoch: [18] [ 940/4276] eta: 2:43:16 lr: 2.8931231797060067e-05 loss: 0.1141 (0.1267) time: 2.9398 data: 0.0076 max mem: 33300 Epoch: [18] [ 950/4276] eta: 2:42:47 lr: 2.892843592367572e-05 loss: 0.1156 (0.1269) time: 2.9390 data: 0.0078 max mem: 33300 Epoch: [18] [ 960/4276] eta: 2:42:17 lr: 2.8925640020267116e-05 loss: 0.1324 (0.1270) time: 2.9207 data: 0.0086 max mem: 33300 Epoch: [18] [ 970/4276] eta: 2:41:45 lr: 2.892284408683072e-05 loss: 0.1324 (0.1270) time: 2.8976 data: 0.0090 max mem: 33300 Epoch: [18] [ 980/4276] eta: 2:41:14 lr: 2.8920048123362968e-05 loss: 0.1234 (0.1269) time: 2.8842 data: 0.0088 max mem: 33300 Epoch: [18] [ 990/4276] eta: 2:40:43 lr: 2.891725212986033e-05 loss: 0.1168 (0.1268) time: 2.8870 data: 0.0087 max mem: 33300 Epoch: [18] [1000/4276] eta: 2:40:14 lr: 2.891445610631924e-05 loss: 0.1105 (0.1268) time: 2.9166 data: 0.0085 max mem: 33300 Epoch: [18] [1010/4276] eta: 2:39:45 lr: 2.891166005273615e-05 loss: 0.1130 (0.1267) time: 2.9382 data: 0.0078 max mem: 33300 Epoch: [18] [1020/4276] eta: 2:39:16 lr: 2.8908863969107524e-05 loss: 0.1155 (0.1267) time: 2.9361 data: 0.0075 max mem: 33300 Epoch: [18] [1030/4276] eta: 2:38:46 lr: 2.8906067855429792e-05 loss: 0.1196 (0.1268) time: 2.9344 data: 0.0074 max mem: 33300 Epoch: [18] [1040/4276] eta: 2:38:16 lr: 2.8903271711699416e-05 loss: 0.1262 (0.1267) time: 2.9244 data: 0.0073 max mem: 33300 Epoch: [18] [1050/4276] eta: 2:37:46 lr: 2.890047553791283e-05 loss: 0.1172 (0.1268) time: 2.9077 data: 0.0082 max mem: 33300 Epoch: [18] [1060/4276] eta: 2:37:15 lr: 2.8897679334066486e-05 loss: 0.1322 (0.1269) time: 2.8867 data: 0.0092 max mem: 33300 Epoch: [18] [1070/4276] eta: 2:36:43 lr: 2.889488310015683e-05 loss: 0.1521 (0.1271) time: 2.8685 data: 0.0094 max mem: 33300 Epoch: [18] [1080/4276] eta: 2:36:14 lr: 2.88920868361803e-05 loss: 0.1325 (0.1271) time: 2.8965 data: 0.0087 max mem: 33300 Epoch: [18] [1090/4276] eta: 2:35:44 lr: 2.8889290542133342e-05 loss: 0.1303 (0.1272) time: 2.9214 data: 0.0076 max mem: 33300 Epoch: [18] [1100/4276] eta: 2:35:14 lr: 2.8886494218012412e-05 loss: 0.1307 (0.1272) time: 2.9135 data: 0.0075 max mem: 33300 Epoch: [18] [1110/4276] eta: 2:34:44 lr: 2.888369786381393e-05 loss: 0.1308 (0.1273) time: 2.9027 data: 0.0080 max mem: 33300 Epoch: [18] [1120/4276] eta: 2:34:13 lr: 2.888090147953435e-05 loss: 0.1353 (0.1274) time: 2.8892 data: 0.0092 max mem: 33300 Epoch: [18] [1130/4276] eta: 2:33:43 lr: 2.887810506517011e-05 loss: 0.1128 (0.1272) time: 2.8912 data: 0.0090 max mem: 33300 Epoch: [18] [1140/4276] eta: 2:33:13 lr: 2.8875308620717646e-05 loss: 0.1105 (0.1271) time: 2.9045 data: 0.0078 max mem: 33300 Epoch: [18] [1150/4276] eta: 2:32:43 lr: 2.88725121461734e-05 loss: 0.1122 (0.1270) time: 2.9117 data: 0.0075 max mem: 33300 Epoch: [18] [1160/4276] eta: 2:32:13 lr: 2.88697156415338e-05 loss: 0.1194 (0.1270) time: 2.9116 data: 0.0076 max mem: 33300 Epoch: [18] [1170/4276] eta: 2:31:43 lr: 2.8866919106795298e-05 loss: 0.1262 (0.1269) time: 2.9139 data: 0.0074 max mem: 33300 Epoch: [18] [1180/4276] eta: 2:31:14 lr: 2.886412254195433e-05 loss: 0.1268 (0.1269) time: 2.9137 data: 0.0074 max mem: 33300 Epoch: [18] [1190/4276] eta: 2:30:44 lr: 2.8861325947007317e-05 loss: 0.1072 (0.1268) time: 2.9139 data: 0.0074 max mem: 33300 Epoch: [18] [1200/4276] eta: 2:30:14 lr: 2.8858529321950712e-05 loss: 0.1155 (0.1267) time: 2.9147 data: 0.0073 max mem: 33300 Epoch: [18] [1210/4276] eta: 2:29:45 lr: 2.8855732666780927e-05 loss: 0.1113 (0.1266) time: 2.9174 data: 0.0073 max mem: 33300 Epoch: [18] [1220/4276] eta: 2:29:15 lr: 2.8852935981494405e-05 loss: 0.1181 (0.1268) time: 2.9177 data: 0.0075 max mem: 33300 Epoch: [18] [1230/4276] eta: 2:28:45 lr: 2.8850139266087584e-05 loss: 0.1329 (0.1268) time: 2.9168 data: 0.0075 max mem: 33300 Epoch: [18] [1240/4276] eta: 2:28:15 lr: 2.8847342520556887e-05 loss: 0.1267 (0.1269) time: 2.9135 data: 0.0074 max mem: 33300 Epoch: [18] [1250/4276] eta: 2:27:46 lr: 2.884454574489875e-05 loss: 0.1248 (0.1269) time: 2.9073 data: 0.0076 max mem: 33300 Epoch: [18] [1260/4276] eta: 2:27:16 lr: 2.8841748939109603e-05 loss: 0.1136 (0.1268) time: 2.9075 data: 0.0075 max mem: 33300 Epoch: [18] [1270/4276] eta: 2:26:46 lr: 2.8838952103185867e-05 loss: 0.1152 (0.1267) time: 2.9120 data: 0.0072 max mem: 33300 Epoch: [18] [1280/4276] eta: 2:26:16 lr: 2.883615523712398e-05 loss: 0.1250 (0.1267) time: 2.9147 data: 0.0069 max mem: 33300 Epoch: [18] [1290/4276] eta: 2:25:47 lr: 2.8833358340920362e-05 loss: 0.1250 (0.1268) time: 2.9170 data: 0.0069 max mem: 33300 Epoch: [18] [1300/4276] eta: 2:25:17 lr: 2.8830561414571438e-05 loss: 0.1077 (0.1267) time: 2.9169 data: 0.0069 max mem: 33300 Epoch: [18] [1310/4276] eta: 2:24:48 lr: 2.8827764458073643e-05 loss: 0.1077 (0.1267) time: 2.9129 data: 0.0070 max mem: 33300 Epoch: [18] [1320/4276] eta: 2:24:18 lr: 2.8824967471423392e-05 loss: 0.1170 (0.1268) time: 2.9190 data: 0.0072 max mem: 33300 Epoch: [18] [1330/4276] eta: 2:23:49 lr: 2.882217045461712e-05 loss: 0.1240 (0.1268) time: 2.9326 data: 0.0071 max mem: 33300 Epoch: [18] [1340/4276] eta: 2:23:20 lr: 2.8819373407651235e-05 loss: 0.1181 (0.1268) time: 2.9358 data: 0.0069 max mem: 33300 Epoch: [18] [1350/4276] eta: 2:22:51 lr: 2.8816576330522175e-05 loss: 0.1221 (0.1268) time: 2.9340 data: 0.0069 max mem: 33300 Epoch: [18] [1360/4276] eta: 2:22:21 lr: 2.881377922322635e-05 loss: 0.1231 (0.1267) time: 2.9306 data: 0.0069 max mem: 33300 Epoch: [18] [1370/4276] eta: 2:21:52 lr: 2.8810982085760185e-05 loss: 0.1108 (0.1267) time: 2.9279 data: 0.0070 max mem: 33300 Epoch: [18] [1380/4276] eta: 2:21:23 lr: 2.8808184918120102e-05 loss: 0.1202 (0.1267) time: 2.9267 data: 0.0070 max mem: 33300 Epoch: [18] [1390/4276] eta: 2:20:53 lr: 2.8805387720302512e-05 loss: 0.1318 (0.1268) time: 2.9261 data: 0.0070 max mem: 33300 Epoch: [18] [1400/4276] eta: 2:20:24 lr: 2.8802590492303843e-05 loss: 0.1288 (0.1268) time: 2.9276 data: 0.0069 max mem: 33300 Epoch: [18] [1410/4276] eta: 2:19:55 lr: 2.8799793234120516e-05 loss: 0.1108 (0.1267) time: 2.9306 data: 0.0071 max mem: 33300 Epoch: [18] [1420/4276] eta: 2:19:26 lr: 2.8796995945748934e-05 loss: 0.1108 (0.1267) time: 2.9358 data: 0.0072 max mem: 33300 Epoch: [18] [1430/4276] eta: 2:18:56 lr: 2.8794198627185526e-05 loss: 0.1197 (0.1267) time: 2.9307 data: 0.0070 max mem: 33300 Epoch: [18] [1440/4276] eta: 2:18:27 lr: 2.8791401278426695e-05 loss: 0.1233 (0.1267) time: 2.9240 data: 0.0071 max mem: 33300 Epoch: [18] [1450/4276] eta: 2:17:57 lr: 2.8788603899468856e-05 loss: 0.1146 (0.1266) time: 2.9201 data: 0.0078 max mem: 33300 Epoch: [18] [1460/4276] eta: 2:17:28 lr: 2.878580649030843e-05 loss: 0.1146 (0.1266) time: 2.9252 data: 0.0078 max mem: 33300 Epoch: [18] [1470/4276] eta: 2:16:59 lr: 2.878300905094183e-05 loss: 0.1173 (0.1265) time: 2.9298 data: 0.0074 max mem: 33300 Epoch: [18] [1480/4276] eta: 2:16:29 lr: 2.8780211581365464e-05 loss: 0.1101 (0.1265) time: 2.9250 data: 0.0075 max mem: 33300 Epoch: [18] [1490/4276] eta: 2:16:00 lr: 2.8777414081575742e-05 loss: 0.1089 (0.1265) time: 2.9247 data: 0.0077 max mem: 33300 Epoch: [18] [1500/4276] eta: 2:15:31 lr: 2.877461655156908e-05 loss: 0.1089 (0.1265) time: 2.9252 data: 0.0075 max mem: 33300 Epoch: [18] [1510/4276] eta: 2:15:01 lr: 2.877181899134188e-05 loss: 0.1080 (0.1264) time: 2.9281 data: 0.0075 max mem: 33300 Epoch: [18] [1520/4276] eta: 2:14:32 lr: 2.8769021400890556e-05 loss: 0.1070 (0.1263) time: 2.9297 data: 0.0075 max mem: 33300 Epoch: [18] [1530/4276] eta: 2:14:03 lr: 2.876622378021151e-05 loss: 0.1107 (0.1262) time: 2.9295 data: 0.0075 max mem: 33300 Epoch: [18] [1540/4276] eta: 2:13:34 lr: 2.8763426129301152e-05 loss: 0.1198 (0.1262) time: 2.9280 data: 0.0075 max mem: 33300 Epoch: [18] [1550/4276] eta: 2:13:04 lr: 2.876062844815589e-05 loss: 0.1282 (0.1262) time: 2.9250 data: 0.0073 max mem: 33300 Epoch: [18] [1560/4276] eta: 2:12:35 lr: 2.875783073677213e-05 loss: 0.1304 (0.1263) time: 2.9166 data: 0.0075 max mem: 33300 Epoch: [18] [1570/4276] eta: 2:12:05 lr: 2.8755032995146276e-05 loss: 0.1233 (0.1263) time: 2.9175 data: 0.0077 max mem: 33300 Epoch: [18] [1580/4276] eta: 2:11:36 lr: 2.8752235223274725e-05 loss: 0.1177 (0.1262) time: 2.9334 data: 0.0075 max mem: 33300 Epoch: [18] [1590/4276] eta: 2:11:07 lr: 2.874943742115389e-05 loss: 0.1297 (0.1262) time: 2.9391 data: 0.0073 max mem: 33300 Epoch: [18] [1600/4276] eta: 2:10:38 lr: 2.8746639588780168e-05 loss: 0.1224 (0.1263) time: 2.9376 data: 0.0073 max mem: 33300 Epoch: [18] [1610/4276] eta: 2:10:08 lr: 2.874384172614995e-05 loss: 0.1207 (0.1262) time: 2.9309 data: 0.0075 max mem: 33300 Epoch: [18] [1620/4276] eta: 2:09:39 lr: 2.8741043833259652e-05 loss: 0.1094 (0.1261) time: 2.9170 data: 0.0074 max mem: 33300 Epoch: [18] [1630/4276] eta: 2:09:09 lr: 2.8738245910105672e-05 loss: 0.1268 (0.1262) time: 2.9125 data: 0.0072 max mem: 33300 Epoch: [18] [1640/4276] eta: 2:08:40 lr: 2.8735447956684403e-05 loss: 0.1101 (0.1260) time: 2.9235 data: 0.0072 max mem: 33300 Epoch: [18] [1650/4276] eta: 2:08:11 lr: 2.8732649972992243e-05 loss: 0.0978 (0.1259) time: 2.9358 data: 0.0074 max mem: 33300 Epoch: [18] [1660/4276] eta: 2:07:42 lr: 2.8729851959025594e-05 loss: 0.1067 (0.1258) time: 2.9361 data: 0.0075 max mem: 33300 Epoch: [18] [1670/4276] eta: 2:07:13 lr: 2.8727053914780848e-05 loss: 0.1178 (0.1258) time: 2.9405 data: 0.0073 max mem: 33300 Epoch: [18] [1680/4276] eta: 2:06:44 lr: 2.87242558402544e-05 loss: 0.1296 (0.1258) time: 2.9431 data: 0.0075 max mem: 33300 Epoch: [18] [1690/4276] eta: 2:06:14 lr: 2.8721457735442647e-05 loss: 0.1277 (0.1258) time: 2.9319 data: 0.0077 max mem: 33300 Epoch: [18] [1700/4276] eta: 2:05:45 lr: 2.871865960034198e-05 loss: 0.1261 (0.1258) time: 2.9139 data: 0.0084 max mem: 33300 Epoch: [18] [1710/4276] eta: 2:05:14 lr: 2.8715861434948793e-05 loss: 0.1235 (0.1258) time: 2.8911 data: 0.0085 max mem: 33300 Epoch: [18] [1720/4276] eta: 2:04:44 lr: 2.8713063239259493e-05 loss: 0.1205 (0.1258) time: 2.8782 data: 0.0078 max mem: 33300 Epoch: [18] [1730/4276] eta: 2:04:15 lr: 2.8710265013270444e-05 loss: 0.1117 (0.1258) time: 2.8895 data: 0.0081 max mem: 33300 Epoch: [18] [1740/4276] eta: 2:03:45 lr: 2.8707466756978057e-05 loss: 0.1273 (0.1258) time: 2.9159 data: 0.0079 max mem: 33300 Epoch: [18] [1750/4276] eta: 2:03:16 lr: 2.870466847037871e-05 loss: 0.1286 (0.1258) time: 2.9298 data: 0.0075 max mem: 33300 Epoch: [18] [1760/4276] eta: 2:02:47 lr: 2.8701870153468796e-05 loss: 0.1137 (0.1257) time: 2.9306 data: 0.0074 max mem: 33300 Epoch: [18] [1770/4276] eta: 2:02:18 lr: 2.869907180624471e-05 loss: 0.1094 (0.1257) time: 2.9317 data: 0.0072 max mem: 33300 Epoch: [18] [1780/4276] eta: 2:01:48 lr: 2.869627342870283e-05 loss: 0.1122 (0.1256) time: 2.9296 data: 0.0071 max mem: 33300 Epoch: [18] [1790/4276] eta: 2:01:19 lr: 2.8693475020839545e-05 loss: 0.1190 (0.1256) time: 2.9294 data: 0.0073 max mem: 33300 Epoch: [18] [1800/4276] eta: 2:00:50 lr: 2.8690676582651244e-05 loss: 0.1190 (0.1256) time: 2.9336 data: 0.0073 max mem: 33300 Epoch: [18] [1810/4276] eta: 2:00:21 lr: 2.868787811413431e-05 loss: 0.1243 (0.1256) time: 2.9350 data: 0.0071 max mem: 33300 Epoch: [18] [1820/4276] eta: 1:59:52 lr: 2.8685079615285126e-05 loss: 0.1339 (0.1257) time: 2.9341 data: 0.0072 max mem: 33300 Epoch: [18] [1830/4276] eta: 1:59:22 lr: 2.8682281086100075e-05 loss: 0.1257 (0.1257) time: 2.9356 data: 0.0073 max mem: 33300 Epoch: [18] [1840/4276] eta: 1:58:53 lr: 2.867948252657554e-05 loss: 0.1141 (0.1256) time: 2.9372 data: 0.0073 max mem: 33300 Epoch: [18] [1850/4276] eta: 1:58:24 lr: 2.8676683936707898e-05 loss: 0.1188 (0.1256) time: 2.9368 data: 0.0071 max mem: 33300 Epoch: [18] [1860/4276] eta: 1:57:55 lr: 2.8673885316493538e-05 loss: 0.1156 (0.1255) time: 2.9369 data: 0.0071 max mem: 33300 Epoch: [18] [1870/4276] eta: 1:57:26 lr: 2.8671086665928832e-05 loss: 0.1167 (0.1256) time: 2.9301 data: 0.0072 max mem: 33300 Epoch: [18] [1880/4276] eta: 1:56:56 lr: 2.866828798501017e-05 loss: 0.1237 (0.1256) time: 2.9265 data: 0.0078 max mem: 33300 Epoch: [18] [1890/4276] eta: 1:56:27 lr: 2.8665489273733918e-05 loss: 0.1176 (0.1255) time: 2.9296 data: 0.0077 max mem: 33300 Epoch: [18] [1900/4276] eta: 1:55:58 lr: 2.8662690532096463e-05 loss: 0.1036 (0.1255) time: 2.9316 data: 0.0073 max mem: 33300 Epoch: [18] [1910/4276] eta: 1:55:29 lr: 2.8659891760094175e-05 loss: 0.1093 (0.1255) time: 2.9310 data: 0.0072 max mem: 33300 Epoch: [18] [1920/4276] eta: 1:54:59 lr: 2.865709295772343e-05 loss: 0.1187 (0.1254) time: 2.9295 data: 0.0070 max mem: 33300 Epoch: [18] [1930/4276] eta: 1:54:30 lr: 2.8654294124980607e-05 loss: 0.1088 (0.1254) time: 2.9309 data: 0.0071 max mem: 33300 Epoch: [18] [1940/4276] eta: 1:54:01 lr: 2.865149526186207e-05 loss: 0.1155 (0.1254) time: 2.9313 data: 0.0072 max mem: 33300 Epoch: [18] [1950/4276] eta: 1:53:32 lr: 2.8648696368364215e-05 loss: 0.1247 (0.1254) time: 2.9349 data: 0.0072 max mem: 33300 Epoch: [18] [1960/4276] eta: 1:53:02 lr: 2.864589744448339e-05 loss: 0.1114 (0.1253) time: 2.9357 data: 0.0071 max mem: 33300 Epoch: [18] [1970/4276] eta: 1:52:33 lr: 2.864309849021598e-05 loss: 0.1052 (0.1253) time: 2.9343 data: 0.0071 max mem: 33300 Epoch: [18] [1980/4276] eta: 1:52:04 lr: 2.8640299505558348e-05 loss: 0.1052 (0.1252) time: 2.9347 data: 0.0071 max mem: 33300 Epoch: [18] [1990/4276] eta: 1:51:35 lr: 2.8637500490506868e-05 loss: 0.1067 (0.1252) time: 2.9327 data: 0.0070 max mem: 33300 Epoch: [18] [2000/4276] eta: 1:51:05 lr: 2.863470144505791e-05 loss: 0.1212 (0.1252) time: 2.9309 data: 0.0071 max mem: 33300 Epoch: [18] [2010/4276] eta: 1:50:36 lr: 2.8631902369207843e-05 loss: 0.1279 (0.1252) time: 2.9314 data: 0.0071 max mem: 33300 Epoch: [18] [2020/4276] eta: 1:50:07 lr: 2.8629103262953032e-05 loss: 0.1205 (0.1252) time: 2.9332 data: 0.0072 max mem: 33300 Epoch: [18] [2030/4276] eta: 1:49:38 lr: 2.862630412628985e-05 loss: 0.1096 (0.1251) time: 2.9333 data: 0.0072 max mem: 33300 Epoch: [18] [2040/4276] eta: 1:49:08 lr: 2.8623504959214647e-05 loss: 0.1078 (0.1251) time: 2.9319 data: 0.0070 max mem: 33300 Epoch: [18] [2050/4276] eta: 1:48:39 lr: 2.862070576172381e-05 loss: 0.1232 (0.1251) time: 2.9311 data: 0.0071 max mem: 33300 Epoch: [18] [2060/4276] eta: 1:48:10 lr: 2.861790653381368e-05 loss: 0.1259 (0.1251) time: 2.9318 data: 0.0073 max mem: 33300 Epoch: [18] [2070/4276] eta: 1:47:40 lr: 2.8615107275480636e-05 loss: 0.1105 (0.1251) time: 2.9237 data: 0.0075 max mem: 33300 Epoch: [18] [2080/4276] eta: 1:47:11 lr: 2.8612307986721036e-05 loss: 0.1105 (0.1251) time: 2.9004 data: 0.0084 max mem: 33300 Epoch: [18] [2090/4276] eta: 1:46:41 lr: 2.8609508667531243e-05 loss: 0.1263 (0.1251) time: 2.9080 data: 0.0087 max mem: 33300 Epoch: [18] [2100/4276] eta: 1:46:12 lr: 2.8606709317907616e-05 loss: 0.1285 (0.1251) time: 2.9216 data: 0.0083 max mem: 33300 Epoch: [18] [2110/4276] eta: 1:45:43 lr: 2.8603909937846518e-05 loss: 0.1173 (0.1250) time: 2.9209 data: 0.0081 max mem: 33300 Epoch: [18] [2120/4276] eta: 1:45:13 lr: 2.8601110527344306e-05 loss: 0.1069 (0.1249) time: 2.9283 data: 0.0075 max mem: 33300 Epoch: [18] [2130/4276] eta: 1:44:44 lr: 2.859831108639734e-05 loss: 0.0954 (0.1248) time: 2.9304 data: 0.0075 max mem: 33300 Epoch: [18] [2140/4276] eta: 1:44:15 lr: 2.8595511615001967e-05 loss: 0.1136 (0.1248) time: 2.9296 data: 0.0076 max mem: 33300 Epoch: [18] [2150/4276] eta: 1:43:46 lr: 2.8592712113154556e-05 loss: 0.1064 (0.1247) time: 2.9261 data: 0.0074 max mem: 33300 Epoch: [18] [2160/4276] eta: 1:43:16 lr: 2.8589912580851463e-05 loss: 0.1064 (0.1247) time: 2.9256 data: 0.0073 max mem: 33300 Epoch: [18] [2170/4276] eta: 1:42:47 lr: 2.8587113018089034e-05 loss: 0.1147 (0.1247) time: 2.9277 data: 0.0075 max mem: 33300 Epoch: [18] [2180/4276] eta: 1:42:18 lr: 2.8584313424863634e-05 loss: 0.1201 (0.1246) time: 2.9298 data: 0.0077 max mem: 33300 Epoch: [18] [2190/4276] eta: 1:41:48 lr: 2.858151380117161e-05 loss: 0.1279 (0.1247) time: 2.9310 data: 0.0076 max mem: 33300 Epoch: [18] [2200/4276] eta: 1:41:19 lr: 2.8578714147009317e-05 loss: 0.1279 (0.1247) time: 2.9308 data: 0.0076 max mem: 33300 Epoch: [18] [2210/4276] eta: 1:40:50 lr: 2.8575914462373104e-05 loss: 0.1268 (0.1247) time: 2.9309 data: 0.0078 max mem: 33300 Epoch: [18] [2220/4276] eta: 1:40:20 lr: 2.857311474725932e-05 loss: 0.1261 (0.1248) time: 2.9133 data: 0.0081 max mem: 33300 Epoch: [18] [2230/4276] eta: 1:39:51 lr: 2.8570315001664316e-05 loss: 0.1190 (0.1248) time: 2.9027 data: 0.0082 max mem: 33300 Epoch: [18] [2240/4276] eta: 1:39:21 lr: 2.8567515225584445e-05 loss: 0.1153 (0.1247) time: 2.9097 data: 0.0076 max mem: 33300 Epoch: [18] [2250/4276] eta: 1:38:52 lr: 2.8564715419016058e-05 loss: 0.1202 (0.1247) time: 2.9083 data: 0.0075 max mem: 33300 Epoch: [18] [2260/4276] eta: 1:38:23 lr: 2.8561915581955494e-05 loss: 0.1267 (0.1248) time: 2.9147 data: 0.0078 max mem: 33300 Epoch: [18] [2270/4276] eta: 1:37:53 lr: 2.8559115714399104e-05 loss: 0.1267 (0.1248) time: 2.9260 data: 0.0076 max mem: 33300 Epoch: [18] [2280/4276] eta: 1:37:24 lr: 2.8556315816343237e-05 loss: 0.1217 (0.1248) time: 2.9251 data: 0.0073 max mem: 33300 Epoch: [18] [2290/4276] eta: 1:36:55 lr: 2.8553515887784232e-05 loss: 0.1254 (0.1248) time: 2.9197 data: 0.0075 max mem: 33300 Epoch: [18] [2300/4276] eta: 1:36:25 lr: 2.8550715928718435e-05 loss: 0.1237 (0.1248) time: 2.9194 data: 0.0077 max mem: 33300 Epoch: [18] [2310/4276] eta: 1:35:56 lr: 2.854791593914219e-05 loss: 0.1266 (0.1249) time: 2.9199 data: 0.0075 max mem: 33300 Epoch: [18] [2320/4276] eta: 1:35:27 lr: 2.8545115919051835e-05 loss: 0.1304 (0.1249) time: 2.9276 data: 0.0072 max mem: 33300 Epoch: [18] [2330/4276] eta: 1:34:57 lr: 2.8542315868443718e-05 loss: 0.1310 (0.1249) time: 2.9333 data: 0.0075 max mem: 33300 Epoch: [18] [2340/4276] eta: 1:34:28 lr: 2.8539515787314185e-05 loss: 0.1323 (0.1250) time: 2.9308 data: 0.0078 max mem: 33300 Epoch: [18] [2350/4276] eta: 1:33:59 lr: 2.8536715675659566e-05 loss: 0.1250 (0.1250) time: 2.9303 data: 0.0075 max mem: 33300 Epoch: [18] [2360/4276] eta: 1:33:30 lr: 2.8533915533476206e-05 loss: 0.1171 (0.1249) time: 2.9316 data: 0.0073 max mem: 33300 Epoch: [18] [2370/4276] eta: 1:33:00 lr: 2.8531115360760436e-05 loss: 0.1226 (0.1250) time: 2.9304 data: 0.0075 max mem: 33300 Epoch: [18] [2380/4276] eta: 1:32:31 lr: 2.8528315157508596e-05 loss: 0.1261 (0.1250) time: 2.9305 data: 0.0077 max mem: 33300 Epoch: [18] [2390/4276] eta: 1:32:02 lr: 2.8525514923717023e-05 loss: 0.1216 (0.1250) time: 2.9294 data: 0.0073 max mem: 33300 Epoch: [18] [2400/4276] eta: 1:31:33 lr: 2.8522714659382056e-05 loss: 0.1225 (0.1250) time: 2.9293 data: 0.0070 max mem: 33300 Epoch: [18] [2410/4276] eta: 1:31:03 lr: 2.851991436450003e-05 loss: 0.1216 (0.1250) time: 2.9290 data: 0.0071 max mem: 33300 Epoch: [18] [2420/4276] eta: 1:30:34 lr: 2.8517114039067277e-05 loss: 0.1118 (0.1250) time: 2.9249 data: 0.0073 max mem: 33300 Epoch: [18] [2430/4276] eta: 1:30:05 lr: 2.851431368308013e-05 loss: 0.1250 (0.1250) time: 2.9284 data: 0.0073 max mem: 33300 Epoch: [18] [2440/4276] eta: 1:29:36 lr: 2.8511513296534924e-05 loss: 0.1250 (0.1250) time: 2.9336 data: 0.0071 max mem: 33300 Epoch: [18] [2450/4276] eta: 1:29:06 lr: 2.850871287942799e-05 loss: 0.1196 (0.1250) time: 2.9300 data: 0.0072 max mem: 33300 Epoch: [18] [2460/4276] eta: 1:28:37 lr: 2.8505912431755655e-05 loss: 0.1246 (0.1251) time: 2.9261 data: 0.0074 max mem: 33300 Epoch: [18] [2470/4276] eta: 1:28:08 lr: 2.8503111953514243e-05 loss: 0.1281 (0.1251) time: 2.9250 data: 0.0072 max mem: 33300 Epoch: [18] [2480/4276] eta: 1:27:38 lr: 2.85003114447001e-05 loss: 0.1322 (0.1252) time: 2.9233 data: 0.0071 max mem: 33300 Epoch: [18] [2490/4276] eta: 1:27:09 lr: 2.8497510905309543e-05 loss: 0.1312 (0.1252) time: 2.9267 data: 0.0073 max mem: 33300 Epoch: [18] [2500/4276] eta: 1:26:40 lr: 2.8494710335338908e-05 loss: 0.1163 (0.1252) time: 2.9282 data: 0.0073 max mem: 33300 Epoch: [18] [2510/4276] eta: 1:26:10 lr: 2.8491909734784506e-05 loss: 0.1293 (0.1252) time: 2.9279 data: 0.0072 max mem: 33300 Epoch: [18] [2520/4276] eta: 1:25:41 lr: 2.848910910364268e-05 loss: 0.1081 (0.1252) time: 2.9304 data: 0.0072 max mem: 33300 Epoch: [18] [2530/4276] eta: 1:25:12 lr: 2.8486308441909744e-05 loss: 0.0984 (0.1251) time: 2.9314 data: 0.0073 max mem: 33300 Epoch: [18] [2540/4276] eta: 1:24:43 lr: 2.8483507749582022e-05 loss: 0.1056 (0.1251) time: 2.9296 data: 0.0074 max mem: 33300 Epoch: [18] [2550/4276] eta: 1:24:14 lr: 2.8480707026655838e-05 loss: 0.1094 (0.1250) time: 2.9347 data: 0.0073 max mem: 33300 Epoch: [18] [2560/4276] eta: 1:23:44 lr: 2.8477906273127526e-05 loss: 0.1056 (0.1250) time: 2.9352 data: 0.0072 max mem: 33300 Epoch: [18] [2570/4276] eta: 1:23:15 lr: 2.8475105488993393e-05 loss: 0.1131 (0.1249) time: 2.9278 data: 0.0073 max mem: 33300 Epoch: [18] [2580/4276] eta: 1:22:46 lr: 2.8472304674249762e-05 loss: 0.1155 (0.1249) time: 2.9283 data: 0.0073 max mem: 33300 Epoch: [18] [2590/4276] eta: 1:22:16 lr: 2.846950382889296e-05 loss: 0.1149 (0.1249) time: 2.9287 data: 0.0071 max mem: 33300 Epoch: [18] [2600/4276] eta: 1:21:47 lr: 2.84667029529193e-05 loss: 0.1136 (0.1249) time: 2.9278 data: 0.0071 max mem: 33300 Epoch: [18] [2610/4276] eta: 1:21:18 lr: 2.84639020463251e-05 loss: 0.1134 (0.1248) time: 2.9291 data: 0.0073 max mem: 33300 Epoch: [18] [2620/4276] eta: 1:20:49 lr: 2.8461101109106673e-05 loss: 0.1070 (0.1248) time: 2.9338 data: 0.0072 max mem: 33300 Epoch: [18] [2630/4276] eta: 1:20:19 lr: 2.8458300141260347e-05 loss: 0.1070 (0.1248) time: 2.9331 data: 0.0071 max mem: 33300 Epoch: [18] [2640/4276] eta: 1:19:50 lr: 2.8455499142782432e-05 loss: 0.1112 (0.1247) time: 2.9308 data: 0.0071 max mem: 33300 Epoch: [18] [2650/4276] eta: 1:19:21 lr: 2.8452698113669247e-05 loss: 0.1112 (0.1247) time: 2.9309 data: 0.0073 max mem: 33300 Epoch: [18] [2660/4276] eta: 1:18:51 lr: 2.844989705391709e-05 loss: 0.1174 (0.1247) time: 2.9269 data: 0.0072 max mem: 33300 Epoch: [18] [2670/4276] eta: 1:18:22 lr: 2.8447095963522296e-05 loss: 0.1243 (0.1248) time: 2.9234 data: 0.0070 max mem: 33300 Epoch: [18] [2680/4276] eta: 1:17:53 lr: 2.8444294842481163e-05 loss: 0.1234 (0.1248) time: 2.9197 data: 0.0075 max mem: 33300 Epoch: [18] [2690/4276] eta: 1:17:24 lr: 2.844149369079e-05 loss: 0.1174 (0.1248) time: 2.9629 data: 0.0078 max mem: 33300 Epoch: [18] [2700/4276] eta: 1:16:55 lr: 2.8438692508445124e-05 loss: 0.1081 (0.1248) time: 3.0109 data: 0.0077 max mem: 33300 Epoch: [18] [2710/4276] eta: 1:16:26 lr: 2.843589129544285e-05 loss: 0.1081 (0.1247) time: 2.9768 data: 0.0077 max mem: 33300 Epoch: [18] [2720/4276] eta: 1:15:57 lr: 2.8433090051779476e-05 loss: 0.1048 (0.1247) time: 2.9341 data: 0.0077 max mem: 33300 Epoch: [18] [2730/4276] eta: 1:15:27 lr: 2.843028877745132e-05 loss: 0.1155 (0.1247) time: 2.9259 data: 0.0079 max mem: 33300 Epoch: [18] [2740/4276] eta: 1:14:58 lr: 2.842748747245468e-05 loss: 0.1236 (0.1247) time: 2.9258 data: 0.0080 max mem: 33300 Epoch: [18] [2750/4276] eta: 1:14:29 lr: 2.8424686136785867e-05 loss: 0.1258 (0.1248) time: 2.9255 data: 0.0078 max mem: 33300 Epoch: [18] [2760/4276] eta: 1:14:00 lr: 2.842188477044118e-05 loss: 0.1268 (0.1248) time: 2.9267 data: 0.0078 max mem: 33300 Epoch: [18] [2770/4276] eta: 1:13:30 lr: 2.8419083373416932e-05 loss: 0.1247 (0.1247) time: 2.9302 data: 0.0080 max mem: 33300 Epoch: [18] [2780/4276] eta: 1:13:01 lr: 2.841628194570942e-05 loss: 0.1179 (0.1247) time: 2.9339 data: 0.0080 max mem: 33300 Epoch: [18] [2790/4276] eta: 1:12:32 lr: 2.8413480487314958e-05 loss: 0.1166 (0.1248) time: 2.9335 data: 0.0077 max mem: 33300 Epoch: [18] [2800/4276] eta: 1:12:02 lr: 2.8410678998229835e-05 loss: 0.1160 (0.1247) time: 2.9308 data: 0.0077 max mem: 33300 Epoch: [18] [2810/4276] eta: 1:11:33 lr: 2.840787747845036e-05 loss: 0.1001 (0.1246) time: 2.9315 data: 0.0079 max mem: 33300 Epoch: [18] [2820/4276] eta: 1:11:04 lr: 2.840507592797283e-05 loss: 0.0989 (0.1245) time: 2.9371 data: 0.0079 max mem: 33300 Epoch: [18] [2830/4276] eta: 1:10:35 lr: 2.840227434679355e-05 loss: 0.1017 (0.1245) time: 2.9338 data: 0.0077 max mem: 33300 Epoch: [18] [2840/4276] eta: 1:10:05 lr: 2.8399472734908806e-05 loss: 0.1169 (0.1245) time: 2.9289 data: 0.0078 max mem: 33300 Epoch: [18] [2850/4276] eta: 1:09:36 lr: 2.83966710923149e-05 loss: 0.1300 (0.1246) time: 2.9316 data: 0.0076 max mem: 33300 Epoch: [18] [2860/4276] eta: 1:09:07 lr: 2.839386941900814e-05 loss: 0.1317 (0.1246) time: 2.9305 data: 0.0073 max mem: 33300 Epoch: [18] [2870/4276] eta: 1:08:38 lr: 2.8391067714984813e-05 loss: 0.1214 (0.1245) time: 2.9302 data: 0.0073 max mem: 33300 Epoch: [18] [2880/4276] eta: 1:08:08 lr: 2.838826598024122e-05 loss: 0.1234 (0.1246) time: 2.9287 data: 0.0073 max mem: 33300 Epoch: [18] [2890/4276] eta: 1:07:39 lr: 2.8385464214773643e-05 loss: 0.1200 (0.1245) time: 2.9282 data: 0.0073 max mem: 33300 Epoch: [18] [2900/4276] eta: 1:07:10 lr: 2.8382662418578388e-05 loss: 0.1134 (0.1245) time: 2.9292 data: 0.0073 max mem: 33300 Epoch: [18] [2910/4276] eta: 1:06:40 lr: 2.837986059165174e-05 loss: 0.1039 (0.1244) time: 2.9314 data: 0.0073 max mem: 33300 Epoch: [18] [2920/4276] eta: 1:06:11 lr: 2.837705873399e-05 loss: 0.1102 (0.1244) time: 2.9319 data: 0.0073 max mem: 33300 Epoch: [18] [2930/4276] eta: 1:05:42 lr: 2.837425684558944e-05 loss: 0.1102 (0.1244) time: 2.9312 data: 0.0074 max mem: 33300 Epoch: [18] [2940/4276] eta: 1:05:13 lr: 2.8371454926446367e-05 loss: 0.1142 (0.1244) time: 2.9321 data: 0.0074 max mem: 33300 Epoch: [18] [2950/4276] eta: 1:04:43 lr: 2.836865297655707e-05 loss: 0.1126 (0.1244) time: 2.9312 data: 0.0074 max mem: 33300 Epoch: [18] [2960/4276] eta: 1:04:14 lr: 2.8365850995917838e-05 loss: 0.1143 (0.1244) time: 2.9292 data: 0.0074 max mem: 33300 Epoch: [18] [2970/4276] eta: 1:03:45 lr: 2.8363048984524942e-05 loss: 0.1145 (0.1244) time: 2.9309 data: 0.0074 max mem: 33300 Epoch: [18] [2980/4276] eta: 1:03:15 lr: 2.836024694237469e-05 loss: 0.1217 (0.1244) time: 2.9322 data: 0.0074 max mem: 33300 Epoch: [18] [2990/4276] eta: 1:02:46 lr: 2.835744486946335e-05 loss: 0.1107 (0.1244) time: 2.9310 data: 0.0073 max mem: 33300 Epoch: [18] [3000/4276] eta: 1:02:17 lr: 2.835464276578722e-05 loss: 0.1116 (0.1244) time: 2.9334 data: 0.0082 max mem: 33300 Epoch: [18] [3010/4276] eta: 1:01:48 lr: 2.8351840631342575e-05 loss: 0.1145 (0.1244) time: 2.9309 data: 0.0089 max mem: 33300 Epoch: [18] [3020/4276] eta: 1:01:18 lr: 2.8349038466125703e-05 loss: 0.1208 (0.1243) time: 2.9289 data: 0.0083 max mem: 33300 Epoch: [18] [3030/4276] eta: 1:00:49 lr: 2.8346236270132886e-05 loss: 0.1193 (0.1243) time: 2.9244 data: 0.0079 max mem: 33300 Epoch: [18] [3040/4276] eta: 1:00:20 lr: 2.8343434043360413e-05 loss: 0.1251 (0.1244) time: 2.9084 data: 0.0083 max mem: 33300 Epoch: [18] [3050/4276] eta: 0:59:50 lr: 2.834063178580455e-05 loss: 0.1158 (0.1244) time: 2.8860 data: 0.0095 max mem: 33300 Epoch: [18] [3060/4276] eta: 0:59:21 lr: 2.8337829497461587e-05 loss: 0.1118 (0.1243) time: 2.8786 data: 0.0099 max mem: 33300 Epoch: [18] [3070/4276] eta: 0:58:51 lr: 2.8335027178327794e-05 loss: 0.1107 (0.1243) time: 2.8933 data: 0.0089 max mem: 33300 Epoch: [18] [3080/4276] eta: 0:58:22 lr: 2.8332224828399458e-05 loss: 0.1107 (0.1243) time: 2.8822 data: 0.0088 max mem: 33300 Epoch: [18] [3090/4276] eta: 0:57:52 lr: 2.8329422447672844e-05 loss: 0.1053 (0.1242) time: 2.8723 data: 0.0099 max mem: 33300 Epoch: [18] [3100/4276] eta: 0:57:23 lr: 2.832662003614424e-05 loss: 0.1149 (0.1242) time: 2.8809 data: 0.0102 max mem: 33300 Epoch: [18] [3110/4276] eta: 0:56:53 lr: 2.8323817593809925e-05 loss: 0.1097 (0.1241) time: 2.8830 data: 0.0094 max mem: 33300 Epoch: [18] [3120/4276] eta: 0:56:24 lr: 2.8321015120666165e-05 loss: 0.1081 (0.1241) time: 2.8819 data: 0.0093 max mem: 33300 Epoch: [18] [3130/4276] eta: 0:55:54 lr: 2.8318212616709234e-05 loss: 0.1098 (0.1241) time: 2.8824 data: 0.0096 max mem: 33300 Epoch: [18] [3140/4276] eta: 0:55:25 lr: 2.8315410081935412e-05 loss: 0.1192 (0.1241) time: 2.9022 data: 0.0094 max mem: 33300 Epoch: [18] [3150/4276] eta: 0:54:56 lr: 2.831260751634096e-05 loss: 0.1195 (0.1241) time: 2.9202 data: 0.0089 max mem: 33300 Epoch: [18] [3160/4276] eta: 0:54:27 lr: 2.8309804919922157e-05 loss: 0.1121 (0.1240) time: 2.9313 data: 0.0086 max mem: 33300 Epoch: [18] [3170/4276] eta: 0:53:57 lr: 2.8307002292675268e-05 loss: 0.1140 (0.1241) time: 2.9547 data: 0.0079 max mem: 33300 Epoch: [18] [3180/4276] eta: 0:53:28 lr: 2.8304199634596562e-05 loss: 0.1142 (0.1240) time: 2.9608 data: 0.0072 max mem: 33300 Epoch: [18] [3190/4276] eta: 0:52:59 lr: 2.8301396945682323e-05 loss: 0.1202 (0.1241) time: 2.9382 data: 0.0079 max mem: 33300 Epoch: [18] [3200/4276] eta: 0:52:30 lr: 2.8298594225928797e-05 loss: 0.1337 (0.1241) time: 2.9109 data: 0.0087 max mem: 33300 Epoch: [18] [3210/4276] eta: 0:52:00 lr: 2.8295791475332274e-05 loss: 0.1245 (0.1241) time: 2.9196 data: 0.0093 max mem: 33300 Epoch: [18] [3220/4276] eta: 0:51:31 lr: 2.8292988693888993e-05 loss: 0.1232 (0.1241) time: 2.9382 data: 0.0093 max mem: 33300 Epoch: [18] [3230/4276] eta: 0:51:03 lr: 2.8290185881595237e-05 loss: 0.1154 (0.1241) time: 3.0840 data: 0.0090 max mem: 33300 Epoch: [18] [3240/4276] eta: 0:50:35 lr: 2.8287383038447262e-05 loss: 0.1372 (0.1241) time: 3.2514 data: 0.0090 max mem: 33300 Epoch: [18] [3250/4276] eta: 0:50:06 lr: 2.8284580164441342e-05 loss: 0.1344 (0.1241) time: 3.2685 data: 0.0088 max mem: 33300 Epoch: [18] [3260/4276] eta: 0:49:38 lr: 2.8281777259573726e-05 loss: 0.1193 (0.1241) time: 3.2592 data: 0.0089 max mem: 33300 Epoch: [18] [3270/4276] eta: 0:49:10 lr: 2.8278974323840696e-05 loss: 0.1234 (0.1241) time: 3.2583 data: 0.0089 max mem: 33300 Epoch: [18] [3280/4276] eta: 0:48:41 lr: 2.827617135723849e-05 loss: 0.1309 (0.1242) time: 3.2665 data: 0.0089 max mem: 33300 Epoch: [18] [3290/4276] eta: 0:48:13 lr: 2.827336835976338e-05 loss: 0.1317 (0.1242) time: 3.2665 data: 0.0089 max mem: 33300 Epoch: [18] [3300/4276] eta: 0:47:45 lr: 2.8270565331411626e-05 loss: 0.1386 (0.1242) time: 3.2635 data: 0.0094 max mem: 33300 Epoch: [18] [3310/4276] eta: 0:47:16 lr: 2.8267762272179475e-05 loss: 0.1383 (0.1243) time: 3.2636 data: 0.0093 max mem: 33300 Epoch: [18] [3320/4276] eta: 0:46:48 lr: 2.8264959182063195e-05 loss: 0.1326 (0.1243) time: 3.2645 data: 0.0094 max mem: 33300 Epoch: [18] [3330/4276] eta: 0:46:19 lr: 2.8262156061059042e-05 loss: 0.1137 (0.1242) time: 3.2446 data: 0.0092 max mem: 33300 Epoch: [18] [3340/4276] eta: 0:45:51 lr: 2.8259352909163266e-05 loss: 0.1149 (0.1242) time: 3.2234 data: 0.0089 max mem: 33300 Epoch: [18] [3350/4276] eta: 0:45:22 lr: 2.825654972637213e-05 loss: 0.1149 (0.1242) time: 3.2215 data: 0.0087 max mem: 33300 Epoch: [18] [3360/4276] eta: 0:44:54 lr: 2.8253746512681876e-05 loss: 0.1042 (0.1241) time: 3.2246 data: 0.0081 max mem: 33300 Epoch: [18] [3370/4276] eta: 0:44:25 lr: 2.8250943268088776e-05 loss: 0.1097 (0.1242) time: 3.2474 data: 0.0080 max mem: 33300 Epoch: [18] [3380/4276] eta: 0:43:56 lr: 2.8248139992589062e-05 loss: 0.1213 (0.1242) time: 3.2517 data: 0.0079 max mem: 33300 Epoch: [18] [3390/4276] eta: 0:43:28 lr: 2.8245336686178992e-05 loss: 0.1225 (0.1242) time: 3.2150 data: 0.0077 max mem: 33300 Epoch: [18] [3400/4276] eta: 0:42:59 lr: 2.8242533348854822e-05 loss: 0.1325 (0.1242) time: 3.2051 data: 0.0078 max mem: 33300 Epoch: [18] [3410/4276] eta: 0:42:30 lr: 2.823972998061279e-05 loss: 0.1228 (0.1242) time: 3.2170 data: 0.0085 max mem: 33300 Epoch: [18] [3420/4276] eta: 0:42:01 lr: 2.8236926581449157e-05 loss: 0.1248 (0.1242) time: 3.2078 data: 0.0089 max mem: 33300 Epoch: [18] [3430/4276] eta: 0:41:33 lr: 2.823412315136017e-05 loss: 0.1289 (0.1242) time: 3.2257 data: 0.0086 max mem: 33300 Epoch: [18] [3440/4276] eta: 0:41:04 lr: 2.8231319690342068e-05 loss: 0.1251 (0.1242) time: 3.2414 data: 0.0086 max mem: 33300 Epoch: [18] [3450/4276] eta: 0:40:35 lr: 2.8228516198391108e-05 loss: 0.1235 (0.1242) time: 3.2187 data: 0.0089 max mem: 33300 Epoch: [18] [3460/4276] eta: 0:40:06 lr: 2.822571267550352e-05 loss: 0.1449 (0.1243) time: 3.2123 data: 0.0085 max mem: 33300 Epoch: [18] [3470/4276] eta: 0:39:37 lr: 2.8222909121675556e-05 loss: 0.1225 (0.1242) time: 3.1652 data: 0.0078 max mem: 33300 Epoch: [18] [3480/4276] eta: 0:39:08 lr: 2.8220105536903463e-05 loss: 0.1225 (0.1243) time: 3.0578 data: 0.0074 max mem: 33300 Epoch: [18] [3490/4276] eta: 0:38:38 lr: 2.8217301921183476e-05 loss: 0.1250 (0.1243) time: 3.0280 data: 0.0075 max mem: 33300 Epoch: [18] [3500/4276] eta: 0:38:09 lr: 2.821449827451185e-05 loss: 0.1176 (0.1242) time: 2.9855 data: 0.0080 max mem: 33300 Epoch: [18] [3510/4276] eta: 0:37:39 lr: 2.821169459688481e-05 loss: 0.1063 (0.1242) time: 2.9327 data: 0.0077 max mem: 33300 Epoch: [18] [3520/4276] eta: 0:37:10 lr: 2.820889088829861e-05 loss: 0.1197 (0.1242) time: 2.9543 data: 0.0072 max mem: 33300 Epoch: [18] [3530/4276] eta: 0:36:40 lr: 2.8206087148749477e-05 loss: 0.1197 (0.1242) time: 2.9440 data: 0.0076 max mem: 33300 Epoch: [18] [3540/4276] eta: 0:36:11 lr: 2.8203283378233653e-05 loss: 0.1222 (0.1243) time: 2.9337 data: 0.0076 max mem: 33300 Epoch: [18] [3550/4276] eta: 0:35:41 lr: 2.8200479576747374e-05 loss: 0.1171 (0.1243) time: 2.9565 data: 0.0074 max mem: 33300 Epoch: [18] [3560/4276] eta: 0:35:12 lr: 2.8197675744286883e-05 loss: 0.1171 (0.1243) time: 2.9590 data: 0.0071 max mem: 33300 Epoch: [18] [3570/4276] eta: 0:34:42 lr: 2.819487188084841e-05 loss: 0.1316 (0.1243) time: 2.9433 data: 0.0072 max mem: 33300 Epoch: [18] [3580/4276] eta: 0:34:13 lr: 2.81920679864282e-05 loss: 0.1230 (0.1243) time: 2.9466 data: 0.0071 max mem: 33300 Epoch: [18] [3590/4276] eta: 0:33:43 lr: 2.8189264061022476e-05 loss: 0.1216 (0.1243) time: 2.9448 data: 0.0068 max mem: 33300 Epoch: [18] [3600/4276] eta: 0:33:14 lr: 2.818646010462747e-05 loss: 0.1205 (0.1243) time: 2.9407 data: 0.0068 max mem: 33300 Epoch: [18] [3610/4276] eta: 0:32:44 lr: 2.818365611723942e-05 loss: 0.1205 (0.1243) time: 2.9512 data: 0.0068 max mem: 33300 Epoch: [18] [3620/4276] eta: 0:32:15 lr: 2.8180852098854554e-05 loss: 0.1172 (0.1243) time: 2.9307 data: 0.0073 max mem: 33300 Epoch: [18] [3630/4276] eta: 0:31:45 lr: 2.8178048049469097e-05 loss: 0.1197 (0.1243) time: 2.8987 data: 0.0078 max mem: 33300 Epoch: [18] [3640/4276] eta: 0:31:16 lr: 2.817524396907929e-05 loss: 0.1175 (0.1243) time: 2.9210 data: 0.0075 max mem: 33300 Epoch: [18] [3650/4276] eta: 0:30:46 lr: 2.8172439857681356e-05 loss: 0.1106 (0.1243) time: 2.9441 data: 0.0066 max mem: 33300 Epoch: [18] [3660/4276] eta: 0:30:17 lr: 2.8169635715271532e-05 loss: 0.1141 (0.1242) time: 2.9424 data: 0.0062 max mem: 33300 Epoch: [18] [3670/4276] eta: 0:29:47 lr: 2.8166831541846022e-05 loss: 0.1146 (0.1242) time: 2.9653 data: 0.0061 max mem: 33300 Epoch: [18] [3680/4276] eta: 0:29:18 lr: 2.8164027337401077e-05 loss: 0.1146 (0.1242) time: 2.9650 data: 0.0063 max mem: 33300 Epoch: [18] [3690/4276] eta: 0:28:48 lr: 2.8161223101932904e-05 loss: 0.1198 (0.1242) time: 2.9433 data: 0.0065 max mem: 33300 Epoch: [18] [3700/4276] eta: 0:28:19 lr: 2.8158418835437734e-05 loss: 0.1169 (0.1242) time: 2.9450 data: 0.0064 max mem: 33300 Epoch: [18] [3710/4276] eta: 0:27:49 lr: 2.815561453791179e-05 loss: 0.1054 (0.1242) time: 2.9389 data: 0.0062 max mem: 33300 Epoch: [18] [3720/4276] eta: 0:27:19 lr: 2.8152810209351298e-05 loss: 0.1037 (0.1241) time: 2.9245 data: 0.0067 max mem: 33300 Epoch: [18] [3730/4276] eta: 0:26:50 lr: 2.8150005849752475e-05 loss: 0.1053 (0.1241) time: 2.9145 data: 0.0068 max mem: 33300 Epoch: [18] [3740/4276] eta: 0:26:20 lr: 2.814720145911155e-05 loss: 0.1154 (0.1241) time: 2.9229 data: 0.0069 max mem: 33300 Epoch: [18] [3750/4276] eta: 0:25:51 lr: 2.814439703742473e-05 loss: 0.1258 (0.1241) time: 2.9382 data: 0.0073 max mem: 33300 Epoch: [18] [3760/4276] eta: 0:25:21 lr: 2.814159258468825e-05 loss: 0.1204 (0.1241) time: 2.9437 data: 0.0073 max mem: 33300 Epoch: [18] [3770/4276] eta: 0:24:52 lr: 2.8138788100898307e-05 loss: 0.1004 (0.1241) time: 2.9426 data: 0.0075 max mem: 33300 Epoch: [18] [3780/4276] eta: 0:24:22 lr: 2.813598358605113e-05 loss: 0.1134 (0.1240) time: 2.9502 data: 0.0078 max mem: 33300 Epoch: [18] [3790/4276] eta: 0:23:53 lr: 2.8133179040142937e-05 loss: 0.1088 (0.1240) time: 2.9602 data: 0.0078 max mem: 33300 Epoch: [18] [3800/4276] eta: 0:23:23 lr: 2.8130374463169938e-05 loss: 0.1096 (0.1241) time: 2.9504 data: 0.0075 max mem: 33300 Epoch: [18] [3810/4276] eta: 0:22:54 lr: 2.812756985512836e-05 loss: 0.1096 (0.1240) time: 2.9395 data: 0.0074 max mem: 33300 Epoch: [18] [3820/4276] eta: 0:22:24 lr: 2.8124765216014398e-05 loss: 0.1035 (0.1240) time: 2.9410 data: 0.0076 max mem: 33300 Epoch: [18] [3830/4276] eta: 0:21:55 lr: 2.8121960545824284e-05 loss: 0.1035 (0.1240) time: 2.9545 data: 0.0076 max mem: 33300 Epoch: [18] [3840/4276] eta: 0:21:25 lr: 2.8119155844554214e-05 loss: 0.1141 (0.1239) time: 2.9273 data: 0.0073 max mem: 33300 Epoch: [18] [3850/4276] eta: 0:20:56 lr: 2.8116351112200407e-05 loss: 0.1027 (0.1239) time: 2.8824 data: 0.0070 max mem: 33300 Epoch: [18] [3860/4276] eta: 0:20:26 lr: 2.8113546348759067e-05 loss: 0.1056 (0.1239) time: 2.8749 data: 0.0069 max mem: 33300 Epoch: [18] [3870/4276] eta: 0:19:57 lr: 2.811074155422641e-05 loss: 0.1160 (0.1238) time: 2.8818 data: 0.0067 max mem: 33300 Epoch: [18] [3880/4276] eta: 0:19:27 lr: 2.8107936728598643e-05 loss: 0.1115 (0.1238) time: 2.8890 data: 0.0062 max mem: 33300 Epoch: [18] [3890/4276] eta: 0:18:58 lr: 2.8105131871871983e-05 loss: 0.1232 (0.1239) time: 2.8858 data: 0.0061 max mem: 33300 Epoch: [18] [3900/4276] eta: 0:18:28 lr: 2.8102326984042615e-05 loss: 0.1247 (0.1239) time: 2.8878 data: 0.0063 max mem: 33300 Epoch: [18] [3910/4276] eta: 0:17:59 lr: 2.8099522065106758e-05 loss: 0.1013 (0.1238) time: 2.9429 data: 0.0065 max mem: 33300 Epoch: [18] [3920/4276] eta: 0:17:29 lr: 2.8096717115060618e-05 loss: 0.1013 (0.1238) time: 2.9747 data: 0.0074 max mem: 33300 Epoch: [18] [3930/4276] eta: 0:17:00 lr: 2.809391213390039e-05 loss: 0.1091 (0.1238) time: 2.9273 data: 0.0079 max mem: 33300 Epoch: [18] [3940/4276] eta: 0:16:30 lr: 2.8091107121622288e-05 loss: 0.1306 (0.1238) time: 2.8964 data: 0.0073 max mem: 33300 Epoch: [18] [3950/4276] eta: 0:16:01 lr: 2.8088302078222506e-05 loss: 0.1221 (0.1238) time: 2.8913 data: 0.0065 max mem: 33300 Epoch: [18] [3960/4276] eta: 0:15:31 lr: 2.8085497003697247e-05 loss: 0.1221 (0.1238) time: 2.8889 data: 0.0062 max mem: 33300 Epoch: [18] [3970/4276] eta: 0:15:02 lr: 2.8082691898042725e-05 loss: 0.1262 (0.1238) time: 2.9050 data: 0.0064 max mem: 33300 Epoch: [18] [3980/4276] eta: 0:14:32 lr: 2.8079886761255114e-05 loss: 0.1194 (0.1238) time: 2.9251 data: 0.0068 max mem: 33300 Epoch: [18] [3990/4276] eta: 0:14:03 lr: 2.8077081593330636e-05 loss: 0.1063 (0.1238) time: 2.9442 data: 0.0069 max mem: 33300 Epoch: [18] [4000/4276] eta: 0:13:33 lr: 2.8074276394265476e-05 loss: 0.1094 (0.1238) time: 2.9294 data: 0.0068 max mem: 33300 Epoch: [18] [4010/4276] eta: 0:13:04 lr: 2.807147116405583e-05 loss: 0.1222 (0.1238) time: 2.9207 data: 0.0069 max mem: 33300 Epoch: [18] [4020/4276] eta: 0:12:34 lr: 2.8068665902697898e-05 loss: 0.1188 (0.1238) time: 2.9441 data: 0.0066 max mem: 33300 Epoch: [18] [4030/4276] eta: 0:12:05 lr: 2.8065860610187877e-05 loss: 0.1131 (0.1238) time: 2.9610 data: 0.0065 max mem: 33300 Epoch: [18] [4040/4276] eta: 0:11:35 lr: 2.8063055286521955e-05 loss: 0.1131 (0.1238) time: 2.9637 data: 0.0064 max mem: 33300 Epoch: [18] [4050/4276] eta: 0:11:06 lr: 2.8060249931696344e-05 loss: 0.1136 (0.1238) time: 2.9519 data: 0.0062 max mem: 33300 Epoch: [18] [4060/4276] eta: 0:10:36 lr: 2.8057444545707207e-05 loss: 0.1235 (0.1238) time: 2.9546 data: 0.0066 max mem: 33300 Epoch: [18] [4070/4276] eta: 0:10:07 lr: 2.8054639128550762e-05 loss: 0.1259 (0.1238) time: 2.9509 data: 0.0064 max mem: 33300 Epoch: [18] [4080/4276] eta: 0:09:37 lr: 2.8051833680223176e-05 loss: 0.1259 (0.1238) time: 2.9453 data: 0.0062 max mem: 33300 Epoch: [18] [4090/4276] eta: 0:09:08 lr: 2.804902820072066e-05 loss: 0.1255 (0.1238) time: 2.9459 data: 0.0061 max mem: 33300 Epoch: [18] [4100/4276] eta: 0:08:38 lr: 2.804622269003939e-05 loss: 0.1222 (0.1238) time: 2.9616 data: 0.0060 max mem: 33300 Epoch: [18] [4110/4276] eta: 0:08:09 lr: 2.804341714817556e-05 loss: 0.1173 (0.1238) time: 2.9643 data: 0.0062 max mem: 33300 Epoch: [18] [4120/4276] eta: 0:07:39 lr: 2.804061157512536e-05 loss: 0.1164 (0.1238) time: 2.9529 data: 0.0064 max mem: 33300 Epoch: [18] [4130/4276] eta: 0:07:10 lr: 2.803780597088496e-05 loss: 0.1140 (0.1238) time: 2.9550 data: 0.0066 max mem: 33300 Epoch: [18] [4140/4276] eta: 0:06:40 lr: 2.8035000335450572e-05 loss: 0.1088 (0.1238) time: 2.9314 data: 0.0069 max mem: 33300 Epoch: [18] [4150/4276] eta: 0:06:11 lr: 2.8032194668818353e-05 loss: 0.1200 (0.1238) time: 2.9323 data: 0.0065 max mem: 33300 Epoch: [18] [4160/4276] eta: 0:05:41 lr: 2.80293889709845e-05 loss: 0.1319 (0.1239) time: 2.9695 data: 0.0061 max mem: 33300 Epoch: [18] [4170/4276] eta: 0:05:12 lr: 2.8026583241945197e-05 loss: 0.1300 (0.1239) time: 2.9401 data: 0.0061 max mem: 33300 Epoch: [18] [4180/4276] eta: 0:04:42 lr: 2.8023777481696624e-05 loss: 0.1274 (0.1239) time: 2.8973 data: 0.0060 max mem: 33300 Epoch: [18] [4190/4276] eta: 0:04:13 lr: 2.8020971690234954e-05 loss: 0.1176 (0.1239) time: 2.8994 data: 0.0059 max mem: 33300 Epoch: [18] [4200/4276] eta: 0:03:44 lr: 2.801816586755639e-05 loss: 0.1180 (0.1240) time: 2.9026 data: 0.0059 max mem: 33300 Epoch: [18] [4210/4276] eta: 0:03:14 lr: 2.801536001365708e-05 loss: 0.1330 (0.1240) time: 2.9155 data: 0.0060 max mem: 33300 Epoch: [18] [4220/4276] eta: 0:02:45 lr: 2.801255412853323e-05 loss: 0.1359 (0.1241) time: 2.9437 data: 0.0061 max mem: 33300 Epoch: [18] [4230/4276] eta: 0:02:15 lr: 2.8009748212180993e-05 loss: 0.1350 (0.1241) time: 2.9459 data: 0.0064 max mem: 33300 Epoch: [18] [4240/4276] eta: 0:01:46 lr: 2.8006942264596565e-05 loss: 0.1303 (0.1241) time: 2.9411 data: 0.0068 max mem: 33300 Epoch: [18] [4250/4276] eta: 0:01:16 lr: 2.8004136285776106e-05 loss: 0.1273 (0.1242) time: 2.9477 data: 0.0065 max mem: 33300 Epoch: [18] [4260/4276] eta: 0:00:47 lr: 2.8001330275715805e-05 loss: 0.1273 (0.1242) time: 2.9515 data: 0.0063 max mem: 33300 Epoch: [18] [4270/4276] eta: 0:00:17 lr: 2.799852423441182e-05 loss: 0.1299 (0.1242) time: 2.9703 data: 0.0062 max mem: 33300 Epoch: [18] Total time: 3:30:03 Test: [ 0/21770] eta: 11:01:12 time: 1.8224 data: 1.7838 max mem: 33300 Test: [ 100/21770] eta: 0:19:56 time: 0.0378 data: 0.0009 max mem: 33300 Test: [ 200/21770] eta: 0:16:41 time: 0.0376 data: 0.0008 max mem: 33300 Test: [ 300/21770] eta: 0:15:33 time: 0.0375 data: 0.0008 max mem: 33300 Test: [ 400/21770] eta: 0:14:57 time: 0.0376 data: 0.0008 max mem: 33300 Test: [ 500/21770] eta: 0:14:34 time: 0.0376 data: 0.0008 max mem: 33300 Test: [ 600/21770] eta: 0:14:18 time: 0.0379 data: 0.0010 max mem: 33300 Test: [ 700/21770] eta: 0:14:06 time: 0.0377 data: 0.0009 max mem: 33300 Test: [ 800/21770] eta: 0:13:56 time: 0.0379 data: 0.0010 max mem: 33300 Test: [ 900/21770] eta: 0:13:48 time: 0.0383 data: 0.0010 max mem: 33300 Test: [ 1000/21770] eta: 0:13:41 time: 0.0385 data: 0.0009 max mem: 33300 Test: [ 1100/21770] eta: 0:13:34 time: 0.0380 data: 0.0010 max mem: 33300 Test: [ 1200/21770] eta: 0:13:27 time: 0.0375 data: 0.0008 max mem: 33300 Test: [ 1300/21770] eta: 0:13:21 time: 0.0377 data: 0.0008 max mem: 33300 Test: [ 1400/21770] eta: 0:13:14 time: 0.0376 data: 0.0008 max mem: 33300 Test: [ 1500/21770] eta: 0:13:09 time: 0.0376 data: 0.0008 max mem: 33300 Test: [ 1600/21770] eta: 0:13:03 time: 0.0375 data: 0.0008 max mem: 33300 Test: [ 1700/21770] eta: 0:12:58 time: 0.0376 data: 0.0008 max mem: 33300 Test: [ 1800/21770] eta: 0:12:52 time: 0.0376 data: 0.0008 max mem: 33300 Test: [ 1900/21770] eta: 0:12:47 time: 0.0375 data: 0.0008 max mem: 33300 Test: [ 2000/21770] eta: 0:12:42 time: 0.0376 data: 0.0009 max mem: 33300 Test: [ 2100/21770] eta: 0:12:38 time: 0.0374 data: 0.0008 max mem: 33300 Test: [ 2200/21770] eta: 0:12:33 time: 0.0377 data: 0.0008 max mem: 33300 Test: [ 2300/21770] eta: 0:12:28 time: 0.0377 data: 0.0008 max mem: 33300 Test: [ 2400/21770] eta: 0:12:24 time: 0.0381 data: 0.0008 max mem: 33300 Test: [ 2500/21770] eta: 0:12:20 time: 0.0380 data: 0.0008 max mem: 33300 Test: [ 2600/21770] eta: 0:12:16 time: 0.0382 data: 0.0008 max mem: 33300 Test: [ 2700/21770] eta: 0:12:12 time: 0.0382 data: 0.0008 max mem: 33300 Test: [ 2800/21770] eta: 0:12:08 time: 0.0383 data: 0.0008 max mem: 33300 Test: [ 2900/21770] eta: 0:12:04 time: 0.0382 data: 0.0008 max mem: 33300 Test: [ 3000/21770] eta: 0:12:00 time: 0.0384 data: 0.0008 max mem: 33300 Test: [ 3100/21770] eta: 0:11:56 time: 0.0382 data: 0.0008 max mem: 33300 Test: [ 3200/21770] eta: 0:11:52 time: 0.0379 data: 0.0008 max mem: 33300 Test: [ 3300/21770] eta: 0:11:48 time: 0.0381 data: 0.0009 max mem: 33300 Test: [ 3400/21770] eta: 0:11:44 time: 0.0379 data: 0.0008 max mem: 33300 Test: [ 3500/21770] eta: 0:11:40 time: 0.0380 data: 0.0008 max mem: 33300 Test: [ 3600/21770] eta: 0:11:36 time: 0.0379 data: 0.0008 max mem: 33300 Test: [ 3700/21770] eta: 0:11:32 time: 0.0380 data: 0.0008 max mem: 33300 Test: [ 3800/21770] eta: 0:11:28 time: 0.0377 data: 0.0008 max mem: 33300 Test: [ 3900/21770] eta: 0:11:24 time: 0.0377 data: 0.0009 max mem: 33300 Test: [ 4000/21770] eta: 0:11:20 time: 0.0377 data: 0.0008 max mem: 33300 Test: [ 4100/21770] eta: 0:11:16 time: 0.0377 data: 0.0009 max mem: 33300 Test: [ 4200/21770] eta: 0:11:12 time: 0.0377 data: 0.0008 max mem: 33300 Test: [ 4300/21770] eta: 0:11:07 time: 0.0377 data: 0.0008 max mem: 33300 Test: [ 4400/21770] eta: 0:11:03 time: 0.0380 data: 0.0008 max mem: 33300 Test: [ 4500/21770] eta: 0:11:00 time: 0.0380 data: 0.0008 max mem: 33300 Test: [ 4600/21770] eta: 0:10:56 time: 0.0379 data: 0.0009 max mem: 33300 Test: [ 4700/21770] eta: 0:10:52 time: 0.0381 data: 0.0008 max mem: 33300 Test: [ 4800/21770] eta: 0:10:48 time: 0.0378 data: 0.0008 max mem: 33300 Test: [ 4900/21770] eta: 0:10:44 time: 0.0382 data: 0.0008 max mem: 33300 Test: [ 5000/21770] eta: 0:10:40 time: 0.0384 data: 0.0008 max mem: 33300 Test: [ 5100/21770] eta: 0:10:36 time: 0.0383 data: 0.0008 max mem: 33300 Test: [ 5200/21770] eta: 0:10:33 time: 0.0386 data: 0.0009 max mem: 33300 Test: [ 5300/21770] eta: 0:10:29 time: 0.0391 data: 0.0010 max mem: 33300 Test: [ 5400/21770] eta: 0:10:25 time: 0.0389 data: 0.0010 max mem: 33300 Test: [ 5500/21770] eta: 0:10:22 time: 0.0387 data: 0.0009 max mem: 33300 Test: [ 5600/21770] eta: 0:10:18 time: 0.0388 data: 0.0009 max mem: 33300 Test: [ 5700/21770] eta: 0:10:14 time: 0.0388 data: 0.0009 max mem: 33300 Test: [ 5800/21770] eta: 0:10:11 time: 0.0388 data: 0.0008 max mem: 33300 Test: [ 5900/21770] eta: 0:10:07 time: 0.0387 data: 0.0008 max mem: 33300 Test: [ 6000/21770] eta: 0:10:03 time: 0.0387 data: 0.0008 max mem: 33300 Test: [ 6100/21770] eta: 0:10:00 time: 0.0387 data: 0.0008 max mem: 33300 Test: [ 6200/21770] eta: 0:09:56 time: 0.0382 data: 0.0008 max mem: 33300 Test: [ 6300/21770] eta: 0:09:52 time: 0.0383 data: 0.0008 max mem: 33300 Test: [ 6400/21770] eta: 0:09:48 time: 0.0383 data: 0.0008 max mem: 33300 Test: [ 6500/21770] eta: 0:09:44 time: 0.0379 data: 0.0008 max mem: 33300 Test: [ 6600/21770] eta: 0:09:41 time: 0.0383 data: 0.0008 max mem: 33300 Test: [ 6700/21770] eta: 0:09:37 time: 0.0379 data: 0.0008 max mem: 33300 Test: [ 6800/21770] eta: 0:09:33 time: 0.0378 data: 0.0008 max mem: 33300 Test: [ 6900/21770] eta: 0:09:29 time: 0.0379 data: 0.0008 max mem: 33300 Test: [ 7000/21770] eta: 0:09:25 time: 0.0380 data: 0.0008 max mem: 33300 Test: [ 7100/21770] eta: 0:09:21 time: 0.0379 data: 0.0008 max mem: 33300 Test: [ 7200/21770] eta: 0:09:17 time: 0.0381 data: 0.0008 max mem: 33300 Test: [ 7300/21770] eta: 0:09:13 time: 0.0379 data: 0.0008 max mem: 33300 Test: [ 7400/21770] eta: 0:09:09 time: 0.0380 data: 0.0008 max mem: 33300 Test: [ 7500/21770] eta: 0:09:06 time: 0.0379 data: 0.0008 max mem: 33300 Test: [ 7600/21770] eta: 0:09:02 time: 0.0377 data: 0.0008 max mem: 33300 Test: [ 7700/21770] eta: 0:08:58 time: 0.0378 data: 0.0008 max mem: 33300 Test: [ 7800/21770] eta: 0:08:54 time: 0.0377 data: 0.0008 max mem: 33300 Test: [ 7900/21770] eta: 0:08:50 time: 0.0377 data: 0.0009 max mem: 33300 Test: [ 8000/21770] eta: 0:08:46 time: 0.0377 data: 0.0008 max mem: 33300 Test: [ 8100/21770] eta: 0:08:42 time: 0.0376 data: 0.0008 max mem: 33300 Test: [ 8200/21770] eta: 0:08:38 time: 0.0376 data: 0.0008 max mem: 33300 Test: [ 8300/21770] eta: 0:08:34 time: 0.0378 data: 0.0008 max mem: 33300 Test: [ 8400/21770] eta: 0:08:30 time: 0.0380 data: 0.0009 max mem: 33300 Test: [ 8500/21770] eta: 0:08:26 time: 0.0380 data: 0.0008 max mem: 33300 Test: [ 8600/21770] eta: 0:08:23 time: 0.0379 data: 0.0008 max mem: 33300 Test: [ 8700/21770] eta: 0:08:19 time: 0.0381 data: 0.0008 max mem: 33300 Test: [ 8800/21770] eta: 0:08:15 time: 0.0381 data: 0.0008 max mem: 33300 Test: [ 8900/21770] eta: 0:08:11 time: 0.0383 data: 0.0009 max mem: 33300 Test: [ 9000/21770] eta: 0:08:07 time: 0.0381 data: 0.0009 max mem: 33300 Test: [ 9100/21770] eta: 0:08:03 time: 0.0381 data: 0.0009 max mem: 33300 Test: [ 9200/21770] eta: 0:08:00 time: 0.0380 data: 0.0009 max mem: 33300 Test: [ 9300/21770] eta: 0:07:56 time: 0.0385 data: 0.0010 max mem: 33300 Test: [ 9400/21770] eta: 0:07:52 time: 0.0388 data: 0.0011 max mem: 33300 Test: [ 9500/21770] eta: 0:07:48 time: 0.0383 data: 0.0010 max mem: 33300 Test: [ 9600/21770] eta: 0:07:44 time: 0.0380 data: 0.0010 max mem: 33300 Test: [ 9700/21770] eta: 0:07:41 time: 0.0382 data: 0.0010 max mem: 33300 Test: [ 9800/21770] eta: 0:07:37 time: 0.0380 data: 0.0009 max mem: 33300 Test: [ 9900/21770] eta: 0:07:33 time: 0.0380 data: 0.0009 max mem: 33300 Test: [10000/21770] eta: 0:07:29 time: 0.0379 data: 0.0009 max mem: 33300 Test: [10100/21770] eta: 0:07:25 time: 0.0381 data: 0.0009 max mem: 33300 Test: [10200/21770] eta: 0:07:21 time: 0.0380 data: 0.0009 max mem: 33300 Test: [10300/21770] eta: 0:07:18 time: 0.0380 data: 0.0009 max mem: 33300 Test: [10400/21770] eta: 0:07:14 time: 0.0382 data: 0.0009 max mem: 33300 Test: [10500/21770] eta: 0:07:10 time: 0.0378 data: 0.0009 max mem: 33300 Test: [10600/21770] eta: 0:07:06 time: 0.0378 data: 0.0009 max mem: 33300 Test: [10700/21770] eta: 0:07:02 time: 0.0378 data: 0.0009 max mem: 33300 Test: [10800/21770] eta: 0:06:58 time: 0.0376 data: 0.0009 max mem: 33300 Test: [10900/21770] eta: 0:06:54 time: 0.0377 data: 0.0009 max mem: 33300 Test: [11000/21770] eta: 0:06:51 time: 0.0380 data: 0.0009 max mem: 33300 Test: [11100/21770] eta: 0:06:47 time: 0.0379 data: 0.0009 max mem: 33300 Test: [11200/21770] eta: 0:06:43 time: 0.0378 data: 0.0008 max mem: 33300 Test: [11300/21770] eta: 0:06:39 time: 0.0378 data: 0.0009 max mem: 33300 Test: [11400/21770] eta: 0:06:35 time: 0.0378 data: 0.0008 max mem: 33300 Test: [11500/21770] eta: 0:06:31 time: 0.0376 data: 0.0009 max mem: 33300 Test: [11600/21770] eta: 0:06:28 time: 0.0380 data: 0.0009 max mem: 33300 Test: [11700/21770] eta: 0:06:24 time: 0.0379 data: 0.0009 max mem: 33300 Test: [11800/21770] eta: 0:06:20 time: 0.0376 data: 0.0009 max mem: 33300 Test: [11900/21770] eta: 0:06:16 time: 0.0377 data: 0.0009 max mem: 33300 Test: [12000/21770] eta: 0:06:12 time: 0.0380 data: 0.0009 max mem: 33300 Test: [12100/21770] eta: 0:06:08 time: 0.0379 data: 0.0009 max mem: 33300 Test: [12200/21770] eta: 0:06:04 time: 0.0377 data: 0.0009 max mem: 33300 Test: [12300/21770] eta: 0:06:01 time: 0.0377 data: 0.0009 max mem: 33300 Test: [12400/21770] eta: 0:05:57 time: 0.0377 data: 0.0008 max mem: 33300 Test: [12500/21770] eta: 0:05:53 time: 0.0379 data: 0.0009 max mem: 33300 Test: [12600/21770] eta: 0:05:49 time: 0.0376 data: 0.0009 max mem: 33300 Test: [12700/21770] eta: 0:05:45 time: 0.0377 data: 0.0009 max mem: 33300 Test: [12800/21770] eta: 0:05:41 time: 0.0378 data: 0.0009 max mem: 33300 Test: [12900/21770] eta: 0:05:38 time: 0.0376 data: 0.0008 max mem: 33300 Test: [13000/21770] eta: 0:05:34 time: 0.0377 data: 0.0009 max mem: 33300 Test: [13100/21770] eta: 0:05:30 time: 0.0383 data: 0.0009 max mem: 33300 Test: [13200/21770] eta: 0:05:26 time: 0.0382 data: 0.0009 max mem: 33300 Test: [13300/21770] eta: 0:05:22 time: 0.0376 data: 0.0009 max mem: 33300 Test: [13400/21770] eta: 0:05:18 time: 0.0377 data: 0.0009 max mem: 33300 Test: [13500/21770] eta: 0:05:15 time: 0.0377 data: 0.0009 max mem: 33300 Test: [13600/21770] eta: 0:05:11 time: 0.0377 data: 0.0009 max mem: 33300 Test: [13700/21770] eta: 0:05:07 time: 0.0377 data: 0.0009 max mem: 33300 Test: [13800/21770] eta: 0:05:03 time: 0.0376 data: 0.0009 max mem: 33300 Test: [13900/21770] eta: 0:04:59 time: 0.0377 data: 0.0009 max mem: 33300 Test: [14000/21770] eta: 0:04:55 time: 0.0376 data: 0.0009 max mem: 33300 Test: [14100/21770] eta: 0:04:52 time: 0.0376 data: 0.0009 max mem: 33300 Test: [14200/21770] eta: 0:04:48 time: 0.0377 data: 0.0009 max mem: 33300 Test: [14300/21770] eta: 0:04:44 time: 0.0376 data: 0.0009 max mem: 33300 Test: [14400/21770] eta: 0:04:40 time: 0.0384 data: 0.0010 max mem: 33300 Test: [14500/21770] eta: 0:04:36 time: 0.0383 data: 0.0010 max mem: 33300 Test: [14600/21770] eta: 0:04:33 time: 0.0382 data: 0.0009 max mem: 33300 Test: [14700/21770] eta: 0:04:29 time: 0.0384 data: 0.0010 max mem: 33300 Test: [14800/21770] eta: 0:04:25 time: 0.0385 data: 0.0010 max mem: 33300 Test: [14900/21770] eta: 0:04:21 time: 0.0384 data: 0.0009 max mem: 33300 Test: [15000/21770] eta: 0:04:17 time: 0.0380 data: 0.0009 max mem: 33300 Test: [15100/21770] eta: 0:04:14 time: 0.0383 data: 0.0009 max mem: 33300 Test: [15200/21770] eta: 0:04:10 time: 0.0386 data: 0.0009 max mem: 33300 Test: [15300/21770] eta: 0:04:06 time: 0.0390 data: 0.0009 max mem: 33300 Test: [15400/21770] eta: 0:04:02 time: 0.0387 data: 0.0009 max mem: 33300 Test: [15500/21770] eta: 0:03:58 time: 0.0391 data: 0.0009 max mem: 33300 Test: [15600/21770] eta: 0:03:55 time: 0.0392 data: 0.0009 max mem: 33300 Test: [15700/21770] eta: 0:03:51 time: 0.0397 data: 0.0009 max mem: 33300 Test: [15800/21770] eta: 0:03:47 time: 0.0387 data: 0.0009 max mem: 33300 Test: [15900/21770] eta: 0:03:43 time: 0.0390 data: 0.0009 max mem: 33300 Test: [16000/21770] eta: 0:03:40 time: 0.0386 data: 0.0009 max mem: 33300 Test: [16100/21770] eta: 0:03:36 time: 0.0389 data: 0.0009 max mem: 33300 Test: [16200/21770] eta: 0:03:32 time: 0.0378 data: 0.0009 max mem: 33300 Test: [16300/21770] eta: 0:03:28 time: 0.0381 data: 0.0009 max mem: 33300 Test: [16400/21770] eta: 0:03:24 time: 0.0381 data: 0.0009 max mem: 33300 Test: [16500/21770] eta: 0:03:20 time: 0.0378 data: 0.0009 max mem: 33300 Test: [16600/21770] eta: 0:03:17 time: 0.0380 data: 0.0009 max mem: 33300 Test: [16700/21770] eta: 0:03:13 time: 0.0378 data: 0.0009 max mem: 33300 Test: [16800/21770] eta: 0:03:09 time: 0.0380 data: 0.0009 max mem: 33300 Test: [16900/21770] eta: 0:03:05 time: 0.0379 data: 0.0008 max mem: 33300 Test: [17000/21770] eta: 0:03:01 time: 0.0378 data: 0.0009 max mem: 33300 Test: [17100/21770] eta: 0:02:58 time: 0.0382 data: 0.0008 max mem: 33300 Test: [17200/21770] eta: 0:02:54 time: 0.0379 data: 0.0009 max mem: 33300 Test: [17300/21770] eta: 0:02:50 time: 0.0380 data: 0.0009 max mem: 33300 Test: [17400/21770] eta: 0:02:46 time: 0.0378 data: 0.0008 max mem: 33300 Test: [17500/21770] eta: 0:02:42 time: 0.0378 data: 0.0009 max mem: 33300 Test: [17600/21770] eta: 0:02:38 time: 0.0377 data: 0.0008 max mem: 33300 Test: [17700/21770] eta: 0:02:35 time: 0.0379 data: 0.0009 max mem: 33300 Test: [17800/21770] eta: 0:02:31 time: 0.0379 data: 0.0009 max mem: 33300 Test: [17900/21770] eta: 0:02:27 time: 0.0380 data: 0.0009 max mem: 33300 Test: [18000/21770] eta: 0:02:23 time: 0.0378 data: 0.0009 max mem: 33300 Test: [18100/21770] eta: 0:02:19 time: 0.0376 data: 0.0009 max mem: 33300 Test: [18200/21770] eta: 0:02:16 time: 0.0377 data: 0.0009 max mem: 33300 Test: [18300/21770] eta: 0:02:12 time: 0.0378 data: 0.0009 max mem: 33300 Test: [18400/21770] eta: 0:02:08 time: 0.0381 data: 0.0009 max mem: 33300 Test: [18500/21770] eta: 0:02:04 time: 0.0379 data: 0.0009 max mem: 33300 Test: [18600/21770] eta: 0:02:00 time: 0.0379 data: 0.0008 max mem: 33300 Test: [18700/21770] eta: 0:01:56 time: 0.0381 data: 0.0008 max mem: 33300 Test: [18800/21770] eta: 0:01:53 time: 0.0377 data: 0.0008 max mem: 33300 Test: [18900/21770] eta: 0:01:49 time: 0.0377 data: 0.0008 max mem: 33300 Test: [19000/21770] eta: 0:01:45 time: 0.0384 data: 0.0009 max mem: 33300 Test: [19100/21770] eta: 0:01:41 time: 0.0380 data: 0.0009 max mem: 33300 Test: [19200/21770] eta: 0:01:37 time: 0.0383 data: 0.0010 max mem: 33300 Test: [19300/21770] eta: 0:01:34 time: 0.0384 data: 0.0009 max mem: 33300 Test: [19400/21770] eta: 0:01:30 time: 0.0382 data: 0.0008 max mem: 33300 Test: [19500/21770] eta: 0:01:26 time: 0.0384 data: 0.0009 max mem: 33300 Test: [19600/21770] eta: 0:01:22 time: 0.0378 data: 0.0009 max mem: 33300 Test: [19700/21770] eta: 0:01:18 time: 0.0378 data: 0.0008 max mem: 33300 Test: [19800/21770] eta: 0:01:15 time: 0.0379 data: 0.0008 max mem: 33300 Test: [19900/21770] eta: 0:01:11 time: 0.0379 data: 0.0009 max mem: 33300 Test: [20000/21770] eta: 0:01:07 time: 0.0381 data: 0.0009 max mem: 33300 Test: [20100/21770] eta: 0:01:03 time: 0.0382 data: 0.0008 max mem: 33300 Test: [20200/21770] eta: 0:00:59 time: 0.0381 data: 0.0008 max mem: 33300 Test: [20300/21770] eta: 0:00:56 time: 0.0380 data: 0.0009 max mem: 33300 Test: [20400/21770] eta: 0:00:52 time: 0.0378 data: 0.0008 max mem: 33300 Test: [20500/21770] eta: 0:00:48 time: 0.0379 data: 0.0008 max mem: 33300 Test: [20600/21770] eta: 0:00:44 time: 0.0378 data: 0.0008 max mem: 33300 Test: [20700/21770] eta: 0:00:40 time: 0.0378 data: 0.0008 max mem: 33300 Test: [20800/21770] eta: 0:00:36 time: 0.0377 data: 0.0008 max mem: 33300 Test: [20900/21770] eta: 0:00:33 time: 0.0386 data: 0.0008 max mem: 33300 Test: [21000/21770] eta: 0:00:29 time: 0.0390 data: 0.0008 max mem: 33300 Test: [21100/21770] eta: 0:00:25 time: 0.0389 data: 0.0009 max mem: 33300 Test: [21200/21770] eta: 0:00:21 time: 0.0390 data: 0.0008 max mem: 33300 Test: [21300/21770] eta: 0:00:17 time: 0.0386 data: 0.0008 max mem: 33300 Test: [21400/21770] eta: 0:00:14 time: 0.0391 data: 0.0008 max mem: 33300 Test: [21500/21770] eta: 0:00:10 time: 0.0387 data: 0.0008 max mem: 33300 Test: [21600/21770] eta: 0:00:06 time: 0.0391 data: 0.0008 max mem: 33300 Test: [21700/21770] eta: 0:00:02 time: 0.0385 data: 0.0008 max mem: 33300 Test: Total time: 0:13:50 Final results: Mean IoU is 0.00 precision@0.5 = 0.00 precision@0.6 = 0.00 precision@0.7 = 0.00 precision@0.8 = 0.00 precision@0.9 = 0.00 overall IoU = 0.00 mean IoU = 0.00 Mean accuracy for one-to-zero sample is 100.00 Average object IoU 0.0 Overall IoU 0.0 Epoch: [19] [ 0/4276] eta: 6:12:54 lr: 2.7996840594630887e-05 loss: 0.1021 (0.1021) time: 5.2327 data: 2.1159 max mem: 33300 Epoch: [19] [ 10/4276] eta: 3:45:29 lr: 2.7994034503329064e-05 loss: 0.1166 (0.1206) time: 3.1715 data: 0.1985 max mem: 33300 Epoch: [19] [ 20/4276] eta: 3:38:01 lr: 2.7991228380773616e-05 loss: 0.1166 (0.1239) time: 2.9656 data: 0.0062 max mem: 33300 Epoch: [19] [ 30/4276] eta: 3:35:50 lr: 2.7988422226960714e-05 loss: 0.1100 (0.1226) time: 2.9832 data: 0.0064 max mem: 33300 Epoch: [19] [ 40/4276] eta: 3:33:49 lr: 2.798561604188652e-05 loss: 0.1163 (0.1231) time: 2.9817 data: 0.0068 max mem: 33300 Epoch: [19] [ 50/4276] eta: 3:32:25 lr: 2.798280982554722e-05 loss: 0.1204 (0.1225) time: 2.9631 data: 0.0065 max mem: 33300 Epoch: [19] [ 60/4276] eta: 3:31:16 lr: 2.7980003577938957e-05 loss: 0.1135 (0.1208) time: 2.9616 data: 0.0068 max mem: 33300 Epoch: [19] [ 70/4276] eta: 3:30:16 lr: 2.7977197299057917e-05 loss: 0.1091 (0.1198) time: 2.9584 data: 0.0069 max mem: 33300 Epoch: [19] [ 80/4276] eta: 3:29:25 lr: 2.7974390988900256e-05 loss: 0.1109 (0.1201) time: 2.9576 data: 0.0067 max mem: 33300 Epoch: [19] [ 90/4276] eta: 3:28:45 lr: 2.797158464746214e-05 loss: 0.1132 (0.1198) time: 2.9651 data: 0.0065 max mem: 33300 Epoch: [19] [ 100/4276] eta: 3:28:07 lr: 2.7968778274739736e-05 loss: 0.1178 (0.1224) time: 2.9734 data: 0.0063 max mem: 33300 Epoch: [19] [ 110/4276] eta: 3:27:17 lr: 2.7965971870729206e-05 loss: 0.1294 (0.1230) time: 2.9548 data: 0.0068 max mem: 33300 Epoch: [19] [ 120/4276] eta: 3:26:39 lr: 2.796316543542672e-05 loss: 0.1210 (0.1233) time: 2.9486 data: 0.0067 max mem: 33300 Epoch: [19] [ 130/4276] eta: 3:26:03 lr: 2.796035896882843e-05 loss: 0.1228 (0.1242) time: 2.9627 data: 0.0067 max mem: 33300 Epoch: [19] [ 140/4276] eta: 3:25:26 lr: 2.7957552470930498e-05 loss: 0.1158 (0.1234) time: 2.9614 data: 0.0072 max mem: 33300 Epoch: [19] [ 150/4276] eta: 3:24:56 lr: 2.7954745941729095e-05 loss: 0.1074 (0.1234) time: 2.9699 data: 0.0074 max mem: 33300 Epoch: [19] [ 160/4276] eta: 3:24:16 lr: 2.7951939381220354e-05 loss: 0.1158 (0.1231) time: 2.9589 data: 0.0076 max mem: 33300 Epoch: [19] [ 170/4276] eta: 3:23:40 lr: 2.7949132789400457e-05 loss: 0.1135 (0.1229) time: 2.9448 data: 0.0076 max mem: 33300 Epoch: [19] [ 180/4276] eta: 3:23:07 lr: 2.7946326166265552e-05 loss: 0.1223 (0.1232) time: 2.9577 data: 0.0078 max mem: 33300 Epoch: [19] [ 190/4276] eta: 3:22:33 lr: 2.79435195118118e-05 loss: 0.1325 (0.1238) time: 2.9605 data: 0.0071 max mem: 33300 Epoch: [19] [ 200/4276] eta: 3:22:01 lr: 2.7940712826035354e-05 loss: 0.1207 (0.1239) time: 2.9584 data: 0.0067 max mem: 33300 Epoch: [19] [ 210/4276] eta: 3:21:29 lr: 2.793790610893236e-05 loss: 0.1224 (0.1244) time: 2.9623 data: 0.0068 max mem: 33300 Epoch: [19] [ 220/4276] eta: 3:21:04 lr: 2.7935099360498984e-05 loss: 0.1222 (0.1240) time: 2.9812 data: 0.0067 max mem: 33300 Epoch: [19] [ 230/4276] eta: 3:20:32 lr: 2.7932292580731367e-05 loss: 0.1087 (0.1236) time: 2.9791 data: 0.0067 max mem: 33300 Epoch: [19] [ 240/4276] eta: 3:20:00 lr: 2.792948576962567e-05 loss: 0.1122 (0.1235) time: 2.9622 data: 0.0064 max mem: 33300 Epoch: [19] [ 250/4276] eta: 3:19:28 lr: 2.792667892717804e-05 loss: 0.1256 (0.1241) time: 2.9617 data: 0.0068 max mem: 33300 Epoch: [19] [ 260/4276] eta: 3:18:56 lr: 2.7923872053384626e-05 loss: 0.1336 (0.1242) time: 2.9592 data: 0.0069 max mem: 33300 Epoch: [19] [ 270/4276] eta: 3:18:25 lr: 2.7921065148241578e-05 loss: 0.1207 (0.1239) time: 2.9594 data: 0.0069 max mem: 33300 Epoch: [19] [ 280/4276] eta: 3:17:58 lr: 2.7918258211745052e-05 loss: 0.1128 (0.1237) time: 2.9759 data: 0.0070 max mem: 33300 Epoch: [19] [ 290/4276] eta: 3:17:26 lr: 2.7915451243891177e-05 loss: 0.1128 (0.1234) time: 2.9745 data: 0.0067 max mem: 33300 Epoch: [19] [ 300/4276] eta: 3:16:55 lr: 2.7912644244676117e-05 loss: 0.1115 (0.1231) time: 2.9600 data: 0.0067 max mem: 33300 Epoch: [19] [ 310/4276] eta: 3:16:24 lr: 2.7909837214096003e-05 loss: 0.1112 (0.1228) time: 2.9616 data: 0.0072 max mem: 33300 Epoch: [19] [ 320/4276] eta: 3:15:52 lr: 2.790703015214699e-05 loss: 0.1152 (0.1231) time: 2.9580 data: 0.0074 max mem: 33300 Epoch: [19] [ 330/4276] eta: 3:15:20 lr: 2.7904223058825214e-05 loss: 0.1236 (0.1234) time: 2.9528 data: 0.0072 max mem: 33300 Epoch: [19] [ 340/4276] eta: 3:14:50 lr: 2.7901415934126824e-05 loss: 0.1235 (0.1233) time: 2.9577 data: 0.0070 max mem: 33300 Epoch: [19] [ 350/4276] eta: 3:14:20 lr: 2.7898608778047958e-05 loss: 0.1141 (0.1233) time: 2.9654 data: 0.0069 max mem: 33300 Epoch: [19] [ 360/4276] eta: 3:13:42 lr: 2.7895801590584765e-05 loss: 0.1256 (0.1239) time: 2.9310 data: 0.0070 max mem: 33300 Epoch: [19] [ 370/4276] eta: 3:13:04 lr: 2.7892994371733368e-05 loss: 0.1127 (0.1235) time: 2.8942 data: 0.0069 max mem: 33300 Epoch: [19] [ 380/4276] eta: 3:12:29 lr: 2.789018712148992e-05 loss: 0.1071 (0.1234) time: 2.8991 data: 0.0072 max mem: 33300 Epoch: [19] [ 390/4276] eta: 3:11:54 lr: 2.788737983985055e-05 loss: 0.1101 (0.1237) time: 2.9082 data: 0.0072 max mem: 33300 Epoch: [19] [ 400/4276] eta: 3:11:25 lr: 2.7884572526811404e-05 loss: 0.1395 (0.1241) time: 2.9407 data: 0.0067 max mem: 33300 Epoch: [19] [ 410/4276] eta: 3:10:55 lr: 2.7881765182368615e-05 loss: 0.1333 (0.1241) time: 2.9685 data: 0.0066 max mem: 33300 Epoch: [19] [ 420/4276] eta: 3:10:28 lr: 2.7878957806518312e-05 loss: 0.1154 (0.1241) time: 2.9761 data: 0.0063 max mem: 33300 Epoch: [19] [ 430/4276] eta: 3:09:53 lr: 2.787615039925664e-05 loss: 0.1157 (0.1242) time: 2.9452 data: 0.0063 max mem: 33300 Epoch: [19] [ 440/4276] eta: 3:09:23 lr: 2.7873342960579728e-05 loss: 0.1129 (0.1239) time: 2.9291 data: 0.0066 max mem: 33300 Epoch: [19] [ 450/4276] eta: 3:08:53 lr: 2.7870535490483707e-05 loss: 0.1173 (0.1240) time: 2.9560 data: 0.0071 max mem: 33300 Epoch: [19] [ 460/4276] eta: 3:08:22 lr: 2.7867727988964713e-05 loss: 0.1136 (0.1235) time: 2.9570 data: 0.0077 max mem: 33300 Epoch: [19] [ 470/4276] eta: 3:07:52 lr: 2.7864920456018865e-05 loss: 0.1028 (0.1231) time: 2.9567 data: 0.0081 max mem: 33300 Epoch: [19] [ 480/4276] eta: 3:07:25 lr: 2.7862112891642305e-05 loss: 0.1062 (0.1228) time: 2.9732 data: 0.0077 max mem: 33300 Epoch: [19] [ 490/4276] eta: 3:06:55 lr: 2.7859305295831163e-05 loss: 0.1018 (0.1225) time: 2.9728 data: 0.0075 max mem: 33300 Epoch: [19] [ 500/4276] eta: 3:06:25 lr: 2.7856497668581555e-05 loss: 0.0994 (0.1222) time: 2.9560 data: 0.0078 max mem: 33300 Epoch: [19] [ 510/4276] eta: 3:05:55 lr: 2.785369000988962e-05 loss: 0.0957 (0.1219) time: 2.9609 data: 0.0080 max mem: 33300 Epoch: [19] [ 520/4276] eta: 3:05:26 lr: 2.785088231975148e-05 loss: 0.0962 (0.1218) time: 2.9635 data: 0.0078 max mem: 33300 Epoch: [19] [ 530/4276] eta: 3:04:56 lr: 2.7848074598163264e-05 loss: 0.1145 (0.1217) time: 2.9592 data: 0.0075 max mem: 33300 Epoch: [19] [ 540/4276] eta: 3:04:28 lr: 2.784526684512109e-05 loss: 0.1088 (0.1214) time: 2.9714 data: 0.0073 max mem: 33300 Epoch: [19] [ 550/4276] eta: 3:03:57 lr: 2.784245906062108e-05 loss: 0.1086 (0.1212) time: 2.9685 data: 0.0069 max mem: 33300 Epoch: [19] [ 560/4276] eta: 3:03:28 lr: 2.7839651244659365e-05 loss: 0.1149 (0.1213) time: 2.9579 data: 0.0072 max mem: 33300 Epoch: [19] [ 570/4276] eta: 3:02:58 lr: 2.7836843397232054e-05 loss: 0.1229 (0.1212) time: 2.9620 data: 0.0078 max mem: 33300 Epoch: [19] [ 580/4276] eta: 3:02:28 lr: 2.7834035518335284e-05 loss: 0.1097 (0.1211) time: 2.9583 data: 0.0074 max mem: 33300 Epoch: [19] [ 590/4276] eta: 3:01:58 lr: 2.783122760796517e-05 loss: 0.1054 (0.1209) time: 2.9587 data: 0.0069 max mem: 33300 Epoch: [19] [ 600/4276] eta: 3:01:29 lr: 2.782841966611782e-05 loss: 0.1137 (0.1209) time: 2.9625 data: 0.0065 max mem: 33300 Epoch: [19] [ 610/4276] eta: 3:01:02 lr: 2.7825611692789373e-05 loss: 0.1140 (0.1207) time: 2.9832 data: 0.0064 max mem: 33300 Epoch: [19] [ 620/4276] eta: 3:00:31 lr: 2.7822803687975924e-05 loss: 0.1100 (0.1206) time: 2.9713 data: 0.0066 max mem: 33300 Epoch: [19] [ 630/4276] eta: 3:00:01 lr: 2.7819995651673598e-05 loss: 0.1120 (0.1211) time: 2.9477 data: 0.0067 max mem: 33300 Epoch: [19] [ 640/4276] eta: 2:59:31 lr: 2.7817187583878513e-05 loss: 0.1145 (0.1212) time: 2.9571 data: 0.0073 max mem: 33300 Epoch: [19] [ 650/4276] eta: 2:58:59 lr: 2.781437948458678e-05 loss: 0.1145 (0.1216) time: 2.9397 data: 0.0072 max mem: 33300 Epoch: [19] [ 660/4276] eta: 2:58:26 lr: 2.781157135379451e-05 loss: 0.1345 (0.1219) time: 2.9103 data: 0.0069 max mem: 33300 Epoch: [19] [ 670/4276] eta: 2:57:55 lr: 2.7808763191497828e-05 loss: 0.1292 (0.1219) time: 2.9229 data: 0.0070 max mem: 33300 Epoch: [19] [ 680/4276] eta: 2:57:27 lr: 2.780595499769283e-05 loss: 0.1261 (0.1219) time: 2.9616 data: 0.0067 max mem: 33300 Epoch: [19] [ 690/4276] eta: 2:56:57 lr: 2.7803146772375637e-05 loss: 0.1208 (0.1220) time: 2.9658 data: 0.0069 max mem: 33300 Epoch: [19] [ 700/4276] eta: 2:56:26 lr: 2.780033851554235e-05 loss: 0.1208 (0.1219) time: 2.9417 data: 0.0071 max mem: 33300 Epoch: [19] [ 710/4276] eta: 2:55:53 lr: 2.7797530227189088e-05 loss: 0.1092 (0.1219) time: 2.9176 data: 0.0070 max mem: 33300 Epoch: [19] [ 720/4276] eta: 2:55:21 lr: 2.7794721907311956e-05 loss: 0.1005 (0.1217) time: 2.9064 data: 0.0071 max mem: 33300 Epoch: [19] [ 730/4276] eta: 2:54:50 lr: 2.7791913555907052e-05 loss: 0.1011 (0.1217) time: 2.9161 data: 0.0068 max mem: 33300 Epoch: [19] [ 740/4276] eta: 2:54:19 lr: 2.7789105172970492e-05 loss: 0.1064 (0.1216) time: 2.9293 data: 0.0071 max mem: 33300 Epoch: [19] [ 750/4276] eta: 2:53:50 lr: 2.7786296758498388e-05 loss: 0.1071 (0.1216) time: 2.9467 data: 0.0069 max mem: 33300 Epoch: [19] [ 760/4276] eta: 2:53:20 lr: 2.7783488312486822e-05 loss: 0.1070 (0.1215) time: 2.9613 data: 0.0064 max mem: 33300 Epoch: [19] [ 770/4276] eta: 2:52:51 lr: 2.7780679834931922e-05 loss: 0.1057 (0.1215) time: 2.9629 data: 0.0071 max mem: 33300 Epoch: [19] [ 780/4276] eta: 2:52:21 lr: 2.7777871325829768e-05 loss: 0.1231 (0.1215) time: 2.9622 data: 0.0073 max mem: 33300 Epoch: [19] [ 790/4276] eta: 2:51:51 lr: 2.7775062785176476e-05 loss: 0.1236 (0.1216) time: 2.9457 data: 0.0075 max mem: 33300 Epoch: [19] [ 800/4276] eta: 2:51:20 lr: 2.777225421296814e-05 loss: 0.1187 (0.1216) time: 2.9295 data: 0.0075 max mem: 33300 Epoch: [19] [ 810/4276] eta: 2:50:51 lr: 2.7769445609200866e-05 loss: 0.1220 (0.1218) time: 2.9501 data: 0.0073 max mem: 33300 Epoch: [19] [ 820/4276] eta: 2:50:19 lr: 2.7766636973870752e-05 loss: 0.1195 (0.1216) time: 2.9403 data: 0.0073 max mem: 33300 Epoch: [19] [ 830/4276] eta: 2:49:48 lr: 2.7763828306973888e-05 loss: 0.1135 (0.1218) time: 2.9127 data: 0.0076 max mem: 33300 Epoch: [19] [ 840/4276] eta: 2:49:17 lr: 2.7761019608506385e-05 loss: 0.1203 (0.1219) time: 2.9252 data: 0.0076 max mem: 33300 Epoch: [19] [ 850/4276] eta: 2:48:47 lr: 2.775821087846432e-05 loss: 0.1189 (0.1218) time: 2.9339 data: 0.0071 max mem: 33300 Epoch: [19] [ 860/4276] eta: 2:48:16 lr: 2.7755402116843805e-05 loss: 0.1124 (0.1218) time: 2.9231 data: 0.0078 max mem: 33300 Epoch: [19] [ 870/4276] eta: 2:47:45 lr: 2.7752593323640918e-05 loss: 0.1155 (0.1219) time: 2.9149 data: 0.0075 max mem: 33300 Epoch: [19] [ 880/4276] eta: 2:47:13 lr: 2.7749784498851766e-05 loss: 0.1200 (0.1220) time: 2.9121 data: 0.0068 max mem: 33300 Epoch: [19] [ 890/4276] eta: 2:46:41 lr: 2.774697564247244e-05 loss: 0.1264 (0.1221) time: 2.8983 data: 0.0073 max mem: 33300 Epoch: [19] [ 900/4276] eta: 2:46:10 lr: 2.7744166754499028e-05 loss: 0.1213 (0.1221) time: 2.8991 data: 0.0077 max mem: 33300 Epoch: [19] [ 910/4276] eta: 2:45:38 lr: 2.7741357834927624e-05 loss: 0.1183 (0.1221) time: 2.9046 data: 0.0078 max mem: 33300 Epoch: [19] [ 920/4276] eta: 2:45:07 lr: 2.7738548883754313e-05 loss: 0.1164 (0.1223) time: 2.8961 data: 0.0079 max mem: 33300 Epoch: [19] [ 930/4276] eta: 2:44:35 lr: 2.7735739900975176e-05 loss: 0.1164 (0.1223) time: 2.8940 data: 0.0074 max mem: 33300 Epoch: [19] [ 940/4276] eta: 2:44:04 lr: 2.773293088658632e-05 loss: 0.1151 (0.1223) time: 2.9069 data: 0.0068 max mem: 33300 Epoch: [19] [ 950/4276] eta: 2:43:35 lr: 2.7730121840583816e-05 loss: 0.1168 (0.1224) time: 2.9312 data: 0.0068 max mem: 33300 Epoch: [19] [ 960/4276] eta: 2:43:05 lr: 2.7727312762963752e-05 loss: 0.1295 (0.1226) time: 2.9475 data: 0.0067 max mem: 33300 Epoch: [19] [ 970/4276] eta: 2:42:35 lr: 2.772450365372222e-05 loss: 0.1231 (0.1226) time: 2.9400 data: 0.0071 max mem: 33300 Epoch: [19] [ 980/4276] eta: 2:42:04 lr: 2.7721694512855305e-05 loss: 0.1230 (0.1226) time: 2.9241 data: 0.0078 max mem: 33300 Epoch: [19] [ 990/4276] eta: 2:41:35 lr: 2.7718885340359084e-05 loss: 0.1229 (0.1226) time: 2.9312 data: 0.0080 max mem: 33300 Epoch: [19] [1000/4276] eta: 2:41:05 lr: 2.7716076136229642e-05 loss: 0.1159 (0.1226) time: 2.9424 data: 0.0077 max mem: 33300 Epoch: [19] [1010/4276] eta: 2:40:35 lr: 2.7713266900463052e-05 loss: 0.1145 (0.1225) time: 2.9417 data: 0.0077 max mem: 33300 Epoch: [19] [1020/4276] eta: 2:40:05 lr: 2.771045763305541e-05 loss: 0.1154 (0.1226) time: 2.9409 data: 0.0078 max mem: 33300 Epoch: [19] [1030/4276] eta: 2:39:36 lr: 2.7707648334002777e-05 loss: 0.1204 (0.1225) time: 2.9421 data: 0.0077 max mem: 33300 Epoch: [19] [1040/4276] eta: 2:39:06 lr: 2.7704839003301243e-05 loss: 0.1232 (0.1225) time: 2.9467 data: 0.0076 max mem: 33300 Epoch: [19] [1050/4276] eta: 2:38:37 lr: 2.7702029640946886e-05 loss: 0.1192 (0.1226) time: 2.9590 data: 0.0075 max mem: 33300 Epoch: [19] [1060/4276] eta: 2:38:06 lr: 2.769922024693578e-05 loss: 0.1192 (0.1227) time: 2.9294 data: 0.0075 max mem: 33300 Epoch: [19] [1070/4276] eta: 2:37:34 lr: 2.7696410821264008e-05 loss: 0.1344 (0.1228) time: 2.8840 data: 0.0081 max mem: 33300 Epoch: [19] [1080/4276] eta: 2:37:04 lr: 2.7693601363927622e-05 loss: 0.1279 (0.1229) time: 2.9102 data: 0.0087 max mem: 33300 Epoch: [19] [1090/4276] eta: 2:36:35 lr: 2.769079187492272e-05 loss: 0.1279 (0.1230) time: 2.9437 data: 0.0087 max mem: 33300 Epoch: [19] [1100/4276] eta: 2:36:05 lr: 2.768798235424536e-05 loss: 0.1320 (0.1231) time: 2.9466 data: 0.0083 max mem: 33300 Epoch: [19] [1110/4276] eta: 2:35:36 lr: 2.7685172801891628e-05 loss: 0.1281 (0.1232) time: 2.9449 data: 0.0079 max mem: 33300 Epoch: [19] [1120/4276] eta: 2:35:06 lr: 2.7682363217857582e-05 loss: 0.1252 (0.1232) time: 2.9428 data: 0.0084 max mem: 33300 Epoch: [19] [1130/4276] eta: 2:34:36 lr: 2.7679553602139303e-05 loss: 0.1197 (0.1232) time: 2.9404 data: 0.0085 max mem: 33300 Epoch: [19] [1140/4276] eta: 2:34:06 lr: 2.7676743954732847e-05 loss: 0.1273 (0.1232) time: 2.9402 data: 0.0083 max mem: 33300 Epoch: [19] [1150/4276] eta: 2:33:37 lr: 2.76739342756343e-05 loss: 0.1276 (0.1232) time: 2.9420 data: 0.0082 max mem: 33300 Epoch: [19] [1160/4276] eta: 2:33:07 lr: 2.7671124564839705e-05 loss: 0.1164 (0.1233) time: 2.9414 data: 0.0080 max mem: 33300 Epoch: [19] [1170/4276] eta: 2:32:37 lr: 2.7668314822345148e-05 loss: 0.1165 (0.1232) time: 2.9412 data: 0.0082 max mem: 33300 Epoch: [19] [1180/4276] eta: 2:32:08 lr: 2.766550504814669e-05 loss: 0.1181 (0.1232) time: 2.9422 data: 0.0082 max mem: 33300 Epoch: [19] [1190/4276] eta: 2:31:38 lr: 2.7662695242240393e-05 loss: 0.1058 (0.1231) time: 2.9438 data: 0.0080 max mem: 33300 Epoch: [19] [1200/4276] eta: 2:31:09 lr: 2.765988540462232e-05 loss: 0.1118 (0.1231) time: 2.9464 data: 0.0080 max mem: 33300 Epoch: [19] [1210/4276] eta: 2:30:39 lr: 2.765707553528854e-05 loss: 0.1134 (0.1230) time: 2.9468 data: 0.0084 max mem: 33300 Epoch: [19] [1220/4276] eta: 2:30:09 lr: 2.7654265634235104e-05 loss: 0.1134 (0.1230) time: 2.9309 data: 0.0083 max mem: 33300 Epoch: [19] [1230/4276] eta: 2:29:38 lr: 2.765145570145809e-05 loss: 0.1134 (0.1231) time: 2.9082 data: 0.0085 max mem: 33300 Epoch: [19] [1240/4276] eta: 2:29:09 lr: 2.764864573695354e-05 loss: 0.1207 (0.1231) time: 2.9298 data: 0.0090 max mem: 33300 Epoch: [19] [1250/4276] eta: 2:28:39 lr: 2.7645835740717513e-05 loss: 0.1299 (0.1232) time: 2.9391 data: 0.0086 max mem: 33300 Epoch: [19] [1260/4276] eta: 2:28:08 lr: 2.764302571274608e-05 loss: 0.1070 (0.1230) time: 2.9109 data: 0.0084 max mem: 33300 Epoch: [19] [1270/4276] eta: 2:27:38 lr: 2.7640215653035285e-05 loss: 0.1045 (0.1230) time: 2.9039 data: 0.0085 max mem: 33300 Epoch: [19] [1280/4276] eta: 2:27:07 lr: 2.763740556158119e-05 loss: 0.1213 (0.1230) time: 2.9045 data: 0.0084 max mem: 33300 Epoch: [19] [1290/4276] eta: 2:26:37 lr: 2.7634595438379863e-05 loss: 0.1213 (0.1231) time: 2.9039 data: 0.0081 max mem: 33300 Epoch: [19] [1300/4276] eta: 2:26:07 lr: 2.7631785283427335e-05 loss: 0.1072 (0.1230) time: 2.9074 data: 0.0085 max mem: 33300 Epoch: [19] [1310/4276] eta: 2:25:37 lr: 2.7628975096719678e-05 loss: 0.0982 (0.1229) time: 2.9134 data: 0.0092 max mem: 33300 Epoch: [19] [1320/4276] eta: 2:25:06 lr: 2.762616487825293e-05 loss: 0.1133 (0.1230) time: 2.9086 data: 0.0091 max mem: 33300 Epoch: [19] [1330/4276] eta: 2:24:36 lr: 2.762335462802315e-05 loss: 0.1226 (0.1230) time: 2.9057 data: 0.0090 max mem: 33300 Epoch: [19] [1340/4276] eta: 2:24:06 lr: 2.7620544346026387e-05 loss: 0.1155 (0.1229) time: 2.9077 data: 0.0087 max mem: 33300 Epoch: [19] [1350/4276] eta: 2:23:35 lr: 2.7617734032258695e-05 loss: 0.1225 (0.1230) time: 2.9043 data: 0.0085 max mem: 33300 Epoch: [19] [1360/4276] eta: 2:23:05 lr: 2.7614923686716114e-05 loss: 0.1169 (0.1229) time: 2.9040 data: 0.0090 max mem: 33300 Epoch: [19] [1370/4276] eta: 2:22:35 lr: 2.76121133093947e-05 loss: 0.1045 (0.1229) time: 2.9060 data: 0.0090 max mem: 33300 Epoch: [19] [1380/4276] eta: 2:22:04 lr: 2.7609302900290502e-05 loss: 0.1121 (0.1230) time: 2.9042 data: 0.0088 max mem: 33300 Epoch: [19] [1390/4276] eta: 2:21:34 lr: 2.7606492459399546e-05 loss: 0.1244 (0.1229) time: 2.9082 data: 0.0087 max mem: 33300 Epoch: [19] [1400/4276] eta: 2:21:04 lr: 2.7603681986717895e-05 loss: 0.1231 (0.1230) time: 2.9175 data: 0.0083 max mem: 33300 Epoch: [19] [1410/4276] eta: 2:20:34 lr: 2.7600871482241586e-05 loss: 0.1060 (0.1229) time: 2.9155 data: 0.0083 max mem: 33300 Epoch: [19] [1420/4276] eta: 2:20:04 lr: 2.759806094596667e-05 loss: 0.1112 (0.1229) time: 2.9107 data: 0.0087 max mem: 33300 Epoch: [19] [1430/4276] eta: 2:19:34 lr: 2.7595250377889177e-05 loss: 0.1146 (0.1228) time: 2.9077 data: 0.0087 max mem: 33300 Epoch: [19] [1440/4276] eta: 2:19:04 lr: 2.7592439778005163e-05 loss: 0.1173 (0.1228) time: 2.9079 data: 0.0089 max mem: 33300 Epoch: [19] [1450/4276] eta: 2:18:34 lr: 2.7589629146310654e-05 loss: 0.1146 (0.1228) time: 2.9144 data: 0.0089 max mem: 33300 Epoch: [19] [1460/4276] eta: 2:18:04 lr: 2.7586818482801703e-05 loss: 0.1063 (0.1228) time: 2.9133 data: 0.0090 max mem: 33300 Epoch: [19] [1470/4276] eta: 2:17:34 lr: 2.7584007787474325e-05 loss: 0.1137 (0.1228) time: 2.9160 data: 0.0090 max mem: 33300 Epoch: [19] [1480/4276] eta: 2:17:05 lr: 2.7581197060324583e-05 loss: 0.1228 (0.1228) time: 2.9411 data: 0.0086 max mem: 33300 Epoch: [19] [1490/4276] eta: 2:16:36 lr: 2.7578386301348497e-05 loss: 0.1104 (0.1228) time: 2.9575 data: 0.0082 max mem: 33300 Epoch: [19] [1500/4276] eta: 2:16:07 lr: 2.7575575510542108e-05 loss: 0.1160 (0.1227) time: 2.9556 data: 0.0083 max mem: 33300 Epoch: [19] [1510/4276] eta: 2:15:38 lr: 2.7572764687901453e-05 loss: 0.1064 (0.1226) time: 2.9536 data: 0.0082 max mem: 33300 Epoch: [19] [1520/4276] eta: 2:15:08 lr: 2.7569953833422568e-05 loss: 0.0994 (0.1225) time: 2.9500 data: 0.0080 max mem: 33300 Epoch: [19] [1530/4276] eta: 2:14:39 lr: 2.7567142947101478e-05 loss: 0.1000 (0.1224) time: 2.9505 data: 0.0078 max mem: 33300 Epoch: [19] [1540/4276] eta: 2:14:10 lr: 2.7564332028934216e-05 loss: 0.1100 (0.1224) time: 2.9533 data: 0.0078 max mem: 33300 Epoch: [19] [1550/4276] eta: 2:13:41 lr: 2.7561521078916814e-05 loss: 0.1102 (0.1224) time: 2.9536 data: 0.0080 max mem: 33300 Epoch: [19] [1560/4276] eta: 2:13:11 lr: 2.7558710097045298e-05 loss: 0.1181 (0.1224) time: 2.9464 data: 0.0084 max mem: 33300 Epoch: [19] [1570/4276] eta: 2:12:41 lr: 2.75558990833157e-05 loss: 0.1181 (0.1224) time: 2.9360 data: 0.0083 max mem: 33300 Epoch: [19] [1580/4276] eta: 2:12:12 lr: 2.7553088037724055e-05 loss: 0.1067 (0.1223) time: 2.9422 data: 0.0082 max mem: 33300 Epoch: [19] [1590/4276] eta: 2:11:43 lr: 2.755027696026638e-05 loss: 0.1064 (0.1223) time: 2.9515 data: 0.0079 max mem: 33300 Epoch: [19] [1600/4276] eta: 2:11:14 lr: 2.754746585093871e-05 loss: 0.1188 (0.1223) time: 2.9528 data: 0.0083 max mem: 33300 Epoch: [19] [1610/4276] eta: 2:10:44 lr: 2.7544654709737057e-05 loss: 0.1065 (0.1223) time: 2.9497 data: 0.0091 max mem: 33300 Epoch: [19] [1620/4276] eta: 2:10:15 lr: 2.754184353665746e-05 loss: 0.1084 (0.1222) time: 2.9462 data: 0.0087 max mem: 33300 Epoch: [19] [1630/4276] eta: 2:09:46 lr: 2.753903233169593e-05 loss: 0.1189 (0.1222) time: 2.9449 data: 0.0087 max mem: 33300 Epoch: [19] [1640/4276] eta: 2:09:16 lr: 2.7536221094848493e-05 loss: 0.1082 (0.1220) time: 2.9535 data: 0.0088 max mem: 33300 Epoch: [19] [1650/4276] eta: 2:08:47 lr: 2.7533409826111167e-05 loss: 0.1077 (0.1220) time: 2.9530 data: 0.0086 max mem: 33300 Epoch: [19] [1660/4276] eta: 2:08:18 lr: 2.7530598525479984e-05 loss: 0.1127 (0.1220) time: 2.9431 data: 0.0089 max mem: 33300 Epoch: [19] [1670/4276] eta: 2:07:48 lr: 2.752778719295095e-05 loss: 0.0990 (0.1219) time: 2.9428 data: 0.0086 max mem: 33300 Epoch: [19] [1680/4276] eta: 2:07:19 lr: 2.75249758285201e-05 loss: 0.1033 (0.1219) time: 2.9438 data: 0.0086 max mem: 33300 Epoch: [19] [1690/4276] eta: 2:06:49 lr: 2.7522164432183434e-05 loss: 0.1140 (0.1219) time: 2.9442 data: 0.0084 max mem: 33300 Epoch: [19] [1700/4276] eta: 2:06:20 lr: 2.7519353003936976e-05 loss: 0.1062 (0.1218) time: 2.9429 data: 0.0082 max mem: 33300 Epoch: [19] [1710/4276] eta: 2:05:51 lr: 2.7516541543776736e-05 loss: 0.1189 (0.1218) time: 2.9448 data: 0.0086 max mem: 33300 Epoch: [19] [1720/4276] eta: 2:05:21 lr: 2.7513730051698733e-05 loss: 0.1189 (0.1218) time: 2.9451 data: 0.0090 max mem: 33300 Epoch: [19] [1730/4276] eta: 2:04:52 lr: 2.7510918527698985e-05 loss: 0.1124 (0.1217) time: 2.9443 data: 0.0088 max mem: 33300 Epoch: [19] [1740/4276] eta: 2:04:22 lr: 2.7508106971773502e-05 loss: 0.1095 (0.1217) time: 2.9461 data: 0.0081 max mem: 33300 Epoch: [19] [1750/4276] eta: 2:03:53 lr: 2.7505295383918295e-05 loss: 0.1095 (0.1217) time: 2.9485 data: 0.0079 max mem: 33300 Epoch: [19] [1760/4276] eta: 2:03:24 lr: 2.750248376412937e-05 loss: 0.0913 (0.1216) time: 2.9484 data: 0.0082 max mem: 33300 Epoch: [19] [1770/4276] eta: 2:02:54 lr: 2.7499672112402746e-05 loss: 0.1145 (0.1216) time: 2.9470 data: 0.0085 max mem: 33300 Epoch: [19] [1780/4276] eta: 2:02:25 lr: 2.7496860428734422e-05 loss: 0.1177 (0.1215) time: 2.9445 data: 0.0082 max mem: 33300 Epoch: [19] [1790/4276] eta: 2:01:56 lr: 2.7494048713120412e-05 loss: 0.1122 (0.1215) time: 2.9448 data: 0.0081 max mem: 33300 Epoch: [19] [1800/4276] eta: 2:01:26 lr: 2.7491236965556723e-05 loss: 0.1094 (0.1214) time: 2.9481 data: 0.0081 max mem: 33300 Epoch: [19] [1810/4276] eta: 2:00:57 lr: 2.7488425186039356e-05 loss: 0.1177 (0.1215) time: 2.9433 data: 0.0080 max mem: 33300 Epoch: [19] [1820/4276] eta: 2:00:27 lr: 2.7485613374564324e-05 loss: 0.1264 (0.1215) time: 2.9473 data: 0.0080 max mem: 33300 Epoch: [19] [1830/4276] eta: 1:59:58 lr: 2.7482801531127632e-05 loss: 0.1154 (0.1215) time: 2.9503 data: 0.0083 max mem: 33300 Epoch: [19] [1840/4276] eta: 1:59:28 lr: 2.7479989655725275e-05 loss: 0.1128 (0.1215) time: 2.9353 data: 0.0083 max mem: 33300 Epoch: [19] [1850/4276] eta: 1:58:59 lr: 2.7477177748353262e-05 loss: 0.1139 (0.1215) time: 2.9183 data: 0.0085 max mem: 33300 Epoch: [19] [1860/4276] eta: 1:58:29 lr: 2.7474365809007578e-05 loss: 0.1174 (0.1215) time: 2.9079 data: 0.0087 max mem: 33300 Epoch: [19] [1870/4276] eta: 1:57:59 lr: 2.7471553837684242e-05 loss: 0.1283 (0.1216) time: 2.9022 data: 0.0088 max mem: 33300 Epoch: [19] [1880/4276] eta: 1:57:29 lr: 2.7468741834379253e-05 loss: 0.1226 (0.1216) time: 2.9029 data: 0.0086 max mem: 33300 Epoch: [19] [1890/4276] eta: 1:56:59 lr: 2.7465929799088596e-05 loss: 0.1160 (0.1216) time: 2.9057 data: 0.0086 max mem: 33300 Epoch: [19] [1900/4276] eta: 1:56:29 lr: 2.746311773180828e-05 loss: 0.1160 (0.1216) time: 2.9056 data: 0.0085 max mem: 33300 Epoch: [19] [1910/4276] eta: 1:55:59 lr: 2.7460305632534307e-05 loss: 0.1074 (0.1216) time: 2.9074 data: 0.0086 max mem: 33300 Epoch: [19] [1920/4276] eta: 1:55:29 lr: 2.7457493501262653e-05 loss: 0.1074 (0.1216) time: 2.9068 data: 0.0082 max mem: 33300 Epoch: [19] [1930/4276] eta: 1:54:59 lr: 2.745468133798933e-05 loss: 0.1143 (0.1216) time: 2.9007 data: 0.0079 max mem: 33300 Epoch: [19] [1940/4276] eta: 1:54:30 lr: 2.745186914271032e-05 loss: 0.1170 (0.1215) time: 2.9031 data: 0.0080 max mem: 33300 Epoch: [19] [1950/4276] eta: 1:54:00 lr: 2.7449056915421616e-05 loss: 0.1195 (0.1216) time: 2.9213 data: 0.0083 max mem: 33300 Epoch: [19] [1960/4276] eta: 1:53:31 lr: 2.7446244656119217e-05 loss: 0.1215 (0.1215) time: 2.9304 data: 0.0084 max mem: 33300 Epoch: [19] [1970/4276] eta: 1:53:01 lr: 2.744343236479911e-05 loss: 0.0973 (0.1214) time: 2.9161 data: 0.0085 max mem: 33300 Epoch: [19] [1980/4276] eta: 1:52:31 lr: 2.7440620041457286e-05 loss: 0.0939 (0.1214) time: 2.9163 data: 0.0087 max mem: 33300 Epoch: [19] [1990/4276] eta: 1:52:02 lr: 2.743780768608974e-05 loss: 0.1062 (0.1214) time: 2.9432 data: 0.0083 max mem: 33300 Epoch: [19] [2000/4276] eta: 1:51:32 lr: 2.743499529869245e-05 loss: 0.1114 (0.1214) time: 2.9368 data: 0.0083 max mem: 33300 Epoch: [19] [2010/4276] eta: 1:51:03 lr: 2.7432182879261404e-05 loss: 0.1228 (0.1214) time: 2.9157 data: 0.0090 max mem: 33300 Epoch: [19] [2020/4276] eta: 1:50:33 lr: 2.7429370427792587e-05 loss: 0.1264 (0.1214) time: 2.9059 data: 0.0093 max mem: 33300 Epoch: [19] [2030/4276] eta: 1:50:03 lr: 2.7426557944281987e-05 loss: 0.1064 (0.1213) time: 2.9041 data: 0.0094 max mem: 33300 Epoch: [19] [2040/4276] eta: 1:49:33 lr: 2.7423745428725594e-05 loss: 0.1064 (0.1212) time: 2.9138 data: 0.0093 max mem: 33300 Epoch: [19] [2050/4276] eta: 1:49:03 lr: 2.7420932881119384e-05 loss: 0.1129 (0.1212) time: 2.9114 data: 0.0087 max mem: 33300 Epoch: [19] [2060/4276] eta: 1:48:35 lr: 2.7418120301459337e-05 loss: 0.1153 (0.1212) time: 2.9856 data: 0.0089 max mem: 33300 Epoch: [19] [2070/4276] eta: 1:48:08 lr: 2.7415307689741437e-05 loss: 0.1114 (0.1212) time: 3.0779 data: 0.0087 max mem: 33300 Epoch: [19] [2080/4276] eta: 1:47:39 lr: 2.7412495045961673e-05 loss: 0.1139 (0.1212) time: 3.0753 data: 0.0078 max mem: 33300 Epoch: [19] [2090/4276] eta: 1:47:11 lr: 2.7409682370116007e-05 loss: 0.1212 (0.1212) time: 3.0487 data: 0.0083 max mem: 33300 Epoch: [19] [2100/4276] eta: 1:46:43 lr: 2.740686966220043e-05 loss: 0.1155 (0.1212) time: 3.0372 data: 0.0091 max mem: 33300 Epoch: [19] [2110/4276] eta: 1:46:14 lr: 2.7404056922210913e-05 loss: 0.1099 (0.1211) time: 3.0493 data: 0.0086 max mem: 33300 Epoch: [19] [2120/4276] eta: 1:45:47 lr: 2.7401244150143435e-05 loss: 0.0962 (0.1210) time: 3.0789 data: 0.0080 max mem: 33300 Epoch: [19] [2130/4276] eta: 1:45:18 lr: 2.739843134599397e-05 loss: 0.1054 (0.1209) time: 3.0657 data: 0.0083 max mem: 33300 Epoch: [19] [2140/4276] eta: 1:44:50 lr: 2.7395618509758497e-05 loss: 0.1071 (0.1209) time: 3.0389 data: 0.0081 max mem: 33300 Epoch: [19] [2150/4276] eta: 1:44:21 lr: 2.7392805641432982e-05 loss: 0.1058 (0.1208) time: 3.0389 data: 0.0080 max mem: 33300 Epoch: [19] [2160/4276] eta: 1:43:53 lr: 2.738999274101341e-05 loss: 0.1002 (0.1208) time: 3.0539 data: 0.0079 max mem: 33300 Epoch: [19] [2170/4276] eta: 1:43:25 lr: 2.738717980849574e-05 loss: 0.1161 (0.1207) time: 3.0808 data: 0.0075 max mem: 33300 Epoch: [19] [2180/4276] eta: 1:42:56 lr: 2.7384366843875942e-05 loss: 0.1212 (0.1207) time: 3.0674 data: 0.0075 max mem: 33300 Epoch: [19] [2190/4276] eta: 1:42:28 lr: 2.7381553847149987e-05 loss: 0.1205 (0.1207) time: 3.0386 data: 0.0078 max mem: 33300 Epoch: [19] [2200/4276] eta: 1:41:59 lr: 2.7378740818313853e-05 loss: 0.1109 (0.1208) time: 3.0284 data: 0.0080 max mem: 33300 Epoch: [19] [2210/4276] eta: 1:41:30 lr: 2.7375927757363502e-05 loss: 0.1150 (0.1208) time: 3.0360 data: 0.0075 max mem: 33300 Epoch: [19] [2220/4276] eta: 1:41:02 lr: 2.7373114664294895e-05 loss: 0.1248 (0.1209) time: 3.0443 data: 0.0078 max mem: 33300 Epoch: [19] [2230/4276] eta: 1:40:33 lr: 2.737030153910401e-05 loss: 0.1078 (0.1208) time: 3.0323 data: 0.0074 max mem: 33300 Epoch: [19] [2240/4276] eta: 1:40:04 lr: 2.7367488381786805e-05 loss: 0.1063 (0.1208) time: 3.0232 data: 0.0070 max mem: 33300 Epoch: [19] [2250/4276] eta: 1:39:35 lr: 2.7364675192339235e-05 loss: 0.1029 (0.1207) time: 3.0266 data: 0.0073 max mem: 33300 Epoch: [19] [2260/4276] eta: 1:39:07 lr: 2.7361861970757274e-05 loss: 0.1082 (0.1207) time: 3.0300 data: 0.0074 max mem: 33300 Epoch: [19] [2270/4276] eta: 1:38:38 lr: 2.7359048717036878e-05 loss: 0.1159 (0.1207) time: 3.0281 data: 0.0073 max mem: 33300 Epoch: [19] [2280/4276] eta: 1:38:09 lr: 2.7356235431174014e-05 loss: 0.1045 (0.1207) time: 3.0266 data: 0.0073 max mem: 33300 Epoch: [19] [2290/4276] eta: 1:37:40 lr: 2.7353422113164643e-05 loss: 0.1104 (0.1206) time: 3.0295 data: 0.0073 max mem: 33300 Epoch: [19] [2300/4276] eta: 1:37:11 lr: 2.7350608763004715e-05 loss: 0.1117 (0.1206) time: 3.0304 data: 0.0074 max mem: 33300 Epoch: [19] [2310/4276] eta: 1:36:43 lr: 2.7347795380690195e-05 loss: 0.1222 (0.1208) time: 3.0452 data: 0.0074 max mem: 33300 Epoch: [19] [2320/4276] eta: 1:36:14 lr: 2.7344981966217037e-05 loss: 0.1266 (0.1208) time: 3.0625 data: 0.0074 max mem: 33300 Epoch: [19] [2330/4276] eta: 1:35:45 lr: 2.734216851958119e-05 loss: 0.1301 (0.1208) time: 3.0581 data: 0.0076 max mem: 33300 Epoch: [19] [2340/4276] eta: 1:35:16 lr: 2.7339355040778624e-05 loss: 0.1301 (0.1208) time: 3.0411 data: 0.0074 max mem: 33300 Epoch: [19] [2350/4276] eta: 1:34:48 lr: 2.7336541529805282e-05 loss: 0.1250 (0.1209) time: 3.0413 data: 0.0068 max mem: 33300 Epoch: [19] [2360/4276] eta: 1:34:19 lr: 2.733372798665712e-05 loss: 0.1164 (0.1208) time: 3.0415 data: 0.0066 max mem: 33300 Epoch: [19] [2370/4276] eta: 1:33:50 lr: 2.73309144113301e-05 loss: 0.1166 (0.1209) time: 3.0317 data: 0.0065 max mem: 33300 Epoch: [19] [2380/4276] eta: 1:33:21 lr: 2.7328100803820156e-05 loss: 0.1238 (0.1209) time: 3.0329 data: 0.0063 max mem: 33300 Epoch: [19] [2390/4276] eta: 1:32:52 lr: 2.7325287164123254e-05 loss: 0.1206 (0.1209) time: 3.0367 data: 0.0065 max mem: 33300 Epoch: [19] [2400/4276] eta: 1:32:23 lr: 2.732247349223533e-05 loss: 0.1147 (0.1209) time: 3.0442 data: 0.0066 max mem: 33300 Epoch: [19] [2410/4276] eta: 1:31:55 lr: 2.731965978815233e-05 loss: 0.1147 (0.1209) time: 3.0688 data: 0.0069 max mem: 33300 Epoch: [19] [2420/4276] eta: 1:31:26 lr: 2.7316846051870216e-05 loss: 0.1059 (0.1208) time: 3.0897 data: 0.0073 max mem: 33300 Epoch: [19] [2430/4276] eta: 1:30:58 lr: 2.7314032283384926e-05 loss: 0.1241 (0.1209) time: 3.0876 data: 0.0077 max mem: 33300 Epoch: [19] [2440/4276] eta: 1:30:29 lr: 2.7311218482692407e-05 loss: 0.1272 (0.1209) time: 3.0727 data: 0.0080 max mem: 33300 Epoch: [19] [2450/4276] eta: 1:30:00 lr: 2.7308404649788612e-05 loss: 0.1272 (0.1209) time: 3.0735 data: 0.0075 max mem: 33300 Epoch: [19] [2460/4276] eta: 1:29:32 lr: 2.7305590784669466e-05 loss: 0.1405 (0.1210) time: 3.0862 data: 0.0071 max mem: 33300 Epoch: [19] [2470/4276] eta: 1:29:03 lr: 2.7302776887330922e-05 loss: 0.1246 (0.1211) time: 3.0701 data: 0.0071 max mem: 33300 Epoch: [19] [2480/4276] eta: 1:28:33 lr: 2.729996295776892e-05 loss: 0.1226 (0.1211) time: 3.0209 data: 0.0066 max mem: 33300 Epoch: [19] [2490/4276] eta: 1:28:04 lr: 2.72971489959794e-05 loss: 0.1169 (0.1211) time: 3.0049 data: 0.0064 max mem: 33300 Epoch: [19] [2500/4276] eta: 1:27:35 lr: 2.7294335001958306e-05 loss: 0.1153 (0.1211) time: 3.0414 data: 0.0062 max mem: 33300 Epoch: [19] [2510/4276] eta: 1:27:06 lr: 2.7291520975701568e-05 loss: 0.1242 (0.1211) time: 3.0547 data: 0.0072 max mem: 33300 Epoch: [19] [2520/4276] eta: 1:26:37 lr: 2.7288706917205126e-05 loss: 0.1093 (0.1211) time: 3.0380 data: 0.0081 max mem: 33300 Epoch: [19] [2530/4276] eta: 1:26:08 lr: 2.7285892826464924e-05 loss: 0.0955 (0.1210) time: 3.0282 data: 0.0082 max mem: 33300 Epoch: [19] [2540/4276] eta: 1:25:39 lr: 2.7283078703476888e-05 loss: 0.1031 (0.1210) time: 3.0306 data: 0.0087 max mem: 33300 Epoch: [19] [2550/4276] eta: 1:25:10 lr: 2.7280264548236968e-05 loss: 0.1067 (0.1210) time: 3.0449 data: 0.0083 max mem: 33300 Epoch: [19] [2560/4276] eta: 1:24:41 lr: 2.7277450360741074e-05 loss: 0.0969 (0.1209) time: 3.0737 data: 0.0078 max mem: 33300 Epoch: [19] [2570/4276] eta: 1:24:13 lr: 2.7274636140985154e-05 loss: 0.0990 (0.1209) time: 3.0897 data: 0.0077 max mem: 33300 Epoch: [19] [2580/4276] eta: 1:23:44 lr: 2.7271821888965137e-05 loss: 0.1087 (0.1208) time: 3.0903 data: 0.0076 max mem: 33300 Epoch: [19] [2590/4276] eta: 1:23:15 lr: 2.7269007604676948e-05 loss: 0.1104 (0.1208) time: 3.0905 data: 0.0076 max mem: 33300 Epoch: [19] [2600/4276] eta: 1:22:46 lr: 2.7266193288116533e-05 loss: 0.1079 (0.1208) time: 3.0925 data: 0.0077 max mem: 33300 Epoch: [19] [2610/4276] eta: 1:22:17 lr: 2.7263378939279806e-05 loss: 0.1038 (0.1207) time: 3.0912 data: 0.0077 max mem: 33300 Epoch: [19] [2620/4276] eta: 1:21:49 lr: 2.72605645581627e-05 loss: 0.1151 (0.1207) time: 3.0848 data: 0.0076 max mem: 33300 Epoch: [19] [2630/4276] eta: 1:21:20 lr: 2.7257750144761134e-05 loss: 0.1129 (0.1207) time: 3.0857 data: 0.0074 max mem: 33300 Epoch: [19] [2640/4276] eta: 1:20:51 lr: 2.7254935699071038e-05 loss: 0.1052 (0.1207) time: 3.0837 data: 0.0074 max mem: 33300 Epoch: [19] [2650/4276] eta: 1:20:22 lr: 2.7252121221088346e-05 loss: 0.1156 (0.1206) time: 3.0834 data: 0.0074 max mem: 33300 Epoch: [19] [2660/4276] eta: 1:19:53 lr: 2.7249306710808968e-05 loss: 0.1173 (0.1206) time: 3.0893 data: 0.0071 max mem: 33300 Epoch: [19] [2670/4276] eta: 1:19:24 lr: 2.7246492168228836e-05 loss: 0.1235 (0.1207) time: 3.0866 data: 0.0071 max mem: 33300 Epoch: [19] [2680/4276] eta: 1:18:55 lr: 2.7243677593343874e-05 loss: 0.1368 (0.1207) time: 3.0851 data: 0.0071 max mem: 33300 Epoch: [19] [2690/4276] eta: 1:18:26 lr: 2.7240862986149996e-05 loss: 0.1147 (0.1206) time: 3.0842 data: 0.0071 max mem: 33300 Epoch: [19] [2700/4276] eta: 1:17:57 lr: 2.7238048346643125e-05 loss: 0.1081 (0.1206) time: 3.0885 data: 0.0073 max mem: 33300 Epoch: [19] [2710/4276] eta: 1:17:28 lr: 2.7235233674819172e-05 loss: 0.1061 (0.1206) time: 3.0904 data: 0.0071 max mem: 33300 Epoch: [19] [2720/4276] eta: 1:16:59 lr: 2.723241897067406e-05 loss: 0.0970 (0.1205) time: 3.0878 data: 0.0070 max mem: 33300 Epoch: [19] [2730/4276] eta: 1:16:30 lr: 2.7229604234203714e-05 loss: 0.1075 (0.1206) time: 3.0866 data: 0.0072 max mem: 33300 Epoch: [19] [2740/4276] eta: 1:16:01 lr: 2.722678946540404e-05 loss: 0.1217 (0.1206) time: 3.0851 data: 0.0069 max mem: 33300 Epoch: [19] [2750/4276] eta: 1:15:32 lr: 2.7223974664270957e-05 loss: 0.1192 (0.1206) time: 3.0585 data: 0.0075 max mem: 33300 Epoch: [19] [2760/4276] eta: 1:15:02 lr: 2.7221159830800385e-05 loss: 0.1170 (0.1205) time: 3.0442 data: 0.0079 max mem: 33300 Epoch: [19] [2770/4276] eta: 1:14:33 lr: 2.7218344964988225e-05 loss: 0.1099 (0.1205) time: 3.0722 data: 0.0076 max mem: 33300 Epoch: [19] [2780/4276] eta: 1:14:04 lr: 2.7215530066830397e-05 loss: 0.1114 (0.1205) time: 3.0893 data: 0.0077 max mem: 33300 Epoch: [19] [2790/4276] eta: 1:13:35 lr: 2.721271513632281e-05 loss: 0.1116 (0.1205) time: 3.0915 data: 0.0072 max mem: 33300 Epoch: [19] [2800/4276] eta: 1:13:06 lr: 2.7209900173461365e-05 loss: 0.1076 (0.1205) time: 3.0792 data: 0.0071 max mem: 33300 Epoch: [19] [2810/4276] eta: 1:12:37 lr: 2.7207085178241988e-05 loss: 0.0972 (0.1204) time: 3.0515 data: 0.0073 max mem: 33300 Epoch: [19] [2820/4276] eta: 1:12:07 lr: 2.7204270150660572e-05 loss: 0.0972 (0.1203) time: 3.0501 data: 0.0077 max mem: 33300 Epoch: [19] [2830/4276] eta: 1:11:38 lr: 2.7201455090713028e-05 loss: 0.1107 (0.1203) time: 3.0764 data: 0.0075 max mem: 33300 Epoch: [19] [2840/4276] eta: 1:11:09 lr: 2.7198639998395276e-05 loss: 0.1261 (0.1204) time: 3.0881 data: 0.0072 max mem: 33300 Epoch: [19] [2850/4276] eta: 1:10:40 lr: 2.7195824873703203e-05 loss: 0.1322 (0.1204) time: 3.0891 data: 0.0076 max mem: 33300 Epoch: [19] [2860/4276] eta: 1:10:11 lr: 2.7193009716632723e-05 loss: 0.1202 (0.1204) time: 3.0909 data: 0.0073 max mem: 33300 Epoch: [19] [2870/4276] eta: 1:09:42 lr: 2.719019452717973e-05 loss: 0.1104 (0.1204) time: 3.0921 data: 0.0071 max mem: 33300 Epoch: [19] [2880/4276] eta: 1:09:13 lr: 2.7187379305340133e-05 loss: 0.1121 (0.1204) time: 3.0908 data: 0.0069 max mem: 33300 Epoch: [19] [2890/4276] eta: 1:08:44 lr: 2.7184564051109835e-05 loss: 0.1150 (0.1204) time: 3.1178 data: 0.0069 max mem: 33300 Epoch: [19] [2900/4276] eta: 1:08:14 lr: 2.7181748764484734e-05 loss: 0.1118 (0.1203) time: 3.1268 data: 0.0073 max mem: 33300 Epoch: [19] [2910/4276] eta: 1:07:45 lr: 2.7178933445460724e-05 loss: 0.1052 (0.1203) time: 3.0709 data: 0.0071 max mem: 33300 Epoch: [19] [2920/4276] eta: 1:07:15 lr: 2.717611809403371e-05 loss: 0.1220 (0.1203) time: 3.0203 data: 0.0070 max mem: 33300 Epoch: [19] [2930/4276] eta: 1:06:46 lr: 2.7173302710199587e-05 loss: 0.1088 (0.1202) time: 3.0416 data: 0.0076 max mem: 33300 Epoch: [19] [2940/4276] eta: 1:06:17 lr: 2.717048729395425e-05 loss: 0.1086 (0.1203) time: 3.0872 data: 0.0079 max mem: 33300 Epoch: [19] [2950/4276] eta: 1:05:47 lr: 2.7167671845293597e-05 loss: 0.1126 (0.1203) time: 3.0709 data: 0.0078 max mem: 33300 Epoch: [19] [2960/4276] eta: 1:05:18 lr: 2.7164856364213516e-05 loss: 0.1169 (0.1203) time: 3.0705 data: 0.0078 max mem: 33300 Epoch: [19] [2970/4276] eta: 1:04:49 lr: 2.71620408507099e-05 loss: 0.1210 (0.1203) time: 3.0932 data: 0.0080 max mem: 33300 Epoch: [19] [2980/4276] eta: 1:04:19 lr: 2.715922530477865e-05 loss: 0.1147 (0.1203) time: 3.0134 data: 0.0075 max mem: 33300 Epoch: [19] [2990/4276] eta: 1:03:49 lr: 2.715640972641566e-05 loss: 0.1110 (0.1203) time: 2.9454 data: 0.0072 max mem: 33300 Epoch: [19] [3000/4276] eta: 1:03:19 lr: 2.7153594115616803e-05 loss: 0.1102 (0.1202) time: 2.9859 data: 0.0083 max mem: 33300 Epoch: [19] [3010/4276] eta: 1:02:50 lr: 2.7150778472377987e-05 loss: 0.1121 (0.1202) time: 3.0218 data: 0.0087 max mem: 33300 Epoch: [19] [3020/4276] eta: 1:02:20 lr: 2.714796279669508e-05 loss: 0.1261 (0.1202) time: 3.0434 data: 0.0085 max mem: 33300 Epoch: [19] [3030/4276] eta: 1:01:51 lr: 2.7145147088563987e-05 loss: 0.1150 (0.1202) time: 3.0171 data: 0.0081 max mem: 33300 Epoch: [19] [3040/4276] eta: 1:01:21 lr: 2.7142331347980583e-05 loss: 0.1213 (0.1203) time: 2.9634 data: 0.0084 max mem: 33300 Epoch: [19] [3050/4276] eta: 1:00:51 lr: 2.713951557494076e-05 loss: 0.1213 (0.1202) time: 2.9633 data: 0.0085 max mem: 33300 Epoch: [19] [3060/4276] eta: 1:00:21 lr: 2.71366997694404e-05 loss: 0.0999 (0.1202) time: 2.9705 data: 0.0074 max mem: 33300 Epoch: [19] [3070/4276] eta: 0:59:51 lr: 2.713388393147539e-05 loss: 0.1036 (0.1202) time: 2.9909 data: 0.0073 max mem: 33300 Epoch: [19] [3080/4276] eta: 0:59:22 lr: 2.713106806104161e-05 loss: 0.1060 (0.1201) time: 2.9938 data: 0.0076 max mem: 33300 Epoch: [19] [3090/4276] eta: 0:58:52 lr: 2.7128252158134937e-05 loss: 0.1046 (0.1201) time: 2.9711 data: 0.0075 max mem: 33300 Epoch: [19] [3100/4276] eta: 0:58:22 lr: 2.712543622275125e-05 loss: 0.1019 (0.1200) time: 2.9629 data: 0.0072 max mem: 33300 Epoch: [19] [3110/4276] eta: 0:57:52 lr: 2.7122620254886432e-05 loss: 0.0982 (0.1200) time: 2.9650 data: 0.0072 max mem: 33300 Epoch: [19] [3120/4276] eta: 0:57:22 lr: 2.7119804254536367e-05 loss: 0.0982 (0.1199) time: 2.9736 data: 0.0072 max mem: 33300 Epoch: [19] [3130/4276] eta: 0:56:53 lr: 2.711698822169692e-05 loss: 0.1022 (0.1199) time: 2.9720 data: 0.0070 max mem: 33300 Epoch: [19] [3140/4276] eta: 0:56:23 lr: 2.711417215636398e-05 loss: 0.1112 (0.1199) time: 2.9688 data: 0.0069 max mem: 33300 Epoch: [19] [3150/4276] eta: 0:55:53 lr: 2.7111356058533415e-05 loss: 0.1131 (0.1199) time: 2.9687 data: 0.0069 max mem: 33300 Epoch: [19] [3160/4276] eta: 0:55:23 lr: 2.7108539928201106e-05 loss: 0.1229 (0.1200) time: 2.9695 data: 0.0069 max mem: 33300 Epoch: [19] [3170/4276] eta: 0:54:53 lr: 2.7105723765362918e-05 loss: 0.1255 (0.1200) time: 2.9610 data: 0.0071 max mem: 33300 Epoch: [19] [3180/4276] eta: 0:54:23 lr: 2.710290757001472e-05 loss: 0.1188 (0.1200) time: 2.9461 data: 0.0071 max mem: 33300 Epoch: [19] [3190/4276] eta: 0:53:53 lr: 2.710009134215239e-05 loss: 0.1188 (0.1200) time: 2.9414 data: 0.0071 max mem: 33300 Epoch: [19] [3200/4276] eta: 0:53:24 lr: 2.7097275081771796e-05 loss: 0.1189 (0.1200) time: 2.9645 data: 0.0071 max mem: 33300 Epoch: [19] [3210/4276] eta: 0:52:54 lr: 2.709445878886881e-05 loss: 0.1164 (0.1200) time: 2.9815 data: 0.0068 max mem: 33300 Epoch: [19] [3220/4276] eta: 0:52:24 lr: 2.7091642463439303e-05 loss: 0.1164 (0.1200) time: 2.9747 data: 0.0079 max mem: 33300 Epoch: [19] [3230/4276] eta: 0:51:54 lr: 2.7088826105479138e-05 loss: 0.1178 (0.1200) time: 2.9705 data: 0.0079 max mem: 33300 Epoch: [19] [3240/4276] eta: 0:51:24 lr: 2.7086009714984183e-05 loss: 0.1200 (0.1200) time: 2.9645 data: 0.0072 max mem: 33300 Epoch: [19] [3250/4276] eta: 0:50:54 lr: 2.7083193291950298e-05 loss: 0.1241 (0.1200) time: 2.9383 data: 0.0079 max mem: 33300 Epoch: [19] [3260/4276] eta: 0:50:24 lr: 2.708037683637335e-05 loss: 0.1183 (0.1201) time: 2.9129 data: 0.0078 max mem: 33300 Epoch: [19] [3270/4276] eta: 0:49:54 lr: 2.7077560348249204e-05 loss: 0.1150 (0.1201) time: 2.9099 data: 0.0074 max mem: 33300 Epoch: [19] [3280/4276] eta: 0:49:25 lr: 2.7074743827573718e-05 loss: 0.1226 (0.1201) time: 2.9424 data: 0.0076 max mem: 33300 Epoch: [19] [3290/4276] eta: 0:48:55 lr: 2.7071927274342758e-05 loss: 0.1226 (0.1201) time: 2.9685 data: 0.0077 max mem: 33300 Epoch: [19] [3300/4276] eta: 0:48:25 lr: 2.706911068855219e-05 loss: 0.1188 (0.1202) time: 2.9610 data: 0.0076 max mem: 33300 Epoch: [19] [3310/4276] eta: 0:47:55 lr: 2.7066294070197858e-05 loss: 0.1326 (0.1202) time: 2.9601 data: 0.0072 max mem: 33300 Epoch: [19] [3320/4276] eta: 0:47:25 lr: 2.7063477419275634e-05 loss: 0.1326 (0.1202) time: 2.9622 data: 0.0074 max mem: 33300 Epoch: [19] [3330/4276] eta: 0:46:56 lr: 2.7060660735781367e-05 loss: 0.1208 (0.1202) time: 2.9708 data: 0.0076 max mem: 33300 Epoch: [19] [3340/4276] eta: 0:46:26 lr: 2.7057844019710915e-05 loss: 0.1122 (0.1202) time: 2.9746 data: 0.0073 max mem: 33300 Epoch: [19] [3350/4276] eta: 0:45:56 lr: 2.705502727106013e-05 loss: 0.1119 (0.1202) time: 2.9655 data: 0.0074 max mem: 33300 Epoch: [19] [3360/4276] eta: 0:45:26 lr: 2.7052210489824868e-05 loss: 0.1071 (0.1201) time: 2.9615 data: 0.0074 max mem: 33300 Epoch: [19] [3370/4276] eta: 0:44:56 lr: 2.704939367600099e-05 loss: 0.1096 (0.1202) time: 2.9621 data: 0.0073 max mem: 33300 Epoch: [19] [3380/4276] eta: 0:44:27 lr: 2.7046576829584347e-05 loss: 0.1170 (0.1202) time: 2.9571 data: 0.0071 max mem: 33300 Epoch: [19] [3390/4276] eta: 0:43:57 lr: 2.704375995057078e-05 loss: 0.1147 (0.1202) time: 2.9609 data: 0.0070 max mem: 33300 Epoch: [19] [3400/4276] eta: 0:43:27 lr: 2.7040943038956145e-05 loss: 0.1301 (0.1202) time: 2.9769 data: 0.0071 max mem: 33300 Epoch: [19] [3410/4276] eta: 0:42:57 lr: 2.7038126094736294e-05 loss: 0.1301 (0.1203) time: 2.9789 data: 0.0072 max mem: 33300 Epoch: [19] [3420/4276] eta: 0:42:28 lr: 2.7035309117907064e-05 loss: 0.1162 (0.1203) time: 2.9678 data: 0.0071 max mem: 33300 Epoch: [19] [3430/4276] eta: 0:41:58 lr: 2.7032492108464314e-05 loss: 0.1295 (0.1203) time: 2.9705 data: 0.0073 max mem: 33300 Epoch: [19] [3440/4276] eta: 0:41:28 lr: 2.7029675066403888e-05 loss: 0.1222 (0.1203) time: 2.9677 data: 0.0079 max mem: 33300 Epoch: [19] [3450/4276] eta: 0:40:58 lr: 2.702685799172162e-05 loss: 0.1022 (0.1203) time: 2.9634 data: 0.0082 max mem: 33300 Epoch: [19] [3460/4276] eta: 0:40:28 lr: 2.7024040884413375e-05 loss: 0.1321 (0.1203) time: 2.9707 data: 0.0080 max mem: 33300 Epoch: [19] [3470/4276] eta: 0:39:59 lr: 2.7021223744474978e-05 loss: 0.1202 (0.1203) time: 2.9691 data: 0.0077 max mem: 33300 Epoch: [19] [3480/4276] eta: 0:39:29 lr: 2.7018406571902282e-05 loss: 0.1249 (0.1203) time: 2.9586 data: 0.0081 max mem: 33300 Epoch: [19] [3490/4276] eta: 0:38:59 lr: 2.7015589366691123e-05 loss: 0.1287 (0.1203) time: 2.9630 data: 0.0081 max mem: 33300 Epoch: [19] [3500/4276] eta: 0:38:29 lr: 2.7012772128837337e-05 loss: 0.1179 (0.1203) time: 2.9678 data: 0.0076 max mem: 33300 Epoch: [19] [3510/4276] eta: 0:37:59 lr: 2.7009954858336766e-05 loss: 0.1070 (0.1203) time: 2.9596 data: 0.0075 max mem: 33300 Epoch: [19] [3520/4276] eta: 0:37:30 lr: 2.7007137555185254e-05 loss: 0.1049 (0.1203) time: 2.9606 data: 0.0078 max mem: 33300 Epoch: [19] [3530/4276] eta: 0:37:00 lr: 2.7004320219378636e-05 loss: 0.1295 (0.1203) time: 2.9415 data: 0.0078 max mem: 33300 Epoch: [19] [3540/4276] eta: 0:36:30 lr: 2.7001502850912742e-05 loss: 0.1295 (0.1203) time: 2.9192 data: 0.0076 max mem: 33300 Epoch: [19] [3550/4276] eta: 0:36:00 lr: 2.699868544978341e-05 loss: 0.1144 (0.1203) time: 2.9179 data: 0.0077 max mem: 33300 Epoch: [19] [3560/4276] eta: 0:35:30 lr: 2.6995868015986476e-05 loss: 0.1135 (0.1203) time: 2.9152 data: 0.0077 max mem: 33300 Epoch: [19] [3570/4276] eta: 0:35:00 lr: 2.699305054951777e-05 loss: 0.1292 (0.1204) time: 2.9159 data: 0.0076 max mem: 33300 Epoch: [19] [3580/4276] eta: 0:34:30 lr: 2.6990233050373122e-05 loss: 0.1083 (0.1203) time: 2.9245 data: 0.0076 max mem: 33300 Epoch: [19] [3590/4276] eta: 0:34:01 lr: 2.698741551854837e-05 loss: 0.1067 (0.1203) time: 2.9622 data: 0.0082 max mem: 33300 Epoch: [19] [3600/4276] eta: 0:33:31 lr: 2.6984597954039335e-05 loss: 0.1212 (0.1204) time: 2.9870 data: 0.0082 max mem: 33300 Epoch: [19] [3610/4276] eta: 0:33:01 lr: 2.6981780356841864e-05 loss: 0.1212 (0.1204) time: 2.9736 data: 0.0076 max mem: 33300 Epoch: [19] [3620/4276] eta: 0:32:31 lr: 2.6978962726951763e-05 loss: 0.1206 (0.1204) time: 2.9719 data: 0.0078 max mem: 33300 Epoch: [19] [3630/4276] eta: 0:32:02 lr: 2.6976145064364872e-05 loss: 0.1168 (0.1204) time: 2.9716 data: 0.0077 max mem: 33300 Epoch: [19] [3640/4276] eta: 0:31:32 lr: 2.6973327369077007e-05 loss: 0.1083 (0.1203) time: 2.9669 data: 0.0076 max mem: 33300 Epoch: [19] [3650/4276] eta: 0:31:02 lr: 2.6970509641083998e-05 loss: 0.0961 (0.1203) time: 2.9658 data: 0.0076 max mem: 33300 Epoch: [19] [3660/4276] eta: 0:30:32 lr: 2.696769188038167e-05 loss: 0.1059 (0.1203) time: 2.9633 data: 0.0075 max mem: 33300 Epoch: [19] [3670/4276] eta: 0:30:03 lr: 2.6964874086965847e-05 loss: 0.1137 (0.1203) time: 2.9657 data: 0.0075 max mem: 33300 Epoch: [19] [3680/4276] eta: 0:29:33 lr: 2.6962056260832348e-05 loss: 0.1170 (0.1203) time: 2.9630 data: 0.0079 max mem: 33300 Epoch: [19] [3690/4276] eta: 0:29:03 lr: 2.6959238401976993e-05 loss: 0.1191 (0.1204) time: 2.9578 data: 0.0079 max mem: 33300 Epoch: [19] [3700/4276] eta: 0:28:33 lr: 2.695642051039561e-05 loss: 0.1140 (0.1204) time: 2.9570 data: 0.0075 max mem: 33300 Epoch: [19] [3710/4276] eta: 0:28:03 lr: 2.695360258608401e-05 loss: 0.1074 (0.1203) time: 2.9547 data: 0.0075 max mem: 33300 Epoch: [19] [3720/4276] eta: 0:27:34 lr: 2.6950784629038e-05 loss: 0.1057 (0.1203) time: 2.9994 data: 0.0075 max mem: 33300 Epoch: [19] [3730/4276] eta: 0:27:04 lr: 2.6947966639253418e-05 loss: 0.1082 (0.1203) time: 3.0264 data: 0.0075 max mem: 33300 Epoch: [19] [3740/4276] eta: 0:26:34 lr: 2.6945148616726066e-05 loss: 0.1067 (0.1203) time: 2.9867 data: 0.0077 max mem: 33300 Epoch: [19] [3750/4276] eta: 0:26:05 lr: 2.694233056145176e-05 loss: 0.1182 (0.1203) time: 2.9668 data: 0.0077 max mem: 33300 Epoch: [19] [3760/4276] eta: 0:25:35 lr: 2.693951247342632e-05 loss: 0.1110 (0.1203) time: 2.9701 data: 0.0077 max mem: 33300 Epoch: [19] [3770/4276] eta: 0:25:05 lr: 2.6936694352645565e-05 loss: 0.1106 (0.1203) time: 2.9728 data: 0.0078 max mem: 33300 Epoch: [19] [3780/4276] eta: 0:24:35 lr: 2.6933876199105283e-05 loss: 0.1122 (0.1202) time: 2.9703 data: 0.0080 max mem: 33300 Epoch: [19] [3790/4276] eta: 0:24:05 lr: 2.6931058012801303e-05 loss: 0.1037 (0.1202) time: 2.9698 data: 0.0081 max mem: 33300 Epoch: [19] [3800/4276] eta: 0:23:36 lr: 2.6928239793729426e-05 loss: 0.1116 (0.1202) time: 2.9722 data: 0.0080 max mem: 33300 Epoch: [19] [3810/4276] eta: 0:23:06 lr: 2.6925421541885464e-05 loss: 0.1116 (0.1202) time: 2.9711 data: 0.0080 max mem: 33300 Epoch: [19] [3820/4276] eta: 0:22:36 lr: 2.6922603257265228e-05 loss: 0.1002 (0.1201) time: 2.9652 data: 0.0084 max mem: 33300 Epoch: [19] [3830/4276] eta: 0:22:07 lr: 2.6919784939864518e-05 loss: 0.1107 (0.1202) time: 2.9999 data: 0.0083 max mem: 33300 Epoch: [19] [3840/4276] eta: 0:21:37 lr: 2.6916966589679144e-05 loss: 0.1107 (0.1201) time: 3.0045 data: 0.0078 max mem: 33300 Epoch: [19] [3850/4276] eta: 0:21:07 lr: 2.6914148206704904e-05 loss: 0.0980 (0.1201) time: 2.9673 data: 0.0075 max mem: 33300 Epoch: [19] [3860/4276] eta: 0:20:37 lr: 2.6911329790937618e-05 loss: 0.1084 (0.1201) time: 2.9633 data: 0.0075 max mem: 33300 Epoch: [19] [3870/4276] eta: 0:20:07 lr: 2.690851134237306e-05 loss: 0.1157 (0.1201) time: 2.9630 data: 0.0074 max mem: 33300 Epoch: [19] [3880/4276] eta: 0:19:38 lr: 2.690569286100706e-05 loss: 0.1152 (0.1201) time: 2.9639 data: 0.0075 max mem: 33300 Epoch: [19] [3890/4276] eta: 0:19:08 lr: 2.6902874346835392e-05 loss: 0.1113 (0.1201) time: 2.9624 data: 0.0072 max mem: 33300 Epoch: [19] [3900/4276] eta: 0:18:38 lr: 2.690005579985388e-05 loss: 0.1113 (0.1201) time: 2.9688 data: 0.0070 max mem: 33300 Epoch: [19] [3910/4276] eta: 0:18:08 lr: 2.6897237220058308e-05 loss: 0.1060 (0.1200) time: 2.9667 data: 0.0070 max mem: 33300 Epoch: [19] [3920/4276] eta: 0:17:39 lr: 2.689441860744448e-05 loss: 0.0930 (0.1200) time: 2.9575 data: 0.0072 max mem: 33300 Epoch: [19] [3930/4276] eta: 0:17:09 lr: 2.689159996200819e-05 loss: 0.0953 (0.1199) time: 2.9551 data: 0.0077 max mem: 33300 Epoch: [19] [3940/4276] eta: 0:16:39 lr: 2.688878128374523e-05 loss: 0.1076 (0.1200) time: 2.9534 data: 0.0075 max mem: 33300 Epoch: [19] [3950/4276] eta: 0:16:09 lr: 2.6885962572651395e-05 loss: 0.1115 (0.1199) time: 2.9528 data: 0.0075 max mem: 33300 Epoch: [19] [3960/4276] eta: 0:15:40 lr: 2.6883143828722475e-05 loss: 0.1059 (0.1199) time: 2.9478 data: 0.0078 max mem: 33300 Epoch: [19] [3970/4276] eta: 0:15:10 lr: 2.688032505195427e-05 loss: 0.1157 (0.1200) time: 2.9465 data: 0.0081 max mem: 33300 Epoch: [19] [3980/4276] eta: 0:14:40 lr: 2.6877506242342565e-05 loss: 0.1249 (0.1200) time: 2.9471 data: 0.0079 max mem: 33300 Epoch: [19] [3990/4276] eta: 0:14:10 lr: 2.6874687399883153e-05 loss: 0.1166 (0.1200) time: 2.9497 data: 0.0075 max mem: 33300 Epoch: [19] [4000/4276] eta: 0:13:41 lr: 2.687186852457183e-05 loss: 0.1144 (0.1200) time: 2.9630 data: 0.0076 max mem: 33300 Epoch: [19] [4010/4276] eta: 0:13:11 lr: 2.686904961640437e-05 loss: 0.1127 (0.1200) time: 2.9806 data: 0.0076 max mem: 33300 Epoch: [19] [4020/4276] eta: 0:12:41 lr: 2.6866230675376563e-05 loss: 0.1108 (0.1199) time: 2.9809 data: 0.0077 max mem: 33300 Epoch: [19] [4030/4276] eta: 0:12:11 lr: 2.6863411701484204e-05 loss: 0.1125 (0.1199) time: 2.9703 data: 0.0078 max mem: 33300 Epoch: [19] [4040/4276] eta: 0:11:42 lr: 2.6860592694723064e-05 loss: 0.1131 (0.1199) time: 2.9701 data: 0.0081 max mem: 33300 Epoch: [19] [4050/4276] eta: 0:11:12 lr: 2.685777365508894e-05 loss: 0.1094 (0.1199) time: 2.9707 data: 0.0079 max mem: 33300 Epoch: [19] [4060/4276] eta: 0:10:42 lr: 2.6854954582577606e-05 loss: 0.1207 (0.1200) time: 2.9709 data: 0.0073 max mem: 33300 Epoch: [19] [4070/4276] eta: 0:10:12 lr: 2.6852135477184846e-05 loss: 0.1269 (0.1200) time: 2.9716 data: 0.0072 max mem: 33300 Epoch: [19] [4080/4276] eta: 0:09:43 lr: 2.684931633890645e-05 loss: 0.1280 (0.1200) time: 2.9746 data: 0.0079 max mem: 33300 Epoch: [19] [4090/4276] eta: 0:09:13 lr: 2.684649716773819e-05 loss: 0.1279 (0.1200) time: 2.9848 data: 0.0082 max mem: 33300 Epoch: [19] [4100/4276] eta: 0:08:43 lr: 2.684367796367584e-05 loss: 0.1239 (0.1201) time: 2.9831 data: 0.0079 max mem: 33300 Epoch: [19] [4110/4276] eta: 0:08:13 lr: 2.6840858726715177e-05 loss: 0.1190 (0.1201) time: 2.9732 data: 0.0078 max mem: 33300 Epoch: [19] [4120/4276] eta: 0:07:44 lr: 2.6838039456851987e-05 loss: 0.1105 (0.1200) time: 2.9741 data: 0.0076 max mem: 33300 Epoch: [19] [4130/4276] eta: 0:07:14 lr: 2.683522015408204e-05 loss: 0.1046 (0.1200) time: 2.9730 data: 0.0076 max mem: 33300 Epoch: [19] [4140/4276] eta: 0:06:44 lr: 2.683240081840111e-05 loss: 0.1119 (0.1200) time: 2.9681 data: 0.0074 max mem: 33300 Epoch: [19] [4150/4276] eta: 0:06:14 lr: 2.682958144980498e-05 loss: 0.1147 (0.1200) time: 2.9640 data: 0.0072 max mem: 33300 Epoch: [19] [4160/4276] eta: 0:05:45 lr: 2.6826762048289404e-05 loss: 0.1229 (0.1201) time: 2.9882 data: 0.0075 max mem: 33300 Epoch: [19] [4170/4276] eta: 0:05:15 lr: 2.682394261385017e-05 loss: 0.1278 (0.1201) time: 2.9955 data: 0.0083 max mem: 33300 Epoch: [19] [4180/4276] eta: 0:04:45 lr: 2.682112314648304e-05 loss: 0.1278 (0.1201) time: 2.9768 data: 0.0081 max mem: 33300 Epoch: [19] [4190/4276] eta: 0:04:15 lr: 2.6818303646183785e-05 loss: 0.1136 (0.1201) time: 2.9804 data: 0.0076 max mem: 33300 Epoch: [19] [4200/4276] eta: 0:03:46 lr: 2.6815484112948175e-05 loss: 0.1169 (0.1201) time: 2.9789 data: 0.0078 max mem: 33300 Epoch: [19] [4210/4276] eta: 0:03:16 lr: 2.6812664546771975e-05 loss: 0.1191 (0.1201) time: 2.9727 data: 0.0077 max mem: 33300 Epoch: [19] [4220/4276] eta: 0:02:46 lr: 2.6809844947650953e-05 loss: 0.1233 (0.1202) time: 2.9747 data: 0.0074 max mem: 33300 Epoch: [19] [4230/4276] eta: 0:02:16 lr: 2.6807025315580875e-05 loss: 0.1277 (0.1202) time: 2.9785 data: 0.0074 max mem: 33300 Epoch: [19] [4240/4276] eta: 0:01:47 lr: 2.6804205650557503e-05 loss: 0.1251 (0.1203) time: 2.9780 data: 0.0076 max mem: 33300 Epoch: [19] [4250/4276] eta: 0:01:17 lr: 2.6801385952576603e-05 loss: 0.1208 (0.1203) time: 2.9750 data: 0.0077 max mem: 33300 Epoch: [19] [4260/4276] eta: 0:00:47 lr: 2.679856622163393e-05 loss: 0.1208 (0.1203) time: 2.9719 data: 0.0074 max mem: 33300 Epoch: [19] [4270/4276] eta: 0:00:17 lr: 2.6795746457725245e-05 loss: 0.1283 (0.1203) time: 2.9803 data: 0.0072 max mem: 33300 Epoch: [19] Total time: 3:32:00 Test: [ 0/21770] eta: 9:52:02 time: 1.6317 data: 1.5898 max mem: 33300 Test: [ 100/21770] eta: 0:19:37 time: 0.0387 data: 0.0009 max mem: 33300 Test: [ 200/21770] eta: 0:16:43 time: 0.0386 data: 0.0009 max mem: 33300 Test: [ 300/21770] eta: 0:15:43 time: 0.0387 data: 0.0009 max mem: 33300 Test: [ 400/21770] eta: 0:15:11 time: 0.0389 data: 0.0008 max mem: 33300 Test: [ 500/21770] eta: 0:14:51 time: 0.0388 data: 0.0009 max mem: 33300 Test: [ 600/21770] eta: 0:14:36 time: 0.0379 data: 0.0009 max mem: 33300 Test: [ 700/21770] eta: 0:14:24 time: 0.0389 data: 0.0009 max mem: 33300 Test: [ 800/21770] eta: 0:14:14 time: 0.0389 data: 0.0009 max mem: 33300 Test: [ 900/21770] eta: 0:14:05 time: 0.0387 data: 0.0009 max mem: 33300 Test: [ 1000/21770] eta: 0:13:58 time: 0.0388 data: 0.0009 max mem: 33300 Test: [ 1100/21770] eta: 0:13:50 time: 0.0386 data: 0.0009 max mem: 33300 Test: [ 1200/21770] eta: 0:13:43 time: 0.0386 data: 0.0008 max mem: 33300 Test: [ 1300/21770] eta: 0:13:37 time: 0.0385 data: 0.0008 max mem: 33300 Test: [ 1400/21770] eta: 0:13:31 time: 0.0389 data: 0.0009 max mem: 33300 Test: [ 1500/21770] eta: 0:13:25 time: 0.0381 data: 0.0009 max mem: 33300 Test: [ 1600/21770] eta: 0:13:20 time: 0.0386 data: 0.0009 max mem: 33300 Test: [ 1700/21770] eta: 0:13:15 time: 0.0387 data: 0.0009 max mem: 33300 Test: [ 1800/21770] eta: 0:13:10 time: 0.0387 data: 0.0009 max mem: 33300 Test: [ 1900/21770] eta: 0:13:05 time: 0.0386 data: 0.0009 max mem: 33300 Test: [ 2000/21770] eta: 0:13:00 time: 0.0387 data: 0.0009 max mem: 33300 Test: [ 2100/21770] eta: 0:12:56 time: 0.0388 data: 0.0009 max mem: 33300 Test: [ 2200/21770] eta: 0:12:51 time: 0.0387 data: 0.0009 max mem: 33300 Test: [ 2300/21770] eta: 0:12:46 time: 0.0384 data: 0.0009 max mem: 33300 Test: [ 2400/21770] eta: 0:12:42 time: 0.0392 data: 0.0010 max mem: 33300 Test: [ 2500/21770] eta: 0:12:37 time: 0.0385 data: 0.0009 max mem: 33300 Test: [ 2600/21770] eta: 0:12:32 time: 0.0383 data: 0.0009 max mem: 33300 Test: [ 2700/21770] eta: 0:12:28 time: 0.0384 data: 0.0009 max mem: 33300 Test: [ 2800/21770] eta: 0:12:23 time: 0.0383 data: 0.0009 max mem: 33300 Test: [ 2900/21770] eta: 0:12:19 time: 0.0385 data: 0.0009 max mem: 33300 Test: [ 3000/21770] eta: 0:12:14 time: 0.0382 data: 0.0009 max mem: 33300 Test: [ 3100/21770] eta: 0:12:10 time: 0.0384 data: 0.0009 max mem: 33300 Test: [ 3200/21770] eta: 0:12:06 time: 0.0381 data: 0.0009 max mem: 33300 Test: [ 3300/21770] eta: 0:12:01 time: 0.0385 data: 0.0009 max mem: 33300 Test: [ 3400/21770] eta: 0:11:57 time: 0.0383 data: 0.0009 max mem: 33300 Test: [ 3500/21770] eta: 0:11:53 time: 0.0382 data: 0.0009 max mem: 33300 Test: [ 3600/21770] eta: 0:11:49 time: 0.0384 data: 0.0009 max mem: 33300 Test: [ 3700/21770] eta: 0:11:45 time: 0.0385 data: 0.0009 max mem: 33300 Test: [ 3800/21770] eta: 0:11:40 time: 0.0382 data: 0.0009 max mem: 33300 Test: [ 3900/21770] eta: 0:11:36 time: 0.0386 data: 0.0009 max mem: 33300 Test: [ 4000/21770] eta: 0:11:32 time: 0.0384 data: 0.0009 max mem: 33300 Test: [ 4100/21770] eta: 0:11:28 time: 0.0386 data: 0.0009 max mem: 33300 Test: [ 4200/21770] eta: 0:11:24 time: 0.0383 data: 0.0009 max mem: 33300 Test: [ 4300/21770] eta: 0:11:20 time: 0.0389 data: 0.0010 max mem: 33300 Test: [ 4400/21770] eta: 0:11:16 time: 0.0388 data: 0.0010 max mem: 33300 Test: [ 4500/21770] eta: 0:11:12 time: 0.0387 data: 0.0009 max mem: 33300 Test: [ 4600/21770] eta: 0:11:08 time: 0.0383 data: 0.0009 max mem: 33300 Test: [ 4700/21770] eta: 0:11:04 time: 0.0387 data: 0.0009 max mem: 33300 Test: [ 4800/21770] eta: 0:11:00 time: 0.0384 data: 0.0009 max mem: 33300 Test: [ 4900/21770] eta: 0:10:56 time: 0.0387 data: 0.0009 max mem: 33300 Test: [ 5000/21770] eta: 0:10:52 time: 0.0386 data: 0.0009 max mem: 33300 Test: [ 5100/21770] eta: 0:10:48 time: 0.0388 data: 0.0009 max mem: 33300 Test: [ 5200/21770] eta: 0:10:44 time: 0.0383 data: 0.0009 max mem: 33300 Test: [ 5300/21770] eta: 0:10:40 time: 0.0387 data: 0.0009 max mem: 33300 Test: [ 5400/21770] eta: 0:10:36 time: 0.0387 data: 0.0010 max mem: 33300 Test: [ 5500/21770] eta: 0:10:32 time: 0.0392 data: 0.0010 max mem: 33300 Test: [ 5600/21770] eta: 0:10:28 time: 0.0396 data: 0.0010 max mem: 33300 Test: [ 5700/21770] eta: 0:10:24 time: 0.0391 data: 0.0009 max mem: 33300 Test: [ 5800/21770] eta: 0:10:21 time: 0.0395 data: 0.0010 max mem: 33300 Test: [ 5900/21770] eta: 0:10:17 time: 0.0386 data: 0.0010 max mem: 33300 Test: [ 6000/21770] eta: 0:10:13 time: 0.0389 data: 0.0009 max mem: 33300 Test: [ 6100/21770] eta: 0:10:09 time: 0.0387 data: 0.0009 max mem: 33300 Test: [ 6200/21770] eta: 0:10:05 time: 0.0388 data: 0.0009 max mem: 33300 Test: [ 6300/21770] eta: 0:10:01 time: 0.0386 data: 0.0009 max mem: 33300 Test: [ 6400/21770] eta: 0:09:57 time: 0.0380 data: 0.0009 max mem: 33300 Test: [ 6500/21770] eta: 0:09:53 time: 0.0385 data: 0.0010 max mem: 33300 Test: [ 6600/21770] eta: 0:09:49 time: 0.0384 data: 0.0009 max mem: 33300 Test: [ 6700/21770] eta: 0:09:45 time: 0.0384 data: 0.0009 max mem: 33300 Test: [ 6800/21770] eta: 0:09:41 time: 0.0386 data: 0.0009 max mem: 33300 Test: [ 6900/21770] eta: 0:09:37 time: 0.0390 data: 0.0009 max mem: 33300 Test: [ 7000/21770] eta: 0:09:33 time: 0.0388 data: 0.0009 max mem: 33300 Test: [ 7100/21770] eta: 0:09:29 time: 0.0389 data: 0.0009 max mem: 33300 Test: [ 7200/21770] eta: 0:09:26 time: 0.0388 data: 0.0009 max mem: 33300 Test: [ 7300/21770] eta: 0:09:22 time: 0.0382 data: 0.0009 max mem: 33300 Test: [ 7400/21770] eta: 0:09:18 time: 0.0387 data: 0.0009 max mem: 33300 Test: [ 7500/21770] eta: 0:09:14 time: 0.0385 data: 0.0009 max mem: 33300 Test: [ 7600/21770] eta: 0:09:10 time: 0.0383 data: 0.0009 max mem: 33300 Test: [ 7700/21770] eta: 0:09:06 time: 0.0387 data: 0.0009 max mem: 33300 Test: [ 7800/21770] eta: 0:09:02 time: 0.0388 data: 0.0009 max mem: 33300 Test: [ 7900/21770] eta: 0:08:58 time: 0.0390 data: 0.0009 max mem: 33300 Test: [ 8000/21770] eta: 0:08:54 time: 0.0387 data: 0.0009 max mem: 33300 Test: [ 8100/21770] eta: 0:08:50 time: 0.0396 data: 0.0010 max mem: 33300 Test: [ 8200/21770] eta: 0:08:47 time: 0.0392 data: 0.0009 max mem: 33300 Test: [ 8300/21770] eta: 0:08:43 time: 0.0392 data: 0.0009 max mem: 33300 Test: [ 8400/21770] eta: 0:08:39 time: 0.0389 data: 0.0009 max mem: 33300 Test: [ 8500/21770] eta: 0:08:35 time: 0.0387 data: 0.0009 max mem: 33300 Test: [ 8600/21770] eta: 0:08:31 time: 0.0388 data: 0.0009 max mem: 33300 Test: [ 8700/21770] eta: 0:08:27 time: 0.0392 data: 0.0009 max mem: 33300 Test: [ 8800/21770] eta: 0:08:23 time: 0.0389 data: 0.0009 max mem: 33300 Test: [ 8900/21770] eta: 0:08:20 time: 0.0392 data: 0.0009 max mem: 33300 Test: [ 9000/21770] eta: 0:08:16 time: 0.0391 data: 0.0010 max mem: 33300 Test: [ 9100/21770] eta: 0:08:12 time: 0.0387 data: 0.0009 max mem: 33300 Test: [ 9200/21770] eta: 0:08:08 time: 0.0390 data: 0.0010 max mem: 33300 Test: [ 9300/21770] eta: 0:08:04 time: 0.0383 data: 0.0009 max mem: 33300 Test: [ 9400/21770] eta: 0:08:00 time: 0.0387 data: 0.0009 max mem: 33300 Test: [ 9500/21770] eta: 0:07:56 time: 0.0383 data: 0.0009 max mem: 33300 Test: [ 9600/21770] eta: 0:07:52 time: 0.0385 data: 0.0009 max mem: 33300 Test: [ 9700/21770] eta: 0:07:48 time: 0.0384 data: 0.0009 max mem: 33300 Test: [ 9800/21770] eta: 0:07:44 time: 0.0387 data: 0.0009 max mem: 33300 Test: [ 9900/21770] eta: 0:07:40 time: 0.0391 data: 0.0010 max mem: 33300 Test: [10000/21770] eta: 0:07:37 time: 0.0386 data: 0.0009 max mem: 33300 Test: [10100/21770] eta: 0:07:33 time: 0.0385 data: 0.0009 max mem: 33300 Test: [10200/21770] eta: 0:07:29 time: 0.0387 data: 0.0009 max mem: 33300 Test: [10300/21770] eta: 0:07:25 time: 0.0384 data: 0.0009 max mem: 33300 Test: [10400/21770] eta: 0:07:21 time: 0.0385 data: 0.0009 max mem: 33300 Test: [10500/21770] eta: 0:07:17 time: 0.0384 data: 0.0009 max mem: 33300 Test: [10600/21770] eta: 0:07:13 time: 0.0384 data: 0.0009 max mem: 33300 Test: [10700/21770] eta: 0:07:09 time: 0.0382 data: 0.0009 max mem: 33300 Test: [10800/21770] eta: 0:07:05 time: 0.0393 data: 0.0010 max mem: 33300 Test: [10900/21770] eta: 0:07:01 time: 0.0382 data: 0.0009 max mem: 33300 Test: [11000/21770] eta: 0:06:57 time: 0.0383 data: 0.0009 max mem: 33300 Test: [11100/21770] eta: 0:06:53 time: 0.0386 data: 0.0009 max mem: 33300 Test: [11200/21770] eta: 0:06:50 time: 0.0385 data: 0.0009 max mem: 33300 Test: [11300/21770] eta: 0:06:46 time: 0.0380 data: 0.0009 max mem: 33300 Test: [11400/21770] eta: 0:06:42 time: 0.0383 data: 0.0009 max mem: 33300 Test: [11500/21770] eta: 0:06:38 time: 0.0382 data: 0.0009 max mem: 33300 Test: [11600/21770] eta: 0:06:34 time: 0.0389 data: 0.0009 max mem: 33300 Test: [11700/21770] eta: 0:06:30 time: 0.0380 data: 0.0009 max mem: 33300 Test: [11800/21770] eta: 0:06:26 time: 0.0382 data: 0.0010 max mem: 33300 Test: [11900/21770] eta: 0:06:22 time: 0.0386 data: 0.0011 max mem: 33300 Test: [12000/21770] eta: 0:06:18 time: 0.0386 data: 0.0009 max mem: 33300 Test: [12100/21770] eta: 0:06:14 time: 0.0386 data: 0.0009 max mem: 33300 Test: [12200/21770] eta: 0:06:10 time: 0.0385 data: 0.0009 max mem: 33300 Test: [12300/21770] eta: 0:06:07 time: 0.0380 data: 0.0009 max mem: 33300 Test: [12400/21770] eta: 0:06:03 time: 0.0384 data: 0.0009 max mem: 33300 Test: [12500/21770] eta: 0:05:59 time: 0.0380 data: 0.0009 max mem: 33300 Test: [12600/21770] eta: 0:05:55 time: 0.0382 data: 0.0009 max mem: 33300 Test: [12700/21770] eta: 0:05:51 time: 0.0397 data: 0.0009 max mem: 33300 Test: [12800/21770] eta: 0:05:47 time: 0.0400 data: 0.0010 max mem: 33300 Test: [12900/21770] eta: 0:05:43 time: 0.0387 data: 0.0009 max mem: 33300 Test: [13000/21770] eta: 0:05:39 time: 0.0390 data: 0.0009 max mem: 33300 Test: [13100/21770] eta: 0:05:36 time: 0.0390 data: 0.0009 max mem: 33300 Test: [13200/21770] eta: 0:05:32 time: 0.0392 data: 0.0009 max mem: 33300 Test: [13300/21770] eta: 0:05:28 time: 0.0392 data: 0.0009 max mem: 33300 Test: [13400/21770] eta: 0:05:24 time: 0.0398 data: 0.0009 max mem: 33300 Test: [13500/21770] eta: 0:05:20 time: 0.0393 data: 0.0009 max mem: 33300 Test: [13600/21770] eta: 0:05:16 time: 0.0389 data: 0.0009 max mem: 33300 Test: [13700/21770] eta: 0:05:12 time: 0.0398 data: 0.0009 max mem: 33300 Test: [13800/21770] eta: 0:05:09 time: 0.0388 data: 0.0010 max mem: 33300 Test: [13900/21770] eta: 0:05:05 time: 0.0390 data: 0.0010 max mem: 33300 Test: [14000/21770] eta: 0:05:01 time: 0.0390 data: 0.0009 max mem: 33300 Test: [14100/21770] eta: 0:04:57 time: 0.0389 data: 0.0009 max mem: 33300 Test: [14200/21770] eta: 0:04:53 time: 0.0390 data: 0.0009 max mem: 33300 Test: [14300/21770] eta: 0:04:49 time: 0.0389 data: 0.0009 max mem: 33300 Test: [14400/21770] eta: 0:04:45 time: 0.0388 data: 0.0009 max mem: 33300 Test: [14500/21770] eta: 0:04:41 time: 0.0387 data: 0.0009 max mem: 33300 Test: [14600/21770] eta: 0:04:38 time: 0.0389 data: 0.0008 max mem: 33300 Test: [14700/21770] eta: 0:04:34 time: 0.0388 data: 0.0009 max mem: 33300 Test: [14800/21770] eta: 0:04:30 time: 0.0388 data: 0.0009 max mem: 33300 Test: [14900/21770] eta: 0:04:26 time: 0.0382 data: 0.0009 max mem: 33300 Test: [15000/21770] eta: 0:04:22 time: 0.0391 data: 0.0009 max mem: 33300 Test: [15100/21770] eta: 0:04:18 time: 0.0388 data: 0.0010 max mem: 33300 Test: [15200/21770] eta: 0:04:14 time: 0.0385 data: 0.0009 max mem: 33300 Test: [15300/21770] eta: 0:04:10 time: 0.0386 data: 0.0008 max mem: 33300 Test: [15400/21770] eta: 0:04:07 time: 0.0389 data: 0.0008 max mem: 33300 Test: [15500/21770] eta: 0:04:03 time: 0.0382 data: 0.0009 max mem: 33300 Test: [15600/21770] eta: 0:03:59 time: 0.0385 data: 0.0009 max mem: 33300 Test: [15700/21770] eta: 0:03:55 time: 0.0382 data: 0.0009 max mem: 33300 Test: [15800/21770] eta: 0:03:51 time: 0.0386 data: 0.0009 max mem: 33300 Test: [15900/21770] eta: 0:03:47 time: 0.0381 data: 0.0010 max mem: 33300 Test: [16000/21770] eta: 0:03:43 time: 0.0394 data: 0.0009 max mem: 33300 Test: [16100/21770] eta: 0:03:39 time: 0.0392 data: 0.0008 max mem: 33300 Test: [16200/21770] eta: 0:03:36 time: 0.0388 data: 0.0008 max mem: 33300 Test: [16300/21770] eta: 0:03:32 time: 0.0389 data: 0.0009 max mem: 33300 Test: [16400/21770] eta: 0:03:28 time: 0.0398 data: 0.0009 max mem: 33300 Test: [16500/21770] eta: 0:03:24 time: 0.0394 data: 0.0009 max mem: 33300 Test: [16600/21770] eta: 0:03:20 time: 0.0396 data: 0.0009 max mem: 33300 Test: [16700/21770] eta: 0:03:16 time: 0.0398 data: 0.0009 max mem: 33300 Test: [16800/21770] eta: 0:03:12 time: 0.0392 data: 0.0009 max mem: 33300 Test: [16900/21770] eta: 0:03:09 time: 0.0394 data: 0.0009 max mem: 33300 Test: [17000/21770] eta: 0:03:05 time: 0.0394 data: 0.0009 max mem: 33300 Test: [17100/21770] eta: 0:03:01 time: 0.0382 data: 0.0009 max mem: 33300 Test: [17200/21770] eta: 0:02:57 time: 0.0381 data: 0.0009 max mem: 33300 Test: [17300/21770] eta: 0:02:53 time: 0.0383 data: 0.0009 max mem: 33300 Test: [17400/21770] eta: 0:02:49 time: 0.0379 data: 0.0009 max mem: 33300 Test: [17500/21770] eta: 0:02:45 time: 0.0382 data: 0.0009 max mem: 33300 Test: [17600/21770] eta: 0:02:41 time: 0.0380 data: 0.0009 max mem: 33300 Test: [17700/21770] eta: 0:02:37 time: 0.0381 data: 0.0009 max mem: 33300 Test: [17800/21770] eta: 0:02:33 time: 0.0381 data: 0.0009 max mem: 33300 Test: [17900/21770] eta: 0:02:30 time: 0.0381 data: 0.0008 max mem: 33300 Test: [18000/21770] eta: 0:02:26 time: 0.0382 data: 0.0009 max mem: 33300 Test: [18100/21770] eta: 0:02:22 time: 0.0381 data: 0.0009 max mem: 33300 Test: [18200/21770] eta: 0:02:18 time: 0.0379 data: 0.0008 max mem: 33300 Test: [18300/21770] eta: 0:02:14 time: 0.0383 data: 0.0009 max mem: 33300 Test: [18400/21770] eta: 0:02:10 time: 0.0381 data: 0.0009 max mem: 33300 Test: [18500/21770] eta: 0:02:06 time: 0.0383 data: 0.0009 max mem: 33300 Test: [18600/21770] eta: 0:02:02 time: 0.0383 data: 0.0009 max mem: 33300 Test: [18700/21770] eta: 0:01:58 time: 0.0385 data: 0.0009 max mem: 33300 Test: [18800/21770] eta: 0:01:55 time: 0.0381 data: 0.0009 max mem: 33300 Test: [18900/21770] eta: 0:01:51 time: 0.0381 data: 0.0010 max mem: 33300 Test: [19000/21770] eta: 0:01:47 time: 0.0383 data: 0.0009 max mem: 33300 Test: [19100/21770] eta: 0:01:43 time: 0.0389 data: 0.0009 max mem: 33300 Test: [19200/21770] eta: 0:01:39 time: 0.0391 data: 0.0009 max mem: 33300 Test: [19300/21770] eta: 0:01:35 time: 0.0390 data: 0.0009 max mem: 33300 Test: [19400/21770] eta: 0:01:31 time: 0.0390 data: 0.0009 max mem: 33300 Test: [19500/21770] eta: 0:01:27 time: 0.0389 data: 0.0009 max mem: 33300 Test: [19600/21770] eta: 0:01:24 time: 0.0390 data: 0.0009 max mem: 33300 Test: [19700/21770] eta: 0:01:20 time: 0.0390 data: 0.0010 max mem: 33300 Test: [19800/21770] eta: 0:01:16 time: 0.0390 data: 0.0009 max mem: 33300 Test: [19900/21770] eta: 0:01:12 time: 0.0382 data: 0.0009 max mem: 33300 Test: [20000/21770] eta: 0:01:08 time: 0.0389 data: 0.0009 max mem: 33300 Test: [20100/21770] eta: 0:01:04 time: 0.0388 data: 0.0009 max mem: 33300 Test: [20200/21770] eta: 0:01:00 time: 0.0389 data: 0.0009 max mem: 33300 Test: [20300/21770] eta: 0:00:56 time: 0.0385 data: 0.0010 max mem: 33300 Test: [20400/21770] eta: 0:00:53 time: 0.0387 data: 0.0009 max mem: 33300 Test: [20500/21770] eta: 0:00:49 time: 0.0387 data: 0.0010 max mem: 33300 Test: [20600/21770] eta: 0:00:45 time: 0.0389 data: 0.0009 max mem: 33300 Test: [20700/21770] eta: 0:00:41 time: 0.0384 data: 0.0009 max mem: 33300 Test: [20800/21770] eta: 0:00:37 time: 0.0386 data: 0.0009 max mem: 33300 Test: [20900/21770] eta: 0:00:33 time: 0.0386 data: 0.0009 max mem: 33300 Test: [21000/21770] eta: 0:00:29 time: 0.0391 data: 0.0010 max mem: 33300 Test: [21100/21770] eta: 0:00:25 time: 0.0383 data: 0.0010 max mem: 33300 Test: [21200/21770] eta: 0:00:22 time: 0.0382 data: 0.0009 max mem: 33300 Test: [21300/21770] eta: 0:00:18 time: 0.0388 data: 0.0010 max mem: 33300 Test: [21400/21770] eta: 0:00:14 time: 0.0383 data: 0.0010 max mem: 33300 Test: [21500/21770] eta: 0:00:10 time: 0.0383 data: 0.0009 max mem: 33300 Test: [21600/21770] eta: 0:00:06 time: 0.0385 data: 0.0009 max mem: 33300 Test: [21700/21770] eta: 0:00:02 time: 0.0385 data: 0.0009 max mem: 33300 Test: Total time: 0:14:03 Final results: Mean IoU is 4.13 precision@0.5 = 0.00 precision@0.6 = 0.00 precision@0.7 = 0.00 precision@0.8 = 0.00 precision@0.9 = 0.00 overall IoU = 4.38 mean IoU = 4.13 Mean accuracy for one-to-zero sample is 0.00 Average object IoU 0.04128493680487747 Overall IoU 4.3831377029418945 Better epoch: 19 Epoch: [20] [ 0/4276] eta: 6:41:22 lr: 2.6794054583554596e-05 loss: 0.1133 (0.1133) time: 5.6319 data: 2.1165 max mem: 33300 Epoch: [20] [ 10/4276] eta: 4:14:59 lr: 2.6791234766891482e-05 loss: 0.1460 (0.1299) time: 3.5864 data: 0.1992 max mem: 33300 Epoch: [20] [ 20/4276] eta: 4:03:50 lr: 2.6788414917251336e-05 loss: 0.1300 (0.1273) time: 3.3278 data: 0.0069 max mem: 33300 Epoch: [20] [ 30/4276] eta: 3:57:08 lr: 2.6785595034629918e-05 loss: 0.1083 (0.1260) time: 3.2214 data: 0.0066 max mem: 33300 Epoch: [20] [ 40/4276] eta: 3:53:00 lr: 2.6782775119022978e-05 loss: 0.1207 (0.1254) time: 3.1563 data: 0.0068 max mem: 33300 Epoch: [20] [ 50/4276] eta: 3:50:25 lr: 2.6779955170426273e-05 loss: 0.1200 (0.1236) time: 3.1484 data: 0.0069 max mem: 33300 Epoch: [20] [ 60/4276] eta: 3:48:25 lr: 2.6777135188835557e-05 loss: 0.1155 (0.1236) time: 3.1492 data: 0.0073 max mem: 33300 Epoch: [20] [ 70/4276] eta: 3:46:48 lr: 2.6774315174246583e-05 loss: 0.1167 (0.1228) time: 3.1437 data: 0.0078 max mem: 33300 Epoch: [20] [ 80/4276] eta: 3:45:22 lr: 2.6771495126655104e-05 loss: 0.1196 (0.1227) time: 3.1367 data: 0.0082 max mem: 33300 Epoch: [20] [ 90/4276] eta: 3:44:03 lr: 2.6768675046056873e-05 loss: 0.1144 (0.1210) time: 3.1264 data: 0.0081 max mem: 33300 Epoch: [20] [ 100/4276] eta: 3:42:47 lr: 2.6765854932447633e-05 loss: 0.1167 (0.1224) time: 3.1136 data: 0.0076 max mem: 33300 Epoch: [20] [ 110/4276] eta: 3:41:53 lr: 2.6763034785823142e-05 loss: 0.1365 (0.1233) time: 3.1241 data: 0.0076 max mem: 33300 Epoch: [20] [ 120/4276] eta: 3:40:55 lr: 2.676021460617914e-05 loss: 0.1209 (0.1231) time: 3.1309 data: 0.0080 max mem: 33300 Epoch: [20] [ 130/4276] eta: 3:39:56 lr: 2.6757394393511374e-05 loss: 0.1215 (0.1240) time: 3.1113 data: 0.0085 max mem: 33300 Epoch: [20] [ 140/4276] eta: 3:38:40 lr: 2.6754574147815596e-05 loss: 0.1209 (0.1235) time: 3.0691 data: 0.0085 max mem: 33300 Epoch: [20] [ 150/4276] eta: 3:37:19 lr: 2.6751753869087552e-05 loss: 0.1061 (0.1224) time: 3.0123 data: 0.0080 max mem: 33300 Epoch: [20] [ 160/4276] eta: 3:35:53 lr: 2.6748933557322974e-05 loss: 0.1122 (0.1227) time: 2.9691 data: 0.0078 max mem: 33300 Epoch: [20] [ 170/4276] eta: 3:34:38 lr: 2.6746113212517616e-05 loss: 0.1122 (0.1217) time: 2.9562 data: 0.0077 max mem: 33300 Epoch: [20] [ 180/4276] eta: 3:33:24 lr: 2.6743292834667215e-05 loss: 0.1063 (0.1218) time: 2.9574 data: 0.0080 max mem: 33300 Epoch: [20] [ 190/4276] eta: 3:32:14 lr: 2.6740472423767517e-05 loss: 0.1145 (0.1221) time: 2.9481 data: 0.0081 max mem: 33300 Epoch: [20] [ 200/4276] eta: 3:31:17 lr: 2.6737651979814248e-05 loss: 0.1013 (0.1215) time: 2.9666 data: 0.0075 max mem: 33300 Epoch: [20] [ 210/4276] eta: 3:30:24 lr: 2.6734831502803154e-05 loss: 0.1189 (0.1219) time: 2.9934 data: 0.0075 max mem: 33300 Epoch: [20] [ 220/4276] eta: 3:29:35 lr: 2.6732010992729985e-05 loss: 0.1213 (0.1219) time: 3.0009 data: 0.0075 max mem: 33300 Epoch: [20] [ 230/4276] eta: 3:28:45 lr: 2.672919044959046e-05 loss: 0.1024 (0.1211) time: 2.9977 data: 0.0071 max mem: 33300 Epoch: [20] [ 240/4276] eta: 3:27:57 lr: 2.6726369873380314e-05 loss: 0.1034 (0.1209) time: 2.9942 data: 0.0070 max mem: 33300 Epoch: [20] [ 250/4276] eta: 3:27:11 lr: 2.6723549264095288e-05 loss: 0.1230 (0.1214) time: 2.9976 data: 0.0069 max mem: 33300 Epoch: [20] [ 260/4276] eta: 3:26:27 lr: 2.672072862173112e-05 loss: 0.1357 (0.1214) time: 2.9985 data: 0.0066 max mem: 33300 Epoch: [20] [ 270/4276] eta: 3:25:43 lr: 2.671790794628353e-05 loss: 0.1059 (0.1211) time: 2.9979 data: 0.0069 max mem: 33300 Epoch: [20] [ 280/4276] eta: 3:25:01 lr: 2.6715087237748266e-05 loss: 0.1010 (0.1209) time: 3.0000 data: 0.0072 max mem: 33300 Epoch: [20] [ 290/4276] eta: 3:24:18 lr: 2.671226649612104e-05 loss: 0.1143 (0.1208) time: 2.9970 data: 0.0070 max mem: 33300 Epoch: [20] [ 300/4276] eta: 3:23:37 lr: 2.670944572139759e-05 loss: 0.1143 (0.1207) time: 2.9920 data: 0.0069 max mem: 33300 Epoch: [20] [ 310/4276] eta: 3:22:56 lr: 2.6706624913573648e-05 loss: 0.1196 (0.1206) time: 2.9948 data: 0.0070 max mem: 33300 Epoch: [20] [ 320/4276] eta: 3:22:18 lr: 2.6703804072644928e-05 loss: 0.1070 (0.1208) time: 3.0037 data: 0.0070 max mem: 33300 Epoch: [20] [ 330/4276] eta: 3:21:39 lr: 2.6700983198607167e-05 loss: 0.1193 (0.1210) time: 3.0053 data: 0.0072 max mem: 33300 Epoch: [20] [ 340/4276] eta: 3:21:02 lr: 2.6698162291456086e-05 loss: 0.1165 (0.1208) time: 3.0029 data: 0.0074 max mem: 33300 Epoch: [20] [ 350/4276] eta: 3:20:24 lr: 2.669534135118741e-05 loss: 0.1046 (0.1206) time: 3.0048 data: 0.0072 max mem: 33300 Epoch: [20] [ 360/4276] eta: 3:19:48 lr: 2.6692520377796854e-05 loss: 0.1191 (0.1211) time: 3.0070 data: 0.0072 max mem: 33300 Epoch: [20] [ 370/4276] eta: 3:19:11 lr: 2.6689699371280152e-05 loss: 0.1210 (0.1211) time: 3.0041 data: 0.0072 max mem: 33300 Epoch: [20] [ 380/4276] eta: 3:18:35 lr: 2.668687833163302e-05 loss: 0.1134 (0.1210) time: 3.0066 data: 0.0075 max mem: 33300 Epoch: [20] [ 390/4276] eta: 3:17:59 lr: 2.6684057258851175e-05 loss: 0.1213 (0.1212) time: 3.0066 data: 0.0076 max mem: 33300 Epoch: [20] [ 400/4276] eta: 3:17:22 lr: 2.6681236152930335e-05 loss: 0.1376 (0.1218) time: 2.9955 data: 0.0073 max mem: 33300 Epoch: [20] [ 410/4276] eta: 3:16:46 lr: 2.667841501386622e-05 loss: 0.1298 (0.1217) time: 2.9946 data: 0.0075 max mem: 33300 Epoch: [20] [ 420/4276] eta: 3:16:10 lr: 2.667559384165455e-05 loss: 0.1237 (0.1218) time: 2.9984 data: 0.0071 max mem: 33300 Epoch: [20] [ 430/4276] eta: 3:15:35 lr: 2.6672772636291028e-05 loss: 0.1254 (0.1220) time: 2.9993 data: 0.0072 max mem: 33300 Epoch: [20] [ 440/4276] eta: 3:14:56 lr: 2.6669951397771374e-05 loss: 0.1095 (0.1216) time: 2.9728 data: 0.0078 max mem: 33300 Epoch: [20] [ 450/4276] eta: 3:14:15 lr: 2.666713012609131e-05 loss: 0.1106 (0.1216) time: 2.9374 data: 0.0076 max mem: 33300 Epoch: [20] [ 460/4276] eta: 3:13:35 lr: 2.666430882124654e-05 loss: 0.1103 (0.1212) time: 2.9296 data: 0.0077 max mem: 33300 Epoch: [20] [ 470/4276] eta: 3:12:57 lr: 2.6661487483232777e-05 loss: 0.0898 (0.1207) time: 2.9398 data: 0.0078 max mem: 33300 Epoch: [20] [ 480/4276] eta: 3:12:24 lr: 2.665866611204573e-05 loss: 0.1034 (0.1206) time: 2.9788 data: 0.0078 max mem: 33300 Epoch: [20] [ 490/4276] eta: 3:11:49 lr: 2.6655844707681106e-05 loss: 0.1034 (0.1201) time: 2.9979 data: 0.0080 max mem: 33300 Epoch: [20] [ 500/4276] eta: 3:11:15 lr: 2.665302327013462e-05 loss: 0.1039 (0.1199) time: 2.9871 data: 0.0077 max mem: 33300 Epoch: [20] [ 510/4276] eta: 3:10:40 lr: 2.6650201799401968e-05 loss: 0.1078 (0.1198) time: 2.9861 data: 0.0069 max mem: 33300 Epoch: [20] [ 520/4276] eta: 3:10:06 lr: 2.6647380295478862e-05 loss: 0.1093 (0.1197) time: 2.9855 data: 0.0069 max mem: 33300 Epoch: [20] [ 530/4276] eta: 3:09:30 lr: 2.6644558758361016e-05 loss: 0.1097 (0.1196) time: 2.9696 data: 0.0073 max mem: 33300 Epoch: [20] [ 540/4276] eta: 3:08:51 lr: 2.6641737188044114e-05 loss: 0.1064 (0.1192) time: 2.9296 data: 0.0079 max mem: 33300 Epoch: [20] [ 550/4276] eta: 3:08:12 lr: 2.6638915584523872e-05 loss: 0.1004 (0.1192) time: 2.9067 data: 0.0083 max mem: 33300 Epoch: [20] [ 560/4276] eta: 3:07:38 lr: 2.663609394779598e-05 loss: 0.1141 (0.1191) time: 2.9410 data: 0.0093 max mem: 33300 Epoch: [20] [ 570/4276] eta: 3:07:04 lr: 2.663327227785616e-05 loss: 0.1083 (0.1189) time: 2.9770 data: 0.0092 max mem: 33300 Epoch: [20] [ 580/4276] eta: 3:06:30 lr: 2.663045057470009e-05 loss: 0.1083 (0.1188) time: 2.9752 data: 0.0086 max mem: 33300 Epoch: [20] [ 590/4276] eta: 3:05:56 lr: 2.6627628838323476e-05 loss: 0.1026 (0.1184) time: 2.9703 data: 0.0093 max mem: 33300 Epoch: [20] [ 600/4276] eta: 3:05:23 lr: 2.6624807068722018e-05 loss: 0.0928 (0.1180) time: 2.9690 data: 0.0091 max mem: 33300 Epoch: [20] [ 610/4276] eta: 3:04:56 lr: 2.6621985265891414e-05 loss: 0.0970 (0.1178) time: 3.0289 data: 0.0086 max mem: 33300 Epoch: [20] [ 620/4276] eta: 3:04:30 lr: 2.6619163429827343e-05 loss: 0.1091 (0.1180) time: 3.0939 data: 0.0084 max mem: 33300 Epoch: [20] [ 630/4276] eta: 3:04:04 lr: 2.6616341560525517e-05 loss: 0.1157 (0.1182) time: 3.0995 data: 0.0082 max mem: 33300 Epoch: [20] [ 640/4276] eta: 3:03:37 lr: 2.661351965798162e-05 loss: 0.1111 (0.1182) time: 3.0970 data: 0.0078 max mem: 33300 Epoch: [20] [ 650/4276] eta: 3:03:10 lr: 2.6610697722191357e-05 loss: 0.1127 (0.1183) time: 3.0919 data: 0.0076 max mem: 33300 Epoch: [20] [ 660/4276] eta: 3:02:43 lr: 2.6607875753150395e-05 loss: 0.1171 (0.1185) time: 3.0917 data: 0.0076 max mem: 33300 Epoch: [20] [ 670/4276] eta: 3:02:16 lr: 2.660505375085445e-05 loss: 0.1244 (0.1186) time: 3.0874 data: 0.0084 max mem: 33300 Epoch: [20] [ 680/4276] eta: 3:01:49 lr: 2.6602231715299187e-05 loss: 0.1236 (0.1185) time: 3.0865 data: 0.0085 max mem: 33300 Epoch: [20] [ 690/4276] eta: 3:01:16 lr: 2.6599409646480312e-05 loss: 0.1077 (0.1185) time: 3.0452 data: 0.0079 max mem: 33300 Epoch: [20] [ 700/4276] eta: 3:00:48 lr: 2.6596587544393496e-05 loss: 0.1117 (0.1184) time: 3.0349 data: 0.0079 max mem: 33300 Epoch: [20] [ 710/4276] eta: 3:00:21 lr: 2.6593765409034437e-05 loss: 0.1144 (0.1184) time: 3.0834 data: 0.0082 max mem: 33300 Epoch: [20] [ 720/4276] eta: 2:59:54 lr: 2.659094324039882e-05 loss: 0.1144 (0.1185) time: 3.0999 data: 0.0085 max mem: 33300 Epoch: [20] [ 730/4276] eta: 2:59:26 lr: 2.658812103848232e-05 loss: 0.1095 (0.1185) time: 3.0988 data: 0.0081 max mem: 33300 Epoch: [20] [ 740/4276] eta: 2:58:59 lr: 2.658529880328062e-05 loss: 0.1045 (0.1184) time: 3.0938 data: 0.0076 max mem: 33300 Epoch: [20] [ 750/4276] eta: 2:58:31 lr: 2.6582476534789412e-05 loss: 0.1074 (0.1184) time: 3.0945 data: 0.0078 max mem: 33300 Epoch: [20] [ 760/4276] eta: 2:58:03 lr: 2.6579654233004364e-05 loss: 0.1051 (0.1184) time: 3.0933 data: 0.0079 max mem: 33300 Epoch: [20] [ 770/4276] eta: 2:57:35 lr: 2.6576831897921163e-05 loss: 0.1054 (0.1184) time: 3.0880 data: 0.0076 max mem: 33300 Epoch: [20] [ 780/4276] eta: 2:57:03 lr: 2.6574009529535476e-05 loss: 0.1145 (0.1183) time: 3.0422 data: 0.0081 max mem: 33300 Epoch: [20] [ 790/4276] eta: 2:56:35 lr: 2.6571187127842996e-05 loss: 0.1173 (0.1183) time: 3.0502 data: 0.0090 max mem: 33300 Epoch: [20] [ 800/4276] eta: 2:56:07 lr: 2.656836469283938e-05 loss: 0.1161 (0.1184) time: 3.0927 data: 0.0089 max mem: 33300 Epoch: [20] [ 810/4276] eta: 2:55:38 lr: 2.6565542224520323e-05 loss: 0.1161 (0.1187) time: 3.0836 data: 0.0081 max mem: 33300 Epoch: [20] [ 820/4276] eta: 2:55:10 lr: 2.6562719722881486e-05 loss: 0.1161 (0.1185) time: 3.0841 data: 0.0080 max mem: 33300 Epoch: [20] [ 830/4276] eta: 2:54:41 lr: 2.6559897187918543e-05 loss: 0.1028 (0.1187) time: 3.0842 data: 0.0081 max mem: 33300 Epoch: [20] [ 840/4276] eta: 2:54:12 lr: 2.6557074619627175e-05 loss: 0.1176 (0.1187) time: 3.0846 data: 0.0079 max mem: 33300 Epoch: [20] [ 850/4276] eta: 2:53:44 lr: 2.6554252018003034e-05 loss: 0.1132 (0.1187) time: 3.0863 data: 0.0078 max mem: 33300 Epoch: [20] [ 860/4276] eta: 2:53:15 lr: 2.6551429383041804e-05 loss: 0.1128 (0.1188) time: 3.0839 data: 0.0077 max mem: 33300 Epoch: [20] [ 870/4276] eta: 2:52:43 lr: 2.654860671473915e-05 loss: 0.1129 (0.1187) time: 3.0515 data: 0.0077 max mem: 33300 Epoch: [20] [ 880/4276] eta: 2:52:15 lr: 2.6545784013090747e-05 loss: 0.1081 (0.1187) time: 3.0556 data: 0.0074 max mem: 33300 Epoch: [20] [ 890/4276] eta: 2:51:46 lr: 2.6542961278092243e-05 loss: 0.1241 (0.1189) time: 3.0886 data: 0.0074 max mem: 33300 Epoch: [20] [ 900/4276] eta: 2:51:18 lr: 2.6540138509739316e-05 loss: 0.1254 (0.1189) time: 3.0914 data: 0.0073 max mem: 33300 Epoch: [20] [ 910/4276] eta: 2:50:49 lr: 2.6537315708027626e-05 loss: 0.1225 (0.1190) time: 3.0919 data: 0.0073 max mem: 33300 Epoch: [20] [ 920/4276] eta: 2:50:20 lr: 2.6534492872952843e-05 loss: 0.1178 (0.1191) time: 3.0857 data: 0.0073 max mem: 33300 Epoch: [20] [ 930/4276] eta: 2:49:51 lr: 2.6531670004510616e-05 loss: 0.1134 (0.1191) time: 3.0842 data: 0.0072 max mem: 33300 Epoch: [20] [ 940/4276] eta: 2:49:22 lr: 2.652884710269662e-05 loss: 0.1158 (0.1191) time: 3.0853 data: 0.0073 max mem: 33300 Epoch: [20] [ 950/4276] eta: 2:48:52 lr: 2.6526024167506504e-05 loss: 0.1158 (0.1192) time: 3.0851 data: 0.0072 max mem: 33300 Epoch: [20] [ 960/4276] eta: 2:48:21 lr: 2.6523201198935933e-05 loss: 0.1260 (0.1193) time: 3.0567 data: 0.0073 max mem: 33300 Epoch: [20] [ 970/4276] eta: 2:47:52 lr: 2.6520378196980562e-05 loss: 0.1284 (0.1194) time: 3.0544 data: 0.0075 max mem: 33300 Epoch: [20] [ 980/4276] eta: 2:47:22 lr: 2.6517555161636043e-05 loss: 0.1186 (0.1194) time: 3.0664 data: 0.0073 max mem: 33300 Epoch: [20] [ 990/4276] eta: 2:46:52 lr: 2.6514732092898043e-05 loss: 0.1108 (0.1193) time: 3.0522 data: 0.0077 max mem: 33300 Epoch: [20] [1000/4276] eta: 2:46:20 lr: 2.651190899076221e-05 loss: 0.1163 (0.1193) time: 3.0377 data: 0.0080 max mem: 33300 Epoch: [20] [1010/4276] eta: 2:45:49 lr: 2.6509085855224192e-05 loss: 0.1163 (0.1192) time: 3.0214 data: 0.0074 max mem: 33300 Epoch: [20] [1020/4276] eta: 2:45:18 lr: 2.650626268627965e-05 loss: 0.1111 (0.1192) time: 3.0203 data: 0.0072 max mem: 33300 Epoch: [20] [1030/4276] eta: 2:44:46 lr: 2.6503439483924225e-05 loss: 0.1090 (0.1192) time: 3.0203 data: 0.0071 max mem: 33300 Epoch: [20] [1040/4276] eta: 2:44:15 lr: 2.6500616248153588e-05 loss: 0.1113 (0.1192) time: 3.0224 data: 0.0072 max mem: 33300 Epoch: [20] [1050/4276] eta: 2:43:43 lr: 2.649779297896336e-05 loss: 0.1146 (0.1193) time: 3.0023 data: 0.0076 max mem: 33300 Epoch: [20] [1060/4276] eta: 2:43:13 lr: 2.6494969676349206e-05 loss: 0.1154 (0.1193) time: 3.0180 data: 0.0078 max mem: 33300 Epoch: [20] [1070/4276] eta: 2:42:44 lr: 2.6492146340306768e-05 loss: 0.1237 (0.1195) time: 3.0737 data: 0.0086 max mem: 33300 Epoch: [20] [1080/4276] eta: 2:42:14 lr: 2.64893229708317e-05 loss: 0.1295 (0.1196) time: 3.0876 data: 0.0088 max mem: 33300 Epoch: [20] [1090/4276] eta: 2:41:45 lr: 2.648649956791963e-05 loss: 0.1207 (0.1197) time: 3.0844 data: 0.0084 max mem: 33300 Epoch: [20] [1100/4276] eta: 2:41:16 lr: 2.6483676131566215e-05 loss: 0.1255 (0.1198) time: 3.0857 data: 0.0085 max mem: 33300 Epoch: [20] [1110/4276] eta: 2:40:46 lr: 2.6480852661767097e-05 loss: 0.1255 (0.1199) time: 3.0870 data: 0.0088 max mem: 33300 Epoch: [20] [1120/4276] eta: 2:40:16 lr: 2.6478029158517915e-05 loss: 0.1169 (0.1198) time: 3.0632 data: 0.0084 max mem: 33300 Epoch: [20] [1130/4276] eta: 2:39:44 lr: 2.6475205621814304e-05 loss: 0.1079 (0.1197) time: 3.0302 data: 0.0075 max mem: 33300 Epoch: [20] [1140/4276] eta: 2:39:12 lr: 2.6472382051651905e-05 loss: 0.1122 (0.1197) time: 3.0066 data: 0.0073 max mem: 33300 Epoch: [20] [1150/4276] eta: 2:38:41 lr: 2.6469558448026365e-05 loss: 0.1172 (0.1197) time: 3.0042 data: 0.0072 max mem: 33300 Epoch: [20] [1160/4276] eta: 2:38:12 lr: 2.6466734810933313e-05 loss: 0.1143 (0.1197) time: 3.0509 data: 0.0082 max mem: 33300 Epoch: [20] [1170/4276] eta: 2:37:42 lr: 2.6463911140368385e-05 loss: 0.1090 (0.1197) time: 3.0860 data: 0.0090 max mem: 33300 Epoch: [20] [1180/4276] eta: 2:37:13 lr: 2.6461087436327214e-05 loss: 0.1090 (0.1196) time: 3.0806 data: 0.0088 max mem: 33300 Epoch: [20] [1190/4276] eta: 2:36:42 lr: 2.6458263698805447e-05 loss: 0.1147 (0.1196) time: 3.0522 data: 0.0086 max mem: 33300 Epoch: [20] [1200/4276] eta: 2:36:12 lr: 2.6455439927798702e-05 loss: 0.1160 (0.1195) time: 3.0465 data: 0.0086 max mem: 33300 Epoch: [20] [1210/4276] eta: 2:35:42 lr: 2.6452616123302615e-05 loss: 0.1113 (0.1194) time: 3.0742 data: 0.0087 max mem: 33300 Epoch: [20] [1220/4276] eta: 2:35:13 lr: 2.644979228531282e-05 loss: 0.1123 (0.1195) time: 3.0841 data: 0.0082 max mem: 33300 Epoch: [20] [1230/4276] eta: 2:34:42 lr: 2.6446968413824945e-05 loss: 0.1201 (0.1195) time: 3.0690 data: 0.0079 max mem: 33300 Epoch: [20] [1240/4276] eta: 2:34:12 lr: 2.6444144508834613e-05 loss: 0.1143 (0.1195) time: 3.0626 data: 0.0085 max mem: 33300 Epoch: [20] [1250/4276] eta: 2:33:43 lr: 2.6441320570337453e-05 loss: 0.1144 (0.1195) time: 3.0782 data: 0.0084 max mem: 33300 Epoch: [20] [1260/4276] eta: 2:33:13 lr: 2.64384965983291e-05 loss: 0.1086 (0.1194) time: 3.0861 data: 0.0078 max mem: 33300 Epoch: [20] [1270/4276] eta: 2:32:44 lr: 2.643567259280517e-05 loss: 0.0958 (0.1193) time: 3.0856 data: 0.0077 max mem: 33300 Epoch: [20] [1280/4276] eta: 2:32:14 lr: 2.643284855376129e-05 loss: 0.1142 (0.1193) time: 3.0856 data: 0.0075 max mem: 33300 Epoch: [20] [1290/4276] eta: 2:31:44 lr: 2.6430024481193083e-05 loss: 0.1233 (0.1193) time: 3.0858 data: 0.0074 max mem: 33300 Epoch: [20] [1300/4276] eta: 2:31:15 lr: 2.6427200375096167e-05 loss: 0.0988 (0.1192) time: 3.0886 data: 0.0071 max mem: 33300 Epoch: [20] [1310/4276] eta: 2:30:45 lr: 2.6424376235466176e-05 loss: 0.0979 (0.1191) time: 3.0831 data: 0.0072 max mem: 33300 Epoch: [20] [1320/4276] eta: 2:30:15 lr: 2.642155206229871e-05 loss: 0.1041 (0.1192) time: 3.0645 data: 0.0074 max mem: 33300 Epoch: [20] [1330/4276] eta: 2:29:44 lr: 2.64187278555894e-05 loss: 0.1112 (0.1191) time: 3.0543 data: 0.0074 max mem: 33300 Epoch: [20] [1340/4276] eta: 2:29:15 lr: 2.6415903615333858e-05 loss: 0.1041 (0.1190) time: 3.0713 data: 0.0074 max mem: 33300 Epoch: [20] [1350/4276] eta: 2:28:45 lr: 2.6413079341527715e-05 loss: 0.1131 (0.1191) time: 3.0868 data: 0.0073 max mem: 33300 Epoch: [20] [1360/4276] eta: 2:28:15 lr: 2.641025503416656e-05 loss: 0.1164 (0.1191) time: 3.0863 data: 0.0074 max mem: 33300 Epoch: [20] [1370/4276] eta: 2:27:45 lr: 2.6407430693246028e-05 loss: 0.1109 (0.1190) time: 3.0883 data: 0.0074 max mem: 33300 Epoch: [20] [1380/4276] eta: 2:27:15 lr: 2.640460631876172e-05 loss: 0.1145 (0.1191) time: 3.0720 data: 0.0074 max mem: 33300 Epoch: [20] [1390/4276] eta: 2:26:42 lr: 2.6401781910709266e-05 loss: 0.1293 (0.1192) time: 2.9984 data: 0.0071 max mem: 33300 Epoch: [20] [1400/4276] eta: 2:26:10 lr: 2.6398957469084256e-05 loss: 0.1267 (0.1193) time: 2.9425 data: 0.0071 max mem: 33300 Epoch: [20] [1410/4276] eta: 2:25:36 lr: 2.6396132993882304e-05 loss: 0.1172 (0.1192) time: 2.9194 data: 0.0079 max mem: 33300 Epoch: [20] [1420/4276] eta: 2:25:03 lr: 2.6393308485099034e-05 loss: 0.1077 (0.1192) time: 2.9112 data: 0.0083 max mem: 33300 Epoch: [20] [1430/4276] eta: 2:24:30 lr: 2.6390483942730036e-05 loss: 0.1113 (0.1192) time: 2.9250 data: 0.0085 max mem: 33300 Epoch: [20] [1440/4276] eta: 2:23:57 lr: 2.6387659366770924e-05 loss: 0.1148 (0.1192) time: 2.9316 data: 0.0086 max mem: 33300 Epoch: [20] [1450/4276] eta: 2:23:25 lr: 2.6384834757217302e-05 loss: 0.1148 (0.1192) time: 2.9399 data: 0.0086 max mem: 33300 Epoch: [20] [1460/4276] eta: 2:22:53 lr: 2.638201011406478e-05 loss: 0.1036 (0.1191) time: 2.9456 data: 0.0087 max mem: 33300 Epoch: [20] [1470/4276] eta: 2:22:20 lr: 2.637918543730895e-05 loss: 0.0993 (0.1190) time: 2.9448 data: 0.0083 max mem: 33300 Epoch: [20] [1480/4276] eta: 2:21:48 lr: 2.637636072694542e-05 loss: 0.1042 (0.1189) time: 2.9364 data: 0.0082 max mem: 33300 Epoch: [20] [1490/4276] eta: 2:21:15 lr: 2.6373535982969792e-05 loss: 0.1042 (0.1188) time: 2.9349 data: 0.0086 max mem: 33300 Epoch: [20] [1500/4276] eta: 2:20:43 lr: 2.6370711205377668e-05 loss: 0.1082 (0.1188) time: 2.9359 data: 0.0085 max mem: 33300 Epoch: [20] [1510/4276] eta: 2:20:11 lr: 2.636788639416464e-05 loss: 0.1048 (0.1188) time: 2.9421 data: 0.0084 max mem: 33300 Epoch: [20] [1520/4276] eta: 2:19:39 lr: 2.6365061549326314e-05 loss: 0.1078 (0.1188) time: 2.9441 data: 0.0081 max mem: 33300 Epoch: [20] [1530/4276] eta: 2:19:06 lr: 2.6362236670858275e-05 loss: 0.1008 (0.1187) time: 2.9418 data: 0.0084 max mem: 33300 Epoch: [20] [1540/4276] eta: 2:18:34 lr: 2.6359411758756136e-05 loss: 0.1005 (0.1187) time: 2.9465 data: 0.0090 max mem: 33300 Epoch: [20] [1550/4276] eta: 2:18:02 lr: 2.635658681301547e-05 loss: 0.1098 (0.1186) time: 2.9419 data: 0.0090 max mem: 33300 Epoch: [20] [1560/4276] eta: 2:17:30 lr: 2.635376183363189e-05 loss: 0.1086 (0.1185) time: 2.9367 data: 0.0086 max mem: 33300 Epoch: [20] [1570/4276] eta: 2:16:58 lr: 2.6350936820600973e-05 loss: 0.1095 (0.1185) time: 2.9406 data: 0.0083 max mem: 33300 Epoch: [20] [1580/4276] eta: 2:16:26 lr: 2.6348111773918326e-05 loss: 0.0995 (0.1185) time: 2.9433 data: 0.0083 max mem: 33300 Epoch: [20] [1590/4276] eta: 2:15:54 lr: 2.6345286693579524e-05 loss: 0.1016 (0.1185) time: 2.9440 data: 0.0084 max mem: 33300 Epoch: [20] [1600/4276] eta: 2:15:22 lr: 2.6342461579580157e-05 loss: 0.1209 (0.1185) time: 2.9318 data: 0.0087 max mem: 33300 Epoch: [20] [1610/4276] eta: 2:14:50 lr: 2.633963643191582e-05 loss: 0.1099 (0.1184) time: 2.9308 data: 0.0087 max mem: 33300 Epoch: [20] [1620/4276] eta: 2:14:18 lr: 2.6336811250582106e-05 loss: 0.1099 (0.1184) time: 2.9463 data: 0.0086 max mem: 33300 Epoch: [20] [1630/4276] eta: 2:13:46 lr: 2.633398603557458e-05 loss: 0.1106 (0.1184) time: 2.9491 data: 0.0082 max mem: 33300 Epoch: [20] [1640/4276] eta: 2:13:15 lr: 2.633116078688884e-05 loss: 0.1025 (0.1183) time: 2.9508 data: 0.0077 max mem: 33300 Epoch: [20] [1650/4276] eta: 2:12:43 lr: 2.632833550452047e-05 loss: 0.1026 (0.1182) time: 2.9504 data: 0.0077 max mem: 33300 Epoch: [20] [1660/4276] eta: 2:12:11 lr: 2.6325510188465052e-05 loss: 0.1060 (0.1182) time: 2.9454 data: 0.0077 max mem: 33300 Epoch: [20] [1670/4276] eta: 2:11:40 lr: 2.6322684838718164e-05 loss: 0.1059 (0.1181) time: 2.9443 data: 0.0075 max mem: 33300 Epoch: [20] [1680/4276] eta: 2:11:08 lr: 2.631985945527538e-05 loss: 0.1059 (0.1181) time: 2.9391 data: 0.0076 max mem: 33300 Epoch: [20] [1690/4276] eta: 2:10:36 lr: 2.6317034038132294e-05 loss: 0.1013 (0.1180) time: 2.9364 data: 0.0078 max mem: 33300 Epoch: [20] [1700/4276] eta: 2:10:04 lr: 2.6314208587284478e-05 loss: 0.1038 (0.1180) time: 2.9270 data: 0.0077 max mem: 33300 Epoch: [20] [1710/4276] eta: 2:09:32 lr: 2.63113831027275e-05 loss: 0.1121 (0.1180) time: 2.9243 data: 0.0078 max mem: 33300 Epoch: [20] [1720/4276] eta: 2:09:01 lr: 2.6308557584456938e-05 loss: 0.1113 (0.1180) time: 2.9343 data: 0.0082 max mem: 33300 Epoch: [20] [1730/4276] eta: 2:08:29 lr: 2.6305732032468378e-05 loss: 0.1235 (0.1180) time: 2.9342 data: 0.0080 max mem: 33300 Epoch: [20] [1740/4276] eta: 2:07:57 lr: 2.6302906446757386e-05 loss: 0.1143 (0.1180) time: 2.9358 data: 0.0077 max mem: 33300 Epoch: [20] [1750/4276] eta: 2:07:26 lr: 2.6300080827319534e-05 loss: 0.1048 (0.1179) time: 2.9404 data: 0.0075 max mem: 33300 Epoch: [20] [1760/4276] eta: 2:06:54 lr: 2.6297255174150387e-05 loss: 0.0984 (0.1178) time: 2.9401 data: 0.0075 max mem: 33300 Epoch: [20] [1770/4276] eta: 2:06:23 lr: 2.629442948724553e-05 loss: 0.1025 (0.1177) time: 2.9428 data: 0.0077 max mem: 33300 Epoch: [20] [1780/4276] eta: 2:05:52 lr: 2.629160376660052e-05 loss: 0.1025 (0.1177) time: 2.9473 data: 0.0077 max mem: 33300 Epoch: [20] [1790/4276] eta: 2:05:20 lr: 2.6288778012210923e-05 loss: 0.0966 (0.1176) time: 2.9415 data: 0.0075 max mem: 33300 Epoch: [20] [1800/4276] eta: 2:04:49 lr: 2.6285952224072318e-05 loss: 0.1034 (0.1176) time: 2.9330 data: 0.0075 max mem: 33300 Epoch: [20] [1810/4276] eta: 2:04:17 lr: 2.6283126402180263e-05 loss: 0.1171 (0.1177) time: 2.9326 data: 0.0077 max mem: 33300 Epoch: [20] [1820/4276] eta: 2:03:46 lr: 2.6280300546530323e-05 loss: 0.1273 (0.1177) time: 2.9343 data: 0.0077 max mem: 33300 Epoch: [20] [1830/4276] eta: 2:03:14 lr: 2.6277474657118063e-05 loss: 0.1134 (0.1177) time: 2.9213 data: 0.0077 max mem: 33300 Epoch: [20] [1840/4276] eta: 2:02:42 lr: 2.6274648733939038e-05 loss: 0.1102 (0.1176) time: 2.9219 data: 0.0081 max mem: 33300 Epoch: [20] [1850/4276] eta: 2:02:11 lr: 2.6271822776988823e-05 loss: 0.1120 (0.1176) time: 2.9340 data: 0.0083 max mem: 33300 Epoch: [20] [1860/4276] eta: 2:01:40 lr: 2.6268996786262963e-05 loss: 0.1110 (0.1176) time: 2.9330 data: 0.0081 max mem: 33300 Epoch: [20] [1870/4276] eta: 2:01:08 lr: 2.6266170761757025e-05 loss: 0.1064 (0.1176) time: 2.9354 data: 0.0078 max mem: 33300 Epoch: [20] [1880/4276] eta: 2:00:37 lr: 2.626334470346657e-05 loss: 0.1083 (0.1176) time: 2.9380 data: 0.0081 max mem: 33300 Epoch: [20] [1890/4276] eta: 2:00:06 lr: 2.626051861138715e-05 loss: 0.1103 (0.1176) time: 2.9377 data: 0.0082 max mem: 33300 Epoch: [20] [1900/4276] eta: 1:59:35 lr: 2.6257692485514317e-05 loss: 0.1048 (0.1176) time: 2.9346 data: 0.0080 max mem: 33300 Epoch: [20] [1910/4276] eta: 1:59:03 lr: 2.625486632584363e-05 loss: 0.1081 (0.1175) time: 2.9322 data: 0.0078 max mem: 33300 Epoch: [20] [1920/4276] eta: 1:58:32 lr: 2.6252040132370648e-05 loss: 0.1100 (0.1175) time: 2.9432 data: 0.0080 max mem: 33300 Epoch: [20] [1930/4276] eta: 1:58:01 lr: 2.6249213905090915e-05 loss: 0.1035 (0.1174) time: 2.9471 data: 0.0083 max mem: 33300 Epoch: [20] [1940/4276] eta: 1:57:30 lr: 2.624638764399998e-05 loss: 0.1035 (0.1174) time: 2.9387 data: 0.0080 max mem: 33300 Epoch: [20] [1950/4276] eta: 1:56:59 lr: 2.6243561349093398e-05 loss: 0.1222 (0.1175) time: 2.9389 data: 0.0082 max mem: 33300 Epoch: [20] [1960/4276] eta: 1:56:28 lr: 2.6240735020366718e-05 loss: 0.1234 (0.1174) time: 2.9329 data: 0.0095 max mem: 33300 Epoch: [20] [1970/4276] eta: 1:55:57 lr: 2.6237908657815497e-05 loss: 0.1025 (0.1174) time: 2.9383 data: 0.0096 max mem: 33300 Epoch: [20] [1980/4276] eta: 1:55:26 lr: 2.6235082261435262e-05 loss: 0.1047 (0.1173) time: 2.9477 data: 0.0087 max mem: 33300 Epoch: [20] [1990/4276] eta: 1:54:55 lr: 2.623225583122157e-05 loss: 0.1047 (0.1173) time: 2.9406 data: 0.0082 max mem: 33300 Epoch: [20] [2000/4276] eta: 1:54:24 lr: 2.622942936716996e-05 loss: 0.1118 (0.1173) time: 2.9372 data: 0.0082 max mem: 33300 Epoch: [20] [2010/4276] eta: 1:53:52 lr: 2.622660286927599e-05 loss: 0.1144 (0.1172) time: 2.9381 data: 0.0083 max mem: 33300 Epoch: [20] [2020/4276] eta: 1:53:21 lr: 2.6223776337535185e-05 loss: 0.1164 (0.1172) time: 2.9397 data: 0.0080 max mem: 33300 Epoch: [20] [2030/4276] eta: 1:52:51 lr: 2.6220949771943086e-05 loss: 0.1058 (0.1172) time: 2.9439 data: 0.0082 max mem: 33300 Epoch: [20] [2040/4276] eta: 1:52:20 lr: 2.6218123172495246e-05 loss: 0.0929 (0.1170) time: 2.9381 data: 0.0088 max mem: 33300 Epoch: [20] [2050/4276] eta: 1:51:49 lr: 2.6215296539187197e-05 loss: 0.0967 (0.1170) time: 2.9377 data: 0.0090 max mem: 33300 Epoch: [20] [2060/4276] eta: 1:51:18 lr: 2.6212469872014472e-05 loss: 0.1014 (0.1170) time: 2.9424 data: 0.0084 max mem: 33300 Epoch: [20] [2070/4276] eta: 1:50:54 lr: 2.620964317097262e-05 loss: 0.1022 (0.1169) time: 3.2788 data: 0.0078 max mem: 33300 Epoch: [20] [2080/4276] eta: 1:50:29 lr: 2.6206816436057164e-05 loss: 0.1024 (0.1170) time: 3.5440 data: 0.0073 max mem: 33300 Epoch: [20] [2090/4276] eta: 1:49:58 lr: 2.6203989667263647e-05 loss: 0.1058 (0.1169) time: 3.2051 data: 0.0075 max mem: 33300 Epoch: [20] [2100/4276] eta: 1:49:27 lr: 2.6201162864587592e-05 loss: 0.1097 (0.1169) time: 2.9388 data: 0.0078 max mem: 33300 Epoch: [20] [2110/4276] eta: 1:48:56 lr: 2.619833602802454e-05 loss: 0.1206 (0.1169) time: 2.9381 data: 0.0079 max mem: 33300 Epoch: [20] [2120/4276] eta: 1:48:25 lr: 2.619550915757002e-05 loss: 0.0914 (0.1168) time: 2.9431 data: 0.0079 max mem: 33300 Epoch: [20] [2130/4276] eta: 1:47:54 lr: 2.619268225321956e-05 loss: 0.0856 (0.1167) time: 2.9465 data: 0.0077 max mem: 33300 Epoch: [20] [2140/4276] eta: 1:47:23 lr: 2.6189855314968688e-05 loss: 0.0997 (0.1166) time: 2.9482 data: 0.0075 max mem: 33300 Epoch: [20] [2150/4276] eta: 1:46:52 lr: 2.6187028342812935e-05 loss: 0.0997 (0.1166) time: 2.9512 data: 0.0073 max mem: 33300 Epoch: [20] [2160/4276] eta: 1:46:21 lr: 2.6184201336747833e-05 loss: 0.0987 (0.1165) time: 2.9530 data: 0.0073 max mem: 33300 Epoch: [20] [2170/4276] eta: 1:45:51 lr: 2.618137429676889e-05 loss: 0.1130 (0.1166) time: 2.9537 data: 0.0075 max mem: 33300 Epoch: [20] [2180/4276] eta: 1:45:20 lr: 2.617854722287164e-05 loss: 0.1134 (0.1166) time: 2.9489 data: 0.0073 max mem: 33300 Epoch: [20] [2190/4276] eta: 1:44:49 lr: 2.617572011505161e-05 loss: 0.1080 (0.1167) time: 2.9407 data: 0.0072 max mem: 33300 Epoch: [20] [2200/4276] eta: 1:44:18 lr: 2.6172892973304326e-05 loss: 0.1175 (0.1167) time: 2.9387 data: 0.0071 max mem: 33300 Epoch: [20] [2210/4276] eta: 1:43:47 lr: 2.6170065797625288e-05 loss: 0.1252 (0.1168) time: 2.9407 data: 0.0072 max mem: 33300 Epoch: [20] [2220/4276] eta: 1:43:16 lr: 2.6167238588010035e-05 loss: 0.1231 (0.1168) time: 2.9437 data: 0.0072 max mem: 33300 Epoch: [20] [2230/4276] eta: 1:42:46 lr: 2.6164411344454076e-05 loss: 0.1092 (0.1168) time: 2.9466 data: 0.0071 max mem: 33300 Epoch: [20] [2240/4276] eta: 1:42:15 lr: 2.6161584066952937e-05 loss: 0.1077 (0.1168) time: 2.9505 data: 0.0074 max mem: 33300 Epoch: [20] [2250/4276] eta: 1:41:44 lr: 2.6158756755502127e-05 loss: 0.1026 (0.1168) time: 2.9548 data: 0.0076 max mem: 33300 Epoch: [20] [2260/4276] eta: 1:41:14 lr: 2.615592941009716e-05 loss: 0.1027 (0.1168) time: 2.9547 data: 0.0076 max mem: 33300 Epoch: [20] [2270/4276] eta: 1:40:43 lr: 2.6153102030733555e-05 loss: 0.1085 (0.1167) time: 2.9507 data: 0.0076 max mem: 33300 Epoch: [20] [2280/4276] eta: 1:40:12 lr: 2.6150274617406826e-05 loss: 0.1117 (0.1168) time: 2.9528 data: 0.0077 max mem: 33300 Epoch: [20] [2290/4276] eta: 1:39:42 lr: 2.6147447170112478e-05 loss: 0.1153 (0.1168) time: 2.9555 data: 0.0076 max mem: 33300 Epoch: [20] [2300/4276] eta: 1:39:11 lr: 2.614461968884603e-05 loss: 0.1153 (0.1168) time: 2.9551 data: 0.0078 max mem: 33300 Epoch: [20] [2310/4276] eta: 1:38:40 lr: 2.614179217360298e-05 loss: 0.1180 (0.1169) time: 2.9545 data: 0.0079 max mem: 33300 Epoch: [20] [2320/4276] eta: 1:38:10 lr: 2.6138964624378854e-05 loss: 0.1180 (0.1169) time: 2.9551 data: 0.0079 max mem: 33300 Epoch: [20] [2330/4276] eta: 1:37:39 lr: 2.613613704116914e-05 loss: 0.1185 (0.1169) time: 2.9581 data: 0.0079 max mem: 33300 Epoch: [20] [2340/4276] eta: 1:37:09 lr: 2.613330942396935e-05 loss: 0.1175 (0.1169) time: 2.9553 data: 0.0077 max mem: 33300 Epoch: [20] [2350/4276] eta: 1:36:38 lr: 2.6130481772774996e-05 loss: 0.1084 (0.1169) time: 2.9516 data: 0.0077 max mem: 33300 Epoch: [20] [2360/4276] eta: 1:36:08 lr: 2.6127654087581582e-05 loss: 0.1084 (0.1169) time: 2.9517 data: 0.0079 max mem: 33300 Epoch: [20] [2370/4276] eta: 1:35:37 lr: 2.6124826368384602e-05 loss: 0.1194 (0.1169) time: 2.9534 data: 0.0082 max mem: 33300 Epoch: [20] [2380/4276] eta: 1:35:06 lr: 2.612199861517956e-05 loss: 0.1194 (0.1170) time: 2.9548 data: 0.0084 max mem: 33300 Epoch: [20] [2390/4276] eta: 1:34:36 lr: 2.6119170827961963e-05 loss: 0.1167 (0.1170) time: 2.9535 data: 0.0082 max mem: 33300 Epoch: [20] [2400/4276] eta: 1:34:05 lr: 2.6116343006727302e-05 loss: 0.1232 (0.1171) time: 2.9383 data: 0.0080 max mem: 33300 Epoch: [20] [2410/4276] eta: 1:33:35 lr: 2.6113515151471074e-05 loss: 0.1210 (0.1171) time: 2.9334 data: 0.0082 max mem: 33300 Epoch: [20] [2420/4276] eta: 1:33:04 lr: 2.611068726218878e-05 loss: 0.1129 (0.1170) time: 2.9465 data: 0.0083 max mem: 33300 Epoch: [20] [2430/4276] eta: 1:32:34 lr: 2.610785933887593e-05 loss: 0.1129 (0.1171) time: 2.9652 data: 0.0082 max mem: 33300 Epoch: [20] [2440/4276] eta: 1:32:03 lr: 2.6105031381527993e-05 loss: 0.1094 (0.1171) time: 2.9869 data: 0.0074 max mem: 33300 Epoch: [20] [2450/4276] eta: 1:31:33 lr: 2.610220339014048e-05 loss: 0.1029 (0.1170) time: 2.9937 data: 0.0068 max mem: 33300 Epoch: [20] [2460/4276] eta: 1:31:03 lr: 2.609937536470887e-05 loss: 0.1289 (0.1171) time: 2.9554 data: 0.0069 max mem: 33300 Epoch: [20] [2470/4276] eta: 1:30:32 lr: 2.6096547305228673e-05 loss: 0.1365 (0.1172) time: 2.9145 data: 0.0077 max mem: 33300 Epoch: [20] [2480/4276] eta: 1:30:01 lr: 2.6093719211695366e-05 loss: 0.1250 (0.1172) time: 2.9102 data: 0.0084 max mem: 33300 Epoch: [20] [2490/4276] eta: 1:29:30 lr: 2.6090891084104436e-05 loss: 0.1221 (0.1173) time: 2.9229 data: 0.0081 max mem: 33300 Epoch: [20] [2500/4276] eta: 1:29:00 lr: 2.608806292245138e-05 loss: 0.1248 (0.1173) time: 2.9409 data: 0.0079 max mem: 33300 Epoch: [20] [2510/4276] eta: 1:28:30 lr: 2.6085234726731682e-05 loss: 0.1231 (0.1173) time: 2.9950 data: 0.0081 max mem: 33300 Epoch: [20] [2520/4276] eta: 1:28:00 lr: 2.608240649694082e-05 loss: 0.1005 (0.1172) time: 3.0628 data: 0.0082 max mem: 33300 Epoch: [20] [2530/4276] eta: 1:27:31 lr: 2.6079578233074285e-05 loss: 0.0966 (0.1172) time: 3.0840 data: 0.0082 max mem: 33300 Epoch: [20] [2540/4276] eta: 1:27:01 lr: 2.6076749935127563e-05 loss: 0.0976 (0.1172) time: 3.0870 data: 0.0086 max mem: 33300 Epoch: [20] [2550/4276] eta: 1:26:32 lr: 2.6073921603096134e-05 loss: 0.1046 (0.1171) time: 3.0847 data: 0.0087 max mem: 33300 Epoch: [20] [2560/4276] eta: 1:26:02 lr: 2.6071093236975474e-05 loss: 0.0953 (0.1171) time: 3.0588 data: 0.0086 max mem: 33300 Epoch: [20] [2570/4276] eta: 1:25:32 lr: 2.606826483676107e-05 loss: 0.0958 (0.1170) time: 3.0627 data: 0.0087 max mem: 33300 Epoch: [20] [2580/4276] eta: 1:25:03 lr: 2.6065436402448394e-05 loss: 0.1043 (0.1170) time: 3.0893 data: 0.0086 max mem: 33300 Epoch: [20] [2590/4276] eta: 1:24:33 lr: 2.6062607934032934e-05 loss: 0.1137 (0.1170) time: 3.0951 data: 0.0086 max mem: 33300 Epoch: [20] [2600/4276] eta: 1:24:04 lr: 2.605977943151015e-05 loss: 0.1030 (0.1169) time: 3.0946 data: 0.0087 max mem: 33300 Epoch: [20] [2610/4276] eta: 1:23:34 lr: 2.605695089487553e-05 loss: 0.1030 (0.1169) time: 3.0879 data: 0.0086 max mem: 33300 Epoch: [20] [2620/4276] eta: 1:23:05 lr: 2.6054122324124545e-05 loss: 0.1068 (0.1169) time: 3.0834 data: 0.0085 max mem: 33300 Epoch: [20] [2630/4276] eta: 1:22:35 lr: 2.6051293719252674e-05 loss: 0.1047 (0.1168) time: 3.0576 data: 0.0086 max mem: 33300 Epoch: [20] [2640/4276] eta: 1:22:05 lr: 2.604846508025538e-05 loss: 0.1018 (0.1168) time: 3.0522 data: 0.0087 max mem: 33300 Epoch: [20] [2650/4276] eta: 1:21:35 lr: 2.6045636407128133e-05 loss: 0.1014 (0.1168) time: 3.0500 data: 0.0085 max mem: 33300 Epoch: [20] [2660/4276] eta: 1:21:05 lr: 2.604280769986641e-05 loss: 0.1014 (0.1168) time: 3.0296 data: 0.0087 max mem: 33300 Epoch: [20] [2670/4276] eta: 1:20:35 lr: 2.6039978958465676e-05 loss: 0.1180 (0.1168) time: 3.0378 data: 0.0089 max mem: 33300 Epoch: [20] [2680/4276] eta: 1:20:05 lr: 2.6037150182921395e-05 loss: 0.1213 (0.1168) time: 3.0681 data: 0.0085 max mem: 33300 Epoch: [20] [2690/4276] eta: 1:19:36 lr: 2.6034321373229035e-05 loss: 0.1132 (0.1168) time: 3.0887 data: 0.0085 max mem: 33300 Epoch: [20] [2700/4276] eta: 1:19:06 lr: 2.603149252938407e-05 loss: 0.1013 (0.1168) time: 3.0855 data: 0.0085 max mem: 33300 Epoch: [20] [2710/4276] eta: 1:18:36 lr: 2.6028663651381946e-05 loss: 0.1023 (0.1167) time: 3.0793 data: 0.0077 max mem: 33300 Epoch: [20] [2720/4276] eta: 1:18:06 lr: 2.6025834739218142e-05 loss: 0.0954 (0.1167) time: 3.0773 data: 0.0073 max mem: 33300 Epoch: [20] [2730/4276] eta: 1:17:37 lr: 2.602300579288811e-05 loss: 0.1100 (0.1167) time: 3.0721 data: 0.0077 max mem: 33300 Epoch: [20] [2740/4276] eta: 1:17:07 lr: 2.6020176812387314e-05 loss: 0.1105 (0.1167) time: 3.0450 data: 0.0083 max mem: 33300 Epoch: [20] [2750/4276] eta: 1:16:37 lr: 2.6017347797711216e-05 loss: 0.1137 (0.1168) time: 3.0246 data: 0.0086 max mem: 33300 Epoch: [20] [2760/4276] eta: 1:16:06 lr: 2.6014518748855265e-05 loss: 0.1117 (0.1168) time: 3.0261 data: 0.0088 max mem: 33300 Epoch: [20] [2770/4276] eta: 1:15:36 lr: 2.6011689665814924e-05 loss: 0.1071 (0.1168) time: 3.0256 data: 0.0090 max mem: 33300 Epoch: [20] [2780/4276] eta: 1:15:07 lr: 2.6008860548585657e-05 loss: 0.1058 (0.1168) time: 3.0368 data: 0.0091 max mem: 33300 Epoch: [20] [2790/4276] eta: 1:14:37 lr: 2.6006031397162906e-05 loss: 0.1111 (0.1168) time: 3.0620 data: 0.0092 max mem: 33300 Epoch: [20] [2800/4276] eta: 1:14:07 lr: 2.6003202211542122e-05 loss: 0.1087 (0.1167) time: 3.0851 data: 0.0090 max mem: 33300 Epoch: [20] [2810/4276] eta: 1:13:37 lr: 2.6000372991718768e-05 loss: 0.0868 (0.1166) time: 3.0728 data: 0.0085 max mem: 33300 Epoch: [20] [2820/4276] eta: 1:13:07 lr: 2.5997543737688296e-05 loss: 0.0946 (0.1166) time: 3.0710 data: 0.0090 max mem: 33300 Epoch: [20] [2830/4276] eta: 1:12:38 lr: 2.5994714449446146e-05 loss: 0.1014 (0.1166) time: 3.0954 data: 0.0098 max mem: 33300 Epoch: [20] [2840/4276] eta: 1:12:08 lr: 2.599188512698777e-05 loss: 0.1263 (0.1166) time: 3.0948 data: 0.0099 max mem: 33300 Epoch: [20] [2850/4276] eta: 1:11:38 lr: 2.5989055770308617e-05 loss: 0.1317 (0.1166) time: 3.0910 data: 0.0094 max mem: 33300 Epoch: [20] [2860/4276] eta: 1:11:08 lr: 2.5986226379404143e-05 loss: 0.1170 (0.1166) time: 3.0915 data: 0.0092 max mem: 33300 Epoch: [20] [2870/4276] eta: 1:10:39 lr: 2.598339695426978e-05 loss: 0.0989 (0.1166) time: 3.0894 data: 0.0099 max mem: 33300 Epoch: [20] [2880/4276] eta: 1:10:09 lr: 2.5980567494900974e-05 loss: 0.1071 (0.1166) time: 3.0913 data: 0.0102 max mem: 33300 Epoch: [20] [2890/4276] eta: 1:09:39 lr: 2.5977738001293173e-05 loss: 0.1078 (0.1165) time: 3.0989 data: 0.0100 max mem: 33300 Epoch: [20] [2900/4276] eta: 1:09:09 lr: 2.5974908473441818e-05 loss: 0.1104 (0.1165) time: 3.1014 data: 0.0097 max mem: 33300 Epoch: [20] [2910/4276] eta: 1:08:40 lr: 2.5972078911342346e-05 loss: 0.1160 (0.1165) time: 3.1002 data: 0.0098 max mem: 33300 Epoch: [20] [2920/4276] eta: 1:08:10 lr: 2.5969249314990197e-05 loss: 0.1168 (0.1165) time: 3.0892 data: 0.0096 max mem: 33300 Epoch: [20] [2930/4276] eta: 1:07:39 lr: 2.596641968438081e-05 loss: 0.1043 (0.1165) time: 3.0397 data: 0.0086 max mem: 33300 Epoch: [20] [2940/4276] eta: 1:07:09 lr: 2.596359001950963e-05 loss: 0.0967 (0.1165) time: 3.0105 data: 0.0089 max mem: 33300 Epoch: [20] [2950/4276] eta: 1:06:39 lr: 2.5960760320372084e-05 loss: 0.0993 (0.1165) time: 3.0194 data: 0.0096 max mem: 33300 Epoch: [20] [2960/4276] eta: 1:06:09 lr: 2.5957930586963607e-05 loss: 0.1044 (0.1165) time: 2.9995 data: 0.0088 max mem: 33300 Epoch: [20] [2970/4276] eta: 1:05:39 lr: 2.5955100819279643e-05 loss: 0.1124 (0.1165) time: 2.9862 data: 0.0084 max mem: 33300 Epoch: [20] [2980/4276] eta: 1:05:08 lr: 2.595227101731561e-05 loss: 0.1192 (0.1165) time: 2.9846 data: 0.0085 max mem: 33300 Epoch: [20] [2990/4276] eta: 1:04:38 lr: 2.5949441181066948e-05 loss: 0.1082 (0.1165) time: 2.9586 data: 0.0081 max mem: 33300 Epoch: [20] [3000/4276] eta: 1:04:07 lr: 2.5946611310529085e-05 loss: 0.1039 (0.1165) time: 2.9378 data: 0.0084 max mem: 33300 Epoch: [20] [3010/4276] eta: 1:03:37 lr: 2.594378140569746e-05 loss: 0.1051 (0.1164) time: 2.9363 data: 0.0086 max mem: 33300 Epoch: [20] [3020/4276] eta: 1:03:06 lr: 2.5940951466567483e-05 loss: 0.1073 (0.1164) time: 2.9214 data: 0.0085 max mem: 33300 Epoch: [20] [3030/4276] eta: 1:02:36 lr: 2.5938121493134588e-05 loss: 0.1115 (0.1164) time: 2.9032 data: 0.0086 max mem: 33300 Epoch: [20] [3040/4276] eta: 1:02:05 lr: 2.5935291485394203e-05 loss: 0.1222 (0.1164) time: 2.9005 data: 0.0088 max mem: 33300 Epoch: [20] [3050/4276] eta: 1:01:34 lr: 2.593246144334176e-05 loss: 0.1208 (0.1164) time: 2.9168 data: 0.0089 max mem: 33300 Epoch: [20] [3060/4276] eta: 1:01:04 lr: 2.592963136697267e-05 loss: 0.1007 (0.1164) time: 2.9387 data: 0.0086 max mem: 33300 Epoch: [20] [3070/4276] eta: 1:00:34 lr: 2.592680125628235e-05 loss: 0.1011 (0.1164) time: 2.9440 data: 0.0085 max mem: 33300 Epoch: [20] [3080/4276] eta: 1:00:03 lr: 2.5923971111266236e-05 loss: 0.1252 (0.1164) time: 2.9459 data: 0.0087 max mem: 33300 Epoch: [20] [3090/4276] eta: 0:59:33 lr: 2.592114093191975e-05 loss: 0.1097 (0.1164) time: 2.9449 data: 0.0087 max mem: 33300 Epoch: [20] [3100/4276] eta: 0:59:02 lr: 2.5918310718238297e-05 loss: 0.1053 (0.1163) time: 2.9396 data: 0.0085 max mem: 33300 Epoch: [20] [3110/4276] eta: 0:58:32 lr: 2.59154804702173e-05 loss: 0.1068 (0.1163) time: 2.9373 data: 0.0085 max mem: 33300 Epoch: [20] [3120/4276] eta: 0:58:02 lr: 2.5912650187852172e-05 loss: 0.1106 (0.1163) time: 2.9402 data: 0.0087 max mem: 33300 Epoch: [20] [3130/4276] eta: 0:57:31 lr: 2.5909819871138342e-05 loss: 0.1152 (0.1163) time: 2.9430 data: 0.0087 max mem: 33300 Epoch: [20] [3140/4276] eta: 0:57:01 lr: 2.5906989520071207e-05 loss: 0.1152 (0.1163) time: 2.9420 data: 0.0085 max mem: 33300 Epoch: [20] [3150/4276] eta: 0:56:31 lr: 2.5904159134646183e-05 loss: 0.1174 (0.1163) time: 2.9404 data: 0.0085 max mem: 33300 Epoch: [20] [3160/4276] eta: 0:56:00 lr: 2.5901328714858686e-05 loss: 0.1148 (0.1163) time: 2.9340 data: 0.0090 max mem: 33300 Epoch: [20] [3170/4276] eta: 0:55:30 lr: 2.5898498260704135e-05 loss: 0.1148 (0.1163) time: 2.9310 data: 0.0092 max mem: 33300 Epoch: [20] [3180/4276] eta: 0:54:59 lr: 2.589566777217792e-05 loss: 0.1153 (0.1163) time: 2.9338 data: 0.0085 max mem: 33300 Epoch: [20] [3190/4276] eta: 0:54:29 lr: 2.589283724927546e-05 loss: 0.1176 (0.1163) time: 2.9337 data: 0.0077 max mem: 33300 Epoch: [20] [3200/4276] eta: 0:53:59 lr: 2.5890006691992163e-05 loss: 0.1183 (0.1163) time: 2.9370 data: 0.0075 max mem: 33300 Epoch: [20] [3210/4276] eta: 0:53:28 lr: 2.5887176100323436e-05 loss: 0.1220 (0.1164) time: 2.9386 data: 0.0075 max mem: 33300 Epoch: [20] [3220/4276] eta: 0:52:58 lr: 2.5884345474264676e-05 loss: 0.1301 (0.1164) time: 2.9380 data: 0.0074 max mem: 33300 Epoch: [20] [3230/4276] eta: 0:52:28 lr: 2.5881514813811293e-05 loss: 0.1099 (0.1164) time: 2.9383 data: 0.0072 max mem: 33300 Epoch: [20] [3240/4276] eta: 0:51:57 lr: 2.587868411895868e-05 loss: 0.1128 (0.1164) time: 2.9369 data: 0.0072 max mem: 33300 Epoch: [20] [3250/4276] eta: 0:51:27 lr: 2.587585338970226e-05 loss: 0.1194 (0.1164) time: 2.9365 data: 0.0071 max mem: 33300 Epoch: [20] [3260/4276] eta: 0:50:57 lr: 2.58730226260374e-05 loss: 0.1160 (0.1164) time: 2.9373 data: 0.0070 max mem: 33300 Epoch: [20] [3270/4276] eta: 0:50:26 lr: 2.5870191827959528e-05 loss: 0.1135 (0.1164) time: 2.9363 data: 0.0071 max mem: 33300 Epoch: [20] [3280/4276] eta: 0:49:56 lr: 2.5867360995464036e-05 loss: 0.1118 (0.1165) time: 2.9369 data: 0.0071 max mem: 33300 Epoch: [20] [3290/4276] eta: 0:49:26 lr: 2.58645301285463e-05 loss: 0.1201 (0.1165) time: 2.9393 data: 0.0071 max mem: 33300 Epoch: [20] [3300/4276] eta: 0:48:55 lr: 2.5861699227201735e-05 loss: 0.1223 (0.1165) time: 2.9382 data: 0.0071 max mem: 33300 Epoch: [20] [3310/4276] eta: 0:48:25 lr: 2.585886829142573e-05 loss: 0.1255 (0.1166) time: 2.9369 data: 0.0071 max mem: 33300 Epoch: [20] [3320/4276] eta: 0:47:55 lr: 2.5856037321213683e-05 loss: 0.1255 (0.1166) time: 2.9369 data: 0.0071 max mem: 33300 Epoch: [20] [3330/4276] eta: 0:47:25 lr: 2.585320631656098e-05 loss: 0.1130 (0.1166) time: 2.9389 data: 0.0071 max mem: 33300 Epoch: [20] [3340/4276] eta: 0:46:54 lr: 2.5850375277463007e-05 loss: 0.1054 (0.1166) time: 2.9382 data: 0.0071 max mem: 33300 Epoch: [20] [3350/4276] eta: 0:46:24 lr: 2.584754420391516e-05 loss: 0.1050 (0.1165) time: 2.9359 data: 0.0071 max mem: 33300 Epoch: [20] [3360/4276] eta: 0:45:54 lr: 2.5844713095912826e-05 loss: 0.1052 (0.1165) time: 2.9357 data: 0.0072 max mem: 33300 Epoch: [20] [3370/4276] eta: 0:45:24 lr: 2.584188195345139e-05 loss: 0.1208 (0.1165) time: 2.9366 data: 0.0074 max mem: 33300 Epoch: [20] [3380/4276] eta: 0:44:53 lr: 2.583905077652624e-05 loss: 0.1208 (0.1165) time: 2.9338 data: 0.0073 max mem: 33300 Epoch: [20] [3390/4276] eta: 0:44:23 lr: 2.5836219565132756e-05 loss: 0.1167 (0.1165) time: 2.9252 data: 0.0071 max mem: 33300 Epoch: [20] [3400/4276] eta: 0:43:53 lr: 2.5833388319266338e-05 loss: 0.1161 (0.1165) time: 2.9281 data: 0.0072 max mem: 33300 Epoch: [20] [3410/4276] eta: 0:43:22 lr: 2.583055703892235e-05 loss: 0.1177 (0.1166) time: 2.9320 data: 0.0074 max mem: 33300 Epoch: [20] [3420/4276] eta: 0:42:52 lr: 2.582772572409617e-05 loss: 0.1179 (0.1166) time: 2.9264 data: 0.0075 max mem: 33300 Epoch: [20] [3430/4276] eta: 0:42:23 lr: 2.5824894374783193e-05 loss: 0.1179 (0.1166) time: 3.1135 data: 0.0075 max mem: 33300 Epoch: [20] [3440/4276] eta: 0:41:54 lr: 2.5822062990978794e-05 loss: 0.1029 (0.1166) time: 3.2954 data: 0.0072 max mem: 33300 Epoch: [20] [3450/4276] eta: 0:41:23 lr: 2.5819231572678342e-05 loss: 0.1087 (0.1166) time: 3.1110 data: 0.0071 max mem: 33300 Epoch: [20] [3460/4276] eta: 0:40:53 lr: 2.581640011987722e-05 loss: 0.1178 (0.1166) time: 2.9339 data: 0.0074 max mem: 33300 Epoch: [20] [3470/4276] eta: 0:40:23 lr: 2.5813568632570807e-05 loss: 0.1070 (0.1166) time: 2.9376 data: 0.0074 max mem: 33300 Epoch: [20] [3480/4276] eta: 0:39:53 lr: 2.581073711075447e-05 loss: 0.1117 (0.1166) time: 2.9381 data: 0.0071 max mem: 33300 Epoch: [20] [3490/4276] eta: 0:39:22 lr: 2.5807905554423584e-05 loss: 0.1097 (0.1166) time: 2.9346 data: 0.0071 max mem: 33300 Epoch: [20] [3500/4276] eta: 0:38:52 lr: 2.5805073963573518e-05 loss: 0.1076 (0.1166) time: 2.9356 data: 0.0071 max mem: 33300 Epoch: [20] [3510/4276] eta: 0:38:22 lr: 2.5802242338199646e-05 loss: 0.1016 (0.1166) time: 2.9399 data: 0.0071 max mem: 33300 Epoch: [20] [3520/4276] eta: 0:37:52 lr: 2.5799410678297343e-05 loss: 0.1016 (0.1165) time: 2.9408 data: 0.0072 max mem: 33300 Epoch: [20] [3530/4276] eta: 0:37:22 lr: 2.5796578983861965e-05 loss: 0.1065 (0.1165) time: 2.9379 data: 0.0073 max mem: 33300 Epoch: [20] [3540/4276] eta: 0:36:51 lr: 2.579374725488888e-05 loss: 0.1168 (0.1165) time: 2.9367 data: 0.0072 max mem: 33300 Epoch: [20] [3550/4276] eta: 0:36:21 lr: 2.5790915491373457e-05 loss: 0.1090 (0.1165) time: 2.9401 data: 0.0071 max mem: 33300 Epoch: [20] [3560/4276] eta: 0:35:51 lr: 2.578808369331107e-05 loss: 0.1090 (0.1166) time: 3.0429 data: 0.0071 max mem: 33300 Epoch: [20] [3570/4276] eta: 0:35:21 lr: 2.5785251860697067e-05 loss: 0.1265 (0.1166) time: 3.0360 data: 0.0072 max mem: 33300 Epoch: [20] [3580/4276] eta: 0:34:51 lr: 2.5782419993526826e-05 loss: 0.1110 (0.1166) time: 2.9286 data: 0.0071 max mem: 33300 Epoch: [20] [3590/4276] eta: 0:34:21 lr: 2.577958809179569e-05 loss: 0.1031 (0.1166) time: 2.9227 data: 0.0073 max mem: 33300 Epoch: [20] [3600/4276] eta: 0:33:51 lr: 2.5776756155499033e-05 loss: 0.1124 (0.1166) time: 2.9156 data: 0.0081 max mem: 33300 Epoch: [20] [3610/4276] eta: 0:33:20 lr: 2.5773924184632203e-05 loss: 0.1103 (0.1166) time: 2.9046 data: 0.0091 max mem: 33300 Epoch: [20] [3620/4276] eta: 0:32:50 lr: 2.5771092179190565e-05 loss: 0.1103 (0.1165) time: 2.9136 data: 0.0089 max mem: 33300 Epoch: [20] [3630/4276] eta: 0:32:20 lr: 2.5768260139169474e-05 loss: 0.1101 (0.1165) time: 2.9369 data: 0.0082 max mem: 33300 Epoch: [20] [3640/4276] eta: 0:31:50 lr: 2.576542806456428e-05 loss: 0.1086 (0.1165) time: 2.9469 data: 0.0083 max mem: 33300 Epoch: [20] [3650/4276] eta: 0:31:20 lr: 2.5762595955370338e-05 loss: 0.1086 (0.1165) time: 2.9425 data: 0.0087 max mem: 33300 Epoch: [20] [3660/4276] eta: 0:30:50 lr: 2.575976381158301e-05 loss: 0.1100 (0.1165) time: 2.9359 data: 0.0085 max mem: 33300 Epoch: [20] [3670/4276] eta: 0:30:19 lr: 2.575693163319764e-05 loss: 0.1024 (0.1164) time: 2.9354 data: 0.0085 max mem: 33300 Epoch: [20] [3680/4276] eta: 0:29:49 lr: 2.5754099420209578e-05 loss: 0.1015 (0.1164) time: 2.9354 data: 0.0087 max mem: 33300 Epoch: [20] [3690/4276] eta: 0:29:19 lr: 2.575126717261417e-05 loss: 0.1140 (0.1164) time: 2.9380 data: 0.0087 max mem: 33300 Epoch: [20] [3700/4276] eta: 0:28:49 lr: 2.5748434890406766e-05 loss: 0.1153 (0.1164) time: 2.9382 data: 0.0085 max mem: 33300 Epoch: [20] [3710/4276] eta: 0:28:19 lr: 2.5745602573582718e-05 loss: 0.1005 (0.1164) time: 2.9389 data: 0.0086 max mem: 33300 Epoch: [20] [3720/4276] eta: 0:27:49 lr: 2.5742770222137363e-05 loss: 0.0937 (0.1164) time: 2.9399 data: 0.0088 max mem: 33300 Epoch: [20] [3730/4276] eta: 0:27:19 lr: 2.573993783606605e-05 loss: 0.1164 (0.1164) time: 2.9405 data: 0.0087 max mem: 33300 Epoch: [20] [3740/4276] eta: 0:26:49 lr: 2.5737105415364125e-05 loss: 0.1158 (0.1164) time: 2.9398 data: 0.0085 max mem: 33300 Epoch: [20] [3750/4276] eta: 0:26:18 lr: 2.5734272960026934e-05 loss: 0.1129 (0.1164) time: 2.9382 data: 0.0085 max mem: 33300 Epoch: [20] [3760/4276] eta: 0:25:48 lr: 2.57314404700498e-05 loss: 0.1073 (0.1163) time: 2.9133 data: 0.0085 max mem: 33300 Epoch: [20] [3770/4276] eta: 0:25:18 lr: 2.5728607945428074e-05 loss: 0.0995 (0.1163) time: 2.8944 data: 0.0085 max mem: 33300 Epoch: [20] [3780/4276] eta: 0:24:48 lr: 2.572577538615709e-05 loss: 0.1061 (0.1163) time: 2.9189 data: 0.0082 max mem: 33300 Epoch: [20] [3790/4276] eta: 0:24:18 lr: 2.5722942792232198e-05 loss: 0.1055 (0.1163) time: 2.9408 data: 0.0080 max mem: 33300 Epoch: [20] [3800/4276] eta: 0:23:48 lr: 2.5720110163648716e-05 loss: 0.1034 (0.1163) time: 2.9394 data: 0.0080 max mem: 33300 Epoch: [20] [3810/4276] eta: 0:23:18 lr: 2.571727750040199e-05 loss: 0.1044 (0.1163) time: 2.9340 data: 0.0080 max mem: 33300 Epoch: [20] [3820/4276] eta: 0:22:48 lr: 2.571444480248734e-05 loss: 0.0975 (0.1162) time: 2.9334 data: 0.0083 max mem: 33300 Epoch: [20] [3830/4276] eta: 0:22:18 lr: 2.571161206990012e-05 loss: 0.0952 (0.1162) time: 2.9360 data: 0.0081 max mem: 33300 Epoch: [20] [3840/4276] eta: 0:21:48 lr: 2.570877930263565e-05 loss: 0.1076 (0.1162) time: 2.9361 data: 0.0079 max mem: 33300 Epoch: [20] [3850/4276] eta: 0:21:17 lr: 2.570594650068925e-05 loss: 0.0997 (0.1162) time: 2.9334 data: 0.0081 max mem: 33300 Epoch: [20] [3860/4276] eta: 0:20:47 lr: 2.570311366405626e-05 loss: 0.1060 (0.1162) time: 2.9346 data: 0.0083 max mem: 33300 Epoch: [20] [3870/4276] eta: 0:20:17 lr: 2.5700280792732014e-05 loss: 0.1091 (0.1161) time: 2.9360 data: 0.0081 max mem: 33300 Epoch: [20] [3880/4276] eta: 0:19:47 lr: 2.5697447886711824e-05 loss: 0.1033 (0.1161) time: 2.9341 data: 0.0080 max mem: 33300 Epoch: [20] [3890/4276] eta: 0:19:17 lr: 2.5694614945991018e-05 loss: 0.0957 (0.1161) time: 2.9278 data: 0.0085 max mem: 33300 Epoch: [20] [3900/4276] eta: 0:18:47 lr: 2.5691781970564925e-05 loss: 0.1028 (0.1161) time: 2.9293 data: 0.0086 max mem: 33300 Epoch: [20] [3910/4276] eta: 0:18:17 lr: 2.5688948960428865e-05 loss: 0.0987 (0.1160) time: 2.9343 data: 0.0081 max mem: 33300 Epoch: [20] [3920/4276] eta: 0:17:47 lr: 2.5686115915578156e-05 loss: 0.0976 (0.1160) time: 2.9337 data: 0.0078 max mem: 33300 Epoch: [20] [3930/4276] eta: 0:17:17 lr: 2.568328283600813e-05 loss: 0.0942 (0.1160) time: 2.9343 data: 0.0083 max mem: 33300 Epoch: [20] [3940/4276] eta: 0:16:47 lr: 2.5680449721714094e-05 loss: 0.1070 (0.1160) time: 2.9343 data: 0.0087 max mem: 33300 Epoch: [20] [3950/4276] eta: 0:16:17 lr: 2.5677616572691366e-05 loss: 0.1075 (0.1160) time: 2.9340 data: 0.0085 max mem: 33300 Epoch: [20] [3960/4276] eta: 0:15:47 lr: 2.567478338893527e-05 loss: 0.1094 (0.1160) time: 2.9344 data: 0.0086 max mem: 33300 Epoch: [20] [3970/4276] eta: 0:15:17 lr: 2.5671950170441112e-05 loss: 0.1281 (0.1160) time: 2.9349 data: 0.0086 max mem: 33300 Epoch: [20] [3980/4276] eta: 0:14:47 lr: 2.566911691720422e-05 loss: 0.1134 (0.1160) time: 2.9349 data: 0.0084 max mem: 33300 Epoch: [20] [3990/4276] eta: 0:14:17 lr: 2.5666283629219894e-05 loss: 0.1076 (0.1160) time: 2.9315 data: 0.0083 max mem: 33300 Epoch: [20] [4000/4276] eta: 0:13:47 lr: 2.5663450306483448e-05 loss: 0.1023 (0.1160) time: 2.9322 data: 0.0083 max mem: 33300 Epoch: [20] [4010/4276] eta: 0:13:17 lr: 2.56606169489902e-05 loss: 0.1001 (0.1160) time: 2.9355 data: 0.0083 max mem: 33300 Epoch: [20] [4020/4276] eta: 0:12:47 lr: 2.565778355673546e-05 loss: 0.1030 (0.1160) time: 2.9364 data: 0.0083 max mem: 33300 Epoch: [20] [4030/4276] eta: 0:12:17 lr: 2.565495012971452e-05 loss: 0.1044 (0.1160) time: 2.9375 data: 0.0081 max mem: 33300 Epoch: [20] [4040/4276] eta: 0:11:47 lr: 2.5652116667922703e-05 loss: 0.1092 (0.1160) time: 2.9353 data: 0.0079 max mem: 33300 Epoch: [20] [4050/4276] eta: 0:11:17 lr: 2.564928317135531e-05 loss: 0.1134 (0.1160) time: 2.9347 data: 0.0081 max mem: 33300 Epoch: [20] [4060/4276] eta: 0:10:47 lr: 2.5646449640007646e-05 loss: 0.1093 (0.1160) time: 2.9350 data: 0.0083 max mem: 33300 Epoch: [20] [4070/4276] eta: 0:10:17 lr: 2.5643616073875015e-05 loss: 0.1177 (0.1160) time: 2.9359 data: 0.0081 max mem: 33300 Epoch: [20] [4080/4276] eta: 0:09:47 lr: 2.5640782472952712e-05 loss: 0.1215 (0.1161) time: 2.9384 data: 0.0081 max mem: 33300 Epoch: [20] [4090/4276] eta: 0:09:17 lr: 2.563794883723604e-05 loss: 0.1267 (0.1161) time: 2.9435 data: 0.0083 max mem: 33300 Epoch: [20] [4100/4276] eta: 0:08:47 lr: 2.5635115166720318e-05 loss: 0.1233 (0.1161) time: 2.9430 data: 0.0086 max mem: 33300 Epoch: [20] [4110/4276] eta: 0:08:17 lr: 2.5632281461400815e-05 loss: 0.1197 (0.1161) time: 2.9374 data: 0.0084 max mem: 33300 Epoch: [20] [4120/4276] eta: 0:07:47 lr: 2.562944772127285e-05 loss: 0.1142 (0.1161) time: 2.9364 data: 0.0084 max mem: 33300 Epoch: [20] [4130/4276] eta: 0:07:17 lr: 2.5626613946331702e-05 loss: 0.1091 (0.1161) time: 2.9429 data: 0.0088 max mem: 33300 Epoch: [20] [4140/4276] eta: 0:06:47 lr: 2.5623780136572684e-05 loss: 0.1070 (0.1161) time: 2.9436 data: 0.0087 max mem: 33300 Epoch: [20] [4150/4276] eta: 0:06:17 lr: 2.5620946291991078e-05 loss: 0.1076 (0.1161) time: 2.9364 data: 0.0085 max mem: 33300 Epoch: [20] [4160/4276] eta: 0:05:47 lr: 2.561811241258218e-05 loss: 0.1117 (0.1161) time: 2.9382 data: 0.0085 max mem: 33300 Epoch: [20] [4170/4276] eta: 0:05:17 lr: 2.5615278498341272e-05 loss: 0.1262 (0.1161) time: 2.9496 data: 0.0088 max mem: 33300 Epoch: [20] [4180/4276] eta: 0:04:47 lr: 2.5612444549263664e-05 loss: 0.1171 (0.1161) time: 2.9510 data: 0.0087 max mem: 33300 Epoch: [20] [4190/4276] eta: 0:04:17 lr: 2.5609610565344626e-05 loss: 0.1159 (0.1162) time: 2.9406 data: 0.0086 max mem: 33300 Epoch: [20] [4200/4276] eta: 0:03:47 lr: 2.5606776546579452e-05 loss: 0.1277 (0.1162) time: 2.9363 data: 0.0088 max mem: 33300 Epoch: [20] [4210/4276] eta: 0:03:17 lr: 2.5603942492963433e-05 loss: 0.1289 (0.1162) time: 2.9354 data: 0.0089 max mem: 33300 Epoch: [20] [4220/4276] eta: 0:02:47 lr: 2.5601108404491853e-05 loss: 0.1317 (0.1163) time: 2.9384 data: 0.0088 max mem: 33300 Epoch: [20] [4230/4276] eta: 0:02:17 lr: 2.5598274281159984e-05 loss: 0.1302 (0.1163) time: 2.9371 data: 0.0086 max mem: 33300 Epoch: [20] [4240/4276] eta: 0:01:47 lr: 2.559544012296312e-05 loss: 0.1263 (0.1163) time: 2.9328 data: 0.0085 max mem: 33300 Epoch: [20] [4250/4276] eta: 0:01:17 lr: 2.5592605929896547e-05 loss: 0.1246 (0.1164) time: 2.9385 data: 0.0088 max mem: 33300 Epoch: [20] [4260/4276] eta: 0:00:47 lr: 2.5589771701955527e-05 loss: 0.1118 (0.1164) time: 2.9396 data: 0.0087 max mem: 33300 Epoch: [20] [4270/4276] eta: 0:00:17 lr: 2.5586937439135356e-05 loss: 0.1280 (0.1164) time: 2.9292 data: 0.0079 max mem: 33300 Epoch: [20] Total time: 3:33:20 Test: [ 0/21770] eta: 8:00:37 time: 1.3246 data: 1.2833 max mem: 33300 Test: [ 100/21770] eta: 0:18:15 time: 0.0383 data: 0.0009 max mem: 33300 Test: [ 200/21770] eta: 0:15:54 time: 0.0378 data: 0.0009 max mem: 33300 Test: [ 300/21770] eta: 0:15:04 time: 0.0380 data: 0.0009 max mem: 33300 Test: [ 400/21770] eta: 0:14:38 time: 0.0379 data: 0.0008 max mem: 33300 Test: [ 500/21770] eta: 0:14:21 time: 0.0382 data: 0.0009 max mem: 33300 Test: [ 600/21770] eta: 0:14:08 time: 0.0380 data: 0.0009 max mem: 33300 Test: [ 700/21770] eta: 0:13:58 time: 0.0382 data: 0.0009 max mem: 33300 Test: [ 800/21770] eta: 0:13:49 time: 0.0380 data: 0.0008 max mem: 33300 Test: [ 900/21770] eta: 0:13:42 time: 0.0381 data: 0.0009 max mem: 33300 Test: [ 1000/21770] eta: 0:13:35 time: 0.0379 data: 0.0008 max mem: 33300 Test: [ 1100/21770] eta: 0:13:29 time: 0.0382 data: 0.0009 max mem: 33300 Test: [ 1200/21770] eta: 0:13:23 time: 0.0380 data: 0.0009 max mem: 33300 Test: [ 1300/21770] eta: 0:13:18 time: 0.0382 data: 0.0009 max mem: 33300 Test: [ 1400/21770] eta: 0:13:13 time: 0.0381 data: 0.0009 max mem: 33300 Test: [ 1500/21770] eta: 0:13:08 time: 0.0382 data: 0.0008 max mem: 33300 Test: [ 1600/21770] eta: 0:13:03 time: 0.0380 data: 0.0009 max mem: 33300 Test: [ 1700/21770] eta: 0:12:58 time: 0.0383 data: 0.0009 max mem: 33300 Test: [ 1800/21770] eta: 0:12:53 time: 0.0380 data: 0.0009 max mem: 33300 Test: [ 1900/21770] eta: 0:12:49 time: 0.0382 data: 0.0008 max mem: 33300 Test: [ 2000/21770] eta: 0:12:44 time: 0.0388 data: 0.0008 max mem: 33300 Test: [ 2100/21770] eta: 0:12:40 time: 0.0380 data: 0.0009 max mem: 33300 Test: [ 2200/21770] eta: 0:12:36 time: 0.0392 data: 0.0009 max mem: 33300 Test: [ 2300/21770] eta: 0:12:32 time: 0.0380 data: 0.0009 max mem: 33300 Test: [ 2400/21770] eta: 0:12:28 time: 0.0380 data: 0.0009 max mem: 33300 Test: [ 2500/21770] eta: 0:12:24 time: 0.0380 data: 0.0009 max mem: 33300 Test: [ 2600/21770] eta: 0:12:19 time: 0.0381 data: 0.0008 max mem: 33300 Test: [ 2700/21770] eta: 0:12:15 time: 0.0384 data: 0.0009 max mem: 33300 Test: [ 2800/21770] eta: 0:12:11 time: 0.0381 data: 0.0009 max mem: 33300 Test: [ 2900/21770] eta: 0:12:07 time: 0.0384 data: 0.0009 max mem: 33300 Test: [ 3000/21770] eta: 0:12:03 time: 0.0386 data: 0.0008 max mem: 33300 Test: [ 3100/21770] eta: 0:11:59 time: 0.0388 data: 0.0008 max mem: 33300 Test: [ 3200/21770] eta: 0:11:56 time: 0.0389 data: 0.0008 max mem: 33300 Test: [ 3300/21770] eta: 0:11:52 time: 0.0390 data: 0.0009 max mem: 33300 Test: [ 3400/21770] eta: 0:11:49 time: 0.0391 data: 0.0009 max mem: 33300 Test: [ 3500/21770] eta: 0:11:45 time: 0.0395 data: 0.0008 max mem: 33300 Test: [ 3600/21770] eta: 0:11:42 time: 0.0395 data: 0.0008 max mem: 33300 Test: [ 3700/21770] eta: 0:11:38 time: 0.0395 data: 0.0008 max mem: 33300 Test: [ 3800/21770] eta: 0:11:35 time: 0.0392 data: 0.0008 max mem: 33300 Test: [ 3900/21770] eta: 0:11:31 time: 0.0394 data: 0.0008 max mem: 33300 Test: [ 4000/21770] eta: 0:11:27 time: 0.0390 data: 0.0009 max mem: 33300 Test: [ 4100/21770] eta: 0:11:24 time: 0.0390 data: 0.0009 max mem: 33300 Test: [ 4200/21770] eta: 0:11:20 time: 0.0391 data: 0.0009 max mem: 33300 Test: [ 4300/21770] eta: 0:11:16 time: 0.0390 data: 0.0009 max mem: 33300 Test: [ 4400/21770] eta: 0:11:12 time: 0.0389 data: 0.0009 max mem: 33300 Test: [ 4500/21770] eta: 0:11:08 time: 0.0388 data: 0.0009 max mem: 33300 Test: [ 4600/21770] eta: 0:11:05 time: 0.0384 data: 0.0009 max mem: 33300 Test: [ 4700/21770] eta: 0:11:01 time: 0.0389 data: 0.0009 max mem: 33300 Test: [ 4800/21770] eta: 0:10:57 time: 0.0386 data: 0.0009 max mem: 33300 Test: [ 4900/21770] eta: 0:10:53 time: 0.0388 data: 0.0008 max mem: 33300 Test: [ 5000/21770] eta: 0:10:49 time: 0.0389 data: 0.0009 max mem: 33300 Test: [ 5100/21770] eta: 0:10:45 time: 0.0387 data: 0.0008 max mem: 33300 Test: [ 5200/21770] eta: 0:10:41 time: 0.0389 data: 0.0009 max mem: 33300 Test: [ 5300/21770] eta: 0:10:38 time: 0.0388 data: 0.0009 max mem: 33300 Test: [ 5400/21770] eta: 0:10:34 time: 0.0389 data: 0.0009 max mem: 33300 Test: [ 5500/21770] eta: 0:10:30 time: 0.0387 data: 0.0009 max mem: 33300 Test: [ 5600/21770] eta: 0:10:26 time: 0.0386 data: 0.0009 max mem: 33300 Test: [ 5700/21770] eta: 0:10:22 time: 0.0388 data: 0.0009 max mem: 33300 Test: [ 5800/21770] eta: 0:10:18 time: 0.0387 data: 0.0009 max mem: 33300 Test: [ 5900/21770] eta: 0:10:14 time: 0.0387 data: 0.0009 max mem: 33300 Test: [ 6000/21770] eta: 0:10:10 time: 0.0387 data: 0.0009 max mem: 33300 Test: [ 6100/21770] eta: 0:10:07 time: 0.0388 data: 0.0009 max mem: 33300 Test: [ 6200/21770] eta: 0:10:03 time: 0.0386 data: 0.0009 max mem: 33300 Test: [ 6300/21770] eta: 0:09:59 time: 0.0388 data: 0.0009 max mem: 33300 Test: [ 6400/21770] eta: 0:09:55 time: 0.0386 data: 0.0009 max mem: 33300 Test: [ 6500/21770] eta: 0:09:51 time: 0.0387 data: 0.0009 max mem: 33300 Test: [ 6600/21770] eta: 0:09:47 time: 0.0386 data: 0.0009 max mem: 33300 Test: [ 6700/21770] eta: 0:09:43 time: 0.0388 data: 0.0009 max mem: 33300 Test: [ 6800/21770] eta: 0:09:39 time: 0.0386 data: 0.0009 max mem: 33300 Test: [ 6900/21770] eta: 0:09:36 time: 0.0387 data: 0.0009 max mem: 33300 Test: [ 7000/21770] eta: 0:09:32 time: 0.0386 data: 0.0009 max mem: 33300 Test: [ 7100/21770] eta: 0:09:28 time: 0.0390 data: 0.0009 max mem: 33300 Test: [ 7200/21770] eta: 0:09:24 time: 0.0391 data: 0.0009 max mem: 33300 Test: [ 7300/21770] eta: 0:09:20 time: 0.0391 data: 0.0009 max mem: 33300 Test: [ 7400/21770] eta: 0:09:16 time: 0.0390 data: 0.0009 max mem: 33300 Test: [ 7500/21770] eta: 0:09:13 time: 0.0392 data: 0.0009 max mem: 33300 Test: [ 7600/21770] eta: 0:09:09 time: 0.0389 data: 0.0009 max mem: 33300 Test: [ 7700/21770] eta: 0:09:05 time: 0.0390 data: 0.0009 max mem: 33300 Test: [ 7800/21770] eta: 0:09:01 time: 0.0390 data: 0.0008 max mem: 33300 Test: [ 7900/21770] eta: 0:08:57 time: 0.0391 data: 0.0009 max mem: 33300 Test: [ 8000/21770] eta: 0:08:53 time: 0.0389 data: 0.0008 max mem: 33300 Test: [ 8100/21770] eta: 0:08:50 time: 0.0391 data: 0.0009 max mem: 33300 Test: [ 8200/21770] eta: 0:08:46 time: 0.0384 data: 0.0009 max mem: 33300 Test: [ 8300/21770] eta: 0:08:42 time: 0.0383 data: 0.0009 max mem: 33300 Test: [ 8400/21770] eta: 0:08:38 time: 0.0384 data: 0.0008 max mem: 33300 Test: [ 8500/21770] eta: 0:08:34 time: 0.0384 data: 0.0009 max mem: 33300 Test: [ 8600/21770] eta: 0:08:30 time: 0.0388 data: 0.0009 max mem: 33300 Test: [ 8700/21770] eta: 0:08:26 time: 0.0387 data: 0.0009 max mem: 33300 Test: [ 8800/21770] eta: 0:08:22 time: 0.0388 data: 0.0009 max mem: 33300 Test: [ 8900/21770] eta: 0:08:18 time: 0.0387 data: 0.0009 max mem: 33300 Test: [ 9000/21770] eta: 0:08:15 time: 0.0388 data: 0.0009 max mem: 33300 Test: [ 9100/21770] eta: 0:08:11 time: 0.0392 data: 0.0009 max mem: 33300 Test: [ 9200/21770] eta: 0:08:07 time: 0.0390 data: 0.0009 max mem: 33300 Test: [ 9300/21770] eta: 0:08:03 time: 0.0392 data: 0.0009 max mem: 33300 Test: [ 9400/21770] eta: 0:07:59 time: 0.0391 data: 0.0009 max mem: 33300 Test: [ 9500/21770] eta: 0:07:55 time: 0.0388 data: 0.0009 max mem: 33300 Test: [ 9600/21770] eta: 0:07:51 time: 0.0387 data: 0.0009 max mem: 33300 Test: [ 9700/21770] eta: 0:07:48 time: 0.0389 data: 0.0009 max mem: 33300 Test: [ 9800/21770] eta: 0:07:44 time: 0.0388 data: 0.0009 max mem: 33300 Test: [ 9900/21770] eta: 0:07:40 time: 0.0389 data: 0.0009 max mem: 33300 Test: [10000/21770] eta: 0:07:36 time: 0.0389 data: 0.0009 max mem: 33300 Test: [10100/21770] eta: 0:07:32 time: 0.0389 data: 0.0008 max mem: 33300 Test: [10200/21770] eta: 0:07:28 time: 0.0388 data: 0.0009 max mem: 33300 Test: [10300/21770] eta: 0:07:24 time: 0.0390 data: 0.0009 max mem: 33300 Test: [10400/21770] eta: 0:07:20 time: 0.0387 data: 0.0009 max mem: 33300 Test: [10500/21770] eta: 0:07:17 time: 0.0390 data: 0.0009 max mem: 33300 Test: [10600/21770] eta: 0:07:13 time: 0.0388 data: 0.0009 max mem: 33300 Test: [10700/21770] eta: 0:07:09 time: 0.0388 data: 0.0009 max mem: 33300 Test: [10800/21770] eta: 0:07:05 time: 0.0387 data: 0.0009 max mem: 33300 Test: [10900/21770] eta: 0:07:01 time: 0.0388 data: 0.0009 max mem: 33300 Test: [11000/21770] eta: 0:06:57 time: 0.0384 data: 0.0009 max mem: 33300 Test: [11100/21770] eta: 0:06:53 time: 0.0385 data: 0.0009 max mem: 33300 Test: [11200/21770] eta: 0:06:49 time: 0.0384 data: 0.0009 max mem: 33300 Test: [11300/21770] eta: 0:06:45 time: 0.0385 data: 0.0009 max mem: 33300 Test: [11400/21770] eta: 0:06:42 time: 0.0385 data: 0.0009 max mem: 33300 Test: [11500/21770] eta: 0:06:38 time: 0.0386 data: 0.0009 max mem: 33300 Test: [11600/21770] eta: 0:06:34 time: 0.0386 data: 0.0009 max mem: 33300 Test: [11700/21770] eta: 0:06:30 time: 0.0385 data: 0.0009 max mem: 33300 Test: [11800/21770] eta: 0:06:26 time: 0.0384 data: 0.0009 max mem: 33300 Test: [11900/21770] eta: 0:06:22 time: 0.0383 data: 0.0009 max mem: 33300 Test: [12000/21770] eta: 0:06:18 time: 0.0384 data: 0.0009 max mem: 33300 Test: [12100/21770] eta: 0:06:14 time: 0.0388 data: 0.0009 max mem: 33300 Test: [12200/21770] eta: 0:06:10 time: 0.0389 data: 0.0009 max mem: 33300 Test: [12300/21770] eta: 0:06:07 time: 0.0391 data: 0.0009 max mem: 33300 Test: [12400/21770] eta: 0:06:03 time: 0.0390 data: 0.0009 max mem: 33300 Test: [12500/21770] eta: 0:05:59 time: 0.0394 data: 0.0009 max mem: 33300 Test: [12600/21770] eta: 0:05:55 time: 0.0391 data: 0.0009 max mem: 33300 Test: [12700/21770] eta: 0:05:51 time: 0.0392 data: 0.0009 max mem: 33300 Test: [12800/21770] eta: 0:05:47 time: 0.0390 data: 0.0009 max mem: 33300 Test: [12900/21770] eta: 0:05:43 time: 0.0395 data: 0.0009 max mem: 33300 Test: [13000/21770] eta: 0:05:40 time: 0.0390 data: 0.0009 max mem: 33300 Test: [13100/21770] eta: 0:05:36 time: 0.0392 data: 0.0009 max mem: 33300 Test: [13200/21770] eta: 0:05:32 time: 0.0390 data: 0.0009 max mem: 33300 Test: [13300/21770] eta: 0:05:28 time: 0.0391 data: 0.0009 max mem: 33300 Test: [13400/21770] eta: 0:05:24 time: 0.0390 data: 0.0008 max mem: 33300 Test: [13500/21770] eta: 0:05:20 time: 0.0391 data: 0.0008 max mem: 33300 Test: [13600/21770] eta: 0:05:16 time: 0.0392 data: 0.0008 max mem: 33300 Test: [13700/21770] eta: 0:05:13 time: 0.0391 data: 0.0008 max mem: 33300 Test: [13800/21770] eta: 0:05:09 time: 0.0390 data: 0.0008 max mem: 33300 Test: [13900/21770] eta: 0:05:05 time: 0.0392 data: 0.0008 max mem: 33300 Test: [14000/21770] eta: 0:05:01 time: 0.0392 data: 0.0008 max mem: 33300 Test: [14100/21770] eta: 0:04:57 time: 0.0395 data: 0.0008 max mem: 33300 Test: [14200/21770] eta: 0:04:53 time: 0.0393 data: 0.0009 max mem: 33300 Test: [14300/21770] eta: 0:04:49 time: 0.0393 data: 0.0009 max mem: 33300 Test: [14400/21770] eta: 0:04:46 time: 0.0394 data: 0.0009 max mem: 33300 Test: [14500/21770] eta: 0:04:42 time: 0.0401 data: 0.0009 max mem: 33300 Test: [14600/21770] eta: 0:04:38 time: 0.0392 data: 0.0009 max mem: 33300 Test: [14700/21770] eta: 0:04:34 time: 0.0396 data: 0.0009 max mem: 33300 Test: [14800/21770] eta: 0:04:30 time: 0.0394 data: 0.0008 max mem: 33300 Test: [14900/21770] eta: 0:04:26 time: 0.0397 data: 0.0008 max mem: 33300 Test: [15000/21770] eta: 0:04:22 time: 0.0390 data: 0.0008 max mem: 33300 Test: [15100/21770] eta: 0:04:19 time: 0.0401 data: 0.0008 max mem: 33300 Test: [15200/21770] eta: 0:04:15 time: 0.0398 data: 0.0008 max mem: 33300 Test: [15300/21770] eta: 0:04:11 time: 0.0401 data: 0.0008 max mem: 33300 Test: [15400/21770] eta: 0:04:07 time: 0.0392 data: 0.0008 max mem: 33300 Test: [15500/21770] eta: 0:04:03 time: 0.0396 data: 0.0008 max mem: 33300 Test: [15600/21770] eta: 0:03:59 time: 0.0395 data: 0.0009 max mem: 33300 Test: [15700/21770] eta: 0:03:56 time: 0.0401 data: 0.0008 max mem: 33300 Test: [15800/21770] eta: 0:03:52 time: 0.0398 data: 0.0008 max mem: 33300 Test: [15900/21770] eta: 0:03:48 time: 0.0401 data: 0.0008 max mem: 33300 Test: [16000/21770] eta: 0:03:44 time: 0.0399 data: 0.0008 max mem: 33300 Test: [16100/21770] eta: 0:03:40 time: 0.0401 data: 0.0008 max mem: 33300 Test: [16200/21770] eta: 0:03:36 time: 0.0390 data: 0.0008 max mem: 33300 Test: [16300/21770] eta: 0:03:32 time: 0.0396 data: 0.0008 max mem: 33300 Test: [16400/21770] eta: 0:03:28 time: 0.0396 data: 0.0008 max mem: 33300 Test: [16500/21770] eta: 0:03:25 time: 0.0399 data: 0.0008 max mem: 33300 Test: [16600/21770] eta: 0:03:21 time: 0.0398 data: 0.0008 max mem: 33300 Test: [16700/21770] eta: 0:03:17 time: 0.0395 data: 0.0008 max mem: 33300 Test: [16800/21770] eta: 0:03:13 time: 0.0398 data: 0.0008 max mem: 33300 Test: [16900/21770] eta: 0:03:09 time: 0.0396 data: 0.0008 max mem: 33300 Test: [17000/21770] eta: 0:03:05 time: 0.0398 data: 0.0008 max mem: 33300 Test: [17100/21770] eta: 0:03:01 time: 0.0401 data: 0.0008 max mem: 33300 Test: [17200/21770] eta: 0:02:58 time: 0.0395 data: 0.0008 max mem: 33300 Test: [17300/21770] eta: 0:02:54 time: 0.0401 data: 0.0008 max mem: 33300 Test: [17400/21770] eta: 0:02:50 time: 0.0393 data: 0.0008 max mem: 33300 Test: [17500/21770] eta: 0:02:46 time: 0.0380 data: 0.0009 max mem: 33300 Test: [17600/21770] eta: 0:02:42 time: 0.0394 data: 0.0008 max mem: 33300 Test: [17700/21770] eta: 0:02:38 time: 0.0393 data: 0.0008 max mem: 33300 Test: [17800/21770] eta: 0:02:34 time: 0.0400 data: 0.0008 max mem: 33300 Test: [17900/21770] eta: 0:02:30 time: 0.0399 data: 0.0008 max mem: 33300 Test: [18000/21770] eta: 0:02:26 time: 0.0399 data: 0.0008 max mem: 33300 Test: [18100/21770] eta: 0:02:23 time: 0.0400 data: 0.0008 max mem: 33300 Test: [18200/21770] eta: 0:02:19 time: 0.0399 data: 0.0008 max mem: 33300 Test: [18300/21770] eta: 0:02:15 time: 0.0400 data: 0.0008 max mem: 33300 Test: [18400/21770] eta: 0:02:11 time: 0.0399 data: 0.0008 max mem: 33300 Test: [18500/21770] eta: 0:02:07 time: 0.0400 data: 0.0008 max mem: 33300 Test: [18600/21770] eta: 0:02:03 time: 0.0399 data: 0.0008 max mem: 33300 Test: [18700/21770] eta: 0:01:59 time: 0.0401 data: 0.0008 max mem: 33300 Test: [18800/21770] eta: 0:01:55 time: 0.0399 data: 0.0008 max mem: 33300 Test: [18900/21770] eta: 0:01:52 time: 0.0400 data: 0.0008 max mem: 33300 Test: [19000/21770] eta: 0:01:48 time: 0.0400 data: 0.0008 max mem: 33300 Test: [19100/21770] eta: 0:01:44 time: 0.0401 data: 0.0008 max mem: 33300 Test: [19200/21770] eta: 0:01:40 time: 0.0399 data: 0.0008 max mem: 33300 Test: [19300/21770] eta: 0:01:36 time: 0.0400 data: 0.0008 max mem: 33300 Test: [19400/21770] eta: 0:01:32 time: 0.0400 data: 0.0008 max mem: 33300 Test: [19500/21770] eta: 0:01:28 time: 0.0401 data: 0.0008 max mem: 33300 Test: [19600/21770] eta: 0:01:24 time: 0.0399 data: 0.0008 max mem: 33300 Test: [19700/21770] eta: 0:01:20 time: 0.0401 data: 0.0008 max mem: 33300 Test: [19800/21770] eta: 0:01:16 time: 0.0399 data: 0.0008 max mem: 33300 Test: [19900/21770] eta: 0:01:13 time: 0.0401 data: 0.0008 max mem: 33300 Test: [20000/21770] eta: 0:01:09 time: 0.0399 data: 0.0008 max mem: 33300 Test: [20100/21770] eta: 0:01:05 time: 0.0400 data: 0.0008 max mem: 33300 Test: [20200/21770] eta: 0:01:01 time: 0.0399 data: 0.0008 max mem: 33300 Test: [20300/21770] eta: 0:00:57 time: 0.0401 data: 0.0008 max mem: 33300 Test: [20400/21770] eta: 0:00:53 time: 0.0399 data: 0.0008 max mem: 33300 Test: [20500/21770] eta: 0:00:49 time: 0.0400 data: 0.0008 max mem: 33300 Test: [20600/21770] eta: 0:00:45 time: 0.0399 data: 0.0008 max mem: 33300 Test: [20700/21770] eta: 0:00:41 time: 0.0401 data: 0.0008 max mem: 33300 Test: [20800/21770] eta: 0:00:37 time: 0.0399 data: 0.0008 max mem: 33300 Test: [20900/21770] eta: 0:00:34 time: 0.0402 data: 0.0008 max mem: 33300 Test: [21000/21770] eta: 0:00:30 time: 0.0399 data: 0.0008 max mem: 33300 Test: [21100/21770] eta: 0:00:26 time: 0.0402 data: 0.0008 max mem: 33300 Test: [21200/21770] eta: 0:00:22 time: 0.0399 data: 0.0008 max mem: 33300 Test: [21300/21770] eta: 0:00:18 time: 0.0401 data: 0.0008 max mem: 33300 Test: [21400/21770] eta: 0:00:14 time: 0.0400 data: 0.0008 max mem: 33300 Test: [21500/21770] eta: 0:00:10 time: 0.0400 data: 0.0008 max mem: 33300 Test: [21600/21770] eta: 0:00:06 time: 0.0399 data: 0.0008 max mem: 33300 Test: [21700/21770] eta: 0:00:02 time: 0.0400 data: 0.0008 max mem: 33300 Test: Total time: 0:14:12 Final results: Mean IoU is 0.00 precision@0.5 = 0.00 precision@0.6 = 0.00 precision@0.7 = 0.00 precision@0.8 = 0.00 precision@0.9 = 0.00 overall IoU = 0.00 mean IoU = 0.00 Mean accuracy for one-to-zero sample is 100.00 Average object IoU 0.0 Overall IoU 0.0 Epoch: [21] [ 0/4276] eta: 6:08:33 lr: 2.5585236864699295e-05 loss: 0.1035 (0.1035) time: 5.1715 data: 2.0479 max mem: 33300 Epoch: [21] [ 10/4276] eta: 3:40:35 lr: 2.558240254606265e-05 loss: 0.1102 (0.1162) time: 3.1027 data: 0.1926 max mem: 33300 Epoch: [21] [ 20/4276] eta: 3:33:20 lr: 2.5579568192534574e-05 loss: 0.1061 (0.1200) time: 2.8993 data: 0.0074 max mem: 33300 Epoch: [21] [ 30/4276] eta: 3:30:22 lr: 2.557673380411033e-05 loss: 0.1061 (0.1201) time: 2.9015 data: 0.0078 max mem: 33300 Epoch: [21] [ 40/4276] eta: 3:29:43 lr: 2.55738993807852e-05 loss: 0.1140 (0.1187) time: 2.9316 data: 0.0084 max mem: 33300 Epoch: [21] [ 50/4276] eta: 3:28:45 lr: 2.557106492255445e-05 loss: 0.1097 (0.1144) time: 2.9500 data: 0.0087 max mem: 33300 Epoch: [21] [ 60/4276] eta: 3:27:53 lr: 2.556823042941336e-05 loss: 0.0982 (0.1124) time: 2.9342 data: 0.0080 max mem: 33300 Epoch: [21] [ 70/4276] eta: 3:27:10 lr: 2.5565395901357186e-05 loss: 0.1031 (0.1118) time: 2.9339 data: 0.0075 max mem: 33300 Epoch: [21] [ 80/4276] eta: 3:26:29 lr: 2.5562561338381207e-05 loss: 0.1159 (0.1141) time: 2.9350 data: 0.0077 max mem: 33300 Epoch: [21] [ 90/4276] eta: 3:25:50 lr: 2.5559726740480688e-05 loss: 0.1159 (0.1136) time: 2.9326 data: 0.0078 max mem: 33300 Epoch: [21] [ 100/4276] eta: 3:25:15 lr: 2.5556892107650905e-05 loss: 0.1080 (0.1147) time: 2.9340 data: 0.0076 max mem: 33300 Epoch: [21] [ 110/4276] eta: 3:24:38 lr: 2.5554057439887102e-05 loss: 0.1169 (0.1151) time: 2.9335 data: 0.0075 max mem: 33300 Epoch: [21] [ 120/4276] eta: 3:24:03 lr: 2.5551222737184555e-05 loss: 0.1122 (0.1150) time: 2.9306 data: 0.0078 max mem: 33300 Epoch: [21] [ 130/4276] eta: 3:23:27 lr: 2.554838799953852e-05 loss: 0.1142 (0.1153) time: 2.9278 data: 0.0078 max mem: 33300 Epoch: [21] [ 140/4276] eta: 3:22:51 lr: 2.5545553226944273e-05 loss: 0.1080 (0.1147) time: 2.9238 data: 0.0075 max mem: 33300 Epoch: [21] [ 150/4276] eta: 3:22:15 lr: 2.554271841939706e-05 loss: 0.1001 (0.1144) time: 2.9208 data: 0.0075 max mem: 33300 Epoch: [21] [ 160/4276] eta: 3:21:37 lr: 2.5539883576892144e-05 loss: 0.1051 (0.1143) time: 2.9124 data: 0.0078 max mem: 33300 Epoch: [21] [ 170/4276] eta: 3:21:06 lr: 2.5537048699424782e-05 loss: 0.1119 (0.1142) time: 2.9190 data: 0.0080 max mem: 33300 Epoch: [21] [ 180/4276] eta: 3:20:34 lr: 2.553421378699023e-05 loss: 0.1132 (0.1144) time: 2.9299 data: 0.0077 max mem: 33300 Epoch: [21] [ 190/4276] eta: 3:20:03 lr: 2.5531378839583747e-05 loss: 0.1036 (0.1141) time: 2.9289 data: 0.0072 max mem: 33300 Epoch: [21] [ 200/4276] eta: 3:19:31 lr: 2.5528543857200577e-05 loss: 0.0971 (0.1142) time: 2.9289 data: 0.0072 max mem: 33300 Epoch: [21] [ 210/4276] eta: 3:19:00 lr: 2.5525708839835983e-05 loss: 0.1075 (0.1139) time: 2.9283 data: 0.0072 max mem: 33300 Epoch: [21] [ 220/4276] eta: 3:18:30 lr: 2.5522873787485212e-05 loss: 0.1028 (0.1136) time: 2.9290 data: 0.0072 max mem: 33300 Epoch: [21] [ 230/4276] eta: 3:17:59 lr: 2.5520038700143513e-05 loss: 0.0994 (0.1131) time: 2.9309 data: 0.0073 max mem: 33300 Epoch: [21] [ 240/4276] eta: 3:17:30 lr: 2.5517203577806132e-05 loss: 0.1073 (0.1135) time: 2.9336 data: 0.0074 max mem: 33300 Epoch: [21] [ 250/4276] eta: 3:17:00 lr: 2.5514368420468327e-05 loss: 0.1166 (0.1141) time: 2.9323 data: 0.0072 max mem: 33300 Epoch: [21] [ 260/4276] eta: 3:16:29 lr: 2.551153322812534e-05 loss: 0.1171 (0.1141) time: 2.9285 data: 0.0071 max mem: 33300 Epoch: [21] [ 270/4276] eta: 3:15:58 lr: 2.550869800077241e-05 loss: 0.1037 (0.1140) time: 2.9274 data: 0.0071 max mem: 33300 Epoch: [21] [ 280/4276] eta: 3:15:28 lr: 2.5505862738404785e-05 loss: 0.1013 (0.1138) time: 2.9290 data: 0.0074 max mem: 33300 Epoch: [21] [ 290/4276] eta: 3:14:58 lr: 2.5503027441017713e-05 loss: 0.1100 (0.1138) time: 2.9304 data: 0.0074 max mem: 33300 Epoch: [21] [ 300/4276] eta: 3:14:28 lr: 2.5500192108606424e-05 loss: 0.1082 (0.1136) time: 2.9305 data: 0.0076 max mem: 33300 Epoch: [21] [ 310/4276] eta: 3:13:58 lr: 2.549735674116617e-05 loss: 0.0987 (0.1133) time: 2.9301 data: 0.0076 max mem: 33300 Epoch: [21] [ 320/4276] eta: 3:13:29 lr: 2.5494521338692177e-05 loss: 0.1085 (0.1134) time: 2.9308 data: 0.0072 max mem: 33300 Epoch: [21] [ 330/4276] eta: 3:12:59 lr: 2.5491685901179702e-05 loss: 0.1169 (0.1138) time: 2.9302 data: 0.0072 max mem: 33300 Epoch: [21] [ 340/4276] eta: 3:12:29 lr: 2.5488850428623966e-05 loss: 0.1076 (0.1134) time: 2.9305 data: 0.0072 max mem: 33300 Epoch: [21] [ 350/4276] eta: 3:12:00 lr: 2.5486014921020202e-05 loss: 0.0943 (0.1133) time: 2.9346 data: 0.0072 max mem: 33300 Epoch: [21] [ 360/4276] eta: 3:11:31 lr: 2.5483179378363657e-05 loss: 0.1163 (0.1140) time: 2.9358 data: 0.0072 max mem: 33300 Epoch: [21] [ 370/4276] eta: 3:11:01 lr: 2.548034380064956e-05 loss: 0.1100 (0.1138) time: 2.9336 data: 0.0072 max mem: 33300 Epoch: [21] [ 380/4276] eta: 3:10:32 lr: 2.5477508187873134e-05 loss: 0.1017 (0.1140) time: 2.9329 data: 0.0072 max mem: 33300 Epoch: [21] [ 390/4276] eta: 3:10:04 lr: 2.5474672540029615e-05 loss: 0.1248 (0.1146) time: 2.9441 data: 0.0073 max mem: 33300 Epoch: [21] [ 400/4276] eta: 3:09:35 lr: 2.5471836857114233e-05 loss: 0.1290 (0.1152) time: 2.9445 data: 0.0073 max mem: 33300 Epoch: [21] [ 410/4276] eta: 3:09:05 lr: 2.5469001139122223e-05 loss: 0.1188 (0.1153) time: 2.9326 data: 0.0072 max mem: 33300 Epoch: [21] [ 420/4276] eta: 3:08:36 lr: 2.5466165386048796e-05 loss: 0.1091 (0.1153) time: 2.9340 data: 0.0072 max mem: 33300 Epoch: [21] [ 430/4276] eta: 3:08:07 lr: 2.5463329597889187e-05 loss: 0.1165 (0.1156) time: 2.9374 data: 0.0071 max mem: 33300 Epoch: [21] [ 440/4276] eta: 3:07:38 lr: 2.5460493774638615e-05 loss: 0.1097 (0.1154) time: 2.9377 data: 0.0071 max mem: 33300 Epoch: [21] [ 450/4276] eta: 3:07:09 lr: 2.5457657916292315e-05 loss: 0.1049 (0.1155) time: 2.9362 data: 0.0072 max mem: 33300 Epoch: [21] [ 460/4276] eta: 3:06:39 lr: 2.5454822022845494e-05 loss: 0.1098 (0.1153) time: 2.9343 data: 0.0072 max mem: 33300 Epoch: [21] [ 470/4276] eta: 3:06:10 lr: 2.5451986094293373e-05 loss: 0.0976 (0.1149) time: 2.9340 data: 0.0072 max mem: 33300 Epoch: [21] [ 480/4276] eta: 3:05:41 lr: 2.5449150130631183e-05 loss: 0.0976 (0.1147) time: 2.9413 data: 0.0072 max mem: 33300 Epoch: [21] [ 490/4276] eta: 3:05:12 lr: 2.544631413185414e-05 loss: 0.0979 (0.1143) time: 2.9408 data: 0.0072 max mem: 33300 Epoch: [21] [ 500/4276] eta: 3:04:42 lr: 2.544347809795745e-05 loss: 0.0967 (0.1140) time: 2.9336 data: 0.0073 max mem: 33300 Epoch: [21] [ 510/4276] eta: 3:04:13 lr: 2.5440642028936336e-05 loss: 0.1068 (0.1140) time: 2.9360 data: 0.0074 max mem: 33300 Epoch: [21] [ 520/4276] eta: 3:03:44 lr: 2.543780592478601e-05 loss: 0.1225 (0.1142) time: 2.9347 data: 0.0074 max mem: 33300 Epoch: [21] [ 530/4276] eta: 3:03:14 lr: 2.5434969785501694e-05 loss: 0.1134 (0.1142) time: 2.9338 data: 0.0074 max mem: 33300 Epoch: [21] [ 540/4276] eta: 3:02:45 lr: 2.5432133611078584e-05 loss: 0.1049 (0.1140) time: 2.9343 data: 0.0074 max mem: 33300 Epoch: [21] [ 550/4276] eta: 3:02:16 lr: 2.54292974015119e-05 loss: 0.1049 (0.1142) time: 2.9398 data: 0.0071 max mem: 33300 Epoch: [21] [ 560/4276] eta: 3:01:47 lr: 2.5426461156796845e-05 loss: 0.1126 (0.1142) time: 2.9402 data: 0.0072 max mem: 33300 Epoch: [21] [ 570/4276] eta: 3:01:18 lr: 2.5423624876928637e-05 loss: 0.1264 (0.1143) time: 2.9349 data: 0.0072 max mem: 33300 Epoch: [21] [ 580/4276] eta: 3:00:49 lr: 2.5420788561902476e-05 loss: 0.1093 (0.1143) time: 2.9411 data: 0.0072 max mem: 33300 Epoch: [21] [ 590/4276] eta: 3:00:20 lr: 2.5417952211713563e-05 loss: 0.0984 (0.1140) time: 2.9418 data: 0.0072 max mem: 33300 Epoch: [21] [ 600/4276] eta: 2:59:50 lr: 2.541511582635712e-05 loss: 0.0941 (0.1138) time: 2.9336 data: 0.0073 max mem: 33300 Epoch: [21] [ 610/4276] eta: 2:59:20 lr: 2.5412279405828325e-05 loss: 0.0963 (0.1138) time: 2.9292 data: 0.0072 max mem: 33300 Epoch: [21] [ 620/4276] eta: 2:58:51 lr: 2.5409442950122403e-05 loss: 0.0952 (0.1136) time: 2.9319 data: 0.0072 max mem: 33300 Epoch: [21] [ 630/4276] eta: 2:58:22 lr: 2.5406606459234533e-05 loss: 0.1035 (0.1136) time: 2.9371 data: 0.0074 max mem: 33300 Epoch: [21] [ 640/4276] eta: 2:57:52 lr: 2.540376993315994e-05 loss: 0.1134 (0.1137) time: 2.9377 data: 0.0074 max mem: 33300 Epoch: [21] [ 650/4276] eta: 2:57:23 lr: 2.5400933371893793e-05 loss: 0.1076 (0.1137) time: 2.9386 data: 0.0074 max mem: 33300 Epoch: [21] [ 660/4276] eta: 2:56:54 lr: 2.5398096775431306e-05 loss: 0.1132 (0.1139) time: 2.9412 data: 0.0074 max mem: 33300 Epoch: [21] [ 670/4276] eta: 2:56:25 lr: 2.5395260143767672e-05 loss: 0.1152 (0.1139) time: 2.9396 data: 0.0072 max mem: 33300 Epoch: [21] [ 680/4276] eta: 2:55:56 lr: 2.5392423476898097e-05 loss: 0.1140 (0.1140) time: 2.9374 data: 0.0073 max mem: 33300 Epoch: [21] [ 690/4276] eta: 2:55:27 lr: 2.5389586774817747e-05 loss: 0.1078 (0.1139) time: 2.9374 data: 0.0073 max mem: 33300 Epoch: [21] [ 700/4276] eta: 2:54:57 lr: 2.5386750037521834e-05 loss: 0.1020 (0.1137) time: 2.9377 data: 0.0072 max mem: 33300 Epoch: [21] [ 710/4276] eta: 2:54:28 lr: 2.5383913265005538e-05 loss: 0.1058 (0.1139) time: 2.9410 data: 0.0072 max mem: 33300 Epoch: [21] [ 720/4276] eta: 2:53:59 lr: 2.5381076457264063e-05 loss: 0.1115 (0.1139) time: 2.9376 data: 0.0072 max mem: 33300 Epoch: [21] [ 730/4276] eta: 2:53:27 lr: 2.5378239614292576e-05 loss: 0.1074 (0.1138) time: 2.9055 data: 0.0081 max mem: 33300 Epoch: [21] [ 740/4276] eta: 2:52:57 lr: 2.5375402736086278e-05 loss: 0.1027 (0.1138) time: 2.9071 data: 0.0092 max mem: 33300 Epoch: [21] [ 750/4276] eta: 2:52:28 lr: 2.5372565822640353e-05 loss: 0.1081 (0.1138) time: 2.9337 data: 0.0089 max mem: 33300 Epoch: [21] [ 760/4276] eta: 2:51:59 lr: 2.536972887394999e-05 loss: 0.1007 (0.1137) time: 2.9344 data: 0.0084 max mem: 33300 Epoch: [21] [ 770/4276] eta: 2:51:29 lr: 2.5366891890010357e-05 loss: 0.1024 (0.1137) time: 2.9346 data: 0.0082 max mem: 33300 Epoch: [21] [ 780/4276] eta: 2:51:00 lr: 2.5364054870816644e-05 loss: 0.1082 (0.1136) time: 2.9364 data: 0.0079 max mem: 33300 Epoch: [21] [ 790/4276] eta: 2:50:31 lr: 2.536121781636403e-05 loss: 0.1123 (0.1137) time: 2.9442 data: 0.0082 max mem: 33300 Epoch: [21] [ 800/4276] eta: 2:50:02 lr: 2.5358380726647708e-05 loss: 0.1025 (0.1136) time: 2.9429 data: 0.0084 max mem: 33300 Epoch: [21] [ 810/4276] eta: 2:49:33 lr: 2.535554360166283e-05 loss: 0.1009 (0.1138) time: 2.9357 data: 0.0082 max mem: 33300 Epoch: [21] [ 820/4276] eta: 2:49:04 lr: 2.5352706441404588e-05 loss: 0.1002 (0.1137) time: 2.9371 data: 0.0079 max mem: 33300 Epoch: [21] [ 830/4276] eta: 2:48:34 lr: 2.5349869245868157e-05 loss: 0.0978 (0.1138) time: 2.9397 data: 0.0081 max mem: 33300 Epoch: [21] [ 840/4276] eta: 2:48:05 lr: 2.5347032015048717e-05 loss: 0.1131 (0.1139) time: 2.9406 data: 0.0084 max mem: 33300 Epoch: [21] [ 850/4276] eta: 2:47:36 lr: 2.5344194748941423e-05 loss: 0.1072 (0.1137) time: 2.9404 data: 0.0081 max mem: 33300 Epoch: [21] [ 860/4276] eta: 2:47:07 lr: 2.5341357447541465e-05 loss: 0.1030 (0.1138) time: 2.9387 data: 0.0079 max mem: 33300 Epoch: [21] [ 870/4276] eta: 2:46:37 lr: 2.5338520110843998e-05 loss: 0.1066 (0.1138) time: 2.9332 data: 0.0086 max mem: 33300 Epoch: [21] [ 880/4276] eta: 2:46:06 lr: 2.5335682738844207e-05 loss: 0.1098 (0.1139) time: 2.9066 data: 0.0088 max mem: 33300 Epoch: [21] [ 890/4276] eta: 2:45:37 lr: 2.5332845331537246e-05 loss: 0.1210 (0.1140) time: 2.9150 data: 0.0081 max mem: 33300 Epoch: [21] [ 900/4276] eta: 2:45:08 lr: 2.5330007888918288e-05 loss: 0.1115 (0.1139) time: 2.9403 data: 0.0085 max mem: 33300 Epoch: [21] [ 910/4276] eta: 2:44:38 lr: 2.5327170410982505e-05 loss: 0.1157 (0.1140) time: 2.9375 data: 0.0093 max mem: 33300 Epoch: [21] [ 920/4276] eta: 2:44:09 lr: 2.5324332897725045e-05 loss: 0.1166 (0.1140) time: 2.9386 data: 0.0089 max mem: 33300 Epoch: [21] [ 930/4276] eta: 2:43:40 lr: 2.5321495349141076e-05 loss: 0.1135 (0.1140) time: 2.9324 data: 0.0083 max mem: 33300 Epoch: [21] [ 940/4276] eta: 2:43:10 lr: 2.5318657765225768e-05 loss: 0.1131 (0.1140) time: 2.9320 data: 0.0083 max mem: 33300 Epoch: [21] [ 950/4276] eta: 2:42:41 lr: 2.5315820145974277e-05 loss: 0.1131 (0.1142) time: 2.9360 data: 0.0083 max mem: 33300 Epoch: [21] [ 960/4276] eta: 2:42:11 lr: 2.531298249138176e-05 loss: 0.1181 (0.1143) time: 2.9267 data: 0.0084 max mem: 33300 Epoch: [21] [ 970/4276] eta: 2:41:42 lr: 2.5310144801443375e-05 loss: 0.1148 (0.1143) time: 2.9279 data: 0.0082 max mem: 33300 Epoch: [21] [ 980/4276] eta: 2:41:13 lr: 2.530730707615428e-05 loss: 0.1083 (0.1142) time: 2.9379 data: 0.0080 max mem: 33300 Epoch: [21] [ 990/4276] eta: 2:40:43 lr: 2.5304469315509632e-05 loss: 0.1123 (0.1142) time: 2.9393 data: 0.0081 max mem: 33300 Epoch: [21] [1000/4276] eta: 2:40:14 lr: 2.5301631519504575e-05 loss: 0.1123 (0.1143) time: 2.9353 data: 0.0083 max mem: 33300 Epoch: [21] [1010/4276] eta: 2:39:45 lr: 2.529879368813427e-05 loss: 0.1071 (0.1143) time: 2.9356 data: 0.0082 max mem: 33300 Epoch: [21] [1020/4276] eta: 2:39:16 lr: 2.529595582139387e-05 loss: 0.1080 (0.1143) time: 2.9426 data: 0.0080 max mem: 33300 Epoch: [21] [1030/4276] eta: 2:38:46 lr: 2.5293117919278525e-05 loss: 0.1080 (0.1144) time: 2.9395 data: 0.0082 max mem: 33300 Epoch: [21] [1040/4276] eta: 2:38:17 lr: 2.5290279981783377e-05 loss: 0.1117 (0.1143) time: 2.9356 data: 0.0084 max mem: 33300 Epoch: [21] [1050/4276] eta: 2:37:48 lr: 2.5287442008903573e-05 loss: 0.1086 (0.1144) time: 2.9350 data: 0.0082 max mem: 33300 Epoch: [21] [1060/4276] eta: 2:37:18 lr: 2.528460400063427e-05 loss: 0.1086 (0.1145) time: 2.9317 data: 0.0082 max mem: 33300 Epoch: [21] [1070/4276] eta: 2:36:49 lr: 2.528176595697061e-05 loss: 0.1155 (0.1147) time: 2.9336 data: 0.0084 max mem: 33300 Epoch: [21] [1080/4276] eta: 2:36:20 lr: 2.5278927877907725e-05 loss: 0.1148 (0.1148) time: 2.9369 data: 0.0084 max mem: 33300 Epoch: [21] [1090/4276] eta: 2:35:50 lr: 2.527608976344077e-05 loss: 0.1198 (0.1149) time: 2.9366 data: 0.0082 max mem: 33300 Epoch: [21] [1100/4276] eta: 2:35:21 lr: 2.527325161356488e-05 loss: 0.1178 (0.1150) time: 2.9324 data: 0.0079 max mem: 33300 Epoch: [21] [1110/4276] eta: 2:34:52 lr: 2.5270413428275206e-05 loss: 0.1129 (0.1150) time: 2.9379 data: 0.0079 max mem: 33300 Epoch: [21] [1120/4276] eta: 2:34:22 lr: 2.5267575207566874e-05 loss: 0.1102 (0.1151) time: 2.9401 data: 0.0078 max mem: 33300 Epoch: [21] [1130/4276] eta: 2:33:53 lr: 2.526473695143502e-05 loss: 0.1081 (0.1149) time: 2.9323 data: 0.0076 max mem: 33300 Epoch: [21] [1140/4276] eta: 2:33:23 lr: 2.5261898659874787e-05 loss: 0.1069 (0.1149) time: 2.9300 data: 0.0076 max mem: 33300 Epoch: [21] [1150/4276] eta: 2:32:54 lr: 2.5259060332881313e-05 loss: 0.1128 (0.1149) time: 2.9318 data: 0.0078 max mem: 33300 Epoch: [21] [1160/4276] eta: 2:32:25 lr: 2.525622197044972e-05 loss: 0.1122 (0.1149) time: 2.9361 data: 0.0078 max mem: 33300 Epoch: [21] [1170/4276] eta: 2:31:55 lr: 2.525338357257515e-05 loss: 0.1122 (0.1150) time: 2.9331 data: 0.0078 max mem: 33300 Epoch: [21] [1180/4276] eta: 2:31:26 lr: 2.5250545139252736e-05 loss: 0.1116 (0.1149) time: 2.9313 data: 0.0078 max mem: 33300 Epoch: [21] [1190/4276] eta: 2:30:56 lr: 2.52477066704776e-05 loss: 0.0976 (0.1148) time: 2.9312 data: 0.0077 max mem: 33300 Epoch: [21] [1200/4276] eta: 2:30:27 lr: 2.524486816624487e-05 loss: 0.0976 (0.1148) time: 2.9308 data: 0.0077 max mem: 33300 Epoch: [21] [1210/4276] eta: 2:29:58 lr: 2.524202962654968e-05 loss: 0.1058 (0.1147) time: 2.9343 data: 0.0076 max mem: 33300 Epoch: [21] [1220/4276] eta: 2:29:28 lr: 2.523919105138715e-05 loss: 0.1058 (0.1147) time: 2.9350 data: 0.0076 max mem: 33300 Epoch: [21] [1230/4276] eta: 2:28:59 lr: 2.5236352440752408e-05 loss: 0.1118 (0.1148) time: 2.9322 data: 0.0078 max mem: 33300 Epoch: [21] [1240/4276] eta: 2:28:29 lr: 2.523351379464058e-05 loss: 0.1095 (0.1147) time: 2.9309 data: 0.0078 max mem: 33300 Epoch: [21] [1250/4276] eta: 2:28:00 lr: 2.5230675113046776e-05 loss: 0.1095 (0.1148) time: 2.9319 data: 0.0078 max mem: 33300 Epoch: [21] [1260/4276] eta: 2:27:31 lr: 2.5227836395966132e-05 loss: 0.1059 (0.1147) time: 2.9328 data: 0.0080 max mem: 33300 Epoch: [21] [1270/4276] eta: 2:27:01 lr: 2.522499764339376e-05 loss: 0.0993 (0.1146) time: 2.9318 data: 0.0081 max mem: 33300 Epoch: [21] [1280/4276] eta: 2:26:32 lr: 2.5222158855324772e-05 loss: 0.1059 (0.1146) time: 2.9322 data: 0.0083 max mem: 33300 Epoch: [21] [1290/4276] eta: 2:26:02 lr: 2.5219320031754295e-05 loss: 0.1126 (0.1146) time: 2.9331 data: 0.0081 max mem: 33300 Epoch: [21] [1300/4276] eta: 2:25:33 lr: 2.5216481172677447e-05 loss: 0.0981 (0.1145) time: 2.9325 data: 0.0079 max mem: 33300 Epoch: [21] [1310/4276] eta: 2:25:04 lr: 2.5213642278089333e-05 loss: 0.0950 (0.1144) time: 2.9331 data: 0.0081 max mem: 33300 Epoch: [21] [1320/4276] eta: 2:24:34 lr: 2.521080334798507e-05 loss: 0.1096 (0.1144) time: 2.9339 data: 0.0082 max mem: 33300 Epoch: [21] [1330/4276] eta: 2:24:05 lr: 2.5207964382359768e-05 loss: 0.1128 (0.1144) time: 2.9332 data: 0.0081 max mem: 33300 Epoch: [21] [1340/4276] eta: 2:23:35 lr: 2.5205125381208545e-05 loss: 0.1135 (0.1144) time: 2.9323 data: 0.0080 max mem: 33300 Epoch: [21] [1350/4276] eta: 2:23:06 lr: 2.5202286344526493e-05 loss: 0.1179 (0.1145) time: 2.9329 data: 0.0081 max mem: 33300 Epoch: [21] [1360/4276] eta: 2:22:37 lr: 2.519944727230874e-05 loss: 0.1217 (0.1145) time: 2.9343 data: 0.0083 max mem: 33300 Epoch: [21] [1370/4276] eta: 2:22:07 lr: 2.5196608164550377e-05 loss: 0.1108 (0.1145) time: 2.9340 data: 0.0082 max mem: 33300 Epoch: [21] [1380/4276] eta: 2:21:38 lr: 2.519376902124653e-05 loss: 0.1155 (0.1147) time: 2.9330 data: 0.0079 max mem: 33300 Epoch: [21] [1390/4276] eta: 2:21:09 lr: 2.5190929842392273e-05 loss: 0.1310 (0.1148) time: 2.9320 data: 0.0081 max mem: 33300 Epoch: [21] [1400/4276] eta: 2:20:39 lr: 2.5188090627982736e-05 loss: 0.1219 (0.1148) time: 2.9318 data: 0.0083 max mem: 33300 Epoch: [21] [1410/4276] eta: 2:20:10 lr: 2.5185251378013003e-05 loss: 0.1041 (0.1148) time: 2.9333 data: 0.0082 max mem: 33300 Epoch: [21] [1420/4276] eta: 2:19:40 lr: 2.5182412092478187e-05 loss: 0.1005 (0.1147) time: 2.9319 data: 0.0080 max mem: 33300 Epoch: [21] [1430/4276] eta: 2:19:11 lr: 2.5179572771373378e-05 loss: 0.1030 (0.1147) time: 2.9295 data: 0.0081 max mem: 33300 Epoch: [21] [1440/4276] eta: 2:18:41 lr: 2.5176733414693676e-05 loss: 0.1114 (0.1147) time: 2.9279 data: 0.0083 max mem: 33300 Epoch: [21] [1450/4276] eta: 2:18:12 lr: 2.5173894022434176e-05 loss: 0.1114 (0.1147) time: 2.9344 data: 0.0081 max mem: 33300 Epoch: [21] [1460/4276] eta: 2:17:43 lr: 2.517105459458998e-05 loss: 0.1031 (0.1148) time: 2.9322 data: 0.0079 max mem: 33300 Epoch: [21] [1470/4276] eta: 2:17:13 lr: 2.516821513115617e-05 loss: 0.1006 (0.1147) time: 2.9206 data: 0.0081 max mem: 33300 Epoch: [21] [1480/4276] eta: 2:16:43 lr: 2.5165375632127854e-05 loss: 0.1009 (0.1147) time: 2.9211 data: 0.0083 max mem: 33300 Epoch: [21] [1490/4276] eta: 2:16:14 lr: 2.5162536097500106e-05 loss: 0.1009 (0.1146) time: 2.9269 data: 0.0081 max mem: 33300 Epoch: [21] [1500/4276] eta: 2:15:45 lr: 2.515969652726803e-05 loss: 0.1034 (0.1146) time: 2.9312 data: 0.0080 max mem: 33300 Epoch: [21] [1510/4276] eta: 2:15:16 lr: 2.5156856921426702e-05 loss: 0.1061 (0.1145) time: 2.9399 data: 0.0082 max mem: 33300 Epoch: [21] [1520/4276] eta: 2:14:46 lr: 2.515401727997122e-05 loss: 0.0995 (0.1145) time: 2.9399 data: 0.0083 max mem: 33300 Epoch: [21] [1530/4276] eta: 2:14:17 lr: 2.515117760289667e-05 loss: 0.1025 (0.1145) time: 2.9312 data: 0.0081 max mem: 33300 Epoch: [21] [1540/4276] eta: 2:13:47 lr: 2.5148337890198125e-05 loss: 0.1115 (0.1146) time: 2.9283 data: 0.0079 max mem: 33300 Epoch: [21] [1550/4276] eta: 2:13:18 lr: 2.5145498141870676e-05 loss: 0.1140 (0.1146) time: 2.9279 data: 0.0081 max mem: 33300 Epoch: [21] [1560/4276] eta: 2:12:49 lr: 2.514265835790941e-05 loss: 0.1080 (0.1145) time: 2.9316 data: 0.0083 max mem: 33300 Epoch: [21] [1570/4276] eta: 2:12:19 lr: 2.5139818538309406e-05 loss: 0.1067 (0.1145) time: 2.9299 data: 0.0082 max mem: 33300 Epoch: [21] [1580/4276] eta: 2:11:50 lr: 2.5136978683065736e-05 loss: 0.1024 (0.1143) time: 2.9283 data: 0.0080 max mem: 33300 Epoch: [21] [1590/4276] eta: 2:11:20 lr: 2.5134138792173484e-05 loss: 0.1048 (0.1144) time: 2.9297 data: 0.0081 max mem: 33300 Epoch: [21] [1600/4276] eta: 2:10:51 lr: 2.513129886562772e-05 loss: 0.1139 (0.1144) time: 2.9321 data: 0.0083 max mem: 33300 Epoch: [21] [1610/4276] eta: 2:10:22 lr: 2.512845890342354e-05 loss: 0.1087 (0.1144) time: 2.9345 data: 0.0081 max mem: 33300 Epoch: [21] [1620/4276] eta: 2:09:52 lr: 2.5125618905555993e-05 loss: 0.0986 (0.1143) time: 2.9340 data: 0.0080 max mem: 33300 Epoch: [21] [1630/4276] eta: 2:09:23 lr: 2.5122778872020164e-05 loss: 0.1046 (0.1144) time: 2.9310 data: 0.0083 max mem: 33300 Epoch: [21] [1640/4276] eta: 2:08:53 lr: 2.5119938802811123e-05 loss: 0.1030 (0.1143) time: 2.9280 data: 0.0084 max mem: 33300 Epoch: [21] [1650/4276] eta: 2:08:24 lr: 2.5117098697923946e-05 loss: 0.0963 (0.1142) time: 2.9267 data: 0.0081 max mem: 33300 Epoch: [21] [1660/4276] eta: 2:07:54 lr: 2.5114258557353693e-05 loss: 0.0987 (0.1141) time: 2.9248 data: 0.0080 max mem: 33300 Epoch: [21] [1670/4276] eta: 2:07:25 lr: 2.511141838109543e-05 loss: 0.0915 (0.1140) time: 2.9206 data: 0.0081 max mem: 33300 Epoch: [21] [1680/4276] eta: 2:06:55 lr: 2.5108578169144238e-05 loss: 0.0960 (0.1141) time: 2.9208 data: 0.0083 max mem: 33300 Epoch: [21] [1690/4276] eta: 2:06:26 lr: 2.5105737921495177e-05 loss: 0.0978 (0.1140) time: 2.9279 data: 0.0081 max mem: 33300 Epoch: [21] [1700/4276] eta: 2:05:57 lr: 2.51028976381433e-05 loss: 0.0971 (0.1139) time: 2.9337 data: 0.0080 max mem: 33300 Epoch: [21] [1710/4276] eta: 2:05:27 lr: 2.510005731908368e-05 loss: 0.1088 (0.1139) time: 2.9299 data: 0.0086 max mem: 33300 Epoch: [21] [1720/4276] eta: 2:04:58 lr: 2.5097216964311372e-05 loss: 0.1061 (0.1138) time: 2.9292 data: 0.0087 max mem: 33300 Epoch: [21] [1730/4276] eta: 2:04:28 lr: 2.509437657382145e-05 loss: 0.1060 (0.1138) time: 2.9325 data: 0.0081 max mem: 33300 Epoch: [21] [1740/4276] eta: 2:03:59 lr: 2.5091536147608952e-05 loss: 0.1089 (0.1138) time: 2.9330 data: 0.0080 max mem: 33300 Epoch: [21] [1750/4276] eta: 2:03:30 lr: 2.508869568566895e-05 loss: 0.1116 (0.1138) time: 2.9337 data: 0.0083 max mem: 33300 Epoch: [21] [1760/4276] eta: 2:03:00 lr: 2.5085855187996493e-05 loss: 0.0956 (0.1137) time: 2.9319 data: 0.0085 max mem: 33300 Epoch: [21] [1770/4276] eta: 2:02:31 lr: 2.5083014654586647e-05 loss: 0.1025 (0.1137) time: 2.9350 data: 0.0082 max mem: 33300 Epoch: [21] [1780/4276] eta: 2:02:02 lr: 2.508017408543445e-05 loss: 0.1019 (0.1136) time: 2.9362 data: 0.0080 max mem: 33300 Epoch: [21] [1790/4276] eta: 2:01:32 lr: 2.507733348053496e-05 loss: 0.0943 (0.1135) time: 2.9327 data: 0.0082 max mem: 33300 Epoch: [21] [1800/4276] eta: 2:01:03 lr: 2.507449283988323e-05 loss: 0.0970 (0.1135) time: 2.9334 data: 0.0083 max mem: 33300 Epoch: [21] [1810/4276] eta: 2:00:34 lr: 2.5071652163474314e-05 loss: 0.1023 (0.1135) time: 2.9327 data: 0.0081 max mem: 33300 Epoch: [21] [1820/4276] eta: 2:00:04 lr: 2.5068811451303247e-05 loss: 0.1114 (0.1135) time: 2.9159 data: 0.0082 max mem: 33300 Epoch: [21] [1830/4276] eta: 1:59:34 lr: 2.506597070336509e-05 loss: 0.1100 (0.1135) time: 2.8842 data: 0.0088 max mem: 33300 Epoch: [21] [1840/4276] eta: 1:59:04 lr: 2.506312991965489e-05 loss: 0.1009 (0.1134) time: 2.8940 data: 0.0092 max mem: 33300 Epoch: [21] [1850/4276] eta: 1:58:35 lr: 2.506028910016767e-05 loss: 0.0957 (0.1134) time: 2.9266 data: 0.0091 max mem: 33300 Epoch: [21] [1860/4276] eta: 1:58:06 lr: 2.5057448244898492e-05 loss: 0.1083 (0.1134) time: 2.9315 data: 0.0083 max mem: 33300 Epoch: [21] [1870/4276] eta: 1:57:36 lr: 2.505460735384239e-05 loss: 0.1083 (0.1134) time: 2.9305 data: 0.0081 max mem: 33300 Epoch: [21] [1880/4276] eta: 1:57:07 lr: 2.5051766426994418e-05 loss: 0.1074 (0.1134) time: 2.9310 data: 0.0083 max mem: 33300 Epoch: [21] [1890/4276] eta: 1:56:38 lr: 2.5048925464349593e-05 loss: 0.0996 (0.1134) time: 2.9326 data: 0.0081 max mem: 33300 Epoch: [21] [1900/4276] eta: 1:56:08 lr: 2.5046084465902968e-05 loss: 0.0988 (0.1134) time: 2.9308 data: 0.0078 max mem: 33300 Epoch: [21] [1910/4276] eta: 1:55:39 lr: 2.504324343164957e-05 loss: 0.0985 (0.1133) time: 2.9305 data: 0.0081 max mem: 33300 Epoch: [21] [1920/4276] eta: 1:55:09 lr: 2.5040402361584454e-05 loss: 0.0950 (0.1133) time: 2.9315 data: 0.0084 max mem: 33300 Epoch: [21] [1930/4276] eta: 1:54:40 lr: 2.5037561255702624e-05 loss: 0.0931 (0.1132) time: 2.9333 data: 0.0084 max mem: 33300 Epoch: [21] [1940/4276] eta: 1:54:11 lr: 2.5034720113999133e-05 loss: 0.0961 (0.1132) time: 2.9353 data: 0.0083 max mem: 33300 Epoch: [21] [1950/4276] eta: 1:53:42 lr: 2.503187893646901e-05 loss: 0.1052 (0.1132) time: 2.9334 data: 0.0083 max mem: 33300 Epoch: [21] [1960/4276] eta: 1:53:12 lr: 2.5029037723107285e-05 loss: 0.1020 (0.1131) time: 2.9327 data: 0.0083 max mem: 33300 Epoch: [21] [1970/4276] eta: 1:52:43 lr: 2.5026196473908974e-05 loss: 0.0921 (0.1130) time: 2.9319 data: 0.0081 max mem: 33300 Epoch: [21] [1980/4276] eta: 1:52:14 lr: 2.5023355188869118e-05 loss: 0.0940 (0.1130) time: 2.9325 data: 0.0079 max mem: 33300 Epoch: [21] [1990/4276] eta: 1:51:44 lr: 2.5020513867982736e-05 loss: 0.1104 (0.1131) time: 2.9310 data: 0.0081 max mem: 33300 Epoch: [21] [2000/4276] eta: 1:51:15 lr: 2.5017672511244865e-05 loss: 0.1104 (0.1131) time: 2.9329 data: 0.0084 max mem: 33300 Epoch: [21] [2010/4276] eta: 1:50:46 lr: 2.501483111865051e-05 loss: 0.1103 (0.1131) time: 2.9342 data: 0.0082 max mem: 33300 Epoch: [21] [2020/4276] eta: 1:50:16 lr: 2.5011989690194704e-05 loss: 0.1134 (0.1131) time: 2.9264 data: 0.0080 max mem: 33300 Epoch: [21] [2030/4276] eta: 1:49:46 lr: 2.5009148225872464e-05 loss: 0.0965 (0.1130) time: 2.9129 data: 0.0085 max mem: 33300 Epoch: [21] [2040/4276] eta: 1:49:17 lr: 2.500630672567882e-05 loss: 0.0904 (0.1130) time: 2.9191 data: 0.0092 max mem: 33300 Epoch: [21] [2050/4276] eta: 1:48:48 lr: 2.5003465189608765e-05 loss: 0.1029 (0.1130) time: 2.9329 data: 0.0091 max mem: 33300 Epoch: [21] [2060/4276] eta: 1:48:18 lr: 2.500062361765734e-05 loss: 0.1067 (0.1130) time: 2.9321 data: 0.0085 max mem: 33300 Epoch: [21] [2070/4276] eta: 1:47:49 lr: 2.499778200981955e-05 loss: 0.1054 (0.1130) time: 2.9197 data: 0.0086 max mem: 33300 Epoch: [21] [2080/4276] eta: 1:47:19 lr: 2.4994940366090418e-05 loss: 0.1054 (0.1131) time: 2.9199 data: 0.0088 max mem: 33300 Epoch: [21] [2090/4276] eta: 1:46:50 lr: 2.499209868646494e-05 loss: 0.1057 (0.1131) time: 2.9328 data: 0.0081 max mem: 33300 Epoch: [21] [2100/4276] eta: 1:46:21 lr: 2.4989256970938137e-05 loss: 0.1128 (0.1131) time: 2.9328 data: 0.0076 max mem: 33300 Epoch: [21] [2110/4276] eta: 1:45:52 lr: 2.4986415219505027e-05 loss: 0.1086 (0.1131) time: 2.9325 data: 0.0077 max mem: 33300 Epoch: [21] [2120/4276] eta: 1:45:22 lr: 2.4983573432160603e-05 loss: 0.0980 (0.1130) time: 2.9317 data: 0.0077 max mem: 33300 Epoch: [21] [2130/4276] eta: 1:44:53 lr: 2.498073160889988e-05 loss: 0.0870 (0.1129) time: 2.9318 data: 0.0075 max mem: 33300 Epoch: [21] [2140/4276] eta: 1:44:24 lr: 2.4977889749717867e-05 loss: 0.1019 (0.1129) time: 2.9322 data: 0.0076 max mem: 33300 Epoch: [21] [2150/4276] eta: 1:43:54 lr: 2.497504785460957e-05 loss: 0.1071 (0.1128) time: 2.9377 data: 0.0078 max mem: 33300 Epoch: [21] [2160/4276] eta: 1:43:25 lr: 2.4972205923569985e-05 loss: 0.0940 (0.1128) time: 2.9377 data: 0.0077 max mem: 33300 Epoch: [21] [2170/4276] eta: 1:42:56 lr: 2.4969363956594115e-05 loss: 0.1067 (0.1128) time: 2.9323 data: 0.0075 max mem: 33300 Epoch: [21] [2180/4276] eta: 1:42:26 lr: 2.4966521953676967e-05 loss: 0.1067 (0.1127) time: 2.9323 data: 0.0073 max mem: 33300 Epoch: [21] [2190/4276] eta: 1:41:57 lr: 2.496367991481354e-05 loss: 0.1003 (0.1127) time: 2.9349 data: 0.0073 max mem: 33300 Epoch: [21] [2200/4276] eta: 1:41:28 lr: 2.4960837839998824e-05 loss: 0.1135 (0.1128) time: 2.9363 data: 0.0076 max mem: 33300 Epoch: [21] [2210/4276] eta: 1:40:58 lr: 2.4957995729227822e-05 loss: 0.1142 (0.1128) time: 2.9333 data: 0.0076 max mem: 33300 Epoch: [21] [2220/4276] eta: 1:40:29 lr: 2.4955153582495524e-05 loss: 0.1081 (0.1128) time: 2.9311 data: 0.0074 max mem: 33300 Epoch: [21] [2230/4276] eta: 1:40:00 lr: 2.4952311399796937e-05 loss: 0.1095 (0.1128) time: 2.9323 data: 0.0079 max mem: 33300 Epoch: [21] [2240/4276] eta: 1:39:30 lr: 2.4949469181127043e-05 loss: 0.0920 (0.1127) time: 2.9321 data: 0.0082 max mem: 33300 Epoch: [21] [2250/4276] eta: 1:39:01 lr: 2.4946626926480833e-05 loss: 0.0933 (0.1127) time: 2.9311 data: 0.0077 max mem: 33300 Epoch: [21] [2260/4276] eta: 1:38:32 lr: 2.4943784635853298e-05 loss: 0.1110 (0.1127) time: 2.9332 data: 0.0076 max mem: 33300 Epoch: [21] [2270/4276] eta: 1:38:02 lr: 2.4940942309239435e-05 loss: 0.1023 (0.1127) time: 2.9342 data: 0.0078 max mem: 33300 Epoch: [21] [2280/4276] eta: 1:37:33 lr: 2.493809994663422e-05 loss: 0.1036 (0.1127) time: 2.9377 data: 0.0078 max mem: 33300 Epoch: [21] [2290/4276] eta: 1:37:04 lr: 2.493525754803265e-05 loss: 0.1050 (0.1127) time: 2.9364 data: 0.0075 max mem: 33300 Epoch: [21] [2300/4276] eta: 1:36:34 lr: 2.4932415113429695e-05 loss: 0.1036 (0.1126) time: 2.9314 data: 0.0075 max mem: 33300 Epoch: [21] [2310/4276] eta: 1:36:05 lr: 2.492957264282036e-05 loss: 0.1068 (0.1127) time: 2.9321 data: 0.0077 max mem: 33300 Epoch: [21] [2320/4276] eta: 1:35:36 lr: 2.4926730136199606e-05 loss: 0.1123 (0.1127) time: 2.9327 data: 0.0078 max mem: 33300 Epoch: [21] [2330/4276] eta: 1:35:07 lr: 2.492388759356243e-05 loss: 0.1215 (0.1128) time: 2.9341 data: 0.0077 max mem: 33300 Epoch: [21] [2340/4276] eta: 1:34:37 lr: 2.4921045014903796e-05 loss: 0.1145 (0.1128) time: 2.9343 data: 0.0075 max mem: 33300 Epoch: [21] [2350/4276] eta: 1:34:08 lr: 2.4918202400218697e-05 loss: 0.1113 (0.1128) time: 2.9345 data: 0.0077 max mem: 33300 Epoch: [21] [2360/4276] eta: 1:33:39 lr: 2.49153597495021e-05 loss: 0.1178 (0.1128) time: 2.9336 data: 0.0077 max mem: 33300 Epoch: [21] [2370/4276] eta: 1:33:09 lr: 2.4912517062748986e-05 loss: 0.1178 (0.1129) time: 2.9323 data: 0.0076 max mem: 33300 Epoch: [21] [2380/4276] eta: 1:32:40 lr: 2.4909674339954324e-05 loss: 0.1197 (0.1129) time: 2.9346 data: 0.0076 max mem: 33300 Epoch: [21] [2390/4276] eta: 1:32:11 lr: 2.4906831581113098e-05 loss: 0.1099 (0.1129) time: 2.9419 data: 0.0075 max mem: 33300 Epoch: [21] [2400/4276] eta: 1:31:41 lr: 2.4903988786220262e-05 loss: 0.1160 (0.1130) time: 2.9430 data: 0.0076 max mem: 33300 Epoch: [21] [2410/4276] eta: 1:31:12 lr: 2.4901145955270796e-05 loss: 0.1124 (0.1129) time: 2.9347 data: 0.0076 max mem: 33300 Epoch: [21] [2420/4276] eta: 1:30:43 lr: 2.4898303088259674e-05 loss: 0.1016 (0.1129) time: 2.9317 data: 0.0077 max mem: 33300 Epoch: [21] [2430/4276] eta: 1:30:13 lr: 2.4895460185181853e-05 loss: 0.1112 (0.1130) time: 2.9307 data: 0.0079 max mem: 33300 Epoch: [21] [2440/4276] eta: 1:29:44 lr: 2.4892617246032303e-05 loss: 0.1136 (0.1129) time: 2.9304 data: 0.0077 max mem: 33300 Epoch: [21] [2450/4276] eta: 1:29:15 lr: 2.4889774270805994e-05 loss: 0.1062 (0.1129) time: 2.9325 data: 0.0075 max mem: 33300 Epoch: [21] [2460/4276] eta: 1:28:45 lr: 2.4886931259497883e-05 loss: 0.1174 (0.1130) time: 2.9348 data: 0.0076 max mem: 33300 Epoch: [21] [2470/4276] eta: 1:28:16 lr: 2.4884088212102933e-05 loss: 0.1203 (0.1130) time: 2.9387 data: 0.0078 max mem: 33300 Epoch: [21] [2480/4276] eta: 1:27:47 lr: 2.4881245128616103e-05 loss: 0.1203 (0.1131) time: 2.9368 data: 0.0079 max mem: 33300 Epoch: [21] [2490/4276] eta: 1:27:18 lr: 2.4878402009032357e-05 loss: 0.1066 (0.1130) time: 2.9309 data: 0.0076 max mem: 33300 Epoch: [21] [2500/4276] eta: 1:26:48 lr: 2.4875558853346658e-05 loss: 0.1057 (0.1131) time: 2.9207 data: 0.0074 max mem: 33300 Epoch: [21] [2510/4276] eta: 1:26:19 lr: 2.487271566155395e-05 loss: 0.1168 (0.1131) time: 2.9218 data: 0.0072 max mem: 33300 Epoch: [21] [2520/4276] eta: 1:25:49 lr: 2.486987243364919e-05 loss: 0.1026 (0.1130) time: 2.9322 data: 0.0072 max mem: 33300 Epoch: [21] [2530/4276] eta: 1:25:20 lr: 2.4867029169627343e-05 loss: 0.0880 (0.1130) time: 2.9318 data: 0.0072 max mem: 33300 Epoch: [21] [2540/4276] eta: 1:24:51 lr: 2.4864185869483356e-05 loss: 0.1028 (0.1130) time: 2.9333 data: 0.0072 max mem: 33300 Epoch: [21] [2550/4276] eta: 1:24:21 lr: 2.4861342533212174e-05 loss: 0.1001 (0.1129) time: 2.9343 data: 0.0072 max mem: 33300 Epoch: [21] [2560/4276] eta: 1:23:52 lr: 2.485849916080875e-05 loss: 0.0923 (0.1129) time: 2.9327 data: 0.0071 max mem: 33300 Epoch: [21] [2570/4276] eta: 1:23:23 lr: 2.485565575226804e-05 loss: 0.0879 (0.1128) time: 2.9328 data: 0.0072 max mem: 33300 Epoch: [21] [2580/4276] eta: 1:22:53 lr: 2.4852812307584988e-05 loss: 0.0906 (0.1128) time: 2.9346 data: 0.0072 max mem: 33300 Epoch: [21] [2590/4276] eta: 1:22:24 lr: 2.4849968826754535e-05 loss: 0.0980 (0.1127) time: 2.9329 data: 0.0072 max mem: 33300 Epoch: [21] [2600/4276] eta: 1:21:55 lr: 2.4847125309771627e-05 loss: 0.0980 (0.1127) time: 2.9378 data: 0.0072 max mem: 33300 Epoch: [21] [2610/4276] eta: 1:21:26 lr: 2.484428175663121e-05 loss: 0.0876 (0.1126) time: 2.9407 data: 0.0071 max mem: 33300 Epoch: [21] [2620/4276] eta: 1:20:56 lr: 2.4841438167328228e-05 loss: 0.0914 (0.1126) time: 2.9336 data: 0.0072 max mem: 33300 Epoch: [21] [2630/4276] eta: 1:20:27 lr: 2.4838594541857615e-05 loss: 0.0989 (0.1126) time: 2.9299 data: 0.0074 max mem: 33300 Epoch: [21] [2640/4276] eta: 1:19:58 lr: 2.483575088021431e-05 loss: 0.0991 (0.1125) time: 2.9313 data: 0.0073 max mem: 33300 Epoch: [21] [2650/4276] eta: 1:19:28 lr: 2.4832907182393258e-05 loss: 0.1004 (0.1125) time: 2.9379 data: 0.0072 max mem: 33300 Epoch: [21] [2660/4276] eta: 1:18:59 lr: 2.48300634483894e-05 loss: 0.1118 (0.1126) time: 2.9398 data: 0.0072 max mem: 33300 Epoch: [21] [2670/4276] eta: 1:18:30 lr: 2.482721967819765e-05 loss: 0.1215 (0.1126) time: 2.9363 data: 0.0075 max mem: 33300 Epoch: [21] [2680/4276] eta: 1:18:00 lr: 2.482437587181296e-05 loss: 0.1239 (0.1127) time: 2.9341 data: 0.0074 max mem: 33300 Epoch: [21] [2690/4276] eta: 1:17:31 lr: 2.4821532029230254e-05 loss: 0.1117 (0.1126) time: 2.9333 data: 0.0072 max mem: 33300 Epoch: [21] [2700/4276] eta: 1:17:02 lr: 2.4818688150444473e-05 loss: 0.1043 (0.1126) time: 2.9345 data: 0.0072 max mem: 33300 Epoch: [21] [2710/4276] eta: 1:16:32 lr: 2.4815844235450532e-05 loss: 0.1035 (0.1126) time: 2.9336 data: 0.0074 max mem: 33300 Epoch: [21] [2720/4276] eta: 1:16:03 lr: 2.481300028424337e-05 loss: 0.0917 (0.1126) time: 2.9325 data: 0.0077 max mem: 33300 Epoch: [21] [2730/4276] eta: 1:15:34 lr: 2.4810156296817917e-05 loss: 0.1022 (0.1126) time: 2.9316 data: 0.0075 max mem: 33300 Epoch: [21] [2740/4276] eta: 1:15:04 lr: 2.4807312273169086e-05 loss: 0.1174 (0.1126) time: 2.9277 data: 0.0073 max mem: 33300 Epoch: [21] [2750/4276] eta: 1:14:35 lr: 2.480446821329181e-05 loss: 0.1186 (0.1127) time: 2.9303 data: 0.0075 max mem: 33300 Epoch: [21] [2760/4276] eta: 1:14:06 lr: 2.4801624117181007e-05 loss: 0.1124 (0.1127) time: 2.9355 data: 0.0077 max mem: 33300 Epoch: [21] [2770/4276] eta: 1:13:36 lr: 2.4798779984831608e-05 loss: 0.1098 (0.1127) time: 2.9351 data: 0.0078 max mem: 33300 Epoch: [21] [2780/4276] eta: 1:13:07 lr: 2.4795935816238522e-05 loss: 0.1092 (0.1127) time: 2.9360 data: 0.0076 max mem: 33300 Epoch: [21] [2790/4276] eta: 1:12:38 lr: 2.4793091611396672e-05 loss: 0.1187 (0.1128) time: 2.9352 data: 0.0076 max mem: 33300 Epoch: [21] [2800/4276] eta: 1:12:08 lr: 2.4790247370300975e-05 loss: 0.1110 (0.1127) time: 2.9334 data: 0.0077 max mem: 33300 Epoch: [21] [2810/4276] eta: 1:11:39 lr: 2.4787403092946352e-05 loss: 0.0913 (0.1126) time: 2.9360 data: 0.0077 max mem: 33300 Epoch: [21] [2820/4276] eta: 1:11:10 lr: 2.4784558779327715e-05 loss: 0.0913 (0.1126) time: 2.9377 data: 0.0076 max mem: 33300 Epoch: [21] [2830/4276] eta: 1:10:40 lr: 2.4781714429439973e-05 loss: 0.0993 (0.1126) time: 2.9372 data: 0.0076 max mem: 33300 Epoch: [21] [2840/4276] eta: 1:10:11 lr: 2.477887004327804e-05 loss: 0.1249 (0.1127) time: 2.9374 data: 0.0077 max mem: 33300 Epoch: [21] [2850/4276] eta: 1:09:42 lr: 2.4776025620836834e-05 loss: 0.1220 (0.1127) time: 2.9384 data: 0.0077 max mem: 33300 Epoch: [21] [2860/4276] eta: 1:09:13 lr: 2.4773181162111254e-05 loss: 0.1093 (0.1127) time: 2.9378 data: 0.0075 max mem: 33300 Epoch: [21] [2870/4276] eta: 1:08:43 lr: 2.477033666709621e-05 loss: 0.0986 (0.1127) time: 2.9351 data: 0.0075 max mem: 33300 Epoch: [21] [2880/4276] eta: 1:08:14 lr: 2.4767492135786613e-05 loss: 0.1033 (0.1127) time: 2.9342 data: 0.0075 max mem: 33300 Epoch: [21] [2890/4276] eta: 1:07:45 lr: 2.476464756817737e-05 loss: 0.1033 (0.1127) time: 2.9357 data: 0.0075 max mem: 33300 Epoch: [21] [2900/4276] eta: 1:07:15 lr: 2.4761802964263377e-05 loss: 0.0967 (0.1126) time: 2.9355 data: 0.0078 max mem: 33300 Epoch: [21] [2910/4276] eta: 1:06:46 lr: 2.4758958324039538e-05 loss: 0.1027 (0.1126) time: 2.9358 data: 0.0078 max mem: 33300 Epoch: [21] [2920/4276] eta: 1:06:17 lr: 2.4756113647500757e-05 loss: 0.1052 (0.1126) time: 2.9352 data: 0.0077 max mem: 33300 Epoch: [21] [2930/4276] eta: 1:05:47 lr: 2.4753268934641943e-05 loss: 0.0987 (0.1126) time: 2.9335 data: 0.0077 max mem: 33300 Epoch: [21] [2940/4276] eta: 1:05:18 lr: 2.4750424185457977e-05 loss: 0.0956 (0.1126) time: 2.9315 data: 0.0076 max mem: 33300 Epoch: [21] [2950/4276] eta: 1:04:49 lr: 2.4747579399943763e-05 loss: 0.0991 (0.1126) time: 2.9264 data: 0.0077 max mem: 33300 Epoch: [21] [2960/4276] eta: 1:04:19 lr: 2.4744734578094196e-05 loss: 0.1093 (0.1126) time: 2.9297 data: 0.0079 max mem: 33300 Epoch: [21] [2970/4276] eta: 1:03:50 lr: 2.4741889719904175e-05 loss: 0.1116 (0.1126) time: 2.9378 data: 0.0077 max mem: 33300 Epoch: [21] [2980/4276] eta: 1:03:21 lr: 2.473904482536859e-05 loss: 0.1176 (0.1126) time: 2.9387 data: 0.0076 max mem: 33300 Epoch: [21] [2990/4276] eta: 1:02:51 lr: 2.473619989448233e-05 loss: 0.1087 (0.1126) time: 2.9327 data: 0.0075 max mem: 33300 Epoch: [21] [3000/4276] eta: 1:02:22 lr: 2.4733354927240286e-05 loss: 0.0955 (0.1126) time: 2.9289 data: 0.0077 max mem: 33300 Epoch: [21] [3010/4276] eta: 1:01:53 lr: 2.4730509923637354e-05 loss: 0.1061 (0.1126) time: 2.9313 data: 0.0078 max mem: 33300 Epoch: [21] [3020/4276] eta: 1:01:23 lr: 2.472766488366841e-05 loss: 0.1152 (0.1126) time: 2.9320 data: 0.0076 max mem: 33300 Epoch: [21] [3030/4276] eta: 1:00:54 lr: 2.4724819807328345e-05 loss: 0.1147 (0.1126) time: 2.9323 data: 0.0076 max mem: 33300 Epoch: [21] [3040/4276] eta: 1:00:25 lr: 2.472197469461205e-05 loss: 0.1163 (0.1127) time: 2.9382 data: 0.0078 max mem: 33300 Epoch: [21] [3050/4276] eta: 0:59:55 lr: 2.4719129545514396e-05 loss: 0.1096 (0.1126) time: 2.9391 data: 0.0079 max mem: 33300 Epoch: [21] [3060/4276] eta: 0:59:26 lr: 2.4716284360030272e-05 loss: 0.0941 (0.1126) time: 2.9359 data: 0.0077 max mem: 33300 Epoch: [21] [3070/4276] eta: 0:58:57 lr: 2.471343913815456e-05 loss: 0.0989 (0.1126) time: 2.9370 data: 0.0075 max mem: 33300 Epoch: [21] [3080/4276] eta: 0:58:27 lr: 2.4710593879882136e-05 loss: 0.1075 (0.1125) time: 2.9370 data: 0.0077 max mem: 33300 Epoch: [21] [3090/4276] eta: 0:57:58 lr: 2.4707748585207882e-05 loss: 0.0986 (0.1125) time: 2.9338 data: 0.0078 max mem: 33300 Epoch: [21] [3100/4276] eta: 0:57:29 lr: 2.4704903254126665e-05 loss: 0.0974 (0.1124) time: 2.9327 data: 0.0076 max mem: 33300 Epoch: [21] [3110/4276] eta: 0:56:59 lr: 2.470205788663337e-05 loss: 0.0969 (0.1124) time: 2.9344 data: 0.0075 max mem: 33300 Epoch: [21] [3120/4276] eta: 0:56:30 lr: 2.4699212482722872e-05 loss: 0.1038 (0.1124) time: 2.9336 data: 0.0077 max mem: 33300 Epoch: [21] [3130/4276] eta: 0:56:01 lr: 2.469636704239003e-05 loss: 0.1096 (0.1124) time: 2.9322 data: 0.0078 max mem: 33300 Epoch: [21] [3140/4276] eta: 0:55:31 lr: 2.4693521565629728e-05 loss: 0.1108 (0.1124) time: 2.9324 data: 0.0076 max mem: 33300 Epoch: [21] [3150/4276] eta: 0:55:02 lr: 2.4690676052436826e-05 loss: 0.1149 (0.1124) time: 2.9326 data: 0.0076 max mem: 33300 Epoch: [21] [3160/4276] eta: 0:54:33 lr: 2.4687830502806204e-05 loss: 0.1134 (0.1124) time: 2.9323 data: 0.0078 max mem: 33300 Epoch: [21] [3170/4276] eta: 0:54:03 lr: 2.4684984916732718e-05 loss: 0.1065 (0.1124) time: 2.9371 data: 0.0077 max mem: 33300 Epoch: [21] [3180/4276] eta: 0:53:34 lr: 2.4682139294211235e-05 loss: 0.1066 (0.1124) time: 2.9373 data: 0.0075 max mem: 33300 Epoch: [21] [3190/4276] eta: 0:53:05 lr: 2.467929363523662e-05 loss: 0.1129 (0.1125) time: 2.9332 data: 0.0076 max mem: 33300 Epoch: [21] [3200/4276] eta: 0:52:35 lr: 2.4676447939803745e-05 loss: 0.1129 (0.1125) time: 2.9348 data: 0.0077 max mem: 33300 Epoch: [21] [3210/4276] eta: 0:52:06 lr: 2.4673602207907458e-05 loss: 0.1156 (0.1125) time: 2.9346 data: 0.0077 max mem: 33300 Epoch: [21] [3220/4276] eta: 0:51:37 lr: 2.4670756439542622e-05 loss: 0.1050 (0.1125) time: 2.9333 data: 0.0076 max mem: 33300 Epoch: [21] [3230/4276] eta: 0:51:08 lr: 2.46679106347041e-05 loss: 0.1105 (0.1125) time: 2.9330 data: 0.0076 max mem: 33300 Epoch: [21] [3240/4276] eta: 0:50:38 lr: 2.466506479338675e-05 loss: 0.1208 (0.1126) time: 2.9327 data: 0.0077 max mem: 33300 Epoch: [21] [3250/4276] eta: 0:50:09 lr: 2.4662218915585416e-05 loss: 0.1261 (0.1127) time: 2.9323 data: 0.0077 max mem: 33300 Epoch: [21] [3260/4276] eta: 0:49:39 lr: 2.4659373001294964e-05 loss: 0.1161 (0.1127) time: 2.9304 data: 0.0076 max mem: 33300 Epoch: [21] [3270/4276] eta: 0:49:10 lr: 2.4656527050510242e-05 loss: 0.1158 (0.1127) time: 2.9360 data: 0.0075 max mem: 33300 Epoch: [21] [3280/4276] eta: 0:48:41 lr: 2.465368106322611e-05 loss: 0.1137 (0.1127) time: 2.9481 data: 0.0077 max mem: 33300 Epoch: [21] [3290/4276] eta: 0:48:12 lr: 2.4650835039437404e-05 loss: 0.1082 (0.1127) time: 2.9528 data: 0.0076 max mem: 33300 Epoch: [21] [3300/4276] eta: 0:47:42 lr: 2.464798897913898e-05 loss: 0.1193 (0.1128) time: 2.9513 data: 0.0073 max mem: 33300 Epoch: [21] [3310/4276] eta: 0:47:13 lr: 2.4645142882325687e-05 loss: 0.1349 (0.1128) time: 2.9667 data: 0.0073 max mem: 33300 Epoch: [21] [3320/4276] eta: 0:46:44 lr: 2.4642296748992372e-05 loss: 0.1282 (0.1128) time: 2.9575 data: 0.0074 max mem: 33300 Epoch: [21] [3330/4276] eta: 0:46:15 lr: 2.4639450579133875e-05 loss: 0.1140 (0.1128) time: 2.9634 data: 0.0076 max mem: 33300 Epoch: [21] [3340/4276] eta: 0:45:46 lr: 2.4636604372745036e-05 loss: 0.1124 (0.1128) time: 3.0351 data: 0.0079 max mem: 33300 Epoch: [21] [3350/4276] eta: 0:45:17 lr: 2.4633758129820714e-05 loss: 0.1043 (0.1128) time: 3.0826 data: 0.0080 max mem: 33300 Epoch: [21] [3360/4276] eta: 0:44:48 lr: 2.4630911850355727e-05 loss: 0.0987 (0.1128) time: 3.0520 data: 0.0079 max mem: 33300 Epoch: [21] [3370/4276] eta: 0:44:18 lr: 2.4628065534344927e-05 loss: 0.1052 (0.1128) time: 2.9776 data: 0.0078 max mem: 33300 Epoch: [21] [3380/4276] eta: 0:43:49 lr: 2.4625219181783148e-05 loss: 0.1029 (0.1128) time: 2.9367 data: 0.0075 max mem: 33300 Epoch: [21] [3390/4276] eta: 0:43:20 lr: 2.4622372792665236e-05 loss: 0.1030 (0.1128) time: 2.9328 data: 0.0075 max mem: 33300 Epoch: [21] [3400/4276] eta: 0:42:50 lr: 2.4619526366986008e-05 loss: 0.1126 (0.1128) time: 2.9331 data: 0.0077 max mem: 33300 Epoch: [21] [3410/4276] eta: 0:42:21 lr: 2.461667990474031e-05 loss: 0.1065 (0.1128) time: 2.9316 data: 0.0077 max mem: 33300 Epoch: [21] [3420/4276] eta: 0:41:52 lr: 2.4613833405922967e-05 loss: 0.1074 (0.1129) time: 2.9294 data: 0.0076 max mem: 33300 Epoch: [21] [3430/4276] eta: 0:41:22 lr: 2.461098687052882e-05 loss: 0.1155 (0.1129) time: 2.9297 data: 0.0077 max mem: 33300 Epoch: [21] [3440/4276] eta: 0:40:53 lr: 2.460814029855269e-05 loss: 0.1034 (0.1129) time: 2.9309 data: 0.0078 max mem: 33300 Epoch: [21] [3450/4276] eta: 0:40:24 lr: 2.4605293689989404e-05 loss: 0.1034 (0.1130) time: 2.9302 data: 0.0080 max mem: 33300 Epoch: [21] [3460/4276] eta: 0:39:54 lr: 2.460244704483379e-05 loss: 0.1250 (0.1130) time: 2.9241 data: 0.0079 max mem: 33300 Epoch: [21] [3470/4276] eta: 0:39:25 lr: 2.4599600363080684e-05 loss: 0.1051 (0.1130) time: 2.9234 data: 0.0075 max mem: 33300 Epoch: [21] [3480/4276] eta: 0:38:55 lr: 2.4596753644724894e-05 loss: 0.1028 (0.1130) time: 2.9300 data: 0.0076 max mem: 33300 Epoch: [21] [3490/4276] eta: 0:38:26 lr: 2.4593906889761245e-05 loss: 0.1117 (0.1130) time: 2.9705 data: 0.0080 max mem: 33300 Epoch: [21] [3500/4276] eta: 0:37:57 lr: 2.4591060098184562e-05 loss: 0.1036 (0.1130) time: 2.9670 data: 0.0080 max mem: 33300 Epoch: [21] [3510/4276] eta: 0:37:27 lr: 2.458821326998967e-05 loss: 0.0998 (0.1130) time: 2.9184 data: 0.0077 max mem: 33300 Epoch: [21] [3520/4276] eta: 0:36:58 lr: 2.4585366405171377e-05 loss: 0.0998 (0.1129) time: 2.9155 data: 0.0078 max mem: 33300 Epoch: [21] [3530/4276] eta: 0:36:29 lr: 2.4582519503724506e-05 loss: 0.1028 (0.1129) time: 2.9236 data: 0.0078 max mem: 33300 Epoch: [21] [3540/4276] eta: 0:36:00 lr: 2.4579672565643867e-05 loss: 0.1216 (0.1130) time: 2.9651 data: 0.0075 max mem: 33300 Epoch: [21] [3550/4276] eta: 0:35:30 lr: 2.4576825590924282e-05 loss: 0.1145 (0.1129) time: 3.0419 data: 0.0075 max mem: 33300 Epoch: [21] [3560/4276] eta: 0:35:01 lr: 2.457397857956055e-05 loss: 0.0998 (0.1130) time: 3.0807 data: 0.0075 max mem: 33300 Epoch: [21] [3570/4276] eta: 0:34:32 lr: 2.4571131531547497e-05 loss: 0.1126 (0.1130) time: 3.0749 data: 0.0078 max mem: 33300 Epoch: [21] [3580/4276] eta: 0:34:03 lr: 2.4568284446879925e-05 loss: 0.0998 (0.1130) time: 3.0755 data: 0.0079 max mem: 33300 Epoch: [21] [3590/4276] eta: 0:33:34 lr: 2.456543732555265e-05 loss: 0.0998 (0.1130) time: 3.0807 data: 0.0081 max mem: 33300 Epoch: [21] [3600/4276] eta: 0:33:05 lr: 2.456259016756046e-05 loss: 0.1221 (0.1130) time: 3.0800 data: 0.0082 max mem: 33300 Epoch: [21] [3610/4276] eta: 0:32:36 lr: 2.455974297289818e-05 loss: 0.1110 (0.1130) time: 3.0791 data: 0.0080 max mem: 33300 Epoch: [21] [3620/4276] eta: 0:32:07 lr: 2.4556895741560606e-05 loss: 0.1098 (0.1130) time: 3.0782 data: 0.0079 max mem: 33300 Epoch: [21] [3630/4276] eta: 0:31:38 lr: 2.4554048473542545e-05 loss: 0.1049 (0.1130) time: 3.0796 data: 0.0081 max mem: 33300 Epoch: [21] [3640/4276] eta: 0:31:09 lr: 2.4551201168838795e-05 loss: 0.0980 (0.1129) time: 3.0790 data: 0.0081 max mem: 33300 Epoch: [21] [3650/4276] eta: 0:30:39 lr: 2.454835382744415e-05 loss: 0.0968 (0.1129) time: 3.0595 data: 0.0079 max mem: 33300 Epoch: [21] [3660/4276] eta: 0:30:10 lr: 2.4545506449353422e-05 loss: 0.1021 (0.1129) time: 3.0309 data: 0.0082 max mem: 33300 Epoch: [21] [3670/4276] eta: 0:29:41 lr: 2.4542659034561396e-05 loss: 0.1070 (0.1129) time: 3.0448 data: 0.0085 max mem: 33300 Epoch: [21] [3680/4276] eta: 0:29:12 lr: 2.4539811583062873e-05 loss: 0.1128 (0.1129) time: 3.0755 data: 0.0084 max mem: 33300 Epoch: [21] [3690/4276] eta: 0:28:43 lr: 2.4536964094852644e-05 loss: 0.1238 (0.1129) time: 3.0823 data: 0.0081 max mem: 33300 Epoch: [21] [3700/4276] eta: 0:28:13 lr: 2.4534116569925512e-05 loss: 0.1199 (0.1129) time: 3.0753 data: 0.0080 max mem: 33300 Epoch: [21] [3710/4276] eta: 0:27:44 lr: 2.4531269008276252e-05 loss: 0.1002 (0.1129) time: 3.0768 data: 0.0083 max mem: 33300 Epoch: [21] [3720/4276] eta: 0:27:15 lr: 2.4528421409899665e-05 loss: 0.0980 (0.1129) time: 3.0839 data: 0.0082 max mem: 33300 Epoch: [21] [3730/4276] eta: 0:26:46 lr: 2.4525573774790538e-05 loss: 0.1114 (0.1129) time: 3.0817 data: 0.0081 max mem: 33300 Epoch: [21] [3740/4276] eta: 0:26:17 lr: 2.452272610294366e-05 loss: 0.1192 (0.1129) time: 3.0761 data: 0.0080 max mem: 33300 Epoch: [21] [3750/4276] eta: 0:25:47 lr: 2.4519878394353812e-05 loss: 0.1090 (0.1129) time: 3.0756 data: 0.0079 max mem: 33300 Epoch: [21] [3760/4276] eta: 0:25:18 lr: 2.4517030649015778e-05 loss: 0.0960 (0.1128) time: 3.0700 data: 0.0080 max mem: 33300 Epoch: [21] [3770/4276] eta: 0:24:49 lr: 2.4514182866924344e-05 loss: 0.1018 (0.1129) time: 3.0665 data: 0.0085 max mem: 33300 Epoch: [21] [3780/4276] eta: 0:24:20 lr: 2.45113350480743e-05 loss: 0.1018 (0.1128) time: 3.0799 data: 0.0089 max mem: 33300 Epoch: [21] [3790/4276] eta: 0:23:50 lr: 2.4508487192460413e-05 loss: 0.0976 (0.1128) time: 3.0694 data: 0.0088 max mem: 33300 Epoch: [21] [3800/4276] eta: 0:23:21 lr: 2.450563930007746e-05 loss: 0.1062 (0.1128) time: 3.0679 data: 0.0090 max mem: 33300 Epoch: [21] [3810/4276] eta: 0:22:52 lr: 2.4502791370920224e-05 loss: 0.1062 (0.1128) time: 3.0835 data: 0.0085 max mem: 33300 Epoch: [21] [3820/4276] eta: 0:22:22 lr: 2.449994340498349e-05 loss: 0.0985 (0.1128) time: 3.0644 data: 0.0082 max mem: 33300 Epoch: [21] [3830/4276] eta: 0:21:53 lr: 2.4497095402262017e-05 loss: 0.0996 (0.1128) time: 3.0648 data: 0.0085 max mem: 33300 Epoch: [21] [3840/4276] eta: 0:21:24 lr: 2.4494247362750586e-05 loss: 0.0975 (0.1127) time: 3.0833 data: 0.0085 max mem: 33300 Epoch: [21] [3850/4276] eta: 0:20:55 lr: 2.4491399286443966e-05 loss: 0.0955 (0.1127) time: 3.0842 data: 0.0085 max mem: 33300 Epoch: [21] [3860/4276] eta: 0:20:25 lr: 2.4488551173336935e-05 loss: 0.1185 (0.1127) time: 3.0859 data: 0.0085 max mem: 33300 Epoch: [21] [3870/4276] eta: 0:19:56 lr: 2.448570302342425e-05 loss: 0.1185 (0.1127) time: 3.0918 data: 0.0084 max mem: 33300 Epoch: [21] [3880/4276] eta: 0:19:27 lr: 2.4482854836700683e-05 loss: 0.1065 (0.1127) time: 3.0915 data: 0.0087 max mem: 33300 Epoch: [21] [3890/4276] eta: 0:18:57 lr: 2.4480006613161e-05 loss: 0.1089 (0.1127) time: 3.0930 data: 0.0085 max mem: 33300 Epoch: [21] [3900/4276] eta: 0:18:28 lr: 2.447715835279997e-05 loss: 0.1014 (0.1127) time: 3.0903 data: 0.0080 max mem: 33300 Epoch: [21] [3910/4276] eta: 0:17:59 lr: 2.4474310055612352e-05 loss: 0.0892 (0.1127) time: 3.0815 data: 0.0079 max mem: 33300 Epoch: [21] [3920/4276] eta: 0:17:29 lr: 2.44714617215929e-05 loss: 0.0898 (0.1126) time: 3.0838 data: 0.0080 max mem: 33300 Epoch: [21] [3930/4276] eta: 0:17:00 lr: 2.4468613350736388e-05 loss: 0.0927 (0.1126) time: 3.0853 data: 0.0079 max mem: 33300 Epoch: [21] [3940/4276] eta: 0:16:30 lr: 2.4465764943037573e-05 loss: 0.0970 (0.1126) time: 3.0841 data: 0.0077 max mem: 33300 Epoch: [21] [3950/4276] eta: 0:16:01 lr: 2.4462916498491205e-05 loss: 0.0970 (0.1126) time: 3.0763 data: 0.0077 max mem: 33300 Epoch: [21] [3960/4276] eta: 0:15:32 lr: 2.446006801709204e-05 loss: 0.1006 (0.1126) time: 3.0214 data: 0.0077 max mem: 33300 Epoch: [21] [3970/4276] eta: 0:15:02 lr: 2.4457219498834842e-05 loss: 0.1121 (0.1126) time: 2.9945 data: 0.0074 max mem: 33300 Epoch: [21] [3980/4276] eta: 0:14:33 lr: 2.4454370943714354e-05 loss: 0.0996 (0.1126) time: 3.0269 data: 0.0073 max mem: 33300 Epoch: [21] [3990/4276] eta: 0:14:03 lr: 2.4451522351725327e-05 loss: 0.1018 (0.1126) time: 3.0133 data: 0.0075 max mem: 33300 Epoch: [21] [4000/4276] eta: 0:13:34 lr: 2.4448673722862523e-05 loss: 0.1021 (0.1126) time: 2.9329 data: 0.0083 max mem: 33300 Epoch: [21] [4010/4276] eta: 0:13:04 lr: 2.4445825057120684e-05 loss: 0.1021 (0.1126) time: 2.8987 data: 0.0086 max mem: 33300 Epoch: [21] [4020/4276] eta: 0:12:35 lr: 2.4442976354494555e-05 loss: 0.1099 (0.1126) time: 2.9253 data: 0.0079 max mem: 33300 Epoch: [21] [4030/4276] eta: 0:12:05 lr: 2.4440127614978883e-05 loss: 0.1028 (0.1125) time: 2.9293 data: 0.0075 max mem: 33300 Epoch: [21] [4040/4276] eta: 0:11:36 lr: 2.4437278838568415e-05 loss: 0.1073 (0.1126) time: 2.9356 data: 0.0080 max mem: 33300 Epoch: [21] [4050/4276] eta: 0:11:06 lr: 2.4434430025257895e-05 loss: 0.1077 (0.1125) time: 2.9360 data: 0.0081 max mem: 33300 Epoch: [21] [4060/4276] eta: 0:10:37 lr: 2.443158117504206e-05 loss: 0.0949 (0.1126) time: 2.9279 data: 0.0073 max mem: 33300 Epoch: [21] [4070/4276] eta: 0:10:07 lr: 2.4428732287915653e-05 loss: 0.1142 (0.1126) time: 2.9270 data: 0.0073 max mem: 33300 Epoch: [21] [4080/4276] eta: 0:09:38 lr: 2.442588336387341e-05 loss: 0.1142 (0.1126) time: 2.9285 data: 0.0075 max mem: 33300 Epoch: [21] [4090/4276] eta: 0:09:08 lr: 2.442303440291008e-05 loss: 0.1242 (0.1126) time: 2.9290 data: 0.0077 max mem: 33300 Epoch: [21] [4100/4276] eta: 0:08:39 lr: 2.4420185405020384e-05 loss: 0.1225 (0.1126) time: 2.9291 data: 0.0081 max mem: 33300 Epoch: [21] [4110/4276] eta: 0:08:09 lr: 2.4417336370199066e-05 loss: 0.1086 (0.1126) time: 2.9303 data: 0.0084 max mem: 33300 Epoch: [21] [4120/4276] eta: 0:07:40 lr: 2.4414487298440854e-05 loss: 0.1080 (0.1126) time: 2.9323 data: 0.0082 max mem: 33300 Epoch: [21] [4130/4276] eta: 0:07:10 lr: 2.4411638189740482e-05 loss: 0.1052 (0.1126) time: 2.9304 data: 0.0075 max mem: 33300 Epoch: [21] [4140/4276] eta: 0:06:41 lr: 2.440878904409268e-05 loss: 0.1027 (0.1126) time: 2.9282 data: 0.0073 max mem: 33300 Epoch: [21] [4150/4276] eta: 0:06:11 lr: 2.4405939861492175e-05 loss: 0.0984 (0.1126) time: 2.9301 data: 0.0075 max mem: 33300 Epoch: [21] [4160/4276] eta: 0:05:42 lr: 2.4403090641933697e-05 loss: 0.1071 (0.1126) time: 2.9314 data: 0.0075 max mem: 33300 Epoch: [21] [4170/4276] eta: 0:05:12 lr: 2.4400241385411974e-05 loss: 0.1103 (0.1126) time: 2.9304 data: 0.0075 max mem: 33300 Epoch: [21] [4180/4276] eta: 0:04:43 lr: 2.4397392091921722e-05 loss: 0.1075 (0.1126) time: 2.9172 data: 0.0074 max mem: 33300 Epoch: [21] [4190/4276] eta: 0:04:13 lr: 2.439454276145767e-05 loss: 0.1052 (0.1126) time: 2.9154 data: 0.0076 max mem: 33300 Epoch: [21] [4200/4276] eta: 0:03:44 lr: 2.4391693394014543e-05 loss: 0.1052 (0.1127) time: 2.9271 data: 0.0079 max mem: 33300 Epoch: [21] [4210/4276] eta: 0:03:14 lr: 2.4388843989587058e-05 loss: 0.1180 (0.1127) time: 2.9297 data: 0.0078 max mem: 33300 Epoch: [21] [4220/4276] eta: 0:02:45 lr: 2.438599454816993e-05 loss: 0.1229 (0.1128) time: 2.9298 data: 0.0075 max mem: 33300 Epoch: [21] [4230/4276] eta: 0:02:15 lr: 2.438314506975788e-05 loss: 0.1219 (0.1128) time: 2.9378 data: 0.0077 max mem: 33300 Epoch: [21] [4240/4276] eta: 0:01:46 lr: 2.438029555434562e-05 loss: 0.1161 (0.1128) time: 2.9250 data: 0.0084 max mem: 33300 Epoch: [21] [4250/4276] eta: 0:01:16 lr: 2.4377446001927875e-05 loss: 0.1149 (0.1128) time: 2.9114 data: 0.0082 max mem: 33300 Epoch: [21] [4260/4276] eta: 0:00:47 lr: 2.4374596412499344e-05 loss: 0.1174 (0.1128) time: 2.9240 data: 0.0076 max mem: 33300 Epoch: [21] [4270/4276] eta: 0:00:17 lr: 2.4371746786054745e-05 loss: 0.1204 (0.1129) time: 2.9260 data: 0.0074 max mem: 33300 Epoch: [21] Total time: 3:30:08 Test: [ 0/21770] eta: 8:36:13 time: 1.4227 data: 1.3799 max mem: 33300 Test: [ 100/21770] eta: 0:18:51 time: 0.0387 data: 0.0009 max mem: 33300 Test: [ 200/21770] eta: 0:16:17 time: 0.0382 data: 0.0009 max mem: 33300 Test: [ 300/21770] eta: 0:15:22 time: 0.0381 data: 0.0009 max mem: 33300 Test: [ 400/21770] eta: 0:14:53 time: 0.0381 data: 0.0008 max mem: 33300 Test: [ 500/21770] eta: 0:14:35 time: 0.0386 data: 0.0008 max mem: 33300 Test: [ 600/21770] eta: 0:14:21 time: 0.0384 data: 0.0009 max mem: 33300 Test: [ 700/21770] eta: 0:14:09 time: 0.0380 data: 0.0009 max mem: 33300 Test: [ 800/21770] eta: 0:13:59 time: 0.0379 data: 0.0009 max mem: 33300 Test: [ 900/21770] eta: 0:13:50 time: 0.0379 data: 0.0009 max mem: 33300 Test: [ 1000/21770] eta: 0:13:43 time: 0.0397 data: 0.0009 max mem: 33300 Test: [ 1100/21770] eta: 0:13:38 time: 0.0394 data: 0.0009 max mem: 33300 Test: [ 1200/21770] eta: 0:13:33 time: 0.0397 data: 0.0009 max mem: 33300 Test: [ 1300/21770] eta: 0:13:29 time: 0.0393 data: 0.0009 max mem: 33300 Test: [ 1400/21770] eta: 0:13:24 time: 0.0390 data: 0.0009 max mem: 33300 Test: [ 1500/21770] eta: 0:13:18 time: 0.0379 data: 0.0008 max mem: 33300 Test: [ 1600/21770] eta: 0:13:12 time: 0.0377 data: 0.0009 max mem: 33300 Test: [ 1700/21770] eta: 0:13:06 time: 0.0378 data: 0.0009 max mem: 33300 Test: [ 1800/21770] eta: 0:13:01 time: 0.0378 data: 0.0009 max mem: 33300 Test: [ 1900/21770] eta: 0:12:56 time: 0.0380 data: 0.0009 max mem: 33300 Test: [ 2000/21770] eta: 0:12:51 time: 0.0378 data: 0.0009 max mem: 33300 Test: [ 2100/21770] eta: 0:12:46 time: 0.0379 data: 0.0009 max mem: 33300 Test: [ 2200/21770] eta: 0:12:42 time: 0.0398 data: 0.0009 max mem: 33300 Test: [ 2300/21770] eta: 0:12:39 time: 0.0397 data: 0.0009 max mem: 33300 Test: [ 2400/21770] eta: 0:12:36 time: 0.0400 data: 0.0009 max mem: 33300 Test: [ 2500/21770] eta: 0:12:33 time: 0.0398 data: 0.0009 max mem: 33300 Test: [ 2600/21770] eta: 0:12:29 time: 0.0399 data: 0.0009 max mem: 33300 Test: [ 2700/21770] eta: 0:12:26 time: 0.0399 data: 0.0008 max mem: 33300 Test: [ 2800/21770] eta: 0:12:23 time: 0.0400 data: 0.0008 max mem: 33300 Test: [ 2900/21770] eta: 0:12:19 time: 0.0400 data: 0.0008 max mem: 33300 Test: [ 3000/21770] eta: 0:12:16 time: 0.0401 data: 0.0008 max mem: 33300 Test: [ 3100/21770] eta: 0:12:12 time: 0.0400 data: 0.0008 max mem: 33300 Test: [ 3200/21770] eta: 0:12:09 time: 0.0400 data: 0.0008 max mem: 33300 Test: [ 3300/21770] eta: 0:12:05 time: 0.0400 data: 0.0008 max mem: 33300 Test: [ 3400/21770] eta: 0:12:02 time: 0.0401 data: 0.0008 max mem: 33300 Test: [ 3500/21770] eta: 0:11:58 time: 0.0394 data: 0.0008 max mem: 33300 Test: [ 3600/21770] eta: 0:11:54 time: 0.0395 data: 0.0008 max mem: 33300 Test: [ 3700/21770] eta: 0:11:50 time: 0.0390 data: 0.0009 max mem: 33300 Test: [ 3800/21770] eta: 0:11:46 time: 0.0391 data: 0.0008 max mem: 33300 Test: [ 3900/21770] eta: 0:11:42 time: 0.0391 data: 0.0008 max mem: 33300 Test: [ 4000/21770] eta: 0:11:38 time: 0.0392 data: 0.0008 max mem: 33300 Test: [ 4100/21770] eta: 0:11:34 time: 0.0394 data: 0.0009 max mem: 33300 Test: [ 4200/21770] eta: 0:11:30 time: 0.0395 data: 0.0009 max mem: 33300 Test: [ 4300/21770] eta: 0:11:26 time: 0.0390 data: 0.0008 max mem: 33300 Test: [ 4400/21770] eta: 0:11:22 time: 0.0395 data: 0.0008 max mem: 33300 Test: [ 4500/21770] eta: 0:11:19 time: 0.0395 data: 0.0008 max mem: 33300 Test: [ 4600/21770] eta: 0:11:15 time: 0.0389 data: 0.0008 max mem: 33300 Test: [ 4700/21770] eta: 0:11:11 time: 0.0397 data: 0.0008 max mem: 33300 Test: [ 4800/21770] eta: 0:11:07 time: 0.0389 data: 0.0008 max mem: 33300 Test: [ 4900/21770] eta: 0:11:03 time: 0.0393 data: 0.0008 max mem: 33300 Test: [ 5000/21770] eta: 0:10:59 time: 0.0392 data: 0.0008 max mem: 33300 Test: [ 5100/21770] eta: 0:10:55 time: 0.0392 data: 0.0008 max mem: 33300 Test: [ 5200/21770] eta: 0:10:51 time: 0.0393 data: 0.0008 max mem: 33300 Test: [ 5300/21770] eta: 0:10:47 time: 0.0393 data: 0.0008 max mem: 33300 Test: [ 5400/21770] eta: 0:10:43 time: 0.0392 data: 0.0008 max mem: 33300 Test: [ 5500/21770] eta: 0:10:39 time: 0.0393 data: 0.0008 max mem: 33300 Test: [ 5600/21770] eta: 0:10:35 time: 0.0393 data: 0.0008 max mem: 33300 Test: [ 5700/21770] eta: 0:10:31 time: 0.0396 data: 0.0009 max mem: 33300 Test: [ 5800/21770] eta: 0:10:28 time: 0.0400 data: 0.0008 max mem: 33300 Test: [ 5900/21770] eta: 0:10:24 time: 0.0395 data: 0.0008 max mem: 33300 Test: [ 6000/21770] eta: 0:10:20 time: 0.0395 data: 0.0008 max mem: 33300 Test: [ 6100/21770] eta: 0:10:16 time: 0.0396 data: 0.0008 max mem: 33300 Test: [ 6200/21770] eta: 0:10:12 time: 0.0398 data: 0.0008 max mem: 33300 Test: [ 6300/21770] eta: 0:10:09 time: 0.0393 data: 0.0008 max mem: 33300 Test: [ 6400/21770] eta: 0:10:05 time: 0.0400 data: 0.0008 max mem: 33300 Test: [ 6500/21770] eta: 0:10:01 time: 0.0393 data: 0.0008 max mem: 33300 Test: [ 6600/21770] eta: 0:09:57 time: 0.0393 data: 0.0008 max mem: 33300 Test: [ 6700/21770] eta: 0:09:53 time: 0.0399 data: 0.0008 max mem: 33300 Test: [ 6800/21770] eta: 0:09:49 time: 0.0394 data: 0.0008 max mem: 33300 Test: [ 6900/21770] eta: 0:09:45 time: 0.0399 data: 0.0008 max mem: 33300 Test: [ 7000/21770] eta: 0:09:42 time: 0.0398 data: 0.0008 max mem: 33300 Test: [ 7100/21770] eta: 0:09:38 time: 0.0398 data: 0.0008 max mem: 33300 Test: [ 7200/21770] eta: 0:09:34 time: 0.0399 data: 0.0008 max mem: 33300 Test: [ 7300/21770] eta: 0:09:30 time: 0.0400 data: 0.0008 max mem: 33300 Test: [ 7400/21770] eta: 0:09:26 time: 0.0400 data: 0.0008 max mem: 33300 Test: [ 7500/21770] eta: 0:09:22 time: 0.0394 data: 0.0008 max mem: 33300 Test: [ 7600/21770] eta: 0:09:18 time: 0.0395 data: 0.0008 max mem: 33300 Test: [ 7700/21770] eta: 0:09:14 time: 0.0398 data: 0.0008 max mem: 33300 Test: [ 7800/21770] eta: 0:09:10 time: 0.0397 data: 0.0008 max mem: 33300 Test: [ 7900/21770] eta: 0:09:06 time: 0.0394 data: 0.0008 max mem: 33300 Test: [ 8000/21770] eta: 0:09:02 time: 0.0392 data: 0.0008 max mem: 33300 Test: [ 8100/21770] eta: 0:08:59 time: 0.0392 data: 0.0008 max mem: 33300 Test: [ 8200/21770] eta: 0:08:55 time: 0.0392 data: 0.0008 max mem: 33300 Test: [ 8300/21770] eta: 0:08:51 time: 0.0392 data: 0.0008 max mem: 33300 Test: [ 8400/21770] eta: 0:08:47 time: 0.0393 data: 0.0008 max mem: 33300 Test: [ 8500/21770] eta: 0:08:43 time: 0.0392 data: 0.0008 max mem: 33300 Test: [ 8600/21770] eta: 0:08:39 time: 0.0391 data: 0.0008 max mem: 33300 Test: [ 8700/21770] eta: 0:08:35 time: 0.0389 data: 0.0008 max mem: 33300 Test: [ 8800/21770] eta: 0:08:31 time: 0.0388 data: 0.0009 max mem: 33300 Test: [ 8900/21770] eta: 0:08:27 time: 0.0391 data: 0.0008 max mem: 33300 Test: [ 9000/21770] eta: 0:08:23 time: 0.0390 data: 0.0008 max mem: 33300 Test: [ 9100/21770] eta: 0:08:19 time: 0.0390 data: 0.0008 max mem: 33300 Test: [ 9200/21770] eta: 0:08:15 time: 0.0389 data: 0.0008 max mem: 33300 Test: [ 9300/21770] eta: 0:08:11 time: 0.0390 data: 0.0008 max mem: 33300 Test: [ 9400/21770] eta: 0:08:07 time: 0.0389 data: 0.0008 max mem: 33300 Test: [ 9500/21770] eta: 0:08:03 time: 0.0390 data: 0.0008 max mem: 33300 Test: [ 9600/21770] eta: 0:07:59 time: 0.0390 data: 0.0008 max mem: 33300 Test: [ 9700/21770] eta: 0:07:55 time: 0.0388 data: 0.0008 max mem: 33300 Test: [ 9800/21770] eta: 0:07:51 time: 0.0386 data: 0.0008 max mem: 33300 Test: [ 9900/21770] eta: 0:07:47 time: 0.0383 data: 0.0008 max mem: 33300 Test: [10000/21770] eta: 0:07:43 time: 0.0388 data: 0.0009 max mem: 33300 Test: [10100/21770] eta: 0:07:39 time: 0.0390 data: 0.0009 max mem: 33300 Test: [10200/21770] eta: 0:07:35 time: 0.0391 data: 0.0009 max mem: 33300 Test: [10300/21770] eta: 0:07:31 time: 0.0391 data: 0.0008 max mem: 33300 Test: [10400/21770] eta: 0:07:27 time: 0.0391 data: 0.0009 max mem: 33300 Test: [10500/21770] eta: 0:07:23 time: 0.0389 data: 0.0008 max mem: 33300 Test: [10600/21770] eta: 0:07:19 time: 0.0391 data: 0.0008 max mem: 33300 Test: [10700/21770] eta: 0:07:15 time: 0.0388 data: 0.0008 max mem: 33300 Test: [10800/21770] eta: 0:07:11 time: 0.0390 data: 0.0008 max mem: 33300 Test: [10900/21770] eta: 0:07:07 time: 0.0389 data: 0.0008 max mem: 33300 Test: [11000/21770] eta: 0:07:03 time: 0.0386 data: 0.0008 max mem: 33300 Test: [11100/21770] eta: 0:06:59 time: 0.0385 data: 0.0008 max mem: 33300 Test: [11200/21770] eta: 0:06:55 time: 0.0390 data: 0.0008 max mem: 33300 Test: [11300/21770] eta: 0:06:51 time: 0.0391 data: 0.0008 max mem: 33300 Test: [11400/21770] eta: 0:06:47 time: 0.0390 data: 0.0008 max mem: 33300 Test: [11500/21770] eta: 0:06:43 time: 0.0391 data: 0.0008 max mem: 33300 Test: [11600/21770] eta: 0:06:39 time: 0.0387 data: 0.0008 max mem: 33300 Test: [11700/21770] eta: 0:06:35 time: 0.0385 data: 0.0008 max mem: 33300 Test: [11800/21770] eta: 0:06:31 time: 0.0387 data: 0.0008 max mem: 33300 Test: [11900/21770] eta: 0:06:27 time: 0.0386 data: 0.0008 max mem: 33300 Test: [12000/21770] eta: 0:06:23 time: 0.0389 data: 0.0009 max mem: 33300 Test: [12100/21770] eta: 0:06:19 time: 0.0387 data: 0.0008 max mem: 33300 Test: [12200/21770] eta: 0:06:15 time: 0.0385 data: 0.0008 max mem: 33300 Test: [12300/21770] eta: 0:06:11 time: 0.0380 data: 0.0008 max mem: 33300 Test: [12400/21770] eta: 0:06:07 time: 0.0384 data: 0.0008 max mem: 33300 Test: [12500/21770] eta: 0:06:03 time: 0.0382 data: 0.0008 max mem: 33300 Test: [12600/21770] eta: 0:05:59 time: 0.0383 data: 0.0008 max mem: 33300 Test: [12700/21770] eta: 0:05:55 time: 0.0382 data: 0.0008 max mem: 33300 Test: [12800/21770] eta: 0:05:51 time: 0.0384 data: 0.0008 max mem: 33300 Test: [12900/21770] eta: 0:05:47 time: 0.0380 data: 0.0008 max mem: 33300 Test: [13000/21770] eta: 0:05:43 time: 0.0387 data: 0.0008 max mem: 33300 Test: [13100/21770] eta: 0:05:39 time: 0.0385 data: 0.0008 max mem: 33300 Test: [13200/21770] eta: 0:05:35 time: 0.0393 data: 0.0008 max mem: 33300 Test: [13300/21770] eta: 0:05:31 time: 0.0392 data: 0.0008 max mem: 33300 Test: [13400/21770] eta: 0:05:27 time: 0.0393 data: 0.0008 max mem: 33300 Test: [13500/21770] eta: 0:05:24 time: 0.0391 data: 0.0008 max mem: 33300 Test: [13600/21770] eta: 0:05:20 time: 0.0388 data: 0.0009 max mem: 33300 Test: [13700/21770] eta: 0:05:16 time: 0.0386 data: 0.0009 max mem: 33300 Test: [13800/21770] eta: 0:05:12 time: 0.0390 data: 0.0008 max mem: 33300 Test: [13900/21770] eta: 0:05:08 time: 0.0386 data: 0.0008 max mem: 33300 Test: [14000/21770] eta: 0:05:04 time: 0.0389 data: 0.0008 max mem: 33300 Test: [14100/21770] eta: 0:05:00 time: 0.0385 data: 0.0008 max mem: 33300 Test: [14200/21770] eta: 0:04:56 time: 0.0390 data: 0.0008 max mem: 33300 Test: [14300/21770] eta: 0:04:52 time: 0.0386 data: 0.0008 max mem: 33300 Test: [14400/21770] eta: 0:04:48 time: 0.0389 data: 0.0008 max mem: 33300 Test: [14500/21770] eta: 0:04:44 time: 0.0386 data: 0.0008 max mem: 33300 Test: [14600/21770] eta: 0:04:40 time: 0.0390 data: 0.0008 max mem: 33300 Test: [14700/21770] eta: 0:04:36 time: 0.0385 data: 0.0008 max mem: 33300 Test: [14800/21770] eta: 0:04:32 time: 0.0388 data: 0.0008 max mem: 33300 Test: [14900/21770] eta: 0:04:28 time: 0.0386 data: 0.0008 max mem: 33300 Test: [15000/21770] eta: 0:04:24 time: 0.0388 data: 0.0008 max mem: 33300 Test: [15100/21770] eta: 0:04:21 time: 0.0385 data: 0.0008 max mem: 33300 Test: [15200/21770] eta: 0:04:17 time: 0.0387 data: 0.0009 max mem: 33300 Test: [15300/21770] eta: 0:04:13 time: 0.0383 data: 0.0008 max mem: 33300 Test: [15400/21770] eta: 0:04:09 time: 0.0385 data: 0.0008 max mem: 33300 Test: [15500/21770] eta: 0:04:05 time: 0.0383 data: 0.0008 max mem: 33300 Test: [15600/21770] eta: 0:04:01 time: 0.0386 data: 0.0008 max mem: 33300 Test: [15700/21770] eta: 0:03:57 time: 0.0383 data: 0.0008 max mem: 33300 Test: [15800/21770] eta: 0:03:53 time: 0.0386 data: 0.0008 max mem: 33300 Test: [15900/21770] eta: 0:03:49 time: 0.0383 data: 0.0008 max mem: 33300 Test: [16000/21770] eta: 0:03:45 time: 0.0389 data: 0.0008 max mem: 33300 Test: [16100/21770] eta: 0:03:41 time: 0.0386 data: 0.0008 max mem: 33300 Test: [16200/21770] eta: 0:03:37 time: 0.0387 data: 0.0008 max mem: 33300 Test: [16300/21770] eta: 0:03:33 time: 0.0385 data: 0.0008 max mem: 33300 Test: [16400/21770] eta: 0:03:29 time: 0.0385 data: 0.0008 max mem: 33300 Test: [16500/21770] eta: 0:03:25 time: 0.0381 data: 0.0008 max mem: 33300 Test: [16600/21770] eta: 0:03:22 time: 0.0384 data: 0.0008 max mem: 33300 Test: [16700/21770] eta: 0:03:18 time: 0.0383 data: 0.0008 max mem: 33300 Test: [16800/21770] eta: 0:03:14 time: 0.0388 data: 0.0008 max mem: 33300 Test: [16900/21770] eta: 0:03:10 time: 0.0384 data: 0.0008 max mem: 33300 Test: [17000/21770] eta: 0:03:06 time: 0.0387 data: 0.0008 max mem: 33300 Test: [17100/21770] eta: 0:03:02 time: 0.0384 data: 0.0008 max mem: 33300 Test: [17200/21770] eta: 0:02:58 time: 0.0389 data: 0.0008 max mem: 33300 Test: [17300/21770] eta: 0:02:54 time: 0.0385 data: 0.0008 max mem: 33300 Test: [17400/21770] eta: 0:02:50 time: 0.0386 data: 0.0008 max mem: 33300 Test: [17500/21770] eta: 0:02:46 time: 0.0379 data: 0.0008 max mem: 33300 Test: [17600/21770] eta: 0:02:42 time: 0.0385 data: 0.0008 max mem: 33300 Test: [17700/21770] eta: 0:02:38 time: 0.0384 data: 0.0009 max mem: 33300 Test: [17800/21770] eta: 0:02:34 time: 0.0391 data: 0.0010 max mem: 33300 Test: [17900/21770] eta: 0:02:31 time: 0.0386 data: 0.0010 max mem: 33300 Test: [18000/21770] eta: 0:02:27 time: 0.0390 data: 0.0010 max mem: 33300 Test: [18100/21770] eta: 0:02:23 time: 0.0387 data: 0.0010 max mem: 33300 Test: [18200/21770] eta: 0:02:19 time: 0.0390 data: 0.0010 max mem: 33300 Test: [18300/21770] eta: 0:02:15 time: 0.0386 data: 0.0010 max mem: 33300 Test: [18400/21770] eta: 0:02:11 time: 0.0394 data: 0.0010 max mem: 33300 Test: [18500/21770] eta: 0:02:07 time: 0.0386 data: 0.0010 max mem: 33300 Test: [18600/21770] eta: 0:02:03 time: 0.0394 data: 0.0009 max mem: 33300 Test: [18700/21770] eta: 0:01:59 time: 0.0391 data: 0.0009 max mem: 33300 Test: [18800/21770] eta: 0:01:55 time: 0.0394 data: 0.0009 max mem: 33300 Test: [18900/21770] eta: 0:01:52 time: 0.0382 data: 0.0009 max mem: 33300 Test: [19000/21770] eta: 0:01:48 time: 0.0384 data: 0.0009 max mem: 33300 Test: [19100/21770] eta: 0:01:44 time: 0.0385 data: 0.0009 max mem: 33300 Test: [19200/21770] eta: 0:01:40 time: 0.0381 data: 0.0009 max mem: 33300 Test: [19300/21770] eta: 0:01:36 time: 0.0381 data: 0.0009 max mem: 33300 Test: [19400/21770] eta: 0:01:32 time: 0.0386 data: 0.0009 max mem: 33300 Test: [19500/21770] eta: 0:01:28 time: 0.0384 data: 0.0009 max mem: 33300 Test: [19600/21770] eta: 0:01:24 time: 0.0386 data: 0.0009 max mem: 33300 Test: [19700/21770] eta: 0:01:20 time: 0.0380 data: 0.0009 max mem: 33300 Test: [19800/21770] eta: 0:01:16 time: 0.0382 data: 0.0009 max mem: 33300 Test: [19900/21770] eta: 0:01:12 time: 0.0381 data: 0.0009 max mem: 33300 Test: [20000/21770] eta: 0:01:09 time: 0.0383 data: 0.0009 max mem: 33300 Test: [20100/21770] eta: 0:01:05 time: 0.0380 data: 0.0008 max mem: 33300 Test: [20200/21770] eta: 0:01:01 time: 0.0382 data: 0.0009 max mem: 33300 Test: [20300/21770] eta: 0:00:57 time: 0.0381 data: 0.0009 max mem: 33300 Test: [20400/21770] eta: 0:00:53 time: 0.0382 data: 0.0009 max mem: 33300 Test: [20500/21770] eta: 0:00:49 time: 0.0381 data: 0.0009 max mem: 33300 Test: [20600/21770] eta: 0:00:45 time: 0.0382 data: 0.0009 max mem: 33300 Test: [20700/21770] eta: 0:00:41 time: 0.0381 data: 0.0009 max mem: 33300 Test: [20800/21770] eta: 0:00:37 time: 0.0382 data: 0.0009 max mem: 33300 Test: [20900/21770] eta: 0:00:33 time: 0.0381 data: 0.0009 max mem: 33300 Test: [21000/21770] eta: 0:00:29 time: 0.0383 data: 0.0009 max mem: 33300 Test: [21100/21770] eta: 0:00:26 time: 0.0382 data: 0.0009 max mem: 33300 Test: [21200/21770] eta: 0:00:22 time: 0.0384 data: 0.0009 max mem: 33300 Test: [21300/21770] eta: 0:00:18 time: 0.0382 data: 0.0009 max mem: 33300 Test: [21400/21770] eta: 0:00:14 time: 0.0383 data: 0.0009 max mem: 33300 Test: [21500/21770] eta: 0:00:10 time: 0.0382 data: 0.0009 max mem: 33300 Test: [21600/21770] eta: 0:00:06 time: 0.0384 data: 0.0009 max mem: 33300 Test: [21700/21770] eta: 0:00:02 time: 0.0381 data: 0.0009 max mem: 33300 Test: Total time: 0:14:07 Final results: Mean IoU is 0.00 precision@0.5 = 0.00 precision@0.6 = 0.00 precision@0.7 = 0.00 precision@0.8 = 0.00 precision@0.9 = 0.00 overall IoU = 0.00 mean IoU = 0.00 Mean accuracy for one-to-zero sample is 100.00 Average object IoU 0.0 Overall IoU 0.0 Epoch: [22] [ 0/4276] eta: 5:40:01 lr: 2.4370036992418076e-05 loss: 0.0903 (0.0903) time: 4.7713 data: 1.7489 max mem: 33300 Epoch: [22] [ 10/4276] eta: 3:40:47 lr: 2.436718730673677e-05 loss: 0.1216 (0.1233) time: 3.1054 data: 0.1662 max mem: 33300 Epoch: [22] [ 20/4276] eta: 3:34:17 lr: 2.436433758402564e-05 loss: 0.1146 (0.1171) time: 2.9335 data: 0.0079 max mem: 33300 Epoch: [22] [ 30/4276] eta: 3:32:06 lr: 2.4361487824279387e-05 loss: 0.1031 (0.1162) time: 2.9377 data: 0.0084 max mem: 33300 Epoch: [22] [ 40/4276] eta: 3:30:28 lr: 2.4358638027492722e-05 loss: 0.1064 (0.1144) time: 2.9394 data: 0.0086 max mem: 33300 Epoch: [22] [ 50/4276] eta: 3:29:13 lr: 2.4355788193660353e-05 loss: 0.1064 (0.1122) time: 2.9294 data: 0.0081 max mem: 33300 Epoch: [22] [ 60/4276] eta: 3:28:17 lr: 2.435293832277697e-05 loss: 0.1002 (0.1115) time: 2.9295 data: 0.0081 max mem: 33300 Epoch: [22] [ 70/4276] eta: 3:27:24 lr: 2.435008841483728e-05 loss: 0.0964 (0.1090) time: 2.9284 data: 0.0081 max mem: 33300 Epoch: [22] [ 80/4276] eta: 3:26:39 lr: 2.4347238469835983e-05 loss: 0.0964 (0.1099) time: 2.9270 data: 0.0079 max mem: 33300 Epoch: [22] [ 90/4276] eta: 3:25:55 lr: 2.434438848776778e-05 loss: 0.1078 (0.1098) time: 2.9268 data: 0.0079 max mem: 33300 Epoch: [22] [ 100/4276] eta: 3:25:15 lr: 2.4341538468627356e-05 loss: 0.1081 (0.1104) time: 2.9257 data: 0.0082 max mem: 33300 Epoch: [22] [ 110/4276] eta: 3:24:37 lr: 2.4338688412409414e-05 loss: 0.1159 (0.1112) time: 2.9261 data: 0.0081 max mem: 33300 Epoch: [22] [ 120/4276] eta: 3:24:00 lr: 2.4335838319108644e-05 loss: 0.1060 (0.1107) time: 2.9255 data: 0.0085 max mem: 33300 Epoch: [22] [ 130/4276] eta: 3:23:26 lr: 2.4332988188719746e-05 loss: 0.1060 (0.1112) time: 2.9284 data: 0.0088 max mem: 33300 Epoch: [22] [ 140/4276] eta: 3:22:52 lr: 2.43301380212374e-05 loss: 0.1093 (0.1105) time: 2.9302 data: 0.0086 max mem: 33300 Epoch: [22] [ 150/4276] eta: 3:22:19 lr: 2.43272878166563e-05 loss: 0.1042 (0.1103) time: 2.9294 data: 0.0083 max mem: 33300 Epoch: [22] [ 160/4276] eta: 3:21:46 lr: 2.4324437574971133e-05 loss: 0.1071 (0.1102) time: 2.9293 data: 0.0079 max mem: 33300 Epoch: [22] [ 170/4276] eta: 3:21:15 lr: 2.432158729617659e-05 loss: 0.1027 (0.1096) time: 2.9300 data: 0.0079 max mem: 33300 Epoch: [22] [ 180/4276] eta: 3:20:43 lr: 2.431873698026734e-05 loss: 0.0990 (0.1097) time: 2.9310 data: 0.0081 max mem: 33300 Epoch: [22] [ 190/4276] eta: 3:20:11 lr: 2.431588662723808e-05 loss: 0.1016 (0.1094) time: 2.9307 data: 0.0082 max mem: 33300 Epoch: [22] [ 200/4276] eta: 3:19:40 lr: 2.4313036237083486e-05 loss: 0.0948 (0.1093) time: 2.9298 data: 0.0080 max mem: 33300 Epoch: [22] [ 210/4276] eta: 3:19:09 lr: 2.431018580979825e-05 loss: 0.0983 (0.1092) time: 2.9299 data: 0.0080 max mem: 33300 Epoch: [22] [ 220/4276] eta: 3:18:37 lr: 2.4307335345377033e-05 loss: 0.0983 (0.1097) time: 2.9273 data: 0.0080 max mem: 33300 Epoch: [22] [ 230/4276] eta: 3:17:58 lr: 2.430448484381452e-05 loss: 0.0875 (0.1090) time: 2.9031 data: 0.0086 max mem: 33300 Epoch: [22] [ 240/4276] eta: 3:17:18 lr: 2.430163430510539e-05 loss: 0.0918 (0.1088) time: 2.8790 data: 0.0088 max mem: 33300 Epoch: [22] [ 250/4276] eta: 3:16:40 lr: 2.4298783729244317e-05 loss: 0.1093 (0.1095) time: 2.8765 data: 0.0080 max mem: 33300 Epoch: [22] [ 260/4276] eta: 3:16:02 lr: 2.429593311622597e-05 loss: 0.1049 (0.1094) time: 2.8761 data: 0.0076 max mem: 33300 Epoch: [22] [ 270/4276] eta: 3:15:25 lr: 2.429308246604502e-05 loss: 0.0948 (0.1095) time: 2.8760 data: 0.0074 max mem: 33300 Epoch: [22] [ 280/4276] eta: 3:14:48 lr: 2.429023177869614e-05 loss: 0.1052 (0.1094) time: 2.8761 data: 0.0076 max mem: 33300 Epoch: [22] [ 290/4276] eta: 3:14:18 lr: 2.4287381054174003e-05 loss: 0.1001 (0.1089) time: 2.8953 data: 0.0081 max mem: 33300 Epoch: [22] [ 300/4276] eta: 3:13:45 lr: 2.4284530292473266e-05 loss: 0.0999 (0.1089) time: 2.9073 data: 0.0091 max mem: 33300 Epoch: [22] [ 310/4276] eta: 3:13:17 lr: 2.4281679493588603e-05 loss: 0.1047 (0.1092) time: 2.9155 data: 0.0092 max mem: 33300 Epoch: [22] [ 320/4276] eta: 3:12:50 lr: 2.427882865751467e-05 loss: 0.1027 (0.1095) time: 2.9365 data: 0.0092 max mem: 33300 Epoch: [22] [ 330/4276] eta: 3:12:21 lr: 2.4275977784246142e-05 loss: 0.1254 (0.1103) time: 2.9370 data: 0.0090 max mem: 33300 Epoch: [22] [ 340/4276] eta: 3:11:54 lr: 2.4273126873777668e-05 loss: 0.1151 (0.1103) time: 2.9366 data: 0.0082 max mem: 33300 Epoch: [22] [ 350/4276] eta: 3:11:26 lr: 2.4270275926103912e-05 loss: 0.1094 (0.1104) time: 2.9407 data: 0.0084 max mem: 33300 Epoch: [22] [ 360/4276] eta: 3:11:02 lr: 2.426742494121954e-05 loss: 0.1205 (0.1114) time: 2.9561 data: 0.0081 max mem: 33300 Epoch: [22] [ 370/4276] eta: 3:10:29 lr: 2.4264573919119196e-05 loss: 0.1146 (0.1113) time: 2.9291 data: 0.0080 max mem: 33300 Epoch: [22] [ 380/4276] eta: 3:10:21 lr: 2.4261722859797543e-05 loss: 0.1046 (0.1113) time: 3.0105 data: 0.0084 max mem: 33300 Epoch: [22] [ 390/4276] eta: 3:10:22 lr: 2.425887176324923e-05 loss: 0.1085 (0.1117) time: 3.1884 data: 0.0090 max mem: 33300 Epoch: [22] [ 400/4276] eta: 3:10:24 lr: 2.425602062946892e-05 loss: 0.1236 (0.1120) time: 3.2534 data: 0.0099 max mem: 33300 Epoch: [22] [ 410/4276] eta: 3:10:25 lr: 2.4253169458451252e-05 loss: 0.1309 (0.1124) time: 3.2682 data: 0.0098 max mem: 33300 Epoch: [22] [ 420/4276] eta: 3:10:25 lr: 2.4250318250190883e-05 loss: 0.1205 (0.1124) time: 3.2730 data: 0.0098 max mem: 33300 Epoch: [22] [ 430/4276] eta: 3:10:22 lr: 2.4247467004682453e-05 loss: 0.1195 (0.1126) time: 3.2702 data: 0.0095 max mem: 33300 Epoch: [22] [ 440/4276] eta: 3:10:19 lr: 2.4244615721920624e-05 loss: 0.1116 (0.1125) time: 3.2680 data: 0.0088 max mem: 33300 Epoch: [22] [ 450/4276] eta: 3:10:13 lr: 2.4241764401900024e-05 loss: 0.1086 (0.1124) time: 3.2660 data: 0.0097 max mem: 33300 Epoch: [22] [ 460/4276] eta: 3:10:04 lr: 2.4238913044615304e-05 loss: 0.1085 (0.1123) time: 3.2486 data: 0.0103 max mem: 33300 Epoch: [22] [ 470/4276] eta: 3:09:56 lr: 2.4236061650061107e-05 loss: 0.0971 (0.1122) time: 3.2445 data: 0.0100 max mem: 33300 Epoch: [22] [ 480/4276] eta: 3:09:47 lr: 2.423321021823208e-05 loss: 0.0981 (0.1121) time: 3.2612 data: 0.0095 max mem: 33300 Epoch: [22] [ 490/4276] eta: 3:09:38 lr: 2.4230358749122846e-05 loss: 0.0991 (0.1117) time: 3.2656 data: 0.0088 max mem: 33300 Epoch: [22] [ 500/4276] eta: 3:09:27 lr: 2.4227507242728056e-05 loss: 0.0991 (0.1115) time: 3.2597 data: 0.0088 max mem: 33300 Epoch: [22] [ 510/4276] eta: 3:09:15 lr: 2.422465569904234e-05 loss: 0.1022 (0.1114) time: 3.2567 data: 0.0090 max mem: 33300 Epoch: [22] [ 520/4276] eta: 3:09:00 lr: 2.4221804118060347e-05 loss: 0.1043 (0.1113) time: 3.2469 data: 0.0091 max mem: 33300 Epoch: [22] [ 530/4276] eta: 3:08:43 lr: 2.4218952499776685e-05 loss: 0.1070 (0.1113) time: 3.2176 data: 0.0097 max mem: 33300 Epoch: [22] [ 540/4276] eta: 3:08:25 lr: 2.4216100844186005e-05 loss: 0.1026 (0.1111) time: 3.1967 data: 0.0093 max mem: 33300 Epoch: [22] [ 550/4276] eta: 3:08:06 lr: 2.4213249151282928e-05 loss: 0.1012 (0.1111) time: 3.1948 data: 0.0086 max mem: 33300 Epoch: [22] [ 560/4276] eta: 3:07:51 lr: 2.4210397421062094e-05 loss: 0.1106 (0.1112) time: 3.2290 data: 0.0095 max mem: 33300 Epoch: [22] [ 570/4276] eta: 3:07:36 lr: 2.4207545653518118e-05 loss: 0.1106 (0.1111) time: 3.2702 data: 0.0106 max mem: 33300 Epoch: [22] [ 580/4276] eta: 3:07:20 lr: 2.4204693848645633e-05 loss: 0.1053 (0.1111) time: 3.2704 data: 0.0102 max mem: 33300 Epoch: [22] [ 590/4276] eta: 3:07:04 lr: 2.4201842006439264e-05 loss: 0.1034 (0.1110) time: 3.2616 data: 0.0090 max mem: 33300 Epoch: [22] [ 600/4276] eta: 3:06:47 lr: 2.4198990126893632e-05 loss: 0.0979 (0.1109) time: 3.2675 data: 0.0097 max mem: 33300 Epoch: [22] [ 610/4276] eta: 3:06:30 lr: 2.4196138210003355e-05 loss: 0.0959 (0.1108) time: 3.2717 data: 0.0105 max mem: 33300 Epoch: [22] [ 620/4276] eta: 3:05:59 lr: 2.4193286255763058e-05 loss: 0.0959 (0.1107) time: 3.1555 data: 0.0095 max mem: 33300 Epoch: [22] [ 630/4276] eta: 3:05:27 lr: 2.4190434264167358e-05 loss: 0.1046 (0.1109) time: 3.0355 data: 0.0090 max mem: 33300 Epoch: [22] [ 640/4276] eta: 3:04:55 lr: 2.4187582235210878e-05 loss: 0.1031 (0.1107) time: 3.0295 data: 0.0091 max mem: 33300 Epoch: [22] [ 650/4276] eta: 3:04:18 lr: 2.4184730168888223e-05 loss: 0.0911 (0.1106) time: 2.9818 data: 0.0088 max mem: 33300 Epoch: [22] [ 660/4276] eta: 3:03:41 lr: 2.418187806519401e-05 loss: 0.1063 (0.1108) time: 2.9334 data: 0.0085 max mem: 33300 Epoch: [22] [ 670/4276] eta: 3:03:04 lr: 2.417902592412286e-05 loss: 0.1086 (0.1107) time: 2.9330 data: 0.0085 max mem: 33300 Epoch: [22] [ 680/4276] eta: 3:02:28 lr: 2.4176173745669375e-05 loss: 0.1028 (0.1107) time: 2.9322 data: 0.0085 max mem: 33300 Epoch: [22] [ 690/4276] eta: 3:01:52 lr: 2.4173321529828167e-05 loss: 0.1028 (0.1108) time: 2.9336 data: 0.0085 max mem: 33300 Epoch: [22] [ 700/4276] eta: 3:01:16 lr: 2.4170469276593845e-05 loss: 0.1011 (0.1107) time: 2.9322 data: 0.0085 max mem: 33300 Epoch: [22] [ 710/4276] eta: 3:00:40 lr: 2.4167616985961024e-05 loss: 0.1043 (0.1109) time: 2.9318 data: 0.0086 max mem: 33300 Epoch: [22] [ 720/4276] eta: 3:00:04 lr: 2.416476465792429e-05 loss: 0.1050 (0.1108) time: 2.9318 data: 0.0087 max mem: 33300 Epoch: [22] [ 730/4276] eta: 2:59:28 lr: 2.416191229247826e-05 loss: 0.0997 (0.1106) time: 2.9290 data: 0.0085 max mem: 33300 Epoch: [22] [ 740/4276] eta: 2:58:53 lr: 2.4159059889617537e-05 loss: 0.0950 (0.1105) time: 2.9284 data: 0.0083 max mem: 33300 Epoch: [22] [ 750/4276] eta: 2:58:17 lr: 2.415620744933672e-05 loss: 0.0982 (0.1105) time: 2.9283 data: 0.0083 max mem: 33300 Epoch: [22] [ 760/4276] eta: 2:57:42 lr: 2.415335497163041e-05 loss: 0.1011 (0.1105) time: 2.9288 data: 0.0083 max mem: 33300 Epoch: [22] [ 770/4276] eta: 2:57:07 lr: 2.415050245649319e-05 loss: 0.1086 (0.1105) time: 2.9279 data: 0.0084 max mem: 33300 Epoch: [22] [ 780/4276] eta: 2:56:32 lr: 2.4147649903919676e-05 loss: 0.1086 (0.1105) time: 2.9269 data: 0.0084 max mem: 33300 Epoch: [22] [ 790/4276] eta: 2:55:57 lr: 2.4144797313904462e-05 loss: 0.1013 (0.1106) time: 2.9295 data: 0.0085 max mem: 33300 Epoch: [22] [ 800/4276] eta: 2:55:23 lr: 2.4141944686442128e-05 loss: 0.1016 (0.1105) time: 2.9333 data: 0.0085 max mem: 33300 Epoch: [22] [ 810/4276] eta: 2:54:49 lr: 2.413909202152727e-05 loss: 0.1041 (0.1108) time: 2.9350 data: 0.0085 max mem: 33300 Epoch: [22] [ 820/4276] eta: 2:54:13 lr: 2.4136239319154484e-05 loss: 0.1054 (0.1106) time: 2.9135 data: 0.0086 max mem: 33300 Epoch: [22] [ 830/4276] eta: 2:53:37 lr: 2.413338657931836e-05 loss: 0.1054 (0.1109) time: 2.8862 data: 0.0088 max mem: 33300 Epoch: [22] [ 840/4276] eta: 2:53:01 lr: 2.4130533802013477e-05 loss: 0.1123 (0.1110) time: 2.8788 data: 0.0083 max mem: 33300 Epoch: [22] [ 850/4276] eta: 2:52:24 lr: 2.4127680987234425e-05 loss: 0.1064 (0.1109) time: 2.8758 data: 0.0078 max mem: 33300 Epoch: [22] [ 860/4276] eta: 2:51:48 lr: 2.4124828134975784e-05 loss: 0.1046 (0.1109) time: 2.8743 data: 0.0078 max mem: 33300 Epoch: [22] [ 870/4276] eta: 2:51:13 lr: 2.4121975245232154e-05 loss: 0.1046 (0.1109) time: 2.8743 data: 0.0080 max mem: 33300 Epoch: [22] [ 880/4276] eta: 2:50:37 lr: 2.4119122317998094e-05 loss: 0.1081 (0.1110) time: 2.8751 data: 0.0082 max mem: 33300 Epoch: [22] [ 890/4276] eta: 2:50:02 lr: 2.4116269353268196e-05 loss: 0.1145 (0.1111) time: 2.8749 data: 0.0080 max mem: 33300 Epoch: [22] [ 900/4276] eta: 2:49:26 lr: 2.4113416351037036e-05 loss: 0.1164 (0.1112) time: 2.8744 data: 0.0078 max mem: 33300 Epoch: [22] [ 910/4276] eta: 2:48:52 lr: 2.4110563311299195e-05 loss: 0.1104 (0.1113) time: 2.8871 data: 0.0084 max mem: 33300 Epoch: [22] [ 920/4276] eta: 2:48:18 lr: 2.4107710234049237e-05 loss: 0.1112 (0.1113) time: 2.9074 data: 0.0094 max mem: 33300 Epoch: [22] [ 930/4276] eta: 2:47:46 lr: 2.4104857119281747e-05 loss: 0.1140 (0.1114) time: 2.9252 data: 0.0098 max mem: 33300 Epoch: [22] [ 940/4276] eta: 2:47:13 lr: 2.4102003966991297e-05 loss: 0.1082 (0.1114) time: 2.9379 data: 0.0096 max mem: 33300 Epoch: [22] [ 950/4276] eta: 2:46:41 lr: 2.4099150777172456e-05 loss: 0.1138 (0.1115) time: 2.9402 data: 0.0091 max mem: 33300 Epoch: [22] [ 960/4276] eta: 2:46:08 lr: 2.409629754981979e-05 loss: 0.1133 (0.1115) time: 2.9317 data: 0.0090 max mem: 33300 Epoch: [22] [ 970/4276] eta: 2:45:36 lr: 2.4093444284927862e-05 loss: 0.1117 (0.1115) time: 2.9323 data: 0.0088 max mem: 33300 Epoch: [22] [ 980/4276] eta: 2:45:03 lr: 2.409059098249126e-05 loss: 0.1113 (0.1115) time: 2.9436 data: 0.0084 max mem: 33300 Epoch: [22] [ 990/4276] eta: 2:44:31 lr: 2.4087737642504524e-05 loss: 0.1045 (0.1114) time: 2.9360 data: 0.0083 max mem: 33300 Epoch: [22] [1000/4276] eta: 2:43:58 lr: 2.408488426496223e-05 loss: 0.0991 (0.1114) time: 2.9199 data: 0.0083 max mem: 33300 Epoch: [22] [1010/4276] eta: 2:43:24 lr: 2.408203084985894e-05 loss: 0.1043 (0.1114) time: 2.9051 data: 0.0090 max mem: 33300 Epoch: [22] [1020/4276] eta: 2:42:52 lr: 2.4079177397189214e-05 loss: 0.1043 (0.1114) time: 2.9118 data: 0.0094 max mem: 33300 Epoch: [22] [1030/4276] eta: 2:42:20 lr: 2.4076323906947607e-05 loss: 0.1050 (0.1114) time: 2.9280 data: 0.0084 max mem: 33300 Epoch: [22] [1040/4276] eta: 2:41:47 lr: 2.4073470379128678e-05 loss: 0.1162 (0.1113) time: 2.9292 data: 0.0075 max mem: 33300 Epoch: [22] [1050/4276] eta: 2:41:15 lr: 2.407061681372698e-05 loss: 0.0964 (0.1115) time: 2.9283 data: 0.0076 max mem: 33300 Epoch: [22] [1060/4276] eta: 2:40:43 lr: 2.4067763210737083e-05 loss: 0.1268 (0.1116) time: 2.9280 data: 0.0075 max mem: 33300 Epoch: [22] [1070/4276] eta: 2:40:11 lr: 2.406490957015352e-05 loss: 0.1229 (0.1118) time: 2.9287 data: 0.0075 max mem: 33300 Epoch: [22] [1080/4276] eta: 2:39:39 lr: 2.406205589197085e-05 loss: 0.1148 (0.1118) time: 2.9275 data: 0.0076 max mem: 33300 Epoch: [22] [1090/4276] eta: 2:39:07 lr: 2.405920217618362e-05 loss: 0.1232 (0.1119) time: 2.9283 data: 0.0075 max mem: 33300 Epoch: [22] [1100/4276] eta: 2:38:35 lr: 2.405634842278639e-05 loss: 0.1138 (0.1120) time: 2.9287 data: 0.0074 max mem: 33300 Epoch: [22] [1110/4276] eta: 2:38:03 lr: 2.4053494631773692e-05 loss: 0.1101 (0.1121) time: 2.9280 data: 0.0074 max mem: 33300 Epoch: [22] [1120/4276] eta: 2:37:31 lr: 2.4050640803140082e-05 loss: 0.1112 (0.1121) time: 2.9282 data: 0.0075 max mem: 33300 Epoch: [22] [1130/4276] eta: 2:36:59 lr: 2.4047786936880096e-05 loss: 0.1039 (0.1120) time: 2.9256 data: 0.0074 max mem: 33300 Epoch: [22] [1140/4276] eta: 2:36:27 lr: 2.404493303298829e-05 loss: 0.1027 (0.1120) time: 2.9242 data: 0.0074 max mem: 33300 Epoch: [22] [1150/4276] eta: 2:35:56 lr: 2.4042079091459183e-05 loss: 0.1037 (0.1119) time: 2.9288 data: 0.0075 max mem: 33300 Epoch: [22] [1160/4276] eta: 2:35:24 lr: 2.403922511228733e-05 loss: 0.1071 (0.1119) time: 2.9315 data: 0.0077 max mem: 33300 Epoch: [22] [1170/4276] eta: 2:34:53 lr: 2.4036371095467264e-05 loss: 0.1156 (0.1120) time: 2.9301 data: 0.0077 max mem: 33300 Epoch: [22] [1180/4276] eta: 2:34:21 lr: 2.4033517040993528e-05 loss: 0.1156 (0.1121) time: 2.9276 data: 0.0075 max mem: 33300 Epoch: [22] [1190/4276] eta: 2:33:49 lr: 2.4030662948860646e-05 loss: 0.0953 (0.1120) time: 2.9274 data: 0.0075 max mem: 33300 Epoch: [22] [1200/4276] eta: 2:33:18 lr: 2.4027808819063156e-05 loss: 0.0925 (0.1119) time: 2.9287 data: 0.0075 max mem: 33300 Epoch: [22] [1210/4276] eta: 2:32:46 lr: 2.402495465159559e-05 loss: 0.0925 (0.1118) time: 2.9282 data: 0.0075 max mem: 33300 Epoch: [22] [1220/4276] eta: 2:32:15 lr: 2.4022100446452484e-05 loss: 0.1000 (0.1118) time: 2.9266 data: 0.0075 max mem: 33300 Epoch: [22] [1230/4276] eta: 2:31:44 lr: 2.4019246203628358e-05 loss: 0.1065 (0.1118) time: 2.9266 data: 0.0075 max mem: 33300 Epoch: [22] [1240/4276] eta: 2:31:12 lr: 2.4016391923117737e-05 loss: 0.1031 (0.1118) time: 2.9277 data: 0.0075 max mem: 33300 Epoch: [22] [1250/4276] eta: 2:30:41 lr: 2.4013537604915157e-05 loss: 0.1055 (0.1119) time: 2.9283 data: 0.0075 max mem: 33300 Epoch: [22] [1260/4276] eta: 2:30:10 lr: 2.401068324901514e-05 loss: 0.1055 (0.1118) time: 2.9285 data: 0.0075 max mem: 33301 Epoch: [22] [1270/4276] eta: 2:29:39 lr: 2.4007828855412198e-05 loss: 0.0933 (0.1117) time: 2.9377 data: 0.0075 max mem: 33301 Epoch: [22] [1280/4276] eta: 2:29:07 lr: 2.4004974424100862e-05 loss: 0.1078 (0.1118) time: 2.9376 data: 0.0075 max mem: 33301 Epoch: [22] [1290/4276] eta: 2:28:36 lr: 2.4002119955075656e-05 loss: 0.1134 (0.1117) time: 2.9280 data: 0.0075 max mem: 33301 Epoch: [22] [1300/4276] eta: 2:28:05 lr: 2.3999265448331083e-05 loss: 0.0934 (0.1116) time: 2.9353 data: 0.0075 max mem: 33301 Epoch: [22] [1310/4276] eta: 2:27:34 lr: 2.3996410903861675e-05 loss: 0.0899 (0.1115) time: 2.9357 data: 0.0075 max mem: 33301 Epoch: [22] [1320/4276] eta: 2:27:03 lr: 2.3993556321661934e-05 loss: 0.1086 (0.1115) time: 2.9299 data: 0.0075 max mem: 33301 Epoch: [22] [1330/4276] eta: 2:26:32 lr: 2.3990701701726386e-05 loss: 0.1104 (0.1115) time: 2.9314 data: 0.0076 max mem: 33301 Epoch: [22] [1340/4276] eta: 2:26:01 lr: 2.398784704404953e-05 loss: 0.1038 (0.1114) time: 2.9331 data: 0.0075 max mem: 33301 Epoch: [22] [1350/4276] eta: 2:25:30 lr: 2.3984992348625886e-05 loss: 0.1023 (0.1114) time: 2.9326 data: 0.0076 max mem: 33301 Epoch: [22] [1360/4276] eta: 2:24:59 lr: 2.398213761544996e-05 loss: 0.1025 (0.1114) time: 2.9307 data: 0.0075 max mem: 33301 Epoch: [22] [1370/4276] eta: 2:24:28 lr: 2.3979282844516263e-05 loss: 0.1076 (0.1114) time: 2.9292 data: 0.0075 max mem: 33301 Epoch: [22] [1380/4276] eta: 2:23:57 lr: 2.397642803581929e-05 loss: 0.1151 (0.1115) time: 2.9245 data: 0.0076 max mem: 33301 Epoch: [22] [1390/4276] eta: 2:23:26 lr: 2.3973573189353558e-05 loss: 0.1151 (0.1115) time: 2.9156 data: 0.0080 max mem: 33301 Epoch: [22] [1400/4276] eta: 2:22:55 lr: 2.397071830511356e-05 loss: 0.1068 (0.1115) time: 2.9170 data: 0.0080 max mem: 33301 Epoch: [22] [1410/4276] eta: 2:22:24 lr: 2.396786338309381e-05 loss: 0.1068 (0.1116) time: 2.9264 data: 0.0081 max mem: 33301 Epoch: [22] [1420/4276] eta: 2:21:53 lr: 2.396500842328879e-05 loss: 0.1046 (0.1116) time: 2.9296 data: 0.0079 max mem: 33301 Epoch: [22] [1430/4276] eta: 2:21:22 lr: 2.3962153425693013e-05 loss: 0.0997 (0.1115) time: 2.9293 data: 0.0075 max mem: 33301 Epoch: [22] [1440/4276] eta: 2:20:51 lr: 2.3959298390300965e-05 loss: 0.1053 (0.1115) time: 2.9295 data: 0.0076 max mem: 33301 Epoch: [22] [1450/4276] eta: 2:20:21 lr: 2.3956443317107157e-05 loss: 0.1059 (0.1114) time: 2.9309 data: 0.0076 max mem: 33301 Epoch: [22] [1460/4276] eta: 2:19:50 lr: 2.3953588206106065e-05 loss: 0.0887 (0.1114) time: 2.9361 data: 0.0077 max mem: 33301 Epoch: [22] [1470/4276] eta: 2:19:19 lr: 2.3950733057292188e-05 loss: 0.1063 (0.1113) time: 2.9352 data: 0.0079 max mem: 33301 Epoch: [22] [1480/4276] eta: 2:18:48 lr: 2.394787787066002e-05 loss: 0.1007 (0.1113) time: 2.9274 data: 0.0082 max mem: 33301 Epoch: [22] [1490/4276] eta: 2:18:18 lr: 2.394502264620405e-05 loss: 0.0905 (0.1112) time: 2.9292 data: 0.0080 max mem: 33301 Epoch: [22] [1500/4276] eta: 2:17:46 lr: 2.3942167383918756e-05 loss: 0.0949 (0.1111) time: 2.9043 data: 0.0080 max mem: 33301 Epoch: [22] [1510/4276] eta: 2:17:15 lr: 2.3939312083798636e-05 loss: 0.0968 (0.1110) time: 2.8769 data: 0.0083 max mem: 33301 Epoch: [22] [1520/4276] eta: 2:16:43 lr: 2.3936456745838165e-05 loss: 0.0977 (0.1110) time: 2.8781 data: 0.0082 max mem: 33301 Epoch: [22] [1530/4276] eta: 2:16:12 lr: 2.3933601370031835e-05 loss: 0.0935 (0.1109) time: 2.8982 data: 0.0087 max mem: 33301 Epoch: [22] [1540/4276] eta: 2:15:42 lr: 2.393074595637412e-05 loss: 0.1026 (0.1109) time: 2.9245 data: 0.0093 max mem: 33301 Epoch: [22] [1550/4276] eta: 2:15:11 lr: 2.39278905048595e-05 loss: 0.1131 (0.1110) time: 2.9359 data: 0.0091 max mem: 33301 Epoch: [22] [1560/4276] eta: 2:14:40 lr: 2.3925035015482462e-05 loss: 0.0964 (0.1109) time: 2.9091 data: 0.0092 max mem: 33301 Epoch: [22] [1570/4276] eta: 2:14:08 lr: 2.3922179488237474e-05 loss: 0.0979 (0.1108) time: 2.8777 data: 0.0090 max mem: 33301 Epoch: [22] [1580/4276] eta: 2:13:38 lr: 2.391932392311901e-05 loss: 0.0996 (0.1107) time: 2.8970 data: 0.0086 max mem: 33301 Epoch: [22] [1590/4276] eta: 2:13:07 lr: 2.3916468320121547e-05 loss: 0.1030 (0.1108) time: 2.9236 data: 0.0096 max mem: 33301 Epoch: [22] [1600/4276] eta: 2:12:37 lr: 2.3913612679239563e-05 loss: 0.1043 (0.1108) time: 2.9330 data: 0.0108 max mem: 33301 Epoch: [22] [1610/4276] eta: 2:12:06 lr: 2.3910757000467522e-05 loss: 0.0963 (0.1108) time: 2.9348 data: 0.0105 max mem: 33301 Epoch: [22] [1620/4276] eta: 2:11:36 lr: 2.3907901283799888e-05 loss: 0.1015 (0.1107) time: 2.9327 data: 0.0094 max mem: 33301 Epoch: [22] [1630/4276] eta: 2:11:06 lr: 2.3905045529231138e-05 loss: 0.1033 (0.1108) time: 2.9310 data: 0.0085 max mem: 33301 Epoch: [22] [1640/4276] eta: 2:10:35 lr: 2.3902189736755742e-05 loss: 0.0985 (0.1106) time: 2.9300 data: 0.0083 max mem: 33301 Epoch: [22] [1650/4276] eta: 2:10:05 lr: 2.3899333906368148e-05 loss: 0.0973 (0.1106) time: 2.9315 data: 0.0085 max mem: 33301 Epoch: [22] [1660/4276] eta: 2:09:34 lr: 2.3896478038062832e-05 loss: 0.0973 (0.1105) time: 2.9320 data: 0.0087 max mem: 33301 Epoch: [22] [1670/4276] eta: 2:09:04 lr: 2.389362213183425e-05 loss: 0.0911 (0.1104) time: 2.9317 data: 0.0085 max mem: 33301 Epoch: [22] [1680/4276] eta: 2:08:34 lr: 2.389076618767687e-05 loss: 0.0911 (0.1104) time: 2.9324 data: 0.0083 max mem: 33301 Epoch: [22] [1690/4276] eta: 2:08:03 lr: 2.3887910205585135e-05 loss: 0.0967 (0.1103) time: 2.9199 data: 0.0086 max mem: 33301 Epoch: [22] [1700/4276] eta: 2:07:33 lr: 2.388505418555351e-05 loss: 0.1007 (0.1103) time: 2.9154 data: 0.0091 max mem: 33301 Epoch: [22] [1710/4276] eta: 2:07:02 lr: 2.3882198127576455e-05 loss: 0.1012 (0.1103) time: 2.9223 data: 0.0090 max mem: 33301 Epoch: [22] [1720/4276] eta: 2:06:32 lr: 2.3879342031648422e-05 loss: 0.1009 (0.1102) time: 2.9268 data: 0.0087 max mem: 33301 Epoch: [22] [1730/4276] eta: 2:06:02 lr: 2.3876485897763858e-05 loss: 0.0978 (0.1102) time: 2.9313 data: 0.0089 max mem: 33301 Epoch: [22] [1740/4276] eta: 2:05:31 lr: 2.387362972591721e-05 loss: 0.0978 (0.1102) time: 2.9322 data: 0.0093 max mem: 33301 Epoch: [22] [1750/4276] eta: 2:05:01 lr: 2.3870773516102936e-05 loss: 0.1074 (0.1102) time: 2.9312 data: 0.0092 max mem: 33301 Epoch: [22] [1760/4276] eta: 2:04:31 lr: 2.3867917268315486e-05 loss: 0.1035 (0.1101) time: 2.9315 data: 0.0088 max mem: 33301 Epoch: [22] [1770/4276] eta: 2:04:01 lr: 2.3865060982549295e-05 loss: 0.1011 (0.1101) time: 2.9332 data: 0.0089 max mem: 33301 Epoch: [22] [1780/4276] eta: 2:03:30 lr: 2.3862204658798812e-05 loss: 0.0991 (0.1101) time: 2.9348 data: 0.0093 max mem: 33301 Epoch: [22] [1790/4276] eta: 2:03:00 lr: 2.385934829705848e-05 loss: 0.0991 (0.1100) time: 2.9340 data: 0.0091 max mem: 33301 Epoch: [22] [1800/4276] eta: 2:02:30 lr: 2.3856491897322747e-05 loss: 0.1009 (0.1100) time: 2.9313 data: 0.0087 max mem: 33301 Epoch: [22] [1810/4276] eta: 2:02:00 lr: 2.385363545958604e-05 loss: 0.1042 (0.1100) time: 2.9315 data: 0.0089 max mem: 33301 Epoch: [22] [1820/4276] eta: 2:01:30 lr: 2.3850778983842804e-05 loss: 0.1081 (0.1101) time: 2.9326 data: 0.0093 max mem: 33301 Epoch: [22] [1830/4276] eta: 2:00:59 lr: 2.3847922470087476e-05 loss: 0.0993 (0.1101) time: 2.9324 data: 0.0090 max mem: 33301 Epoch: [22] [1840/4276] eta: 2:00:30 lr: 2.384506591831449e-05 loss: 0.0911 (0.1100) time: 2.9385 data: 0.0088 max mem: 33301 Epoch: [22] [1850/4276] eta: 1:59:59 lr: 2.3842209328518278e-05 loss: 0.0975 (0.1100) time: 2.9381 data: 0.0090 max mem: 33301 Epoch: [22] [1860/4276] eta: 1:59:29 lr: 2.3839352700693274e-05 loss: 0.1026 (0.1100) time: 2.9315 data: 0.0096 max mem: 33301 Epoch: [22] [1870/4276] eta: 1:58:59 lr: 2.3836496034833912e-05 loss: 0.1036 (0.1101) time: 2.9306 data: 0.0093 max mem: 33301 Epoch: [22] [1880/4276] eta: 1:58:29 lr: 2.3833639330934612e-05 loss: 0.1088 (0.1101) time: 2.9292 data: 0.0088 max mem: 33301 Epoch: [22] [1890/4276] eta: 1:57:59 lr: 2.3830782588989804e-05 loss: 0.0996 (0.1101) time: 2.9302 data: 0.0091 max mem: 33301 Epoch: [22] [1900/4276] eta: 1:57:29 lr: 2.382792580899392e-05 loss: 0.1020 (0.1101) time: 2.9312 data: 0.0096 max mem: 33301 Epoch: [22] [1910/4276] eta: 1:56:59 lr: 2.3825068990941378e-05 loss: 0.1060 (0.1101) time: 2.9316 data: 0.0093 max mem: 33301 Epoch: [22] [1920/4276] eta: 1:56:28 lr: 2.3822212134826598e-05 loss: 0.0990 (0.1100) time: 2.9317 data: 0.0087 max mem: 33301 Epoch: [22] [1930/4276] eta: 1:55:58 lr: 2.381935524064401e-05 loss: 0.0936 (0.1099) time: 2.9309 data: 0.0089 max mem: 33301 Epoch: [22] [1940/4276] eta: 1:55:28 lr: 2.3816498308388026e-05 loss: 0.0961 (0.1099) time: 2.9317 data: 0.0093 max mem: 33301 Epoch: [22] [1950/4276] eta: 1:54:58 lr: 2.381364133805307e-05 loss: 0.1079 (0.1100) time: 2.9104 data: 0.0088 max mem: 33301 Epoch: [22] [1960/4276] eta: 1:54:27 lr: 2.381078432963355e-05 loss: 0.1078 (0.1099) time: 2.8803 data: 0.0082 max mem: 33301 Epoch: [22] [1970/4276] eta: 1:53:56 lr: 2.3807927283123883e-05 loss: 0.0842 (0.1098) time: 2.8747 data: 0.0080 max mem: 33301 Epoch: [22] [1980/4276] eta: 1:53:26 lr: 2.380507019851849e-05 loss: 0.0943 (0.1097) time: 2.8776 data: 0.0078 max mem: 33301 Epoch: [22] [1990/4276] eta: 1:52:55 lr: 2.380221307581178e-05 loss: 0.0946 (0.1097) time: 2.8807 data: 0.0080 max mem: 33301 Epoch: [22] [2000/4276] eta: 1:52:24 lr: 2.3799355914998157e-05 loss: 0.1037 (0.1098) time: 2.8832 data: 0.0080 max mem: 33301 Epoch: [22] [2010/4276] eta: 1:51:54 lr: 2.379649871607203e-05 loss: 0.1037 (0.1098) time: 2.8952 data: 0.0080 max mem: 33301 Epoch: [22] [2020/4276] eta: 1:51:24 lr: 2.379364147902781e-05 loss: 0.0951 (0.1098) time: 2.9199 data: 0.0091 max mem: 33301 Epoch: [22] [2030/4276] eta: 1:50:54 lr: 2.3790784203859904e-05 loss: 0.0938 (0.1097) time: 2.9343 data: 0.0095 max mem: 33301 Epoch: [22] [2040/4276] eta: 1:50:24 lr: 2.378792689056271e-05 loss: 0.0878 (0.1097) time: 2.9357 data: 0.0091 max mem: 33301 Epoch: [22] [2050/4276] eta: 1:49:54 lr: 2.3785069539130632e-05 loss: 0.1059 (0.1097) time: 2.9357 data: 0.0095 max mem: 33301 Epoch: [22] [2060/4276] eta: 1:49:24 lr: 2.3782212149558073e-05 loss: 0.1084 (0.1097) time: 2.9346 data: 0.0100 max mem: 33301 Epoch: [22] [2070/4276] eta: 1:48:55 lr: 2.3779354721839437e-05 loss: 0.0958 (0.1097) time: 2.9365 data: 0.0097 max mem: 33301 Epoch: [22] [2080/4276] eta: 1:48:25 lr: 2.3776497255969108e-05 loss: 0.1025 (0.1098) time: 2.9336 data: 0.0092 max mem: 33301 Epoch: [22] [2090/4276] eta: 1:47:55 lr: 2.3773639751941487e-05 loss: 0.1084 (0.1098) time: 2.9291 data: 0.0084 max mem: 33301 Epoch: [22] [2100/4276] eta: 1:47:24 lr: 2.3770782209750976e-05 loss: 0.1026 (0.1097) time: 2.9164 data: 0.0083 max mem: 33301 Epoch: [22] [2110/4276] eta: 1:46:54 lr: 2.3767924629391964e-05 loss: 0.0992 (0.1097) time: 2.8911 data: 0.0092 max mem: 33301 Epoch: [22] [2120/4276] eta: 1:46:24 lr: 2.3765067010858834e-05 loss: 0.0825 (0.1095) time: 2.8825 data: 0.0090 max mem: 33301 Epoch: [22] [2130/4276] eta: 1:45:53 lr: 2.3762209354145985e-05 loss: 0.0898 (0.1095) time: 2.8848 data: 0.0083 max mem: 33301 Epoch: [22] [2140/4276] eta: 1:45:23 lr: 2.3759351659247804e-05 loss: 0.0928 (0.1094) time: 2.8846 data: 0.0083 max mem: 33301 Epoch: [22] [2150/4276] eta: 1:44:52 lr: 2.3756493926158675e-05 loss: 0.1021 (0.1094) time: 2.8857 data: 0.0091 max mem: 33301 Epoch: [22] [2160/4276] eta: 1:44:22 lr: 2.3753636154872986e-05 loss: 0.1063 (0.1094) time: 2.8840 data: 0.0090 max mem: 33301 Epoch: [22] [2170/4276] eta: 1:43:52 lr: 2.3750778345385114e-05 loss: 0.0974 (0.1094) time: 2.8855 data: 0.0090 max mem: 33301 Epoch: [22] [2180/4276] eta: 1:43:22 lr: 2.3747920497689453e-05 loss: 0.1019 (0.1094) time: 2.9071 data: 0.0098 max mem: 33301 Epoch: [22] [2190/4276] eta: 1:42:52 lr: 2.3745062611780368e-05 loss: 0.0996 (0.1094) time: 2.9300 data: 0.0102 max mem: 33301 Epoch: [22] [2200/4276] eta: 1:42:22 lr: 2.374220468765224e-05 loss: 0.1159 (0.1094) time: 2.9355 data: 0.0104 max mem: 33301 Epoch: [22] [2210/4276] eta: 1:41:53 lr: 2.3739346725299456e-05 loss: 0.1180 (0.1094) time: 2.9372 data: 0.0104 max mem: 33301 Epoch: [22] [2220/4276] eta: 1:41:23 lr: 2.373648872471639e-05 loss: 0.1107 (0.1095) time: 2.9373 data: 0.0104 max mem: 33301 Epoch: [22] [2230/4276] eta: 1:40:53 lr: 2.373363068589741e-05 loss: 0.1107 (0.1094) time: 2.9425 data: 0.0100 max mem: 33301 Epoch: [22] [2240/4276] eta: 1:40:23 lr: 2.3730772608836884e-05 loss: 0.1012 (0.1094) time: 2.9373 data: 0.0096 max mem: 33301 Epoch: [22] [2250/4276] eta: 1:39:53 lr: 2.3727914493529193e-05 loss: 0.0924 (0.1094) time: 2.9308 data: 0.0095 max mem: 33301 Epoch: [22] [2260/4276] eta: 1:39:24 lr: 2.372505633996871e-05 loss: 0.0963 (0.1093) time: 2.9347 data: 0.0097 max mem: 33301 Epoch: [22] [2270/4276] eta: 1:38:54 lr: 2.3722198148149785e-05 loss: 0.0987 (0.1093) time: 2.9228 data: 0.0097 max mem: 33301 Epoch: [22] [2280/4276] eta: 1:38:23 lr: 2.3719339918066796e-05 loss: 0.1040 (0.1093) time: 2.8965 data: 0.0087 max mem: 33301 Epoch: [22] [2290/4276] eta: 1:37:53 lr: 2.3716481649714102e-05 loss: 0.1082 (0.1093) time: 2.8803 data: 0.0079 max mem: 33301 Epoch: [22] [2300/4276] eta: 1:37:23 lr: 2.3713623343086076e-05 loss: 0.1081 (0.1093) time: 2.8790 data: 0.0076 max mem: 33301 Epoch: [22] [2310/4276] eta: 1:36:53 lr: 2.371076499817707e-05 loss: 0.1081 (0.1094) time: 2.8783 data: 0.0074 max mem: 33301 Epoch: [22] [2320/4276] eta: 1:36:22 lr: 2.370790661498144e-05 loss: 0.1172 (0.1094) time: 2.8776 data: 0.0074 max mem: 33301 Epoch: [22] [2330/4276] eta: 1:35:52 lr: 2.3705048193493556e-05 loss: 0.1169 (0.1094) time: 2.8778 data: 0.0073 max mem: 33301 Epoch: [22] [2340/4276] eta: 1:35:22 lr: 2.370218973370777e-05 loss: 0.1139 (0.1095) time: 2.8940 data: 0.0082 max mem: 33301 Epoch: [22] [2350/4276] eta: 1:34:52 lr: 2.369933123561843e-05 loss: 0.1021 (0.1095) time: 2.9152 data: 0.0089 max mem: 33301 Epoch: [22] [2360/4276] eta: 1:34:23 lr: 2.3696472699219897e-05 loss: 0.0980 (0.1094) time: 2.9339 data: 0.0089 max mem: 33301 Epoch: [22] [2370/4276] eta: 1:33:53 lr: 2.3693614124506516e-05 loss: 0.1020 (0.1095) time: 2.9369 data: 0.0095 max mem: 33301 Epoch: [22] [2380/4276] eta: 1:33:23 lr: 2.369075551147265e-05 loss: 0.1158 (0.1095) time: 2.9294 data: 0.0093 max mem: 33301 Epoch: [22] [2390/4276] eta: 1:32:53 lr: 2.368789686011263e-05 loss: 0.1151 (0.1095) time: 2.9334 data: 0.0087 max mem: 33301 Epoch: [22] [2400/4276] eta: 1:32:24 lr: 2.368503817042081e-05 loss: 0.1027 (0.1095) time: 2.9339 data: 0.0087 max mem: 33301 Epoch: [22] [2410/4276] eta: 1:31:54 lr: 2.3682179442391543e-05 loss: 0.1061 (0.1095) time: 2.9346 data: 0.0089 max mem: 33301 Epoch: [22] [2420/4276] eta: 1:31:24 lr: 2.3679320676019164e-05 loss: 0.1061 (0.1095) time: 2.9333 data: 0.0089 max mem: 33301 Epoch: [22] [2430/4276] eta: 1:30:54 lr: 2.3676461871298018e-05 loss: 0.1078 (0.1095) time: 2.9287 data: 0.0085 max mem: 33301 Epoch: [22] [2440/4276] eta: 1:30:25 lr: 2.3673603028222448e-05 loss: 0.1142 (0.1095) time: 2.9275 data: 0.0081 max mem: 33301 Epoch: [22] [2450/4276] eta: 1:29:55 lr: 2.3670744146786782e-05 loss: 0.1059 (0.1095) time: 2.9244 data: 0.0088 max mem: 33301 Epoch: [22] [2460/4276] eta: 1:29:25 lr: 2.3667885226985375e-05 loss: 0.1082 (0.1095) time: 2.9265 data: 0.0088 max mem: 33301 Epoch: [22] [2470/4276] eta: 1:28:56 lr: 2.366502626881255e-05 loss: 0.1169 (0.1096) time: 2.9328 data: 0.0080 max mem: 33301 Epoch: [22] [2480/4276] eta: 1:28:26 lr: 2.3662167272262642e-05 loss: 0.1202 (0.1096) time: 2.9331 data: 0.0079 max mem: 33301 Epoch: [22] [2490/4276] eta: 1:27:56 lr: 2.365930823732999e-05 loss: 0.1105 (0.1096) time: 2.9327 data: 0.0078 max mem: 33301 Epoch: [22] [2500/4276] eta: 1:27:26 lr: 2.365644916400892e-05 loss: 0.1123 (0.1096) time: 2.9327 data: 0.0076 max mem: 33301 Epoch: [22] [2510/4276] eta: 1:26:57 lr: 2.365359005229376e-05 loss: 0.1197 (0.1096) time: 2.9324 data: 0.0079 max mem: 33301 Epoch: [22] [2520/4276] eta: 1:26:27 lr: 2.365073090217884e-05 loss: 0.1008 (0.1096) time: 2.9282 data: 0.0078 max mem: 33301 Epoch: [22] [2530/4276] eta: 1:25:57 lr: 2.364787171365849e-05 loss: 0.0882 (0.1095) time: 2.9279 data: 0.0074 max mem: 33301 Epoch: [22] [2540/4276] eta: 1:25:28 lr: 2.3645012486727027e-05 loss: 0.0975 (0.1095) time: 2.9396 data: 0.0075 max mem: 33301 Epoch: [22] [2550/4276] eta: 1:24:58 lr: 2.3642153221378776e-05 loss: 0.1017 (0.1095) time: 2.9415 data: 0.0074 max mem: 33301 Epoch: [22] [2560/4276] eta: 1:24:28 lr: 2.3639293917608064e-05 loss: 0.0941 (0.1094) time: 2.9351 data: 0.0075 max mem: 33301 Epoch: [22] [2570/4276] eta: 1:23:59 lr: 2.363643457540921e-05 loss: 0.0956 (0.1094) time: 2.9343 data: 0.0075 max mem: 33301 Epoch: [22] [2580/4276] eta: 1:23:29 lr: 2.3633575194776526e-05 loss: 0.1017 (0.1094) time: 2.9268 data: 0.0087 max mem: 33301 Epoch: [22] [2590/4276] eta: 1:22:59 lr: 2.3630715775704325e-05 loss: 0.0910 (0.1093) time: 2.9269 data: 0.0087 max mem: 33301 Epoch: [22] [2600/4276] eta: 1:22:30 lr: 2.3627856318186935e-05 loss: 0.0905 (0.1093) time: 2.9307 data: 0.0074 max mem: 33301 Epoch: [22] [2610/4276] eta: 1:22:00 lr: 2.3624996822218664e-05 loss: 0.0931 (0.1092) time: 2.9219 data: 0.0081 max mem: 33301 Epoch: [22] [2620/4276] eta: 1:21:30 lr: 2.3622137287793823e-05 loss: 0.1074 (0.1093) time: 2.9239 data: 0.0082 max mem: 33301 Epoch: [22] [2630/4276] eta: 1:21:00 lr: 2.3619277714906714e-05 loss: 0.1019 (0.1092) time: 2.9312 data: 0.0075 max mem: 33301 Epoch: [22] [2640/4276] eta: 1:20:31 lr: 2.361641810355166e-05 loss: 0.0974 (0.1092) time: 2.9320 data: 0.0075 max mem: 33301 Epoch: [22] [2650/4276] eta: 1:20:01 lr: 2.3613558453722965e-05 loss: 0.0945 (0.1092) time: 2.9324 data: 0.0075 max mem: 33301 Epoch: [22] [2660/4276] eta: 1:19:31 lr: 2.361069876541492e-05 loss: 0.1013 (0.1092) time: 2.9318 data: 0.0076 max mem: 33301 Epoch: [22] [2670/4276] eta: 1:19:02 lr: 2.3607839038621844e-05 loss: 0.1070 (0.1092) time: 2.9323 data: 0.0075 max mem: 33301 Epoch: [22] [2680/4276] eta: 1:18:32 lr: 2.3604979273338035e-05 loss: 0.1078 (0.1092) time: 2.9320 data: 0.0075 max mem: 33301 Epoch: [22] [2690/4276] eta: 1:18:02 lr: 2.3602119469557797e-05 loss: 0.1057 (0.1092) time: 2.9313 data: 0.0075 max mem: 33301 Epoch: [22] [2700/4276] eta: 1:17:33 lr: 2.3599259627275417e-05 loss: 0.0987 (0.1092) time: 2.9306 data: 0.0080 max mem: 33301 Epoch: [22] [2710/4276] eta: 1:17:03 lr: 2.35963997464852e-05 loss: 0.1046 (0.1092) time: 2.9305 data: 0.0080 max mem: 33301 Epoch: [22] [2720/4276] eta: 1:16:34 lr: 2.3593539827181443e-05 loss: 0.1012 (0.1091) time: 2.9380 data: 0.0075 max mem: 33301 Epoch: [22] [2730/4276] eta: 1:16:04 lr: 2.3590679869358446e-05 loss: 0.1037 (0.1092) time: 2.9385 data: 0.0075 max mem: 33301 Epoch: [22] [2740/4276] eta: 1:15:34 lr: 2.3587819873010482e-05 loss: 0.1092 (0.1092) time: 2.9336 data: 0.0078 max mem: 33301 Epoch: [22] [2750/4276] eta: 1:15:05 lr: 2.358495983813186e-05 loss: 0.1095 (0.1092) time: 2.9342 data: 0.0077 max mem: 33301 Epoch: [22] [2760/4276] eta: 1:14:35 lr: 2.3582099764716862e-05 loss: 0.0990 (0.1092) time: 2.9280 data: 0.0074 max mem: 33301 Epoch: [22] [2770/4276] eta: 1:14:06 lr: 2.357923965275978e-05 loss: 0.0990 (0.1092) time: 2.9430 data: 0.0074 max mem: 33301 Epoch: [22] [2780/4276] eta: 1:13:36 lr: 2.357637950225489e-05 loss: 0.1003 (0.1092) time: 2.9486 data: 0.0077 max mem: 33301 Epoch: [22] [2790/4276] eta: 1:13:06 lr: 2.3573519313196486e-05 loss: 0.1049 (0.1092) time: 2.9319 data: 0.0077 max mem: 33301 Epoch: [22] [2800/4276] eta: 1:12:37 lr: 2.357065908557885e-05 loss: 0.0999 (0.1092) time: 2.9294 data: 0.0075 max mem: 33301 Epoch: [22] [2810/4276] eta: 1:12:07 lr: 2.3567798819396257e-05 loss: 0.0857 (0.1091) time: 2.9297 data: 0.0075 max mem: 33301 Epoch: [22] [2820/4276] eta: 1:11:37 lr: 2.3564938514642988e-05 loss: 0.0843 (0.1090) time: 2.9325 data: 0.0075 max mem: 33301 Epoch: [22] [2830/4276] eta: 1:11:08 lr: 2.356207817131332e-05 loss: 0.0944 (0.1090) time: 2.9377 data: 0.0075 max mem: 33301 Epoch: [22] [2840/4276] eta: 1:10:38 lr: 2.355921778940154e-05 loss: 0.1085 (0.1090) time: 2.9390 data: 0.0076 max mem: 33301 Epoch: [22] [2850/4276] eta: 1:10:09 lr: 2.3556357368901913e-05 loss: 0.1110 (0.1091) time: 2.9358 data: 0.0076 max mem: 33301 Epoch: [22] [2860/4276] eta: 1:09:39 lr: 2.355349690980871e-05 loss: 0.1082 (0.1091) time: 2.9348 data: 0.0075 max mem: 33301 Epoch: [22] [2870/4276] eta: 1:09:09 lr: 2.3550636412116206e-05 loss: 0.1026 (0.1091) time: 2.9364 data: 0.0077 max mem: 33301 Epoch: [22] [2880/4276] eta: 1:08:40 lr: 2.3547775875818678e-05 loss: 0.1117 (0.1091) time: 2.9362 data: 0.0080 max mem: 33301 Epoch: [22] [2890/4276] eta: 1:08:10 lr: 2.3544915300910382e-05 loss: 0.1117 (0.1091) time: 2.9336 data: 0.0082 max mem: 33301 Epoch: [22] [2900/4276] eta: 1:07:41 lr: 2.3542054687385584e-05 loss: 0.0958 (0.1091) time: 2.9316 data: 0.0082 max mem: 33301 Epoch: [22] [2910/4276] eta: 1:07:11 lr: 2.353919403523856e-05 loss: 0.1039 (0.1091) time: 2.9323 data: 0.0079 max mem: 33301 Epoch: [22] [2920/4276] eta: 1:06:42 lr: 2.3536333344463567e-05 loss: 0.1112 (0.1091) time: 2.9417 data: 0.0082 max mem: 33301 Epoch: [22] [2930/4276] eta: 1:06:12 lr: 2.3533472615054866e-05 loss: 0.1001 (0.1091) time: 2.9413 data: 0.0083 max mem: 33301 Epoch: [22] [2940/4276] eta: 1:05:42 lr: 2.3530611847006718e-05 loss: 0.1001 (0.1091) time: 2.9486 data: 0.0080 max mem: 33301 Epoch: [22] [2950/4276] eta: 1:05:13 lr: 2.3527751040313382e-05 loss: 0.1057 (0.1091) time: 2.9517 data: 0.0079 max mem: 33301 Epoch: [22] [2960/4276] eta: 1:04:43 lr: 2.3524890194969114e-05 loss: 0.0979 (0.1090) time: 2.9387 data: 0.0082 max mem: 33301 Epoch: [22] [2970/4276] eta: 1:04:14 lr: 2.352202931096817e-05 loss: 0.1126 (0.1091) time: 2.9445 data: 0.0082 max mem: 33301 Epoch: [22] [2980/4276] eta: 1:03:44 lr: 2.3519168388304798e-05 loss: 0.1126 (0.1091) time: 2.9306 data: 0.0085 max mem: 33301 Epoch: [22] [2990/4276] eta: 1:03:15 lr: 2.351630742697326e-05 loss: 0.1054 (0.1090) time: 2.9233 data: 0.0086 max mem: 33301 Epoch: [22] [3000/4276] eta: 1:02:45 lr: 2.3513446426967804e-05 loss: 0.1019 (0.1090) time: 2.9343 data: 0.0084 max mem: 33301 Epoch: [22] [3010/4276] eta: 1:02:15 lr: 2.3510585388282665e-05 loss: 0.1044 (0.1090) time: 2.9335 data: 0.0080 max mem: 33301 Epoch: [22] [3020/4276] eta: 1:01:46 lr: 2.3507724310912104e-05 loss: 0.1044 (0.1090) time: 2.9329 data: 0.0076 max mem: 33301 Epoch: [22] [3030/4276] eta: 1:01:16 lr: 2.3504863194850363e-05 loss: 0.0999 (0.1090) time: 2.9301 data: 0.0078 max mem: 33301 Epoch: [22] [3040/4276] eta: 1:00:47 lr: 2.350200204009169e-05 loss: 0.1058 (0.1090) time: 2.9264 data: 0.0083 max mem: 33301 Epoch: [22] [3050/4276] eta: 1:00:17 lr: 2.3499140846630318e-05 loss: 0.1061 (0.1090) time: 2.9280 data: 0.0080 max mem: 33301 Epoch: [22] [3060/4276] eta: 0:59:47 lr: 2.349627961446049e-05 loss: 0.0893 (0.1090) time: 2.9320 data: 0.0075 max mem: 33301 Epoch: [22] [3070/4276] eta: 0:59:18 lr: 2.3493418343576445e-05 loss: 0.0949 (0.1090) time: 2.9307 data: 0.0078 max mem: 33301 Epoch: [22] [3080/4276] eta: 0:58:48 lr: 2.349055703397243e-05 loss: 0.1007 (0.1090) time: 2.9275 data: 0.0079 max mem: 33301 Epoch: [22] [3090/4276] eta: 0:58:19 lr: 2.3487695685642666e-05 loss: 0.0920 (0.1089) time: 2.9295 data: 0.0076 max mem: 33301 Epoch: [22] [3100/4276] eta: 0:57:49 lr: 2.348483429858139e-05 loss: 0.0873 (0.1089) time: 2.9306 data: 0.0075 max mem: 33301 Epoch: [22] [3110/4276] eta: 0:57:19 lr: 2.3481972872782847e-05 loss: 0.0954 (0.1088) time: 2.9280 data: 0.0076 max mem: 33301 Epoch: [22] [3120/4276] eta: 0:56:50 lr: 2.3479111408241248e-05 loss: 0.0910 (0.1088) time: 2.9313 data: 0.0075 max mem: 33301 Epoch: [22] [3130/4276] eta: 0:56:20 lr: 2.3476249904950836e-05 loss: 0.0924 (0.1088) time: 2.9307 data: 0.0075 max mem: 33301 Epoch: [22] [3140/4276] eta: 0:55:51 lr: 2.3473388362905827e-05 loss: 0.0965 (0.1088) time: 2.9269 data: 0.0075 max mem: 33301 Epoch: [22] [3150/4276] eta: 0:55:21 lr: 2.3470526782100465e-05 loss: 0.1001 (0.1087) time: 2.9261 data: 0.0075 max mem: 33301 Epoch: [22] [3160/4276] eta: 0:54:52 lr: 2.3467665162528953e-05 loss: 0.0976 (0.1087) time: 2.9242 data: 0.0074 max mem: 33301 Epoch: [22] [3170/4276] eta: 0:54:22 lr: 2.3464803504185524e-05 loss: 0.1021 (0.1087) time: 2.9219 data: 0.0073 max mem: 33301 Epoch: [22] [3180/4276] eta: 0:53:52 lr: 2.34619418070644e-05 loss: 0.1033 (0.1087) time: 2.9209 data: 0.0073 max mem: 33301 Epoch: [22] [3190/4276] eta: 0:53:23 lr: 2.34590800711598e-05 loss: 0.0990 (0.1087) time: 2.9231 data: 0.0074 max mem: 33301 Epoch: [22] [3200/4276] eta: 0:52:53 lr: 2.3456218296465933e-05 loss: 0.0999 (0.1087) time: 2.9270 data: 0.0075 max mem: 33301 Epoch: [22] [3210/4276] eta: 0:52:24 lr: 2.3453356482977022e-05 loss: 0.1045 (0.1087) time: 2.9476 data: 0.0075 max mem: 33301 Epoch: [22] [3220/4276] eta: 0:51:54 lr: 2.345049463068728e-05 loss: 0.1086 (0.1087) time: 2.9481 data: 0.0075 max mem: 33301 Epoch: [22] [3230/4276] eta: 0:51:25 lr: 2.3447632739590927e-05 loss: 0.1084 (0.1087) time: 2.9281 data: 0.0075 max mem: 33301 Epoch: [22] [3240/4276] eta: 0:50:55 lr: 2.344477080968216e-05 loss: 0.1115 (0.1088) time: 2.9274 data: 0.0075 max mem: 33301 Epoch: [22] [3250/4276] eta: 0:50:26 lr: 2.3441908840955193e-05 loss: 0.1076 (0.1088) time: 2.9282 data: 0.0074 max mem: 33301 Epoch: [22] [3260/4276] eta: 0:49:56 lr: 2.3439046833404234e-05 loss: 0.1070 (0.1089) time: 2.9272 data: 0.0075 max mem: 33301 Epoch: [22] [3270/4276] eta: 0:49:26 lr: 2.3436184787023496e-05 loss: 0.1073 (0.1089) time: 2.9256 data: 0.0075 max mem: 33301 Epoch: [22] [3280/4276] eta: 0:48:57 lr: 2.343332270180717e-05 loss: 0.1031 (0.1089) time: 2.9227 data: 0.0078 max mem: 33301 Epoch: [22] [3290/4276] eta: 0:48:27 lr: 2.3430460577749473e-05 loss: 0.1031 (0.1089) time: 2.9236 data: 0.0078 max mem: 33301 Epoch: [22] [3300/4276] eta: 0:47:58 lr: 2.342759841484459e-05 loss: 0.1142 (0.1089) time: 2.9271 data: 0.0076 max mem: 33301 Epoch: [22] [3310/4276] eta: 0:47:28 lr: 2.3424736213086738e-05 loss: 0.1123 (0.1089) time: 2.9283 data: 0.0075 max mem: 33301 Epoch: [22] [3320/4276] eta: 0:46:59 lr: 2.34218739724701e-05 loss: 0.1114 (0.1090) time: 2.9296 data: 0.0075 max mem: 33301 Epoch: [22] [3330/4276] eta: 0:46:29 lr: 2.3419011692988876e-05 loss: 0.1096 (0.1090) time: 2.9263 data: 0.0079 max mem: 33301 Epoch: [22] [3340/4276] eta: 0:46:00 lr: 2.3416149374637263e-05 loss: 0.0980 (0.1090) time: 2.9300 data: 0.0079 max mem: 33301 Epoch: [22] [3350/4276] eta: 0:45:30 lr: 2.341328701740946e-05 loss: 0.0967 (0.1090) time: 2.9373 data: 0.0075 max mem: 33301 Epoch: [22] [3360/4276] eta: 0:45:00 lr: 2.3410424621299644e-05 loss: 0.1015 (0.1090) time: 2.9368 data: 0.0075 max mem: 33301 Epoch: [22] [3370/4276] eta: 0:44:31 lr: 2.3407562186302005e-05 loss: 0.1073 (0.1090) time: 2.9366 data: 0.0075 max mem: 33301 Epoch: [22] [3380/4276] eta: 0:44:01 lr: 2.3404699712410745e-05 loss: 0.1213 (0.1091) time: 2.9374 data: 0.0075 max mem: 33301 Epoch: [22] [3390/4276] eta: 0:43:32 lr: 2.3401837199620043e-05 loss: 0.1147 (0.1091) time: 2.9405 data: 0.0075 max mem: 33301 Epoch: [22] [3400/4276] eta: 0:43:02 lr: 2.3398974647924073e-05 loss: 0.1099 (0.1091) time: 2.9423 data: 0.0075 max mem: 33301 Epoch: [22] [3410/4276] eta: 0:42:33 lr: 2.3396112057317033e-05 loss: 0.1030 (0.1091) time: 2.9423 data: 0.0074 max mem: 33301 Epoch: [22] [3420/4276] eta: 0:42:03 lr: 2.3393249427793098e-05 loss: 0.1055 (0.1091) time: 2.9445 data: 0.0077 max mem: 33301 Epoch: [22] [3430/4276] eta: 0:41:34 lr: 2.3390386759346444e-05 loss: 0.1047 (0.1091) time: 2.9444 data: 0.0081 max mem: 33301 Epoch: [22] [3440/4276] eta: 0:41:04 lr: 2.3387524051971253e-05 loss: 0.1018 (0.1091) time: 2.9405 data: 0.0083 max mem: 33301 Epoch: [22] [3450/4276] eta: 0:40:35 lr: 2.33846613056617e-05 loss: 0.1001 (0.1091) time: 2.9368 data: 0.0082 max mem: 33301 Epoch: [22] [3460/4276] eta: 0:40:05 lr: 2.338179852041196e-05 loss: 0.1178 (0.1091) time: 2.9366 data: 0.0080 max mem: 33301 Epoch: [22] [3470/4276] eta: 0:39:36 lr: 2.33789356962162e-05 loss: 0.1101 (0.1091) time: 2.9388 data: 0.0080 max mem: 33301 Epoch: [22] [3480/4276] eta: 0:39:06 lr: 2.3376072833068598e-05 loss: 0.1112 (0.1092) time: 2.9230 data: 0.0082 max mem: 33301 Epoch: [22] [3490/4276] eta: 0:38:37 lr: 2.337320993096332e-05 loss: 0.1103 (0.1092) time: 2.8987 data: 0.0085 max mem: 33301 Epoch: [22] [3500/4276] eta: 0:38:07 lr: 2.3370346989894545e-05 loss: 0.1041 (0.1092) time: 2.8864 data: 0.0081 max mem: 33301 Epoch: [22] [3510/4276] eta: 0:37:37 lr: 2.3367484009856423e-05 loss: 0.1022 (0.1091) time: 2.8813 data: 0.0088 max mem: 33301 Epoch: [22] [3520/4276] eta: 0:37:08 lr: 2.336462099084312e-05 loss: 0.1089 (0.1091) time: 2.8861 data: 0.0096 max mem: 33301 Epoch: [22] [3530/4276] eta: 0:36:38 lr: 2.3361757932848806e-05 loss: 0.1089 (0.1091) time: 2.9150 data: 0.0090 max mem: 33301 Epoch: [22] [3540/4276] eta: 0:36:09 lr: 2.3358894835867645e-05 loss: 0.1076 (0.1092) time: 2.9414 data: 0.0081 max mem: 33301 Epoch: [22] [3550/4276] eta: 0:35:39 lr: 2.3356031699893783e-05 loss: 0.0964 (0.1091) time: 2.9426 data: 0.0075 max mem: 33301 Epoch: [22] [3560/4276] eta: 0:35:10 lr: 2.3353168524921386e-05 loss: 0.1024 (0.1092) time: 2.9378 data: 0.0075 max mem: 33301 Epoch: [22] [3570/4276] eta: 0:34:40 lr: 2.3350305310944612e-05 loss: 0.1213 (0.1092) time: 2.9326 data: 0.0075 max mem: 33301 Epoch: [22] [3580/4276] eta: 0:34:11 lr: 2.3347442057957617e-05 loss: 0.1062 (0.1092) time: 2.9270 data: 0.0077 max mem: 33301 Epoch: [22] [3590/4276] eta: 0:33:41 lr: 2.3344578765954543e-05 loss: 0.0960 (0.1091) time: 2.9303 data: 0.0085 max mem: 33301 Epoch: [22] [3600/4276] eta: 0:33:12 lr: 2.334171543492955e-05 loss: 0.0973 (0.1092) time: 2.9350 data: 0.0088 max mem: 33301 Epoch: [22] [3610/4276] eta: 0:32:42 lr: 2.3338852064876784e-05 loss: 0.0996 (0.1091) time: 2.9348 data: 0.0089 max mem: 33301 Epoch: [22] [3620/4276] eta: 0:32:13 lr: 2.33359886557904e-05 loss: 0.0986 (0.1091) time: 2.9369 data: 0.0088 max mem: 33301 Epoch: [22] [3630/4276] eta: 0:31:43 lr: 2.333312520766453e-05 loss: 0.0998 (0.1091) time: 2.9368 data: 0.0083 max mem: 33301 Epoch: [22] [3640/4276] eta: 0:31:14 lr: 2.3330261720493326e-05 loss: 0.0960 (0.1091) time: 2.9384 data: 0.0084 max mem: 33301 Epoch: [22] [3650/4276] eta: 0:30:44 lr: 2.3327398194270934e-05 loss: 0.0860 (0.1090) time: 2.9379 data: 0.0081 max mem: 33301 Epoch: [22] [3660/4276] eta: 0:30:15 lr: 2.3324534628991494e-05 loss: 0.0883 (0.1090) time: 2.9349 data: 0.0079 max mem: 33301 Epoch: [22] [3670/4276] eta: 0:29:45 lr: 2.3321671024649138e-05 loss: 0.0934 (0.1090) time: 2.9281 data: 0.0081 max mem: 33301 Epoch: [22] [3680/4276] eta: 0:29:16 lr: 2.331880738123801e-05 loss: 0.0908 (0.1089) time: 2.9183 data: 0.0080 max mem: 33301 Epoch: [22] [3690/4276] eta: 0:28:46 lr: 2.3315943698752243e-05 loss: 0.1076 (0.1089) time: 2.8946 data: 0.0084 max mem: 33301 Epoch: [22] [3700/4276] eta: 0:28:17 lr: 2.331307997718598e-05 loss: 0.1047 (0.1089) time: 2.8851 data: 0.0084 max mem: 33301 Epoch: [22] [3710/4276] eta: 0:27:47 lr: 2.331021621653334e-05 loss: 0.1014 (0.1089) time: 2.9151 data: 0.0084 max mem: 33301 Epoch: [22] [3720/4276] eta: 0:27:18 lr: 2.330735241678846e-05 loss: 0.1032 (0.1089) time: 2.9329 data: 0.0081 max mem: 33301 Epoch: [22] [3730/4276] eta: 0:26:48 lr: 2.3304488577945475e-05 loss: 0.1086 (0.1089) time: 2.9312 data: 0.0074 max mem: 33301 Epoch: [22] [3740/4276] eta: 0:26:19 lr: 2.33016246999985e-05 loss: 0.0995 (0.1089) time: 2.9307 data: 0.0074 max mem: 33301 Epoch: [22] [3750/4276] eta: 0:25:49 lr: 2.329876078294167e-05 loss: 0.0995 (0.1089) time: 2.9325 data: 0.0075 max mem: 33301 Epoch: [22] [3760/4276] eta: 0:25:20 lr: 2.3295896826769108e-05 loss: 0.0920 (0.1089) time: 2.9257 data: 0.0082 max mem: 33301 Epoch: [22] [3770/4276] eta: 0:24:50 lr: 2.3293032831474937e-05 loss: 0.1007 (0.1089) time: 2.9110 data: 0.0093 max mem: 33301 Epoch: [22] [3780/4276] eta: 0:24:21 lr: 2.3290168797053275e-05 loss: 0.1066 (0.1089) time: 2.9176 data: 0.0091 max mem: 33301 Epoch: [22] [3790/4276] eta: 0:23:51 lr: 2.3287304723498238e-05 loss: 0.0973 (0.1089) time: 2.9324 data: 0.0084 max mem: 33301 Epoch: [22] [3800/4276] eta: 0:23:22 lr: 2.3284440610803952e-05 loss: 0.1037 (0.1089) time: 2.9347 data: 0.0083 max mem: 33301 Epoch: [22] [3810/4276] eta: 0:22:52 lr: 2.328157645896453e-05 loss: 0.1093 (0.1089) time: 2.9353 data: 0.0079 max mem: 33301 Epoch: [22] [3820/4276] eta: 0:22:23 lr: 2.3278712267974083e-05 loss: 0.0967 (0.1089) time: 2.9366 data: 0.0079 max mem: 33301 Epoch: [22] [3830/4276] eta: 0:21:53 lr: 2.3275848037826722e-05 loss: 0.0932 (0.1089) time: 2.9377 data: 0.0081 max mem: 33301 Epoch: [22] [3840/4276] eta: 0:21:24 lr: 2.3272983768516563e-05 loss: 0.0974 (0.1088) time: 2.9363 data: 0.0079 max mem: 33301 Epoch: [22] [3850/4276] eta: 0:20:55 lr: 2.3270119460037713e-05 loss: 0.0918 (0.1088) time: 2.9363 data: 0.0079 max mem: 33301 Epoch: [22] [3860/4276] eta: 0:20:25 lr: 2.3267255112384277e-05 loss: 0.1031 (0.1088) time: 2.9366 data: 0.0077 max mem: 33301 Epoch: [22] [3870/4276] eta: 0:19:56 lr: 2.3264390725550364e-05 loss: 0.1079 (0.1088) time: 2.9360 data: 0.0075 max mem: 33301 Epoch: [22] [3880/4276] eta: 0:19:26 lr: 2.3261526299530073e-05 loss: 0.0993 (0.1088) time: 2.9346 data: 0.0075 max mem: 33301 Epoch: [22] [3890/4276] eta: 0:18:57 lr: 2.3258661834317517e-05 loss: 0.1000 (0.1088) time: 2.9329 data: 0.0074 max mem: 33301 Epoch: [22] [3900/4276] eta: 0:18:27 lr: 2.3255797329906784e-05 loss: 0.1014 (0.1088) time: 2.9319 data: 0.0074 max mem: 33301 Epoch: [22] [3910/4276] eta: 0:17:58 lr: 2.3252932786291974e-05 loss: 0.0944 (0.1087) time: 2.9304 data: 0.0075 max mem: 33301 Epoch: [22] [3920/4276] eta: 0:17:28 lr: 2.325006820346719e-05 loss: 0.0917 (0.1087) time: 2.9302 data: 0.0075 max mem: 33301 Epoch: [22] [3930/4276] eta: 0:16:59 lr: 2.3247203581426534e-05 loss: 0.0954 (0.1087) time: 2.9307 data: 0.0075 max mem: 33301 Epoch: [22] [3940/4276] eta: 0:16:29 lr: 2.324433892016408e-05 loss: 0.0927 (0.1087) time: 2.9317 data: 0.0075 max mem: 33301 Epoch: [22] [3950/4276] eta: 0:16:00 lr: 2.3241474219673932e-05 loss: 0.0980 (0.1087) time: 2.9365 data: 0.0078 max mem: 33301 Epoch: [22] [3960/4276] eta: 0:15:30 lr: 2.3238609479950182e-05 loss: 0.0980 (0.1087) time: 2.9297 data: 0.0081 max mem: 33301 Epoch: [22] [3970/4276] eta: 0:15:01 lr: 2.323574470098692e-05 loss: 0.0966 (0.1087) time: 2.9248 data: 0.0080 max mem: 33301 Epoch: [22] [3980/4276] eta: 0:14:31 lr: 2.3232879882778224e-05 loss: 0.1012 (0.1087) time: 2.9304 data: 0.0077 max mem: 33301 Epoch: [22] [3990/4276] eta: 0:14:02 lr: 2.3230015025318185e-05 loss: 0.0999 (0.1087) time: 2.9293 data: 0.0075 max mem: 33301 Epoch: [22] [4000/4276] eta: 0:13:32 lr: 2.3227150128600882e-05 loss: 0.0917 (0.1086) time: 2.9289 data: 0.0075 max mem: 33301 Epoch: [22] [4010/4276] eta: 0:13:03 lr: 2.3224285192620408e-05 loss: 0.0918 (0.1086) time: 2.9298 data: 0.0075 max mem: 33301 Epoch: [22] [4020/4276] eta: 0:12:34 lr: 2.322142021737083e-05 loss: 0.0987 (0.1086) time: 2.9259 data: 0.0078 max mem: 33301 Epoch: [22] [4030/4276] eta: 0:12:04 lr: 2.3218555202846232e-05 loss: 0.1032 (0.1086) time: 2.9210 data: 0.0086 max mem: 33301 Epoch: [22] [4040/4276] eta: 0:11:35 lr: 2.3215690149040693e-05 loss: 0.1070 (0.1087) time: 2.9241 data: 0.0085 max mem: 33301 Epoch: [22] [4050/4276] eta: 0:11:05 lr: 2.3212825055948285e-05 loss: 0.1031 (0.1087) time: 2.9285 data: 0.0077 max mem: 33301 Epoch: [22] [4060/4276] eta: 0:10:36 lr: 2.3209959923563076e-05 loss: 0.0987 (0.1088) time: 2.9305 data: 0.0075 max mem: 33301 Epoch: [22] [4070/4276] eta: 0:10:06 lr: 2.320709475187915e-05 loss: 0.1112 (0.1088) time: 2.9294 data: 0.0075 max mem: 33301 Epoch: [22] [4080/4276] eta: 0:09:37 lr: 2.320422954089057e-05 loss: 0.1144 (0.1088) time: 2.9272 data: 0.0075 max mem: 33301 Epoch: [22] [4090/4276] eta: 0:09:07 lr: 2.3201364290591403e-05 loss: 0.1129 (0.1088) time: 2.9423 data: 0.0077 max mem: 33301 Epoch: [22] [4100/4276] eta: 0:08:38 lr: 2.3198499000975713e-05 loss: 0.1129 (0.1088) time: 2.9435 data: 0.0077 max mem: 33301 Epoch: [22] [4110/4276] eta: 0:08:08 lr: 2.319563367203757e-05 loss: 0.1088 (0.1088) time: 2.9284 data: 0.0075 max mem: 33301 Epoch: [22] [4120/4276] eta: 0:07:39 lr: 2.319276830377104e-05 loss: 0.1014 (0.1088) time: 2.9283 data: 0.0074 max mem: 33301 Epoch: [22] [4130/4276] eta: 0:07:09 lr: 2.318990289617018e-05 loss: 0.0929 (0.1088) time: 2.9290 data: 0.0075 max mem: 33301 Epoch: [22] [4140/4276] eta: 0:06:40 lr: 2.3187037449229044e-05 loss: 0.0922 (0.1088) time: 2.9280 data: 0.0075 max mem: 33301 Epoch: [22] [4150/4276] eta: 0:06:11 lr: 2.3184171962941696e-05 loss: 0.1037 (0.1088) time: 2.9293 data: 0.0077 max mem: 33301 Epoch: [22] [4160/4276] eta: 0:05:41 lr: 2.31813064373022e-05 loss: 0.1093 (0.1088) time: 2.9322 data: 0.0080 max mem: 33301 Epoch: [22] [4170/4276] eta: 0:05:12 lr: 2.3178440872304593e-05 loss: 0.1117 (0.1088) time: 2.9329 data: 0.0077 max mem: 33301 Epoch: [22] [4180/4276] eta: 0:04:42 lr: 2.317557526794294e-05 loss: 0.1068 (0.1088) time: 2.9213 data: 0.0081 max mem: 33301 Epoch: [22] [4190/4276] eta: 0:04:13 lr: 2.317270962421129e-05 loss: 0.1047 (0.1088) time: 2.9126 data: 0.0089 max mem: 33301 Epoch: [22] [4200/4276] eta: 0:03:43 lr: 2.3169843941103692e-05 loss: 0.1158 (0.1088) time: 2.8980 data: 0.0091 max mem: 33301 Epoch: [22] [4210/4276] eta: 0:03:14 lr: 2.316697821861419e-05 loss: 0.1095 (0.1088) time: 2.8804 data: 0.0087 max mem: 33301 Epoch: [22] [4220/4276] eta: 0:02:44 lr: 2.316411245673683e-05 loss: 0.1117 (0.1089) time: 2.8796 data: 0.0087 max mem: 33301 Epoch: [22] [4230/4276] eta: 0:02:15 lr: 2.3161246655465662e-05 loss: 0.1125 (0.1089) time: 2.8935 data: 0.0091 max mem: 33301 Epoch: [22] [4240/4276] eta: 0:01:45 lr: 2.315838081479473e-05 loss: 0.1125 (0.1089) time: 2.8946 data: 0.0090 max mem: 33301 Epoch: [22] [4250/4276] eta: 0:01:16 lr: 2.3155514934718066e-05 loss: 0.1186 (0.1089) time: 2.8827 data: 0.0086 max mem: 33301 Epoch: [22] [4260/4276] eta: 0:00:47 lr: 2.3152649015229707e-05 loss: 0.1186 (0.1089) time: 2.8810 data: 0.0088 max mem: 33301 Epoch: [22] [4270/4276] eta: 0:00:17 lr: 2.31497830563237e-05 loss: 0.1130 (0.1090) time: 2.8744 data: 0.0082 max mem: 33301 Epoch: [22] Total time: 3:29:46 Test: [ 0/21770] eta: 8:53:34 time: 1.4706 data: 1.4283 max mem: 33301 Test: [ 100/21770] eta: 0:18:47 time: 0.0378 data: 0.0009 max mem: 33301 Test: [ 200/21770] eta: 0:16:11 time: 0.0381 data: 0.0009 max mem: 33301 Test: [ 300/21770] eta: 0:15:15 time: 0.0377 data: 0.0009 max mem: 33301 Test: [ 400/21770] eta: 0:14:46 time: 0.0378 data: 0.0009 max mem: 33301 Test: [ 500/21770] eta: 0:14:26 time: 0.0376 data: 0.0008 max mem: 33301 Test: [ 600/21770] eta: 0:14:12 time: 0.0378 data: 0.0009 max mem: 33301 Test: [ 700/21770] eta: 0:14:00 time: 0.0379 data: 0.0009 max mem: 33301 Test: [ 800/21770] eta: 0:13:51 time: 0.0377 data: 0.0009 max mem: 33301 Test: [ 900/21770] eta: 0:13:42 time: 0.0379 data: 0.0009 max mem: 33301 Test: [ 1000/21770] eta: 0:13:35 time: 0.0379 data: 0.0009 max mem: 33301 Test: [ 1100/21770] eta: 0:13:29 time: 0.0382 data: 0.0009 max mem: 33301 Test: [ 1200/21770] eta: 0:13:25 time: 0.0387 data: 0.0009 max mem: 33301 Test: [ 1300/21770] eta: 0:13:20 time: 0.0390 data: 0.0009 max mem: 33301 Test: [ 1400/21770] eta: 0:13:16 time: 0.0388 data: 0.0009 max mem: 33301 Test: [ 1500/21770] eta: 0:13:12 time: 0.0387 data: 0.0009 max mem: 33301 Test: [ 1600/21770] eta: 0:13:07 time: 0.0384 data: 0.0009 max mem: 33301 Test: [ 1700/21770] eta: 0:13:03 time: 0.0388 data: 0.0009 max mem: 33301 Test: [ 1800/21770] eta: 0:12:59 time: 0.0384 data: 0.0009 max mem: 33301 Test: [ 1900/21770] eta: 0:12:54 time: 0.0387 data: 0.0009 max mem: 33301 Test: [ 2000/21770] eta: 0:12:50 time: 0.0386 data: 0.0009 max mem: 33301 Test: [ 2100/21770] eta: 0:12:46 time: 0.0388 data: 0.0009 max mem: 33301 Test: [ 2200/21770] eta: 0:12:42 time: 0.0390 data: 0.0009 max mem: 33301 Test: [ 2300/21770] eta: 0:12:37 time: 0.0380 data: 0.0008 max mem: 33301 Test: [ 2400/21770] eta: 0:12:33 time: 0.0385 data: 0.0009 max mem: 33301 Test: [ 2500/21770] eta: 0:12:29 time: 0.0392 data: 0.0010 max mem: 33301 Test: [ 2600/21770] eta: 0:12:26 time: 0.0393 data: 0.0010 max mem: 33301 Test: [ 2700/21770] eta: 0:12:22 time: 0.0396 data: 0.0010 max mem: 33301 Test: [ 2800/21770] eta: 0:12:19 time: 0.0392 data: 0.0009 max mem: 33301 Test: [ 2900/21770] eta: 0:12:16 time: 0.0391 data: 0.0009 max mem: 33301 Test: [ 3000/21770] eta: 0:12:12 time: 0.0392 data: 0.0009 max mem: 33301 Test: [ 3100/21770] eta: 0:12:08 time: 0.0391 data: 0.0009 max mem: 33301 Test: [ 3200/21770] eta: 0:12:04 time: 0.0392 data: 0.0009 max mem: 33301 Test: [ 3300/21770] eta: 0:12:00 time: 0.0394 data: 0.0009 max mem: 33301 Test: [ 3400/21770] eta: 0:11:56 time: 0.0384 data: 0.0008 max mem: 33301 Test: [ 3500/21770] eta: 0:11:52 time: 0.0386 data: 0.0009 max mem: 33301 Test: [ 3600/21770] eta: 0:11:48 time: 0.0390 data: 0.0009 max mem: 33301 Test: [ 3700/21770] eta: 0:11:44 time: 0.0388 data: 0.0008 max mem: 33301 Test: [ 3800/21770] eta: 0:11:40 time: 0.0386 data: 0.0008 max mem: 33301 Test: [ 3900/21770] eta: 0:11:36 time: 0.0386 data: 0.0009 max mem: 33301 Test: [ 4000/21770] eta: 0:11:32 time: 0.0384 data: 0.0008 max mem: 33301 Test: [ 4100/21770] eta: 0:11:28 time: 0.0385 data: 0.0009 max mem: 33301 Test: [ 4200/21770] eta: 0:11:24 time: 0.0383 data: 0.0009 max mem: 33301 Test: [ 4300/21770] eta: 0:11:19 time: 0.0382 data: 0.0009 max mem: 33301 Test: [ 4400/21770] eta: 0:11:15 time: 0.0383 data: 0.0009 max mem: 33301 Test: [ 4500/21770] eta: 0:11:11 time: 0.0382 data: 0.0009 max mem: 33301 Test: [ 4600/21770] eta: 0:11:07 time: 0.0383 data: 0.0009 max mem: 33301 Test: [ 4700/21770] eta: 0:11:03 time: 0.0381 data: 0.0009 max mem: 33301 Test: [ 4800/21770] eta: 0:10:59 time: 0.0383 data: 0.0009 max mem: 33301 Test: [ 4900/21770] eta: 0:10:55 time: 0.0382 data: 0.0009 max mem: 33301 Test: [ 5000/21770] eta: 0:10:51 time: 0.0401 data: 0.0009 max mem: 33301 Test: [ 5100/21770] eta: 0:10:47 time: 0.0399 data: 0.0009 max mem: 33301 Test: [ 5200/21770] eta: 0:10:44 time: 0.0389 data: 0.0009 max mem: 33301 Test: [ 5300/21770] eta: 0:10:40 time: 0.0383 data: 0.0009 max mem: 33301 Test: [ 5400/21770] eta: 0:10:36 time: 0.0383 data: 0.0009 max mem: 33301 Test: [ 5500/21770] eta: 0:10:32 time: 0.0383 data: 0.0009 max mem: 33301 Test: [ 5600/21770] eta: 0:10:28 time: 0.0382 data: 0.0009 max mem: 33301 Test: [ 5700/21770] eta: 0:10:23 time: 0.0382 data: 0.0009 max mem: 33301 Test: [ 5800/21770] eta: 0:10:19 time: 0.0379 data: 0.0009 max mem: 33301 Test: [ 5900/21770] eta: 0:10:15 time: 0.0379 data: 0.0009 max mem: 33301 Test: [ 6000/21770] eta: 0:10:11 time: 0.0382 data: 0.0009 max mem: 33301 Test: [ 6100/21770] eta: 0:10:07 time: 0.0383 data: 0.0009 max mem: 33301 Test: [ 6200/21770] eta: 0:10:03 time: 0.0382 data: 0.0008 max mem: 33301 Test: [ 6300/21770] eta: 0:09:59 time: 0.0383 data: 0.0009 max mem: 33301 Test: [ 6400/21770] eta: 0:09:55 time: 0.0382 data: 0.0009 max mem: 33301 Test: [ 6500/21770] eta: 0:09:51 time: 0.0382 data: 0.0009 max mem: 33301 Test: [ 6600/21770] eta: 0:09:47 time: 0.0382 data: 0.0009 max mem: 33301 Test: [ 6700/21770] eta: 0:09:43 time: 0.0384 data: 0.0009 max mem: 33301 Test: [ 6800/21770] eta: 0:09:39 time: 0.0383 data: 0.0009 max mem: 33301 Test: [ 6900/21770] eta: 0:09:35 time: 0.0387 data: 0.0008 max mem: 33301 Test: [ 7000/21770] eta: 0:09:31 time: 0.0384 data: 0.0009 max mem: 33301 Test: [ 7100/21770] eta: 0:09:28 time: 0.0387 data: 0.0009 max mem: 33301 Test: [ 7200/21770] eta: 0:09:24 time: 0.0385 data: 0.0009 max mem: 33301 Test: [ 7300/21770] eta: 0:09:20 time: 0.0388 data: 0.0009 max mem: 33301 Test: [ 7400/21770] eta: 0:09:16 time: 0.0382 data: 0.0009 max mem: 33301 Test: [ 7500/21770] eta: 0:09:12 time: 0.0383 data: 0.0009 max mem: 33301 Test: [ 7600/21770] eta: 0:09:08 time: 0.0382 data: 0.0009 max mem: 33301 Test: [ 7700/21770] eta: 0:09:04 time: 0.0382 data: 0.0009 max mem: 33301 Test: [ 7800/21770] eta: 0:09:00 time: 0.0381 data: 0.0009 max mem: 33301 Test: [ 7900/21770] eta: 0:08:56 time: 0.0384 data: 0.0009 max mem: 33301 Test: [ 8000/21770] eta: 0:08:52 time: 0.0382 data: 0.0009 max mem: 33301 Test: [ 8100/21770] eta: 0:08:48 time: 0.0383 data: 0.0009 max mem: 33301 Test: [ 8200/21770] eta: 0:08:44 time: 0.0382 data: 0.0009 max mem: 33301 Test: [ 8300/21770] eta: 0:08:40 time: 0.0383 data: 0.0009 max mem: 33301 Test: [ 8400/21770] eta: 0:08:36 time: 0.0383 data: 0.0009 max mem: 33301 Test: [ 8500/21770] eta: 0:08:32 time: 0.0382 data: 0.0008 max mem: 33301 Test: [ 8600/21770] eta: 0:08:28 time: 0.0381 data: 0.0009 max mem: 33301 Test: [ 8700/21770] eta: 0:08:25 time: 0.0382 data: 0.0009 max mem: 33301 Test: [ 8800/21770] eta: 0:08:21 time: 0.0382 data: 0.0009 max mem: 33301 Test: [ 8900/21770] eta: 0:08:17 time: 0.0383 data: 0.0009 max mem: 33301 Test: [ 9000/21770] eta: 0:08:13 time: 0.0382 data: 0.0009 max mem: 33301 Test: [ 9100/21770] eta: 0:08:09 time: 0.0383 data: 0.0009 max mem: 33301 Test: [ 9200/21770] eta: 0:08:05 time: 0.0380 data: 0.0009 max mem: 33301 Test: [ 9300/21770] eta: 0:08:01 time: 0.0382 data: 0.0009 max mem: 33301 Test: [ 9400/21770] eta: 0:07:57 time: 0.0379 data: 0.0009 max mem: 33301 Test: [ 9500/21770] eta: 0:07:53 time: 0.0379 data: 0.0009 max mem: 33301 Test: [ 9600/21770] eta: 0:07:49 time: 0.0383 data: 0.0009 max mem: 33301 Test: [ 9700/21770] eta: 0:07:45 time: 0.0386 data: 0.0009 max mem: 33301 Test: [ 9800/21770] eta: 0:07:42 time: 0.0396 data: 0.0009 max mem: 33301 Test: [ 9900/21770] eta: 0:07:38 time: 0.0400 data: 0.0009 max mem: 33301 Test: [10000/21770] eta: 0:07:34 time: 0.0394 data: 0.0009 max mem: 33301 Test: [10100/21770] eta: 0:07:30 time: 0.0394 data: 0.0009 max mem: 33301 Test: [10200/21770] eta: 0:07:27 time: 0.0397 data: 0.0009 max mem: 33301 Test: [10300/21770] eta: 0:07:23 time: 0.0401 data: 0.0009 max mem: 33301 Test: [10400/21770] eta: 0:07:19 time: 0.0391 data: 0.0009 max mem: 33301 Test: [10500/21770] eta: 0:07:15 time: 0.0392 data: 0.0009 max mem: 33301 Test: [10600/21770] eta: 0:07:11 time: 0.0392 data: 0.0009 max mem: 33301 Test: [10700/21770] eta: 0:07:08 time: 0.0389 data: 0.0009 max mem: 33301 Test: [10800/21770] eta: 0:07:04 time: 0.0387 data: 0.0009 max mem: 33301 Test: [10900/21770] eta: 0:07:00 time: 0.0390 data: 0.0009 max mem: 33301 Test: [11000/21770] eta: 0:06:56 time: 0.0388 data: 0.0009 max mem: 33301 Test: [11100/21770] eta: 0:06:52 time: 0.0390 data: 0.0009 max mem: 33301 Test: [11200/21770] eta: 0:06:48 time: 0.0388 data: 0.0009 max mem: 33301 Test: [11300/21770] eta: 0:06:45 time: 0.0387 data: 0.0009 max mem: 33301 Test: [11400/21770] eta: 0:06:41 time: 0.0388 data: 0.0009 max mem: 33301 Test: [11500/21770] eta: 0:06:37 time: 0.0389 data: 0.0009 max mem: 33301 Test: [11600/21770] eta: 0:06:33 time: 0.0387 data: 0.0009 max mem: 33301 Test: [11700/21770] eta: 0:06:29 time: 0.0381 data: 0.0009 max mem: 33301 Test: [11800/21770] eta: 0:06:25 time: 0.0381 data: 0.0009 max mem: 33301 Test: [11900/21770] eta: 0:06:21 time: 0.0396 data: 0.0009 max mem: 33301 Test: [12000/21770] eta: 0:06:18 time: 0.0400 data: 0.0009 max mem: 33301 Test: [12100/21770] eta: 0:06:14 time: 0.0398 data: 0.0009 max mem: 33301 Test: [12200/21770] eta: 0:06:10 time: 0.0400 data: 0.0009 max mem: 33301 Test: [12300/21770] eta: 0:06:06 time: 0.0399 data: 0.0009 max mem: 33301 Test: [12400/21770] eta: 0:06:02 time: 0.0393 data: 0.0009 max mem: 33301 Test: [12500/21770] eta: 0:05:59 time: 0.0400 data: 0.0009 max mem: 33301 Test: [12600/21770] eta: 0:05:55 time: 0.0391 data: 0.0009 max mem: 33301 Test: [12700/21770] eta: 0:05:51 time: 0.0393 data: 0.0009 max mem: 33301 Test: [12800/21770] eta: 0:05:47 time: 0.0397 data: 0.0009 max mem: 33301 Test: [12900/21770] eta: 0:05:43 time: 0.0392 data: 0.0009 max mem: 33301 Test: [13000/21770] eta: 0:05:40 time: 0.0392 data: 0.0009 max mem: 33301 Test: [13100/21770] eta: 0:05:36 time: 0.0399 data: 0.0009 max mem: 33301 Test: [13200/21770] eta: 0:05:32 time: 0.0400 data: 0.0009 max mem: 33301 Test: [13300/21770] eta: 0:05:28 time: 0.0398 data: 0.0009 max mem: 33301 Test: [13400/21770] eta: 0:05:24 time: 0.0393 data: 0.0009 max mem: 33301 Test: [13500/21770] eta: 0:05:20 time: 0.0399 data: 0.0009 max mem: 33301 Test: [13600/21770] eta: 0:05:17 time: 0.0391 data: 0.0009 max mem: 33301 Test: [13700/21770] eta: 0:05:13 time: 0.0397 data: 0.0008 max mem: 33301 Test: [13800/21770] eta: 0:05:09 time: 0.0392 data: 0.0008 max mem: 33301 Test: [13900/21770] eta: 0:05:05 time: 0.0388 data: 0.0009 max mem: 33301 Test: [14000/21770] eta: 0:05:01 time: 0.0382 data: 0.0009 max mem: 33301 Test: [14100/21770] eta: 0:04:57 time: 0.0382 data: 0.0009 max mem: 33301 Test: [14200/21770] eta: 0:04:53 time: 0.0383 data: 0.0009 max mem: 33301 Test: [14300/21770] eta: 0:04:49 time: 0.0384 data: 0.0009 max mem: 33301 Test: [14400/21770] eta: 0:04:45 time: 0.0386 data: 0.0009 max mem: 33301 Test: [14500/21770] eta: 0:04:42 time: 0.0388 data: 0.0009 max mem: 33301 Test: [14600/21770] eta: 0:04:38 time: 0.0386 data: 0.0009 max mem: 33301 Test: [14700/21770] eta: 0:04:34 time: 0.0386 data: 0.0009 max mem: 33301 Test: [14800/21770] eta: 0:04:30 time: 0.0387 data: 0.0009 max mem: 33301 Test: [14900/21770] eta: 0:04:26 time: 0.0387 data: 0.0009 max mem: 33301 Test: [15000/21770] eta: 0:04:22 time: 0.0390 data: 0.0009 max mem: 33301 Test: [15100/21770] eta: 0:04:18 time: 0.0386 data: 0.0009 max mem: 33301 Test: [15200/21770] eta: 0:04:14 time: 0.0390 data: 0.0009 max mem: 33301 Test: [15300/21770] eta: 0:04:11 time: 0.0385 data: 0.0009 max mem: 33301 Test: [15400/21770] eta: 0:04:07 time: 0.0392 data: 0.0009 max mem: 33301 Test: [15500/21770] eta: 0:04:03 time: 0.0382 data: 0.0009 max mem: 33301 Test: [15600/21770] eta: 0:03:59 time: 0.0388 data: 0.0009 max mem: 33301 Test: [15700/21770] eta: 0:03:55 time: 0.0385 data: 0.0009 max mem: 33301 Test: [15800/21770] eta: 0:03:51 time: 0.0388 data: 0.0009 max mem: 33301 Test: [15900/21770] eta: 0:03:47 time: 0.0385 data: 0.0009 max mem: 33301 Test: [16000/21770] eta: 0:03:43 time: 0.0390 data: 0.0009 max mem: 33301 Test: [16100/21770] eta: 0:03:39 time: 0.0390 data: 0.0009 max mem: 33301 Test: [16200/21770] eta: 0:03:36 time: 0.0387 data: 0.0009 max mem: 33301 Test: [16300/21770] eta: 0:03:32 time: 0.0390 data: 0.0009 max mem: 33301 Test: [16400/21770] eta: 0:03:28 time: 0.0385 data: 0.0009 max mem: 33301 Test: [16500/21770] eta: 0:03:24 time: 0.0389 data: 0.0009 max mem: 33301 Test: [16600/21770] eta: 0:03:20 time: 0.0388 data: 0.0009 max mem: 33301 Test: [16700/21770] eta: 0:03:16 time: 0.0386 data: 0.0009 max mem: 33301 Test: [16800/21770] eta: 0:03:12 time: 0.0386 data: 0.0009 max mem: 33301 Test: [16900/21770] eta: 0:03:08 time: 0.0389 data: 0.0009 max mem: 33301 Test: [17000/21770] eta: 0:03:05 time: 0.0387 data: 0.0009 max mem: 33301 Test: [17100/21770] eta: 0:03:01 time: 0.0386 data: 0.0009 max mem: 33301 Test: [17200/21770] eta: 0:02:57 time: 0.0386 data: 0.0009 max mem: 33301 Test: [17300/21770] eta: 0:02:53 time: 0.0389 data: 0.0009 max mem: 33301 Test: [17400/21770] eta: 0:02:49 time: 0.0387 data: 0.0009 max mem: 33301 Test: [17500/21770] eta: 0:02:45 time: 0.0387 data: 0.0009 max mem: 33301 Test: [17600/21770] eta: 0:02:41 time: 0.0385 data: 0.0009 max mem: 33301 Test: [17700/21770] eta: 0:02:37 time: 0.0381 data: 0.0009 max mem: 33301 Test: [17800/21770] eta: 0:02:33 time: 0.0382 data: 0.0009 max mem: 33301 Test: [17900/21770] eta: 0:02:30 time: 0.0380 data: 0.0009 max mem: 33301 Test: [18000/21770] eta: 0:02:26 time: 0.0383 data: 0.0009 max mem: 33301 Test: [18100/21770] eta: 0:02:22 time: 0.0381 data: 0.0009 max mem: 33301 Test: [18200/21770] eta: 0:02:18 time: 0.0382 data: 0.0009 max mem: 33301 Test: [18300/21770] eta: 0:02:14 time: 0.0383 data: 0.0008 max mem: 33301 Test: [18400/21770] eta: 0:02:10 time: 0.0382 data: 0.0009 max mem: 33301 Test: [18500/21770] eta: 0:02:06 time: 0.0381 data: 0.0009 max mem: 33301 Test: [18600/21770] eta: 0:02:02 time: 0.0382 data: 0.0009 max mem: 33301 Test: [18700/21770] eta: 0:01:58 time: 0.0381 data: 0.0009 max mem: 33301 Test: [18800/21770] eta: 0:01:55 time: 0.0382 data: 0.0009 max mem: 33301 Test: [18900/21770] eta: 0:01:51 time: 0.0381 data: 0.0009 max mem: 33301 Test: [19000/21770] eta: 0:01:47 time: 0.0382 data: 0.0009 max mem: 33301 Test: [19100/21770] eta: 0:01:43 time: 0.0382 data: 0.0008 max mem: 33301 Test: [19200/21770] eta: 0:01:39 time: 0.0385 data: 0.0009 max mem: 33301 Test: [19300/21770] eta: 0:01:35 time: 0.0384 data: 0.0009 max mem: 33301 Test: [19400/21770] eta: 0:01:31 time: 0.0386 data: 0.0009 max mem: 33301 Test: [19500/21770] eta: 0:01:27 time: 0.0383 data: 0.0009 max mem: 33301 Test: [19600/21770] eta: 0:01:24 time: 0.0386 data: 0.0009 max mem: 33301 Test: [19700/21770] eta: 0:01:20 time: 0.0385 data: 0.0009 max mem: 33301 Test: [19800/21770] eta: 0:01:16 time: 0.0391 data: 0.0008 max mem: 33301 Test: [19900/21770] eta: 0:01:12 time: 0.0383 data: 0.0009 max mem: 33301 Test: [20000/21770] eta: 0:01:08 time: 0.0385 data: 0.0009 max mem: 33301 Test: [20100/21770] eta: 0:01:04 time: 0.0383 data: 0.0009 max mem: 33301 Test: [20200/21770] eta: 0:01:00 time: 0.0387 data: 0.0009 max mem: 33301 Test: [20300/21770] eta: 0:00:56 time: 0.0381 data: 0.0009 max mem: 33301 Test: [20400/21770] eta: 0:00:53 time: 0.0382 data: 0.0008 max mem: 33301 Test: [20500/21770] eta: 0:00:49 time: 0.0381 data: 0.0009 max mem: 33301 Test: [20600/21770] eta: 0:00:45 time: 0.0382 data: 0.0009 max mem: 33301 Test: [20700/21770] eta: 0:00:41 time: 0.0380 data: 0.0009 max mem: 33301 Test: [20800/21770] eta: 0:00:37 time: 0.0384 data: 0.0009 max mem: 33301 Test: [20900/21770] eta: 0:00:33 time: 0.0385 data: 0.0009 max mem: 33301 Test: [21000/21770] eta: 0:00:29 time: 0.0385 data: 0.0009 max mem: 33301 Test: [21100/21770] eta: 0:00:25 time: 0.0384 data: 0.0009 max mem: 33301 Test: [21200/21770] eta: 0:00:22 time: 0.0387 data: 0.0009 max mem: 33301 Test: [21300/21770] eta: 0:00:18 time: 0.0386 data: 0.0009 max mem: 33301 Test: [21400/21770] eta: 0:00:14 time: 0.0395 data: 0.0010 max mem: 33301 Test: [21500/21770] eta: 0:00:10 time: 0.0393 data: 0.0010 max mem: 33301 Test: [21600/21770] eta: 0:00:06 time: 0.0396 data: 0.0010 max mem: 33301 Test: [21700/21770] eta: 0:00:02 time: 0.0385 data: 0.0009 max mem: 33301 Test: Total time: 0:14:03 Final results: Mean IoU is 15.77 precision@0.5 = 2.83 precision@0.6 = 1.25 precision@0.7 = 0.44 precision@0.8 = 0.10 precision@0.9 = 0.00 overall IoU = 15.84 mean IoU = 15.77 Mean accuracy for one-to-zero sample is 0.00 Average object IoU 0.15767904766376536 Overall IoU 15.836902618408203 Better epoch: 22 Epoch: [23] [ 0/4276] eta: 6:06:48 lr: 2.3148063462057145e-05 loss: 0.0895 (0.0895) time: 5.1470 data: 2.0617 max mem: 33301 Epoch: [23] [ 10/4276] eta: 3:43:54 lr: 2.3145197440070494e-05 loss: 0.1044 (0.1073) time: 3.1492 data: 0.1931 max mem: 33301 Epoch: [23] [ 20/4276] eta: 3:36:31 lr: 2.3142331378650675e-05 loss: 0.1016 (0.1068) time: 2.9479 data: 0.0063 max mem: 33301 Epoch: [23] [ 30/4276] eta: 3:33:35 lr: 2.313946527779172e-05 loss: 0.1016 (0.1075) time: 2.9463 data: 0.0068 max mem: 33301 Epoch: [23] [ 40/4276] eta: 3:31:49 lr: 2.313659913748767e-05 loss: 0.1043 (0.1074) time: 2.9455 data: 0.0071 max mem: 33301 Epoch: [23] [ 50/4276] eta: 3:30:27 lr: 2.3133732957732546e-05 loss: 0.1020 (0.1067) time: 2.9413 data: 0.0068 max mem: 33301 Epoch: [23] [ 60/4276] eta: 3:29:19 lr: 2.313086673852037e-05 loss: 0.0923 (0.1047) time: 2.9353 data: 0.0068 max mem: 33301 Epoch: [23] [ 70/4276] eta: 3:28:23 lr: 2.3128000479845167e-05 loss: 0.0919 (0.1029) time: 2.9339 data: 0.0070 max mem: 33301 Epoch: [23] [ 80/4276] eta: 3:27:33 lr: 2.3125134181700964e-05 loss: 0.0957 (0.1044) time: 2.9346 data: 0.0074 max mem: 33301 Epoch: [23] [ 90/4276] eta: 3:26:48 lr: 2.3122267844081783e-05 loss: 0.1009 (0.1041) time: 2.9334 data: 0.0071 max mem: 33301 Epoch: [23] [ 100/4276] eta: 3:26:04 lr: 2.3119401466981634e-05 loss: 0.1024 (0.1049) time: 2.9311 data: 0.0065 max mem: 33301 Epoch: [23] [ 110/4276] eta: 3:25:23 lr: 2.3116535050394537e-05 loss: 0.1177 (0.1063) time: 2.9310 data: 0.0067 max mem: 33301 Epoch: [23] [ 120/4276] eta: 3:24:44 lr: 2.311366859431452e-05 loss: 0.0887 (0.1053) time: 2.9308 data: 0.0069 max mem: 33301 Epoch: [23] [ 130/4276] eta: 3:24:06 lr: 2.311080209873558e-05 loss: 0.0893 (0.1059) time: 2.9292 data: 0.0066 max mem: 33301 Epoch: [23] [ 140/4276] eta: 3:23:30 lr: 2.3107935563651737e-05 loss: 0.0982 (0.1054) time: 2.9305 data: 0.0065 max mem: 33301 Epoch: [23] [ 150/4276] eta: 3:22:51 lr: 2.3105068989056997e-05 loss: 0.0894 (0.1053) time: 2.9258 data: 0.0068 max mem: 33301 Epoch: [23] [ 160/4276] eta: 3:22:17 lr: 2.310220237494538e-05 loss: 0.0978 (0.1051) time: 2.9255 data: 0.0070 max mem: 33301 Epoch: [23] [ 170/4276] eta: 3:21:43 lr: 2.309933572131088e-05 loss: 0.1035 (0.1052) time: 2.9313 data: 0.0067 max mem: 33301 Epoch: [23] [ 180/4276] eta: 3:21:11 lr: 2.3096469028147505e-05 loss: 0.1053 (0.1060) time: 2.9324 data: 0.0067 max mem: 33301 Epoch: [23] [ 190/4276] eta: 3:20:39 lr: 2.309360229544926e-05 loss: 0.1096 (0.1062) time: 2.9347 data: 0.0070 max mem: 33301 Epoch: [23] [ 200/4276] eta: 3:20:07 lr: 2.3090735523210156e-05 loss: 0.1016 (0.1057) time: 2.9347 data: 0.0070 max mem: 33301 Epoch: [23] [ 210/4276] eta: 3:19:27 lr: 2.3087868711424174e-05 loss: 0.1009 (0.1057) time: 2.9138 data: 0.0065 max mem: 33301 Epoch: [23] [ 220/4276] eta: 3:18:53 lr: 2.3085001860085325e-05 loss: 0.0967 (0.1056) time: 2.9052 data: 0.0060 max mem: 33301 Epoch: [23] [ 230/4276] eta: 3:18:22 lr: 2.30821349691876e-05 loss: 0.0873 (0.1049) time: 2.9239 data: 0.0067 max mem: 33301 Epoch: [23] [ 240/4276] eta: 3:17:47 lr: 2.3079268038725e-05 loss: 0.0925 (0.1048) time: 2.9222 data: 0.0068 max mem: 33301 Epoch: [23] [ 250/4276] eta: 3:17:16 lr: 2.3076401068691513e-05 loss: 0.1048 (0.1050) time: 2.9217 data: 0.0062 max mem: 33301 Epoch: [23] [ 260/4276] eta: 3:16:45 lr: 2.307353405908113e-05 loss: 0.1048 (0.1047) time: 2.9306 data: 0.0061 max mem: 33301 Epoch: [23] [ 270/4276] eta: 3:16:15 lr: 2.3070667009887838e-05 loss: 0.0841 (0.1046) time: 2.9306 data: 0.0061 max mem: 33301 Epoch: [23] [ 280/4276] eta: 3:15:43 lr: 2.306779992110564e-05 loss: 0.0933 (0.1044) time: 2.9284 data: 0.0064 max mem: 33301 Epoch: [23] [ 290/4276] eta: 3:15:11 lr: 2.3064932792728504e-05 loss: 0.0933 (0.1040) time: 2.9226 data: 0.0064 max mem: 33301 Epoch: [23] [ 300/4276] eta: 3:14:38 lr: 2.3062065624750416e-05 loss: 0.0963 (0.1038) time: 2.9128 data: 0.0061 max mem: 33301 Epoch: [23] [ 310/4276] eta: 3:14:07 lr: 2.3059198417165365e-05 loss: 0.0973 (0.1035) time: 2.9178 data: 0.0063 max mem: 33301 Epoch: [23] [ 320/4276] eta: 3:13:37 lr: 2.305633116996734e-05 loss: 0.0973 (0.1036) time: 2.9309 data: 0.0065 max mem: 33301 Epoch: [23] [ 330/4276] eta: 3:13:06 lr: 2.3053463883150297e-05 loss: 0.1070 (0.1039) time: 2.9273 data: 0.0065 max mem: 33301 Epoch: [23] [ 340/4276] eta: 3:12:31 lr: 2.3050596556708232e-05 loss: 0.1091 (0.1039) time: 2.9022 data: 0.0067 max mem: 33301 Epoch: [23] [ 350/4276] eta: 3:11:55 lr: 2.304772919063511e-05 loss: 0.1096 (0.1042) time: 2.8812 data: 0.0069 max mem: 33301 Epoch: [23] [ 360/4276] eta: 3:11:24 lr: 2.3044861784924917e-05 loss: 0.1182 (0.1049) time: 2.8978 data: 0.0069 max mem: 33301 Epoch: [23] [ 370/4276] eta: 3:10:53 lr: 2.3041994339571613e-05 loss: 0.0996 (0.1048) time: 2.9158 data: 0.0068 max mem: 33301 Epoch: [23] [ 380/4276] eta: 3:10:23 lr: 2.3039126854569174e-05 loss: 0.0996 (0.1050) time: 2.9229 data: 0.0064 max mem: 33301 Epoch: [23] [ 390/4276] eta: 3:09:54 lr: 2.3036259329911563e-05 loss: 0.1098 (0.1053) time: 2.9310 data: 0.0062 max mem: 33301 Epoch: [23] [ 400/4276] eta: 3:09:25 lr: 2.303339176559276e-05 loss: 0.1194 (0.1056) time: 2.9327 data: 0.0061 max mem: 33301 Epoch: [23] [ 410/4276] eta: 3:08:56 lr: 2.3030524161606715e-05 loss: 0.1142 (0.1057) time: 2.9350 data: 0.0062 max mem: 33301 Epoch: [23] [ 420/4276] eta: 3:08:26 lr: 2.3027656517947396e-05 loss: 0.0979 (0.1057) time: 2.9327 data: 0.0069 max mem: 33301 Epoch: [23] [ 430/4276] eta: 3:07:57 lr: 2.302478883460877e-05 loss: 0.1018 (0.1060) time: 2.9294 data: 0.0068 max mem: 33301 Epoch: [23] [ 440/4276] eta: 3:07:27 lr: 2.302192111158479e-05 loss: 0.1061 (0.1058) time: 2.9304 data: 0.0062 max mem: 33301 Epoch: [23] [ 450/4276] eta: 3:06:58 lr: 2.3019053348869414e-05 loss: 0.0954 (0.1059) time: 2.9310 data: 0.0061 max mem: 33301 Epoch: [23] [ 460/4276] eta: 3:06:29 lr: 2.30161855464566e-05 loss: 0.0929 (0.1056) time: 2.9326 data: 0.0062 max mem: 33301 Epoch: [23] [ 470/4276] eta: 3:05:59 lr: 2.3013317704340312e-05 loss: 0.0866 (0.1052) time: 2.9319 data: 0.0065 max mem: 33301 Epoch: [23] [ 480/4276] eta: 3:05:29 lr: 2.3010449822514484e-05 loss: 0.0910 (0.1051) time: 2.9275 data: 0.0063 max mem: 33301 Epoch: [23] [ 490/4276] eta: 3:04:59 lr: 2.300758190097308e-05 loss: 0.0865 (0.1047) time: 2.9215 data: 0.0059 max mem: 33301 Epoch: [23] [ 500/4276] eta: 3:04:30 lr: 2.3004713939710047e-05 loss: 0.0862 (0.1044) time: 2.9252 data: 0.0059 max mem: 33301 Epoch: [23] [ 510/4276] eta: 3:04:00 lr: 2.300184593871933e-05 loss: 0.0928 (0.1045) time: 2.9316 data: 0.0062 max mem: 33301 Epoch: [23] [ 520/4276] eta: 3:03:32 lr: 2.299897789799488e-05 loss: 0.0898 (0.1044) time: 2.9366 data: 0.0062 max mem: 33301 Epoch: [23] [ 530/4276] eta: 3:03:02 lr: 2.299610981753063e-05 loss: 0.0898 (0.1043) time: 2.9355 data: 0.0060 max mem: 33301 Epoch: [23] [ 540/4276] eta: 3:02:33 lr: 2.2993241697320532e-05 loss: 0.0934 (0.1041) time: 2.9293 data: 0.0060 max mem: 33301 Epoch: [23] [ 550/4276] eta: 3:02:04 lr: 2.2990373537358527e-05 loss: 0.1002 (0.1043) time: 2.9366 data: 0.0062 max mem: 33301 Epoch: [23] [ 560/4276] eta: 3:01:35 lr: 2.2987505337638548e-05 loss: 0.1111 (0.1043) time: 2.9362 data: 0.0062 max mem: 33301 Epoch: [23] [ 570/4276] eta: 3:01:05 lr: 2.298463709815453e-05 loss: 0.0946 (0.1043) time: 2.9288 data: 0.0061 max mem: 33301 Epoch: [23] [ 580/4276] eta: 3:00:36 lr: 2.2981768818900418e-05 loss: 0.1020 (0.1044) time: 2.9337 data: 0.0061 max mem: 33301 Epoch: [23] [ 590/4276] eta: 3:00:07 lr: 2.297890049987014e-05 loss: 0.0898 (0.1042) time: 2.9341 data: 0.0063 max mem: 33301 Epoch: [23] [ 600/4276] eta: 2:59:37 lr: 2.2976032141057622e-05 loss: 0.0876 (0.1041) time: 2.9283 data: 0.0065 max mem: 33301 Epoch: [23] [ 610/4276] eta: 2:59:08 lr: 2.29731637424568e-05 loss: 0.0890 (0.1042) time: 2.9290 data: 0.0063 max mem: 33301 Epoch: [23] [ 620/4276] eta: 2:58:38 lr: 2.2970295304061598e-05 loss: 0.0937 (0.1041) time: 2.9306 data: 0.0062 max mem: 33301 Epoch: [23] [ 630/4276] eta: 2:58:10 lr: 2.2967426825865954e-05 loss: 0.1048 (0.1043) time: 2.9359 data: 0.0063 max mem: 33301 Epoch: [23] [ 640/4276] eta: 2:57:40 lr: 2.2964558307863776e-05 loss: 0.1048 (0.1044) time: 2.9375 data: 0.0065 max mem: 33301 Epoch: [23] [ 650/4276] eta: 2:57:11 lr: 2.2961689750048997e-05 loss: 0.1018 (0.1045) time: 2.9314 data: 0.0064 max mem: 33301 Epoch: [23] [ 660/4276] eta: 2:56:42 lr: 2.295882115241553e-05 loss: 0.1121 (0.1048) time: 2.9312 data: 0.0064 max mem: 33301 Epoch: [23] [ 670/4276] eta: 2:56:12 lr: 2.2955952514957306e-05 loss: 0.1020 (0.1047) time: 2.9329 data: 0.0066 max mem: 33301 Epoch: [23] [ 680/4276] eta: 2:55:43 lr: 2.295308383766823e-05 loss: 0.1020 (0.1047) time: 2.9341 data: 0.0065 max mem: 33301 Epoch: [23] [ 690/4276] eta: 2:55:15 lr: 2.2950215120542225e-05 loss: 0.1033 (0.1047) time: 2.9407 data: 0.0063 max mem: 33301 Epoch: [23] [ 700/4276] eta: 2:54:46 lr: 2.2947346363573204e-05 loss: 0.1019 (0.1046) time: 2.9455 data: 0.0063 max mem: 33301 Epoch: [23] [ 710/4276] eta: 2:54:17 lr: 2.2944477566755085e-05 loss: 0.0952 (0.1047) time: 2.9396 data: 0.0064 max mem: 33301 Epoch: [23] [ 720/4276] eta: 2:53:47 lr: 2.2941608730081762e-05 loss: 0.0924 (0.1045) time: 2.9334 data: 0.0064 max mem: 33301 Epoch: [23] [ 730/4276] eta: 2:53:18 lr: 2.2938739853547157e-05 loss: 0.0985 (0.1045) time: 2.9343 data: 0.0063 max mem: 33301 Epoch: [23] [ 740/4276] eta: 2:52:49 lr: 2.2935870937145174e-05 loss: 0.0973 (0.1044) time: 2.9347 data: 0.0065 max mem: 33301 Epoch: [23] [ 750/4276] eta: 2:52:20 lr: 2.2933001980869714e-05 loss: 0.0973 (0.1044) time: 2.9342 data: 0.0066 max mem: 33301 Epoch: [23] [ 760/4276] eta: 2:51:51 lr: 2.293013298471468e-05 loss: 0.0926 (0.1044) time: 2.9384 data: 0.0065 max mem: 33301 Epoch: [23] [ 770/4276] eta: 2:51:22 lr: 2.292726394867398e-05 loss: 0.0958 (0.1044) time: 2.9371 data: 0.0064 max mem: 33301 Epoch: [23] [ 780/4276] eta: 2:50:52 lr: 2.2924394872741516e-05 loss: 0.1021 (0.1045) time: 2.9324 data: 0.0064 max mem: 33301 Epoch: [23] [ 790/4276] eta: 2:50:23 lr: 2.2921525756911168e-05 loss: 0.1104 (0.1046) time: 2.9332 data: 0.0066 max mem: 33301 Epoch: [23] [ 800/4276] eta: 2:49:54 lr: 2.2918656601176848e-05 loss: 0.1044 (0.1047) time: 2.9347 data: 0.0067 max mem: 33301 Epoch: [23] [ 810/4276] eta: 2:49:24 lr: 2.2915787405532446e-05 loss: 0.1086 (0.1050) time: 2.9338 data: 0.0064 max mem: 33301 Epoch: [23] [ 820/4276] eta: 2:48:55 lr: 2.291291816997186e-05 loss: 0.0992 (0.1049) time: 2.9325 data: 0.0063 max mem: 33301 Epoch: [23] [ 830/4276] eta: 2:48:26 lr: 2.2910048894488968e-05 loss: 0.0932 (0.1050) time: 2.9329 data: 0.0066 max mem: 33301 Epoch: [23] [ 840/4276] eta: 2:47:56 lr: 2.290717957907767e-05 loss: 0.1014 (0.1051) time: 2.9336 data: 0.0067 max mem: 33301 Epoch: [23] [ 850/4276] eta: 2:47:27 lr: 2.2904310223731848e-05 loss: 0.1014 (0.1052) time: 2.9336 data: 0.0066 max mem: 33301 Epoch: [23] [ 860/4276] eta: 2:46:58 lr: 2.2901440828445395e-05 loss: 0.1043 (0.1053) time: 2.9367 data: 0.0065 max mem: 33301 Epoch: [23] [ 870/4276] eta: 2:46:29 lr: 2.2898571393212178e-05 loss: 0.1043 (0.1052) time: 2.9356 data: 0.0067 max mem: 33301 Epoch: [23] [ 880/4276] eta: 2:45:59 lr: 2.2895701918026093e-05 loss: 0.0997 (0.1053) time: 2.9309 data: 0.0067 max mem: 33301 Epoch: [23] [ 890/4276] eta: 2:45:30 lr: 2.2892832402881016e-05 loss: 0.1073 (0.1054) time: 2.9316 data: 0.0065 max mem: 33301 Epoch: [23] [ 900/4276] eta: 2:45:01 lr: 2.288996284777083e-05 loss: 0.1069 (0.1055) time: 2.9312 data: 0.0065 max mem: 33301 Epoch: [23] [ 910/4276] eta: 2:44:31 lr: 2.28870932526894e-05 loss: 0.1081 (0.1057) time: 2.9314 data: 0.0067 max mem: 33301 Epoch: [23] [ 920/4276] eta: 2:44:02 lr: 2.288422361763061e-05 loss: 0.1139 (0.1057) time: 2.9316 data: 0.0067 max mem: 33301 Epoch: [23] [ 930/4276] eta: 2:43:32 lr: 2.2881353942588325e-05 loss: 0.1110 (0.1057) time: 2.9314 data: 0.0066 max mem: 33301 Epoch: [23] [ 940/4276] eta: 2:43:03 lr: 2.2878484227556432e-05 loss: 0.1037 (0.1058) time: 2.9337 data: 0.0066 max mem: 33301 Epoch: [23] [ 950/4276] eta: 2:42:34 lr: 2.287561447252878e-05 loss: 0.1037 (0.1059) time: 2.9352 data: 0.0068 max mem: 33301 Epoch: [23] [ 960/4276] eta: 2:42:05 lr: 2.2872744677499248e-05 loss: 0.1039 (0.1059) time: 2.9370 data: 0.0069 max mem: 33301 Epoch: [23] [ 970/4276] eta: 2:41:36 lr: 2.28698748424617e-05 loss: 0.1025 (0.1059) time: 2.9385 data: 0.0066 max mem: 33301 Epoch: [23] [ 980/4276] eta: 2:41:05 lr: 2.286700496741e-05 loss: 0.1102 (0.1060) time: 2.9107 data: 0.0064 max mem: 33301 Epoch: [23] [ 990/4276] eta: 2:40:34 lr: 2.286413505233801e-05 loss: 0.1014 (0.1058) time: 2.8896 data: 0.0067 max mem: 33301 Epoch: [23] [1000/4276] eta: 2:40:05 lr: 2.2861265097239584e-05 loss: 0.0966 (0.1058) time: 2.9170 data: 0.0075 max mem: 33301 Epoch: [23] [1010/4276] eta: 2:39:36 lr: 2.2858395102108596e-05 loss: 0.1044 (0.1058) time: 2.9395 data: 0.0079 max mem: 33301 Epoch: [23] [1020/4276] eta: 2:39:07 lr: 2.2855525066938886e-05 loss: 0.1010 (0.1058) time: 2.9400 data: 0.0079 max mem: 33301 Epoch: [23] [1030/4276] eta: 2:38:38 lr: 2.285265499172431e-05 loss: 0.1094 (0.1059) time: 2.9401 data: 0.0083 max mem: 33301 Epoch: [23] [1040/4276] eta: 2:38:09 lr: 2.2849784876458732e-05 loss: 0.0978 (0.1058) time: 2.9421 data: 0.0083 max mem: 33301 Epoch: [23] [1050/4276] eta: 2:37:40 lr: 2.2846914721136002e-05 loss: 0.0971 (0.1060) time: 2.9426 data: 0.0082 max mem: 33301 Epoch: [23] [1060/4276] eta: 2:37:11 lr: 2.284404452574996e-05 loss: 0.1114 (0.1061) time: 2.9377 data: 0.0083 max mem: 33301 Epoch: [23] [1070/4276] eta: 2:36:41 lr: 2.284117429029446e-05 loss: 0.1114 (0.1062) time: 2.9359 data: 0.0081 max mem: 33301 Epoch: [23] [1080/4276] eta: 2:36:12 lr: 2.2838304014763346e-05 loss: 0.1064 (0.1062) time: 2.9376 data: 0.0078 max mem: 33301 Epoch: [23] [1090/4276] eta: 2:35:43 lr: 2.2835433699150466e-05 loss: 0.1064 (0.1064) time: 2.9355 data: 0.0079 max mem: 33301 Epoch: [23] [1100/4276] eta: 2:35:14 lr: 2.2832563343449652e-05 loss: 0.1098 (0.1064) time: 2.9337 data: 0.0079 max mem: 33301 Epoch: [23] [1110/4276] eta: 2:34:44 lr: 2.2829692947654757e-05 loss: 0.1176 (0.1065) time: 2.9302 data: 0.0078 max mem: 33301 Epoch: [23] [1120/4276] eta: 2:34:15 lr: 2.282682251175961e-05 loss: 0.1097 (0.1065) time: 2.9299 data: 0.0078 max mem: 33301 Epoch: [23] [1130/4276] eta: 2:33:45 lr: 2.2823952035758057e-05 loss: 0.1023 (0.1065) time: 2.9333 data: 0.0078 max mem: 33301 Epoch: [23] [1140/4276] eta: 2:33:16 lr: 2.2821081519643924e-05 loss: 0.1084 (0.1066) time: 2.9332 data: 0.0078 max mem: 33301 Epoch: [23] [1150/4276] eta: 2:32:47 lr: 2.2818210963411048e-05 loss: 0.1145 (0.1067) time: 2.9347 data: 0.0077 max mem: 33301 Epoch: [23] [1160/4276] eta: 2:32:18 lr: 2.2815340367053262e-05 loss: 0.1226 (0.1067) time: 2.9388 data: 0.0079 max mem: 33301 Epoch: [23] [1170/4276] eta: 2:31:49 lr: 2.2812469730564396e-05 loss: 0.1008 (0.1067) time: 2.9375 data: 0.0081 max mem: 33301 Epoch: [23] [1180/4276] eta: 2:31:19 lr: 2.2809599053938272e-05 loss: 0.1008 (0.1067) time: 2.9340 data: 0.0078 max mem: 33301 Epoch: [23] [1190/4276] eta: 2:30:50 lr: 2.280672833716872e-05 loss: 0.1019 (0.1066) time: 2.9353 data: 0.0079 max mem: 33301 Epoch: [23] [1200/4276] eta: 2:30:21 lr: 2.2803857580249562e-05 loss: 0.0949 (0.1066) time: 2.9369 data: 0.0082 max mem: 33301 Epoch: [23] [1210/4276] eta: 2:29:52 lr: 2.280098678317463e-05 loss: 0.0958 (0.1065) time: 2.9382 data: 0.0083 max mem: 33301 Epoch: [23] [1220/4276] eta: 2:29:23 lr: 2.2798115945937727e-05 loss: 0.0960 (0.1066) time: 2.9460 data: 0.0081 max mem: 33301 Epoch: [23] [1230/4276] eta: 2:28:54 lr: 2.2795245068532685e-05 loss: 0.1016 (0.1066) time: 2.9473 data: 0.0077 max mem: 33301 Epoch: [23] [1240/4276] eta: 2:28:25 lr: 2.2792374150953317e-05 loss: 0.1048 (0.1067) time: 2.9400 data: 0.0080 max mem: 33301 Epoch: [23] [1250/4276] eta: 2:27:55 lr: 2.278950319319344e-05 loss: 0.1083 (0.1067) time: 2.9374 data: 0.0081 max mem: 33301 Epoch: [23] [1260/4276] eta: 2:27:26 lr: 2.278663219524686e-05 loss: 0.0972 (0.1066) time: 2.9362 data: 0.0078 max mem: 33301 Epoch: [23] [1270/4276] eta: 2:26:57 lr: 2.27837611571074e-05 loss: 0.0953 (0.1065) time: 2.9371 data: 0.0079 max mem: 33301 Epoch: [23] [1280/4276] eta: 2:26:27 lr: 2.278089007876886e-05 loss: 0.0967 (0.1065) time: 2.9355 data: 0.0081 max mem: 33301 Epoch: [23] [1290/4276] eta: 2:25:58 lr: 2.2778018960225057e-05 loss: 0.0996 (0.1065) time: 2.9364 data: 0.0082 max mem: 33301 Epoch: [23] [1300/4276] eta: 2:25:29 lr: 2.2775147801469785e-05 loss: 0.0911 (0.1065) time: 2.9415 data: 0.0080 max mem: 33301 Epoch: [23] [1310/4276] eta: 2:25:00 lr: 2.2772276602496853e-05 loss: 0.0870 (0.1063) time: 2.9389 data: 0.0079 max mem: 33301 Epoch: [23] [1320/4276] eta: 2:24:31 lr: 2.276940536330007e-05 loss: 0.0959 (0.1063) time: 2.9345 data: 0.0080 max mem: 33301 Epoch: [23] [1330/4276] eta: 2:24:01 lr: 2.276653408387323e-05 loss: 0.1064 (0.1063) time: 2.9351 data: 0.0082 max mem: 33301 Epoch: [23] [1340/4276] eta: 2:23:32 lr: 2.276366276421013e-05 loss: 0.0967 (0.1063) time: 2.9406 data: 0.0080 max mem: 33301 Epoch: [23] [1350/4276] eta: 2:23:03 lr: 2.276079140430457e-05 loss: 0.0993 (0.1064) time: 2.9396 data: 0.0078 max mem: 33301 Epoch: [23] [1360/4276] eta: 2:22:34 lr: 2.2757920004150345e-05 loss: 0.1019 (0.1063) time: 2.9367 data: 0.0080 max mem: 33301 Epoch: [23] [1370/4276] eta: 2:22:04 lr: 2.275504856374125e-05 loss: 0.1019 (0.1064) time: 2.9379 data: 0.0082 max mem: 33301 Epoch: [23] [1380/4276] eta: 2:21:35 lr: 2.2752177083071068e-05 loss: 0.1218 (0.1065) time: 2.9349 data: 0.0079 max mem: 33301 Epoch: [23] [1390/4276] eta: 2:21:06 lr: 2.2749305562133598e-05 loss: 0.1218 (0.1066) time: 2.9347 data: 0.0078 max mem: 33301 Epoch: [23] [1400/4276] eta: 2:20:36 lr: 2.2746434000922628e-05 loss: 0.1181 (0.1066) time: 2.9358 data: 0.0080 max mem: 33301 Epoch: [23] [1410/4276] eta: 2:20:07 lr: 2.2743562399431935e-05 loss: 0.1022 (0.1066) time: 2.9343 data: 0.0082 max mem: 33301 Epoch: [23] [1420/4276] eta: 2:19:38 lr: 2.274069075765531e-05 loss: 0.0953 (0.1066) time: 2.9330 data: 0.0079 max mem: 33301 Epoch: [23] [1430/4276] eta: 2:19:08 lr: 2.2737819075586533e-05 loss: 0.1008 (0.1066) time: 2.9334 data: 0.0077 max mem: 33301 Epoch: [23] [1440/4276] eta: 2:18:39 lr: 2.273494735321939e-05 loss: 0.1073 (0.1066) time: 2.9338 data: 0.0079 max mem: 33301 Epoch: [23] [1450/4276] eta: 2:18:10 lr: 2.273207559054765e-05 loss: 0.1026 (0.1066) time: 2.9351 data: 0.0082 max mem: 33301 Epoch: [23] [1460/4276] eta: 2:17:40 lr: 2.2729203787565092e-05 loss: 0.1026 (0.1065) time: 2.9347 data: 0.0081 max mem: 33301 Epoch: [23] [1470/4276] eta: 2:17:11 lr: 2.2726331944265493e-05 loss: 0.1035 (0.1065) time: 2.9334 data: 0.0078 max mem: 33301 Epoch: [23] [1480/4276] eta: 2:16:42 lr: 2.2723460060642633e-05 loss: 0.1098 (0.1065) time: 2.9335 data: 0.0081 max mem: 33301 Epoch: [23] [1490/4276] eta: 2:16:12 lr: 2.272058813669027e-05 loss: 0.0995 (0.1065) time: 2.9340 data: 0.0080 max mem: 33301 Epoch: [23] [1500/4276] eta: 2:15:43 lr: 2.271771617240218e-05 loss: 0.0883 (0.1065) time: 2.9344 data: 0.0078 max mem: 33301 Epoch: [23] [1510/4276] eta: 2:15:14 lr: 2.271484416777213e-05 loss: 0.0957 (0.1065) time: 2.9332 data: 0.0078 max mem: 33301 Epoch: [23] [1520/4276] eta: 2:14:45 lr: 2.271197212279389e-05 loss: 0.1006 (0.1065) time: 2.9339 data: 0.0078 max mem: 33301 Epoch: [23] [1530/4276] eta: 2:14:15 lr: 2.2709100037461214e-05 loss: 0.0944 (0.1064) time: 2.9361 data: 0.0079 max mem: 33301 Epoch: [23] [1540/4276] eta: 2:13:45 lr: 2.270622791176787e-05 loss: 0.0944 (0.1065) time: 2.9157 data: 0.0075 max mem: 33301 Epoch: [23] [1550/4276] eta: 2:13:16 lr: 2.270335574570762e-05 loss: 0.1014 (0.1065) time: 2.9212 data: 0.0073 max mem: 33301 Epoch: [23] [1560/4276] eta: 2:12:47 lr: 2.2700483539274226e-05 loss: 0.0986 (0.1064) time: 2.9431 data: 0.0078 max mem: 33301 Epoch: [23] [1570/4276] eta: 2:12:18 lr: 2.269761129246143e-05 loss: 0.0979 (0.1064) time: 2.9377 data: 0.0081 max mem: 33301 Epoch: [23] [1580/4276] eta: 2:11:48 lr: 2.2694739005262997e-05 loss: 0.0936 (0.1063) time: 2.9376 data: 0.0080 max mem: 33301 Epoch: [23] [1590/4276] eta: 2:11:19 lr: 2.2691866677672676e-05 loss: 0.0945 (0.1063) time: 2.9373 data: 0.0078 max mem: 33301 Epoch: [23] [1600/4276] eta: 2:10:50 lr: 2.2688994309684227e-05 loss: 0.1038 (0.1063) time: 2.9346 data: 0.0077 max mem: 33301 Epoch: [23] [1610/4276] eta: 2:10:20 lr: 2.2686121901291388e-05 loss: 0.0968 (0.1062) time: 2.9278 data: 0.0075 max mem: 33301 Epoch: [23] [1620/4276] eta: 2:09:51 lr: 2.2683249452487905e-05 loss: 0.0950 (0.1062) time: 2.9275 data: 0.0073 max mem: 33301 Epoch: [23] [1630/4276] eta: 2:09:21 lr: 2.268037696326754e-05 loss: 0.0952 (0.1062) time: 2.9327 data: 0.0073 max mem: 33301 Epoch: [23] [1640/4276] eta: 2:08:52 lr: 2.2677504433624018e-05 loss: 0.0836 (0.1060) time: 2.9331 data: 0.0076 max mem: 33301 Epoch: [23] [1650/4276] eta: 2:08:23 lr: 2.267463186355109e-05 loss: 0.0859 (0.1060) time: 2.9327 data: 0.0076 max mem: 33301 Epoch: [23] [1660/4276] eta: 2:07:53 lr: 2.267175925304249e-05 loss: 0.0930 (0.1059) time: 2.9317 data: 0.0073 max mem: 33301 Epoch: [23] [1670/4276] eta: 2:07:24 lr: 2.2668886602091965e-05 loss: 0.1002 (0.1059) time: 2.9321 data: 0.0073 max mem: 33301 Epoch: [23] [1680/4276] eta: 2:06:55 lr: 2.266601391069324e-05 loss: 0.0996 (0.1059) time: 2.9336 data: 0.0076 max mem: 33301 Epoch: [23] [1690/4276] eta: 2:06:25 lr: 2.2663141178840062e-05 loss: 0.0912 (0.1058) time: 2.9336 data: 0.0078 max mem: 33301 Epoch: [23] [1700/4276] eta: 2:05:56 lr: 2.266026840652615e-05 loss: 0.0877 (0.1058) time: 2.9337 data: 0.0076 max mem: 33301 Epoch: [23] [1710/4276] eta: 2:05:27 lr: 2.265739559374525e-05 loss: 0.0866 (0.1057) time: 2.9340 data: 0.0074 max mem: 33301 Epoch: [23] [1720/4276] eta: 2:04:57 lr: 2.2654522740491078e-05 loss: 0.0876 (0.1057) time: 2.9324 data: 0.0074 max mem: 33301 Epoch: [23] [1730/4276] eta: 2:04:28 lr: 2.2651649846757363e-05 loss: 0.0916 (0.1057) time: 2.9142 data: 0.0071 max mem: 33301 Epoch: [23] [1740/4276] eta: 2:03:58 lr: 2.264877691253783e-05 loss: 0.0843 (0.1056) time: 2.8924 data: 0.0068 max mem: 33301 Epoch: [23] [1750/4276] eta: 2:03:28 lr: 2.2645903937826213e-05 loss: 0.0865 (0.1055) time: 2.9128 data: 0.0073 max mem: 33301 Epoch: [23] [1760/4276] eta: 2:02:59 lr: 2.2643030922616216e-05 loss: 0.0894 (0.1054) time: 2.9366 data: 0.0080 max mem: 33301 Epoch: [23] [1770/4276] eta: 2:02:30 lr: 2.264015786690157e-05 loss: 0.0855 (0.1054) time: 2.9345 data: 0.0079 max mem: 33301 Epoch: [23] [1780/4276] eta: 2:02:00 lr: 2.263728477067599e-05 loss: 0.0929 (0.1054) time: 2.9326 data: 0.0076 max mem: 33301 Epoch: [23] [1790/4276] eta: 2:01:31 lr: 2.2634411633933195e-05 loss: 0.0939 (0.1053) time: 2.9327 data: 0.0074 max mem: 33301 Epoch: [23] [1800/4276] eta: 2:01:02 lr: 2.263153845666689e-05 loss: 0.0939 (0.1053) time: 2.9338 data: 0.0076 max mem: 33301 Epoch: [23] [1810/4276] eta: 2:00:32 lr: 2.2628665238870792e-05 loss: 0.1085 (0.1054) time: 2.9257 data: 0.0075 max mem: 33301 Epoch: [23] [1820/4276] eta: 2:00:03 lr: 2.262579198053861e-05 loss: 0.1210 (0.1054) time: 2.9197 data: 0.0076 max mem: 33301 Epoch: [23] [1830/4276] eta: 1:59:33 lr: 2.262291868166406e-05 loss: 0.1070 (0.1054) time: 2.9266 data: 0.0076 max mem: 33301 Epoch: [23] [1840/4276] eta: 1:59:04 lr: 2.262004534224084e-05 loss: 0.0939 (0.1053) time: 2.9262 data: 0.0072 max mem: 33301 Epoch: [23] [1850/4276] eta: 1:58:34 lr: 2.261717196226265e-05 loss: 0.0953 (0.1054) time: 2.8998 data: 0.0072 max mem: 33301 Epoch: [23] [1860/4276] eta: 1:58:04 lr: 2.2614298541723207e-05 loss: 0.0982 (0.1054) time: 2.9005 data: 0.0075 max mem: 33301 Epoch: [23] [1870/4276] eta: 1:57:35 lr: 2.26114250806162e-05 loss: 0.0997 (0.1054) time: 2.9270 data: 0.0079 max mem: 33301 Epoch: [23] [1880/4276] eta: 1:57:06 lr: 2.2608551578935335e-05 loss: 0.0937 (0.1053) time: 2.9352 data: 0.0077 max mem: 33301 Epoch: [23] [1890/4276] eta: 1:56:36 lr: 2.2605678036674303e-05 loss: 0.1004 (0.1053) time: 2.9367 data: 0.0075 max mem: 33301 Epoch: [23] [1900/4276] eta: 1:56:07 lr: 2.2602804453826803e-05 loss: 0.1004 (0.1053) time: 2.9342 data: 0.0075 max mem: 33301 Epoch: [23] [1910/4276] eta: 1:55:38 lr: 2.2599930830386533e-05 loss: 0.0971 (0.1054) time: 2.9272 data: 0.0076 max mem: 33301 Epoch: [23] [1920/4276] eta: 1:55:08 lr: 2.2597057166347174e-05 loss: 0.0923 (0.1053) time: 2.9269 data: 0.0074 max mem: 33301 Epoch: [23] [1930/4276] eta: 1:54:39 lr: 2.259418346170242e-05 loss: 0.0879 (0.1053) time: 2.9313 data: 0.0073 max mem: 33301 Epoch: [23] [1940/4276] eta: 1:54:10 lr: 2.259130971644597e-05 loss: 0.0970 (0.1053) time: 2.9330 data: 0.0073 max mem: 33301 Epoch: [23] [1950/4276] eta: 1:53:40 lr: 2.2588435930571485e-05 loss: 0.0998 (0.1053) time: 2.9352 data: 0.0072 max mem: 33301 Epoch: [23] [1960/4276] eta: 1:53:11 lr: 2.258556210407267e-05 loss: 0.0887 (0.1053) time: 2.9331 data: 0.0070 max mem: 33301 Epoch: [23] [1970/4276] eta: 1:52:42 lr: 2.2582688236943196e-05 loss: 0.0848 (0.1052) time: 2.9352 data: 0.0072 max mem: 33301 Epoch: [23] [1980/4276] eta: 1:52:13 lr: 2.2579814329176758e-05 loss: 0.0848 (0.1051) time: 2.9375 data: 0.0073 max mem: 33301 Epoch: [23] [1990/4276] eta: 1:51:43 lr: 2.2576940380767016e-05 loss: 0.0953 (0.1051) time: 2.9357 data: 0.0072 max mem: 33301 Epoch: [23] [2000/4276] eta: 1:51:14 lr: 2.2574066391707654e-05 loss: 0.1018 (0.1052) time: 2.9337 data: 0.0072 max mem: 33301 Epoch: [23] [2010/4276] eta: 1:50:45 lr: 2.257119236199235e-05 loss: 0.0989 (0.1051) time: 2.9368 data: 0.0071 max mem: 33301 Epoch: [23] [2020/4276] eta: 1:50:15 lr: 2.2568318291614776e-05 loss: 0.0979 (0.1052) time: 2.9383 data: 0.0071 max mem: 33301 Epoch: [23] [2030/4276] eta: 1:49:46 lr: 2.2565444180568596e-05 loss: 0.0979 (0.1051) time: 2.9370 data: 0.0071 max mem: 33301 Epoch: [23] [2040/4276] eta: 1:49:17 lr: 2.2562570028847483e-05 loss: 0.0961 (0.1051) time: 2.9371 data: 0.0071 max mem: 33301 Epoch: [23] [2050/4276] eta: 1:48:48 lr: 2.2559695836445106e-05 loss: 0.0991 (0.1051) time: 2.9340 data: 0.0074 max mem: 33301 Epoch: [23] [2060/4276] eta: 1:48:18 lr: 2.2556821603355136e-05 loss: 0.1045 (0.1052) time: 2.9335 data: 0.0075 max mem: 33301 Epoch: [23] [2070/4276] eta: 1:47:49 lr: 2.255394732957122e-05 loss: 0.0927 (0.1051) time: 2.9294 data: 0.0074 max mem: 33301 Epoch: [23] [2080/4276] eta: 1:47:19 lr: 2.2551073015087033e-05 loss: 0.1001 (0.1052) time: 2.9297 data: 0.0071 max mem: 33301 Epoch: [23] [2090/4276] eta: 1:46:50 lr: 2.2548198659896227e-05 loss: 0.1045 (0.1052) time: 2.9341 data: 0.0070 max mem: 33301 Epoch: [23] [2100/4276] eta: 1:46:21 lr: 2.254532426399247e-05 loss: 0.1069 (0.1052) time: 2.9333 data: 0.0069 max mem: 33301 Epoch: [23] [2110/4276] eta: 1:45:52 lr: 2.2542449827369406e-05 loss: 0.1063 (0.1052) time: 2.9338 data: 0.0068 max mem: 33301 Epoch: [23] [2120/4276] eta: 1:45:22 lr: 2.253957535002069e-05 loss: 0.0842 (0.1050) time: 2.9342 data: 0.0067 max mem: 33301 Epoch: [23] [2130/4276] eta: 1:44:53 lr: 2.253670083193998e-05 loss: 0.0810 (0.1050) time: 2.9412 data: 0.0067 max mem: 33301 Epoch: [23] [2140/4276] eta: 1:44:24 lr: 2.2533826273120928e-05 loss: 0.0863 (0.1049) time: 2.9403 data: 0.0068 max mem: 33301 Epoch: [23] [2150/4276] eta: 1:43:54 lr: 2.2530951673557175e-05 loss: 0.0886 (0.1049) time: 2.9337 data: 0.0066 max mem: 33301 Epoch: [23] [2160/4276] eta: 1:43:25 lr: 2.2528077033242366e-05 loss: 0.0910 (0.1049) time: 2.9355 data: 0.0067 max mem: 33301 Epoch: [23] [2170/4276] eta: 1:42:56 lr: 2.252520235217015e-05 loss: 0.0975 (0.1049) time: 2.9367 data: 0.0070 max mem: 33301 Epoch: [23] [2180/4276] eta: 1:42:27 lr: 2.2522327630334177e-05 loss: 0.1084 (0.1049) time: 2.9353 data: 0.0069 max mem: 33301 Epoch: [23] [2190/4276] eta: 1:41:57 lr: 2.2519452867728075e-05 loss: 0.1109 (0.1049) time: 2.9069 data: 0.0067 max mem: 33301 Epoch: [23] [2200/4276] eta: 1:41:27 lr: 2.2516578064345485e-05 loss: 0.1119 (0.1050) time: 2.8807 data: 0.0069 max mem: 33301 Epoch: [23] [2210/4276] eta: 1:40:57 lr: 2.2513703220180047e-05 loss: 0.1013 (0.1049) time: 2.8799 data: 0.0070 max mem: 33301 Epoch: [23] [2220/4276] eta: 1:40:27 lr: 2.25108283352254e-05 loss: 0.1048 (0.1050) time: 2.8844 data: 0.0073 max mem: 33301 Epoch: [23] [2230/4276] eta: 1:39:58 lr: 2.2507953409475168e-05 loss: 0.0998 (0.1049) time: 2.8968 data: 0.0080 max mem: 33301 Epoch: [23] [2240/4276] eta: 1:39:29 lr: 2.250507844292299e-05 loss: 0.0889 (0.1049) time: 2.9229 data: 0.0082 max mem: 33301 Epoch: [23] [2250/4276] eta: 1:38:59 lr: 2.2502203435562493e-05 loss: 0.0940 (0.1048) time: 2.9394 data: 0.0078 max mem: 33301 Epoch: [23] [2260/4276] eta: 1:38:30 lr: 2.2499328387387296e-05 loss: 0.1022 (0.1049) time: 2.9350 data: 0.0075 max mem: 33301 Epoch: [23] [2270/4276] eta: 1:38:01 lr: 2.2496453298391034e-05 loss: 0.1062 (0.1049) time: 2.9331 data: 0.0074 max mem: 33301 Epoch: [23] [2280/4276] eta: 1:37:31 lr: 2.249357816856733e-05 loss: 0.0969 (0.1048) time: 2.9332 data: 0.0074 max mem: 33301 Epoch: [23] [2290/4276] eta: 1:37:02 lr: 2.2490702997909807e-05 loss: 0.0969 (0.1048) time: 2.9327 data: 0.0076 max mem: 33301 Epoch: [23] [2300/4276] eta: 1:36:33 lr: 2.248782778641208e-05 loss: 0.1007 (0.1049) time: 2.9205 data: 0.0075 max mem: 33301 Epoch: [23] [2310/4276] eta: 1:36:03 lr: 2.2484952534067762e-05 loss: 0.1052 (0.1049) time: 2.9212 data: 0.0069 max mem: 33301 Epoch: [23] [2320/4276] eta: 1:35:34 lr: 2.2482077240870478e-05 loss: 0.1095 (0.1049) time: 2.9335 data: 0.0067 max mem: 33301 Epoch: [23] [2330/4276] eta: 1:35:05 lr: 2.2479201906813845e-05 loss: 0.1207 (0.1050) time: 2.9345 data: 0.0067 max mem: 33301 Epoch: [23] [2340/4276] eta: 1:34:35 lr: 2.2476326531891468e-05 loss: 0.1168 (0.1050) time: 2.9277 data: 0.0071 max mem: 33301 Epoch: [23] [2350/4276] eta: 1:34:06 lr: 2.2473451116096955e-05 loss: 0.1055 (0.1050) time: 2.9254 data: 0.0071 max mem: 33301 Epoch: [23] [2360/4276] eta: 1:33:37 lr: 2.2470575659423917e-05 loss: 0.1036 (0.1050) time: 2.9330 data: 0.0067 max mem: 33301 Epoch: [23] [2370/4276] eta: 1:33:07 lr: 2.2467700161865972e-05 loss: 0.1050 (0.1050) time: 2.9352 data: 0.0068 max mem: 33301 Epoch: [23] [2380/4276] eta: 1:32:38 lr: 2.24648246234167e-05 loss: 0.1101 (0.1051) time: 2.9351 data: 0.0069 max mem: 33301 Epoch: [23] [2390/4276] eta: 1:32:08 lr: 2.2461949044069728e-05 loss: 0.1101 (0.1051) time: 2.9137 data: 0.0068 max mem: 33301 Epoch: [23] [2400/4276] eta: 1:31:39 lr: 2.245907342381864e-05 loss: 0.1140 (0.1051) time: 2.8846 data: 0.0070 max mem: 33301 Epoch: [23] [2410/4276] eta: 1:31:09 lr: 2.245619776265705e-05 loss: 0.1109 (0.1051) time: 2.9021 data: 0.0076 max mem: 33301 Epoch: [23] [2420/4276] eta: 1:30:40 lr: 2.2453322060578537e-05 loss: 0.1010 (0.1051) time: 2.9322 data: 0.0078 max mem: 33301 Epoch: [23] [2430/4276] eta: 1:30:11 lr: 2.2450446317576708e-05 loss: 0.1068 (0.1052) time: 2.9370 data: 0.0076 max mem: 33301 Epoch: [23] [2440/4276] eta: 1:29:42 lr: 2.2447570533645153e-05 loss: 0.1077 (0.1051) time: 2.9383 data: 0.0075 max mem: 33301 Epoch: [23] [2450/4276] eta: 1:29:12 lr: 2.2444694708777467e-05 loss: 0.1052 (0.1052) time: 2.9397 data: 0.0074 max mem: 33301 Epoch: [23] [2460/4276] eta: 1:28:43 lr: 2.244181884296723e-05 loss: 0.1096 (0.1052) time: 2.9391 data: 0.0075 max mem: 33301 Epoch: [23] [2470/4276] eta: 1:28:14 lr: 2.2438942936208035e-05 loss: 0.1061 (0.1052) time: 2.9373 data: 0.0075 max mem: 33301 Epoch: [23] [2480/4276] eta: 1:27:44 lr: 2.2436066988493464e-05 loss: 0.1061 (0.1053) time: 2.9378 data: 0.0075 max mem: 33301 Epoch: [23] [2490/4276] eta: 1:27:15 lr: 2.2433190999817115e-05 loss: 0.1029 (0.1053) time: 2.9374 data: 0.0075 max mem: 33301 Epoch: [23] [2500/4276] eta: 1:26:46 lr: 2.2430314970172547e-05 loss: 0.1029 (0.1053) time: 2.9360 data: 0.0074 max mem: 33301 Epoch: [23] [2510/4276] eta: 1:26:16 lr: 2.2427438899553356e-05 loss: 0.1089 (0.1053) time: 2.9260 data: 0.0074 max mem: 33301 Epoch: [23] [2520/4276] eta: 1:25:47 lr: 2.242456278795311e-05 loss: 0.0956 (0.1052) time: 2.9281 data: 0.0074 max mem: 33301 Epoch: [23] [2530/4276] eta: 1:25:18 lr: 2.2421686635365398e-05 loss: 0.0831 (0.1052) time: 2.9387 data: 0.0076 max mem: 33301 Epoch: [23] [2540/4276] eta: 1:24:49 lr: 2.2418810441783776e-05 loss: 0.0881 (0.1052) time: 2.9359 data: 0.0074 max mem: 33301 Epoch: [23] [2550/4276] eta: 1:24:19 lr: 2.2415934207201826e-05 loss: 0.0985 (0.1051) time: 2.9365 data: 0.0074 max mem: 33301 Epoch: [23] [2560/4276] eta: 1:23:50 lr: 2.241305793161312e-05 loss: 0.0818 (0.1051) time: 2.9370 data: 0.0076 max mem: 33301 Epoch: [23] [2570/4276] eta: 1:23:21 lr: 2.241018161501122e-05 loss: 0.0884 (0.1051) time: 2.9351 data: 0.0075 max mem: 33301 Epoch: [23] [2580/4276] eta: 1:22:52 lr: 2.2407305257389696e-05 loss: 0.0962 (0.1051) time: 2.9347 data: 0.0074 max mem: 33301 Epoch: [23] [2590/4276] eta: 1:22:22 lr: 2.240442885874211e-05 loss: 0.0962 (0.1051) time: 2.9348 data: 0.0075 max mem: 33301 Epoch: [23] [2600/4276] eta: 1:21:53 lr: 2.240155241906203e-05 loss: 0.0984 (0.1050) time: 2.9235 data: 0.0075 max mem: 33301 Epoch: [23] [2610/4276] eta: 1:21:23 lr: 2.239867593834301e-05 loss: 0.0952 (0.1050) time: 2.9211 data: 0.0074 max mem: 33301 Epoch: [23] [2620/4276] eta: 1:20:54 lr: 2.2395799416578607e-05 loss: 0.1062 (0.1050) time: 2.9317 data: 0.0074 max mem: 33301 Epoch: [23] [2630/4276] eta: 1:20:25 lr: 2.2392922853762384e-05 loss: 0.0968 (0.1050) time: 2.9493 data: 0.0073 max mem: 33301 Epoch: [23] [2640/4276] eta: 1:19:56 lr: 2.2390046249887896e-05 loss: 0.0900 (0.1050) time: 2.9499 data: 0.0073 max mem: 33301 Epoch: [23] [2650/4276] eta: 1:19:26 lr: 2.238716960494868e-05 loss: 0.0960 (0.1050) time: 2.9362 data: 0.0073 max mem: 33301 Epoch: [23] [2660/4276] eta: 1:18:57 lr: 2.2384292918938306e-05 loss: 0.1006 (0.1050) time: 2.9375 data: 0.0072 max mem: 33301 Epoch: [23] [2670/4276] eta: 1:18:28 lr: 2.2381416191850316e-05 loss: 0.1046 (0.1050) time: 2.9374 data: 0.0071 max mem: 33301 Epoch: [23] [2680/4276] eta: 1:17:59 lr: 2.237853942367826e-05 loss: 0.1084 (0.1050) time: 2.9376 data: 0.0070 max mem: 33301 Epoch: [23] [2690/4276] eta: 1:17:29 lr: 2.2375662614415673e-05 loss: 0.1058 (0.1050) time: 2.9352 data: 0.0072 max mem: 33301 Epoch: [23] [2700/4276] eta: 1:17:00 lr: 2.2372785764056108e-05 loss: 0.0962 (0.1049) time: 2.9278 data: 0.0072 max mem: 33301 Epoch: [23] [2710/4276] eta: 1:16:31 lr: 2.2369908872593097e-05 loss: 0.0953 (0.1049) time: 2.9296 data: 0.0070 max mem: 33301 Epoch: [23] [2720/4276] eta: 1:16:01 lr: 2.2367031940020193e-05 loss: 0.0888 (0.1048) time: 2.9364 data: 0.0073 max mem: 33301 Epoch: [23] [2730/4276] eta: 1:15:32 lr: 2.236415496633092e-05 loss: 0.0935 (0.1049) time: 2.9367 data: 0.0074 max mem: 33301 Epoch: [23] [2740/4276] eta: 1:15:03 lr: 2.2361277951518818e-05 loss: 0.1048 (0.1049) time: 2.9350 data: 0.0072 max mem: 33301 Epoch: [23] [2750/4276] eta: 1:14:33 lr: 2.2358400895577418e-05 loss: 0.1081 (0.1049) time: 2.9333 data: 0.0070 max mem: 33301 Epoch: [23] [2760/4276] eta: 1:14:04 lr: 2.235552379850026e-05 loss: 0.1081 (0.1049) time: 2.9339 data: 0.0070 max mem: 33301 Epoch: [23] [2770/4276] eta: 1:13:35 lr: 2.2352646660280857e-05 loss: 0.1048 (0.1049) time: 2.9348 data: 0.0072 max mem: 33301 Epoch: [23] [2780/4276] eta: 1:13:06 lr: 2.234976948091275e-05 loss: 0.0960 (0.1049) time: 2.9339 data: 0.0072 max mem: 33301 Epoch: [23] [2790/4276] eta: 1:12:36 lr: 2.234689226038946e-05 loss: 0.0972 (0.1049) time: 2.9302 data: 0.0070 max mem: 33301 Epoch: [23] [2800/4276] eta: 1:12:07 lr: 2.2344014998704517e-05 loss: 0.0942 (0.1049) time: 2.9314 data: 0.0072 max mem: 33301 Epoch: [23] [2810/4276] eta: 1:11:38 lr: 2.234113769585143e-05 loss: 0.0800 (0.1048) time: 2.9351 data: 0.0074 max mem: 33301 Epoch: [23] [2820/4276] eta: 1:11:08 lr: 2.2338260351823727e-05 loss: 0.0861 (0.1047) time: 2.9356 data: 0.0073 max mem: 33301 Epoch: [23] [2830/4276] eta: 1:10:39 lr: 2.233538296661492e-05 loss: 0.0925 (0.1047) time: 2.9351 data: 0.0071 max mem: 33301 Epoch: [23] [2840/4276] eta: 1:10:10 lr: 2.233250554021854e-05 loss: 0.1036 (0.1047) time: 2.9354 data: 0.0073 max mem: 33301 Epoch: [23] [2850/4276] eta: 1:09:40 lr: 2.232962807262808e-05 loss: 0.1001 (0.1047) time: 2.9389 data: 0.0075 max mem: 33301 Epoch: [23] [2860/4276] eta: 1:09:11 lr: 2.2326750563837064e-05 loss: 0.0980 (0.1047) time: 2.9413 data: 0.0072 max mem: 33301 Epoch: [23] [2870/4276] eta: 1:08:42 lr: 2.2323873013839005e-05 loss: 0.0980 (0.1047) time: 2.9416 data: 0.0071 max mem: 33301 Epoch: [23] [2880/4276] eta: 1:08:13 lr: 2.23209954226274e-05 loss: 0.0966 (0.1047) time: 2.9404 data: 0.0073 max mem: 33301 Epoch: [23] [2890/4276] eta: 1:07:43 lr: 2.2318117790195758e-05 loss: 0.0926 (0.1047) time: 2.9389 data: 0.0075 max mem: 33301 Epoch: [23] [2900/4276] eta: 1:07:14 lr: 2.231524011653759e-05 loss: 0.0911 (0.1047) time: 2.9355 data: 0.0073 max mem: 33301 Epoch: [23] [2910/4276] eta: 1:06:45 lr: 2.2312362401646394e-05 loss: 0.0911 (0.1047) time: 2.9351 data: 0.0071 max mem: 33301 Epoch: [23] [2920/4276] eta: 1:06:15 lr: 2.2309484645515667e-05 loss: 0.1022 (0.1047) time: 2.9363 data: 0.0073 max mem: 33301 Epoch: [23] [2930/4276] eta: 1:05:46 lr: 2.2306606848138912e-05 loss: 0.0911 (0.1046) time: 2.9367 data: 0.0077 max mem: 33301 Epoch: [23] [2940/4276] eta: 1:05:17 lr: 2.230372900950962e-05 loss: 0.0910 (0.1046) time: 2.9370 data: 0.0077 max mem: 33301 Epoch: [23] [2950/4276] eta: 1:04:47 lr: 2.2300851129621294e-05 loss: 0.1017 (0.1046) time: 2.9349 data: 0.0073 max mem: 33301 Epoch: [23] [2960/4276] eta: 1:04:18 lr: 2.2297973208467414e-05 loss: 0.0974 (0.1046) time: 2.9348 data: 0.0073 max mem: 33301 Epoch: [23] [2970/4276] eta: 1:03:49 lr: 2.229509524604148e-05 loss: 0.0994 (0.1046) time: 2.9421 data: 0.0075 max mem: 33301 Epoch: [23] [2980/4276] eta: 1:03:20 lr: 2.2292217242336976e-05 loss: 0.1035 (0.1047) time: 2.9422 data: 0.0073 max mem: 33301 Epoch: [23] [2990/4276] eta: 1:02:50 lr: 2.2289339197347398e-05 loss: 0.0924 (0.1046) time: 2.9495 data: 0.0072 max mem: 33301 Epoch: [23] [3000/4276] eta: 1:02:21 lr: 2.2286461111066213e-05 loss: 0.0929 (0.1046) time: 2.9501 data: 0.0075 max mem: 33301 Epoch: [23] [3010/4276] eta: 1:01:52 lr: 2.2283582983486918e-05 loss: 0.1027 (0.1046) time: 2.9362 data: 0.0077 max mem: 33301 Epoch: [23] [3020/4276] eta: 1:01:23 lr: 2.2280704814602982e-05 loss: 0.1029 (0.1046) time: 2.9425 data: 0.0073 max mem: 33301 Epoch: [23] [3030/4276] eta: 1:00:53 lr: 2.22778266044079e-05 loss: 0.0973 (0.1046) time: 2.9430 data: 0.0071 max mem: 33301 Epoch: [23] [3040/4276] eta: 1:00:24 lr: 2.227494835289513e-05 loss: 0.0992 (0.1046) time: 2.9398 data: 0.0073 max mem: 33301 Epoch: [23] [3050/4276] eta: 0:59:55 lr: 2.2272070060058157e-05 loss: 0.0968 (0.1046) time: 2.9374 data: 0.0074 max mem: 33301 Epoch: [23] [3060/4276] eta: 0:59:25 lr: 2.2269191725890453e-05 loss: 0.0892 (0.1045) time: 2.9323 data: 0.0073 max mem: 33301 Epoch: [23] [3070/4276] eta: 0:58:56 lr: 2.226631335038549e-05 loss: 0.0916 (0.1045) time: 2.9351 data: 0.0071 max mem: 33301 Epoch: [23] [3080/4276] eta: 0:58:27 lr: 2.2263434933536733e-05 loss: 0.0962 (0.1045) time: 2.9394 data: 0.0072 max mem: 33301 Epoch: [23] [3090/4276] eta: 0:57:57 lr: 2.226055647533765e-05 loss: 0.0928 (0.1045) time: 2.9402 data: 0.0074 max mem: 33301 Epoch: [23] [3100/4276] eta: 0:57:28 lr: 2.22576779757817e-05 loss: 0.0930 (0.1045) time: 2.9409 data: 0.0072 max mem: 33301 Epoch: [23] [3110/4276] eta: 0:56:59 lr: 2.2254799434862364e-05 loss: 0.0926 (0.1044) time: 2.9408 data: 0.0071 max mem: 33301 Epoch: [23] [3120/4276] eta: 0:56:30 lr: 2.2251920852573082e-05 loss: 0.0871 (0.1044) time: 2.9393 data: 0.0073 max mem: 33301 Epoch: [23] [3130/4276] eta: 0:56:00 lr: 2.224904222890732e-05 loss: 0.0939 (0.1044) time: 2.9390 data: 0.0073 max mem: 33301 Epoch: [23] [3140/4276] eta: 0:55:31 lr: 2.224616356385854e-05 loss: 0.1021 (0.1044) time: 2.9361 data: 0.0071 max mem: 33301 Epoch: [23] [3150/4276] eta: 0:55:02 lr: 2.2243284857420198e-05 loss: 0.1052 (0.1043) time: 2.9324 data: 0.0070 max mem: 33301 Epoch: [23] [3160/4276] eta: 0:54:32 lr: 2.2240406109585736e-05 loss: 0.0992 (0.1043) time: 2.9314 data: 0.0071 max mem: 33301 Epoch: [23] [3170/4276] eta: 0:54:03 lr: 2.223752732034861e-05 loss: 0.0992 (0.1043) time: 2.9326 data: 0.0073 max mem: 33301 Epoch: [23] [3180/4276] eta: 0:53:34 lr: 2.2234648489702276e-05 loss: 0.0984 (0.1043) time: 2.9333 data: 0.0072 max mem: 33301 Epoch: [23] [3190/4276] eta: 0:53:04 lr: 2.223176961764017e-05 loss: 0.0984 (0.1043) time: 2.9321 data: 0.0070 max mem: 33301 Epoch: [23] [3200/4276] eta: 0:52:35 lr: 2.222889070415574e-05 loss: 0.1008 (0.1043) time: 2.9327 data: 0.0072 max mem: 33301 Epoch: [23] [3210/4276] eta: 0:52:06 lr: 2.2226011749242438e-05 loss: 0.0983 (0.1044) time: 2.9366 data: 0.0075 max mem: 33301 Epoch: [23] [3220/4276] eta: 0:51:36 lr: 2.2223132752893697e-05 loss: 0.1006 (0.1044) time: 2.9360 data: 0.0073 max mem: 33301 Epoch: [23] [3230/4276] eta: 0:51:07 lr: 2.2220253715102954e-05 loss: 0.0961 (0.1043) time: 2.9344 data: 0.0071 max mem: 33301 Epoch: [23] [3240/4276] eta: 0:50:38 lr: 2.2217374635863653e-05 loss: 0.1080 (0.1044) time: 2.9352 data: 0.0073 max mem: 33301 Epoch: [23] [3250/4276] eta: 0:50:08 lr: 2.221449551516922e-05 loss: 0.1153 (0.1044) time: 2.9345 data: 0.0075 max mem: 33301 Epoch: [23] [3260/4276] eta: 0:49:39 lr: 2.2211616353013103e-05 loss: 0.0994 (0.1044) time: 2.9365 data: 0.0074 max mem: 33301 Epoch: [23] [3270/4276] eta: 0:49:10 lr: 2.2208737149388718e-05 loss: 0.1027 (0.1044) time: 2.9343 data: 0.0070 max mem: 33301 Epoch: [23] [3280/4276] eta: 0:48:40 lr: 2.22058579042895e-05 loss: 0.1042 (0.1044) time: 2.9302 data: 0.0070 max mem: 33301 Epoch: [23] [3290/4276] eta: 0:48:11 lr: 2.2202978617708877e-05 loss: 0.1092 (0.1044) time: 2.9326 data: 0.0073 max mem: 33301 Epoch: [23] [3300/4276] eta: 0:47:42 lr: 2.220009928964028e-05 loss: 0.1113 (0.1045) time: 2.9346 data: 0.0072 max mem: 33301 Epoch: [23] [3310/4276] eta: 0:47:12 lr: 2.2197219920077118e-05 loss: 0.1221 (0.1045) time: 2.9355 data: 0.0071 max mem: 33301 Epoch: [23] [3320/4276] eta: 0:46:43 lr: 2.2194340509012816e-05 loss: 0.1169 (0.1045) time: 2.9351 data: 0.0072 max mem: 33301 Epoch: [23] [3330/4276] eta: 0:46:14 lr: 2.2191461056440803e-05 loss: 0.0991 (0.1045) time: 2.9343 data: 0.0074 max mem: 33301 Epoch: [23] [3340/4276] eta: 0:45:44 lr: 2.2188581562354495e-05 loss: 0.0991 (0.1045) time: 2.9344 data: 0.0073 max mem: 33301 Epoch: [23] [3350/4276] eta: 0:45:15 lr: 2.2185702026747297e-05 loss: 0.1064 (0.1045) time: 2.9342 data: 0.0071 max mem: 33301 Epoch: [23] [3360/4276] eta: 0:44:46 lr: 2.2182822449612624e-05 loss: 0.0997 (0.1045) time: 2.9334 data: 0.0072 max mem: 33301 Epoch: [23] [3370/4276] eta: 0:44:17 lr: 2.2179942830943898e-05 loss: 0.1048 (0.1046) time: 2.9327 data: 0.0074 max mem: 33301 Epoch: [23] [3380/4276] eta: 0:43:47 lr: 2.217706317073452e-05 loss: 0.1050 (0.1046) time: 2.9323 data: 0.0073 max mem: 33301 Epoch: [23] [3390/4276] eta: 0:43:18 lr: 2.2174183468977896e-05 loss: 0.1043 (0.1046) time: 2.9317 data: 0.0070 max mem: 33301 Epoch: [23] [3400/4276] eta: 0:42:49 lr: 2.2171303725667435e-05 loss: 0.1058 (0.1046) time: 2.9337 data: 0.0071 max mem: 33301 Epoch: [23] [3410/4276] eta: 0:42:19 lr: 2.2168423940796536e-05 loss: 0.1015 (0.1046) time: 2.9360 data: 0.0074 max mem: 33301 Epoch: [23] [3420/4276] eta: 0:41:50 lr: 2.216554411435861e-05 loss: 0.1108 (0.1046) time: 2.9428 data: 0.0072 max mem: 33301 Epoch: [23] [3430/4276] eta: 0:41:21 lr: 2.2162664246347046e-05 loss: 0.1188 (0.1047) time: 2.9433 data: 0.0072 max mem: 33301 Epoch: [23] [3440/4276] eta: 0:40:51 lr: 2.215978433675524e-05 loss: 0.1072 (0.1047) time: 2.9364 data: 0.0074 max mem: 33301 Epoch: [23] [3450/4276] eta: 0:40:22 lr: 2.21569043855766e-05 loss: 0.0998 (0.1047) time: 2.9362 data: 0.0075 max mem: 33301 Epoch: [23] [3460/4276] eta: 0:39:53 lr: 2.215402439280451e-05 loss: 0.1060 (0.1047) time: 2.9339 data: 0.0074 max mem: 33301 Epoch: [23] [3470/4276] eta: 0:39:23 lr: 2.2151144358432362e-05 loss: 0.0985 (0.1047) time: 2.9327 data: 0.0070 max mem: 33301 Epoch: [23] [3480/4276] eta: 0:38:54 lr: 2.214826428245354e-05 loss: 0.0950 (0.1047) time: 2.9356 data: 0.0071 max mem: 33301 Epoch: [23] [3490/4276] eta: 0:38:25 lr: 2.2145384164861444e-05 loss: 0.0950 (0.1047) time: 2.9391 data: 0.0076 max mem: 33301 Epoch: [23] [3500/4276] eta: 0:37:55 lr: 2.2142504005649447e-05 loss: 0.0932 (0.1047) time: 2.9301 data: 0.0074 max mem: 33301 Epoch: [23] [3510/4276] eta: 0:37:26 lr: 2.2139623804810937e-05 loss: 0.0919 (0.1047) time: 2.9119 data: 0.0073 max mem: 33301 Epoch: [23] [3520/4276] eta: 0:36:57 lr: 2.2136743562339297e-05 loss: 0.0909 (0.1047) time: 2.9180 data: 0.0071 max mem: 33301 Epoch: [23] [3530/4276] eta: 0:36:27 lr: 2.213386327822791e-05 loss: 0.0962 (0.1047) time: 2.9330 data: 0.0069 max mem: 33301 Epoch: [23] [3540/4276] eta: 0:35:58 lr: 2.2130982952470144e-05 loss: 0.1051 (0.1047) time: 2.9351 data: 0.0070 max mem: 33301 Epoch: [23] [3550/4276] eta: 0:35:29 lr: 2.2128102585059376e-05 loss: 0.1051 (0.1047) time: 2.9345 data: 0.0069 max mem: 33301 Epoch: [23] [3560/4276] eta: 0:34:59 lr: 2.2125222175988984e-05 loss: 0.1049 (0.1047) time: 2.9340 data: 0.0069 max mem: 33301 Epoch: [23] [3570/4276] eta: 0:34:30 lr: 2.212234172525234e-05 loss: 0.1213 (0.1048) time: 2.9347 data: 0.0070 max mem: 33301 Epoch: [23] [3580/4276] eta: 0:34:01 lr: 2.2119461232842802e-05 loss: 0.0914 (0.1047) time: 2.9350 data: 0.0070 max mem: 33301 Epoch: [23] [3590/4276] eta: 0:33:31 lr: 2.2116580698753744e-05 loss: 0.0909 (0.1047) time: 2.9349 data: 0.0068 max mem: 33301 Epoch: [23] [3600/4276] eta: 0:33:02 lr: 2.2113700122978536e-05 loss: 0.1015 (0.1047) time: 2.9336 data: 0.0066 max mem: 33301 Epoch: [23] [3610/4276] eta: 0:32:33 lr: 2.211081950551054e-05 loss: 0.0985 (0.1048) time: 2.9292 data: 0.0068 max mem: 33301 Epoch: [23] [3620/4276] eta: 0:32:03 lr: 2.210793884634311e-05 loss: 0.0917 (0.1047) time: 2.9260 data: 0.0070 max mem: 33301 Epoch: [23] [3630/4276] eta: 0:31:34 lr: 2.2105058145469606e-05 loss: 0.1005 (0.1047) time: 2.9307 data: 0.0068 max mem: 33301 Epoch: [23] [3640/4276] eta: 0:31:05 lr: 2.2102177402883392e-05 loss: 0.1025 (0.1047) time: 2.9346 data: 0.0068 max mem: 33301 Epoch: [23] [3650/4276] eta: 0:30:35 lr: 2.2099296618577824e-05 loss: 0.0982 (0.1047) time: 2.9342 data: 0.0069 max mem: 33301 Epoch: [23] [3660/4276] eta: 0:30:06 lr: 2.209641579254624e-05 loss: 0.0951 (0.1047) time: 2.9336 data: 0.0068 max mem: 33301 Epoch: [23] [3670/4276] eta: 0:29:37 lr: 2.2093534924782004e-05 loss: 0.0989 (0.1047) time: 2.9335 data: 0.0067 max mem: 33301 Epoch: [23] [3680/4276] eta: 0:29:07 lr: 2.209065401527846e-05 loss: 0.1079 (0.1047) time: 2.9365 data: 0.0068 max mem: 33301 Epoch: [23] [3690/4276] eta: 0:28:38 lr: 2.2087773064028967e-05 loss: 0.1117 (0.1047) time: 2.9366 data: 0.0070 max mem: 33301 Epoch: [23] [3700/4276] eta: 0:28:09 lr: 2.208489207102685e-05 loss: 0.1117 (0.1047) time: 2.9351 data: 0.0072 max mem: 33301 Epoch: [23] [3710/4276] eta: 0:27:39 lr: 2.2082011036265465e-05 loss: 0.0959 (0.1046) time: 2.9366 data: 0.0071 max mem: 33301 Epoch: [23] [3720/4276] eta: 0:27:10 lr: 2.2079129959738144e-05 loss: 0.0927 (0.1046) time: 2.9350 data: 0.0072 max mem: 33301 Epoch: [23] [3730/4276] eta: 0:26:41 lr: 2.2076248841438242e-05 loss: 0.1029 (0.1046) time: 2.9349 data: 0.0072 max mem: 33301 Epoch: [23] [3740/4276] eta: 0:26:11 lr: 2.2073367681359077e-05 loss: 0.0956 (0.1046) time: 2.9344 data: 0.0072 max mem: 33301 Epoch: [23] [3750/4276] eta: 0:25:42 lr: 2.2070486479493994e-05 loss: 0.0992 (0.1046) time: 2.9331 data: 0.0073 max mem: 33301 Epoch: [23] [3760/4276] eta: 0:25:13 lr: 2.2067605235836326e-05 loss: 0.0992 (0.1046) time: 2.9336 data: 0.0073 max mem: 33301 Epoch: [23] [3770/4276] eta: 0:24:43 lr: 2.20647239503794e-05 loss: 0.0953 (0.1046) time: 2.9346 data: 0.0072 max mem: 33301 Epoch: [23] [3780/4276] eta: 0:24:14 lr: 2.2061842623116543e-05 loss: 0.0945 (0.1046) time: 2.9354 data: 0.0072 max mem: 33301 Epoch: [23] [3790/4276] eta: 0:23:45 lr: 2.2058961254041085e-05 loss: 0.0945 (0.1046) time: 2.9367 data: 0.0073 max mem: 33301 Epoch: [23] [3800/4276] eta: 0:23:16 lr: 2.2056079843146353e-05 loss: 0.0996 (0.1046) time: 2.9371 data: 0.0073 max mem: 33301 Epoch: [23] [3810/4276] eta: 0:22:46 lr: 2.2053198390425667e-05 loss: 0.0995 (0.1046) time: 2.9358 data: 0.0073 max mem: 33301 Epoch: [23] [3820/4276] eta: 0:22:17 lr: 2.2050316895872344e-05 loss: 0.0860 (0.1045) time: 2.9356 data: 0.0073 max mem: 33301 Epoch: [23] [3830/4276] eta: 0:21:48 lr: 2.2047435359479703e-05 loss: 0.0873 (0.1045) time: 2.9374 data: 0.0071 max mem: 33301 Epoch: [23] [3840/4276] eta: 0:21:18 lr: 2.2044553781241073e-05 loss: 0.1005 (0.1045) time: 2.9384 data: 0.0072 max mem: 33301 Epoch: [23] [3850/4276] eta: 0:20:49 lr: 2.2041672161149752e-05 loss: 0.0842 (0.1045) time: 2.9386 data: 0.0074 max mem: 33301 Epoch: [23] [3860/4276] eta: 0:20:20 lr: 2.2038790499199056e-05 loss: 0.0946 (0.1045) time: 2.9384 data: 0.0074 max mem: 33301 Epoch: [23] [3870/4276] eta: 0:19:50 lr: 2.20359087953823e-05 loss: 0.1051 (0.1045) time: 2.9328 data: 0.0076 max mem: 33301 Epoch: [23] [3880/4276] eta: 0:19:21 lr: 2.2033027049692802e-05 loss: 0.1008 (0.1045) time: 2.9265 data: 0.0078 max mem: 33301 Epoch: [23] [3890/4276] eta: 0:18:52 lr: 2.2030145262123843e-05 loss: 0.0972 (0.1045) time: 2.9261 data: 0.0075 max mem: 33301 Epoch: [23] [3900/4276] eta: 0:18:22 lr: 2.2027263432668743e-05 loss: 0.0969 (0.1045) time: 2.9268 data: 0.0073 max mem: 33301 Epoch: [23] [3910/4276] eta: 0:17:53 lr: 2.2024381561320803e-05 loss: 0.0872 (0.1044) time: 2.9313 data: 0.0074 max mem: 33301 Epoch: [23] [3920/4276] eta: 0:17:24 lr: 2.202149964807333e-05 loss: 0.0850 (0.1044) time: 2.9359 data: 0.0075 max mem: 33301 Epoch: [23] [3930/4276] eta: 0:16:54 lr: 2.2018617692919608e-05 loss: 0.0976 (0.1044) time: 2.9361 data: 0.0074 max mem: 33301 Epoch: [23] [3940/4276] eta: 0:16:25 lr: 2.2015735695852937e-05 loss: 0.1059 (0.1044) time: 2.9363 data: 0.0073 max mem: 33301 Epoch: [23] [3950/4276] eta: 0:15:56 lr: 2.2012853656866612e-05 loss: 0.0988 (0.1044) time: 2.9350 data: 0.0073 max mem: 33301 Epoch: [23] [3960/4276] eta: 0:15:26 lr: 2.2009971575953935e-05 loss: 0.0937 (0.1044) time: 2.9354 data: 0.0073 max mem: 33301 Epoch: [23] [3970/4276] eta: 0:14:57 lr: 2.200708945310818e-05 loss: 0.0925 (0.1044) time: 2.9357 data: 0.0073 max mem: 33301 Epoch: [23] [3980/4276] eta: 0:14:28 lr: 2.2004207288322644e-05 loss: 0.0986 (0.1044) time: 2.9352 data: 0.0072 max mem: 33301 Epoch: [23] [3990/4276] eta: 0:13:58 lr: 2.2001325081590612e-05 loss: 0.0986 (0.1044) time: 2.9351 data: 0.0072 max mem: 33301 Epoch: [23] [4000/4276] eta: 0:13:29 lr: 2.199844283290537e-05 loss: 0.0949 (0.1044) time: 2.9359 data: 0.0072 max mem: 33301 Epoch: [23] [4010/4276] eta: 0:13:00 lr: 2.199556054226019e-05 loss: 0.0952 (0.1044) time: 2.9338 data: 0.0074 max mem: 33301 Epoch: [23] [4020/4276] eta: 0:12:30 lr: 2.199267820964836e-05 loss: 0.1042 (0.1044) time: 2.9347 data: 0.0076 max mem: 33301 Epoch: [23] [4030/4276] eta: 0:12:01 lr: 2.1989795835063152e-05 loss: 0.0994 (0.1044) time: 2.9376 data: 0.0075 max mem: 33301 Epoch: [23] [4040/4276] eta: 0:11:32 lr: 2.1986913418497854e-05 loss: 0.1085 (0.1045) time: 2.9372 data: 0.0073 max mem: 33301 Epoch: [23] [4050/4276] eta: 0:11:02 lr: 2.1984030959945723e-05 loss: 0.1202 (0.1045) time: 2.9353 data: 0.0073 max mem: 33301 Epoch: [23] [4060/4276] eta: 0:10:33 lr: 2.1981148459400035e-05 loss: 0.1014 (0.1045) time: 2.9416 data: 0.0073 max mem: 33301 Epoch: [23] [4070/4276] eta: 0:10:04 lr: 2.1978265916854073e-05 loss: 0.1036 (0.1045) time: 2.9427 data: 0.0075 max mem: 33301 Epoch: [23] [4080/4276] eta: 0:09:34 lr: 2.1975383332301084e-05 loss: 0.1075 (0.1045) time: 2.9363 data: 0.0075 max mem: 33301 Epoch: [23] [4090/4276] eta: 0:09:05 lr: 2.1972500705734343e-05 loss: 0.1187 (0.1046) time: 2.9372 data: 0.0073 max mem: 33301 Epoch: [23] [4100/4276] eta: 0:08:36 lr: 2.1969618037147118e-05 loss: 0.1129 (0.1046) time: 2.9366 data: 0.0073 max mem: 33301 Epoch: [23] [4110/4276] eta: 0:08:06 lr: 2.196673532653266e-05 loss: 0.1156 (0.1046) time: 2.9359 data: 0.0073 max mem: 33301 Epoch: [23] [4120/4276] eta: 0:07:37 lr: 2.1963852573884236e-05 loss: 0.1164 (0.1046) time: 2.9351 data: 0.0072 max mem: 33301 Epoch: [23] [4130/4276] eta: 0:07:08 lr: 2.1960969779195095e-05 loss: 0.1009 (0.1046) time: 2.9306 data: 0.0072 max mem: 33301 Epoch: [23] [4140/4276] eta: 0:06:38 lr: 2.19580869424585e-05 loss: 0.1009 (0.1046) time: 2.9344 data: 0.0073 max mem: 33301 Epoch: [23] [4150/4276] eta: 0:06:09 lr: 2.1955204063667703e-05 loss: 0.1156 (0.1047) time: 2.9351 data: 0.0073 max mem: 33301 Epoch: [23] [4160/4276] eta: 0:05:40 lr: 2.195232114281595e-05 loss: 0.1008 (0.1047) time: 2.9253 data: 0.0074 max mem: 33301 Epoch: [23] [4170/4276] eta: 0:05:10 lr: 2.194943817989649e-05 loss: 0.1034 (0.1047) time: 2.9324 data: 0.0079 max mem: 33301 Epoch: [23] [4180/4276] eta: 0:04:41 lr: 2.194655517490257e-05 loss: 0.1034 (0.1047) time: 2.9386 data: 0.0078 max mem: 33301 Epoch: [23] [4190/4276] eta: 0:04:12 lr: 2.1943672127827444e-05 loss: 0.1055 (0.1047) time: 2.9341 data: 0.0071 max mem: 33301 Epoch: [23] [4200/4276] eta: 0:03:42 lr: 2.194078903866434e-05 loss: 0.1108 (0.1047) time: 2.9285 data: 0.0072 max mem: 33301 Epoch: [23] [4210/4276] eta: 0:03:13 lr: 2.1937905907406507e-05 loss: 0.1092 (0.1048) time: 2.9280 data: 0.0069 max mem: 33301 Epoch: [23] [4220/4276] eta: 0:02:44 lr: 2.1935022734047178e-05 loss: 0.1154 (0.1048) time: 2.9319 data: 0.0065 max mem: 33301 Epoch: [23] [4230/4276] eta: 0:02:14 lr: 2.19321395185796e-05 loss: 0.1186 (0.1048) time: 2.9325 data: 0.0065 max mem: 33301 Epoch: [23] [4240/4276] eta: 0:01:45 lr: 2.1929256260996994e-05 loss: 0.1092 (0.1048) time: 2.9349 data: 0.0065 max mem: 33301 Epoch: [23] [4250/4276] eta: 0:01:16 lr: 2.1926372961292597e-05 loss: 0.1092 (0.1049) time: 2.9328 data: 0.0065 max mem: 33301 Epoch: [23] [4260/4276] eta: 0:00:46 lr: 2.192348961945964e-05 loss: 0.1146 (0.1049) time: 2.9307 data: 0.0067 max mem: 33301 Epoch: [23] [4270/4276] eta: 0:00:17 lr: 2.192060623549136e-05 loss: 0.1146 (0.1050) time: 2.9309 data: 0.0066 max mem: 33301 Epoch: [23] Total time: 3:29:01 Test: [ 0/21770] eta: 8:09:22 time: 1.3487 data: 1.2866 max mem: 33301 Test: [ 100/21770] eta: 0:18:53 time: 0.0391 data: 0.0009 max mem: 33301 Test: [ 200/21770] eta: 0:16:26 time: 0.0391 data: 0.0008 max mem: 33301 Test: [ 300/21770] eta: 0:15:35 time: 0.0392 data: 0.0009 max mem: 33301 Test: [ 400/21770] eta: 0:15:08 time: 0.0392 data: 0.0009 max mem: 33301 Test: [ 500/21770] eta: 0:14:50 time: 0.0392 data: 0.0009 max mem: 33301 Test: [ 600/21770] eta: 0:14:37 time: 0.0394 data: 0.0008 max mem: 33301 Test: [ 700/21770] eta: 0:14:26 time: 0.0392 data: 0.0008 max mem: 33301 Test: [ 800/21770] eta: 0:14:17 time: 0.0393 data: 0.0008 max mem: 33301 Test: [ 900/21770] eta: 0:14:09 time: 0.0393 data: 0.0008 max mem: 33301 Test: [ 1000/21770] eta: 0:14:02 time: 0.0392 data: 0.0008 max mem: 33301 Test: [ 1100/21770] eta: 0:13:55 time: 0.0392 data: 0.0008 max mem: 33301 Test: [ 1200/21770] eta: 0:13:49 time: 0.0393 data: 0.0008 max mem: 33301 Test: [ 1300/21770] eta: 0:13:43 time: 0.0392 data: 0.0008 max mem: 33301 Test: [ 1400/21770] eta: 0:13:38 time: 0.0392 data: 0.0008 max mem: 33301 Test: [ 1500/21770] eta: 0:13:33 time: 0.0392 data: 0.0008 max mem: 33301 Test: [ 1600/21770] eta: 0:13:28 time: 0.0393 data: 0.0008 max mem: 33301 Test: [ 1700/21770] eta: 0:13:23 time: 0.0392 data: 0.0008 max mem: 33301 Test: [ 1800/21770] eta: 0:13:18 time: 0.0393 data: 0.0008 max mem: 33301 Test: [ 1900/21770] eta: 0:13:13 time: 0.0392 data: 0.0008 max mem: 33301 Test: [ 2000/21770] eta: 0:13:08 time: 0.0390 data: 0.0008 max mem: 33301 Test: [ 2100/21770] eta: 0:13:03 time: 0.0389 data: 0.0008 max mem: 33301 Test: [ 2200/21770] eta: 0:12:59 time: 0.0390 data: 0.0008 max mem: 33301 Test: [ 2300/21770] eta: 0:12:54 time: 0.0390 data: 0.0008 max mem: 33301 Test: [ 2400/21770] eta: 0:12:50 time: 0.0397 data: 0.0008 max mem: 33301 Test: [ 2500/21770] eta: 0:12:46 time: 0.0393 data: 0.0009 max mem: 33301 Test: [ 2600/21770] eta: 0:12:41 time: 0.0393 data: 0.0008 max mem: 33301 Test: [ 2700/21770] eta: 0:12:37 time: 0.0394 data: 0.0008 max mem: 33301 Test: [ 2800/21770] eta: 0:12:33 time: 0.0394 data: 0.0008 max mem: 33301 Test: [ 2900/21770] eta: 0:12:29 time: 0.0392 data: 0.0008 max mem: 33301 Test: [ 3000/21770] eta: 0:12:24 time: 0.0389 data: 0.0009 max mem: 33301 Test: [ 3100/21770] eta: 0:12:20 time: 0.0388 data: 0.0009 max mem: 33301 Test: [ 3200/21770] eta: 0:12:15 time: 0.0389 data: 0.0009 max mem: 33301 Test: [ 3300/21770] eta: 0:12:11 time: 0.0389 data: 0.0009 max mem: 33301 Test: [ 3400/21770] eta: 0:12:06 time: 0.0388 data: 0.0009 max mem: 33301 Test: [ 3500/21770] eta: 0:12:02 time: 0.0388 data: 0.0009 max mem: 33301 Test: [ 3600/21770] eta: 0:11:58 time: 0.0385 data: 0.0009 max mem: 33301 Test: [ 3700/21770] eta: 0:11:53 time: 0.0383 data: 0.0009 max mem: 33301 Test: [ 3800/21770] eta: 0:11:49 time: 0.0386 data: 0.0009 max mem: 33301 Test: [ 3900/21770] eta: 0:11:44 time: 0.0380 data: 0.0009 max mem: 33301 Test: [ 4000/21770] eta: 0:11:40 time: 0.0382 data: 0.0009 max mem: 33301 Test: [ 4100/21770] eta: 0:11:35 time: 0.0381 data: 0.0009 max mem: 33301 Test: [ 4200/21770] eta: 0:11:31 time: 0.0382 data: 0.0009 max mem: 33301 Test: [ 4300/21770] eta: 0:11:26 time: 0.0380 data: 0.0009 max mem: 33301 Test: [ 4400/21770] eta: 0:11:22 time: 0.0382 data: 0.0009 max mem: 33301 Test: [ 4500/21770] eta: 0:11:17 time: 0.0381 data: 0.0009 max mem: 33301 Test: [ 4600/21770] eta: 0:11:13 time: 0.0382 data: 0.0009 max mem: 33301 Test: [ 4700/21770] eta: 0:11:09 time: 0.0381 data: 0.0009 max mem: 33301 Test: [ 4800/21770] eta: 0:11:04 time: 0.0381 data: 0.0009 max mem: 33301 Test: [ 4900/21770] eta: 0:11:00 time: 0.0381 data: 0.0009 max mem: 33301 Test: [ 5000/21770] eta: 0:10:56 time: 0.0382 data: 0.0009 max mem: 33301 Test: [ 5100/21770] eta: 0:10:52 time: 0.0385 data: 0.0009 max mem: 33301 Test: [ 5200/21770] eta: 0:10:48 time: 0.0386 data: 0.0009 max mem: 33301 Test: [ 5300/21770] eta: 0:10:43 time: 0.0384 data: 0.0009 max mem: 33301 Test: [ 5400/21770] eta: 0:10:39 time: 0.0385 data: 0.0009 max mem: 33301 Test: [ 5500/21770] eta: 0:10:35 time: 0.0384 data: 0.0009 max mem: 33301 Test: [ 5600/21770] eta: 0:10:31 time: 0.0387 data: 0.0009 max mem: 33301 Test: [ 5700/21770] eta: 0:10:27 time: 0.0383 data: 0.0009 max mem: 33301 Test: [ 5800/21770] eta: 0:10:23 time: 0.0388 data: 0.0009 max mem: 33301 Test: [ 5900/21770] eta: 0:10:19 time: 0.0382 data: 0.0009 max mem: 33301 Test: [ 6000/21770] eta: 0:10:15 time: 0.0389 data: 0.0009 max mem: 33301 Test: [ 6100/21770] eta: 0:10:11 time: 0.0385 data: 0.0009 max mem: 33301 Test: [ 6200/21770] eta: 0:10:07 time: 0.0388 data: 0.0009 max mem: 33301 Test: [ 6300/21770] eta: 0:10:03 time: 0.0385 data: 0.0009 max mem: 33301 Test: [ 6400/21770] eta: 0:09:59 time: 0.0390 data: 0.0009 max mem: 33301 Test: [ 6500/21770] eta: 0:09:55 time: 0.0386 data: 0.0009 max mem: 33301 Test: [ 6600/21770] eta: 0:09:51 time: 0.0388 data: 0.0009 max mem: 33301 Test: [ 6700/21770] eta: 0:09:47 time: 0.0389 data: 0.0009 max mem: 33301 Test: [ 6800/21770] eta: 0:09:43 time: 0.0382 data: 0.0009 max mem: 33301 Test: [ 6900/21770] eta: 0:09:39 time: 0.0381 data: 0.0009 max mem: 33301 Test: [ 7000/21770] eta: 0:09:35 time: 0.0382 data: 0.0009 max mem: 33301 Test: [ 7100/21770] eta: 0:09:31 time: 0.0380 data: 0.0009 max mem: 33301 Test: [ 7200/21770] eta: 0:09:27 time: 0.0382 data: 0.0009 max mem: 33301 Test: [ 7300/21770] eta: 0:09:23 time: 0.0380 data: 0.0009 max mem: 33301 Test: [ 7400/21770] eta: 0:09:19 time: 0.0382 data: 0.0009 max mem: 33301 Test: [ 7500/21770] eta: 0:09:15 time: 0.0380 data: 0.0009 max mem: 33301 Test: [ 7600/21770] eta: 0:09:11 time: 0.0381 data: 0.0009 max mem: 33301 Test: [ 7700/21770] eta: 0:09:07 time: 0.0384 data: 0.0009 max mem: 33301 Test: [ 7800/21770] eta: 0:09:03 time: 0.0390 data: 0.0009 max mem: 33301 Test: [ 7900/21770] eta: 0:08:59 time: 0.0391 data: 0.0009 max mem: 33301 Test: [ 8000/21770] eta: 0:08:55 time: 0.0391 data: 0.0009 max mem: 33301 Test: [ 8100/21770] eta: 0:08:51 time: 0.0391 data: 0.0009 max mem: 33301 Test: [ 8200/21770] eta: 0:08:47 time: 0.0389 data: 0.0009 max mem: 33301 Test: [ 8300/21770] eta: 0:08:43 time: 0.0390 data: 0.0009 max mem: 33301 Test: [ 8400/21770] eta: 0:08:40 time: 0.0393 data: 0.0009 max mem: 33301 Test: [ 8500/21770] eta: 0:08:36 time: 0.0399 data: 0.0009 max mem: 33301 Test: [ 8600/21770] eta: 0:08:32 time: 0.0383 data: 0.0009 max mem: 33301 Test: [ 8700/21770] eta: 0:08:28 time: 0.0390 data: 0.0009 max mem: 33301 Test: [ 8800/21770] eta: 0:08:24 time: 0.0389 data: 0.0009 max mem: 33301 Test: [ 8900/21770] eta: 0:08:20 time: 0.0390 data: 0.0009 max mem: 33301 Test: [ 9000/21770] eta: 0:08:16 time: 0.0391 data: 0.0009 max mem: 33301 Test: [ 9100/21770] eta: 0:08:12 time: 0.0394 data: 0.0009 max mem: 33301 Test: [ 9200/21770] eta: 0:08:09 time: 0.0387 data: 0.0008 max mem: 33301 Test: [ 9300/21770] eta: 0:08:05 time: 0.0391 data: 0.0009 max mem: 33301 Test: [ 9400/21770] eta: 0:08:01 time: 0.0388 data: 0.0008 max mem: 33301 Test: [ 9500/21770] eta: 0:07:57 time: 0.0392 data: 0.0008 max mem: 33301 Test: [ 9600/21770] eta: 0:07:53 time: 0.0384 data: 0.0008 max mem: 33301 Test: [ 9700/21770] eta: 0:07:49 time: 0.0383 data: 0.0008 max mem: 33301 Test: [ 9800/21770] eta: 0:07:45 time: 0.0385 data: 0.0008 max mem: 33301 Test: [ 9900/21770] eta: 0:07:41 time: 0.0385 data: 0.0009 max mem: 33301 Test: [10000/21770] eta: 0:07:37 time: 0.0394 data: 0.0008 max mem: 33301 Test: [10100/21770] eta: 0:07:34 time: 0.0393 data: 0.0009 max mem: 33301 Test: [10200/21770] eta: 0:07:30 time: 0.0393 data: 0.0008 max mem: 33301 Test: [10300/21770] eta: 0:07:26 time: 0.0391 data: 0.0008 max mem: 33301 Test: [10400/21770] eta: 0:07:22 time: 0.0391 data: 0.0008 max mem: 33301 Test: [10500/21770] eta: 0:07:18 time: 0.0388 data: 0.0008 max mem: 33301 Test: [10600/21770] eta: 0:07:14 time: 0.0390 data: 0.0008 max mem: 33301 Test: [10700/21770] eta: 0:07:10 time: 0.0385 data: 0.0009 max mem: 33301 Test: [10800/21770] eta: 0:07:06 time: 0.0388 data: 0.0013 max mem: 33301 Test: [10900/21770] eta: 0:07:03 time: 0.0386 data: 0.0009 max mem: 33301 Test: [11000/21770] eta: 0:06:59 time: 0.0398 data: 0.0008 max mem: 33301 Test: [11100/21770] eta: 0:06:55 time: 0.0400 data: 0.0008 max mem: 33301 Test: [11200/21770] eta: 0:06:51 time: 0.0400 data: 0.0008 max mem: 33301 Test: [11300/21770] eta: 0:06:47 time: 0.0401 data: 0.0008 max mem: 33301 Test: [11400/21770] eta: 0:06:44 time: 0.0398 data: 0.0008 max mem: 33301 Test: [11500/21770] eta: 0:06:40 time: 0.0399 data: 0.0008 max mem: 33301 Test: [11600/21770] eta: 0:06:36 time: 0.0389 data: 0.0008 max mem: 33301 Test: [11700/21770] eta: 0:06:32 time: 0.0389 data: 0.0008 max mem: 33301 Test: [11800/21770] eta: 0:06:28 time: 0.0392 data: 0.0008 max mem: 33301 Test: [11900/21770] eta: 0:06:24 time: 0.0391 data: 0.0008 max mem: 33301 Test: [12000/21770] eta: 0:06:20 time: 0.0393 data: 0.0008 max mem: 33301 Test: [12100/21770] eta: 0:06:16 time: 0.0391 data: 0.0008 max mem: 33301 Test: [12200/21770] eta: 0:06:13 time: 0.0390 data: 0.0008 max mem: 33301 Test: [12300/21770] eta: 0:06:09 time: 0.0388 data: 0.0008 max mem: 33301 Test: [12400/21770] eta: 0:06:05 time: 0.0388 data: 0.0008 max mem: 33301 Test: [12500/21770] eta: 0:06:01 time: 0.0382 data: 0.0008 max mem: 33301 Test: [12600/21770] eta: 0:05:57 time: 0.0381 data: 0.0009 max mem: 33301 Test: [12700/21770] eta: 0:05:53 time: 0.0380 data: 0.0009 max mem: 33301 Test: [12800/21770] eta: 0:05:49 time: 0.0380 data: 0.0008 max mem: 33301 Test: [12900/21770] eta: 0:05:45 time: 0.0380 data: 0.0007 max mem: 33301 Test: [13000/21770] eta: 0:05:41 time: 0.0390 data: 0.0009 max mem: 33301 Test: [13100/21770] eta: 0:05:37 time: 0.0381 data: 0.0009 max mem: 33301 Test: [13200/21770] eta: 0:05:33 time: 0.0382 data: 0.0010 max mem: 33301 Test: [13300/21770] eta: 0:05:29 time: 0.0381 data: 0.0009 max mem: 33301 Test: [13400/21770] eta: 0:05:25 time: 0.0380 data: 0.0009 max mem: 33301 Test: [13500/21770] eta: 0:05:21 time: 0.0381 data: 0.0009 max mem: 33301 Test: [13600/21770] eta: 0:05:17 time: 0.0379 data: 0.0009 max mem: 33301 Test: [13700/21770] eta: 0:05:14 time: 0.0401 data: 0.0008 max mem: 33301 Test: [13800/21770] eta: 0:05:10 time: 0.0399 data: 0.0008 max mem: 33301 Test: [13900/21770] eta: 0:05:06 time: 0.0377 data: 0.0009 max mem: 33301 Test: [14000/21770] eta: 0:05:02 time: 0.0377 data: 0.0009 max mem: 33301 Test: [14100/21770] eta: 0:04:58 time: 0.0378 data: 0.0009 max mem: 33301 Test: [14200/21770] eta: 0:04:54 time: 0.0378 data: 0.0009 max mem: 33301 Test: [14300/21770] eta: 0:04:50 time: 0.0378 data: 0.0009 max mem: 33301 Test: [14400/21770] eta: 0:04:46 time: 0.0377 data: 0.0009 max mem: 33301 Test: [14500/21770] eta: 0:04:42 time: 0.0378 data: 0.0009 max mem: 33301 Test: [14600/21770] eta: 0:04:38 time: 0.0377 data: 0.0009 max mem: 33301 Test: [14700/21770] eta: 0:04:34 time: 0.0380 data: 0.0009 max mem: 33301 Test: [14800/21770] eta: 0:04:30 time: 0.0381 data: 0.0009 max mem: 33301 Test: [14900/21770] eta: 0:04:26 time: 0.0381 data: 0.0009 max mem: 33301 Test: [15000/21770] eta: 0:04:22 time: 0.0381 data: 0.0009 max mem: 33301 Test: [15100/21770] eta: 0:04:19 time: 0.0380 data: 0.0009 max mem: 33301 Test: [15200/21770] eta: 0:04:15 time: 0.0382 data: 0.0009 max mem: 33301 Test: [15300/21770] eta: 0:04:11 time: 0.0380 data: 0.0009 max mem: 33301 Test: [15400/21770] eta: 0:04:07 time: 0.0382 data: 0.0009 max mem: 33301 Test: [15500/21770] eta: 0:04:03 time: 0.0383 data: 0.0009 max mem: 33301 Test: [15600/21770] eta: 0:03:59 time: 0.0386 data: 0.0009 max mem: 33301 Test: [15700/21770] eta: 0:03:55 time: 0.0383 data: 0.0009 max mem: 33301 Test: [15800/21770] eta: 0:03:51 time: 0.0388 data: 0.0009 max mem: 33301 Test: [15900/21770] eta: 0:03:47 time: 0.0386 data: 0.0009 max mem: 33301 Test: [16000/21770] eta: 0:03:43 time: 0.0389 data: 0.0009 max mem: 33301 Test: [16100/21770] eta: 0:03:40 time: 0.0385 data: 0.0009 max mem: 33301 Test: [16200/21770] eta: 0:03:36 time: 0.0387 data: 0.0009 max mem: 33301 Test: [16300/21770] eta: 0:03:32 time: 0.0386 data: 0.0009 max mem: 33301 Test: [16400/21770] eta: 0:03:28 time: 0.0389 data: 0.0009 max mem: 33301 Test: [16500/21770] eta: 0:03:24 time: 0.0384 data: 0.0009 max mem: 33301 Test: [16600/21770] eta: 0:03:20 time: 0.0390 data: 0.0009 max mem: 33301 Test: [16700/21770] eta: 0:03:16 time: 0.0383 data: 0.0009 max mem: 33301 Test: [16800/21770] eta: 0:03:12 time: 0.0386 data: 0.0009 max mem: 33301 Test: [16900/21770] eta: 0:03:08 time: 0.0383 data: 0.0009 max mem: 33301 Test: [17000/21770] eta: 0:03:05 time: 0.0384 data: 0.0009 max mem: 33301 Test: [17100/21770] eta: 0:03:01 time: 0.0381 data: 0.0009 max mem: 33301 Test: [17200/21770] eta: 0:02:57 time: 0.0384 data: 0.0009 max mem: 33301 Test: [17300/21770] eta: 0:02:53 time: 0.0380 data: 0.0009 max mem: 33301 Test: [17400/21770] eta: 0:02:49 time: 0.0384 data: 0.0009 max mem: 33301 Test: [17500/21770] eta: 0:02:45 time: 0.0381 data: 0.0009 max mem: 33301 Test: [17600/21770] eta: 0:02:41 time: 0.0384 data: 0.0009 max mem: 33301 Test: [17700/21770] eta: 0:02:37 time: 0.0383 data: 0.0009 max mem: 33301 Test: [17800/21770] eta: 0:02:33 time: 0.0385 data: 0.0009 max mem: 33301 Test: [17900/21770] eta: 0:02:30 time: 0.0381 data: 0.0009 max mem: 33301 Test: [18000/21770] eta: 0:02:26 time: 0.0385 data: 0.0009 max mem: 33301 Test: [18100/21770] eta: 0:02:22 time: 0.0381 data: 0.0009 max mem: 33301 Test: [18200/21770] eta: 0:02:18 time: 0.0384 data: 0.0009 max mem: 33301 Test: [18300/21770] eta: 0:02:14 time: 0.0380 data: 0.0009 max mem: 33301 Test: [18400/21770] eta: 0:02:10 time: 0.0384 data: 0.0009 max mem: 33301 Test: [18500/21770] eta: 0:02:06 time: 0.0383 data: 0.0009 max mem: 33301 Test: [18600/21770] eta: 0:02:02 time: 0.0386 data: 0.0009 max mem: 33301 Test: [18700/21770] eta: 0:01:58 time: 0.0381 data: 0.0009 max mem: 33301 Test: [18800/21770] eta: 0:01:55 time: 0.0385 data: 0.0009 max mem: 33301 Test: [18900/21770] eta: 0:01:51 time: 0.0381 data: 0.0009 max mem: 33301 Test: [19000/21770] eta: 0:01:47 time: 0.0385 data: 0.0009 max mem: 33301 Test: [19100/21770] eta: 0:01:43 time: 0.0381 data: 0.0009 max mem: 33301 Test: [19200/21770] eta: 0:01:39 time: 0.0384 data: 0.0009 max mem: 33301 Test: [19300/21770] eta: 0:01:35 time: 0.0381 data: 0.0009 max mem: 33301 Test: [19400/21770] eta: 0:01:31 time: 0.0384 data: 0.0009 max mem: 33301 Test: [19500/21770] eta: 0:01:27 time: 0.0382 data: 0.0009 max mem: 33301 Test: [19600/21770] eta: 0:01:24 time: 0.0385 data: 0.0009 max mem: 33301 Test: [19700/21770] eta: 0:01:20 time: 0.0382 data: 0.0009 max mem: 33301 Test: [19800/21770] eta: 0:01:16 time: 0.0385 data: 0.0009 max mem: 33301 Test: [19900/21770] eta: 0:01:12 time: 0.0382 data: 0.0009 max mem: 33301 Test: [20000/21770] eta: 0:01:08 time: 0.0384 data: 0.0009 max mem: 33301 Test: [20100/21770] eta: 0:01:04 time: 0.0380 data: 0.0009 max mem: 33301 Test: [20200/21770] eta: 0:01:00 time: 0.0381 data: 0.0009 max mem: 33301 Test: [20300/21770] eta: 0:00:56 time: 0.0380 data: 0.0009 max mem: 33301 Test: [20400/21770] eta: 0:00:53 time: 0.0381 data: 0.0009 max mem: 33301 Test: [20500/21770] eta: 0:00:49 time: 0.0380 data: 0.0009 max mem: 33301 Test: [20600/21770] eta: 0:00:45 time: 0.0381 data: 0.0009 max mem: 33301 Test: [20700/21770] eta: 0:00:41 time: 0.0382 data: 0.0009 max mem: 33301 Test: [20800/21770] eta: 0:00:37 time: 0.0384 data: 0.0009 max mem: 33301 Test: [20900/21770] eta: 0:00:33 time: 0.0382 data: 0.0009 max mem: 33301 Test: [21000/21770] eta: 0:00:29 time: 0.0387 data: 0.0009 max mem: 33301 Test: [21100/21770] eta: 0:00:25 time: 0.0382 data: 0.0009 max mem: 33301 Test: [21200/21770] eta: 0:00:22 time: 0.0384 data: 0.0009 max mem: 33301 Test: [21300/21770] eta: 0:00:18 time: 0.0380 data: 0.0009 max mem: 33301 Test: [21400/21770] eta: 0:00:14 time: 0.0381 data: 0.0009 max mem: 33301 Test: [21500/21770] eta: 0:00:10 time: 0.0391 data: 0.0009 max mem: 33301 Test: [21600/21770] eta: 0:00:06 time: 0.0398 data: 0.0009 max mem: 33301 Test: [21700/21770] eta: 0:00:02 time: 0.0398 data: 0.0009 max mem: 33301 Test: Total time: 0:14:02 Final results: Mean IoU is 0.68 precision@0.5 = 0.00 precision@0.6 = 0.00 precision@0.7 = 0.00 precision@0.8 = 0.00 precision@0.9 = 0.00 overall IoU = 0.67 mean IoU = 0.68 Mean accuracy for one-to-zero sample is 0.00 Average object IoU 0.006830634277886781 Overall IoU 0.6675364971160889 Epoch: [24] [ 0/4276] eta: 6:03:37 lr: 2.191887618488261e-05 loss: 0.0987 (0.0987) time: 5.1023 data: 2.0087 max mem: 33301 Epoch: [24] [ 10/4276] eta: 3:42:19 lr: 2.1915992733483698e-05 loss: 0.1112 (0.1150) time: 3.1269 data: 0.1895 max mem: 33301 Epoch: [24] [ 20/4276] eta: 3:35:21 lr: 2.191310923993184e-05 loss: 0.1089 (0.1145) time: 2.9327 data: 0.0071 max mem: 33301 Epoch: [24] [ 30/4276] eta: 3:32:44 lr: 2.191022570422025e-05 loss: 0.1046 (0.1125) time: 2.9397 data: 0.0072 max mem: 33301 Epoch: [24] [ 40/4276] eta: 3:31:01 lr: 2.1907342126342155e-05 loss: 0.1005 (0.1113) time: 2.9397 data: 0.0076 max mem: 33301 Epoch: [24] [ 50/4276] eta: 3:29:45 lr: 2.1904458506290765e-05 loss: 0.1030 (0.1083) time: 2.9347 data: 0.0072 max mem: 33301 Epoch: [24] [ 60/4276] eta: 3:28:44 lr: 2.1901574844059292e-05 loss: 0.1005 (0.1057) time: 2.9330 data: 0.0069 max mem: 33301 Epoch: [24] [ 70/4276] eta: 3:27:52 lr: 2.189869113964095e-05 loss: 0.0833 (0.1034) time: 2.9331 data: 0.0068 max mem: 33301 Epoch: [24] [ 80/4276] eta: 3:27:06 lr: 2.189580739302896e-05 loss: 0.0904 (0.1027) time: 2.9341 data: 0.0069 max mem: 33301 Epoch: [24] [ 90/4276] eta: 3:26:26 lr: 2.189292360421651e-05 loss: 0.0946 (0.1025) time: 2.9360 data: 0.0068 max mem: 33301 Epoch: [24] [ 100/4276] eta: 3:25:46 lr: 2.1890039773196823e-05 loss: 0.0946 (0.1034) time: 2.9358 data: 0.0067 max mem: 33301 Epoch: [24] [ 110/4276] eta: 3:25:08 lr: 2.188715589996309e-05 loss: 0.1121 (0.1055) time: 2.9340 data: 0.0067 max mem: 33301 Epoch: [24] [ 120/4276] eta: 3:24:34 lr: 2.1884271984508526e-05 loss: 0.1083 (0.1053) time: 2.9380 data: 0.0068 max mem: 33301 Epoch: [24] [ 130/4276] eta: 3:23:58 lr: 2.188138802682632e-05 loss: 0.1056 (0.1051) time: 2.9371 data: 0.0068 max mem: 33301 Epoch: [24] [ 140/4276] eta: 3:23:24 lr: 2.187850402690967e-05 loss: 0.0917 (0.1044) time: 2.9339 data: 0.0066 max mem: 33301 Epoch: [24] [ 150/4276] eta: 3:22:50 lr: 2.1875619984751773e-05 loss: 0.0917 (0.1038) time: 2.9365 data: 0.0067 max mem: 33301 Epoch: [24] [ 160/4276] eta: 3:22:18 lr: 2.1872735900345835e-05 loss: 0.0954 (0.1034) time: 2.9373 data: 0.0067 max mem: 33301 Epoch: [24] [ 170/4276] eta: 3:21:44 lr: 2.1869851773685025e-05 loss: 0.0980 (0.1034) time: 2.9355 data: 0.0066 max mem: 33301 Epoch: [24] [ 180/4276] eta: 3:21:12 lr: 2.1866967604762546e-05 loss: 0.0980 (0.1036) time: 2.9345 data: 0.0066 max mem: 33301 Epoch: [24] [ 190/4276] eta: 3:20:39 lr: 2.1864083393571585e-05 loss: 0.0916 (0.1033) time: 2.9340 data: 0.0066 max mem: 33301 Epoch: [24] [ 200/4276] eta: 3:19:58 lr: 2.186119914010532e-05 loss: 0.0883 (0.1024) time: 2.9089 data: 0.0075 max mem: 33301 Epoch: [24] [ 210/4276] eta: 3:19:17 lr: 2.185831484435694e-05 loss: 0.0962 (0.1024) time: 2.8851 data: 0.0082 max mem: 33301 Epoch: [24] [ 220/4276] eta: 3:18:45 lr: 2.1855430506319622e-05 loss: 0.0931 (0.1018) time: 2.9055 data: 0.0087 max mem: 33301 Epoch: [24] [ 230/4276] eta: 3:18:12 lr: 2.185254612598655e-05 loss: 0.0863 (0.1011) time: 2.9250 data: 0.0085 max mem: 33301 Epoch: [24] [ 240/4276] eta: 3:17:37 lr: 2.1849661703350893e-05 loss: 0.0874 (0.1009) time: 2.9135 data: 0.0073 max mem: 33301 Epoch: [24] [ 250/4276] eta: 3:17:07 lr: 2.1846777238405832e-05 loss: 0.0984 (0.1016) time: 2.9199 data: 0.0072 max mem: 33301 Epoch: [24] [ 260/4276] eta: 3:16:38 lr: 2.1843892731144533e-05 loss: 0.0984 (0.1015) time: 2.9388 data: 0.0070 max mem: 33301 Epoch: [24] [ 270/4276] eta: 3:16:09 lr: 2.1841008181560176e-05 loss: 0.0930 (0.1015) time: 2.9386 data: 0.0068 max mem: 33301 Epoch: [24] [ 280/4276] eta: 3:15:39 lr: 2.1838123589645916e-05 loss: 0.0839 (0.1013) time: 2.9369 data: 0.0070 max mem: 33301 Epoch: [24] [ 290/4276] eta: 3:15:09 lr: 2.183523895539493e-05 loss: 0.0906 (0.1015) time: 2.9355 data: 0.0069 max mem: 33301 Epoch: [24] [ 300/4276] eta: 3:14:40 lr: 2.1832354278800375e-05 loss: 0.0974 (0.1015) time: 2.9355 data: 0.0071 max mem: 33301 Epoch: [24] [ 310/4276] eta: 3:14:10 lr: 2.1829469559855422e-05 loss: 0.0951 (0.1013) time: 2.9366 data: 0.0071 max mem: 33301 Epoch: [24] [ 320/4276] eta: 3:13:38 lr: 2.182658479855322e-05 loss: 0.0946 (0.1017) time: 2.9265 data: 0.0071 max mem: 33301 Epoch: [24] [ 330/4276] eta: 3:13:09 lr: 2.182369999488693e-05 loss: 0.1089 (0.1020) time: 2.9263 data: 0.0070 max mem: 33301 Epoch: [24] [ 340/4276] eta: 3:12:39 lr: 2.1820815148849707e-05 loss: 0.0991 (0.1018) time: 2.9356 data: 0.0066 max mem: 33301 Epoch: [24] [ 350/4276] eta: 3:12:10 lr: 2.1817930260434715e-05 loss: 0.0991 (0.1020) time: 2.9371 data: 0.0065 max mem: 33301 Epoch: [24] [ 360/4276] eta: 3:11:39 lr: 2.181504532963509e-05 loss: 0.1013 (0.1028) time: 2.9281 data: 0.0066 max mem: 33301 Epoch: [24] [ 370/4276] eta: 3:11:05 lr: 2.1812160356443985e-05 loss: 0.0991 (0.1028) time: 2.9075 data: 0.0075 max mem: 33301 Epoch: [24] [ 380/4276] eta: 3:10:36 lr: 2.180927534085455e-05 loss: 0.0991 (0.1028) time: 2.9176 data: 0.0074 max mem: 33301 Epoch: [24] [ 390/4276] eta: 3:10:07 lr: 2.180639028285994e-05 loss: 0.1025 (0.1028) time: 2.9357 data: 0.0064 max mem: 33301 Epoch: [24] [ 400/4276] eta: 3:09:37 lr: 2.1803505182453278e-05 loss: 0.1010 (0.1029) time: 2.9351 data: 0.0060 max mem: 33301 Epoch: [24] [ 410/4276] eta: 3:09:10 lr: 2.1800620039627712e-05 loss: 0.1041 (0.1030) time: 2.9469 data: 0.0060 max mem: 33301 Epoch: [24] [ 420/4276] eta: 3:08:40 lr: 2.1797734854376383e-05 loss: 0.1041 (0.1032) time: 2.9428 data: 0.0063 max mem: 33301 Epoch: [24] [ 430/4276] eta: 3:08:07 lr: 2.1794849626692435e-05 loss: 0.1049 (0.1035) time: 2.9122 data: 0.0071 max mem: 33301 Epoch: [24] [ 440/4276] eta: 3:07:38 lr: 2.1791964356568987e-05 loss: 0.1005 (0.1034) time: 2.9160 data: 0.0076 max mem: 33301 Epoch: [24] [ 450/4276] eta: 3:07:09 lr: 2.1789079043999178e-05 loss: 0.0956 (0.1034) time: 2.9358 data: 0.0071 max mem: 33301 Epoch: [24] [ 460/4276] eta: 3:06:39 lr: 2.1786193688976144e-05 loss: 0.0956 (0.1031) time: 2.9359 data: 0.0066 max mem: 33301 Epoch: [24] [ 470/4276] eta: 3:06:10 lr: 2.1783308291493003e-05 loss: 0.0834 (0.1028) time: 2.9374 data: 0.0065 max mem: 33301 Epoch: [24] [ 480/4276] eta: 3:05:41 lr: 2.1780422851542885e-05 loss: 0.0917 (0.1029) time: 2.9390 data: 0.0065 max mem: 33301 Epoch: [24] [ 490/4276] eta: 3:05:12 lr: 2.1777537369118917e-05 loss: 0.0984 (0.1028) time: 2.9385 data: 0.0063 max mem: 33301 Epoch: [24] [ 500/4276] eta: 3:04:43 lr: 2.177465184421422e-05 loss: 0.0925 (0.1027) time: 2.9371 data: 0.0060 max mem: 33301 Epoch: [24] [ 510/4276] eta: 3:04:13 lr: 2.1771766276821905e-05 loss: 0.0904 (0.1028) time: 2.9328 data: 0.0060 max mem: 33301 Epoch: [24] [ 520/4276] eta: 3:03:43 lr: 2.1768880666935097e-05 loss: 0.0922 (0.1028) time: 2.9298 data: 0.0060 max mem: 33301 Epoch: [24] [ 530/4276] eta: 3:03:14 lr: 2.1765995014546908e-05 loss: 0.1004 (0.1027) time: 2.9324 data: 0.0061 max mem: 33301 Epoch: [24] [ 540/4276] eta: 3:02:44 lr: 2.176310931965046e-05 loss: 0.0950 (0.1026) time: 2.9335 data: 0.0061 max mem: 33301 Epoch: [24] [ 550/4276] eta: 3:02:15 lr: 2.1760223582238852e-05 loss: 0.0981 (0.1029) time: 2.9306 data: 0.0061 max mem: 33301 Epoch: [24] [ 560/4276] eta: 3:01:45 lr: 2.1757337802305197e-05 loss: 0.1133 (0.1029) time: 2.9310 data: 0.0061 max mem: 33301 Epoch: [24] [ 570/4276] eta: 3:01:16 lr: 2.17544519798426e-05 loss: 0.1037 (0.1029) time: 2.9350 data: 0.0061 max mem: 33301 Epoch: [24] [ 580/4276] eta: 3:00:45 lr: 2.1751566114844173e-05 loss: 0.0959 (0.1030) time: 2.9191 data: 0.0066 max mem: 33301 Epoch: [24] [ 590/4276] eta: 3:00:16 lr: 2.1748680207303012e-05 loss: 0.0855 (0.1028) time: 2.9230 data: 0.0076 max mem: 33301 Epoch: [24] [ 600/4276] eta: 2:59:46 lr: 2.1745794257212215e-05 loss: 0.0862 (0.1026) time: 2.9385 data: 0.0073 max mem: 33301 Epoch: [24] [ 610/4276] eta: 2:59:17 lr: 2.1742908264564883e-05 loss: 0.1005 (0.1026) time: 2.9343 data: 0.0068 max mem: 33301 Epoch: [24] [ 620/4276] eta: 2:58:48 lr: 2.1740022229354122e-05 loss: 0.0974 (0.1025) time: 2.9338 data: 0.0065 max mem: 33301 Epoch: [24] [ 630/4276] eta: 2:58:18 lr: 2.173713615157301e-05 loss: 0.0993 (0.1027) time: 2.9321 data: 0.0061 max mem: 33301 Epoch: [24] [ 640/4276] eta: 2:57:49 lr: 2.173425003121464e-05 loss: 0.1014 (0.1026) time: 2.9329 data: 0.0061 max mem: 33301 Epoch: [24] [ 650/4276] eta: 2:57:19 lr: 2.173136386827211e-05 loss: 0.0973 (0.1027) time: 2.9337 data: 0.0061 max mem: 33301 Epoch: [24] [ 660/4276] eta: 2:56:50 lr: 2.1728477662738515e-05 loss: 0.1071 (0.1028) time: 2.9333 data: 0.0061 max mem: 33301 Epoch: [24] [ 670/4276] eta: 2:56:21 lr: 2.1725591414606916e-05 loss: 0.1087 (0.1029) time: 2.9352 data: 0.0062 max mem: 33301 Epoch: [24] [ 680/4276] eta: 2:55:51 lr: 2.1722705123870415e-05 loss: 0.1056 (0.1029) time: 2.9360 data: 0.0063 max mem: 33301 Epoch: [24] [ 690/4276] eta: 2:55:22 lr: 2.1719818790522088e-05 loss: 0.0914 (0.1029) time: 2.9331 data: 0.0061 max mem: 33301 Epoch: [24] [ 700/4276] eta: 2:54:52 lr: 2.1716932414555014e-05 loss: 0.0914 (0.1029) time: 2.9319 data: 0.0062 max mem: 33301 Epoch: [24] [ 710/4276] eta: 2:54:23 lr: 2.1714045995962267e-05 loss: 0.0978 (0.1030) time: 2.9327 data: 0.0062 max mem: 33301 Epoch: [24] [ 720/4276] eta: 2:53:54 lr: 2.1711159534736925e-05 loss: 0.0918 (0.1028) time: 2.9327 data: 0.0062 max mem: 33301 Epoch: [24] [ 730/4276] eta: 2:53:24 lr: 2.1708273030872057e-05 loss: 0.0953 (0.1027) time: 2.9323 data: 0.0062 max mem: 33301 Epoch: [24] [ 740/4276] eta: 2:52:55 lr: 2.170538648436074e-05 loss: 0.1002 (0.1027) time: 2.9335 data: 0.0061 max mem: 33301 Epoch: [24] [ 750/4276] eta: 2:52:26 lr: 2.1702499895196035e-05 loss: 0.0995 (0.1028) time: 2.9340 data: 0.0062 max mem: 33301 Epoch: [24] [ 760/4276] eta: 2:51:56 lr: 2.169961326337101e-05 loss: 0.0952 (0.1027) time: 2.9326 data: 0.0061 max mem: 33301 Epoch: [24] [ 770/4276] eta: 2:51:27 lr: 2.1696726588878734e-05 loss: 0.0922 (0.1026) time: 2.9329 data: 0.0062 max mem: 33301 Epoch: [24] [ 780/4276] eta: 2:50:57 lr: 2.169383987171226e-05 loss: 0.0943 (0.1025) time: 2.9326 data: 0.0061 max mem: 33301 Epoch: [24] [ 790/4276] eta: 2:50:28 lr: 2.1690953111864654e-05 loss: 0.1003 (0.1030) time: 2.9324 data: 0.0061 max mem: 33301 Epoch: [24] [ 800/4276] eta: 2:49:58 lr: 2.1688066309328965e-05 loss: 0.1046 (0.1030) time: 2.9323 data: 0.0062 max mem: 33301 Epoch: [24] [ 810/4276] eta: 2:49:29 lr: 2.1685179464098266e-05 loss: 0.1059 (0.1031) time: 2.9324 data: 0.0062 max mem: 33301 Epoch: [24] [ 820/4276] eta: 2:49:00 lr: 2.168229257616559e-05 loss: 0.0996 (0.1030) time: 2.9336 data: 0.0061 max mem: 33301 Epoch: [24] [ 830/4276] eta: 2:48:30 lr: 2.1679405645523994e-05 loss: 0.0999 (0.1030) time: 2.9337 data: 0.0061 max mem: 33301 Epoch: [24] [ 840/4276] eta: 2:48:01 lr: 2.1676518672166533e-05 loss: 0.1012 (0.1030) time: 2.9332 data: 0.0061 max mem: 33301 Epoch: [24] [ 850/4276] eta: 2:47:32 lr: 2.1673631656086252e-05 loss: 0.0917 (0.1028) time: 2.9326 data: 0.0062 max mem: 33301 Epoch: [24] [ 860/4276] eta: 2:47:02 lr: 2.167074459727619e-05 loss: 0.0929 (0.1030) time: 2.9265 data: 0.0062 max mem: 33301 Epoch: [24] [ 870/4276] eta: 2:46:32 lr: 2.1667857495729392e-05 loss: 0.1004 (0.1029) time: 2.9230 data: 0.0062 max mem: 33301 Epoch: [24] [ 880/4276] eta: 2:46:03 lr: 2.1664970351438894e-05 loss: 0.1002 (0.1031) time: 2.9295 data: 0.0061 max mem: 33301 Epoch: [24] [ 890/4276] eta: 2:45:33 lr: 2.166208316439775e-05 loss: 0.1178 (0.1034) time: 2.9327 data: 0.0061 max mem: 33301 Epoch: [24] [ 900/4276] eta: 2:45:04 lr: 2.1659195934598973e-05 loss: 0.1068 (0.1034) time: 2.9315 data: 0.0061 max mem: 33301 Epoch: [24] [ 910/4276] eta: 2:44:34 lr: 2.165630866203561e-05 loss: 0.1068 (0.1034) time: 2.9314 data: 0.0061 max mem: 33301 Epoch: [24] [ 920/4276] eta: 2:44:05 lr: 2.1653421346700688e-05 loss: 0.1081 (0.1035) time: 2.9314 data: 0.0063 max mem: 33301 Epoch: [24] [ 930/4276] eta: 2:43:36 lr: 2.1650533988587246e-05 loss: 0.1022 (0.1035) time: 2.9378 data: 0.0062 max mem: 33301 Epoch: [24] [ 940/4276] eta: 2:43:07 lr: 2.1647646587688296e-05 loss: 0.0947 (0.1034) time: 2.9384 data: 0.0063 max mem: 33301 Epoch: [24] [ 950/4276] eta: 2:42:38 lr: 2.164475914399687e-05 loss: 0.0970 (0.1036) time: 2.9393 data: 0.0064 max mem: 33301 Epoch: [24] [ 960/4276] eta: 2:42:08 lr: 2.1641871657505992e-05 loss: 0.1079 (0.1036) time: 2.9388 data: 0.0062 max mem: 33301 Epoch: [24] [ 970/4276] eta: 2:41:37 lr: 2.163898412820869e-05 loss: 0.1027 (0.1036) time: 2.9091 data: 0.0071 max mem: 33301 Epoch: [24] [ 980/4276] eta: 2:41:08 lr: 2.1636096556097964e-05 loss: 0.1027 (0.1036) time: 2.9096 data: 0.0074 max mem: 33301 Epoch: [24] [ 990/4276] eta: 2:40:37 lr: 2.1633208941166843e-05 loss: 0.1022 (0.1036) time: 2.9138 data: 0.0070 max mem: 33301 Epoch: [24] [1000/4276] eta: 2:40:07 lr: 2.1630321283408333e-05 loss: 0.0953 (0.1036) time: 2.9037 data: 0.0072 max mem: 33301 Epoch: [24] [1010/4276] eta: 2:39:37 lr: 2.1627433582815462e-05 loss: 0.0941 (0.1035) time: 2.9049 data: 0.0078 max mem: 33301 Epoch: [24] [1020/4276] eta: 2:39:08 lr: 2.162454583938122e-05 loss: 0.0941 (0.1035) time: 2.9195 data: 0.0080 max mem: 33301 Epoch: [24] [1030/4276] eta: 2:38:39 lr: 2.1621658053098627e-05 loss: 0.1051 (0.1036) time: 2.9372 data: 0.0071 max mem: 33301 Epoch: [24] [1040/4276] eta: 2:38:09 lr: 2.161877022396068e-05 loss: 0.0970 (0.1035) time: 2.9340 data: 0.0067 max mem: 33301 Epoch: [24] [1050/4276] eta: 2:37:40 lr: 2.1615882351960396e-05 loss: 0.0954 (0.1036) time: 2.9340 data: 0.0067 max mem: 33301 Epoch: [24] [1060/4276] eta: 2:37:11 lr: 2.1612994437090758e-05 loss: 0.0977 (0.1036) time: 2.9355 data: 0.0065 max mem: 33301 Epoch: [24] [1070/4276] eta: 2:36:42 lr: 2.1610106479344776e-05 loss: 0.1112 (0.1038) time: 2.9377 data: 0.0065 max mem: 33301 Epoch: [24] [1080/4276] eta: 2:36:12 lr: 2.160721847871545e-05 loss: 0.1083 (0.1039) time: 2.9369 data: 0.0068 max mem: 33301 Epoch: [24] [1090/4276] eta: 2:35:43 lr: 2.160433043519576e-05 loss: 0.1105 (0.1040) time: 2.9339 data: 0.0068 max mem: 33301 Epoch: [24] [1100/4276] eta: 2:35:14 lr: 2.160144234877871e-05 loss: 0.1093 (0.1041) time: 2.9327 data: 0.0065 max mem: 33301 Epoch: [24] [1110/4276] eta: 2:34:45 lr: 2.1598554219457283e-05 loss: 0.1081 (0.1042) time: 2.9370 data: 0.0065 max mem: 33301 Epoch: [24] [1120/4276] eta: 2:34:15 lr: 2.159566604722448e-05 loss: 0.1004 (0.1042) time: 2.9354 data: 0.0066 max mem: 33301 Epoch: [24] [1130/4276] eta: 2:33:46 lr: 2.159277783207327e-05 loss: 0.0988 (0.1042) time: 2.9318 data: 0.0066 max mem: 33301 Epoch: [24] [1140/4276] eta: 2:33:17 lr: 2.1589889573996645e-05 loss: 0.1038 (0.1043) time: 2.9311 data: 0.0064 max mem: 33301 Epoch: [24] [1150/4276] eta: 2:32:47 lr: 2.1587001272987583e-05 loss: 0.1005 (0.1043) time: 2.9295 data: 0.0063 max mem: 33301 Epoch: [24] [1160/4276] eta: 2:32:18 lr: 2.1584112929039074e-05 loss: 0.0961 (0.1044) time: 2.9314 data: 0.0062 max mem: 33301 Epoch: [24] [1170/4276] eta: 2:31:48 lr: 2.1581224542144077e-05 loss: 0.1085 (0.1044) time: 2.9168 data: 0.0062 max mem: 33301 Epoch: [24] [1180/4276] eta: 2:31:17 lr: 2.1578336112295578e-05 loss: 0.1047 (0.1044) time: 2.8914 data: 0.0072 max mem: 33301 Epoch: [24] [1190/4276] eta: 2:30:48 lr: 2.157544763948655e-05 loss: 0.0883 (0.1043) time: 2.9098 data: 0.0075 max mem: 33301 Epoch: [24] [1200/4276] eta: 2:30:18 lr: 2.1572559123709964e-05 loss: 0.0994 (0.1044) time: 2.9346 data: 0.0068 max mem: 33301 Epoch: [24] [1210/4276] eta: 2:29:49 lr: 2.1569670564958778e-05 loss: 0.0994 (0.1043) time: 2.9303 data: 0.0069 max mem: 33301 Epoch: [24] [1220/4276] eta: 2:29:20 lr: 2.1566781963225962e-05 loss: 0.1038 (0.1043) time: 2.9296 data: 0.0068 max mem: 33301 Epoch: [24] [1230/4276] eta: 2:28:50 lr: 2.156389331850449e-05 loss: 0.1043 (0.1044) time: 2.9307 data: 0.0068 max mem: 33301 Epoch: [24] [1240/4276] eta: 2:28:21 lr: 2.1561004630787316e-05 loss: 0.1004 (0.1043) time: 2.9318 data: 0.0069 max mem: 33301 Epoch: [24] [1250/4276] eta: 2:27:51 lr: 2.1558115900067395e-05 loss: 0.0950 (0.1044) time: 2.9127 data: 0.0068 max mem: 33301 Epoch: [24] [1260/4276] eta: 2:27:22 lr: 2.1555227126337692e-05 loss: 0.0949 (0.1043) time: 2.9147 data: 0.0068 max mem: 33301 Epoch: [24] [1270/4276] eta: 2:26:52 lr: 2.1552338309591156e-05 loss: 0.0948 (0.1042) time: 2.9350 data: 0.0070 max mem: 33301 Epoch: [24] [1280/4276] eta: 2:26:23 lr: 2.1549449449820748e-05 loss: 0.0949 (0.1043) time: 2.9330 data: 0.0072 max mem: 33301 Epoch: [24] [1290/4276] eta: 2:25:53 lr: 2.1546560547019405e-05 loss: 0.0987 (0.1042) time: 2.9206 data: 0.0069 max mem: 33301 Epoch: [24] [1300/4276] eta: 2:25:23 lr: 2.1543671601180085e-05 loss: 0.0951 (0.1042) time: 2.9100 data: 0.0068 max mem: 33301 Epoch: [24] [1310/4276] eta: 2:24:53 lr: 2.1540782612295733e-05 loss: 0.0822 (0.1040) time: 2.9045 data: 0.0073 max mem: 33301 Epoch: [24] [1320/4276] eta: 2:24:23 lr: 2.1537893580359298e-05 loss: 0.0838 (0.1040) time: 2.8887 data: 0.0072 max mem: 33301 Epoch: [24] [1330/4276] eta: 2:23:53 lr: 2.153500450536371e-05 loss: 0.0985 (0.1040) time: 2.8931 data: 0.0070 max mem: 33301 Epoch: [24] [1340/4276] eta: 2:23:24 lr: 2.1532115387301913e-05 loss: 0.0934 (0.1040) time: 2.9314 data: 0.0071 max mem: 33301 Epoch: [24] [1350/4276] eta: 2:22:55 lr: 2.1529226226166845e-05 loss: 0.1002 (0.1040) time: 2.9460 data: 0.0071 max mem: 33301 Epoch: [24] [1360/4276] eta: 2:22:26 lr: 2.152633702195145e-05 loss: 0.1077 (0.1040) time: 2.9360 data: 0.0068 max mem: 33301 Epoch: [24] [1370/4276] eta: 2:21:57 lr: 2.1523447774648645e-05 loss: 0.0962 (0.1039) time: 2.9402 data: 0.0065 max mem: 33301 Epoch: [24] [1380/4276] eta: 2:21:28 lr: 2.152055848425137e-05 loss: 0.0962 (0.1040) time: 2.9400 data: 0.0064 max mem: 33301 Epoch: [24] [1390/4276] eta: 2:20:59 lr: 2.1517669150752557e-05 loss: 0.1071 (0.1041) time: 2.9382 data: 0.0062 max mem: 33301 Epoch: [24] [1400/4276] eta: 2:20:29 lr: 2.151477977414512e-05 loss: 0.1017 (0.1041) time: 2.9385 data: 0.0065 max mem: 33301 Epoch: [24] [1410/4276] eta: 2:20:00 lr: 2.151189035442199e-05 loss: 0.1010 (0.1041) time: 2.9390 data: 0.0066 max mem: 33301 Epoch: [24] [1420/4276] eta: 2:19:31 lr: 2.150900089157609e-05 loss: 0.0947 (0.1041) time: 2.9383 data: 0.0064 max mem: 33301 Epoch: [24] [1430/4276] eta: 2:19:02 lr: 2.1506111385600342e-05 loss: 0.0947 (0.1041) time: 2.9356 data: 0.0065 max mem: 33301 Epoch: [24] [1440/4276] eta: 2:18:32 lr: 2.1503221836487653e-05 loss: 0.0944 (0.1040) time: 2.9299 data: 0.0065 max mem: 33301 Epoch: [24] [1450/4276] eta: 2:18:03 lr: 2.1500332244230948e-05 loss: 0.0881 (0.1039) time: 2.9192 data: 0.0068 max mem: 33301 Epoch: [24] [1460/4276] eta: 2:17:33 lr: 2.1497442608823135e-05 loss: 0.0991 (0.1039) time: 2.9238 data: 0.0065 max mem: 33301 Epoch: [24] [1470/4276] eta: 2:17:04 lr: 2.149455293025713e-05 loss: 0.1075 (0.1039) time: 2.9340 data: 0.0062 max mem: 33301 Epoch: [24] [1480/4276] eta: 2:16:35 lr: 2.1491663208525832e-05 loss: 0.1112 (0.1040) time: 2.9348 data: 0.0062 max mem: 33301 Epoch: [24] [1490/4276] eta: 2:16:06 lr: 2.1488773443622155e-05 loss: 0.0979 (0.1039) time: 2.9357 data: 0.0061 max mem: 33301 Epoch: [24] [1500/4276] eta: 2:15:37 lr: 2.1485883635539002e-05 loss: 0.0928 (0.1039) time: 2.9369 data: 0.0064 max mem: 33301 Epoch: [24] [1510/4276] eta: 2:15:07 lr: 2.1482993784269278e-05 loss: 0.1043 (0.1039) time: 2.9364 data: 0.0066 max mem: 33301 Epoch: [24] [1520/4276] eta: 2:14:38 lr: 2.148010388980587e-05 loss: 0.0942 (0.1038) time: 2.9256 data: 0.0069 max mem: 33301 Epoch: [24] [1530/4276] eta: 2:14:09 lr: 2.147721395214169e-05 loss: 0.0942 (0.1038) time: 2.9270 data: 0.0068 max mem: 33301 Epoch: [24] [1540/4276] eta: 2:13:39 lr: 2.1474323971269623e-05 loss: 0.0956 (0.1037) time: 2.9374 data: 0.0065 max mem: 33301 Epoch: [24] [1550/4276] eta: 2:13:10 lr: 2.147143394718257e-05 loss: 0.0939 (0.1037) time: 2.9360 data: 0.0064 max mem: 33301 Epoch: [24] [1560/4276] eta: 2:12:41 lr: 2.146854387987341e-05 loss: 0.0925 (0.1036) time: 2.9354 data: 0.0062 max mem: 33301 Epoch: [24] [1570/4276] eta: 2:12:12 lr: 2.1465653769335044e-05 loss: 0.0908 (0.1036) time: 2.9362 data: 0.0062 max mem: 33301 Epoch: [24] [1580/4276] eta: 2:11:42 lr: 2.1462763615560347e-05 loss: 0.0881 (0.1035) time: 2.9365 data: 0.0062 max mem: 33301 Epoch: [24] [1590/4276] eta: 2:11:13 lr: 2.145987341854222e-05 loss: 0.0892 (0.1035) time: 2.9342 data: 0.0062 max mem: 33301 Epoch: [24] [1600/4276] eta: 2:10:44 lr: 2.145698317827352e-05 loss: 0.1092 (0.1035) time: 2.9318 data: 0.0062 max mem: 33301 Epoch: [24] [1610/4276] eta: 2:10:14 lr: 2.1454092894747147e-05 loss: 0.1044 (0.1035) time: 2.9319 data: 0.0062 max mem: 33301 Epoch: [24] [1620/4276] eta: 2:09:45 lr: 2.1451202567955965e-05 loss: 0.0921 (0.1034) time: 2.9321 data: 0.0062 max mem: 33301 Epoch: [24] [1630/4276] eta: 2:09:16 lr: 2.1448312197892865e-05 loss: 0.0924 (0.1035) time: 2.9330 data: 0.0062 max mem: 33301 Epoch: [24] [1640/4276] eta: 2:08:46 lr: 2.1445421784550702e-05 loss: 0.1026 (0.1034) time: 2.9304 data: 0.0062 max mem: 33301 Epoch: [24] [1650/4276] eta: 2:08:17 lr: 2.1442531327922356e-05 loss: 0.0914 (0.1033) time: 2.9301 data: 0.0062 max mem: 33301 Epoch: [24] [1660/4276] eta: 2:07:48 lr: 2.143964082800069e-05 loss: 0.0914 (0.1033) time: 2.9328 data: 0.0061 max mem: 33301 Epoch: [24] [1670/4276] eta: 2:07:19 lr: 2.143675028477858e-05 loss: 0.0924 (0.1032) time: 2.9332 data: 0.0062 max mem: 33301 Epoch: [24] [1680/4276] eta: 2:06:49 lr: 2.143385969824888e-05 loss: 0.0948 (0.1033) time: 2.9322 data: 0.0062 max mem: 33301 Epoch: [24] [1690/4276] eta: 2:06:20 lr: 2.1430969068404453e-05 loss: 0.0832 (0.1032) time: 2.9318 data: 0.0062 max mem: 33301 Epoch: [24] [1700/4276] eta: 2:05:51 lr: 2.1428078395238162e-05 loss: 0.0832 (0.1031) time: 2.9482 data: 0.0065 max mem: 33301 Epoch: [24] [1710/4276] eta: 2:05:22 lr: 2.142518767874286e-05 loss: 0.1021 (0.1031) time: 2.9481 data: 0.0066 max mem: 33301 Epoch: [24] [1720/4276] eta: 2:04:53 lr: 2.1422296918911407e-05 loss: 0.0994 (0.1031) time: 2.9330 data: 0.0064 max mem: 33301 Epoch: [24] [1730/4276] eta: 2:04:23 lr: 2.1419406115736644e-05 loss: 0.0928 (0.1030) time: 2.9325 data: 0.0062 max mem: 33301 Epoch: [24] [1740/4276] eta: 2:03:54 lr: 2.1416515269211443e-05 loss: 0.0965 (0.1031) time: 2.9328 data: 0.0062 max mem: 33301 Epoch: [24] [1750/4276] eta: 2:03:25 lr: 2.141362437932863e-05 loss: 0.0890 (0.1030) time: 2.9391 data: 0.0062 max mem: 33301 Epoch: [24] [1760/4276] eta: 2:02:56 lr: 2.1410733446081057e-05 loss: 0.0850 (0.1029) time: 2.9385 data: 0.0063 max mem: 33301 Epoch: [24] [1770/4276] eta: 2:02:26 lr: 2.1407842469461574e-05 loss: 0.0850 (0.1028) time: 2.9318 data: 0.0062 max mem: 33301 Epoch: [24] [1780/4276] eta: 2:01:57 lr: 2.140495144946302e-05 loss: 0.0890 (0.1028) time: 2.9237 data: 0.0062 max mem: 33301 Epoch: [24] [1790/4276] eta: 2:01:27 lr: 2.1402060386078227e-05 loss: 0.0898 (0.1028) time: 2.9241 data: 0.0066 max mem: 33301 Epoch: [24] [1800/4276] eta: 2:00:58 lr: 2.1399169279300036e-05 loss: 0.0948 (0.1029) time: 2.9324 data: 0.0066 max mem: 33301 Epoch: [24] [1810/4276] eta: 2:00:29 lr: 2.1396278129121283e-05 loss: 0.1024 (0.1029) time: 2.9315 data: 0.0063 max mem: 33301 Epoch: [24] [1820/4276] eta: 2:00:00 lr: 2.1393386935534808e-05 loss: 0.1035 (0.1029) time: 2.9331 data: 0.0063 max mem: 33301 Epoch: [24] [1830/4276] eta: 1:59:30 lr: 2.1390495698533422e-05 loss: 0.0955 (0.1029) time: 2.9339 data: 0.0062 max mem: 33301 Epoch: [24] [1840/4276] eta: 1:59:01 lr: 2.1387604418109967e-05 loss: 0.0872 (0.1028) time: 2.9324 data: 0.0062 max mem: 33301 Epoch: [24] [1850/4276] eta: 1:58:32 lr: 2.138471309425726e-05 loss: 0.0892 (0.1028) time: 2.9313 data: 0.0062 max mem: 33301 Epoch: [24] [1860/4276] eta: 1:58:02 lr: 2.1381821726968138e-05 loss: 0.0989 (0.1027) time: 2.9310 data: 0.0062 max mem: 33301 Epoch: [24] [1870/4276] eta: 1:57:32 lr: 2.1378930316235406e-05 loss: 0.1013 (0.1027) time: 2.9114 data: 0.0070 max mem: 33301 Epoch: [24] [1880/4276] eta: 1:57:03 lr: 2.137603886205189e-05 loss: 0.0975 (0.1027) time: 2.9182 data: 0.0073 max mem: 33301 Epoch: [24] [1890/4276] eta: 1:56:34 lr: 2.1373147364410405e-05 loss: 0.0916 (0.1027) time: 2.9414 data: 0.0068 max mem: 33301 Epoch: [24] [1900/4276] eta: 1:56:05 lr: 2.137025582330377e-05 loss: 0.0893 (0.1026) time: 2.9327 data: 0.0066 max mem: 33301 Epoch: [24] [1910/4276] eta: 1:55:35 lr: 2.136736423872479e-05 loss: 0.0968 (0.1026) time: 2.9316 data: 0.0068 max mem: 33301 Epoch: [24] [1920/4276] eta: 1:55:06 lr: 2.1364472610666274e-05 loss: 0.0986 (0.1025) time: 2.9323 data: 0.0072 max mem: 33301 Epoch: [24] [1930/4276] eta: 1:54:37 lr: 2.1361580939121035e-05 loss: 0.0825 (0.1025) time: 2.9293 data: 0.0071 max mem: 33301 Epoch: [24] [1940/4276] eta: 1:54:07 lr: 2.1358689224081878e-05 loss: 0.0893 (0.1025) time: 2.9304 data: 0.0070 max mem: 33301 Epoch: [24] [1950/4276] eta: 1:53:38 lr: 2.13557974655416e-05 loss: 0.0931 (0.1025) time: 2.9306 data: 0.0068 max mem: 33301 Epoch: [24] [1960/4276] eta: 1:53:09 lr: 2.1352905663493007e-05 loss: 0.0930 (0.1024) time: 2.9321 data: 0.0068 max mem: 33301 Epoch: [24] [1970/4276] eta: 1:52:40 lr: 2.1350013817928895e-05 loss: 0.0889 (0.1024) time: 2.9340 data: 0.0069 max mem: 33301 Epoch: [24] [1980/4276] eta: 1:52:10 lr: 2.134712192884206e-05 loss: 0.0969 (0.1024) time: 2.9361 data: 0.0070 max mem: 33301 Epoch: [24] [1990/4276] eta: 1:51:41 lr: 2.1344229996225293e-05 loss: 0.0969 (0.1024) time: 2.9356 data: 0.0070 max mem: 33301 Epoch: [24] [2000/4276] eta: 1:51:12 lr: 2.134133802007139e-05 loss: 0.1036 (0.1024) time: 2.9339 data: 0.0070 max mem: 33301 Epoch: [24] [2010/4276] eta: 1:50:42 lr: 2.133844600037314e-05 loss: 0.1162 (0.1025) time: 2.9358 data: 0.0070 max mem: 33301 Epoch: [24] [2020/4276] eta: 1:50:13 lr: 2.133555393712332e-05 loss: 0.1188 (0.1025) time: 2.9337 data: 0.0070 max mem: 33301 Epoch: [24] [2030/4276] eta: 1:49:44 lr: 2.1332661830314733e-05 loss: 0.0906 (0.1025) time: 2.9337 data: 0.0071 max mem: 33301 Epoch: [24] [2040/4276] eta: 1:49:15 lr: 2.1329769679940142e-05 loss: 0.0826 (0.1024) time: 2.9358 data: 0.0070 max mem: 33301 Epoch: [24] [2050/4276] eta: 1:48:45 lr: 2.132687748599234e-05 loss: 0.0985 (0.1025) time: 2.9367 data: 0.0070 max mem: 33301 Epoch: [24] [2060/4276] eta: 1:48:16 lr: 2.13239852484641e-05 loss: 0.0912 (0.1024) time: 2.9363 data: 0.0069 max mem: 33301 Epoch: [24] [2070/4276] eta: 1:47:47 lr: 2.1321092967348197e-05 loss: 0.0879 (0.1024) time: 2.9345 data: 0.0069 max mem: 33301 Epoch: [24] [2080/4276] eta: 1:47:17 lr: 2.13182006426374e-05 loss: 0.0934 (0.1025) time: 2.9346 data: 0.0070 max mem: 33301 Epoch: [24] [2090/4276] eta: 1:46:48 lr: 2.1315308274324496e-05 loss: 0.1084 (0.1025) time: 2.9354 data: 0.0069 max mem: 33301 Epoch: [24] [2100/4276] eta: 1:46:19 lr: 2.1312415862402232e-05 loss: 0.0913 (0.1025) time: 2.9368 data: 0.0069 max mem: 33301 Epoch: [24] [2110/4276] eta: 1:45:50 lr: 2.130952340686339e-05 loss: 0.0913 (0.1025) time: 2.9378 data: 0.0069 max mem: 33301 Epoch: [24] [2120/4276] eta: 1:45:21 lr: 2.1306630907700722e-05 loss: 0.0840 (0.1023) time: 2.9459 data: 0.0070 max mem: 33301 Epoch: [24] [2130/4276] eta: 1:44:51 lr: 2.1303738364907006e-05 loss: 0.0821 (0.1023) time: 2.9448 data: 0.0071 max mem: 33301 Epoch: [24] [2140/4276] eta: 1:44:22 lr: 2.1300845778474982e-05 loss: 0.0865 (0.1022) time: 2.9365 data: 0.0071 max mem: 33301 Epoch: [24] [2150/4276] eta: 1:43:53 lr: 2.1297953148397418e-05 loss: 0.0916 (0.1022) time: 2.9348 data: 0.0069 max mem: 33301 Epoch: [24] [2160/4276] eta: 1:43:23 lr: 2.1295060474667068e-05 loss: 0.0925 (0.1022) time: 2.9353 data: 0.0069 max mem: 33301 Epoch: [24] [2170/4276] eta: 1:42:54 lr: 2.129216775727669e-05 loss: 0.0986 (0.1022) time: 2.9371 data: 0.0069 max mem: 33301 Epoch: [24] [2180/4276] eta: 1:42:25 lr: 2.1289274996219022e-05 loss: 0.1009 (0.1022) time: 2.9353 data: 0.0070 max mem: 33301 Epoch: [24] [2190/4276] eta: 1:41:56 lr: 2.1286382191486818e-05 loss: 0.1048 (0.1022) time: 2.9347 data: 0.0070 max mem: 33301 Epoch: [24] [2200/4276] eta: 1:41:26 lr: 2.128348934307282e-05 loss: 0.1090 (0.1022) time: 2.9387 data: 0.0069 max mem: 33301 Epoch: [24] [2210/4276] eta: 1:40:57 lr: 2.1280596450969784e-05 loss: 0.1119 (0.1022) time: 2.9375 data: 0.0069 max mem: 33301 Epoch: [24] [2220/4276] eta: 1:40:28 lr: 2.1277703515170432e-05 loss: 0.1095 (0.1023) time: 2.9339 data: 0.0070 max mem: 33301 Epoch: [24] [2230/4276] eta: 1:39:59 lr: 2.1274810535667518e-05 loss: 0.1090 (0.1023) time: 2.9348 data: 0.0070 max mem: 33301 Epoch: [24] [2240/4276] eta: 1:39:29 lr: 2.127191751245377e-05 loss: 0.0891 (0.1022) time: 2.9353 data: 0.0070 max mem: 33301 Epoch: [24] [2250/4276] eta: 1:39:00 lr: 2.126902444552193e-05 loss: 0.0891 (0.1022) time: 2.9340 data: 0.0071 max mem: 33301 Epoch: [24] [2260/4276] eta: 1:38:31 lr: 2.126613133486472e-05 loss: 0.0991 (0.1022) time: 2.9333 data: 0.0068 max mem: 33301 Epoch: [24] [2270/4276] eta: 1:38:01 lr: 2.126323818047487e-05 loss: 0.0931 (0.1022) time: 2.9328 data: 0.0068 max mem: 33301 Epoch: [24] [2280/4276] eta: 1:37:32 lr: 2.1260344982345116e-05 loss: 0.0931 (0.1022) time: 2.9314 data: 0.0069 max mem: 33301 Epoch: [24] [2290/4276] eta: 1:37:03 lr: 2.1257451740468183e-05 loss: 0.1025 (0.1022) time: 2.9314 data: 0.0070 max mem: 33301 Epoch: [24] [2300/4276] eta: 1:36:33 lr: 2.125455845483678e-05 loss: 0.0889 (0.1022) time: 2.9337 data: 0.0070 max mem: 33301 Epoch: [24] [2310/4276] eta: 1:36:04 lr: 2.1251665125443638e-05 loss: 0.1071 (0.1023) time: 2.9351 data: 0.0070 max mem: 33301 Epoch: [24] [2320/4276] eta: 1:35:35 lr: 2.1248771752281476e-05 loss: 0.1091 (0.1023) time: 2.9274 data: 0.0073 max mem: 33301 Epoch: [24] [2330/4276] eta: 1:35:05 lr: 2.1245878335343003e-05 loss: 0.1048 (0.1023) time: 2.9272 data: 0.0073 max mem: 33301 Epoch: [24] [2340/4276] eta: 1:34:36 lr: 2.1242984874620935e-05 loss: 0.1137 (0.1024) time: 2.9334 data: 0.0069 max mem: 33301 Epoch: [24] [2350/4276] eta: 1:34:07 lr: 2.1240091370107986e-05 loss: 0.1107 (0.1024) time: 2.9322 data: 0.0070 max mem: 33301 Epoch: [24] [2360/4276] eta: 1:33:37 lr: 2.1237197821796863e-05 loss: 0.1000 (0.1024) time: 2.9340 data: 0.0072 max mem: 33301 Epoch: [24] [2370/4276] eta: 1:33:08 lr: 2.123430422968027e-05 loss: 0.1046 (0.1025) time: 2.9352 data: 0.0072 max mem: 33301 Epoch: [24] [2380/4276] eta: 1:32:39 lr: 2.1231410593750908e-05 loss: 0.1088 (0.1025) time: 2.9328 data: 0.0069 max mem: 33301 Epoch: [24] [2390/4276] eta: 1:32:09 lr: 2.1228516914001485e-05 loss: 0.0996 (0.1025) time: 2.9322 data: 0.0069 max mem: 33301 Epoch: [24] [2400/4276] eta: 1:31:40 lr: 2.1225623190424708e-05 loss: 0.1068 (0.1026) time: 2.9333 data: 0.0070 max mem: 33301 Epoch: [24] [2410/4276] eta: 1:31:11 lr: 2.1222729423013253e-05 loss: 0.1172 (0.1026) time: 2.9353 data: 0.0069 max mem: 33301 Epoch: [24] [2420/4276] eta: 1:30:42 lr: 2.1219835611759828e-05 loss: 0.0974 (0.1026) time: 2.9358 data: 0.0069 max mem: 33301 Epoch: [24] [2430/4276] eta: 1:30:12 lr: 2.1216941756657126e-05 loss: 0.1028 (0.1027) time: 2.9351 data: 0.0069 max mem: 33301 Epoch: [24] [2440/4276] eta: 1:29:43 lr: 2.121404785769784e-05 loss: 0.1077 (0.1027) time: 2.9341 data: 0.0069 max mem: 33301 Epoch: [24] [2450/4276] eta: 1:29:14 lr: 2.1211153914874644e-05 loss: 0.1014 (0.1027) time: 2.9325 data: 0.0069 max mem: 33301 Epoch: [24] [2460/4276] eta: 1:28:44 lr: 2.120825992818023e-05 loss: 0.1008 (0.1027) time: 2.9311 data: 0.0070 max mem: 33301 Epoch: [24] [2470/4276] eta: 1:28:15 lr: 2.1205365897607292e-05 loss: 0.1008 (0.1027) time: 2.9319 data: 0.0070 max mem: 33301 Epoch: [24] [2480/4276] eta: 1:27:46 lr: 2.12024718231485e-05 loss: 0.1113 (0.1028) time: 2.9323 data: 0.0070 max mem: 33301 Epoch: [24] [2490/4276] eta: 1:27:16 lr: 2.1199577704796532e-05 loss: 0.1071 (0.1028) time: 2.9324 data: 0.0070 max mem: 33301 Epoch: [24] [2500/4276] eta: 1:26:47 lr: 2.1196683542544064e-05 loss: 0.0987 (0.1028) time: 2.9316 data: 0.0069 max mem: 33301 Epoch: [24] [2510/4276] eta: 1:26:18 lr: 2.119378933638377e-05 loss: 0.1000 (0.1028) time: 2.9314 data: 0.0069 max mem: 33301 Epoch: [24] [2520/4276] eta: 1:25:48 lr: 2.119089508630833e-05 loss: 0.1000 (0.1028) time: 2.9320 data: 0.0069 max mem: 33301 Epoch: [24] [2530/4276] eta: 1:25:19 lr: 2.1188000792310405e-05 loss: 0.0887 (0.1027) time: 2.9342 data: 0.0069 max mem: 33301 Epoch: [24] [2540/4276] eta: 1:24:50 lr: 2.118510645438266e-05 loss: 0.0897 (0.1028) time: 2.9350 data: 0.0070 max mem: 33301 Epoch: [24] [2550/4276] eta: 1:24:20 lr: 2.1182212072517765e-05 loss: 0.0973 (0.1027) time: 2.9325 data: 0.0070 max mem: 33301 Epoch: [24] [2560/4276] eta: 1:23:51 lr: 2.1179317646708387e-05 loss: 0.0872 (0.1027) time: 2.9315 data: 0.0069 max mem: 33301 Epoch: [24] [2570/4276] eta: 1:23:22 lr: 2.117642317694717e-05 loss: 0.0872 (0.1027) time: 2.9307 data: 0.0068 max mem: 33301 Epoch: [24] [2580/4276] eta: 1:22:52 lr: 2.1173528663226777e-05 loss: 0.0895 (0.1026) time: 2.9313 data: 0.0069 max mem: 33301 Epoch: [24] [2590/4276] eta: 1:22:23 lr: 2.1170634105539868e-05 loss: 0.0889 (0.1026) time: 2.9329 data: 0.0070 max mem: 33301 Epoch: [24] [2600/4276] eta: 1:21:54 lr: 2.11677395038791e-05 loss: 0.0870 (0.1025) time: 2.9341 data: 0.0070 max mem: 33301 Epoch: [24] [2610/4276] eta: 1:21:25 lr: 2.116484485823711e-05 loss: 0.0869 (0.1025) time: 2.9359 data: 0.0070 max mem: 33301 Epoch: [24] [2620/4276] eta: 1:20:55 lr: 2.1161950168606558e-05 loss: 0.0869 (0.1025) time: 2.9421 data: 0.0071 max mem: 33301 Epoch: [24] [2630/4276] eta: 1:20:26 lr: 2.1159055434980088e-05 loss: 0.0905 (0.1024) time: 2.9421 data: 0.0071 max mem: 33301 Epoch: [24] [2640/4276] eta: 1:19:57 lr: 2.1156160657350334e-05 loss: 0.0848 (0.1024) time: 2.9416 data: 0.0070 max mem: 33301 Epoch: [24] [2650/4276] eta: 1:19:27 lr: 2.115326583570994e-05 loss: 0.0905 (0.1024) time: 2.9406 data: 0.0070 max mem: 33301 Epoch: [24] [2660/4276] eta: 1:18:58 lr: 2.115037097005155e-05 loss: 0.0918 (0.1024) time: 2.9361 data: 0.0070 max mem: 33301 Epoch: [24] [2670/4276] eta: 1:18:29 lr: 2.1147476060367806e-05 loss: 0.0978 (0.1024) time: 2.9349 data: 0.0069 max mem: 33301 Epoch: [24] [2680/4276] eta: 1:18:00 lr: 2.1144581106651323e-05 loss: 0.1022 (0.1024) time: 2.9344 data: 0.0069 max mem: 33301 Epoch: [24] [2690/4276] eta: 1:17:30 lr: 2.114168610889475e-05 loss: 0.0997 (0.1024) time: 2.9352 data: 0.0070 max mem: 33301 Epoch: [24] [2700/4276] eta: 1:17:01 lr: 2.1138791067090702e-05 loss: 0.0933 (0.1024) time: 2.9367 data: 0.0070 max mem: 33301 Epoch: [24] [2710/4276] eta: 1:16:32 lr: 2.1135895981231825e-05 loss: 0.0923 (0.1024) time: 2.9380 data: 0.0071 max mem: 33301 Epoch: [24] [2720/4276] eta: 1:16:02 lr: 2.1133000851310724e-05 loss: 0.0915 (0.1024) time: 2.9385 data: 0.0071 max mem: 33301 Epoch: [24] [2730/4276] eta: 1:15:33 lr: 2.1130105677320027e-05 loss: 0.0948 (0.1024) time: 2.9365 data: 0.0071 max mem: 33301 Epoch: [24] [2740/4276] eta: 1:15:04 lr: 2.112721045925236e-05 loss: 0.1056 (0.1024) time: 2.9341 data: 0.0070 max mem: 33301 Epoch: [24] [2750/4276] eta: 1:14:34 lr: 2.1124315197100343e-05 loss: 0.1056 (0.1024) time: 2.9225 data: 0.0069 max mem: 33301 Epoch: [24] [2760/4276] eta: 1:14:05 lr: 2.1121419890856575e-05 loss: 0.0982 (0.1023) time: 2.9235 data: 0.0070 max mem: 33301 Epoch: [24] [2770/4276] eta: 1:13:36 lr: 2.111852454051368e-05 loss: 0.0924 (0.1023) time: 2.9354 data: 0.0070 max mem: 33301 Epoch: [24] [2780/4276] eta: 1:13:06 lr: 2.1115629146064263e-05 loss: 0.0924 (0.1023) time: 2.9353 data: 0.0070 max mem: 33301 Epoch: [24] [2790/4276] eta: 1:12:37 lr: 2.1112733707500944e-05 loss: 0.1053 (0.1024) time: 2.9334 data: 0.0070 max mem: 33301 Epoch: [24] [2800/4276] eta: 1:12:08 lr: 2.1109838224816317e-05 loss: 0.0937 (0.1023) time: 2.9326 data: 0.0069 max mem: 33301 Epoch: [24] [2810/4276] eta: 1:11:38 lr: 2.1106942698002987e-05 loss: 0.0764 (0.1022) time: 2.9142 data: 0.0070 max mem: 33301 Epoch: [24] [2820/4276] eta: 1:11:09 lr: 2.110404712705355e-05 loss: 0.0753 (0.1022) time: 2.9143 data: 0.0073 max mem: 33301 Epoch: [24] [2830/4276] eta: 1:10:40 lr: 2.1101151511960625e-05 loss: 0.0898 (0.1022) time: 2.9336 data: 0.0072 max mem: 33301 Epoch: [24] [2840/4276] eta: 1:10:10 lr: 2.1098255852716783e-05 loss: 0.1013 (0.1022) time: 2.9299 data: 0.0065 max mem: 33301 Epoch: [24] [2850/4276] eta: 1:09:41 lr: 2.109536014931463e-05 loss: 0.1218 (0.1023) time: 2.9280 data: 0.0065 max mem: 33301 Epoch: [24] [2860/4276] eta: 1:09:12 lr: 2.1092464401746758e-05 loss: 0.1013 (0.1023) time: 2.9310 data: 0.0068 max mem: 33301 Epoch: [24] [2870/4276] eta: 1:08:42 lr: 2.1089568610005757e-05 loss: 0.0927 (0.1023) time: 2.9297 data: 0.0067 max mem: 33301 Epoch: [24] [2880/4276] eta: 1:08:13 lr: 2.1086672774084206e-05 loss: 0.0988 (0.1023) time: 2.9253 data: 0.0066 max mem: 33301 Epoch: [24] [2890/4276] eta: 1:07:43 lr: 2.1083776893974694e-05 loss: 0.0986 (0.1023) time: 2.9236 data: 0.0065 max mem: 33301 Epoch: [24] [2900/4276] eta: 1:07:14 lr: 2.10808809696698e-05 loss: 0.0956 (0.1023) time: 2.9222 data: 0.0066 max mem: 33301 Epoch: [24] [2910/4276] eta: 1:06:45 lr: 2.107798500116212e-05 loss: 0.0978 (0.1023) time: 2.9202 data: 0.0066 max mem: 33301 Epoch: [24] [2920/4276] eta: 1:06:15 lr: 2.1075088988444205e-05 loss: 0.0914 (0.1023) time: 2.9221 data: 0.0064 max mem: 33301 Epoch: [24] [2930/4276] eta: 1:05:46 lr: 2.1072192931508644e-05 loss: 0.0989 (0.1023) time: 2.9253 data: 0.0065 max mem: 33301 Epoch: [24] [2940/4276] eta: 1:05:17 lr: 2.1069296830348012e-05 loss: 0.1024 (0.1023) time: 2.9211 data: 0.0069 max mem: 33301 Epoch: [24] [2950/4276] eta: 1:04:47 lr: 2.1066400684954872e-05 loss: 0.1029 (0.1023) time: 2.9216 data: 0.0069 max mem: 33301 Epoch: [24] [2960/4276] eta: 1:04:18 lr: 2.106350449532179e-05 loss: 0.0931 (0.1023) time: 2.9109 data: 0.0072 max mem: 33301 Epoch: [24] [2970/4276] eta: 1:03:48 lr: 2.106060826144134e-05 loss: 0.0939 (0.1024) time: 2.8991 data: 0.0075 max mem: 33301 Epoch: [24] [2980/4276] eta: 1:03:19 lr: 2.1057711983306085e-05 loss: 0.1021 (0.1024) time: 2.9130 data: 0.0070 max mem: 33301 Epoch: [24] [2990/4276] eta: 1:02:50 lr: 2.105481566090857e-05 loss: 0.0979 (0.1023) time: 2.9244 data: 0.0067 max mem: 33301 Epoch: [24] [3000/4276] eta: 1:02:20 lr: 2.105191929424137e-05 loss: 0.0893 (0.1023) time: 2.9150 data: 0.0070 max mem: 33301 Epoch: [24] [3010/4276] eta: 1:01:51 lr: 2.104902288329703e-05 loss: 0.0905 (0.1023) time: 2.8973 data: 0.0074 max mem: 33301 Epoch: [24] [3020/4276] eta: 1:01:21 lr: 2.1046126428068116e-05 loss: 0.0972 (0.1022) time: 2.9074 data: 0.0075 max mem: 33301 Epoch: [24] [3030/4276] eta: 1:00:52 lr: 2.1043229928547168e-05 loss: 0.1073 (0.1023) time: 2.9239 data: 0.0070 max mem: 33301 Epoch: [24] [3040/4276] eta: 1:00:23 lr: 2.104033338472673e-05 loss: 0.1091 (0.1023) time: 2.9221 data: 0.0066 max mem: 33301 Epoch: [24] [3050/4276] eta: 0:59:53 lr: 2.103743679659936e-05 loss: 0.0989 (0.1023) time: 2.9148 data: 0.0072 max mem: 33301 Epoch: [24] [3060/4276] eta: 0:59:24 lr: 2.1034540164157604e-05 loss: 0.0817 (0.1022) time: 2.9159 data: 0.0074 max mem: 33301 Epoch: [24] [3070/4276] eta: 0:58:55 lr: 2.1031643487393987e-05 loss: 0.0817 (0.1022) time: 2.9231 data: 0.0073 max mem: 33301 Epoch: [24] [3080/4276] eta: 0:58:25 lr: 2.102874676630106e-05 loss: 0.0917 (0.1022) time: 2.9148 data: 0.0071 max mem: 33301 Epoch: [24] [3090/4276] eta: 0:57:56 lr: 2.1025850000871354e-05 loss: 0.0870 (0.1022) time: 2.9155 data: 0.0068 max mem: 33301 Epoch: [24] [3100/4276] eta: 0:57:27 lr: 2.1022953191097412e-05 loss: 0.0870 (0.1021) time: 2.9249 data: 0.0068 max mem: 33301 Epoch: [24] [3110/4276] eta: 0:56:57 lr: 2.1020056336971758e-05 loss: 0.0832 (0.1021) time: 2.9152 data: 0.0073 max mem: 33301 Epoch: [24] [3120/4276] eta: 0:56:28 lr: 2.101715943848692e-05 loss: 0.0822 (0.1020) time: 2.9171 data: 0.0074 max mem: 33301 Epoch: [24] [3130/4276] eta: 0:55:58 lr: 2.101426249563543e-05 loss: 0.0922 (0.1020) time: 2.9314 data: 0.0068 max mem: 33301 Epoch: [24] [3140/4276] eta: 0:55:29 lr: 2.1011365508409813e-05 loss: 0.0937 (0.1020) time: 2.9293 data: 0.0067 max mem: 33301 Epoch: [24] [3150/4276] eta: 0:55:00 lr: 2.1008468476802584e-05 loss: 0.0909 (0.1020) time: 2.9259 data: 0.0068 max mem: 33301 Epoch: [24] [3160/4276] eta: 0:54:31 lr: 2.100557140080627e-05 loss: 0.0910 (0.1020) time: 2.9271 data: 0.0068 max mem: 33301 Epoch: [24] [3170/4276] eta: 0:54:01 lr: 2.100267428041338e-05 loss: 0.0919 (0.1020) time: 2.9261 data: 0.0068 max mem: 33301 Epoch: [24] [3180/4276] eta: 0:53:32 lr: 2.0999777115616442e-05 loss: 0.0925 (0.1019) time: 2.9215 data: 0.0068 max mem: 33301 Epoch: [24] [3190/4276] eta: 0:53:02 lr: 2.0996879906407956e-05 loss: 0.1004 (0.1019) time: 2.9220 data: 0.0068 max mem: 33301 Epoch: [24] [3200/4276] eta: 0:52:33 lr: 2.099398265278044e-05 loss: 0.1000 (0.1019) time: 2.9239 data: 0.0065 max mem: 33301 Epoch: [24] [3210/4276] eta: 0:52:04 lr: 2.09910853547264e-05 loss: 0.0946 (0.1020) time: 2.9219 data: 0.0065 max mem: 33301 Epoch: [24] [3220/4276] eta: 0:51:34 lr: 2.0988188012238336e-05 loss: 0.0999 (0.1020) time: 2.9224 data: 0.0067 max mem: 33301 Epoch: [24] [3230/4276] eta: 0:51:05 lr: 2.0985290625308754e-05 loss: 0.0917 (0.1019) time: 2.9226 data: 0.0067 max mem: 33301 Epoch: [24] [3240/4276] eta: 0:50:36 lr: 2.0982393193930157e-05 loss: 0.0931 (0.1020) time: 2.9233 data: 0.0066 max mem: 33301 Epoch: [24] [3250/4276] eta: 0:50:06 lr: 2.0979495718095045e-05 loss: 0.1088 (0.1020) time: 2.9245 data: 0.0067 max mem: 33301 Epoch: [24] [3260/4276] eta: 0:49:37 lr: 2.09765981977959e-05 loss: 0.0996 (0.1019) time: 2.9261 data: 0.0067 max mem: 33301 Epoch: [24] [3270/4276] eta: 0:49:08 lr: 2.097370063302523e-05 loss: 0.0996 (0.1020) time: 2.9286 data: 0.0068 max mem: 33301 Epoch: [24] [3280/4276] eta: 0:48:39 lr: 2.0970803023775524e-05 loss: 0.1072 (0.1020) time: 2.9240 data: 0.0068 max mem: 33301 Epoch: [24] [3290/4276] eta: 0:48:09 lr: 2.096790537003927e-05 loss: 0.1072 (0.1020) time: 2.9230 data: 0.0067 max mem: 33301 Epoch: [24] [3300/4276] eta: 0:47:40 lr: 2.0965007671808938e-05 loss: 0.1080 (0.1021) time: 2.9270 data: 0.0067 max mem: 33301 Epoch: [24] [3310/4276] eta: 0:47:11 lr: 2.096210992907703e-05 loss: 0.1109 (0.1021) time: 2.9279 data: 0.0067 max mem: 33301 Epoch: [24] [3320/4276] eta: 0:46:41 lr: 2.0959212141836023e-05 loss: 0.1093 (0.1022) time: 2.9282 data: 0.0066 max mem: 33301 Epoch: [24] [3330/4276] eta: 0:46:12 lr: 2.09563143100784e-05 loss: 0.0982 (0.1021) time: 2.9265 data: 0.0066 max mem: 33301 Epoch: [24] [3340/4276] eta: 0:45:43 lr: 2.095341643379662e-05 loss: 0.0917 (0.1021) time: 2.9260 data: 0.0067 max mem: 33301 Epoch: [24] [3350/4276] eta: 0:45:13 lr: 2.095051851298317e-05 loss: 0.0848 (0.1021) time: 2.9224 data: 0.0068 max mem: 33301 Epoch: [24] [3360/4276] eta: 0:44:44 lr: 2.094762054763052e-05 loss: 0.0923 (0.1021) time: 2.9231 data: 0.0068 max mem: 33301 Epoch: [24] [3370/4276] eta: 0:44:15 lr: 2.0944722537731145e-05 loss: 0.1049 (0.1021) time: 2.9272 data: 0.0067 max mem: 33301 Epoch: [24] [3380/4276] eta: 0:43:45 lr: 2.09418244832775e-05 loss: 0.1037 (0.1021) time: 2.9195 data: 0.0066 max mem: 33301 Epoch: [24] [3390/4276] eta: 0:43:16 lr: 2.093892638426205e-05 loss: 0.0939 (0.1021) time: 2.9120 data: 0.0067 max mem: 33301 Epoch: [24] [3400/4276] eta: 0:42:47 lr: 2.0936028240677264e-05 loss: 0.0999 (0.1022) time: 2.9202 data: 0.0066 max mem: 33301 Epoch: [24] [3410/4276] eta: 0:42:17 lr: 2.0933130052515602e-05 loss: 0.1022 (0.1022) time: 2.9271 data: 0.0066 max mem: 33301 Epoch: [24] [3420/4276] eta: 0:41:48 lr: 2.0930231819769508e-05 loss: 0.1007 (0.1022) time: 2.9264 data: 0.0068 max mem: 33301 Epoch: [24] [3430/4276] eta: 0:41:19 lr: 2.092733354243145e-05 loss: 0.1024 (0.1022) time: 2.9273 data: 0.0067 max mem: 33301 Epoch: [24] [3440/4276] eta: 0:40:49 lr: 2.092443522049387e-05 loss: 0.0986 (0.1022) time: 2.9289 data: 0.0066 max mem: 33301 Epoch: [24] [3450/4276] eta: 0:40:20 lr: 2.0921536853949227e-05 loss: 0.1046 (0.1022) time: 2.9077 data: 0.0070 max mem: 33301 Epoch: [24] [3460/4276] eta: 0:39:51 lr: 2.091863844278996e-05 loss: 0.1085 (0.1023) time: 2.9053 data: 0.0076 max mem: 33301 Epoch: [24] [3470/4276] eta: 0:39:21 lr: 2.0915739987008513e-05 loss: 0.0954 (0.1023) time: 2.9253 data: 0.0072 max mem: 33301 Epoch: [24] [3480/4276] eta: 0:38:52 lr: 2.0912841486597334e-05 loss: 0.0962 (0.1023) time: 2.9249 data: 0.0071 max mem: 33301 Epoch: [24] [3490/4276] eta: 0:38:23 lr: 2.0909942941548867e-05 loss: 0.1035 (0.1023) time: 2.9268 data: 0.0072 max mem: 33301 Epoch: [24] [3500/4276] eta: 0:37:53 lr: 2.0907044351855533e-05 loss: 0.1031 (0.1023) time: 2.9341 data: 0.0070 max mem: 33301 Epoch: [24] [3510/4276] eta: 0:37:24 lr: 2.090414571750978e-05 loss: 0.0960 (0.1023) time: 2.9337 data: 0.0071 max mem: 33301 Epoch: [24] [3520/4276] eta: 0:36:55 lr: 2.0901247038504042e-05 loss: 0.0929 (0.1023) time: 2.9285 data: 0.0070 max mem: 33301 Epoch: [24] [3530/4276] eta: 0:36:25 lr: 2.0898348314830736e-05 loss: 0.0929 (0.1023) time: 2.9290 data: 0.0071 max mem: 33301 Epoch: [24] [3540/4276] eta: 0:35:56 lr: 2.0895449546482294e-05 loss: 0.0975 (0.1023) time: 2.9290 data: 0.0071 max mem: 33301 Epoch: [24] [3550/4276] eta: 0:35:27 lr: 2.0892550733451145e-05 loss: 0.0962 (0.1022) time: 2.9291 data: 0.0071 max mem: 33301 Epoch: [24] [3560/4276] eta: 0:34:58 lr: 2.088965187572972e-05 loss: 0.0947 (0.1023) time: 2.9279 data: 0.0072 max mem: 33301 Epoch: [24] [3570/4276] eta: 0:34:28 lr: 2.0886752973310418e-05 loss: 0.1110 (0.1023) time: 2.9332 data: 0.0071 max mem: 33301 Epoch: [24] [3580/4276] eta: 0:33:59 lr: 2.0883854026185674e-05 loss: 0.0909 (0.1023) time: 2.9350 data: 0.0070 max mem: 33301 Epoch: [24] [3590/4276] eta: 0:33:30 lr: 2.0880955034347893e-05 loss: 0.0894 (0.1023) time: 2.9280 data: 0.0070 max mem: 33301 Epoch: [24] [3600/4276] eta: 0:33:00 lr: 2.0878055997789497e-05 loss: 0.0985 (0.1023) time: 2.9261 data: 0.0071 max mem: 33301 Epoch: [24] [3610/4276] eta: 0:32:31 lr: 2.0875156916502886e-05 loss: 0.0938 (0.1023) time: 2.9268 data: 0.0071 max mem: 33301 Epoch: [24] [3620/4276] eta: 0:32:02 lr: 2.0872257790480473e-05 loss: 0.0976 (0.1023) time: 2.9275 data: 0.0070 max mem: 33301 Epoch: [24] [3630/4276] eta: 0:31:32 lr: 2.0869358619714663e-05 loss: 0.1001 (0.1023) time: 2.9270 data: 0.0071 max mem: 33301 Epoch: [24] [3640/4276] eta: 0:31:03 lr: 2.0866459404197867e-05 loss: 0.0952 (0.1023) time: 2.9254 data: 0.0070 max mem: 33301 Epoch: [24] [3650/4276] eta: 0:30:34 lr: 2.0863560143922467e-05 loss: 0.0910 (0.1022) time: 2.9117 data: 0.0071 max mem: 33301 Epoch: [24] [3660/4276] eta: 0:30:04 lr: 2.0860660838880874e-05 loss: 0.0946 (0.1022) time: 2.9063 data: 0.0072 max mem: 33301 Epoch: [24] [3670/4276] eta: 0:29:35 lr: 2.085776148906548e-05 loss: 0.0968 (0.1023) time: 2.9045 data: 0.0072 max mem: 33301 Epoch: [24] [3680/4276] eta: 0:29:06 lr: 2.0854862094468685e-05 loss: 0.0940 (0.1022) time: 2.9109 data: 0.0070 max mem: 33301 Epoch: [24] [3690/4276] eta: 0:28:36 lr: 2.0851962655082866e-05 loss: 0.0966 (0.1023) time: 2.9277 data: 0.0068 max mem: 33301 Epoch: [24] [3700/4276] eta: 0:28:07 lr: 2.0849063170900422e-05 loss: 0.0966 (0.1022) time: 2.9278 data: 0.0067 max mem: 33301 Epoch: [24] [3710/4276] eta: 0:27:38 lr: 2.084616364191373e-05 loss: 0.0956 (0.1022) time: 2.9275 data: 0.0066 max mem: 33301 Epoch: [24] [3720/4276] eta: 0:27:09 lr: 2.0843264068115187e-05 loss: 0.0956 (0.1022) time: 2.9286 data: 0.0065 max mem: 33301 Epoch: [24] [3730/4276] eta: 0:26:39 lr: 2.084036444949716e-05 loss: 0.0981 (0.1022) time: 2.9230 data: 0.0067 max mem: 33301 Epoch: [24] [3740/4276] eta: 0:26:10 lr: 2.0837464786052023e-05 loss: 0.0981 (0.1022) time: 2.9199 data: 0.0071 max mem: 33301 Epoch: [24] [3750/4276] eta: 0:25:41 lr: 2.083456507777217e-05 loss: 0.0974 (0.1022) time: 2.9250 data: 0.0070 max mem: 33301 Epoch: [24] [3760/4276] eta: 0:25:11 lr: 2.0831665324649965e-05 loss: 0.0909 (0.1022) time: 2.9201 data: 0.0065 max mem: 33301 Epoch: [24] [3770/4276] eta: 0:24:42 lr: 2.0828765526677775e-05 loss: 0.0927 (0.1022) time: 2.9195 data: 0.0065 max mem: 33301 Epoch: [24] [3780/4276] eta: 0:24:13 lr: 2.082586568384797e-05 loss: 0.0951 (0.1022) time: 2.9259 data: 0.0065 max mem: 33301 Epoch: [24] [3790/4276] eta: 0:23:43 lr: 2.0822965796152923e-05 loss: 0.0873 (0.1022) time: 2.9041 data: 0.0070 max mem: 33301 Epoch: [24] [3800/4276] eta: 0:23:14 lr: 2.0820065863584993e-05 loss: 0.0934 (0.1022) time: 2.8855 data: 0.0072 max mem: 33301 Epoch: [24] [3810/4276] eta: 0:22:45 lr: 2.0817165886136534e-05 loss: 0.0918 (0.1022) time: 2.9124 data: 0.0074 max mem: 33301 Epoch: [24] [3820/4276] eta: 0:22:15 lr: 2.0814265863799913e-05 loss: 0.0903 (0.1021) time: 2.9317 data: 0.0075 max mem: 33301 Epoch: [24] [3830/4276] eta: 0:21:46 lr: 2.0811365796567483e-05 loss: 0.0910 (0.1022) time: 2.9274 data: 0.0067 max mem: 33301 Epoch: [24] [3840/4276] eta: 0:21:17 lr: 2.0808465684431593e-05 loss: 0.0879 (0.1021) time: 2.9261 data: 0.0066 max mem: 33301 Epoch: [24] [3850/4276] eta: 0:20:47 lr: 2.08055655273846e-05 loss: 0.0833 (0.1021) time: 2.9197 data: 0.0066 max mem: 33301 Epoch: [24] [3860/4276] eta: 0:20:18 lr: 2.080266532541885e-05 loss: 0.0985 (0.1021) time: 2.9185 data: 0.0068 max mem: 33301 Epoch: [24] [3870/4276] eta: 0:19:49 lr: 2.0799765078526694e-05 loss: 0.0965 (0.1021) time: 2.9229 data: 0.0069 max mem: 33301 Epoch: [24] [3880/4276] eta: 0:19:20 lr: 2.0796864786700466e-05 loss: 0.0897 (0.1021) time: 2.9078 data: 0.0067 max mem: 33301 Epoch: [24] [3890/4276] eta: 0:18:50 lr: 2.0793964449932513e-05 loss: 0.0872 (0.1021) time: 2.8941 data: 0.0071 max mem: 33301 Epoch: [24] [3900/4276] eta: 0:18:21 lr: 2.079106406821517e-05 loss: 0.0944 (0.1021) time: 2.8963 data: 0.0071 max mem: 33301 Epoch: [24] [3910/4276] eta: 0:17:52 lr: 2.078816364154078e-05 loss: 0.0893 (0.1020) time: 2.9249 data: 0.0073 max mem: 33301 Epoch: [24] [3920/4276] eta: 0:17:22 lr: 2.078526316990167e-05 loss: 0.0889 (0.1020) time: 2.9393 data: 0.0075 max mem: 33301 Epoch: [24] [3930/4276] eta: 0:16:53 lr: 2.0782362653290168e-05 loss: 0.0902 (0.1020) time: 2.9242 data: 0.0069 max mem: 33301 Epoch: [24] [3940/4276] eta: 0:16:24 lr: 2.077946209169861e-05 loss: 0.0911 (0.1020) time: 2.9234 data: 0.0069 max mem: 33301 Epoch: [24] [3950/4276] eta: 0:15:54 lr: 2.0776561485119323e-05 loss: 0.0911 (0.1020) time: 2.9288 data: 0.0068 max mem: 33301 Epoch: [24] [3960/4276] eta: 0:15:25 lr: 2.077366083354462e-05 loss: 0.0975 (0.1020) time: 2.9305 data: 0.0069 max mem: 33301 Epoch: [24] [3970/4276] eta: 0:14:56 lr: 2.077076013696683e-05 loss: 0.1070 (0.1020) time: 2.9333 data: 0.0069 max mem: 33301 Epoch: [24] [3980/4276] eta: 0:14:27 lr: 2.0767859395378265e-05 loss: 0.0931 (0.1020) time: 2.9336 data: 0.0068 max mem: 33301 Epoch: [24] [3990/4276] eta: 0:13:57 lr: 2.0764958608771253e-05 loss: 0.0931 (0.1020) time: 2.9271 data: 0.0069 max mem: 33301 Epoch: [24] [4000/4276] eta: 0:13:28 lr: 2.0762057777138097e-05 loss: 0.0953 (0.1020) time: 2.9272 data: 0.0068 max mem: 33301 Epoch: [24] [4010/4276] eta: 0:12:59 lr: 2.075915690047111e-05 loss: 0.0891 (0.1019) time: 2.9280 data: 0.0070 max mem: 33301 Epoch: [24] [4020/4276] eta: 0:12:29 lr: 2.07562559787626e-05 loss: 0.0939 (0.1020) time: 2.9274 data: 0.0069 max mem: 33301 Epoch: [24] [4030/4276] eta: 0:12:00 lr: 2.0753355012004883e-05 loss: 0.0971 (0.1019) time: 2.9271 data: 0.0070 max mem: 33301 Epoch: [24] [4040/4276] eta: 0:11:31 lr: 2.0750454000190245e-05 loss: 0.0921 (0.1020) time: 2.9267 data: 0.0071 max mem: 33301 Epoch: [24] [4050/4276] eta: 0:11:02 lr: 2.0747552943310998e-05 loss: 0.0925 (0.1019) time: 2.9251 data: 0.0069 max mem: 33301 Epoch: [24] [4060/4276] eta: 0:10:32 lr: 2.0744651841359437e-05 loss: 0.0978 (0.1019) time: 2.9265 data: 0.0070 max mem: 33301 Epoch: [24] [4070/4276] eta: 0:10:03 lr: 2.0741750694327864e-05 loss: 0.1056 (0.1020) time: 2.9287 data: 0.0069 max mem: 33301 Epoch: [24] [4080/4276] eta: 0:09:34 lr: 2.073884950220856e-05 loss: 0.0996 (0.1020) time: 2.9231 data: 0.0071 max mem: 33301 Epoch: [24] [4090/4276] eta: 0:09:04 lr: 2.0735948264993828e-05 loss: 0.1070 (0.1020) time: 2.9134 data: 0.0071 max mem: 33301 Epoch: [24] [4100/4276] eta: 0:08:35 lr: 2.0733046982675945e-05 loss: 0.1075 (0.1020) time: 2.9185 data: 0.0068 max mem: 33301 Epoch: [24] [4110/4276] eta: 0:08:06 lr: 2.0730145655247212e-05 loss: 0.1101 (0.1020) time: 2.9295 data: 0.0071 max mem: 33301 Epoch: [24] [4120/4276] eta: 0:07:36 lr: 2.07272442826999e-05 loss: 0.1035 (0.1020) time: 2.9315 data: 0.0070 max mem: 33301 Epoch: [24] [4130/4276] eta: 0:07:07 lr: 2.072434286502629e-05 loss: 0.0958 (0.1020) time: 2.9304 data: 0.0070 max mem: 33301 Epoch: [24] [4140/4276] eta: 0:06:38 lr: 2.072144140221867e-05 loss: 0.0958 (0.1020) time: 2.9241 data: 0.0068 max mem: 33301 Epoch: [24] [4150/4276] eta: 0:06:09 lr: 2.0718539894269303e-05 loss: 0.0992 (0.1020) time: 2.9136 data: 0.0076 max mem: 33301 Epoch: [24] [4160/4276] eta: 0:05:39 lr: 2.071563834117047e-05 loss: 0.1033 (0.1020) time: 2.9131 data: 0.0080 max mem: 33301 Epoch: [24] [4170/4276] eta: 0:05:10 lr: 2.0712736742914442e-05 loss: 0.1077 (0.1021) time: 2.9177 data: 0.0073 max mem: 33301 Epoch: [24] [4180/4276] eta: 0:04:41 lr: 2.0709835099493485e-05 loss: 0.0996 (0.1020) time: 2.9226 data: 0.0069 max mem: 33301 Epoch: [24] [4190/4276] eta: 0:04:11 lr: 2.070693341089986e-05 loss: 0.0933 (0.1021) time: 2.9281 data: 0.0065 max mem: 33301 Epoch: [24] [4200/4276] eta: 0:03:42 lr: 2.070403167712584e-05 loss: 0.1026 (0.1021) time: 2.9292 data: 0.0067 max mem: 33301 Epoch: [24] [4210/4276] eta: 0:03:13 lr: 2.0701129898163678e-05 loss: 0.1026 (0.1021) time: 2.9252 data: 0.0068 max mem: 33301 Epoch: [24] [4220/4276] eta: 0:02:44 lr: 2.069822807400564e-05 loss: 0.1105 (0.1021) time: 2.9261 data: 0.0069 max mem: 33301 Epoch: [24] [4230/4276] eta: 0:02:14 lr: 2.069532620464397e-05 loss: 0.1082 (0.1022) time: 2.9301 data: 0.0070 max mem: 33301 Epoch: [24] [4240/4276] eta: 0:01:45 lr: 2.0692424290070926e-05 loss: 0.1048 (0.1022) time: 2.9302 data: 0.0068 max mem: 33301 Epoch: [24] [4250/4276] eta: 0:01:16 lr: 2.068952233027876e-05 loss: 0.1084 (0.1022) time: 2.9261 data: 0.0067 max mem: 33301 Epoch: [24] [4260/4276] eta: 0:00:46 lr: 2.0686620325259724e-05 loss: 0.1141 (0.1022) time: 2.9204 data: 0.0065 max mem: 33301 Epoch: [24] [4270/4276] eta: 0:00:17 lr: 2.0683718275006057e-05 loss: 0.1103 (0.1022) time: 2.9249 data: 0.0064 max mem: 33301 Epoch: [24] Total time: 3:28:45 Test: [ 0/21770] eta: 8:20:48 time: 1.3803 data: 1.3384 max mem: 33301 Test: [ 100/21770] eta: 0:18:57 time: 0.0390 data: 0.0009 max mem: 33301 Test: [ 200/21770] eta: 0:16:27 time: 0.0390 data: 0.0008 max mem: 33301 Test: [ 300/21770] eta: 0:15:34 time: 0.0390 data: 0.0009 max mem: 33301 Test: [ 400/21770] eta: 0:15:02 time: 0.0377 data: 0.0009 max mem: 33301 Test: [ 500/21770] eta: 0:14:40 time: 0.0381 data: 0.0009 max mem: 33301 Test: [ 600/21770] eta: 0:14:24 time: 0.0379 data: 0.0009 max mem: 33301 Test: [ 700/21770] eta: 0:14:11 time: 0.0380 data: 0.0009 max mem: 33301 Test: [ 800/21770] eta: 0:14:01 time: 0.0381 data: 0.0009 max mem: 33301 Test: [ 900/21770] eta: 0:13:52 time: 0.0380 data: 0.0009 max mem: 33301 Test: [ 1000/21770] eta: 0:13:44 time: 0.0381 data: 0.0009 max mem: 33301 Test: [ 1100/21770] eta: 0:13:37 time: 0.0380 data: 0.0008 max mem: 33301 Test: [ 1200/21770] eta: 0:13:31 time: 0.0381 data: 0.0009 max mem: 33301 Test: [ 1300/21770] eta: 0:13:25 time: 0.0385 data: 0.0009 max mem: 33301 Test: [ 1400/21770] eta: 0:13:20 time: 0.0383 data: 0.0008 max mem: 33301 Test: [ 1500/21770] eta: 0:13:14 time: 0.0384 data: 0.0008 max mem: 33301 Test: [ 1600/21770] eta: 0:13:09 time: 0.0380 data: 0.0009 max mem: 33301 Test: [ 1700/21770] eta: 0:13:04 time: 0.0380 data: 0.0008 max mem: 33301 Test: [ 1800/21770] eta: 0:12:59 time: 0.0381 data: 0.0008 max mem: 33301 Test: [ 1900/21770] eta: 0:12:54 time: 0.0381 data: 0.0008 max mem: 33301 Test: [ 2000/21770] eta: 0:12:49 time: 0.0380 data: 0.0008 max mem: 33301 Test: [ 2100/21770] eta: 0:12:44 time: 0.0380 data: 0.0008 max mem: 33301 Test: [ 2200/21770] eta: 0:12:40 time: 0.0381 data: 0.0008 max mem: 33301 Test: [ 2300/21770] eta: 0:12:35 time: 0.0380 data: 0.0008 max mem: 33301 Test: [ 2400/21770] eta: 0:12:31 time: 0.0383 data: 0.0009 max mem: 33301 Test: [ 2500/21770] eta: 0:12:27 time: 0.0383 data: 0.0009 max mem: 33301 Test: [ 2600/21770] eta: 0:12:23 time: 0.0383 data: 0.0009 max mem: 33301 Test: [ 2700/21770] eta: 0:12:18 time: 0.0382 data: 0.0009 max mem: 33301 Test: [ 2800/21770] eta: 0:12:14 time: 0.0383 data: 0.0009 max mem: 33301 Test: [ 2900/21770] eta: 0:12:10 time: 0.0383 data: 0.0008 max mem: 33301 Test: [ 3000/21770] eta: 0:12:06 time: 0.0383 data: 0.0009 max mem: 33301 Test: [ 3100/21770] eta: 0:12:02 time: 0.0387 data: 0.0009 max mem: 33301 Test: [ 3200/21770] eta: 0:11:58 time: 0.0387 data: 0.0009 max mem: 33301 Test: [ 3300/21770] eta: 0:11:54 time: 0.0392 data: 0.0009 max mem: 33301 Test: [ 3400/21770] eta: 0:11:50 time: 0.0388 data: 0.0009 max mem: 33301 Test: [ 3500/21770] eta: 0:11:47 time: 0.0391 data: 0.0009 max mem: 33301 Test: [ 3600/21770] eta: 0:11:43 time: 0.0394 data: 0.0009 max mem: 33301 Test: [ 3700/21770] eta: 0:11:39 time: 0.0392 data: 0.0008 max mem: 33301 Test: [ 3800/21770] eta: 0:11:36 time: 0.0393 data: 0.0009 max mem: 33301 Test: [ 3900/21770] eta: 0:11:32 time: 0.0394 data: 0.0009 max mem: 33301 Test: [ 4000/21770] eta: 0:11:29 time: 0.0392 data: 0.0008 max mem: 33301 Test: [ 4100/21770] eta: 0:11:25 time: 0.0392 data: 0.0008 max mem: 33301 Test: [ 4200/21770] eta: 0:11:21 time: 0.0394 data: 0.0009 max mem: 33301 Test: [ 4300/21770] eta: 0:11:18 time: 0.0392 data: 0.0008 max mem: 33301 Test: [ 4400/21770] eta: 0:11:14 time: 0.0393 data: 0.0008 max mem: 33301 Test: [ 4500/21770] eta: 0:11:10 time: 0.0401 data: 0.0008 max mem: 33301 Test: [ 4600/21770] eta: 0:11:07 time: 0.0393 data: 0.0008 max mem: 33301 Test: [ 4700/21770] eta: 0:11:03 time: 0.0402 data: 0.0008 max mem: 33301 Test: [ 4800/21770] eta: 0:11:00 time: 0.0391 data: 0.0008 max mem: 33301 Test: [ 4900/21770] eta: 0:10:56 time: 0.0400 data: 0.0008 max mem: 33301 Test: [ 5000/21770] eta: 0:10:52 time: 0.0393 data: 0.0008 max mem: 33301 Test: [ 5100/21770] eta: 0:10:49 time: 0.0395 data: 0.0008 max mem: 33301 Test: [ 5200/21770] eta: 0:10:45 time: 0.0399 data: 0.0008 max mem: 33301 Test: [ 5300/21770] eta: 0:10:42 time: 0.0399 data: 0.0008 max mem: 33301 Test: [ 5400/21770] eta: 0:10:38 time: 0.0398 data: 0.0008 max mem: 33301 Test: [ 5500/21770] eta: 0:10:34 time: 0.0399 data: 0.0008 max mem: 33301 Test: [ 5600/21770] eta: 0:10:31 time: 0.0399 data: 0.0008 max mem: 33301 Test: [ 5700/21770] eta: 0:10:27 time: 0.0401 data: 0.0008 max mem: 33301 Test: [ 5800/21770] eta: 0:10:23 time: 0.0400 data: 0.0008 max mem: 33301 Test: [ 5900/21770] eta: 0:10:20 time: 0.0399 data: 0.0008 max mem: 33301 Test: [ 6000/21770] eta: 0:10:16 time: 0.0399 data: 0.0008 max mem: 33301 Test: [ 6100/21770] eta: 0:10:12 time: 0.0395 data: 0.0008 max mem: 33301 Test: [ 6200/21770] eta: 0:10:08 time: 0.0395 data: 0.0008 max mem: 33301 Test: [ 6300/21770] eta: 0:10:05 time: 0.0394 data: 0.0008 max mem: 33301 Test: [ 6400/21770] eta: 0:10:01 time: 0.0399 data: 0.0008 max mem: 33301 Test: [ 6500/21770] eta: 0:09:57 time: 0.0394 data: 0.0008 max mem: 33301 Test: [ 6600/21770] eta: 0:09:53 time: 0.0393 data: 0.0008 max mem: 33301 Test: [ 6700/21770] eta: 0:09:49 time: 0.0393 data: 0.0008 max mem: 33301 Test: [ 6800/21770] eta: 0:09:45 time: 0.0392 data: 0.0008 max mem: 33301 Test: [ 6900/21770] eta: 0:09:42 time: 0.0391 data: 0.0008 max mem: 33301 Test: [ 7000/21770] eta: 0:09:38 time: 0.0393 data: 0.0008 max mem: 33301 Test: [ 7100/21770] eta: 0:09:34 time: 0.0393 data: 0.0008 max mem: 33301 Test: [ 7200/21770] eta: 0:09:30 time: 0.0394 data: 0.0008 max mem: 33301 Test: [ 7300/21770] eta: 0:09:26 time: 0.0377 data: 0.0009 max mem: 33301 Test: [ 7400/21770] eta: 0:09:22 time: 0.0377 data: 0.0009 max mem: 33301 Test: [ 7500/21770] eta: 0:09:18 time: 0.0391 data: 0.0009 max mem: 33301 Test: [ 7600/21770] eta: 0:09:14 time: 0.0388 data: 0.0009 max mem: 33301 Test: [ 7700/21770] eta: 0:09:10 time: 0.0392 data: 0.0009 max mem: 33301 Test: [ 7800/21770] eta: 0:09:06 time: 0.0386 data: 0.0009 max mem: 33301 Test: [ 7900/21770] eta: 0:09:02 time: 0.0391 data: 0.0008 max mem: 33301 Test: [ 8000/21770] eta: 0:08:58 time: 0.0388 data: 0.0008 max mem: 33301 Test: [ 8100/21770] eta: 0:08:54 time: 0.0391 data: 0.0008 max mem: 33301 Test: [ 8200/21770] eta: 0:08:50 time: 0.0388 data: 0.0008 max mem: 33301 Test: [ 8300/21770] eta: 0:08:46 time: 0.0389 data: 0.0008 max mem: 33301 Test: [ 8400/21770] eta: 0:08:42 time: 0.0392 data: 0.0008 max mem: 33301 Test: [ 8500/21770] eta: 0:08:38 time: 0.0393 data: 0.0008 max mem: 33301 Test: [ 8600/21770] eta: 0:08:34 time: 0.0391 data: 0.0008 max mem: 33301 Test: [ 8700/21770] eta: 0:08:30 time: 0.0392 data: 0.0008 max mem: 33301 Test: [ 8800/21770] eta: 0:08:26 time: 0.0392 data: 0.0008 max mem: 33301 Test: [ 8900/21770] eta: 0:08:23 time: 0.0394 data: 0.0008 max mem: 33301 Test: [ 9000/21770] eta: 0:08:19 time: 0.0393 data: 0.0008 max mem: 33301 Test: [ 9100/21770] eta: 0:08:15 time: 0.0393 data: 0.0008 max mem: 33301 Test: [ 9200/21770] eta: 0:08:11 time: 0.0390 data: 0.0008 max mem: 33301 Test: [ 9300/21770] eta: 0:08:07 time: 0.0392 data: 0.0008 max mem: 33301 Test: [ 9400/21770] eta: 0:08:03 time: 0.0398 data: 0.0009 max mem: 33301 Test: [ 9500/21770] eta: 0:07:59 time: 0.0398 data: 0.0008 max mem: 33301 Test: [ 9600/21770] eta: 0:07:55 time: 0.0377 data: 0.0009 max mem: 33301 Test: [ 9700/21770] eta: 0:07:51 time: 0.0399 data: 0.0008 max mem: 33301 Test: [ 9800/21770] eta: 0:07:48 time: 0.0399 data: 0.0008 max mem: 33301 Test: [ 9900/21770] eta: 0:07:44 time: 0.0399 data: 0.0008 max mem: 33301 Test: [10000/21770] eta: 0:07:40 time: 0.0400 data: 0.0008 max mem: 33301 Test: [10100/21770] eta: 0:07:36 time: 0.0398 data: 0.0008 max mem: 33301 Test: [10200/21770] eta: 0:07:32 time: 0.0401 data: 0.0008 max mem: 33301 Test: [10300/21770] eta: 0:07:29 time: 0.0399 data: 0.0008 max mem: 33301 Test: [10400/21770] eta: 0:07:25 time: 0.0400 data: 0.0008 max mem: 33301 Test: [10500/21770] eta: 0:07:21 time: 0.0398 data: 0.0008 max mem: 33301 Test: [10600/21770] eta: 0:07:17 time: 0.0401 data: 0.0008 max mem: 33301 Test: [10700/21770] eta: 0:07:13 time: 0.0399 data: 0.0008 max mem: 33301 Test: [10800/21770] eta: 0:07:09 time: 0.0401 data: 0.0008 max mem: 33301 Test: [10900/21770] eta: 0:07:06 time: 0.0400 data: 0.0008 max mem: 33301 Test: [11000/21770] eta: 0:07:02 time: 0.0401 data: 0.0008 max mem: 33301 Test: [11100/21770] eta: 0:06:58 time: 0.0398 data: 0.0008 max mem: 33301 Test: [11200/21770] eta: 0:06:54 time: 0.0387 data: 0.0008 max mem: 33301 Test: [11300/21770] eta: 0:06:50 time: 0.0388 data: 0.0008 max mem: 33301 Test: [11400/21770] eta: 0:06:46 time: 0.0385 data: 0.0008 max mem: 33301 Test: [11500/21770] eta: 0:06:42 time: 0.0387 data: 0.0008 max mem: 33301 Test: [11600/21770] eta: 0:06:38 time: 0.0386 data: 0.0008 max mem: 33301 Test: [11700/21770] eta: 0:06:34 time: 0.0386 data: 0.0008 max mem: 33301 Test: [11800/21770] eta: 0:06:30 time: 0.0386 data: 0.0008 max mem: 33301 Test: [11900/21770] eta: 0:06:26 time: 0.0386 data: 0.0008 max mem: 33301 Test: [12000/21770] eta: 0:06:22 time: 0.0386 data: 0.0008 max mem: 33301 Test: [12100/21770] eta: 0:06:18 time: 0.0388 data: 0.0008 max mem: 33301 Test: [12200/21770] eta: 0:06:14 time: 0.0387 data: 0.0008 max mem: 33301 Test: [12300/21770] eta: 0:06:10 time: 0.0389 data: 0.0008 max mem: 33301 Test: [12400/21770] eta: 0:06:06 time: 0.0391 data: 0.0008 max mem: 33301 Test: [12500/21770] eta: 0:06:02 time: 0.0389 data: 0.0008 max mem: 33301 Test: [12600/21770] eta: 0:05:59 time: 0.0389 data: 0.0008 max mem: 33301 Test: [12700/21770] eta: 0:05:55 time: 0.0389 data: 0.0008 max mem: 33301 Test: [12800/21770] eta: 0:05:51 time: 0.0385 data: 0.0008 max mem: 33301 Test: [12900/21770] eta: 0:05:47 time: 0.0388 data: 0.0008 max mem: 33301 Test: [13000/21770] eta: 0:05:43 time: 0.0385 data: 0.0008 max mem: 33301 Test: [13100/21770] eta: 0:05:39 time: 0.0387 data: 0.0008 max mem: 33301 Test: [13200/21770] eta: 0:05:35 time: 0.0387 data: 0.0009 max mem: 33301 Test: [13300/21770] eta: 0:05:31 time: 0.0386 data: 0.0008 max mem: 33301 Test: [13400/21770] eta: 0:05:27 time: 0.0385 data: 0.0008 max mem: 33301 Test: [13500/21770] eta: 0:05:23 time: 0.0387 data: 0.0008 max mem: 33301 Test: [13600/21770] eta: 0:05:19 time: 0.0390 data: 0.0008 max mem: 33301 Test: [13700/21770] eta: 0:05:15 time: 0.0390 data: 0.0008 max mem: 33301 Test: [13800/21770] eta: 0:05:11 time: 0.0390 data: 0.0008 max mem: 33301 Test: [13900/21770] eta: 0:05:07 time: 0.0390 data: 0.0008 max mem: 33301 Test: [14000/21770] eta: 0:05:03 time: 0.0391 data: 0.0008 max mem: 33301 Test: [14100/21770] eta: 0:04:59 time: 0.0389 data: 0.0008 max mem: 33301 Test: [14200/21770] eta: 0:04:56 time: 0.0389 data: 0.0008 max mem: 33301 Test: [14300/21770] eta: 0:04:52 time: 0.0388 data: 0.0008 max mem: 33301 Test: [14400/21770] eta: 0:04:48 time: 0.0388 data: 0.0009 max mem: 33301 Test: [14500/21770] eta: 0:04:44 time: 0.0391 data: 0.0008 max mem: 33301 Test: [14600/21770] eta: 0:04:40 time: 0.0390 data: 0.0008 max mem: 33301 Test: [14700/21770] eta: 0:04:36 time: 0.0391 data: 0.0008 max mem: 33301 Test: [14800/21770] eta: 0:04:32 time: 0.0393 data: 0.0009 max mem: 33301 Test: [14900/21770] eta: 0:04:28 time: 0.0392 data: 0.0008 max mem: 33301 Test: [15000/21770] eta: 0:04:24 time: 0.0394 data: 0.0008 max mem: 33301 Test: [15100/21770] eta: 0:04:20 time: 0.0392 data: 0.0008 max mem: 33301 Test: [15200/21770] eta: 0:04:16 time: 0.0395 data: 0.0008 max mem: 33301 Test: [15300/21770] eta: 0:04:13 time: 0.0391 data: 0.0008 max mem: 33301 Test: [15400/21770] eta: 0:04:09 time: 0.0395 data: 0.0008 max mem: 33301 Test: [15500/21770] eta: 0:04:05 time: 0.0394 data: 0.0008 max mem: 33301 Test: [15600/21770] eta: 0:04:01 time: 0.0393 data: 0.0008 max mem: 33301 Test: [15700/21770] eta: 0:03:57 time: 0.0394 data: 0.0008 max mem: 33301 Test: [15800/21770] eta: 0:03:53 time: 0.0391 data: 0.0008 max mem: 33301 Test: [15900/21770] eta: 0:03:49 time: 0.0394 data: 0.0008 max mem: 33301 Test: [16000/21770] eta: 0:03:45 time: 0.0393 data: 0.0008 max mem: 33301 Test: [16100/21770] eta: 0:03:41 time: 0.0392 data: 0.0008 max mem: 33301 Test: [16200/21770] eta: 0:03:37 time: 0.0392 data: 0.0008 max mem: 33301 Test: [16300/21770] eta: 0:03:33 time: 0.0397 data: 0.0008 max mem: 33301 Test: [16400/21770] eta: 0:03:30 time: 0.0392 data: 0.0008 max mem: 33301 Test: [16500/21770] eta: 0:03:26 time: 0.0390 data: 0.0008 max mem: 33301 Test: [16600/21770] eta: 0:03:22 time: 0.0395 data: 0.0008 max mem: 33301 Test: [16700/21770] eta: 0:03:18 time: 0.0392 data: 0.0008 max mem: 33301 Test: [16800/21770] eta: 0:03:14 time: 0.0394 data: 0.0008 max mem: 33301 Test: [16900/21770] eta: 0:03:10 time: 0.0390 data: 0.0008 max mem: 33301 Test: [17000/21770] eta: 0:03:06 time: 0.0392 data: 0.0008 max mem: 33301 Test: [17100/21770] eta: 0:03:02 time: 0.0390 data: 0.0008 max mem: 33301 Test: [17200/21770] eta: 0:02:58 time: 0.0391 data: 0.0008 max mem: 33301 Test: [17300/21770] eta: 0:02:54 time: 0.0391 data: 0.0008 max mem: 33301 Test: [17400/21770] eta: 0:02:50 time: 0.0391 data: 0.0008 max mem: 33301 Test: [17500/21770] eta: 0:02:47 time: 0.0391 data: 0.0008 max mem: 33301 Test: [17600/21770] eta: 0:02:43 time: 0.0391 data: 0.0008 max mem: 33301 Test: [17700/21770] eta: 0:02:39 time: 0.0391 data: 0.0008 max mem: 33301 Test: [17800/21770] eta: 0:02:35 time: 0.0393 data: 0.0008 max mem: 33301 Test: [17900/21770] eta: 0:02:31 time: 0.0392 data: 0.0008 max mem: 33301 Test: [18000/21770] eta: 0:02:27 time: 0.0391 data: 0.0008 max mem: 33301 Test: [18100/21770] eta: 0:02:23 time: 0.0392 data: 0.0008 max mem: 33301 Test: [18200/21770] eta: 0:02:19 time: 0.0391 data: 0.0008 max mem: 33301 Test: [18300/21770] eta: 0:02:15 time: 0.0391 data: 0.0008 max mem: 33301 Test: [18400/21770] eta: 0:02:11 time: 0.0392 data: 0.0008 max mem: 33301 Test: [18500/21770] eta: 0:02:07 time: 0.0390 data: 0.0008 max mem: 33301 Test: [18600/21770] eta: 0:02:04 time: 0.0393 data: 0.0008 max mem: 33301 Test: [18700/21770] eta: 0:02:00 time: 0.0392 data: 0.0008 max mem: 33301 Test: [18800/21770] eta: 0:01:56 time: 0.0393 data: 0.0008 max mem: 33301 Test: [18900/21770] eta: 0:01:52 time: 0.0392 data: 0.0008 max mem: 33301 Test: [19000/21770] eta: 0:01:48 time: 0.0393 data: 0.0008 max mem: 33301 Test: [19100/21770] eta: 0:01:44 time: 0.0391 data: 0.0008 max mem: 33301 Test: [19200/21770] eta: 0:01:40 time: 0.0391 data: 0.0008 max mem: 33301 Test: [19300/21770] eta: 0:01:36 time: 0.0390 data: 0.0008 max mem: 33301 Test: [19400/21770] eta: 0:01:32 time: 0.0393 data: 0.0008 max mem: 33301 Test: [19500/21770] eta: 0:01:28 time: 0.0393 data: 0.0008 max mem: 33301 Test: [19600/21770] eta: 0:01:24 time: 0.0394 data: 0.0008 max mem: 33301 Test: [19700/21770] eta: 0:01:21 time: 0.0391 data: 0.0008 max mem: 33301 Test: [19800/21770] eta: 0:01:17 time: 0.0391 data: 0.0008 max mem: 33301 Test: [19900/21770] eta: 0:01:13 time: 0.0391 data: 0.0008 max mem: 33301 Test: [20000/21770] eta: 0:01:09 time: 0.0392 data: 0.0008 max mem: 33301 Test: [20100/21770] eta: 0:01:05 time: 0.0390 data: 0.0008 max mem: 33301 Test: [20200/21770] eta: 0:01:01 time: 0.0389 data: 0.0008 max mem: 33301 Test: [20300/21770] eta: 0:00:57 time: 0.0389 data: 0.0008 max mem: 33301 Test: [20400/21770] eta: 0:00:53 time: 0.0388 data: 0.0008 max mem: 33301 Test: [20500/21770] eta: 0:00:49 time: 0.0388 data: 0.0008 max mem: 33301 Test: [20600/21770] eta: 0:00:45 time: 0.0388 data: 0.0008 max mem: 33301 Test: [20700/21770] eta: 0:00:41 time: 0.0386 data: 0.0008 max mem: 33301 Test: [20800/21770] eta: 0:00:37 time: 0.0388 data: 0.0008 max mem: 33301 Test: [20900/21770] eta: 0:00:34 time: 0.0389 data: 0.0008 max mem: 33301 Test: [21000/21770] eta: 0:00:30 time: 0.0388 data: 0.0008 max mem: 33301 Test: [21100/21770] eta: 0:00:26 time: 0.0393 data: 0.0009 max mem: 33301 Test: [21200/21770] eta: 0:00:22 time: 0.0389 data: 0.0008 max mem: 33301 Test: [21300/21770] eta: 0:00:18 time: 0.0389 data: 0.0008 max mem: 33301 Test: [21400/21770] eta: 0:00:14 time: 0.0389 data: 0.0009 max mem: 33301 Test: [21500/21770] eta: 0:00:10 time: 0.0387 data: 0.0008 max mem: 33301 Test: [21600/21770] eta: 0:00:06 time: 0.0387 data: 0.0008 max mem: 33301 Test: [21700/21770] eta: 0:00:02 time: 0.0390 data: 0.0008 max mem: 33301 Test: Total time: 0:14:11 Final results: Mean IoU is 16.31 precision@0.5 = 2.99 precision@0.6 = 1.34 precision@0.7 = 0.38 precision@0.8 = 0.03 precision@0.9 = 0.00 overall IoU = 16.43 mean IoU = 16.31 Mean accuracy for one-to-zero sample is 0.00 Average object IoU 0.1631097177556786 Overall IoU 16.432435989379883 Better epoch: 24 Epoch: [25] [ 0/4276] eta: 6:08:13 lr: 2.068197702313801e-05 loss: 0.1300 (0.1300) time: 5.1669 data: 2.0707 max mem: 33301 Epoch: [25] [ 10/4276] eta: 3:43:13 lr: 2.0679074900492794e-05 loss: 0.0951 (0.1006) time: 3.1395 data: 0.1958 max mem: 33301 Epoch: [25] [ 20/4276] eta: 3:36:05 lr: 2.0676172732592776e-05 loss: 0.0964 (0.1037) time: 2.9404 data: 0.0079 max mem: 33301 Epoch: [25] [ 30/4276] eta: 3:33:15 lr: 2.06732705194302e-05 loss: 0.0983 (0.1051) time: 2.9443 data: 0.0078 max mem: 33301 Epoch: [25] [ 40/4276] eta: 3:31:28 lr: 2.0670368260997282e-05 loss: 0.0983 (0.1042) time: 2.9418 data: 0.0083 max mem: 33301 Epoch: [25] [ 50/4276] eta: 3:30:05 lr: 2.0667465957286265e-05 loss: 0.0930 (0.1026) time: 2.9355 data: 0.0084 max mem: 33301 Epoch: [25] [ 60/4276] eta: 3:28:59 lr: 2.066456360828937e-05 loss: 0.0894 (0.1071) time: 2.9311 data: 0.0082 max mem: 33301 Epoch: [25] [ 70/4276] eta: 3:28:06 lr: 2.0661661213998834e-05 loss: 0.0882 (0.1045) time: 2.9326 data: 0.0078 max mem: 33301 Epoch: [25] [ 80/4276] eta: 3:27:19 lr: 2.065875877440687e-05 loss: 0.0976 (0.1067) time: 2.9353 data: 0.0079 max mem: 33301 Epoch: [25] [ 90/4276] eta: 3:26:35 lr: 2.06558562895057e-05 loss: 0.1090 (0.1073) time: 2.9345 data: 0.0080 max mem: 33301 Epoch: [25] [ 100/4276] eta: 3:25:55 lr: 2.065295375928755e-05 loss: 0.1111 (0.1076) time: 2.9340 data: 0.0079 max mem: 33301 Epoch: [25] [ 110/4276] eta: 3:25:15 lr: 2.065005118374463e-05 loss: 0.1032 (0.1074) time: 2.9337 data: 0.0078 max mem: 33301 Epoch: [25] [ 120/4276] eta: 3:24:40 lr: 2.0647148562869154e-05 loss: 0.1006 (0.1068) time: 2.9361 data: 0.0078 max mem: 33301 Epoch: [25] [ 130/4276] eta: 3:24:06 lr: 2.064424589665333e-05 loss: 0.0943 (0.1070) time: 2.9401 data: 0.0082 max mem: 33301 Epoch: [25] [ 140/4276] eta: 3:23:32 lr: 2.0641343185089367e-05 loss: 0.1001 (0.1061) time: 2.9391 data: 0.0079 max mem: 33301 Epoch: [25] [ 150/4276] eta: 3:22:58 lr: 2.063844042816948e-05 loss: 0.1026 (0.1060) time: 2.9375 data: 0.0077 max mem: 33301 Epoch: [25] [ 160/4276] eta: 3:22:28 lr: 2.0635537625885862e-05 loss: 0.1007 (0.1054) time: 2.9437 data: 0.0078 max mem: 33301 Epoch: [25] [ 170/4276] eta: 3:21:56 lr: 2.0632634778230713e-05 loss: 0.0967 (0.1046) time: 2.9451 data: 0.0077 max mem: 33301 Epoch: [25] [ 180/4276] eta: 3:21:24 lr: 2.0629731885196235e-05 loss: 0.1010 (0.1053) time: 2.9401 data: 0.0076 max mem: 33301 Epoch: [25] [ 190/4276] eta: 3:20:52 lr: 2.0626828946774625e-05 loss: 0.1038 (0.1049) time: 2.9398 data: 0.0074 max mem: 33301 Epoch: [25] [ 200/4276] eta: 3:20:21 lr: 2.0623925962958076e-05 loss: 0.0899 (0.1043) time: 2.9404 data: 0.0073 max mem: 33301 Epoch: [25] [ 210/4276] eta: 3:19:45 lr: 2.0621022933738768e-05 loss: 0.0899 (0.1042) time: 2.9287 data: 0.0071 max mem: 33301 Epoch: [25] [ 220/4276] eta: 3:19:04 lr: 2.0618119859108903e-05 loss: 0.0861 (0.1036) time: 2.9000 data: 0.0068 max mem: 33301 Epoch: [25] [ 230/4276] eta: 3:18:23 lr: 2.0615216739060654e-05 loss: 0.0826 (0.1033) time: 2.8823 data: 0.0071 max mem: 33301 Epoch: [25] [ 240/4276] eta: 3:17:42 lr: 2.061231357358621e-05 loss: 0.0949 (0.1031) time: 2.8758 data: 0.0071 max mem: 33301 Epoch: [25] [ 250/4276] eta: 3:17:01 lr: 2.0609410362677746e-05 loss: 0.1002 (0.1034) time: 2.8700 data: 0.0068 max mem: 33301 Epoch: [25] [ 260/4276] eta: 3:16:23 lr: 2.0606507106327452e-05 loss: 0.1068 (0.1033) time: 2.8731 data: 0.0067 max mem: 33301 Epoch: [25] [ 270/4276] eta: 3:15:51 lr: 2.0603603804527487e-05 loss: 0.0988 (0.1034) time: 2.8962 data: 0.0070 max mem: 33301 Epoch: [25] [ 280/4276] eta: 3:15:15 lr: 2.0600700457270033e-05 loss: 0.0895 (0.1028) time: 2.9027 data: 0.0071 max mem: 33301 Epoch: [25] [ 290/4276] eta: 3:14:39 lr: 2.0597797064547254e-05 loss: 0.0895 (0.1025) time: 2.8876 data: 0.0067 max mem: 33301 Epoch: [25] [ 300/4276] eta: 3:14:06 lr: 2.0594893626351326e-05 loss: 0.0963 (0.1025) time: 2.8907 data: 0.0069 max mem: 33301 Epoch: [25] [ 310/4276] eta: 3:13:38 lr: 2.0591990142674404e-05 loss: 0.0977 (0.1022) time: 2.9190 data: 0.0071 max mem: 33301 Epoch: [25] [ 320/4276] eta: 3:13:10 lr: 2.058908661350865e-05 loss: 0.0994 (0.1026) time: 2.9413 data: 0.0072 max mem: 33301 Epoch: [25] [ 330/4276] eta: 3:12:42 lr: 2.058618303884623e-05 loss: 0.1023 (0.1028) time: 2.9419 data: 0.0074 max mem: 33301 Epoch: [25] [ 340/4276] eta: 3:12:13 lr: 2.0583279418679305e-05 loss: 0.0990 (0.1027) time: 2.9389 data: 0.0075 max mem: 33301 Epoch: [25] [ 350/4276] eta: 3:11:39 lr: 2.058037575300001e-05 loss: 0.0972 (0.1026) time: 2.9094 data: 0.0073 max mem: 33301 Epoch: [25] [ 360/4276] eta: 3:11:05 lr: 2.0577472041800514e-05 loss: 0.0975 (0.1031) time: 2.8845 data: 0.0070 max mem: 33301 Epoch: [25] [ 370/4276] eta: 3:10:31 lr: 2.057456828507296e-05 loss: 0.0967 (0.1028) time: 2.8866 data: 0.0068 max mem: 33301 Epoch: [25] [ 380/4276] eta: 3:09:59 lr: 2.0571664482809498e-05 loss: 0.0894 (0.1029) time: 2.8935 data: 0.0071 max mem: 33301 Epoch: [25] [ 390/4276] eta: 3:09:27 lr: 2.0568760635002267e-05 loss: 0.0978 (0.1030) time: 2.8945 data: 0.0071 max mem: 33301 Epoch: [25] [ 400/4276] eta: 3:08:57 lr: 2.0565856741643406e-05 loss: 0.1141 (0.1035) time: 2.9047 data: 0.0070 max mem: 33301 Epoch: [25] [ 410/4276] eta: 3:08:25 lr: 2.056295280272506e-05 loss: 0.1160 (0.1036) time: 2.9075 data: 0.0073 max mem: 33301 Epoch: [25] [ 420/4276] eta: 3:07:52 lr: 2.0560048818239368e-05 loss: 0.0982 (0.1036) time: 2.8898 data: 0.0069 max mem: 33301 Epoch: [25] [ 430/4276] eta: 3:07:19 lr: 2.055714478817845e-05 loss: 0.1023 (0.1037) time: 2.8832 data: 0.0067 max mem: 33301 Epoch: [25] [ 440/4276] eta: 3:06:51 lr: 2.055424071253445e-05 loss: 0.0978 (0.1035) time: 2.9103 data: 0.0073 max mem: 33301 Epoch: [25] [ 450/4276] eta: 3:06:23 lr: 2.055133659129949e-05 loss: 0.0942 (0.1034) time: 2.9389 data: 0.0074 max mem: 33301 Epoch: [25] [ 460/4276] eta: 3:05:55 lr: 2.05484324244657e-05 loss: 0.0945 (0.1030) time: 2.9352 data: 0.0073 max mem: 33301 Epoch: [25] [ 470/4276] eta: 3:05:24 lr: 2.0545528212025203e-05 loss: 0.0790 (0.1026) time: 2.9210 data: 0.0071 max mem: 33301 Epoch: [25] [ 480/4276] eta: 3:04:56 lr: 2.054262395397011e-05 loss: 0.0790 (0.1024) time: 2.9244 data: 0.0068 max mem: 33301 Epoch: [25] [ 490/4276] eta: 3:04:29 lr: 2.0539719650292548e-05 loss: 0.0888 (0.1022) time: 2.9410 data: 0.0069 max mem: 33301 Epoch: [25] [ 500/4276] eta: 3:04:00 lr: 2.0536815300984637e-05 loss: 0.0901 (0.1020) time: 2.9405 data: 0.0068 max mem: 33301 Epoch: [25] [ 510/4276] eta: 3:03:32 lr: 2.053391090603848e-05 loss: 0.0938 (0.1019) time: 2.9367 data: 0.0064 max mem: 33301 Epoch: [25] [ 520/4276] eta: 3:03:04 lr: 2.0531006465446184e-05 loss: 0.0998 (0.1020) time: 2.9388 data: 0.0064 max mem: 33301 Epoch: [25] [ 530/4276] eta: 3:02:35 lr: 2.0528101979199872e-05 loss: 0.1009 (0.1019) time: 2.9366 data: 0.0067 max mem: 33301 Epoch: [25] [ 540/4276] eta: 3:02:07 lr: 2.0525197447291634e-05 loss: 0.0972 (0.1018) time: 2.9319 data: 0.0071 max mem: 33301 Epoch: [25] [ 550/4276] eta: 3:01:37 lr: 2.052229286971358e-05 loss: 0.0970 (0.1017) time: 2.9285 data: 0.0065 max mem: 33301 Epoch: [25] [ 560/4276] eta: 3:01:09 lr: 2.05193882464578e-05 loss: 0.0971 (0.1016) time: 2.9331 data: 0.0062 max mem: 33301 Epoch: [25] [ 570/4276] eta: 3:00:41 lr: 2.051648357751641e-05 loss: 0.1013 (0.1016) time: 2.9403 data: 0.0063 max mem: 33301 Epoch: [25] [ 580/4276] eta: 3:00:13 lr: 2.051357886288149e-05 loss: 0.1013 (0.1015) time: 2.9400 data: 0.0062 max mem: 33301 Epoch: [25] [ 590/4276] eta: 2:59:44 lr: 2.051067410254513e-05 loss: 0.0843 (0.1014) time: 2.9389 data: 0.0062 max mem: 33301 Epoch: [25] [ 600/4276] eta: 2:59:16 lr: 2.0507769296499427e-05 loss: 0.0951 (0.1013) time: 2.9381 data: 0.0062 max mem: 33301 Epoch: [25] [ 610/4276] eta: 2:58:47 lr: 2.050486444473647e-05 loss: 0.0947 (0.1011) time: 2.9390 data: 0.0065 max mem: 33301 Epoch: [25] [ 620/4276] eta: 2:58:19 lr: 2.0501959547248335e-05 loss: 0.0897 (0.1010) time: 2.9387 data: 0.0065 max mem: 33301 Epoch: [25] [ 630/4276] eta: 2:57:50 lr: 2.0499054604027102e-05 loss: 0.0958 (0.1013) time: 2.9377 data: 0.0062 max mem: 33301 Epoch: [25] [ 640/4276] eta: 2:57:21 lr: 2.049614961506486e-05 loss: 0.0986 (0.1011) time: 2.9388 data: 0.0065 max mem: 33301 Epoch: [25] [ 650/4276] eta: 2:56:53 lr: 2.049324458035368e-05 loss: 0.0904 (0.1012) time: 2.9401 data: 0.0069 max mem: 33301 Epoch: [25] [ 660/4276] eta: 2:56:24 lr: 2.049033949988563e-05 loss: 0.1016 (0.1014) time: 2.9405 data: 0.0069 max mem: 33301 Epoch: [25] [ 670/4276] eta: 2:55:55 lr: 2.0487434373652785e-05 loss: 0.1028 (0.1014) time: 2.9366 data: 0.0066 max mem: 33301 Epoch: [25] [ 680/4276] eta: 2:55:27 lr: 2.0484529201647218e-05 loss: 0.1040 (0.1014) time: 2.9368 data: 0.0067 max mem: 33301 Epoch: [25] [ 690/4276] eta: 2:54:58 lr: 2.0481623983860996e-05 loss: 0.0914 (0.1013) time: 2.9419 data: 0.0066 max mem: 33301 Epoch: [25] [ 700/4276] eta: 2:54:30 lr: 2.047871872028617e-05 loss: 0.0880 (0.1011) time: 2.9417 data: 0.0064 max mem: 33301 Epoch: [25] [ 710/4276] eta: 2:54:01 lr: 2.0475813410914806e-05 loss: 0.0915 (0.1011) time: 2.9405 data: 0.0063 max mem: 33301 Epoch: [25] [ 720/4276] eta: 2:53:32 lr: 2.0472908055738966e-05 loss: 0.0955 (0.1011) time: 2.9401 data: 0.0065 max mem: 33301 Epoch: [25] [ 730/4276] eta: 2:53:04 lr: 2.0470002654750706e-05 loss: 0.0947 (0.1011) time: 2.9397 data: 0.0067 max mem: 33301 Epoch: [25] [ 740/4276] eta: 2:52:35 lr: 2.0467097207942066e-05 loss: 0.0881 (0.1009) time: 2.9383 data: 0.0065 max mem: 33301 Epoch: [25] [ 750/4276] eta: 2:52:06 lr: 2.0464191715305112e-05 loss: 0.0892 (0.1008) time: 2.9381 data: 0.0063 max mem: 33301 Epoch: [25] [ 760/4276] eta: 2:51:37 lr: 2.0461286176831882e-05 loss: 0.0903 (0.1008) time: 2.9394 data: 0.0063 max mem: 33301 Epoch: [25] [ 770/4276] eta: 2:51:08 lr: 2.0458380592514426e-05 loss: 0.0903 (0.1007) time: 2.9387 data: 0.0064 max mem: 33301 Epoch: [25] [ 780/4276] eta: 2:50:39 lr: 2.0455474962344777e-05 loss: 0.0874 (0.1006) time: 2.9377 data: 0.0066 max mem: 33301 Epoch: [25] [ 790/4276] eta: 2:50:11 lr: 2.045256928631498e-05 loss: 0.0945 (0.1007) time: 2.9452 data: 0.0065 max mem: 33301 Epoch: [25] [ 800/4276] eta: 2:49:41 lr: 2.0449663564417074e-05 loss: 0.1011 (0.1007) time: 2.9336 data: 0.0065 max mem: 33301 Epoch: [25] [ 810/4276] eta: 2:49:12 lr: 2.0446757796643097e-05 loss: 0.1060 (0.1009) time: 2.9163 data: 0.0069 max mem: 33301 Epoch: [25] [ 820/4276] eta: 2:48:41 lr: 2.0443851982985064e-05 loss: 0.0999 (0.1008) time: 2.9135 data: 0.0069 max mem: 33301 Epoch: [25] [ 830/4276] eta: 2:48:12 lr: 2.044094612343502e-05 loss: 0.0942 (0.1009) time: 2.9186 data: 0.0068 max mem: 33301 Epoch: [25] [ 840/4276] eta: 2:47:43 lr: 2.043804021798499e-05 loss: 0.0991 (0.1012) time: 2.9323 data: 0.0066 max mem: 33301 Epoch: [25] [ 850/4276] eta: 2:47:14 lr: 2.0435134266626985e-05 loss: 0.0958 (0.1010) time: 2.9374 data: 0.0067 max mem: 33301 Epoch: [25] [ 860/4276] eta: 2:46:45 lr: 2.0432228269353032e-05 loss: 0.0888 (0.1011) time: 2.9380 data: 0.0067 max mem: 33301 Epoch: [25] [ 870/4276] eta: 2:46:16 lr: 2.0429322226155152e-05 loss: 0.0982 (0.1010) time: 2.9371 data: 0.0067 max mem: 33301 Epoch: [25] [ 880/4276] eta: 2:45:47 lr: 2.0426416137025365e-05 loss: 0.0982 (0.1011) time: 2.9379 data: 0.0068 max mem: 33301 Epoch: [25] [ 890/4276] eta: 2:45:18 lr: 2.0423510001955676e-05 loss: 0.1085 (0.1013) time: 2.9392 data: 0.0065 max mem: 33301 Epoch: [25] [ 900/4276] eta: 2:44:49 lr: 2.0420603820938095e-05 loss: 0.1039 (0.1013) time: 2.9386 data: 0.0064 max mem: 33301 Epoch: [25] [ 910/4276] eta: 2:44:20 lr: 2.0417697593964634e-05 loss: 0.0989 (0.1013) time: 2.9341 data: 0.0065 max mem: 33301 Epoch: [25] [ 920/4276] eta: 2:43:51 lr: 2.0414791321027298e-05 loss: 0.0969 (0.1014) time: 2.9343 data: 0.0065 max mem: 33301 Epoch: [25] [ 930/4276] eta: 2:43:22 lr: 2.0411885002118086e-05 loss: 0.0969 (0.1013) time: 2.9381 data: 0.0063 max mem: 33301 Epoch: [25] [ 940/4276] eta: 2:42:53 lr: 2.0408978637229e-05 loss: 0.0920 (0.1012) time: 2.9379 data: 0.0063 max mem: 33301 Epoch: [25] [ 950/4276] eta: 2:42:24 lr: 2.040607222635203e-05 loss: 0.0922 (0.1013) time: 2.9391 data: 0.0066 max mem: 33301 Epoch: [25] [ 960/4276] eta: 2:41:55 lr: 2.0403165769479186e-05 loss: 0.1023 (0.1013) time: 2.9380 data: 0.0066 max mem: 33301 Epoch: [25] [ 970/4276] eta: 2:41:26 lr: 2.0400259266602444e-05 loss: 0.0994 (0.1013) time: 2.9410 data: 0.0063 max mem: 33301 Epoch: [25] [ 980/4276] eta: 2:40:57 lr: 2.03973527177138e-05 loss: 0.1025 (0.1013) time: 2.9432 data: 0.0065 max mem: 33301 Epoch: [25] [ 990/4276] eta: 2:40:28 lr: 2.0394446122805237e-05 loss: 0.1029 (0.1013) time: 2.9398 data: 0.0068 max mem: 33301 Epoch: [25] [1000/4276] eta: 2:39:59 lr: 2.039153948186875e-05 loss: 0.0954 (0.1012) time: 2.9393 data: 0.0071 max mem: 33301 Epoch: [25] [1010/4276] eta: 2:39:30 lr: 2.03886327948963e-05 loss: 0.0988 (0.1013) time: 2.9387 data: 0.0068 max mem: 33301 Epoch: [25] [1020/4276] eta: 2:39:01 lr: 2.038572606187988e-05 loss: 0.0984 (0.1013) time: 2.9387 data: 0.0066 max mem: 33301 Epoch: [25] [1030/4276] eta: 2:38:32 lr: 2.0382819282811463e-05 loss: 0.1002 (0.1014) time: 2.9395 data: 0.0067 max mem: 33301 Epoch: [25] [1040/4276] eta: 2:38:04 lr: 2.0379912457683025e-05 loss: 0.0974 (0.1013) time: 2.9458 data: 0.0068 max mem: 33301 Epoch: [25] [1050/4276] eta: 2:37:35 lr: 2.037700558648653e-05 loss: 0.0959 (0.1014) time: 2.9446 data: 0.0068 max mem: 33301 Epoch: [25] [1060/4276] eta: 2:37:06 lr: 2.0374098669213945e-05 loss: 0.0994 (0.1015) time: 2.9377 data: 0.0067 max mem: 33301 Epoch: [25] [1070/4276] eta: 2:36:36 lr: 2.037119170585724e-05 loss: 0.1148 (0.1017) time: 2.9345 data: 0.0069 max mem: 33301 Epoch: [25] [1080/4276] eta: 2:36:07 lr: 2.036828469640838e-05 loss: 0.1077 (0.1018) time: 2.9350 data: 0.0069 max mem: 33301 Epoch: [25] [1090/4276] eta: 2:35:38 lr: 2.036537764085931e-05 loss: 0.1157 (0.1019) time: 2.9379 data: 0.0067 max mem: 33301 Epoch: [25] [1100/4276] eta: 2:35:09 lr: 2.0362470539202006e-05 loss: 0.1178 (0.1020) time: 2.9369 data: 0.0065 max mem: 33301 Epoch: [25] [1110/4276] eta: 2:34:40 lr: 2.035956339142841e-05 loss: 0.1175 (0.1022) time: 2.9391 data: 0.0067 max mem: 33301 Epoch: [25] [1120/4276] eta: 2:34:11 lr: 2.0356656197530476e-05 loss: 0.1075 (0.1021) time: 2.9410 data: 0.0071 max mem: 33301 Epoch: [25] [1130/4276] eta: 2:33:42 lr: 2.0353748957500154e-05 loss: 0.0838 (0.1020) time: 2.9384 data: 0.0070 max mem: 33301 Epoch: [25] [1140/4276] eta: 2:33:13 lr: 2.035084167132939e-05 loss: 0.0872 (0.1020) time: 2.9379 data: 0.0066 max mem: 33301 Epoch: [25] [1150/4276] eta: 2:32:43 lr: 2.0347934339010133e-05 loss: 0.0966 (0.1021) time: 2.9362 data: 0.0067 max mem: 33301 Epoch: [25] [1160/4276] eta: 2:32:14 lr: 2.0345026960534312e-05 loss: 0.1138 (0.1021) time: 2.9348 data: 0.0071 max mem: 33301 Epoch: [25] [1170/4276] eta: 2:31:45 lr: 2.0342119535893868e-05 loss: 0.0962 (0.1021) time: 2.9370 data: 0.0069 max mem: 33301 Epoch: [25] [1180/4276] eta: 2:31:16 lr: 2.0339212065080746e-05 loss: 0.0962 (0.1020) time: 2.9359 data: 0.0065 max mem: 33301 Epoch: [25] [1190/4276] eta: 2:30:46 lr: 2.0336304548086873e-05 loss: 0.0818 (0.1019) time: 2.9352 data: 0.0066 max mem: 33301 Epoch: [25] [1200/4276] eta: 2:30:17 lr: 2.0333396984904178e-05 loss: 0.0818 (0.1019) time: 2.9365 data: 0.0068 max mem: 33301 Epoch: [25] [1210/4276] eta: 2:29:48 lr: 2.0330489375524588e-05 loss: 0.0830 (0.1018) time: 2.9371 data: 0.0067 max mem: 33301 Epoch: [25] [1220/4276] eta: 2:29:19 lr: 2.0327581719940024e-05 loss: 0.0930 (0.1018) time: 2.9419 data: 0.0064 max mem: 33301 Epoch: [25] [1230/4276] eta: 2:28:50 lr: 2.0324674018142427e-05 loss: 0.0957 (0.1018) time: 2.9398 data: 0.0063 max mem: 33301 Epoch: [25] [1240/4276] eta: 2:28:21 lr: 2.0321766270123692e-05 loss: 0.0961 (0.1018) time: 2.9330 data: 0.0065 max mem: 33301 Epoch: [25] [1250/4276] eta: 2:27:51 lr: 2.0318858475875745e-05 loss: 0.0948 (0.1018) time: 2.9337 data: 0.0065 max mem: 33301 Epoch: [25] [1260/4276] eta: 2:27:22 lr: 2.0315950635390503e-05 loss: 0.0948 (0.1018) time: 2.9340 data: 0.0063 max mem: 33301 Epoch: [25] [1270/4276] eta: 2:26:53 lr: 2.031304274865988e-05 loss: 0.0974 (0.1018) time: 2.9353 data: 0.0063 max mem: 33301 Epoch: [25] [1280/4276] eta: 2:26:24 lr: 2.0310134815675774e-05 loss: 0.1072 (0.1018) time: 2.9422 data: 0.0064 max mem: 33301 Epoch: [25] [1290/4276] eta: 2:25:55 lr: 2.0307226836430096e-05 loss: 0.1037 (0.1019) time: 2.9473 data: 0.0065 max mem: 33301 Epoch: [25] [1300/4276] eta: 2:25:26 lr: 2.0304318810914748e-05 loss: 0.0925 (0.1018) time: 2.9431 data: 0.0063 max mem: 33301 Epoch: [25] [1310/4276] eta: 2:24:57 lr: 2.030141073912164e-05 loss: 0.0903 (0.1018) time: 2.9384 data: 0.0063 max mem: 33301 Epoch: [25] [1320/4276] eta: 2:24:27 lr: 2.0298502621042657e-05 loss: 0.1018 (0.1019) time: 2.9358 data: 0.0065 max mem: 33301 Epoch: [25] [1330/4276] eta: 2:23:58 lr: 2.02955944566697e-05 loss: 0.0892 (0.1018) time: 2.9349 data: 0.0064 max mem: 33301 Epoch: [25] [1340/4276] eta: 2:23:29 lr: 2.0292686245994656e-05 loss: 0.0871 (0.1018) time: 2.9461 data: 0.0064 max mem: 33301 Epoch: [25] [1350/4276] eta: 2:23:00 lr: 2.0289777989009422e-05 loss: 0.0891 (0.1018) time: 2.9427 data: 0.0063 max mem: 33301 Epoch: [25] [1360/4276] eta: 2:22:30 lr: 2.028686968570588e-05 loss: 0.0939 (0.1017) time: 2.9161 data: 0.0068 max mem: 33301 Epoch: [25] [1370/4276] eta: 2:22:01 lr: 2.028396133607591e-05 loss: 0.0955 (0.1018) time: 2.9200 data: 0.0069 max mem: 33301 Epoch: [25] [1380/4276] eta: 2:21:31 lr: 2.0281052940111406e-05 loss: 0.0958 (0.1018) time: 2.9353 data: 0.0063 max mem: 33301 Epoch: [25] [1390/4276] eta: 2:21:02 lr: 2.0278144497804243e-05 loss: 0.1000 (0.1019) time: 2.9374 data: 0.0060 max mem: 33301 Epoch: [25] [1400/4276] eta: 2:20:33 lr: 2.0275236009146286e-05 loss: 0.1126 (0.1020) time: 2.9385 data: 0.0061 max mem: 33301 Epoch: [25] [1410/4276] eta: 2:20:04 lr: 2.027232747412942e-05 loss: 0.1018 (0.1020) time: 2.9367 data: 0.0062 max mem: 33301 Epoch: [25] [1420/4276] eta: 2:19:35 lr: 2.0269418892745508e-05 loss: 0.0985 (0.1020) time: 2.9379 data: 0.0060 max mem: 33301 Epoch: [25] [1430/4276] eta: 2:19:05 lr: 2.0266510264986428e-05 loss: 0.1116 (0.1021) time: 2.9369 data: 0.0061 max mem: 33301 Epoch: [25] [1440/4276] eta: 2:18:36 lr: 2.026360159084403e-05 loss: 0.0963 (0.1021) time: 2.9341 data: 0.0060 max mem: 33301 Epoch: [25] [1450/4276] eta: 2:18:07 lr: 2.0260692870310187e-05 loss: 0.0932 (0.1021) time: 2.9416 data: 0.0060 max mem: 33301 Epoch: [25] [1460/4276] eta: 2:17:38 lr: 2.0257784103376764e-05 loss: 0.0871 (0.1020) time: 2.9423 data: 0.0061 max mem: 33301 Epoch: [25] [1470/4276] eta: 2:17:08 lr: 2.02548752900356e-05 loss: 0.0964 (0.1020) time: 2.9333 data: 0.0061 max mem: 33301 Epoch: [25] [1480/4276] eta: 2:16:38 lr: 2.0251966430278567e-05 loss: 0.0964 (0.1020) time: 2.9155 data: 0.0063 max mem: 33301 Epoch: [25] [1490/4276] eta: 2:16:08 lr: 2.0249057524097504e-05 loss: 0.0897 (0.1019) time: 2.8883 data: 0.0064 max mem: 33301 Epoch: [25] [1500/4276] eta: 2:15:38 lr: 2.024614857148427e-05 loss: 0.0956 (0.1020) time: 2.8781 data: 0.0064 max mem: 33301 Epoch: [25] [1510/4276] eta: 2:15:07 lr: 2.02432395724307e-05 loss: 0.0956 (0.1019) time: 2.8778 data: 0.0066 max mem: 33301 Epoch: [25] [1520/4276] eta: 2:14:37 lr: 2.0240330526928645e-05 loss: 0.0931 (0.1019) time: 2.8860 data: 0.0067 max mem: 33301 Epoch: [25] [1530/4276] eta: 2:14:08 lr: 2.0237421434969943e-05 loss: 0.0876 (0.1018) time: 2.9053 data: 0.0069 max mem: 33301 Epoch: [25] [1540/4276] eta: 2:13:39 lr: 2.023451229654644e-05 loss: 0.0901 (0.1018) time: 2.9281 data: 0.0074 max mem: 33301 Epoch: [25] [1550/4276] eta: 2:13:10 lr: 2.0231603111649953e-05 loss: 0.0951 (0.1017) time: 2.9387 data: 0.0073 max mem: 33301 Epoch: [25] [1560/4276] eta: 2:12:40 lr: 2.0228693880272325e-05 loss: 0.0972 (0.1018) time: 2.9374 data: 0.0070 max mem: 33301 Epoch: [25] [1570/4276] eta: 2:12:11 lr: 2.022578460240539e-05 loss: 0.0986 (0.1018) time: 2.9372 data: 0.0073 max mem: 33301 Epoch: [25] [1580/4276] eta: 2:11:42 lr: 2.022287527804097e-05 loss: 0.1009 (0.1018) time: 2.9371 data: 0.0075 max mem: 33301 Epoch: [25] [1590/4276] eta: 2:11:13 lr: 2.0219965907170888e-05 loss: 0.0995 (0.1018) time: 2.9362 data: 0.0073 max mem: 33301 Epoch: [25] [1600/4276] eta: 2:10:43 lr: 2.0217056489786966e-05 loss: 0.0995 (0.1018) time: 2.9354 data: 0.0071 max mem: 33301 Epoch: [25] [1610/4276] eta: 2:10:14 lr: 2.0214147025881023e-05 loss: 0.1007 (0.1018) time: 2.9340 data: 0.0074 max mem: 33301 Epoch: [25] [1620/4276] eta: 2:09:45 lr: 2.021123751544488e-05 loss: 0.0972 (0.1018) time: 2.9252 data: 0.0077 max mem: 33301 Epoch: [25] [1630/4276] eta: 2:09:15 lr: 2.0208327958470338e-05 loss: 0.0967 (0.1018) time: 2.9198 data: 0.0073 max mem: 33301 Epoch: [25] [1640/4276] eta: 2:08:46 lr: 2.0205418354949218e-05 loss: 0.0911 (0.1016) time: 2.9229 data: 0.0072 max mem: 33301 Epoch: [25] [1650/4276] eta: 2:08:16 lr: 2.020250870487332e-05 loss: 0.0880 (0.1016) time: 2.9231 data: 0.0075 max mem: 33301 Epoch: [25] [1660/4276] eta: 2:07:46 lr: 2.0199599008234464e-05 loss: 0.0902 (0.1016) time: 2.9017 data: 0.0072 max mem: 33301 Epoch: [25] [1670/4276] eta: 2:07:16 lr: 2.019668926502443e-05 loss: 0.0902 (0.1015) time: 2.8744 data: 0.0067 max mem: 33301 Epoch: [25] [1680/4276] eta: 2:06:46 lr: 2.0193779475235034e-05 loss: 0.0941 (0.1015) time: 2.8658 data: 0.0066 max mem: 33301 Epoch: [25] [1690/4276] eta: 2:06:16 lr: 2.0190869638858065e-05 loss: 0.0941 (0.1014) time: 2.8727 data: 0.0068 max mem: 33301 Epoch: [25] [1700/4276] eta: 2:05:45 lr: 2.0187959755885323e-05 loss: 0.0927 (0.1014) time: 2.8731 data: 0.0066 max mem: 33301 Epoch: [25] [1710/4276] eta: 2:05:15 lr: 2.018504982630859e-05 loss: 0.0886 (0.1013) time: 2.8654 data: 0.0062 max mem: 33301 Epoch: [25] [1720/4276] eta: 2:04:45 lr: 2.018213985011966e-05 loss: 0.0886 (0.1013) time: 2.8666 data: 0.0064 max mem: 33301 Epoch: [25] [1730/4276] eta: 2:04:15 lr: 2.0179229827310314e-05 loss: 0.0887 (0.1012) time: 2.8811 data: 0.0069 max mem: 33301 Epoch: [25] [1740/4276] eta: 2:03:46 lr: 2.017631975787235e-05 loss: 0.0870 (0.1012) time: 2.9086 data: 0.0074 max mem: 33301 Epoch: [25] [1750/4276] eta: 2:03:16 lr: 2.0173409641797525e-05 loss: 0.0933 (0.1011) time: 2.9225 data: 0.0074 max mem: 33301 Epoch: [25] [1760/4276] eta: 2:02:47 lr: 2.017049947907763e-05 loss: 0.0894 (0.1011) time: 2.9242 data: 0.0075 max mem: 33301 Epoch: [25] [1770/4276] eta: 2:02:18 lr: 2.0167589269704445e-05 loss: 0.0890 (0.1010) time: 2.9223 data: 0.0079 max mem: 33301 Epoch: [25] [1780/4276] eta: 2:01:48 lr: 2.0164679013669725e-05 loss: 0.0890 (0.1011) time: 2.9231 data: 0.0079 max mem: 33301 Epoch: [25] [1790/4276] eta: 2:01:19 lr: 2.0161768710965253e-05 loss: 0.0862 (0.1010) time: 2.9258 data: 0.0076 max mem: 33301 Epoch: [25] [1800/4276] eta: 2:00:50 lr: 2.0158858361582786e-05 loss: 0.0840 (0.1009) time: 2.9216 data: 0.0072 max mem: 33301 Epoch: [25] [1810/4276] eta: 2:00:20 lr: 2.0155947965514096e-05 loss: 0.0921 (0.1010) time: 2.9181 data: 0.0073 max mem: 33301 Epoch: [25] [1820/4276] eta: 1:59:51 lr: 2.0153037522750937e-05 loss: 0.0989 (0.1009) time: 2.9284 data: 0.0075 max mem: 33301 Epoch: [25] [1830/4276] eta: 1:59:22 lr: 2.0150127033285067e-05 loss: 0.0904 (0.1009) time: 2.9307 data: 0.0074 max mem: 33301 Epoch: [25] [1840/4276] eta: 1:58:52 lr: 2.0147216497108244e-05 loss: 0.0863 (0.1009) time: 2.9232 data: 0.0074 max mem: 33301 Epoch: [25] [1850/4276] eta: 1:58:23 lr: 2.014430591421222e-05 loss: 0.0873 (0.1009) time: 2.9243 data: 0.0075 max mem: 33301 Epoch: [25] [1860/4276] eta: 1:57:54 lr: 2.0141395284588745e-05 loss: 0.0984 (0.1009) time: 2.9238 data: 0.0076 max mem: 33301 Epoch: [25] [1870/4276] eta: 1:57:24 lr: 2.013848460822956e-05 loss: 0.0993 (0.1010) time: 2.9229 data: 0.0073 max mem: 33301 Epoch: [25] [1880/4276] eta: 1:56:55 lr: 2.0135573885126418e-05 loss: 0.0898 (0.1009) time: 2.9234 data: 0.0071 max mem: 33301 Epoch: [25] [1890/4276] eta: 1:56:26 lr: 2.0132663115271057e-05 loss: 0.0918 (0.1009) time: 2.9240 data: 0.0074 max mem: 33301 Epoch: [25] [1900/4276] eta: 1:55:56 lr: 2.012975229865521e-05 loss: 0.0948 (0.1009) time: 2.9251 data: 0.0076 max mem: 33301 Epoch: [25] [1910/4276] eta: 1:55:27 lr: 2.0126841435270615e-05 loss: 0.0912 (0.1009) time: 2.9245 data: 0.0071 max mem: 33301 Epoch: [25] [1920/4276] eta: 1:54:58 lr: 2.0123930525109007e-05 loss: 0.0892 (0.1009) time: 2.9243 data: 0.0068 max mem: 33301 Epoch: [25] [1930/4276] eta: 1:54:28 lr: 2.0121019568162123e-05 loss: 0.0874 (0.1008) time: 2.9190 data: 0.0071 max mem: 33301 Epoch: [25] [1940/4276] eta: 1:53:59 lr: 2.0118108564421677e-05 loss: 0.0950 (0.1008) time: 2.9181 data: 0.0073 max mem: 33301 Epoch: [25] [1950/4276] eta: 1:53:30 lr: 2.0115197513879397e-05 loss: 0.1042 (0.1008) time: 2.9230 data: 0.0071 max mem: 33301 Epoch: [25] [1960/4276] eta: 1:53:00 lr: 2.0112286416527004e-05 loss: 0.1042 (0.1008) time: 2.9232 data: 0.0068 max mem: 33301 Epoch: [25] [1970/4276] eta: 1:52:31 lr: 2.010937527235623e-05 loss: 0.0826 (0.1007) time: 2.9252 data: 0.0070 max mem: 33301 Epoch: [25] [1980/4276] eta: 1:52:02 lr: 2.010646408135877e-05 loss: 0.0844 (0.1007) time: 2.9273 data: 0.0071 max mem: 33301 Epoch: [25] [1990/4276] eta: 1:51:32 lr: 2.0103552843526353e-05 loss: 0.0977 (0.1007) time: 2.9262 data: 0.0069 max mem: 33301 Epoch: [25] [2000/4276] eta: 1:51:03 lr: 2.010064155885068e-05 loss: 0.0981 (0.1007) time: 2.9206 data: 0.0068 max mem: 33301 Epoch: [25] [2010/4276] eta: 1:50:34 lr: 2.0097730227323468e-05 loss: 0.0928 (0.1007) time: 2.9224 data: 0.0070 max mem: 33301 Epoch: [25] [2020/4276] eta: 1:50:04 lr: 2.0094818848936412e-05 loss: 0.1065 (0.1007) time: 2.9270 data: 0.0070 max mem: 33301 Epoch: [25] [2030/4276] eta: 1:49:35 lr: 2.009190742368122e-05 loss: 0.0971 (0.1007) time: 2.9249 data: 0.0066 max mem: 33301 Epoch: [25] [2040/4276] eta: 1:49:06 lr: 2.008899595154959e-05 loss: 0.0872 (0.1007) time: 2.9235 data: 0.0066 max mem: 33301 Epoch: [25] [2050/4276] eta: 1:48:36 lr: 2.008608443253322e-05 loss: 0.1059 (0.1007) time: 2.9246 data: 0.0068 max mem: 33301 Epoch: [25] [2060/4276] eta: 1:48:07 lr: 2.0083172866623797e-05 loss: 0.1001 (0.1007) time: 2.9279 data: 0.0069 max mem: 33301 Epoch: [25] [2070/4276] eta: 1:47:38 lr: 2.0080261253813018e-05 loss: 0.0881 (0.1007) time: 2.9225 data: 0.0067 max mem: 33301 Epoch: [25] [2080/4276] eta: 1:47:08 lr: 2.0077349594092574e-05 loss: 0.0939 (0.1007) time: 2.9203 data: 0.0067 max mem: 33301 Epoch: [25] [2090/4276] eta: 1:46:39 lr: 2.0074437887454142e-05 loss: 0.1026 (0.1008) time: 2.9294 data: 0.0069 max mem: 33301 Epoch: [25] [2100/4276] eta: 1:46:10 lr: 2.0071526133889407e-05 loss: 0.1026 (0.1008) time: 2.9281 data: 0.0068 max mem: 33301 Epoch: [25] [2110/4276] eta: 1:45:41 lr: 2.006861433339005e-05 loss: 0.0940 (0.1007) time: 2.9234 data: 0.0066 max mem: 33301 Epoch: [25] [2120/4276] eta: 1:45:11 lr: 2.0065702485947747e-05 loss: 0.0881 (0.1006) time: 2.9246 data: 0.0065 max mem: 33301 Epoch: [25] [2130/4276] eta: 1:44:42 lr: 2.0062790591554173e-05 loss: 0.0827 (0.1006) time: 2.9246 data: 0.0066 max mem: 33301 Epoch: [25] [2140/4276] eta: 1:44:13 lr: 2.0059878650200997e-05 loss: 0.0884 (0.1005) time: 2.9237 data: 0.0068 max mem: 33301 Epoch: [25] [2150/4276] eta: 1:43:43 lr: 2.0056966661879888e-05 loss: 0.0884 (0.1005) time: 2.9257 data: 0.0066 max mem: 33301 Epoch: [25] [2160/4276] eta: 1:43:14 lr: 2.005405462658252e-05 loss: 0.0910 (0.1004) time: 2.9284 data: 0.0065 max mem: 33301 Epoch: [25] [2170/4276] eta: 1:42:45 lr: 2.005114254430054e-05 loss: 0.0910 (0.1004) time: 2.9297 data: 0.0068 max mem: 33301 Epoch: [25] [2180/4276] eta: 1:42:16 lr: 2.0048230415025613e-05 loss: 0.0883 (0.1004) time: 2.9281 data: 0.0069 max mem: 33301 Epoch: [25] [2190/4276] eta: 1:41:46 lr: 2.00453182387494e-05 loss: 0.0878 (0.1003) time: 2.9268 data: 0.0068 max mem: 33301 Epoch: [25] [2200/4276] eta: 1:41:17 lr: 2.004240601546356e-05 loss: 0.0973 (0.1004) time: 2.9274 data: 0.0069 max mem: 33301 Epoch: [25] [2210/4276] eta: 1:40:48 lr: 2.003949374515973e-05 loss: 0.1092 (0.1004) time: 2.9274 data: 0.0071 max mem: 33301 Epoch: [25] [2220/4276] eta: 1:40:18 lr: 2.0036581427829567e-05 loss: 0.1029 (0.1004) time: 2.9273 data: 0.0071 max mem: 33301 Epoch: [25] [2230/4276] eta: 1:39:49 lr: 2.0033669063464716e-05 loss: 0.0902 (0.1004) time: 2.9273 data: 0.0070 max mem: 33301 Epoch: [25] [2240/4276] eta: 1:39:20 lr: 2.0030756652056826e-05 loss: 0.0876 (0.1004) time: 2.9268 data: 0.0070 max mem: 33301 Epoch: [25] [2250/4276] eta: 1:38:50 lr: 2.0027844193597524e-05 loss: 0.0824 (0.1003) time: 2.9066 data: 0.0073 max mem: 33301 Epoch: [25] [2260/4276] eta: 1:38:21 lr: 2.0024931688078453e-05 loss: 0.0889 (0.1003) time: 2.8982 data: 0.0070 max mem: 33301 Epoch: [25] [2270/4276] eta: 1:37:52 lr: 2.002201913549125e-05 loss: 0.0895 (0.1003) time: 2.9198 data: 0.0065 max mem: 33301 Epoch: [25] [2280/4276] eta: 1:37:22 lr: 2.0019106535827546e-05 loss: 0.0934 (0.1003) time: 2.9295 data: 0.0063 max mem: 33301 Epoch: [25] [2290/4276] eta: 1:36:53 lr: 2.0016193889078966e-05 loss: 0.0923 (0.1003) time: 2.9272 data: 0.0060 max mem: 33301 Epoch: [25] [2300/4276] eta: 1:36:24 lr: 2.0013281195237135e-05 loss: 0.0884 (0.1003) time: 2.9272 data: 0.0060 max mem: 33301 Epoch: [25] [2310/4276] eta: 1:35:55 lr: 2.0010368454293683e-05 loss: 0.0928 (0.1003) time: 2.9287 data: 0.0060 max mem: 33301 Epoch: [25] [2320/4276] eta: 1:35:25 lr: 2.0007455666240227e-05 loss: 0.1042 (0.1003) time: 2.9278 data: 0.0060 max mem: 33301 Epoch: [25] [2330/4276] eta: 1:34:56 lr: 2.0004542831068378e-05 loss: 0.1093 (0.1004) time: 2.9253 data: 0.0060 max mem: 33301 Epoch: [25] [2340/4276] eta: 1:34:27 lr: 2.000162994876976e-05 loss: 0.1077 (0.1004) time: 2.9231 data: 0.0062 max mem: 33301 Epoch: [25] [2350/4276] eta: 1:33:57 lr: 1.9998717019335975e-05 loss: 0.0882 (0.1004) time: 2.9238 data: 0.0062 max mem: 33301 Epoch: [25] [2360/4276] eta: 1:33:28 lr: 1.9995804042758647e-05 loss: 0.0829 (0.1003) time: 2.9267 data: 0.0060 max mem: 33301 Epoch: [25] [2370/4276] eta: 1:32:59 lr: 1.9992891019029363e-05 loss: 0.0938 (0.1004) time: 2.9274 data: 0.0060 max mem: 33301 Epoch: [25] [2380/4276] eta: 1:32:30 lr: 1.9989977948139733e-05 loss: 0.1071 (0.1004) time: 2.9278 data: 0.0060 max mem: 33301 Epoch: [25] [2390/4276] eta: 1:32:00 lr: 1.9987064830081364e-05 loss: 0.1034 (0.1004) time: 2.9224 data: 0.0062 max mem: 33301 Epoch: [25] [2400/4276] eta: 1:31:31 lr: 1.9984151664845844e-05 loss: 0.1164 (0.1005) time: 2.9246 data: 0.0063 max mem: 33301 Epoch: [25] [2410/4276] eta: 1:31:02 lr: 1.998123845242477e-05 loss: 0.1057 (0.1005) time: 2.9298 data: 0.0060 max mem: 33301 Epoch: [25] [2420/4276] eta: 1:30:32 lr: 1.9978325192809737e-05 loss: 0.0933 (0.1005) time: 2.9283 data: 0.0060 max mem: 33301 Epoch: [25] [2430/4276] eta: 1:30:03 lr: 1.9975411885992335e-05 loss: 0.0988 (0.1005) time: 2.9283 data: 0.0061 max mem: 33301 Epoch: [25] [2440/4276] eta: 1:29:34 lr: 1.997249853196414e-05 loss: 0.1082 (0.1005) time: 2.9250 data: 0.0062 max mem: 33301 Epoch: [25] [2450/4276] eta: 1:29:05 lr: 1.9969585130716745e-05 loss: 0.0903 (0.1005) time: 2.9252 data: 0.0062 max mem: 33301 Epoch: [25] [2460/4276] eta: 1:28:35 lr: 1.9966671682241726e-05 loss: 0.0944 (0.1005) time: 2.9242 data: 0.0060 max mem: 33301 Epoch: [25] [2470/4276] eta: 1:28:06 lr: 1.9963758186530664e-05 loss: 0.1005 (0.1005) time: 2.8967 data: 0.0067 max mem: 33301 Epoch: [25] [2480/4276] eta: 1:27:36 lr: 1.9960844643575126e-05 loss: 0.1012 (0.1005) time: 2.8920 data: 0.0075 max mem: 33301 Epoch: [25] [2490/4276] eta: 1:27:07 lr: 1.9957931053366683e-05 loss: 0.1029 (0.1006) time: 2.9039 data: 0.0079 max mem: 33301 Epoch: [25] [2500/4276] eta: 1:26:37 lr: 1.9955017415896912e-05 loss: 0.1017 (0.1006) time: 2.9121 data: 0.0072 max mem: 33301 Epoch: [25] [2510/4276] eta: 1:26:08 lr: 1.995210373115738e-05 loss: 0.0969 (0.1005) time: 2.9278 data: 0.0064 max mem: 33301 Epoch: [25] [2520/4276] eta: 1:25:39 lr: 1.994918999913964e-05 loss: 0.0910 (0.1005) time: 2.9290 data: 0.0065 max mem: 33301 Epoch: [25] [2530/4276] eta: 1:25:10 lr: 1.9946276219835252e-05 loss: 0.0837 (0.1004) time: 2.9210 data: 0.0064 max mem: 33301 Epoch: [25] [2540/4276] eta: 1:24:40 lr: 1.9943362393235783e-05 loss: 0.0890 (0.1004) time: 2.9199 data: 0.0065 max mem: 33301 Epoch: [25] [2550/4276] eta: 1:24:11 lr: 1.9940448519332784e-05 loss: 0.0923 (0.1003) time: 2.9272 data: 0.0065 max mem: 33301 Epoch: [25] [2560/4276] eta: 1:23:42 lr: 1.99375345981178e-05 loss: 0.0789 (0.1003) time: 2.9268 data: 0.0063 max mem: 33301 Epoch: [25] [2570/4276] eta: 1:23:13 lr: 1.9934620629582387e-05 loss: 0.0790 (0.1003) time: 2.9280 data: 0.0065 max mem: 33301 Epoch: [25] [2580/4276] eta: 1:22:43 lr: 1.9931706613718085e-05 loss: 0.0837 (0.1002) time: 2.9269 data: 0.0064 max mem: 33301 Epoch: [25] [2590/4276] eta: 1:22:14 lr: 1.9928792550516446e-05 loss: 0.0886 (0.1002) time: 2.9257 data: 0.0063 max mem: 33301 Epoch: [25] [2600/4276] eta: 1:21:45 lr: 1.9925878439968996e-05 loss: 0.0886 (0.1002) time: 2.9270 data: 0.0064 max mem: 33301 Epoch: [25] [2610/4276] eta: 1:21:15 lr: 1.9922964282067278e-05 loss: 0.0899 (0.1001) time: 2.9289 data: 0.0065 max mem: 33301 Epoch: [25] [2620/4276] eta: 1:20:46 lr: 1.9920050076802828e-05 loss: 0.0899 (0.1001) time: 2.9285 data: 0.0064 max mem: 33301 Epoch: [25] [2630/4276] eta: 1:20:17 lr: 1.9917135824167183e-05 loss: 0.0905 (0.1001) time: 2.9291 data: 0.0063 max mem: 33301 Epoch: [25] [2640/4276] eta: 1:19:48 lr: 1.991422152415186e-05 loss: 0.0853 (0.1000) time: 2.9295 data: 0.0063 max mem: 33301 Epoch: [25] [2650/4276] eta: 1:19:18 lr: 1.9911307176748382e-05 loss: 0.0825 (0.1000) time: 2.9297 data: 0.0064 max mem: 33301 Epoch: [25] [2660/4276] eta: 1:18:49 lr: 1.990839278194829e-05 loss: 0.0850 (0.1000) time: 2.9325 data: 0.0064 max mem: 33301 Epoch: [25] [2670/4276] eta: 1:18:20 lr: 1.9905478339743086e-05 loss: 0.1092 (0.1000) time: 2.9296 data: 0.0062 max mem: 33301 Epoch: [25] [2680/4276] eta: 1:17:51 lr: 1.9902563850124293e-05 loss: 0.1035 (0.1000) time: 2.9268 data: 0.0063 max mem: 33301 Epoch: [25] [2690/4276] eta: 1:17:21 lr: 1.989964931308342e-05 loss: 0.0937 (0.1000) time: 2.9288 data: 0.0065 max mem: 33301 Epoch: [25] [2700/4276] eta: 1:16:52 lr: 1.9896734728611993e-05 loss: 0.0886 (0.1000) time: 2.9298 data: 0.0066 max mem: 33301 Epoch: [25] [2710/4276] eta: 1:16:23 lr: 1.98938200967015e-05 loss: 0.0845 (0.0999) time: 2.9285 data: 0.0066 max mem: 33301 Epoch: [25] [2720/4276] eta: 1:15:54 lr: 1.9890905417343457e-05 loss: 0.0778 (0.0998) time: 2.9272 data: 0.0065 max mem: 33301 Epoch: [25] [2730/4276] eta: 1:15:24 lr: 1.988799069052937e-05 loss: 0.1015 (0.0999) time: 2.9283 data: 0.0066 max mem: 33301 Epoch: [25] [2740/4276] eta: 1:14:55 lr: 1.988507591625073e-05 loss: 0.1078 (0.0999) time: 2.9261 data: 0.0067 max mem: 33301 Epoch: [25] [2750/4276] eta: 1:14:26 lr: 1.988216109449903e-05 loss: 0.0906 (0.0999) time: 2.8977 data: 0.0065 max mem: 33301 Epoch: [25] [2760/4276] eta: 1:13:56 lr: 1.9879246225265777e-05 loss: 0.0906 (0.0999) time: 2.9019 data: 0.0073 max mem: 33301 Epoch: [25] [2770/4276] eta: 1:13:27 lr: 1.987633130854245e-05 loss: 0.0963 (0.0999) time: 2.9303 data: 0.0081 max mem: 33301 Epoch: [25] [2780/4276] eta: 1:12:58 lr: 1.987341634432055e-05 loss: 0.1006 (0.0999) time: 2.9234 data: 0.0079 max mem: 33301 Epoch: [25] [2790/4276] eta: 1:12:29 lr: 1.9870501332591544e-05 loss: 0.1036 (0.0999) time: 2.9244 data: 0.0076 max mem: 33301 Epoch: [25] [2800/4276] eta: 1:11:59 lr: 1.986758627334692e-05 loss: 0.0918 (0.0999) time: 2.9206 data: 0.0074 max mem: 33301 Epoch: [25] [2810/4276] eta: 1:11:30 lr: 1.9864671166578164e-05 loss: 0.0799 (0.0998) time: 2.9104 data: 0.0075 max mem: 33301 Epoch: [25] [2820/4276] eta: 1:11:01 lr: 1.986175601227675e-05 loss: 0.0789 (0.0997) time: 2.9208 data: 0.0075 max mem: 33301 Epoch: [25] [2830/4276] eta: 1:10:31 lr: 1.9858840810434144e-05 loss: 0.0884 (0.0997) time: 2.9305 data: 0.0073 max mem: 33301 Epoch: [25] [2840/4276] eta: 1:10:02 lr: 1.985592556104182e-05 loss: 0.0927 (0.0997) time: 2.9303 data: 0.0072 max mem: 33301 Epoch: [25] [2850/4276] eta: 1:09:33 lr: 1.9853010264091244e-05 loss: 0.1021 (0.0998) time: 2.9303 data: 0.0074 max mem: 33301 Epoch: [25] [2860/4276] eta: 1:09:04 lr: 1.985009491957389e-05 loss: 0.0952 (0.0998) time: 2.9308 data: 0.0074 max mem: 33301 Epoch: [25] [2870/4276] eta: 1:08:34 lr: 1.9847179527481204e-05 loss: 0.0998 (0.0998) time: 2.9305 data: 0.0072 max mem: 33301 Epoch: [25] [2880/4276] eta: 1:08:05 lr: 1.9844264087804656e-05 loss: 0.0949 (0.0998) time: 2.9314 data: 0.0073 max mem: 33301 Epoch: [25] [2890/4276] eta: 1:07:36 lr: 1.9841348600535694e-05 loss: 0.0868 (0.0998) time: 2.9331 data: 0.0076 max mem: 33301 Epoch: [25] [2900/4276] eta: 1:07:07 lr: 1.983843306566578e-05 loss: 0.0829 (0.0997) time: 2.9313 data: 0.0075 max mem: 33301 Epoch: [25] [2910/4276] eta: 1:06:37 lr: 1.983551748318635e-05 loss: 0.0873 (0.0997) time: 2.9289 data: 0.0072 max mem: 33301 Epoch: [25] [2920/4276] eta: 1:06:08 lr: 1.983260185308886e-05 loss: 0.0960 (0.0997) time: 2.9310 data: 0.0073 max mem: 33301 Epoch: [25] [2930/4276] eta: 1:05:39 lr: 1.9829686175364754e-05 loss: 0.0930 (0.0997) time: 2.9328 data: 0.0076 max mem: 33301 Epoch: [25] [2940/4276] eta: 1:05:10 lr: 1.9826770450005473e-05 loss: 0.0921 (0.0997) time: 2.9334 data: 0.0076 max mem: 33301 Epoch: [25] [2950/4276] eta: 1:04:40 lr: 1.982385467700245e-05 loss: 0.0937 (0.0997) time: 2.9343 data: 0.0073 max mem: 33301 Epoch: [25] [2960/4276] eta: 1:04:11 lr: 1.982093885634712e-05 loss: 0.0929 (0.0997) time: 2.9354 data: 0.0073 max mem: 33301 Epoch: [25] [2970/4276] eta: 1:03:42 lr: 1.9818022988030925e-05 loss: 0.0949 (0.0997) time: 2.9335 data: 0.0075 max mem: 33301 Epoch: [25] [2980/4276] eta: 1:03:13 lr: 1.9815107072045284e-05 loss: 0.1007 (0.0997) time: 2.9289 data: 0.0075 max mem: 33301 Epoch: [25] [2990/4276] eta: 1:02:43 lr: 1.981219110838162e-05 loss: 0.1007 (0.0997) time: 2.9289 data: 0.0073 max mem: 33301 Epoch: [25] [3000/4276] eta: 1:02:14 lr: 1.9809275097031366e-05 loss: 0.0956 (0.0997) time: 2.9280 data: 0.0071 max mem: 33301 Epoch: [25] [3010/4276] eta: 1:01:45 lr: 1.9806359037985943e-05 loss: 0.0956 (0.0997) time: 2.9293 data: 0.0070 max mem: 33301 Epoch: [25] [3020/4276] eta: 1:01:16 lr: 1.980344293123676e-05 loss: 0.1102 (0.0997) time: 2.9246 data: 0.0068 max mem: 33301 Epoch: [25] [3030/4276] eta: 1:00:46 lr: 1.9800526776775236e-05 loss: 0.0928 (0.0997) time: 2.9251 data: 0.0069 max mem: 33301 Epoch: [25] [3040/4276] eta: 1:00:17 lr: 1.9797610574592778e-05 loss: 0.1048 (0.0997) time: 2.9326 data: 0.0067 max mem: 33301 Epoch: [25] [3050/4276] eta: 0:59:48 lr: 1.9794694324680806e-05 loss: 0.0969 (0.0997) time: 2.9244 data: 0.0064 max mem: 33301 Epoch: [25] [3060/4276] eta: 0:59:18 lr: 1.979177802703071e-05 loss: 0.0762 (0.0996) time: 2.9025 data: 0.0063 max mem: 33301 Epoch: [25] [3070/4276] eta: 0:58:49 lr: 1.9788861681633904e-05 loss: 0.0840 (0.0996) time: 2.9036 data: 0.0067 max mem: 33301 Epoch: [25] [3080/4276] eta: 0:58:20 lr: 1.978594528848178e-05 loss: 0.0874 (0.0996) time: 2.9228 data: 0.0070 max mem: 33301 Epoch: [25] [3090/4276] eta: 0:57:51 lr: 1.9783028847565745e-05 loss: 0.0822 (0.0996) time: 2.9275 data: 0.0067 max mem: 33301 Epoch: [25] [3100/4276] eta: 0:57:21 lr: 1.978011235887718e-05 loss: 0.0912 (0.0996) time: 2.9280 data: 0.0066 max mem: 33301 Epoch: [25] [3110/4276] eta: 0:56:52 lr: 1.977719582240748e-05 loss: 0.0831 (0.0995) time: 2.9272 data: 0.0066 max mem: 33301 Epoch: [25] [3120/4276] eta: 0:56:23 lr: 1.977427923814804e-05 loss: 0.0798 (0.0995) time: 2.9275 data: 0.0067 max mem: 33301 Epoch: [25] [3130/4276] eta: 0:55:53 lr: 1.9771362606090242e-05 loss: 0.0814 (0.0995) time: 2.9291 data: 0.0069 max mem: 33301 Epoch: [25] [3140/4276] eta: 0:55:24 lr: 1.9768445926225458e-05 loss: 0.0866 (0.0994) time: 2.9290 data: 0.0068 max mem: 33301 Epoch: [25] [3150/4276] eta: 0:54:55 lr: 1.9765529198545073e-05 loss: 0.0918 (0.0995) time: 2.9268 data: 0.0066 max mem: 33301 Epoch: [25] [3160/4276] eta: 0:54:26 lr: 1.976261242304047e-05 loss: 0.0935 (0.0995) time: 2.9268 data: 0.0068 max mem: 33301 Epoch: [25] [3170/4276] eta: 0:53:56 lr: 1.9759695599703016e-05 loss: 0.0943 (0.0995) time: 2.9280 data: 0.0070 max mem: 33301 Epoch: [25] [3180/4276] eta: 0:53:27 lr: 1.9756778728524076e-05 loss: 0.1026 (0.0995) time: 2.9289 data: 0.0068 max mem: 33301 Epoch: [25] [3190/4276] eta: 0:52:58 lr: 1.9753861809495023e-05 loss: 0.1025 (0.0995) time: 2.9289 data: 0.0066 max mem: 33301 Epoch: [25] [3200/4276] eta: 0:52:29 lr: 1.975094484260722e-05 loss: 0.0936 (0.0995) time: 2.9270 data: 0.0066 max mem: 33301 Epoch: [25] [3210/4276] eta: 0:51:59 lr: 1.974802782785203e-05 loss: 0.0937 (0.0995) time: 2.9263 data: 0.0066 max mem: 33301 Epoch: [25] [3220/4276] eta: 0:51:30 lr: 1.9745110765220805e-05 loss: 0.1036 (0.0995) time: 2.9274 data: 0.0064 max mem: 33301 Epoch: [25] [3230/4276] eta: 0:51:01 lr: 1.9742193654704904e-05 loss: 0.0952 (0.0995) time: 2.9267 data: 0.0064 max mem: 33301 Epoch: [25] [3240/4276] eta: 0:50:32 lr: 1.973927649629568e-05 loss: 0.1041 (0.0996) time: 2.9161 data: 0.0067 max mem: 33301 Epoch: [25] [3250/4276] eta: 0:50:02 lr: 1.9736359289984485e-05 loss: 0.1025 (0.0995) time: 2.9111 data: 0.0069 max mem: 33301 Epoch: [25] [3260/4276] eta: 0:49:33 lr: 1.9733442035762657e-05 loss: 0.0933 (0.0995) time: 2.9214 data: 0.0066 max mem: 33301 Epoch: [25] [3270/4276] eta: 0:49:04 lr: 1.9730524733621544e-05 loss: 0.1055 (0.0995) time: 2.9234 data: 0.0064 max mem: 33301 Epoch: [25] [3280/4276] eta: 0:48:34 lr: 1.9727607383552487e-05 loss: 0.1055 (0.0995) time: 2.9258 data: 0.0067 max mem: 33301 Epoch: [25] [3290/4276] eta: 0:48:05 lr: 1.972468998554682e-05 loss: 0.0938 (0.0995) time: 2.9300 data: 0.0068 max mem: 33301 Epoch: [25] [3300/4276] eta: 0:47:36 lr: 1.972177253959588e-05 loss: 0.1030 (0.0996) time: 2.9266 data: 0.0065 max mem: 33301 Epoch: [25] [3310/4276] eta: 0:47:07 lr: 1.9718855045690992e-05 loss: 0.1086 (0.0996) time: 2.9224 data: 0.0063 max mem: 33301 Epoch: [25] [3320/4276] eta: 0:46:37 lr: 1.9715937503823498e-05 loss: 0.1053 (0.0996) time: 2.9273 data: 0.0062 max mem: 33301 Epoch: [25] [3330/4276] eta: 0:46:08 lr: 1.9713019913984706e-05 loss: 0.1033 (0.0996) time: 2.9378 data: 0.0064 max mem: 33301 Epoch: [25] [3340/4276] eta: 0:45:39 lr: 1.971010227616595e-05 loss: 0.0999 (0.0996) time: 2.9401 data: 0.0063 max mem: 33301 Epoch: [25] [3350/4276] eta: 0:45:10 lr: 1.9707184590358545e-05 loss: 0.0942 (0.0996) time: 2.9379 data: 0.0062 max mem: 33301 Epoch: [25] [3360/4276] eta: 0:44:40 lr: 1.9704266856553812e-05 loss: 0.0873 (0.0996) time: 2.9380 data: 0.0063 max mem: 33301 Epoch: [25] [3370/4276] eta: 0:44:11 lr: 1.9701349074743056e-05 loss: 0.1062 (0.0997) time: 2.9402 data: 0.0064 max mem: 33301 Epoch: [25] [3380/4276] eta: 0:43:42 lr: 1.969843124491759e-05 loss: 0.1044 (0.0997) time: 2.9396 data: 0.0066 max mem: 33301 Epoch: [25] [3390/4276] eta: 0:43:13 lr: 1.969551336706872e-05 loss: 0.0992 (0.0997) time: 2.9356 data: 0.0063 max mem: 33301 Epoch: [25] [3400/4276] eta: 0:42:43 lr: 1.9692595441187764e-05 loss: 0.1029 (0.0998) time: 2.9337 data: 0.0062 max mem: 33301 Epoch: [25] [3410/4276] eta: 0:42:14 lr: 1.9689677467266002e-05 loss: 0.1003 (0.0998) time: 2.9356 data: 0.0063 max mem: 33301 Epoch: [25] [3420/4276] eta: 0:41:45 lr: 1.9686759445294745e-05 loss: 0.1009 (0.0998) time: 2.9356 data: 0.0062 max mem: 33301 Epoch: [25] [3430/4276] eta: 0:41:16 lr: 1.9683841375265284e-05 loss: 0.1027 (0.0998) time: 2.9364 data: 0.0061 max mem: 33301 Epoch: [25] [3440/4276] eta: 0:40:46 lr: 1.968092325716892e-05 loss: 0.0948 (0.0998) time: 2.9376 data: 0.0061 max mem: 33301 Epoch: [25] [3450/4276] eta: 0:40:17 lr: 1.9678005090996923e-05 loss: 0.0929 (0.0998) time: 2.9146 data: 0.0066 max mem: 33301 Epoch: [25] [3460/4276] eta: 0:39:48 lr: 1.9675086876740594e-05 loss: 0.1144 (0.0999) time: 2.9075 data: 0.0073 max mem: 33301 Epoch: [25] [3470/4276] eta: 0:39:19 lr: 1.9672168614391215e-05 loss: 0.0915 (0.0998) time: 2.9320 data: 0.0075 max mem: 33301 Epoch: [25] [3480/4276] eta: 0:38:49 lr: 1.9669250303940068e-05 loss: 0.0860 (0.0998) time: 2.9407 data: 0.0070 max mem: 33301 Epoch: [25] [3490/4276] eta: 0:38:20 lr: 1.966633194537842e-05 loss: 0.0987 (0.0998) time: 2.9409 data: 0.0067 max mem: 33301 Epoch: [25] [3500/4276] eta: 0:37:51 lr: 1.966341353869755e-05 loss: 0.0851 (0.0998) time: 2.9461 data: 0.0072 max mem: 33301 Epoch: [25] [3510/4276] eta: 0:37:22 lr: 1.966049508388873e-05 loss: 0.0851 (0.0998) time: 2.9443 data: 0.0070 max mem: 33301 Epoch: [25] [3520/4276] eta: 0:36:52 lr: 1.9657576580943236e-05 loss: 0.0882 (0.0998) time: 2.9222 data: 0.0071 max mem: 33301 Epoch: [25] [3530/4276] eta: 0:36:23 lr: 1.9654658029852316e-05 loss: 0.0931 (0.0997) time: 2.9222 data: 0.0073 max mem: 33301 Epoch: [25] [3540/4276] eta: 0:35:54 lr: 1.9651739430607244e-05 loss: 0.0954 (0.0998) time: 2.9391 data: 0.0070 max mem: 33301 Epoch: [25] [3550/4276] eta: 0:35:25 lr: 1.9648820783199272e-05 loss: 0.0914 (0.0997) time: 2.9396 data: 0.0069 max mem: 33301 Epoch: [25] [3560/4276] eta: 0:34:55 lr: 1.9645902087619673e-05 loss: 0.0875 (0.0998) time: 2.9387 data: 0.0068 max mem: 33301 Epoch: [25] [3570/4276] eta: 0:34:26 lr: 1.9642983343859674e-05 loss: 0.0955 (0.0998) time: 2.9390 data: 0.0069 max mem: 33301 Epoch: [25] [3580/4276] eta: 0:33:57 lr: 1.9640064551910538e-05 loss: 0.0935 (0.0998) time: 2.9388 data: 0.0068 max mem: 33301 Epoch: [25] [3590/4276] eta: 0:33:28 lr: 1.9637145711763522e-05 loss: 0.0935 (0.0998) time: 2.9370 data: 0.0069 max mem: 33301 Epoch: [25] [3600/4276] eta: 0:32:58 lr: 1.963422682340985e-05 loss: 0.0909 (0.0998) time: 2.9361 data: 0.0069 max mem: 33301 Epoch: [25] [3610/4276] eta: 0:32:29 lr: 1.963130788684077e-05 loss: 0.0883 (0.0998) time: 2.9377 data: 0.0069 max mem: 33301 Epoch: [25] [3620/4276] eta: 0:32:00 lr: 1.9628388902047524e-05 loss: 0.0929 (0.0998) time: 2.9395 data: 0.0070 max mem: 33301 Epoch: [25] [3630/4276] eta: 0:31:31 lr: 1.9625469869021352e-05 loss: 0.0964 (0.0998) time: 2.9384 data: 0.0069 max mem: 33301 Epoch: [25] [3640/4276] eta: 0:31:01 lr: 1.962255078775347e-05 loss: 0.0873 (0.0997) time: 2.9386 data: 0.0069 max mem: 33301 Epoch: [25] [3650/4276] eta: 0:30:32 lr: 1.961963165823511e-05 loss: 0.0819 (0.0997) time: 2.9386 data: 0.0069 max mem: 33301 Epoch: [25] [3660/4276] eta: 0:30:03 lr: 1.961671248045751e-05 loss: 0.0879 (0.0997) time: 2.9376 data: 0.0069 max mem: 33301 Epoch: [25] [3670/4276] eta: 0:29:34 lr: 1.9613793254411885e-05 loss: 0.0961 (0.0997) time: 2.9406 data: 0.0069 max mem: 33301 Epoch: [25] [3680/4276] eta: 0:29:04 lr: 1.9610873980089448e-05 loss: 0.1076 (0.0997) time: 2.9408 data: 0.0068 max mem: 33301 Epoch: [25] [3690/4276] eta: 0:28:35 lr: 1.960795465748142e-05 loss: 0.1017 (0.0997) time: 2.9381 data: 0.0069 max mem: 33301 Epoch: [25] [3700/4276] eta: 0:28:06 lr: 1.9605035286579017e-05 loss: 0.0886 (0.0997) time: 2.9367 data: 0.0068 max mem: 33301 Epoch: [25] [3710/4276] eta: 0:27:37 lr: 1.9602115867373455e-05 loss: 0.0886 (0.0997) time: 2.9380 data: 0.0069 max mem: 33301 Epoch: [25] [3720/4276] eta: 0:27:07 lr: 1.9599196399855926e-05 loss: 0.0919 (0.0997) time: 2.9418 data: 0.0073 max mem: 33301 Epoch: [25] [3730/4276] eta: 0:26:38 lr: 1.9596276884017643e-05 loss: 0.0938 (0.0997) time: 2.9301 data: 0.0074 max mem: 33301 Epoch: [25] [3740/4276] eta: 0:26:09 lr: 1.9593357319849805e-05 loss: 0.0959 (0.0997) time: 2.8999 data: 0.0073 max mem: 33301 Epoch: [25] [3750/4276] eta: 0:25:39 lr: 1.9590437707343617e-05 loss: 0.1015 (0.0997) time: 2.8831 data: 0.0072 max mem: 33301 Epoch: [25] [3760/4276] eta: 0:25:10 lr: 1.9587518046490265e-05 loss: 0.0884 (0.0997) time: 2.8838 data: 0.0071 max mem: 33301 Epoch: [25] [3770/4276] eta: 0:24:41 lr: 1.9584598337280944e-05 loss: 0.0885 (0.0997) time: 2.9060 data: 0.0072 max mem: 33301 Epoch: [25] [3780/4276] eta: 0:24:11 lr: 1.958167857970684e-05 loss: 0.0808 (0.0997) time: 2.9353 data: 0.0078 max mem: 33301 Epoch: [25] [3790/4276] eta: 0:23:42 lr: 1.9578758773759147e-05 loss: 0.0859 (0.0997) time: 2.9416 data: 0.0083 max mem: 33301 Epoch: [25] [3800/4276] eta: 0:23:13 lr: 1.957583891942904e-05 loss: 0.0964 (0.0997) time: 2.9409 data: 0.0083 max mem: 33301 Epoch: [25] [3810/4276] eta: 0:22:44 lr: 1.9572919016707703e-05 loss: 0.0910 (0.0997) time: 2.9403 data: 0.0085 max mem: 33301 Epoch: [25] [3820/4276] eta: 0:22:14 lr: 1.9569999065586304e-05 loss: 0.0872 (0.0996) time: 2.9458 data: 0.0084 max mem: 33301 Epoch: [25] [3830/4276] eta: 0:21:45 lr: 1.9567079066056034e-05 loss: 0.0857 (0.0996) time: 2.9464 data: 0.0083 max mem: 33301 Epoch: [25] [3840/4276] eta: 0:21:16 lr: 1.9564159018108046e-05 loss: 0.0838 (0.0996) time: 2.9295 data: 0.0082 max mem: 33301 Epoch: [25] [3850/4276] eta: 0:20:47 lr: 1.9561238921733514e-05 loss: 0.0794 (0.0995) time: 2.9172 data: 0.0079 max mem: 33301 Epoch: [25] [3860/4276] eta: 0:20:17 lr: 1.95583187769236e-05 loss: 0.0825 (0.0995) time: 2.9221 data: 0.0078 max mem: 33301 Epoch: [25] [3870/4276] eta: 0:19:48 lr: 1.9555398583669476e-05 loss: 0.0939 (0.0995) time: 2.9335 data: 0.0074 max mem: 33301 Epoch: [25] [3880/4276] eta: 0:19:19 lr: 1.9552478341962284e-05 loss: 0.0842 (0.0995) time: 2.9384 data: 0.0071 max mem: 33301 Epoch: [25] [3890/4276] eta: 0:18:50 lr: 1.954955805179319e-05 loss: 0.0943 (0.0995) time: 2.9390 data: 0.0074 max mem: 33301 Epoch: [25] [3900/4276] eta: 0:18:20 lr: 1.9546637713153347e-05 loss: 0.1021 (0.0995) time: 2.9399 data: 0.0073 max mem: 33301 Epoch: [25] [3910/4276] eta: 0:17:51 lr: 1.954371732603389e-05 loss: 0.0821 (0.0995) time: 2.9211 data: 0.0070 max mem: 33301 Epoch: [25] [3920/4276] eta: 0:17:22 lr: 1.9540796890425978e-05 loss: 0.0808 (0.0994) time: 2.9083 data: 0.0074 max mem: 33301 Epoch: [25] [3930/4276] eta: 0:16:52 lr: 1.953787640632075e-05 loss: 0.0861 (0.0994) time: 2.9276 data: 0.0080 max mem: 33301 Epoch: [25] [3940/4276] eta: 0:16:23 lr: 1.953495587370935e-05 loss: 0.0932 (0.0994) time: 2.9405 data: 0.0079 max mem: 33301 Epoch: [25] [3950/4276] eta: 0:15:54 lr: 1.9532035292582908e-05 loss: 0.0962 (0.0995) time: 2.9381 data: 0.0078 max mem: 33301 Epoch: [25] [3960/4276] eta: 0:15:25 lr: 1.9529114662932553e-05 loss: 0.1009 (0.0995) time: 2.9401 data: 0.0079 max mem: 33301 Epoch: [25] [3970/4276] eta: 0:14:55 lr: 1.952619398474943e-05 loss: 0.1102 (0.0995) time: 2.9409 data: 0.0082 max mem: 33301 Epoch: [25] [3980/4276] eta: 0:14:26 lr: 1.9523273258024656e-05 loss: 0.0919 (0.0995) time: 2.9127 data: 0.0081 max mem: 33301 Epoch: [25] [3990/4276] eta: 0:13:57 lr: 1.9520352482749354e-05 loss: 0.0899 (0.0995) time: 2.8834 data: 0.0073 max mem: 33301 Epoch: [25] [4000/4276] eta: 0:13:27 lr: 1.951743165891465e-05 loss: 0.0884 (0.0994) time: 2.8803 data: 0.0070 max mem: 33301 Epoch: [25] [4010/4276] eta: 0:12:58 lr: 1.9514510786511662e-05 loss: 0.0940 (0.0995) time: 2.9088 data: 0.0075 max mem: 33301 Epoch: [25] [4020/4276] eta: 0:12:29 lr: 1.9511589865531504e-05 loss: 0.1007 (0.0994) time: 2.9380 data: 0.0077 max mem: 33301 Epoch: [25] [4030/4276] eta: 0:12:00 lr: 1.9508668895965286e-05 loss: 0.0972 (0.0994) time: 2.9375 data: 0.0075 max mem: 33301 Epoch: [25] [4040/4276] eta: 0:11:30 lr: 1.9505747877804113e-05 loss: 0.0924 (0.0994) time: 2.9368 data: 0.0074 max mem: 33301 Epoch: [25] [4050/4276] eta: 0:11:01 lr: 1.95028268110391e-05 loss: 0.0888 (0.0994) time: 2.9397 data: 0.0076 max mem: 33301 Epoch: [25] [4060/4276] eta: 0:10:32 lr: 1.9499905695661346e-05 loss: 0.0878 (0.0994) time: 2.9432 data: 0.0076 max mem: 33301 Epoch: [25] [4070/4276] eta: 0:10:03 lr: 1.9496984531661947e-05 loss: 0.0968 (0.0994) time: 2.9384 data: 0.0074 max mem: 33301 Epoch: [25] [4080/4276] eta: 0:09:33 lr: 1.9494063319031997e-05 loss: 0.0929 (0.0994) time: 2.9388 data: 0.0069 max mem: 33301 Epoch: [25] [4090/4276] eta: 0:09:04 lr: 1.9491142057762597e-05 loss: 0.0998 (0.0995) time: 2.9421 data: 0.0068 max mem: 33301 Epoch: [25] [4100/4276] eta: 0:08:35 lr: 1.9488220747844836e-05 loss: 0.1060 (0.0995) time: 2.9378 data: 0.0071 max mem: 33301 Epoch: [25] [4110/4276] eta: 0:08:05 lr: 1.9485299389269788e-05 loss: 0.1001 (0.0995) time: 2.9361 data: 0.0069 max mem: 33301 Epoch: [25] [4120/4276] eta: 0:07:36 lr: 1.9482377982028554e-05 loss: 0.0997 (0.0995) time: 2.9362 data: 0.0073 max mem: 33301 Epoch: [25] [4130/4276] eta: 0:07:07 lr: 1.9479456526112206e-05 loss: 0.0962 (0.0995) time: 2.9349 data: 0.0073 max mem: 33301 Epoch: [25] [4140/4276] eta: 0:06:38 lr: 1.9476535021511823e-05 loss: 0.0904 (0.0995) time: 2.9242 data: 0.0070 max mem: 33301 Epoch: [25] [4150/4276] eta: 0:06:08 lr: 1.947361346821848e-05 loss: 0.0910 (0.0995) time: 2.9201 data: 0.0074 max mem: 33301 Epoch: [25] [4160/4276] eta: 0:05:39 lr: 1.9470691866223246e-05 loss: 0.0919 (0.0995) time: 2.9318 data: 0.0070 max mem: 33301 Epoch: [25] [4170/4276] eta: 0:05:10 lr: 1.9467770215517188e-05 loss: 0.1101 (0.0995) time: 2.9313 data: 0.0066 max mem: 33301 Epoch: [25] [4180/4276] eta: 0:04:41 lr: 1.946484851609138e-05 loss: 0.1020 (0.0995) time: 2.9289 data: 0.0066 max mem: 33301 Epoch: [25] [4190/4276] eta: 0:04:11 lr: 1.9461926767936874e-05 loss: 0.0949 (0.0995) time: 2.9352 data: 0.0065 max mem: 33301 Epoch: [25] [4200/4276] eta: 0:03:42 lr: 1.945900497104473e-05 loss: 0.1009 (0.0995) time: 2.9374 data: 0.0065 max mem: 33301 Epoch: [25] [4210/4276] eta: 0:03:13 lr: 1.945608312540601e-05 loss: 0.1010 (0.0995) time: 2.9386 data: 0.0065 max mem: 33301 Epoch: [25] [4220/4276] eta: 0:02:43 lr: 1.945316123101176e-05 loss: 0.1018 (0.0996) time: 2.9392 data: 0.0064 max mem: 33301 Epoch: [25] [4230/4276] eta: 0:02:14 lr: 1.9450239287853026e-05 loss: 0.1072 (0.0997) time: 2.9387 data: 0.0064 max mem: 33301 Epoch: [25] [4240/4276] eta: 0:01:45 lr: 1.9447317295920863e-05 loss: 0.1072 (0.0997) time: 2.9321 data: 0.0065 max mem: 33301 Epoch: [25] [4250/4276] eta: 0:01:16 lr: 1.9444395255206317e-05 loss: 0.1166 (0.0997) time: 2.9338 data: 0.0065 max mem: 33301 Epoch: [25] [4260/4276] eta: 0:00:46 lr: 1.9441473165700413e-05 loss: 0.1102 (0.0998) time: 2.9457 data: 0.0064 max mem: 33301 Epoch: [25] [4270/4276] eta: 0:00:17 lr: 1.94385510273942e-05 loss: 0.1000 (0.0998) time: 2.9441 data: 0.0064 max mem: 33301 Epoch: [25] Total time: 3:28:40 Test: [ 0/21770] eta: 7:38:22 time: 1.2633 data: 1.2250 max mem: 33301 Test: [ 100/21770] eta: 0:18:11 time: 0.0382 data: 0.0009 max mem: 33301 Test: [ 200/21770] eta: 0:16:00 time: 0.0388 data: 0.0008 max mem: 33301 Test: [ 300/21770] eta: 0:15:14 time: 0.0385 data: 0.0008 max mem: 33301 Test: [ 400/21770] eta: 0:14:50 time: 0.0389 data: 0.0008 max mem: 33301 Test: [ 500/21770] eta: 0:14:35 time: 0.0389 data: 0.0008 max mem: 33301 Test: [ 600/21770] eta: 0:14:23 time: 0.0387 data: 0.0008 max mem: 33301 Test: [ 700/21770] eta: 0:14:13 time: 0.0387 data: 0.0008 max mem: 33301 Test: [ 800/21770] eta: 0:14:05 time: 0.0390 data: 0.0008 max mem: 33301 Test: [ 900/21770] eta: 0:13:58 time: 0.0387 data: 0.0008 max mem: 33301 Test: [ 1000/21770] eta: 0:13:51 time: 0.0391 data: 0.0009 max mem: 33301 Test: [ 1100/21770] eta: 0:13:45 time: 0.0389 data: 0.0009 max mem: 33301 Test: [ 1200/21770] eta: 0:13:39 time: 0.0390 data: 0.0009 max mem: 33301 Test: [ 1300/21770] eta: 0:13:34 time: 0.0388 data: 0.0008 max mem: 33301 Test: [ 1400/21770] eta: 0:13:27 time: 0.0379 data: 0.0009 max mem: 33301 Test: [ 1500/21770] eta: 0:13:21 time: 0.0381 data: 0.0009 max mem: 33301 Test: [ 1600/21770] eta: 0:13:15 time: 0.0378 data: 0.0009 max mem: 33301 Test: [ 1700/21770] eta: 0:13:09 time: 0.0380 data: 0.0009 max mem: 33301 Test: [ 1800/21770] eta: 0:13:04 time: 0.0386 data: 0.0009 max mem: 33301 Test: [ 1900/21770] eta: 0:12:59 time: 0.0380 data: 0.0008 max mem: 33301 Test: [ 2000/21770] eta: 0:12:54 time: 0.0380 data: 0.0009 max mem: 33301 Test: [ 2100/21770] eta: 0:12:49 time: 0.0379 data: 0.0008 max mem: 33301 Test: [ 2200/21770] eta: 0:12:44 time: 0.0380 data: 0.0008 max mem: 33301 Test: [ 2300/21770] eta: 0:12:39 time: 0.0380 data: 0.0008 max mem: 33301 Test: [ 2400/21770] eta: 0:12:35 time: 0.0381 data: 0.0009 max mem: 33301 Test: [ 2500/21770] eta: 0:12:30 time: 0.0380 data: 0.0009 max mem: 33301 Test: [ 2600/21770] eta: 0:12:26 time: 0.0381 data: 0.0009 max mem: 33301 Test: [ 2700/21770] eta: 0:12:21 time: 0.0381 data: 0.0009 max mem: 33301 Test: [ 2800/21770] eta: 0:12:17 time: 0.0382 data: 0.0009 max mem: 33301 Test: [ 2900/21770] eta: 0:12:12 time: 0.0385 data: 0.0009 max mem: 33301 Test: [ 3000/21770] eta: 0:12:08 time: 0.0389 data: 0.0009 max mem: 33301 Test: [ 3100/21770] eta: 0:12:05 time: 0.0390 data: 0.0008 max mem: 33301 Test: [ 3200/21770] eta: 0:12:01 time: 0.0389 data: 0.0008 max mem: 33301 Test: [ 3300/21770] eta: 0:11:57 time: 0.0393 data: 0.0008 max mem: 33301 Test: [ 3400/21770] eta: 0:11:53 time: 0.0393 data: 0.0008 max mem: 33301 Test: [ 3500/21770] eta: 0:11:50 time: 0.0394 data: 0.0008 max mem: 33301 Test: [ 3600/21770] eta: 0:11:46 time: 0.0392 data: 0.0009 max mem: 33301 Test: [ 3700/21770] eta: 0:11:43 time: 0.0396 data: 0.0009 max mem: 33301 Test: [ 3800/21770] eta: 0:11:39 time: 0.0394 data: 0.0009 max mem: 33301 Test: [ 3900/21770] eta: 0:11:35 time: 0.0396 data: 0.0008 max mem: 33301 Test: [ 4000/21770] eta: 0:11:32 time: 0.0391 data: 0.0008 max mem: 33301 Test: [ 4100/21770] eta: 0:11:28 time: 0.0392 data: 0.0008 max mem: 33301 Test: [ 4200/21770] eta: 0:11:24 time: 0.0391 data: 0.0008 max mem: 33301 Test: [ 4300/21770] eta: 0:11:20 time: 0.0392 data: 0.0008 max mem: 33301 Test: [ 4400/21770] eta: 0:11:17 time: 0.0392 data: 0.0008 max mem: 33301 Test: [ 4500/21770] eta: 0:11:13 time: 0.0379 data: 0.0009 max mem: 33301 Test: [ 4600/21770] eta: 0:11:09 time: 0.0389 data: 0.0009 max mem: 33301 Test: [ 4700/21770] eta: 0:11:05 time: 0.0399 data: 0.0008 max mem: 33301 Test: [ 4800/21770] eta: 0:11:01 time: 0.0389 data: 0.0008 max mem: 33301 Test: [ 4900/21770] eta: 0:10:57 time: 0.0394 data: 0.0008 max mem: 33301 Test: [ 5000/21770] eta: 0:10:54 time: 0.0390 data: 0.0009 max mem: 33301 Test: [ 5100/21770] eta: 0:10:50 time: 0.0397 data: 0.0009 max mem: 33301 Test: [ 5200/21770] eta: 0:10:46 time: 0.0393 data: 0.0009 max mem: 33301 Test: [ 5300/21770] eta: 0:10:42 time: 0.0395 data: 0.0009 max mem: 33301 Test: [ 5400/21770] eta: 0:10:38 time: 0.0395 data: 0.0008 max mem: 33301 Test: [ 5500/21770] eta: 0:10:35 time: 0.0393 data: 0.0008 max mem: 33301 Test: [ 5600/21770] eta: 0:10:31 time: 0.0379 data: 0.0009 max mem: 33301 Test: [ 5700/21770] eta: 0:10:26 time: 0.0381 data: 0.0009 max mem: 33301 Test: [ 5800/21770] eta: 0:10:22 time: 0.0382 data: 0.0009 max mem: 33301 Test: [ 5900/21770] eta: 0:10:18 time: 0.0382 data: 0.0009 max mem: 33301 Test: [ 6000/21770] eta: 0:10:14 time: 0.0379 data: 0.0009 max mem: 33301 Test: [ 6100/21770] eta: 0:10:10 time: 0.0382 data: 0.0009 max mem: 33301 Test: [ 6200/21770] eta: 0:10:06 time: 0.0380 data: 0.0008 max mem: 33301 Test: [ 6300/21770] eta: 0:10:02 time: 0.0382 data: 0.0009 max mem: 33301 Test: [ 6400/21770] eta: 0:09:58 time: 0.0381 data: 0.0009 max mem: 33301 Test: [ 6500/21770] eta: 0:09:54 time: 0.0382 data: 0.0009 max mem: 33301 Test: [ 6600/21770] eta: 0:09:49 time: 0.0380 data: 0.0009 max mem: 33301 Test: [ 6700/21770] eta: 0:09:45 time: 0.0381 data: 0.0009 max mem: 33301 Test: [ 6800/21770] eta: 0:09:41 time: 0.0380 data: 0.0009 max mem: 33301 Test: [ 6900/21770] eta: 0:09:37 time: 0.0382 data: 0.0009 max mem: 33301 Test: [ 7000/21770] eta: 0:09:33 time: 0.0380 data: 0.0009 max mem: 33301 Test: [ 7100/21770] eta: 0:09:29 time: 0.0382 data: 0.0009 max mem: 33301 Test: [ 7200/21770] eta: 0:09:25 time: 0.0381 data: 0.0009 max mem: 33301 Test: [ 7300/21770] eta: 0:09:21 time: 0.0382 data: 0.0009 max mem: 33301 Test: [ 7400/21770] eta: 0:09:17 time: 0.0380 data: 0.0009 max mem: 33301 Test: [ 7500/21770] eta: 0:09:13 time: 0.0382 data: 0.0009 max mem: 33301 Test: [ 7600/21770] eta: 0:09:09 time: 0.0381 data: 0.0009 max mem: 33301 Test: [ 7700/21770] eta: 0:09:05 time: 0.0382 data: 0.0009 max mem: 33301 Test: [ 7800/21770] eta: 0:09:01 time: 0.0382 data: 0.0009 max mem: 33301 Test: [ 7900/21770] eta: 0:08:57 time: 0.0383 data: 0.0009 max mem: 33301 Test: [ 8000/21770] eta: 0:08:53 time: 0.0381 data: 0.0009 max mem: 33301 Test: [ 8100/21770] eta: 0:08:49 time: 0.0381 data: 0.0009 max mem: 33301 Test: [ 8200/21770] eta: 0:08:45 time: 0.0381 data: 0.0009 max mem: 33301 Test: [ 8300/21770] eta: 0:08:41 time: 0.0385 data: 0.0009 max mem: 33301 Test: [ 8400/21770] eta: 0:08:37 time: 0.0388 data: 0.0009 max mem: 33301 Test: [ 8500/21770] eta: 0:08:34 time: 0.0387 data: 0.0009 max mem: 33301 Test: [ 8600/21770] eta: 0:08:30 time: 0.0384 data: 0.0008 max mem: 33301 Test: [ 8700/21770] eta: 0:08:26 time: 0.0386 data: 0.0008 max mem: 33301 Test: [ 8800/21770] eta: 0:08:22 time: 0.0381 data: 0.0009 max mem: 33301 Test: [ 8900/21770] eta: 0:08:18 time: 0.0386 data: 0.0009 max mem: 33301 Test: [ 9000/21770] eta: 0:08:14 time: 0.0383 data: 0.0009 max mem: 33301 Test: [ 9100/21770] eta: 0:08:10 time: 0.0386 data: 0.0009 max mem: 33301 Test: [ 9200/21770] eta: 0:08:06 time: 0.0382 data: 0.0009 max mem: 33301 Test: [ 9300/21770] eta: 0:08:02 time: 0.0386 data: 0.0009 max mem: 33301 Test: [ 9400/21770] eta: 0:07:58 time: 0.0381 data: 0.0009 max mem: 33301 Test: [ 9500/21770] eta: 0:07:54 time: 0.0382 data: 0.0009 max mem: 33301 Test: [ 9600/21770] eta: 0:07:50 time: 0.0385 data: 0.0009 max mem: 33301 Test: [ 9700/21770] eta: 0:07:47 time: 0.0386 data: 0.0009 max mem: 33301 Test: [ 9800/21770] eta: 0:07:43 time: 0.0394 data: 0.0008 max mem: 33301 Test: [ 9900/21770] eta: 0:07:39 time: 0.0392 data: 0.0008 max mem: 33301 Test: [10000/21770] eta: 0:07:35 time: 0.0395 data: 0.0008 max mem: 33301 Test: [10100/21770] eta: 0:07:31 time: 0.0395 data: 0.0008 max mem: 33301 Test: [10200/21770] eta: 0:07:28 time: 0.0391 data: 0.0008 max mem: 33301 Test: [10300/21770] eta: 0:07:24 time: 0.0402 data: 0.0008 max mem: 33301 Test: [10400/21770] eta: 0:07:20 time: 0.0396 data: 0.0008 max mem: 33301 Test: [10500/21770] eta: 0:07:16 time: 0.0400 data: 0.0008 max mem: 33301 Test: [10600/21770] eta: 0:07:13 time: 0.0398 data: 0.0008 max mem: 33301 Test: [10700/21770] eta: 0:07:09 time: 0.0392 data: 0.0008 max mem: 33301 Test: [10800/21770] eta: 0:07:05 time: 0.0391 data: 0.0008 max mem: 33301 Test: [10900/21770] eta: 0:07:01 time: 0.0401 data: 0.0008 max mem: 33301 Test: [11000/21770] eta: 0:06:57 time: 0.0394 data: 0.0008 max mem: 33301 Test: [11100/21770] eta: 0:06:53 time: 0.0395 data: 0.0008 max mem: 33301 Test: [11200/21770] eta: 0:06:50 time: 0.0391 data: 0.0008 max mem: 33301 Test: [11300/21770] eta: 0:06:46 time: 0.0399 data: 0.0008 max mem: 33301 Test: [11400/21770] eta: 0:06:42 time: 0.0377 data: 0.0009 max mem: 33301 Test: [11500/21770] eta: 0:06:38 time: 0.0378 data: 0.0009 max mem: 33301 Test: [11600/21770] eta: 0:06:34 time: 0.0378 data: 0.0009 max mem: 33301 Test: [11700/21770] eta: 0:06:30 time: 0.0379 data: 0.0009 max mem: 33301 Test: [11800/21770] eta: 0:06:26 time: 0.0378 data: 0.0009 max mem: 33301 Test: [11900/21770] eta: 0:06:22 time: 0.0385 data: 0.0009 max mem: 33301 Test: [12000/21770] eta: 0:06:18 time: 0.0386 data: 0.0009 max mem: 33301 Test: [12100/21770] eta: 0:06:14 time: 0.0387 data: 0.0009 max mem: 33301 Test: [12200/21770] eta: 0:06:11 time: 0.0386 data: 0.0009 max mem: 33301 Test: [12300/21770] eta: 0:06:07 time: 0.0387 data: 0.0009 max mem: 33301 Test: [12400/21770] eta: 0:06:03 time: 0.0387 data: 0.0009 max mem: 33301 Test: [12500/21770] eta: 0:05:59 time: 0.0389 data: 0.0009 max mem: 33301 Test: [12600/21770] eta: 0:05:55 time: 0.0386 data: 0.0009 max mem: 33301 Test: [12700/21770] eta: 0:05:51 time: 0.0388 data: 0.0009 max mem: 33301 Test: [12800/21770] eta: 0:05:47 time: 0.0384 data: 0.0009 max mem: 33301 Test: [12900/21770] eta: 0:05:43 time: 0.0394 data: 0.0008 max mem: 33301 Test: [13000/21770] eta: 0:05:40 time: 0.0393 data: 0.0008 max mem: 33301 Test: [13100/21770] eta: 0:05:36 time: 0.0399 data: 0.0008 max mem: 33301 Test: [13200/21770] eta: 0:05:32 time: 0.0399 data: 0.0008 max mem: 33301 Test: [13300/21770] eta: 0:05:28 time: 0.0399 data: 0.0008 max mem: 33301 Test: [13400/21770] eta: 0:05:24 time: 0.0378 data: 0.0009 max mem: 33301 Test: [13500/21770] eta: 0:05:20 time: 0.0378 data: 0.0009 max mem: 33301 Test: [13600/21770] eta: 0:05:16 time: 0.0379 data: 0.0009 max mem: 33301 Test: [13700/21770] eta: 0:05:12 time: 0.0377 data: 0.0008 max mem: 33301 Test: [13800/21770] eta: 0:05:08 time: 0.0379 data: 0.0009 max mem: 33301 Test: [13900/21770] eta: 0:05:05 time: 0.0379 data: 0.0009 max mem: 33301 Test: [14000/21770] eta: 0:05:01 time: 0.0378 data: 0.0009 max mem: 33301 Test: [14100/21770] eta: 0:04:57 time: 0.0379 data: 0.0009 max mem: 33301 Test: [14200/21770] eta: 0:04:53 time: 0.0378 data: 0.0009 max mem: 33301 Test: [14300/21770] eta: 0:04:49 time: 0.0379 data: 0.0009 max mem: 33301 Test: [14400/21770] eta: 0:04:45 time: 0.0379 data: 0.0009 max mem: 33301 Test: [14500/21770] eta: 0:04:41 time: 0.0383 data: 0.0009 max mem: 33301 Test: [14600/21770] eta: 0:04:37 time: 0.0382 data: 0.0009 max mem: 33301 Test: [14700/21770] eta: 0:04:33 time: 0.0381 data: 0.0009 max mem: 33301 Test: [14800/21770] eta: 0:04:29 time: 0.0382 data: 0.0009 max mem: 33301 Test: [14900/21770] eta: 0:04:25 time: 0.0383 data: 0.0009 max mem: 33301 Test: [15000/21770] eta: 0:04:22 time: 0.0382 data: 0.0009 max mem: 33301 Test: [15100/21770] eta: 0:04:18 time: 0.0382 data: 0.0009 max mem: 33301 Test: [15200/21770] eta: 0:04:14 time: 0.0381 data: 0.0009 max mem: 33301 Test: [15300/21770] eta: 0:04:10 time: 0.0382 data: 0.0009 max mem: 33301 Test: [15400/21770] eta: 0:04:06 time: 0.0381 data: 0.0009 max mem: 33301 Test: [15500/21770] eta: 0:04:02 time: 0.0383 data: 0.0009 max mem: 33301 Test: [15600/21770] eta: 0:03:58 time: 0.0385 data: 0.0009 max mem: 33301 Test: [15700/21770] eta: 0:03:54 time: 0.0388 data: 0.0009 max mem: 33301 Test: [15800/21770] eta: 0:03:50 time: 0.0384 data: 0.0009 max mem: 33301 Test: [15900/21770] eta: 0:03:47 time: 0.0386 data: 0.0009 max mem: 33301 Test: [16000/21770] eta: 0:03:43 time: 0.0387 data: 0.0009 max mem: 33301 Test: [16100/21770] eta: 0:03:39 time: 0.0390 data: 0.0009 max mem: 33301 Test: [16200/21770] eta: 0:03:35 time: 0.0386 data: 0.0009 max mem: 33301 Test: [16300/21770] eta: 0:03:31 time: 0.0389 data: 0.0009 max mem: 33301 Test: [16400/21770] eta: 0:03:27 time: 0.0386 data: 0.0009 max mem: 33301 Test: [16500/21770] eta: 0:03:23 time: 0.0389 data: 0.0009 max mem: 33301 Test: [16600/21770] eta: 0:03:20 time: 0.0387 data: 0.0009 max mem: 33301 Test: [16700/21770] eta: 0:03:16 time: 0.0389 data: 0.0009 max mem: 33301 Test: [16800/21770] eta: 0:03:12 time: 0.0384 data: 0.0009 max mem: 33301 Test: [16900/21770] eta: 0:03:08 time: 0.0387 data: 0.0009 max mem: 33301 Test: [17000/21770] eta: 0:03:04 time: 0.0379 data: 0.0009 max mem: 33301 Test: [17100/21770] eta: 0:03:00 time: 0.0381 data: 0.0009 max mem: 33301 Test: [17200/21770] eta: 0:02:56 time: 0.0381 data: 0.0009 max mem: 33301 Test: [17300/21770] eta: 0:02:52 time: 0.0383 data: 0.0009 max mem: 33301 Test: [17400/21770] eta: 0:02:49 time: 0.0381 data: 0.0009 max mem: 33301 Test: [17500/21770] eta: 0:02:45 time: 0.0383 data: 0.0009 max mem: 33301 Test: [17600/21770] eta: 0:02:41 time: 0.0381 data: 0.0009 max mem: 33301 Test: [17700/21770] eta: 0:02:37 time: 0.0382 data: 0.0009 max mem: 33301 Test: [17800/21770] eta: 0:02:33 time: 0.0381 data: 0.0009 max mem: 33301 Test: [17900/21770] eta: 0:02:29 time: 0.0383 data: 0.0009 max mem: 33301 Test: [18000/21770] eta: 0:02:25 time: 0.0381 data: 0.0009 max mem: 33301 Test: [18100/21770] eta: 0:02:21 time: 0.0387 data: 0.0009 max mem: 33301 Test: [18200/21770] eta: 0:02:18 time: 0.0387 data: 0.0009 max mem: 33301 Test: [18300/21770] eta: 0:02:14 time: 0.0383 data: 0.0009 max mem: 33301 Test: [18400/21770] eta: 0:02:10 time: 0.0386 data: 0.0009 max mem: 33301 Test: [18500/21770] eta: 0:02:06 time: 0.0385 data: 0.0009 max mem: 33301 Test: [18600/21770] eta: 0:02:02 time: 0.0385 data: 0.0009 max mem: 33301 Test: [18700/21770] eta: 0:01:58 time: 0.0385 data: 0.0009 max mem: 33301 Test: [18800/21770] eta: 0:01:54 time: 0.0385 data: 0.0009 max mem: 33301 Test: [18900/21770] eta: 0:01:50 time: 0.0384 data: 0.0009 max mem: 33301 Test: [19000/21770] eta: 0:01:47 time: 0.0385 data: 0.0009 max mem: 33301 Test: [19100/21770] eta: 0:01:43 time: 0.0384 data: 0.0009 max mem: 33301 Test: [19200/21770] eta: 0:01:39 time: 0.0385 data: 0.0009 max mem: 33301 Test: [19300/21770] eta: 0:01:35 time: 0.0386 data: 0.0009 max mem: 33301 Test: [19400/21770] eta: 0:01:31 time: 0.0384 data: 0.0009 max mem: 33301 Test: [19500/21770] eta: 0:01:27 time: 0.0385 data: 0.0009 max mem: 33301 Test: [19600/21770] eta: 0:01:23 time: 0.0381 data: 0.0009 max mem: 33301 Test: [19700/21770] eta: 0:01:19 time: 0.0383 data: 0.0009 max mem: 33301 Test: [19800/21770] eta: 0:01:16 time: 0.0378 data: 0.0009 max mem: 33301 Test: [19900/21770] eta: 0:01:12 time: 0.0379 data: 0.0009 max mem: 33301 Test: [20000/21770] eta: 0:01:08 time: 0.0378 data: 0.0009 max mem: 33301 Test: [20100/21770] eta: 0:01:04 time: 0.0379 data: 0.0009 max mem: 33301 Test: [20200/21770] eta: 0:01:00 time: 0.0378 data: 0.0009 max mem: 33301 Test: [20300/21770] eta: 0:00:56 time: 0.0378 data: 0.0009 max mem: 33301 Test: [20400/21770] eta: 0:00:52 time: 0.0378 data: 0.0009 max mem: 33301 Test: [20500/21770] eta: 0:00:49 time: 0.0378 data: 0.0009 max mem: 33301 Test: [20600/21770] eta: 0:00:45 time: 0.0378 data: 0.0009 max mem: 33301 Test: [20700/21770] eta: 0:00:41 time: 0.0378 data: 0.0009 max mem: 33301 Test: [20800/21770] eta: 0:00:37 time: 0.0379 data: 0.0009 max mem: 33301 Test: [20900/21770] eta: 0:00:33 time: 0.0379 data: 0.0009 max mem: 33301 Test: [21000/21770] eta: 0:00:29 time: 0.0378 data: 0.0009 max mem: 33301 Test: [21100/21770] eta: 0:00:25 time: 0.0378 data: 0.0009 max mem: 33301 Test: [21200/21770] eta: 0:00:21 time: 0.0379 data: 0.0009 max mem: 33301 Test: [21300/21770] eta: 0:00:18 time: 0.0378 data: 0.0009 max mem: 33301 Test: [21400/21770] eta: 0:00:14 time: 0.0378 data: 0.0009 max mem: 33301 Test: [21500/21770] eta: 0:00:10 time: 0.0378 data: 0.0009 max mem: 33301 Test: [21600/21770] eta: 0:00:06 time: 0.0379 data: 0.0009 max mem: 33301 Test: [21700/21770] eta: 0:00:02 time: 0.0378 data: 0.0009 max mem: 33301 Test: Total time: 0:13:59 Final results: Mean IoU is 0.00 precision@0.5 = 0.00 precision@0.6 = 0.00 precision@0.7 = 0.00 precision@0.8 = 0.00 precision@0.9 = 0.00 overall IoU = 0.00 mean IoU = 0.00 Mean accuracy for one-to-zero sample is 100.00 Average object IoU 0.0 Overall IoU 0.0 Epoch: [26] [ 0/4276] eta: 5:59:27 lr: 1.943679772098259e-05 loss: 0.0954 (0.0954) time: 5.0438 data: 2.0314 max mem: 33301 Epoch: [26] [ 10/4276] eta: 3:42:51 lr: 1.9433875504577227e-05 loss: 0.0954 (0.0972) time: 3.1344 data: 0.1916 max mem: 33301 Epoch: [26] [ 20/4276] eta: 3:35:46 lr: 1.9430953239348233e-05 loss: 0.0918 (0.0969) time: 2.9418 data: 0.0071 max mem: 33301 Epoch: [26] [ 30/4276] eta: 3:32:35 lr: 1.942803092528663e-05 loss: 0.0967 (0.0989) time: 2.9327 data: 0.0076 max mem: 33301 Epoch: [26] [ 40/4276] eta: 3:31:04 lr: 1.9425108562383443e-05 loss: 0.0970 (0.0978) time: 2.9349 data: 0.0076 max mem: 33301 Epoch: [26] [ 50/4276] eta: 3:29:55 lr: 1.942218615062969e-05 loss: 0.0939 (0.0957) time: 2.9437 data: 0.0068 max mem: 33301 Epoch: [26] [ 60/4276] eta: 3:29:01 lr: 1.9419263690016394e-05 loss: 0.0845 (0.0941) time: 2.9437 data: 0.0070 max mem: 33301 Epoch: [26] [ 70/4276] eta: 3:28:12 lr: 1.9416341180534554e-05 loss: 0.0839 (0.0927) time: 2.9436 data: 0.0069 max mem: 33301 Epoch: [26] [ 80/4276] eta: 3:27:28 lr: 1.941341862217519e-05 loss: 0.0839 (0.0934) time: 2.9430 data: 0.0068 max mem: 33301 Epoch: [26] [ 90/4276] eta: 3:26:50 lr: 1.9410496014929304e-05 loss: 0.0912 (0.0929) time: 2.9456 data: 0.0068 max mem: 33301 Epoch: [26] [ 100/4276] eta: 3:26:10 lr: 1.9407573358787913e-05 loss: 0.0902 (0.0936) time: 2.9435 data: 0.0068 max mem: 33301 Epoch: [26] [ 110/4276] eta: 3:25:31 lr: 1.9404650653741994e-05 loss: 0.0915 (0.0949) time: 2.9383 data: 0.0067 max mem: 33301 Epoch: [26] [ 120/4276] eta: 3:24:54 lr: 1.9401727899782556e-05 loss: 0.1094 (0.0957) time: 2.9388 data: 0.0066 max mem: 33301 Epoch: [26] [ 130/4276] eta: 3:24:18 lr: 1.9398805096900598e-05 loss: 0.0982 (0.0960) time: 2.9394 data: 0.0066 max mem: 33301 Epoch: [26] [ 140/4276] eta: 3:23:42 lr: 1.9395882245087105e-05 loss: 0.0905 (0.0957) time: 2.9357 data: 0.0066 max mem: 33301 Epoch: [26] [ 150/4276] eta: 3:23:06 lr: 1.9392959344333062e-05 loss: 0.0905 (0.0957) time: 2.9331 data: 0.0066 max mem: 33301 Epoch: [26] [ 160/4276] eta: 3:22:33 lr: 1.9390036394629456e-05 loss: 0.0869 (0.0953) time: 2.9353 data: 0.0067 max mem: 33301 Epoch: [26] [ 170/4276] eta: 3:21:58 lr: 1.9387113395967264e-05 loss: 0.0839 (0.0951) time: 2.9338 data: 0.0068 max mem: 33301 Epoch: [26] [ 180/4276] eta: 3:21:26 lr: 1.938419034833748e-05 loss: 0.0900 (0.0957) time: 2.9351 data: 0.0067 max mem: 33301 Epoch: [26] [ 190/4276] eta: 3:20:53 lr: 1.938126725173106e-05 loss: 0.0900 (0.0958) time: 2.9389 data: 0.0065 max mem: 33301 Epoch: [26] [ 200/4276] eta: 3:20:19 lr: 1.9378344106138982e-05 loss: 0.0814 (0.0958) time: 2.9329 data: 0.0064 max mem: 33301 Epoch: [26] [ 210/4276] eta: 3:19:45 lr: 1.9375420911552218e-05 loss: 0.0916 (0.0958) time: 2.9254 data: 0.0075 max mem: 33301 Epoch: [26] [ 220/4276] eta: 3:19:14 lr: 1.9372497667961735e-05 loss: 0.0965 (0.0958) time: 2.9316 data: 0.0079 max mem: 33301 Epoch: [26] [ 230/4276] eta: 3:18:39 lr: 1.9369574375358484e-05 loss: 0.0877 (0.0955) time: 2.9287 data: 0.0074 max mem: 33301 Epoch: [26] [ 240/4276] eta: 3:18:08 lr: 1.936665103373343e-05 loss: 0.0877 (0.0955) time: 2.9270 data: 0.0072 max mem: 33301 Epoch: [26] [ 250/4276] eta: 3:17:37 lr: 1.9363727643077533e-05 loss: 0.1046 (0.0970) time: 2.9360 data: 0.0071 max mem: 33301 Epoch: [26] [ 260/4276] eta: 3:17:06 lr: 1.9360804203381745e-05 loss: 0.1046 (0.0971) time: 2.9364 data: 0.0070 max mem: 33301 Epoch: [26] [ 270/4276] eta: 3:16:34 lr: 1.9357880714637007e-05 loss: 0.0924 (0.0975) time: 2.9294 data: 0.0071 max mem: 33301 Epoch: [26] [ 280/4276] eta: 3:16:04 lr: 1.9354957176834275e-05 loss: 0.1028 (0.0976) time: 2.9307 data: 0.0072 max mem: 33301 Epoch: [26] [ 290/4276] eta: 3:15:33 lr: 1.935203358996449e-05 loss: 0.0917 (0.0976) time: 2.9392 data: 0.0071 max mem: 33301 Epoch: [26] [ 300/4276] eta: 3:15:03 lr: 1.934910995401858e-05 loss: 0.0887 (0.0974) time: 2.9390 data: 0.0072 max mem: 33301 Epoch: [26] [ 310/4276] eta: 3:14:33 lr: 1.93461862689875e-05 loss: 0.0892 (0.0973) time: 2.9397 data: 0.0070 max mem: 33301 Epoch: [26] [ 320/4276] eta: 3:14:03 lr: 1.9343262534862168e-05 loss: 0.0951 (0.0975) time: 2.9391 data: 0.0069 max mem: 33301 Epoch: [26] [ 330/4276] eta: 3:13:33 lr: 1.9340338751633526e-05 loss: 0.1005 (0.0978) time: 2.9385 data: 0.0072 max mem: 33301 Epoch: [26] [ 340/4276] eta: 3:13:04 lr: 1.9337414919292495e-05 loss: 0.0993 (0.0979) time: 2.9392 data: 0.0071 max mem: 33301 Epoch: [26] [ 350/4276] eta: 3:12:36 lr: 1.9334491037829997e-05 loss: 0.0993 (0.0980) time: 2.9480 data: 0.0072 max mem: 33301 Epoch: [26] [ 360/4276] eta: 3:12:06 lr: 1.9331567107236956e-05 loss: 0.1054 (0.0985) time: 2.9482 data: 0.0069 max mem: 33301 Epoch: [26] [ 370/4276] eta: 3:11:36 lr: 1.9328643127504294e-05 loss: 0.1007 (0.0986) time: 2.9383 data: 0.0066 max mem: 33301 Epoch: [26] [ 380/4276] eta: 3:11:06 lr: 1.9325719098622915e-05 loss: 0.0984 (0.0987) time: 2.9387 data: 0.0065 max mem: 33301 Epoch: [26] [ 390/4276] eta: 3:10:35 lr: 1.9322795020583738e-05 loss: 0.1019 (0.0988) time: 2.9347 data: 0.0064 max mem: 33301 Epoch: [26] [ 400/4276] eta: 3:10:02 lr: 1.9319870893377663e-05 loss: 0.1044 (0.0989) time: 2.9192 data: 0.0071 max mem: 33301 Epoch: [26] [ 410/4276] eta: 3:09:33 lr: 1.9316946716995606e-05 loss: 0.1044 (0.0992) time: 2.9242 data: 0.0073 max mem: 33301 Epoch: [26] [ 420/4276] eta: 3:09:03 lr: 1.9314022491428457e-05 loss: 0.1076 (0.0995) time: 2.9378 data: 0.0069 max mem: 33301 Epoch: [26] [ 430/4276] eta: 3:08:33 lr: 1.931109821666712e-05 loss: 0.1076 (0.0997) time: 2.9351 data: 0.0066 max mem: 33301 Epoch: [26] [ 440/4276] eta: 3:08:04 lr: 1.9308173892702486e-05 loss: 0.0968 (0.0997) time: 2.9393 data: 0.0071 max mem: 33301 Epoch: [26] [ 450/4276] eta: 3:07:34 lr: 1.930524951952546e-05 loss: 0.0932 (0.0997) time: 2.9411 data: 0.0072 max mem: 33301 Epoch: [26] [ 460/4276] eta: 3:07:04 lr: 1.930232509712691e-05 loss: 0.0888 (0.0993) time: 2.9343 data: 0.0067 max mem: 33301 Epoch: [26] [ 470/4276] eta: 3:06:34 lr: 1.929940062549773e-05 loss: 0.0880 (0.0992) time: 2.9341 data: 0.0067 max mem: 33301 Epoch: [26] [ 480/4276] eta: 3:06:04 lr: 1.9296476104628808e-05 loss: 0.0843 (0.0988) time: 2.9355 data: 0.0067 max mem: 33301 Epoch: [26] [ 490/4276] eta: 3:05:34 lr: 1.9293551534511024e-05 loss: 0.0815 (0.0985) time: 2.9343 data: 0.0066 max mem: 33301 Epoch: [26] [ 500/4276] eta: 3:05:04 lr: 1.929062691513524e-05 loss: 0.0843 (0.0982) time: 2.9358 data: 0.0066 max mem: 33301 Epoch: [26] [ 510/4276] eta: 3:04:34 lr: 1.9287702246492338e-05 loss: 0.0899 (0.0984) time: 2.9345 data: 0.0067 max mem: 33301 Epoch: [26] [ 520/4276] eta: 3:04:04 lr: 1.9284777528573183e-05 loss: 0.0994 (0.0984) time: 2.9344 data: 0.0072 max mem: 33301 Epoch: [26] [ 530/4276] eta: 3:03:34 lr: 1.928185276136865e-05 loss: 0.0947 (0.0983) time: 2.9342 data: 0.0072 max mem: 33301 Epoch: [26] [ 540/4276] eta: 3:03:05 lr: 1.9278927944869593e-05 loss: 0.0879 (0.0982) time: 2.9346 data: 0.0070 max mem: 33301 Epoch: [26] [ 550/4276] eta: 3:02:35 lr: 1.927600307906687e-05 loss: 0.0979 (0.0984) time: 2.9371 data: 0.0067 max mem: 33301 Epoch: [26] [ 560/4276] eta: 3:02:06 lr: 1.927307816395134e-05 loss: 0.1048 (0.0984) time: 2.9405 data: 0.0067 max mem: 33301 Epoch: [26] [ 570/4276] eta: 3:01:37 lr: 1.9270153199513866e-05 loss: 0.0974 (0.0983) time: 2.9434 data: 0.0068 max mem: 33301 Epoch: [26] [ 580/4276] eta: 3:01:07 lr: 1.926722818574528e-05 loss: 0.0974 (0.0986) time: 2.9409 data: 0.0066 max mem: 33301 Epoch: [26] [ 590/4276] eta: 3:00:36 lr: 1.926430312263644e-05 loss: 0.0908 (0.0985) time: 2.9256 data: 0.0074 max mem: 33301 Epoch: [26] [ 600/4276] eta: 3:00:03 lr: 1.926137801017819e-05 loss: 0.0889 (0.0984) time: 2.8966 data: 0.0079 max mem: 33301 Epoch: [26] [ 610/4276] eta: 2:59:32 lr: 1.925845284836136e-05 loss: 0.0879 (0.0982) time: 2.8949 data: 0.0077 max mem: 33301 Epoch: [26] [ 620/4276] eta: 2:59:02 lr: 1.9255527637176793e-05 loss: 0.0847 (0.0981) time: 2.9232 data: 0.0077 max mem: 33301 Epoch: [26] [ 630/4276] eta: 2:58:32 lr: 1.9252602376615326e-05 loss: 0.0926 (0.0983) time: 2.9288 data: 0.0076 max mem: 33301 Epoch: [26] [ 640/4276] eta: 2:58:03 lr: 1.9249677066667788e-05 loss: 0.1036 (0.0983) time: 2.9294 data: 0.0078 max mem: 33301 Epoch: [26] [ 650/4276] eta: 2:57:33 lr: 1.9246751707324996e-05 loss: 0.0900 (0.0983) time: 2.9371 data: 0.0078 max mem: 33301 Epoch: [26] [ 660/4276] eta: 2:57:04 lr: 1.9243826298577787e-05 loss: 0.1021 (0.0986) time: 2.9371 data: 0.0078 max mem: 33301 Epoch: [26] [ 670/4276] eta: 2:56:34 lr: 1.9240900840416974e-05 loss: 0.0964 (0.0985) time: 2.9385 data: 0.0076 max mem: 33301 Epoch: [26] [ 680/4276] eta: 2:56:05 lr: 1.9237975332833383e-05 loss: 0.0909 (0.0984) time: 2.9373 data: 0.0078 max mem: 33301 Epoch: [26] [ 690/4276] eta: 2:55:35 lr: 1.9235049775817814e-05 loss: 0.0975 (0.0985) time: 2.9361 data: 0.0079 max mem: 33301 Epoch: [26] [ 700/4276] eta: 2:55:06 lr: 1.9232124169361084e-05 loss: 0.0929 (0.0984) time: 2.9365 data: 0.0076 max mem: 33301 Epoch: [26] [ 710/4276] eta: 2:54:36 lr: 1.9229198513454005e-05 loss: 0.0916 (0.0984) time: 2.9368 data: 0.0075 max mem: 33301 Epoch: [26] [ 720/4276] eta: 2:54:07 lr: 1.9226272808087385e-05 loss: 0.0943 (0.0983) time: 2.9366 data: 0.0077 max mem: 33301 Epoch: [26] [ 730/4276] eta: 2:53:36 lr: 1.9223347053252007e-05 loss: 0.0889 (0.0982) time: 2.9238 data: 0.0076 max mem: 33301 Epoch: [26] [ 740/4276] eta: 2:53:07 lr: 1.9220421248938684e-05 loss: 0.0898 (0.0981) time: 2.9208 data: 0.0077 max mem: 33301 Epoch: [26] [ 750/4276] eta: 2:52:37 lr: 1.9217495395138203e-05 loss: 0.0900 (0.0981) time: 2.9274 data: 0.0076 max mem: 33301 Epoch: [26] [ 760/4276] eta: 2:52:07 lr: 1.9214569491841362e-05 loss: 0.0955 (0.0980) time: 2.9290 data: 0.0074 max mem: 33301 Epoch: [26] [ 770/4276] eta: 2:51:38 lr: 1.9211643539038942e-05 loss: 0.0895 (0.0980) time: 2.9357 data: 0.0076 max mem: 33301 Epoch: [26] [ 780/4276] eta: 2:51:08 lr: 1.9208717536721734e-05 loss: 0.0892 (0.0979) time: 2.9294 data: 0.0073 max mem: 33301 Epoch: [26] [ 790/4276] eta: 2:50:38 lr: 1.9205791484880508e-05 loss: 0.0892 (0.0980) time: 2.9245 data: 0.0068 max mem: 33301 Epoch: [26] [ 800/4276] eta: 2:50:08 lr: 1.9202865383506062e-05 loss: 0.0938 (0.0980) time: 2.9225 data: 0.0067 max mem: 33301 Epoch: [26] [ 810/4276] eta: 2:49:37 lr: 1.9199939232589148e-05 loss: 0.0938 (0.0980) time: 2.9172 data: 0.0074 max mem: 33301 Epoch: [26] [ 820/4276] eta: 2:49:07 lr: 1.919701303212055e-05 loss: 0.1042 (0.0982) time: 2.9134 data: 0.0079 max mem: 33301 Epoch: [26] [ 830/4276] eta: 2:48:38 lr: 1.9194086782091038e-05 loss: 0.1045 (0.0982) time: 2.9256 data: 0.0077 max mem: 33301 Epoch: [26] [ 840/4276] eta: 2:48:09 lr: 1.9191160482491374e-05 loss: 0.1047 (0.0984) time: 2.9425 data: 0.0072 max mem: 33301 Epoch: [26] [ 850/4276] eta: 2:47:40 lr: 1.9188234133312313e-05 loss: 0.0955 (0.0983) time: 2.9427 data: 0.0069 max mem: 33301 Epoch: [26] [ 860/4276] eta: 2:47:10 lr: 1.918530773454462e-05 loss: 0.0978 (0.0984) time: 2.9378 data: 0.0068 max mem: 33301 Epoch: [26] [ 870/4276] eta: 2:46:41 lr: 1.9182381286179052e-05 loss: 0.0931 (0.0983) time: 2.9345 data: 0.0065 max mem: 33301 Epoch: [26] [ 880/4276] eta: 2:46:12 lr: 1.9179454788206362e-05 loss: 0.0897 (0.0983) time: 2.9394 data: 0.0067 max mem: 33301 Epoch: [26] [ 890/4276] eta: 2:45:42 lr: 1.9176528240617288e-05 loss: 0.1066 (0.0986) time: 2.9396 data: 0.0069 max mem: 33301 Epoch: [26] [ 900/4276] eta: 2:45:13 lr: 1.9173601643402583e-05 loss: 0.1066 (0.0985) time: 2.9342 data: 0.0069 max mem: 33301 Epoch: [26] [ 910/4276] eta: 2:44:43 lr: 1.9170674996552994e-05 loss: 0.0948 (0.0987) time: 2.9362 data: 0.0068 max mem: 33301 Epoch: [26] [ 920/4276] eta: 2:44:14 lr: 1.9167748300059247e-05 loss: 0.1058 (0.0987) time: 2.9360 data: 0.0071 max mem: 33301 Epoch: [26] [ 930/4276] eta: 2:43:45 lr: 1.9164821553912084e-05 loss: 0.0947 (0.0987) time: 2.9342 data: 0.0073 max mem: 33301 Epoch: [26] [ 940/4276] eta: 2:43:14 lr: 1.9161894758102234e-05 loss: 0.0904 (0.0987) time: 2.9114 data: 0.0073 max mem: 33301 Epoch: [26] [ 950/4276] eta: 2:42:42 lr: 1.9158967912620433e-05 loss: 0.1003 (0.0988) time: 2.8851 data: 0.0072 max mem: 33301 Epoch: [26] [ 960/4276] eta: 2:42:11 lr: 1.9156041017457396e-05 loss: 0.1027 (0.0989) time: 2.8780 data: 0.0072 max mem: 33301 Epoch: [26] [ 970/4276] eta: 2:41:41 lr: 1.9153114072603852e-05 loss: 0.1014 (0.0989) time: 2.8989 data: 0.0075 max mem: 33301 Epoch: [26] [ 980/4276] eta: 2:41:12 lr: 1.9150187078050518e-05 loss: 0.1011 (0.0989) time: 2.9310 data: 0.0079 max mem: 33301 Epoch: [26] [ 990/4276] eta: 2:40:43 lr: 1.9147260033788117e-05 loss: 0.0945 (0.0989) time: 2.9378 data: 0.0080 max mem: 33301 Epoch: [26] [1000/4276] eta: 2:40:13 lr: 1.9144332939807348e-05 loss: 0.0944 (0.0989) time: 2.9374 data: 0.0083 max mem: 33301 Epoch: [26] [1010/4276] eta: 2:39:44 lr: 1.914140579609892e-05 loss: 0.0967 (0.0988) time: 2.9379 data: 0.0081 max mem: 33301 Epoch: [26] [1020/4276] eta: 2:39:15 lr: 1.913847860265355e-05 loss: 0.0963 (0.0989) time: 2.9388 data: 0.0079 max mem: 33301 Epoch: [26] [1030/4276] eta: 2:38:46 lr: 1.913555135946194e-05 loss: 0.1050 (0.0990) time: 2.9384 data: 0.0082 max mem: 33301 Epoch: [26] [1040/4276] eta: 2:38:15 lr: 1.9132624066514776e-05 loss: 0.1050 (0.0990) time: 2.9087 data: 0.0078 max mem: 33301 Epoch: [26] [1050/4276] eta: 2:37:44 lr: 1.912969672380276e-05 loss: 0.0928 (0.0991) time: 2.8931 data: 0.0074 max mem: 33301 Epoch: [26] [1060/4276] eta: 2:37:15 lr: 1.9126769331316585e-05 loss: 0.0983 (0.0991) time: 2.9208 data: 0.0075 max mem: 33301 Epoch: [26] [1070/4276] eta: 2:36:46 lr: 1.9123841889046944e-05 loss: 0.1027 (0.0992) time: 2.9347 data: 0.0071 max mem: 33301 Epoch: [26] [1080/4276] eta: 2:36:17 lr: 1.912091439698452e-05 loss: 0.1094 (0.0992) time: 2.9388 data: 0.0069 max mem: 33301 Epoch: [26] [1090/4276] eta: 2:35:46 lr: 1.9117986855119985e-05 loss: 0.1017 (0.0994) time: 2.9206 data: 0.0074 max mem: 33301 Epoch: [26] [1100/4276] eta: 2:35:17 lr: 1.9115059263444032e-05 loss: 0.0985 (0.0995) time: 2.9136 data: 0.0075 max mem: 33301 Epoch: [26] [1110/4276] eta: 2:34:48 lr: 1.9112131621947336e-05 loss: 0.1041 (0.0996) time: 2.9334 data: 0.0069 max mem: 33301 Epoch: [26] [1120/4276] eta: 2:34:18 lr: 1.9109203930620555e-05 loss: 0.1022 (0.0996) time: 2.9363 data: 0.0066 max mem: 33301 Epoch: [26] [1130/4276] eta: 2:33:49 lr: 1.9106276189454373e-05 loss: 0.0846 (0.0995) time: 2.9361 data: 0.0068 max mem: 33301 Epoch: [26] [1140/4276] eta: 2:33:20 lr: 1.910334839843945e-05 loss: 0.0898 (0.0995) time: 2.9362 data: 0.0067 max mem: 33301 Epoch: [26] [1150/4276] eta: 2:32:50 lr: 1.9100420557566447e-05 loss: 0.1004 (0.0995) time: 2.9174 data: 0.0070 max mem: 33301 Epoch: [26] [1160/4276] eta: 2:32:20 lr: 1.9097492666826027e-05 loss: 0.0924 (0.0995) time: 2.9085 data: 0.0074 max mem: 33301 Epoch: [26] [1170/4276] eta: 2:31:51 lr: 1.9094564726208837e-05 loss: 0.0903 (0.0995) time: 2.9269 data: 0.0072 max mem: 33301 Epoch: [26] [1180/4276] eta: 2:31:21 lr: 1.9091636735705537e-05 loss: 0.0902 (0.0995) time: 2.9246 data: 0.0072 max mem: 33301 Epoch: [26] [1190/4276] eta: 2:30:51 lr: 1.908870869530678e-05 loss: 0.0828 (0.0994) time: 2.9177 data: 0.0070 max mem: 33301 Epoch: [26] [1200/4276] eta: 2:30:21 lr: 1.9085780605003198e-05 loss: 0.0828 (0.0993) time: 2.9209 data: 0.0072 max mem: 33301 Epoch: [26] [1210/4276] eta: 2:29:52 lr: 1.908285246478544e-05 loss: 0.0822 (0.0992) time: 2.9268 data: 0.0072 max mem: 33301 Epoch: [26] [1220/4276] eta: 2:29:23 lr: 1.907992427464415e-05 loss: 0.0953 (0.0993) time: 2.9418 data: 0.0066 max mem: 33301 Epoch: [26] [1230/4276] eta: 2:28:53 lr: 1.907699603456996e-05 loss: 0.0956 (0.0993) time: 2.9222 data: 0.0072 max mem: 33301 Epoch: [26] [1240/4276] eta: 2:28:23 lr: 1.9074067744553494e-05 loss: 0.0868 (0.0992) time: 2.9057 data: 0.0078 max mem: 33301 Epoch: [26] [1250/4276] eta: 2:27:53 lr: 1.907113940458539e-05 loss: 0.0860 (0.0993) time: 2.9027 data: 0.0080 max mem: 33301 Epoch: [26] [1260/4276] eta: 2:27:23 lr: 1.9068211014656277e-05 loss: 0.0932 (0.0993) time: 2.8914 data: 0.0085 max mem: 33301 Epoch: [26] [1270/4276] eta: 2:26:53 lr: 1.9065282574756764e-05 loss: 0.0932 (0.0992) time: 2.9022 data: 0.0085 max mem: 33301 Epoch: [26] [1280/4276] eta: 2:26:23 lr: 1.906235408487748e-05 loss: 0.0923 (0.0992) time: 2.9232 data: 0.0076 max mem: 33301 Epoch: [26] [1290/4276] eta: 2:25:54 lr: 1.9059425545009036e-05 loss: 0.1055 (0.0992) time: 2.9362 data: 0.0072 max mem: 33301 Epoch: [26] [1300/4276] eta: 2:25:25 lr: 1.905649695514205e-05 loss: 0.0889 (0.0992) time: 2.9383 data: 0.0070 max mem: 33301 Epoch: [26] [1310/4276] eta: 2:24:56 lr: 1.905356831526712e-05 loss: 0.0805 (0.0991) time: 2.9458 data: 0.0070 max mem: 33301 Epoch: [26] [1320/4276] eta: 2:24:27 lr: 1.9050639625374857e-05 loss: 0.0843 (0.0991) time: 2.9539 data: 0.0074 max mem: 33301 Epoch: [26] [1330/4276] eta: 2:23:58 lr: 1.9047710885455867e-05 loss: 0.0981 (0.0990) time: 2.9514 data: 0.0075 max mem: 33301 Epoch: [26] [1340/4276] eta: 2:23:29 lr: 1.9044782095500747e-05 loss: 0.0816 (0.0990) time: 2.9441 data: 0.0072 max mem: 33301 Epoch: [26] [1350/4276] eta: 2:23:00 lr: 1.904185325550008e-05 loss: 0.0897 (0.0990) time: 2.9424 data: 0.0068 max mem: 33301 Epoch: [26] [1360/4276] eta: 2:22:31 lr: 1.903892436544447e-05 loss: 0.0980 (0.0990) time: 2.9417 data: 0.0068 max mem: 33301 Epoch: [26] [1370/4276] eta: 2:22:02 lr: 1.9035995425324504e-05 loss: 0.0847 (0.0989) time: 2.9410 data: 0.0071 max mem: 33301 Epoch: [26] [1380/4276] eta: 2:21:33 lr: 1.9033066435130772e-05 loss: 0.0948 (0.0990) time: 2.9409 data: 0.0070 max mem: 33301 Epoch: [26] [1390/4276] eta: 2:21:04 lr: 1.903013739485384e-05 loss: 0.1004 (0.0991) time: 2.9392 data: 0.0069 max mem: 33301 Epoch: [26] [1400/4276] eta: 2:20:34 lr: 1.9027208304484297e-05 loss: 0.1069 (0.0991) time: 2.9337 data: 0.0070 max mem: 33301 Epoch: [26] [1410/4276] eta: 2:20:05 lr: 1.9024279164012717e-05 loss: 0.0944 (0.0990) time: 2.9354 data: 0.0076 max mem: 33301 Epoch: [26] [1420/4276] eta: 2:19:36 lr: 1.9021349973429675e-05 loss: 0.0869 (0.0990) time: 2.9413 data: 0.0076 max mem: 33301 Epoch: [26] [1430/4276] eta: 2:19:07 lr: 1.901842073272573e-05 loss: 0.0869 (0.0990) time: 2.9410 data: 0.0072 max mem: 33301 Epoch: [26] [1440/4276] eta: 2:18:37 lr: 1.9015491441891454e-05 loss: 0.0955 (0.0989) time: 2.9401 data: 0.0070 max mem: 33301 Epoch: [26] [1450/4276] eta: 2:18:08 lr: 1.9012562100917404e-05 loss: 0.0955 (0.0989) time: 2.9388 data: 0.0071 max mem: 33301 Epoch: [26] [1460/4276] eta: 2:17:39 lr: 1.900963270979414e-05 loss: 0.0906 (0.0988) time: 2.9390 data: 0.0073 max mem: 33301 Epoch: [26] [1470/4276] eta: 2:17:10 lr: 1.9006703268512216e-05 loss: 0.0827 (0.0987) time: 2.9396 data: 0.0073 max mem: 33301 Epoch: [26] [1480/4276] eta: 2:16:41 lr: 1.900377377706218e-05 loss: 0.0895 (0.0987) time: 2.9406 data: 0.0072 max mem: 33301 Epoch: [26] [1490/4276] eta: 2:16:11 lr: 1.9000844235434586e-05 loss: 0.0910 (0.0986) time: 2.9418 data: 0.0074 max mem: 33301 Epoch: [26] [1500/4276] eta: 2:15:42 lr: 1.899791464361998e-05 loss: 0.0875 (0.0986) time: 2.9426 data: 0.0077 max mem: 33301 Epoch: [26] [1510/4276] eta: 2:15:13 lr: 1.899498500160889e-05 loss: 0.0799 (0.0985) time: 2.9414 data: 0.0077 max mem: 33301 Epoch: [26] [1520/4276] eta: 2:14:43 lr: 1.8992055309391863e-05 loss: 0.0813 (0.0984) time: 2.9245 data: 0.0074 max mem: 33301 Epoch: [26] [1530/4276] eta: 2:14:14 lr: 1.8989125566959436e-05 loss: 0.0884 (0.0984) time: 2.9182 data: 0.0076 max mem: 33301 Epoch: [26] [1540/4276] eta: 2:13:44 lr: 1.8986195774302132e-05 loss: 0.0901 (0.0983) time: 2.9130 data: 0.0081 max mem: 33301 Epoch: [26] [1550/4276] eta: 2:13:14 lr: 1.8983265931410483e-05 loss: 0.0901 (0.0983) time: 2.9072 data: 0.0078 max mem: 33301 Epoch: [26] [1560/4276] eta: 2:12:45 lr: 1.8980336038275007e-05 loss: 0.0847 (0.0982) time: 2.9288 data: 0.0076 max mem: 33301 Epoch: [26] [1570/4276] eta: 2:12:16 lr: 1.8977406094886235e-05 loss: 0.0881 (0.0982) time: 2.9409 data: 0.0072 max mem: 33301 Epoch: [26] [1580/4276] eta: 2:11:47 lr: 1.8974476101234675e-05 loss: 0.0825 (0.0981) time: 2.9404 data: 0.0068 max mem: 33301 Epoch: [26] [1590/4276] eta: 2:11:18 lr: 1.897154605731084e-05 loss: 0.0876 (0.0981) time: 2.9400 data: 0.0068 max mem: 33301 Epoch: [26] [1600/4276] eta: 2:10:48 lr: 1.8968615963105248e-05 loss: 0.1005 (0.0981) time: 2.9400 data: 0.0069 max mem: 33301 Epoch: [26] [1610/4276] eta: 2:10:19 lr: 1.8965685818608405e-05 loss: 0.0870 (0.0981) time: 2.9404 data: 0.0069 max mem: 33301 Epoch: [26] [1620/4276] eta: 2:09:50 lr: 1.8962755623810805e-05 loss: 0.0858 (0.0981) time: 2.9329 data: 0.0071 max mem: 33301 Epoch: [26] [1630/4276] eta: 2:09:20 lr: 1.895982537870295e-05 loss: 0.0862 (0.0981) time: 2.9079 data: 0.0079 max mem: 33301 Epoch: [26] [1640/4276] eta: 2:08:50 lr: 1.8956895083275344e-05 loss: 0.0862 (0.0980) time: 2.8951 data: 0.0077 max mem: 33301 Epoch: [26] [1650/4276] eta: 2:08:21 lr: 1.8953964737518482e-05 loss: 0.0787 (0.0979) time: 2.9275 data: 0.0078 max mem: 33301 Epoch: [26] [1660/4276] eta: 2:07:51 lr: 1.8951034341422845e-05 loss: 0.0837 (0.0979) time: 2.9322 data: 0.0085 max mem: 33301 Epoch: [26] [1670/4276] eta: 2:07:22 lr: 1.8948103894978916e-05 loss: 0.0851 (0.0978) time: 2.9233 data: 0.0081 max mem: 33301 Epoch: [26] [1680/4276] eta: 2:06:53 lr: 1.8945173398177187e-05 loss: 0.0860 (0.0978) time: 2.9382 data: 0.0075 max mem: 33301 Epoch: [26] [1690/4276] eta: 2:06:23 lr: 1.894224285100814e-05 loss: 0.0860 (0.0978) time: 2.9391 data: 0.0072 max mem: 33301 Epoch: [26] [1700/4276] eta: 2:05:54 lr: 1.893931225346224e-05 loss: 0.1008 (0.0978) time: 2.9397 data: 0.0074 max mem: 33301 Epoch: [26] [1710/4276] eta: 2:05:25 lr: 1.893638160552996e-05 loss: 0.0978 (0.0978) time: 2.9281 data: 0.0077 max mem: 33301 Epoch: [26] [1720/4276] eta: 2:04:55 lr: 1.893345090720178e-05 loss: 0.0894 (0.0978) time: 2.9025 data: 0.0078 max mem: 33301 Epoch: [26] [1730/4276] eta: 2:04:25 lr: 1.893052015846816e-05 loss: 0.0849 (0.0977) time: 2.9109 data: 0.0080 max mem: 33301 Epoch: [26] [1740/4276] eta: 2:03:56 lr: 1.892758935931956e-05 loss: 0.0798 (0.0976) time: 2.9353 data: 0.0079 max mem: 33301 Epoch: [26] [1750/4276] eta: 2:03:27 lr: 1.8924658509746437e-05 loss: 0.0806 (0.0976) time: 2.9404 data: 0.0079 max mem: 33301 Epoch: [26] [1760/4276] eta: 2:02:58 lr: 1.8921727609739248e-05 loss: 0.0812 (0.0976) time: 2.9411 data: 0.0078 max mem: 33301 Epoch: [26] [1770/4276] eta: 2:02:29 lr: 1.8918796659288455e-05 loss: 0.0824 (0.0976) time: 2.9417 data: 0.0080 max mem: 33301 Epoch: [26] [1780/4276] eta: 2:01:59 lr: 1.891586565838449e-05 loss: 0.0894 (0.0976) time: 2.9422 data: 0.0079 max mem: 33301 Epoch: [26] [1790/4276] eta: 2:01:30 lr: 1.8912934607017803e-05 loss: 0.0875 (0.0976) time: 2.9420 data: 0.0075 max mem: 33301 Epoch: [26] [1800/4276] eta: 2:01:01 lr: 1.891000350517884e-05 loss: 0.0875 (0.0975) time: 2.9416 data: 0.0075 max mem: 33301 Epoch: [26] [1810/4276] eta: 2:00:32 lr: 1.8907072352858043e-05 loss: 0.0960 (0.0976) time: 2.9417 data: 0.0076 max mem: 33301 Epoch: [26] [1820/4276] eta: 2:00:03 lr: 1.890414115004584e-05 loss: 0.0947 (0.0976) time: 2.9444 data: 0.0077 max mem: 33301 Epoch: [26] [1830/4276] eta: 1:59:33 lr: 1.8901209896732652e-05 loss: 0.0896 (0.0975) time: 2.9434 data: 0.0078 max mem: 33301 Epoch: [26] [1840/4276] eta: 1:59:04 lr: 1.8898278592908926e-05 loss: 0.0818 (0.0975) time: 2.9409 data: 0.0076 max mem: 33301 Epoch: [26] [1850/4276] eta: 1:58:35 lr: 1.889534723856507e-05 loss: 0.0855 (0.0975) time: 2.9411 data: 0.0076 max mem: 33301 Epoch: [26] [1860/4276] eta: 1:58:06 lr: 1.8892415833691518e-05 loss: 0.0905 (0.0975) time: 2.9402 data: 0.0075 max mem: 33301 Epoch: [26] [1870/4276] eta: 1:57:36 lr: 1.8889484378278676e-05 loss: 0.0886 (0.0976) time: 2.9403 data: 0.0075 max mem: 33301 Epoch: [26] [1880/4276] eta: 1:57:07 lr: 1.8886552872316967e-05 loss: 0.0835 (0.0975) time: 2.9404 data: 0.0075 max mem: 33301 Epoch: [26] [1890/4276] eta: 1:56:38 lr: 1.8883621315796794e-05 loss: 0.0852 (0.0975) time: 2.9407 data: 0.0077 max mem: 33301 Epoch: [26] [1900/4276] eta: 1:56:09 lr: 1.8880689708708564e-05 loss: 0.0852 (0.0975) time: 2.9427 data: 0.0078 max mem: 33301 Epoch: [26] [1910/4276] eta: 1:55:40 lr: 1.8877758051042686e-05 loss: 0.0851 (0.0975) time: 2.9434 data: 0.0075 max mem: 33301 Epoch: [26] [1920/4276] eta: 1:55:10 lr: 1.887482634278956e-05 loss: 0.0829 (0.0975) time: 2.9433 data: 0.0075 max mem: 33301 Epoch: [26] [1930/4276] eta: 1:54:41 lr: 1.8871894583939576e-05 loss: 0.0822 (0.0975) time: 2.9424 data: 0.0076 max mem: 33301 Epoch: [26] [1940/4276] eta: 1:54:12 lr: 1.8868962774483126e-05 loss: 0.0907 (0.0975) time: 2.9403 data: 0.0079 max mem: 33301 Epoch: [26] [1950/4276] eta: 1:53:43 lr: 1.8866030914410607e-05 loss: 0.1054 (0.0976) time: 2.9396 data: 0.0078 max mem: 33301 Epoch: [26] [1960/4276] eta: 1:53:13 lr: 1.8863099003712403e-05 loss: 0.1027 (0.0976) time: 2.9399 data: 0.0075 max mem: 33301 Epoch: [26] [1970/4276] eta: 1:52:44 lr: 1.8860167042378894e-05 loss: 0.0834 (0.0976) time: 2.9417 data: 0.0077 max mem: 33301 Epoch: [26] [1980/4276] eta: 1:52:15 lr: 1.8857235030400455e-05 loss: 0.0823 (0.0975) time: 2.9448 data: 0.0077 max mem: 33301 Epoch: [26] [1990/4276] eta: 1:51:46 lr: 1.8854302967767467e-05 loss: 0.0853 (0.0975) time: 2.9437 data: 0.0076 max mem: 33301 Epoch: [26] [2000/4276] eta: 1:51:17 lr: 1.8851370854470308e-05 loss: 0.0931 (0.0975) time: 2.9440 data: 0.0080 max mem: 33301 Epoch: [26] [2010/4276] eta: 1:50:47 lr: 1.8848438690499332e-05 loss: 0.0943 (0.0975) time: 2.9304 data: 0.0080 max mem: 33301 Epoch: [26] [2020/4276] eta: 1:50:18 lr: 1.8845506475844913e-05 loss: 0.1014 (0.0975) time: 2.9296 data: 0.0081 max mem: 33301 Epoch: [26] [2030/4276] eta: 1:49:49 lr: 1.8842574210497408e-05 loss: 0.0863 (0.0974) time: 2.9437 data: 0.0080 max mem: 33301 Epoch: [26] [2040/4276] eta: 1:49:19 lr: 1.8839641894447186e-05 loss: 0.0837 (0.0974) time: 2.9445 data: 0.0079 max mem: 33301 Epoch: [26] [2050/4276] eta: 1:48:50 lr: 1.8836709527684586e-05 loss: 0.0990 (0.0975) time: 2.9437 data: 0.0081 max mem: 33301 Epoch: [26] [2060/4276] eta: 1:48:21 lr: 1.8833777110199965e-05 loss: 0.0990 (0.0975) time: 2.9387 data: 0.0080 max mem: 33301 Epoch: [26] [2070/4276] eta: 1:47:52 lr: 1.8830844641983675e-05 loss: 0.0853 (0.0975) time: 2.9405 data: 0.0079 max mem: 33301 Epoch: [26] [2080/4276] eta: 1:47:22 lr: 1.882791212302606e-05 loss: 0.0895 (0.0975) time: 2.9420 data: 0.0077 max mem: 33301 Epoch: [26] [2090/4276] eta: 1:46:53 lr: 1.8824979553317452e-05 loss: 0.0934 (0.0975) time: 2.9371 data: 0.0074 max mem: 33301 Epoch: [26] [2100/4276] eta: 1:46:24 lr: 1.8822046932848192e-05 loss: 0.0844 (0.0975) time: 2.9378 data: 0.0070 max mem: 33301 Epoch: [26] [2110/4276] eta: 1:45:54 lr: 1.881911426160862e-05 loss: 0.0844 (0.0975) time: 2.9428 data: 0.0069 max mem: 33301 Epoch: [26] [2120/4276] eta: 1:45:25 lr: 1.8816181539589053e-05 loss: 0.0795 (0.0974) time: 2.9422 data: 0.0070 max mem: 33301 Epoch: [26] [2130/4276] eta: 1:44:56 lr: 1.8813248766779824e-05 loss: 0.0798 (0.0973) time: 2.9372 data: 0.0069 max mem: 33301 Epoch: [26] [2140/4276] eta: 1:44:27 lr: 1.8810315943171257e-05 loss: 0.0901 (0.0973) time: 2.9385 data: 0.0068 max mem: 33301 Epoch: [26] [2150/4276] eta: 1:43:57 lr: 1.8807383068753677e-05 loss: 0.0849 (0.0972) time: 2.9426 data: 0.0067 max mem: 33301 Epoch: [26] [2160/4276] eta: 1:43:28 lr: 1.8804450143517393e-05 loss: 0.0784 (0.0972) time: 2.9416 data: 0.0071 max mem: 33301 Epoch: [26] [2170/4276] eta: 1:42:59 lr: 1.880151716745271e-05 loss: 0.0879 (0.0972) time: 2.9412 data: 0.0071 max mem: 33301 Epoch: [26] [2180/4276] eta: 1:42:29 lr: 1.8798584140549945e-05 loss: 0.1002 (0.0972) time: 2.9399 data: 0.0068 max mem: 33301 Epoch: [26] [2190/4276] eta: 1:42:00 lr: 1.8795651062799412e-05 loss: 0.0949 (0.0972) time: 2.9381 data: 0.0069 max mem: 33301 Epoch: [26] [2200/4276] eta: 1:41:31 lr: 1.8792717934191394e-05 loss: 0.0918 (0.0972) time: 2.9376 data: 0.0069 max mem: 33301 Epoch: [26] [2210/4276] eta: 1:41:02 lr: 1.8789784754716202e-05 loss: 0.0926 (0.0972) time: 2.9393 data: 0.0069 max mem: 33301 Epoch: [26] [2220/4276] eta: 1:40:32 lr: 1.8786851524364125e-05 loss: 0.0926 (0.0972) time: 2.9409 data: 0.0066 max mem: 33301 Epoch: [26] [2230/4276] eta: 1:40:03 lr: 1.8783918243125463e-05 loss: 0.0865 (0.0972) time: 2.9413 data: 0.0066 max mem: 33301 Epoch: [26] [2240/4276] eta: 1:39:34 lr: 1.878098491099049e-05 loss: 0.0778 (0.0971) time: 2.9495 data: 0.0066 max mem: 33301 Epoch: [26] [2250/4276] eta: 1:39:04 lr: 1.87780515279495e-05 loss: 0.0837 (0.0971) time: 2.9387 data: 0.0068 max mem: 33301 Epoch: [26] [2260/4276] eta: 1:38:35 lr: 1.8775118093992768e-05 loss: 0.0955 (0.0971) time: 2.9339 data: 0.0072 max mem: 33301 Epoch: [26] [2270/4276] eta: 1:38:06 lr: 1.8772184609110573e-05 loss: 0.0966 (0.0971) time: 2.9427 data: 0.0070 max mem: 33301 Epoch: [26] [2280/4276] eta: 1:37:37 lr: 1.876925107329319e-05 loss: 0.0966 (0.0971) time: 2.9396 data: 0.0067 max mem: 33301 Epoch: [26] [2290/4276] eta: 1:37:07 lr: 1.8766317486530883e-05 loss: 0.0943 (0.0971) time: 2.9392 data: 0.0067 max mem: 33301 Epoch: [26] [2300/4276] eta: 1:36:38 lr: 1.876338384881393e-05 loss: 0.0914 (0.0971) time: 2.9385 data: 0.0067 max mem: 33301 Epoch: [26] [2310/4276] eta: 1:36:09 lr: 1.8760450160132586e-05 loss: 0.0973 (0.0971) time: 2.9411 data: 0.0067 max mem: 33301 Epoch: [26] [2320/4276] eta: 1:35:39 lr: 1.8757516420477104e-05 loss: 0.0986 (0.0971) time: 2.9377 data: 0.0066 max mem: 33301 Epoch: [26] [2330/4276] eta: 1:35:10 lr: 1.875458262983775e-05 loss: 0.1056 (0.0971) time: 2.9341 data: 0.0066 max mem: 33301 Epoch: [26] [2340/4276] eta: 1:34:41 lr: 1.875164878820477e-05 loss: 0.1012 (0.0971) time: 2.9381 data: 0.0067 max mem: 33301 Epoch: [26] [2350/4276] eta: 1:34:11 lr: 1.8748714895568424e-05 loss: 0.0821 (0.0971) time: 2.9200 data: 0.0070 max mem: 33301 Epoch: [26] [2360/4276] eta: 1:33:41 lr: 1.874578095191894e-05 loss: 0.0838 (0.0971) time: 2.9010 data: 0.0077 max mem: 33301 Epoch: [26] [2370/4276] eta: 1:33:12 lr: 1.8742846957246566e-05 loss: 0.0898 (0.0971) time: 2.9105 data: 0.0078 max mem: 33301 Epoch: [26] [2380/4276] eta: 1:32:43 lr: 1.873991291154154e-05 loss: 0.0970 (0.0971) time: 2.9248 data: 0.0074 max mem: 33301 Epoch: [26] [2390/4276] eta: 1:32:13 lr: 1.8736978814794104e-05 loss: 0.0970 (0.0971) time: 2.9333 data: 0.0071 max mem: 33301 Epoch: [26] [2400/4276] eta: 1:31:44 lr: 1.8734044666994476e-05 loss: 0.0978 (0.0972) time: 2.9323 data: 0.0070 max mem: 33301 Epoch: [26] [2410/4276] eta: 1:31:14 lr: 1.8731110468132892e-05 loss: 0.0973 (0.0972) time: 2.9191 data: 0.0072 max mem: 33301 Epoch: [26] [2420/4276] eta: 1:30:45 lr: 1.8728176218199576e-05 loss: 0.0973 (0.0972) time: 2.9253 data: 0.0071 max mem: 33301 Epoch: [26] [2430/4276] eta: 1:30:16 lr: 1.8725241917184738e-05 loss: 0.1017 (0.0972) time: 2.9408 data: 0.0067 max mem: 33301 Epoch: [26] [2440/4276] eta: 1:29:46 lr: 1.8722307565078604e-05 loss: 0.1007 (0.0972) time: 2.9387 data: 0.0066 max mem: 33301 Epoch: [26] [2450/4276] eta: 1:29:17 lr: 1.8719373161871383e-05 loss: 0.0927 (0.0972) time: 2.9389 data: 0.0066 max mem: 33301 Epoch: [26] [2460/4276] eta: 1:28:48 lr: 1.871643870755329e-05 loss: 0.0875 (0.0972) time: 2.9403 data: 0.0065 max mem: 33301 Epoch: [26] [2470/4276] eta: 1:28:19 lr: 1.8713504202114517e-05 loss: 0.0853 (0.0972) time: 2.9378 data: 0.0066 max mem: 33301 Epoch: [26] [2480/4276] eta: 1:27:49 lr: 1.8710569645545278e-05 loss: 0.1080 (0.0973) time: 2.9284 data: 0.0068 max mem: 33301 Epoch: [26] [2490/4276] eta: 1:27:19 lr: 1.870763503783577e-05 loss: 0.0991 (0.0973) time: 2.9028 data: 0.0074 max mem: 33301 Epoch: [26] [2500/4276] eta: 1:26:50 lr: 1.8704700378976185e-05 loss: 0.1051 (0.0973) time: 2.8836 data: 0.0078 max mem: 33301 Epoch: [26] [2510/4276] eta: 1:26:20 lr: 1.8701765668956717e-05 loss: 0.1120 (0.0974) time: 2.8796 data: 0.0076 max mem: 33301 Epoch: [26] [2520/4276] eta: 1:25:50 lr: 1.869883090776755e-05 loss: 0.0866 (0.0973) time: 2.8788 data: 0.0077 max mem: 33301 Epoch: [26] [2530/4276] eta: 1:25:21 lr: 1.8695896095398872e-05 loss: 0.0761 (0.0972) time: 2.8795 data: 0.0075 max mem: 33301 Epoch: [26] [2540/4276] eta: 1:24:51 lr: 1.8692961231840865e-05 loss: 0.0840 (0.0973) time: 2.8815 data: 0.0075 max mem: 33301 Epoch: [26] [2550/4276] eta: 1:24:22 lr: 1.8690026317083694e-05 loss: 0.0892 (0.0972) time: 2.9089 data: 0.0073 max mem: 33301 Epoch: [26] [2560/4276] eta: 1:23:52 lr: 1.8687091351117547e-05 loss: 0.0830 (0.0972) time: 2.9345 data: 0.0070 max mem: 33301 Epoch: [26] [2570/4276] eta: 1:23:23 lr: 1.8684156333932585e-05 loss: 0.0901 (0.0972) time: 2.9384 data: 0.0074 max mem: 33301 Epoch: [26] [2580/4276] eta: 1:22:54 lr: 1.868122126551898e-05 loss: 0.0901 (0.0972) time: 2.9382 data: 0.0076 max mem: 33301 Epoch: [26] [2590/4276] eta: 1:22:24 lr: 1.8678286145866895e-05 loss: 0.0912 (0.0972) time: 2.9113 data: 0.0075 max mem: 33301 Epoch: [26] [2600/4276] eta: 1:21:55 lr: 1.8675350974966477e-05 loss: 0.0902 (0.0972) time: 2.9070 data: 0.0076 max mem: 33301 Epoch: [26] [2610/4276] eta: 1:21:25 lr: 1.8672415752807893e-05 loss: 0.0805 (0.0971) time: 2.9214 data: 0.0076 max mem: 33301 Epoch: [26] [2620/4276] eta: 1:20:56 lr: 1.8669480479381295e-05 loss: 0.0882 (0.0972) time: 2.9184 data: 0.0075 max mem: 33301 Epoch: [26] [2630/4276] eta: 1:20:27 lr: 1.8666545154676827e-05 loss: 0.0994 (0.0972) time: 2.9275 data: 0.0079 max mem: 33301 Epoch: [26] [2640/4276] eta: 1:19:57 lr: 1.866360977868463e-05 loss: 0.0924 (0.0972) time: 2.9364 data: 0.0075 max mem: 33301 Epoch: [26] [2650/4276] eta: 1:19:28 lr: 1.866067435139485e-05 loss: 0.0900 (0.0971) time: 2.9423 data: 0.0068 max mem: 33301 Epoch: [26] [2660/4276] eta: 1:18:59 lr: 1.8657738872797626e-05 loss: 0.0898 (0.0971) time: 2.9599 data: 0.0066 max mem: 33301 Epoch: [26] [2670/4276] eta: 1:18:30 lr: 1.8654803342883085e-05 loss: 0.0921 (0.0971) time: 2.9634 data: 0.0063 max mem: 33301 Epoch: [26] [2680/4276] eta: 1:18:01 lr: 1.865186776164136e-05 loss: 0.1034 (0.0971) time: 2.9582 data: 0.0063 max mem: 33301 Epoch: [26] [2690/4276] eta: 1:17:31 lr: 1.864893212906258e-05 loss: 0.0910 (0.0971) time: 2.9573 data: 0.0065 max mem: 33301 Epoch: [26] [2700/4276] eta: 1:17:02 lr: 1.864599644513687e-05 loss: 0.0808 (0.0970) time: 2.9502 data: 0.0065 max mem: 33301 Epoch: [26] [2710/4276] eta: 1:16:33 lr: 1.864306070985434e-05 loss: 0.0798 (0.0970) time: 2.9527 data: 0.0064 max mem: 33301 Epoch: [26] [2720/4276] eta: 1:16:04 lr: 1.864012492320511e-05 loss: 0.0798 (0.0970) time: 2.9559 data: 0.0064 max mem: 33301 Epoch: [26] [2730/4276] eta: 1:15:35 lr: 1.86371890851793e-05 loss: 0.0843 (0.0970) time: 2.9587 data: 0.0065 max mem: 33301 Epoch: [26] [2740/4276] eta: 1:15:05 lr: 1.8634253195767e-05 loss: 0.1040 (0.0970) time: 2.9614 data: 0.0065 max mem: 33301 Epoch: [26] [2750/4276] eta: 1:14:36 lr: 1.863131725495833e-05 loss: 0.1108 (0.0971) time: 2.9577 data: 0.0064 max mem: 33301 Epoch: [26] [2760/4276] eta: 1:14:07 lr: 1.8628381262743383e-05 loss: 0.1006 (0.0971) time: 2.9529 data: 0.0066 max mem: 33301 Epoch: [26] [2770/4276] eta: 1:13:38 lr: 1.8625445219112268e-05 loss: 0.0995 (0.0971) time: 2.9554 data: 0.0065 max mem: 33301 Epoch: [26] [2780/4276] eta: 1:13:08 lr: 1.862250912405506e-05 loss: 0.0912 (0.0971) time: 2.9569 data: 0.0065 max mem: 33301 Epoch: [26] [2790/4276] eta: 1:12:39 lr: 1.861957297756186e-05 loss: 0.0917 (0.0971) time: 2.9503 data: 0.0068 max mem: 33301 Epoch: [26] [2800/4276] eta: 1:12:10 lr: 1.8616636779622758e-05 loss: 0.0916 (0.0971) time: 2.9494 data: 0.0065 max mem: 33301 Epoch: [26] [2810/4276] eta: 1:11:41 lr: 1.861370053022783e-05 loss: 0.0819 (0.0971) time: 2.9538 data: 0.0063 max mem: 33301 Epoch: [26] [2820/4276] eta: 1:11:12 lr: 1.8610764229367157e-05 loss: 0.0809 (0.0970) time: 2.9540 data: 0.0063 max mem: 33301 Epoch: [26] [2830/4276] eta: 1:10:42 lr: 1.860782787703081e-05 loss: 0.0905 (0.0970) time: 2.9522 data: 0.0062 max mem: 33301 Epoch: [26] [2840/4276] eta: 1:10:13 lr: 1.8604891473208872e-05 loss: 0.0967 (0.0970) time: 2.9555 data: 0.0062 max mem: 33301 Epoch: [26] [2850/4276] eta: 1:09:44 lr: 1.8601955017891405e-05 loss: 0.0967 (0.0970) time: 2.9574 data: 0.0062 max mem: 33301 Epoch: [26] [2860/4276] eta: 1:09:15 lr: 1.859901851106847e-05 loss: 0.0781 (0.0970) time: 2.9573 data: 0.0062 max mem: 33301 Epoch: [26] [2870/4276] eta: 1:08:45 lr: 1.859608195273013e-05 loss: 0.0891 (0.0970) time: 2.9595 data: 0.0062 max mem: 33301 Epoch: [26] [2880/4276] eta: 1:08:16 lr: 1.859314534286644e-05 loss: 0.0925 (0.0970) time: 2.9628 data: 0.0064 max mem: 33301 Epoch: [26] [2890/4276] eta: 1:07:47 lr: 1.8590208681467465e-05 loss: 0.0928 (0.0971) time: 2.9542 data: 0.0066 max mem: 33301 Epoch: [26] [2900/4276] eta: 1:07:18 lr: 1.8587271968523237e-05 loss: 0.0882 (0.0970) time: 2.9415 data: 0.0068 max mem: 33301 Epoch: [26] [2910/4276] eta: 1:06:48 lr: 1.8584335204023813e-05 loss: 0.0979 (0.0971) time: 2.9393 data: 0.0065 max mem: 33301 Epoch: [26] [2920/4276] eta: 1:06:19 lr: 1.8581398387959232e-05 loss: 0.1096 (0.0971) time: 2.9390 data: 0.0062 max mem: 33301 Epoch: [26] [2930/4276] eta: 1:05:50 lr: 1.8578461520319544e-05 loss: 0.1052 (0.0972) time: 2.9407 data: 0.0064 max mem: 33301 Epoch: [26] [2940/4276] eta: 1:05:20 lr: 1.8575524601094767e-05 loss: 0.1007 (0.0972) time: 2.9396 data: 0.0065 max mem: 33301 Epoch: [26] [2950/4276] eta: 1:04:51 lr: 1.857258763027494e-05 loss: 0.1008 (0.0972) time: 2.9373 data: 0.0065 max mem: 33301 Epoch: [26] [2960/4276] eta: 1:04:22 lr: 1.856965060785009e-05 loss: 0.0993 (0.0972) time: 2.9364 data: 0.0066 max mem: 33301 Epoch: [26] [2970/4276] eta: 1:03:52 lr: 1.856671353381025e-05 loss: 0.0973 (0.0972) time: 2.9376 data: 0.0066 max mem: 33301 Epoch: [26] [2980/4276] eta: 1:03:23 lr: 1.8563776408145422e-05 loss: 0.0973 (0.0972) time: 2.9421 data: 0.0067 max mem: 33301 Epoch: [26] [2990/4276] eta: 1:02:54 lr: 1.8560839230845633e-05 loss: 0.0915 (0.0972) time: 2.9422 data: 0.0065 max mem: 33301 Epoch: [26] [3000/4276] eta: 1:02:24 lr: 1.8557902001900897e-05 loss: 0.0886 (0.0971) time: 2.9442 data: 0.0066 max mem: 33301 Epoch: [26] [3010/4276] eta: 1:01:55 lr: 1.855496472130123e-05 loss: 0.0934 (0.0972) time: 2.9446 data: 0.0066 max mem: 33301 Epoch: [26] [3020/4276] eta: 1:01:26 lr: 1.855202738903662e-05 loss: 0.0940 (0.0971) time: 2.9415 data: 0.0066 max mem: 33301 Epoch: [26] [3030/4276] eta: 1:00:56 lr: 1.8549090005097083e-05 loss: 0.0895 (0.0971) time: 2.9413 data: 0.0066 max mem: 33301 Epoch: [26] [3040/4276] eta: 1:00:27 lr: 1.8546152569472618e-05 loss: 0.0920 (0.0971) time: 2.9405 data: 0.0066 max mem: 33301 Epoch: [26] [3050/4276] eta: 0:59:58 lr: 1.8543215082153207e-05 loss: 0.0960 (0.0971) time: 2.9386 data: 0.0066 max mem: 33301 Epoch: [26] [3060/4276] eta: 0:59:28 lr: 1.854027754312885e-05 loss: 0.0907 (0.0971) time: 2.9437 data: 0.0066 max mem: 33301 Epoch: [26] [3070/4276] eta: 0:58:59 lr: 1.853733995238953e-05 loss: 0.0883 (0.0971) time: 2.9454 data: 0.0067 max mem: 33301 Epoch: [26] [3080/4276] eta: 0:58:30 lr: 1.8534402309925238e-05 loss: 0.0839 (0.0971) time: 2.9374 data: 0.0069 max mem: 33301 Epoch: [26] [3090/4276] eta: 0:58:00 lr: 1.8531464615725947e-05 loss: 0.0829 (0.0970) time: 2.9377 data: 0.0067 max mem: 33301 Epoch: [26] [3100/4276] eta: 0:57:31 lr: 1.8528526869781635e-05 loss: 0.0865 (0.0970) time: 2.9459 data: 0.0066 max mem: 33301 Epoch: [26] [3110/4276] eta: 0:57:02 lr: 1.852558907208227e-05 loss: 0.0984 (0.0970) time: 2.9432 data: 0.0067 max mem: 33301 Epoch: [26] [3120/4276] eta: 0:56:32 lr: 1.8522651222617833e-05 loss: 0.1001 (0.0970) time: 2.9377 data: 0.0068 max mem: 33301 Epoch: [26] [3130/4276] eta: 0:56:03 lr: 1.8519713321378277e-05 loss: 0.0803 (0.0970) time: 2.9388 data: 0.0066 max mem: 33301 Epoch: [26] [3140/4276] eta: 0:55:34 lr: 1.8516775368353565e-05 loss: 0.0815 (0.0970) time: 2.9361 data: 0.0065 max mem: 33301 Epoch: [26] [3150/4276] eta: 0:55:04 lr: 1.8513837363533653e-05 loss: 0.0882 (0.0970) time: 2.9380 data: 0.0066 max mem: 33301 Epoch: [26] [3160/4276] eta: 0:54:35 lr: 1.8510899306908507e-05 loss: 0.0887 (0.0969) time: 2.9223 data: 0.0070 max mem: 33301 Epoch: [26] [3170/4276] eta: 0:54:06 lr: 1.8507961198468063e-05 loss: 0.0841 (0.0969) time: 2.9215 data: 0.0083 max mem: 33301 Epoch: [26] [3180/4276] eta: 0:53:36 lr: 1.8505023038202273e-05 loss: 0.0841 (0.0969) time: 2.9415 data: 0.0078 max mem: 33301 Epoch: [26] [3190/4276] eta: 0:53:07 lr: 1.850208482610108e-05 loss: 0.0906 (0.0969) time: 2.9473 data: 0.0068 max mem: 33301 Epoch: [26] [3200/4276] eta: 0:52:38 lr: 1.8499146562154425e-05 loss: 0.0906 (0.0969) time: 2.9381 data: 0.0079 max mem: 33301 Epoch: [26] [3210/4276] eta: 0:52:08 lr: 1.8496208246352234e-05 loss: 0.0898 (0.0969) time: 2.9387 data: 0.0080 max mem: 33301 Epoch: [26] [3220/4276] eta: 0:51:39 lr: 1.8493269878684447e-05 loss: 0.0900 (0.0969) time: 2.9466 data: 0.0074 max mem: 33301 Epoch: [26] [3230/4276] eta: 0:51:10 lr: 1.8490331459140985e-05 loss: 0.0842 (0.0969) time: 2.9403 data: 0.0074 max mem: 33301 Epoch: [26] [3240/4276] eta: 0:50:40 lr: 1.8487392987711784e-05 loss: 0.0943 (0.0969) time: 2.9401 data: 0.0076 max mem: 33301 Epoch: [26] [3250/4276] eta: 0:50:11 lr: 1.848445446438675e-05 loss: 0.0945 (0.0969) time: 2.9411 data: 0.0074 max mem: 33301 Epoch: [26] [3260/4276] eta: 0:49:42 lr: 1.8481515889155808e-05 loss: 0.0964 (0.0970) time: 2.9235 data: 0.0078 max mem: 33301 Epoch: [26] [3270/4276] eta: 0:49:12 lr: 1.847857726200887e-05 loss: 0.0964 (0.0970) time: 2.9227 data: 0.0078 max mem: 33301 Epoch: [26] [3280/4276] eta: 0:48:43 lr: 1.8475638582935847e-05 loss: 0.0995 (0.0970) time: 2.9393 data: 0.0069 max mem: 33301 Epoch: [26] [3290/4276] eta: 0:48:14 lr: 1.8472699851926633e-05 loss: 0.1083 (0.0970) time: 2.9384 data: 0.0067 max mem: 33301 Epoch: [26] [3300/4276] eta: 0:47:44 lr: 1.846976106897114e-05 loss: 0.0984 (0.0971) time: 2.9423 data: 0.0067 max mem: 33301 Epoch: [26] [3310/4276] eta: 0:47:15 lr: 1.8466822234059263e-05 loss: 0.0992 (0.0971) time: 2.9410 data: 0.0068 max mem: 33301 Epoch: [26] [3320/4276] eta: 0:46:46 lr: 1.84638833471809e-05 loss: 0.1196 (0.0972) time: 2.9363 data: 0.0068 max mem: 33301 Epoch: [26] [3330/4276] eta: 0:46:16 lr: 1.8460944408325932e-05 loss: 0.0955 (0.0972) time: 2.9385 data: 0.0067 max mem: 33301 Epoch: [26] [3340/4276] eta: 0:45:47 lr: 1.845800541748425e-05 loss: 0.0847 (0.0972) time: 2.9426 data: 0.0068 max mem: 33301 Epoch: [26] [3350/4276] eta: 0:45:18 lr: 1.8455066374645745e-05 loss: 0.0947 (0.0972) time: 2.9420 data: 0.0068 max mem: 33301 Epoch: [26] [3360/4276] eta: 0:44:48 lr: 1.845212727980028e-05 loss: 0.0892 (0.0971) time: 2.9378 data: 0.0067 max mem: 33301 Epoch: [26] [3370/4276] eta: 0:44:19 lr: 1.8449188132937742e-05 loss: 0.0873 (0.0972) time: 2.9306 data: 0.0070 max mem: 33301 Epoch: [26] [3380/4276] eta: 0:43:49 lr: 1.8446248934047996e-05 loss: 0.1082 (0.0972) time: 2.9214 data: 0.0078 max mem: 33301 Epoch: [26] [3390/4276] eta: 0:43:20 lr: 1.844330968312092e-05 loss: 0.0853 (0.0972) time: 2.9273 data: 0.0077 max mem: 33301 Epoch: [26] [3400/4276] eta: 0:42:51 lr: 1.8440370380146366e-05 loss: 0.0948 (0.0972) time: 2.9318 data: 0.0070 max mem: 33301 Epoch: [26] [3410/4276] eta: 0:42:21 lr: 1.8437431025114197e-05 loss: 0.1004 (0.0972) time: 2.9319 data: 0.0070 max mem: 33301 Epoch: [26] [3420/4276] eta: 0:41:52 lr: 1.8434491618014273e-05 loss: 0.0950 (0.0973) time: 2.9368 data: 0.0069 max mem: 33301 Epoch: [26] [3430/4276] eta: 0:41:23 lr: 1.8431552158836445e-05 loss: 0.1008 (0.0973) time: 2.9374 data: 0.0067 max mem: 33301 Epoch: [26] [3440/4276] eta: 0:40:53 lr: 1.842861264757056e-05 loss: 0.0937 (0.0973) time: 2.9386 data: 0.0068 max mem: 33301 Epoch: [26] [3450/4276] eta: 0:40:24 lr: 1.8425673084206464e-05 loss: 0.0882 (0.0973) time: 2.9373 data: 0.0067 max mem: 33301 Epoch: [26] [3460/4276] eta: 0:39:55 lr: 1.8422733468734e-05 loss: 0.1025 (0.0974) time: 2.9400 data: 0.0068 max mem: 33301 Epoch: [26] [3470/4276] eta: 0:39:25 lr: 1.8419793801143006e-05 loss: 0.0918 (0.0974) time: 2.9416 data: 0.0067 max mem: 33301 Epoch: [26] [3480/4276] eta: 0:38:56 lr: 1.8416854081423313e-05 loss: 0.0916 (0.0974) time: 2.9400 data: 0.0067 max mem: 33301 Epoch: [26] [3490/4276] eta: 0:38:27 lr: 1.841391430956475e-05 loss: 0.0939 (0.0974) time: 2.9424 data: 0.0069 max mem: 33301 Epoch: [26] [3500/4276] eta: 0:37:57 lr: 1.8410974485557144e-05 loss: 0.0939 (0.0974) time: 2.9407 data: 0.0069 max mem: 33301 Epoch: [26] [3510/4276] eta: 0:37:28 lr: 1.8408034609390322e-05 loss: 0.0905 (0.0974) time: 2.9364 data: 0.0068 max mem: 33301 Epoch: [26] [3520/4276] eta: 0:36:59 lr: 1.8405094681054098e-05 loss: 0.0936 (0.0974) time: 2.9348 data: 0.0068 max mem: 33301 Epoch: [26] [3530/4276] eta: 0:36:29 lr: 1.8402154700538286e-05 loss: 0.0995 (0.0974) time: 2.9356 data: 0.0071 max mem: 33301 Epoch: [26] [3540/4276] eta: 0:36:00 lr: 1.8399214667832695e-05 loss: 0.0982 (0.0974) time: 2.9371 data: 0.0071 max mem: 33301 Epoch: [26] [3550/4276] eta: 0:35:30 lr: 1.8396274582927143e-05 loss: 0.0936 (0.0974) time: 2.9354 data: 0.0067 max mem: 33301 Epoch: [26] [3560/4276] eta: 0:35:01 lr: 1.8393334445811423e-05 loss: 0.0932 (0.0975) time: 2.9321 data: 0.0070 max mem: 33301 Epoch: [26] [3570/4276] eta: 0:34:32 lr: 1.8390394256475335e-05 loss: 0.0980 (0.0975) time: 2.9338 data: 0.0071 max mem: 33301 Epoch: [26] [3580/4276] eta: 0:34:02 lr: 1.8387454014908678e-05 loss: 0.0871 (0.0975) time: 2.9463 data: 0.0073 max mem: 33301 Epoch: [26] [3590/4276] eta: 0:33:33 lr: 1.8384513721101248e-05 loss: 0.0871 (0.0975) time: 2.9510 data: 0.0073 max mem: 33301 Epoch: [26] [3600/4276] eta: 0:33:04 lr: 1.8381573375042824e-05 loss: 0.0974 (0.0975) time: 2.9438 data: 0.0073 max mem: 33301 Epoch: [26] [3610/4276] eta: 0:32:34 lr: 1.8378632976723193e-05 loss: 0.0893 (0.0975) time: 2.9390 data: 0.0075 max mem: 33301 Epoch: [26] [3620/4276] eta: 0:32:05 lr: 1.8375692526132136e-05 loss: 0.0895 (0.0974) time: 2.9395 data: 0.0074 max mem: 33301 Epoch: [26] [3630/4276] eta: 0:31:36 lr: 1.8372752023259437e-05 loss: 0.0913 (0.0975) time: 2.9266 data: 0.0075 max mem: 33301 Epoch: [26] [3640/4276] eta: 0:31:06 lr: 1.8369811468094857e-05 loss: 0.0943 (0.0974) time: 2.9290 data: 0.0079 max mem: 33301 Epoch: [26] [3650/4276] eta: 0:30:37 lr: 1.836687086062817e-05 loss: 0.0806 (0.0974) time: 2.9357 data: 0.0080 max mem: 33301 Epoch: [26] [3660/4276] eta: 0:30:08 lr: 1.836393020084915e-05 loss: 0.0869 (0.0974) time: 2.9343 data: 0.0073 max mem: 33301 Epoch: [26] [3670/4276] eta: 0:29:38 lr: 1.8360989488747544e-05 loss: 0.1006 (0.0974) time: 2.9415 data: 0.0067 max mem: 33301 Epoch: [26] [3680/4276] eta: 0:29:09 lr: 1.8358048724313114e-05 loss: 0.1066 (0.0974) time: 2.9411 data: 0.0068 max mem: 33301 Epoch: [26] [3690/4276] eta: 0:28:40 lr: 1.8355107907535615e-05 loss: 0.0955 (0.0974) time: 2.9435 data: 0.0067 max mem: 33301 Epoch: [26] [3700/4276] eta: 0:28:10 lr: 1.8352167038404806e-05 loss: 0.0906 (0.0974) time: 2.9410 data: 0.0068 max mem: 33301 Epoch: [26] [3710/4276] eta: 0:27:41 lr: 1.8349226116910413e-05 loss: 0.0914 (0.0974) time: 2.9393 data: 0.0067 max mem: 33301 Epoch: [26] [3720/4276] eta: 0:27:12 lr: 1.8346285143042197e-05 loss: 0.1008 (0.0974) time: 2.9394 data: 0.0068 max mem: 33301 Epoch: [26] [3730/4276] eta: 0:26:42 lr: 1.8343344116789883e-05 loss: 0.0922 (0.0974) time: 2.9397 data: 0.0067 max mem: 33301 Epoch: [26] [3740/4276] eta: 0:26:13 lr: 1.8340403038143218e-05 loss: 0.0903 (0.0974) time: 2.9412 data: 0.0068 max mem: 33301 Epoch: [26] [3750/4276] eta: 0:25:44 lr: 1.8337461907091925e-05 loss: 0.0940 (0.0974) time: 2.9419 data: 0.0068 max mem: 33301 Epoch: [26] [3760/4276] eta: 0:25:14 lr: 1.833452072362573e-05 loss: 0.0899 (0.0974) time: 2.9416 data: 0.0067 max mem: 33301 Epoch: [26] [3770/4276] eta: 0:24:45 lr: 1.8331579487734356e-05 loss: 0.0871 (0.0974) time: 2.9423 data: 0.0067 max mem: 33301 Epoch: [26] [3780/4276] eta: 0:24:16 lr: 1.832863819940753e-05 loss: 0.0831 (0.0974) time: 2.9402 data: 0.0067 max mem: 33301 Epoch: [26] [3790/4276] eta: 0:23:46 lr: 1.832569685863496e-05 loss: 0.0772 (0.0973) time: 2.9375 data: 0.0067 max mem: 33301 Epoch: [26] [3800/4276] eta: 0:23:17 lr: 1.8322755465406356e-05 loss: 0.0787 (0.0973) time: 2.9382 data: 0.0068 max mem: 33301 Epoch: [26] [3810/4276] eta: 0:22:47 lr: 1.8319814019711428e-05 loss: 0.0829 (0.0973) time: 2.9456 data: 0.0068 max mem: 33301 Epoch: [26] [3820/4276] eta: 0:22:18 lr: 1.8316872521539887e-05 loss: 0.0852 (0.0972) time: 2.9464 data: 0.0068 max mem: 33301 Epoch: [26] [3830/4276] eta: 0:21:49 lr: 1.8313930970881425e-05 loss: 0.0883 (0.0972) time: 2.9378 data: 0.0069 max mem: 33301 Epoch: [26] [3840/4276] eta: 0:21:19 lr: 1.8310989367725735e-05 loss: 0.0846 (0.0972) time: 2.9432 data: 0.0070 max mem: 33301 Epoch: [26] [3850/4276] eta: 0:20:50 lr: 1.8308047712062516e-05 loss: 0.0782 (0.0972) time: 2.9441 data: 0.0070 max mem: 33301 Epoch: [26] [3860/4276] eta: 0:20:21 lr: 1.8305106003881456e-05 loss: 0.0875 (0.0972) time: 2.9393 data: 0.0070 max mem: 33301 Epoch: [26] [3870/4276] eta: 0:19:51 lr: 1.8302164243172233e-05 loss: 0.0961 (0.0972) time: 2.9408 data: 0.0068 max mem: 33301 Epoch: [26] [3880/4276] eta: 0:19:22 lr: 1.8299222429924535e-05 loss: 0.0858 (0.0972) time: 2.9381 data: 0.0067 max mem: 33301 Epoch: [26] [3890/4276] eta: 0:18:53 lr: 1.8296280564128034e-05 loss: 0.0845 (0.0972) time: 2.9424 data: 0.0068 max mem: 33301 Epoch: [26] [3900/4276] eta: 0:18:23 lr: 1.829333864577241e-05 loss: 0.0843 (0.0972) time: 2.9382 data: 0.0072 max mem: 33301 Epoch: [26] [3910/4276] eta: 0:17:54 lr: 1.829039667484732e-05 loss: 0.0841 (0.0971) time: 2.9347 data: 0.0071 max mem: 33301 Epoch: [26] [3920/4276] eta: 0:17:25 lr: 1.828745465134244e-05 loss: 0.0800 (0.0971) time: 2.9424 data: 0.0070 max mem: 33301 Epoch: [26] [3930/4276] eta: 0:16:55 lr: 1.8284512575247424e-05 loss: 0.0843 (0.0972) time: 2.9380 data: 0.0073 max mem: 33301 Epoch: [26] [3940/4276] eta: 0:16:26 lr: 1.8281570446551934e-05 loss: 0.0992 (0.0972) time: 2.9399 data: 0.0071 max mem: 33301 Epoch: [26] [3950/4276] eta: 0:15:57 lr: 1.8278628265245624e-05 loss: 0.0880 (0.0971) time: 2.9472 data: 0.0067 max mem: 33301 Epoch: [26] [3960/4276] eta: 0:15:27 lr: 1.8275686031318135e-05 loss: 0.0821 (0.0971) time: 2.9464 data: 0.0068 max mem: 33301 Epoch: [26] [3970/4276] eta: 0:14:58 lr: 1.8272743744759126e-05 loss: 0.0971 (0.0971) time: 2.9430 data: 0.0068 max mem: 33301 Epoch: [26] [3980/4276] eta: 0:14:28 lr: 1.8269801405558225e-05 loss: 0.1042 (0.0971) time: 2.9360 data: 0.0069 max mem: 33301 Epoch: [26] [3990/4276] eta: 0:13:59 lr: 1.8266859013705082e-05 loss: 0.0915 (0.0971) time: 2.9369 data: 0.0071 max mem: 33301 Epoch: [26] [4000/4276] eta: 0:13:30 lr: 1.8263916569189322e-05 loss: 0.0896 (0.0971) time: 2.9545 data: 0.0069 max mem: 33301 Epoch: [26] [4010/4276] eta: 0:13:00 lr: 1.826097407200058e-05 loss: 0.0948 (0.0971) time: 2.9508 data: 0.0068 max mem: 33301 Epoch: [26] [4020/4276] eta: 0:12:31 lr: 1.8258031522128478e-05 loss: 0.0948 (0.0971) time: 2.9289 data: 0.0076 max mem: 33301 Epoch: [26] [4030/4276] eta: 0:12:02 lr: 1.8255088919562645e-05 loss: 0.0891 (0.0971) time: 2.9313 data: 0.0075 max mem: 33301 Epoch: [26] [4040/4276] eta: 0:11:32 lr: 1.8252146264292694e-05 loss: 0.0910 (0.0971) time: 2.9390 data: 0.0068 max mem: 33301 Epoch: [26] [4050/4276] eta: 0:11:03 lr: 1.8249203556308244e-05 loss: 0.0870 (0.0971) time: 2.9394 data: 0.0070 max mem: 33301 Epoch: [26] [4060/4276] eta: 0:10:34 lr: 1.82462607955989e-05 loss: 0.0820 (0.0971) time: 2.9399 data: 0.0070 max mem: 33301 Epoch: [26] [4070/4276] eta: 0:10:04 lr: 1.824331798215427e-05 loss: 0.0885 (0.0971) time: 2.9396 data: 0.0068 max mem: 33301 Epoch: [26] [4080/4276] eta: 0:09:35 lr: 1.8240375115963958e-05 loss: 0.0928 (0.0971) time: 2.9404 data: 0.0068 max mem: 33301 Epoch: [26] [4090/4276] eta: 0:09:06 lr: 1.8237432197017566e-05 loss: 0.0970 (0.0971) time: 2.9414 data: 0.0068 max mem: 33301 Epoch: [26] [4100/4276] eta: 0:08:36 lr: 1.8234489225304685e-05 loss: 0.0981 (0.0971) time: 2.9409 data: 0.0070 max mem: 33301 Epoch: [26] [4110/4276] eta: 0:08:07 lr: 1.8231546200814902e-05 loss: 0.1055 (0.0972) time: 2.9386 data: 0.0073 max mem: 33301 Epoch: [26] [4120/4276] eta: 0:07:38 lr: 1.8228603123537812e-05 loss: 0.0972 (0.0972) time: 2.9391 data: 0.0071 max mem: 33301 Epoch: [26] [4130/4276] eta: 0:07:08 lr: 1.8225659993463e-05 loss: 0.0844 (0.0971) time: 2.9349 data: 0.0073 max mem: 33301 Epoch: [26] [4140/4276] eta: 0:06:39 lr: 1.8222716810580035e-05 loss: 0.0844 (0.0971) time: 2.9335 data: 0.0073 max mem: 33301 Epoch: [26] [4150/4276] eta: 0:06:09 lr: 1.8219773574878497e-05 loss: 0.0901 (0.0971) time: 2.9445 data: 0.0068 max mem: 33301 Epoch: [26] [4160/4276] eta: 0:05:40 lr: 1.821683028634796e-05 loss: 0.0933 (0.0972) time: 2.9427 data: 0.0068 max mem: 33301 Epoch: [26] [4170/4276] eta: 0:05:11 lr: 1.8213886944977993e-05 loss: 0.0953 (0.0972) time: 2.9316 data: 0.0071 max mem: 33301 Epoch: [26] [4180/4276] eta: 0:04:41 lr: 1.821094355075815e-05 loss: 0.0921 (0.0972) time: 2.9309 data: 0.0071 max mem: 33301 Epoch: [26] [4190/4276] eta: 0:04:12 lr: 1.8208000103677997e-05 loss: 0.0929 (0.0972) time: 2.9339 data: 0.0067 max mem: 33301 Epoch: [26] [4200/4276] eta: 0:03:43 lr: 1.820505660372709e-05 loss: 0.1111 (0.0972) time: 2.9326 data: 0.0069 max mem: 33301 Epoch: [26] [4210/4276] eta: 0:03:13 lr: 1.8202113050894983e-05 loss: 0.1111 (0.0972) time: 2.9333 data: 0.0070 max mem: 33301 Epoch: [26] [4220/4276] eta: 0:02:44 lr: 1.819916944517122e-05 loss: 0.1029 (0.0973) time: 2.9361 data: 0.0069 max mem: 33301 Epoch: [26] [4230/4276] eta: 0:02:15 lr: 1.819622578654534e-05 loss: 0.1031 (0.0973) time: 2.9376 data: 0.0069 max mem: 33301 Epoch: [26] [4240/4276] eta: 0:01:45 lr: 1.819328207500689e-05 loss: 0.1034 (0.0973) time: 2.9291 data: 0.0074 max mem: 33301 Epoch: [26] [4250/4276] eta: 0:01:16 lr: 1.8190338310545414e-05 loss: 0.1076 (0.0974) time: 2.9337 data: 0.0075 max mem: 33301 Epoch: [26] [4260/4276] eta: 0:00:46 lr: 1.8187394493150424e-05 loss: 0.1010 (0.0974) time: 2.9435 data: 0.0070 max mem: 33301 Epoch: [26] [4270/4276] eta: 0:00:17 lr: 1.818445062281146e-05 loss: 0.0957 (0.0974) time: 2.9342 data: 0.0069 max mem: 33301 Epoch: [26] Total time: 3:29:14 Test: [ 0/21770] eta: 8:52:49 time: 1.4685 data: 1.4299 max mem: 33301 Test: [ 100/21770] eta: 0:19:01 time: 0.0386 data: 0.0009 max mem: 33301 Test: [ 200/21770] eta: 0:16:24 time: 0.0385 data: 0.0009 max mem: 33301 Test: [ 300/21770] eta: 0:15:30 time: 0.0389 data: 0.0009 max mem: 33301 Test: [ 400/21770] eta: 0:15:01 time: 0.0387 data: 0.0009 max mem: 33301 Test: [ 500/21770] eta: 0:14:42 time: 0.0388 data: 0.0009 max mem: 33301 Test: [ 600/21770] eta: 0:14:28 time: 0.0385 data: 0.0009 max mem: 33301 Test: [ 700/21770] eta: 0:14:17 time: 0.0388 data: 0.0009 max mem: 33301 Test: [ 800/21770] eta: 0:14:08 time: 0.0389 data: 0.0009 max mem: 33301 Test: [ 900/21770] eta: 0:14:01 time: 0.0391 data: 0.0009 max mem: 33301 Test: [ 1000/21770] eta: 0:13:53 time: 0.0382 data: 0.0009 max mem: 33301 Test: [ 1100/21770] eta: 0:13:45 time: 0.0380 data: 0.0009 max mem: 33301 Test: [ 1200/21770] eta: 0:13:37 time: 0.0374 data: 0.0009 max mem: 33301 Test: [ 1300/21770] eta: 0:13:30 time: 0.0376 data: 0.0009 max mem: 33301 Test: [ 1400/21770] eta: 0:13:23 time: 0.0376 data: 0.0009 max mem: 33301 Test: [ 1500/21770] eta: 0:13:16 time: 0.0375 data: 0.0009 max mem: 33301 Test: [ 1600/21770] eta: 0:13:11 time: 0.0388 data: 0.0009 max mem: 33301 Test: [ 1700/21770] eta: 0:13:07 time: 0.0390 data: 0.0008 max mem: 33301 Test: [ 1800/21770] eta: 0:13:02 time: 0.0390 data: 0.0009 max mem: 33301 Test: [ 1900/21770] eta: 0:12:58 time: 0.0391 data: 0.0009 max mem: 33301 Test: [ 2000/21770] eta: 0:12:54 time: 0.0389 data: 0.0009 max mem: 33301 Test: [ 2100/21770] eta: 0:12:50 time: 0.0392 data: 0.0009 max mem: 33301 Test: [ 2200/21770] eta: 0:12:46 time: 0.0391 data: 0.0009 max mem: 33301 Test: [ 2300/21770] eta: 0:12:42 time: 0.0389 data: 0.0009 max mem: 33301 Test: [ 2400/21770] eta: 0:12:38 time: 0.0399 data: 0.0009 max mem: 33301 Test: [ 2500/21770] eta: 0:12:35 time: 0.0400 data: 0.0008 max mem: 33301 Test: [ 2600/21770] eta: 0:12:32 time: 0.0399 data: 0.0008 max mem: 33301 Test: [ 2700/21770] eta: 0:12:28 time: 0.0397 data: 0.0009 max mem: 33301 Test: [ 2800/21770] eta: 0:12:25 time: 0.0397 data: 0.0008 max mem: 33301 Test: [ 2900/21770] eta: 0:12:21 time: 0.0395 data: 0.0008 max mem: 33301 Test: [ 3000/21770] eta: 0:12:17 time: 0.0395 data: 0.0008 max mem: 33301 Test: [ 3100/21770] eta: 0:12:13 time: 0.0396 data: 0.0008 max mem: 33301 Test: [ 3200/21770] eta: 0:12:10 time: 0.0397 data: 0.0008 max mem: 33301 Test: [ 3300/21770] eta: 0:12:06 time: 0.0396 data: 0.0008 max mem: 33301 Test: [ 3400/21770] eta: 0:12:02 time: 0.0392 data: 0.0008 max mem: 33301 Test: [ 3500/21770] eta: 0:11:58 time: 0.0393 data: 0.0008 max mem: 33301 Test: [ 3600/21770] eta: 0:11:54 time: 0.0394 data: 0.0008 max mem: 33301 Test: [ 3700/21770] eta: 0:11:50 time: 0.0393 data: 0.0008 max mem: 33301 Test: [ 3800/21770] eta: 0:11:46 time: 0.0392 data: 0.0008 max mem: 33301 Test: [ 3900/21770] eta: 0:11:42 time: 0.0396 data: 0.0009 max mem: 33301 Test: [ 4000/21770] eta: 0:11:38 time: 0.0391 data: 0.0008 max mem: 33301 Test: [ 4100/21770] eta: 0:11:34 time: 0.0393 data: 0.0008 max mem: 33301 Test: [ 4200/21770] eta: 0:11:30 time: 0.0394 data: 0.0009 max mem: 33301 Test: [ 4300/21770] eta: 0:11:27 time: 0.0393 data: 0.0008 max mem: 33301 Test: [ 4400/21770] eta: 0:11:23 time: 0.0391 data: 0.0008 max mem: 33301 Test: [ 4500/21770] eta: 0:11:19 time: 0.0391 data: 0.0008 max mem: 33301 Test: [ 4600/21770] eta: 0:11:15 time: 0.0387 data: 0.0008 max mem: 33301 Test: [ 4700/21770] eta: 0:11:11 time: 0.0392 data: 0.0008 max mem: 33301 Test: [ 4800/21770] eta: 0:11:06 time: 0.0387 data: 0.0008 max mem: 33301 Test: [ 4900/21770] eta: 0:11:02 time: 0.0392 data: 0.0008 max mem: 33301 Test: [ 5000/21770] eta: 0:10:58 time: 0.0388 data: 0.0008 max mem: 33301 Test: [ 5100/21770] eta: 0:10:54 time: 0.0393 data: 0.0008 max mem: 33301 Test: [ 5200/21770] eta: 0:10:50 time: 0.0389 data: 0.0008 max mem: 33301 Test: [ 5300/21770] eta: 0:10:46 time: 0.0392 data: 0.0008 max mem: 33301 Test: [ 5400/21770] eta: 0:10:42 time: 0.0389 data: 0.0008 max mem: 33301 Test: [ 5500/21770] eta: 0:10:38 time: 0.0390 data: 0.0008 max mem: 33301 Test: [ 5600/21770] eta: 0:10:34 time: 0.0388 data: 0.0008 max mem: 33301 Test: [ 5700/21770] eta: 0:10:30 time: 0.0392 data: 0.0008 max mem: 33301 Test: [ 5800/21770] eta: 0:10:26 time: 0.0389 data: 0.0009 max mem: 33301 Test: [ 5900/21770] eta: 0:10:22 time: 0.0390 data: 0.0008 max mem: 33301 Test: [ 6000/21770] eta: 0:10:18 time: 0.0388 data: 0.0008 max mem: 33301 Test: [ 6100/21770] eta: 0:10:14 time: 0.0390 data: 0.0008 max mem: 33301 Test: [ 6200/21770] eta: 0:10:10 time: 0.0387 data: 0.0008 max mem: 33301 Test: [ 6300/21770] eta: 0:10:06 time: 0.0392 data: 0.0008 max mem: 33301 Test: [ 6400/21770] eta: 0:10:02 time: 0.0388 data: 0.0008 max mem: 33301 Test: [ 6500/21770] eta: 0:09:58 time: 0.0389 data: 0.0008 max mem: 33301 Test: [ 6600/21770] eta: 0:09:54 time: 0.0380 data: 0.0008 max mem: 33301 Test: [ 6700/21770] eta: 0:09:50 time: 0.0385 data: 0.0008 max mem: 33301 Test: [ 6800/21770] eta: 0:09:46 time: 0.0381 data: 0.0008 max mem: 33301 Test: [ 6900/21770] eta: 0:09:42 time: 0.0384 data: 0.0008 max mem: 33301 Test: [ 7000/21770] eta: 0:09:38 time: 0.0382 data: 0.0008 max mem: 33301 Test: [ 7100/21770] eta: 0:09:34 time: 0.0385 data: 0.0008 max mem: 33301 Test: [ 7200/21770] eta: 0:09:30 time: 0.0382 data: 0.0008 max mem: 33301 Test: [ 7300/21770] eta: 0:09:26 time: 0.0384 data: 0.0008 max mem: 33301 Test: [ 7400/21770] eta: 0:09:22 time: 0.0382 data: 0.0008 max mem: 33301 Test: [ 7500/21770] eta: 0:09:17 time: 0.0385 data: 0.0008 max mem: 33301 Test: [ 7600/21770] eta: 0:09:13 time: 0.0382 data: 0.0008 max mem: 33301 Test: [ 7700/21770] eta: 0:09:09 time: 0.0384 data: 0.0008 max mem: 33301 Test: [ 7800/21770] eta: 0:09:05 time: 0.0381 data: 0.0008 max mem: 33301 Test: [ 7900/21770] eta: 0:09:01 time: 0.0385 data: 0.0008 max mem: 33301 Test: [ 8000/21770] eta: 0:08:57 time: 0.0385 data: 0.0008 max mem: 33301 Test: [ 8100/21770] eta: 0:08:53 time: 0.0389 data: 0.0008 max mem: 33301 Test: [ 8200/21770] eta: 0:08:49 time: 0.0385 data: 0.0008 max mem: 33301 Test: [ 8300/21770] eta: 0:08:45 time: 0.0390 data: 0.0008 max mem: 33301 Test: [ 8400/21770] eta: 0:08:42 time: 0.0388 data: 0.0008 max mem: 33301 Test: [ 8500/21770] eta: 0:08:38 time: 0.0391 data: 0.0008 max mem: 33301 Test: [ 8600/21770] eta: 0:08:34 time: 0.0389 data: 0.0008 max mem: 33301 Test: [ 8700/21770] eta: 0:08:30 time: 0.0390 data: 0.0008 max mem: 33301 Test: [ 8800/21770] eta: 0:08:26 time: 0.0383 data: 0.0008 max mem: 33301 Test: [ 8900/21770] eta: 0:08:22 time: 0.0387 data: 0.0008 max mem: 33301 Test: [ 9000/21770] eta: 0:08:18 time: 0.0391 data: 0.0008 max mem: 33301 Test: [ 9100/21770] eta: 0:08:14 time: 0.0393 data: 0.0008 max mem: 33301 Test: [ 9200/21770] eta: 0:08:10 time: 0.0389 data: 0.0009 max mem: 33301 Test: [ 9300/21770] eta: 0:08:06 time: 0.0375 data: 0.0009 max mem: 33301 Test: [ 9400/21770] eta: 0:08:02 time: 0.0380 data: 0.0009 max mem: 33301 Test: [ 9500/21770] eta: 0:07:58 time: 0.0390 data: 0.0008 max mem: 33301 Test: [ 9600/21770] eta: 0:07:54 time: 0.0388 data: 0.0008 max mem: 33301 Test: [ 9700/21770] eta: 0:07:50 time: 0.0391 data: 0.0008 max mem: 33301 Test: [ 9800/21770] eta: 0:07:46 time: 0.0388 data: 0.0008 max mem: 33301 Test: [ 9900/21770] eta: 0:07:42 time: 0.0392 data: 0.0008 max mem: 33301 Test: [10000/21770] eta: 0:07:39 time: 0.0389 data: 0.0008 max mem: 33301 Test: [10100/21770] eta: 0:07:35 time: 0.0393 data: 0.0008 max mem: 33301 Test: [10200/21770] eta: 0:07:31 time: 0.0389 data: 0.0008 max mem: 33301 Test: [10300/21770] eta: 0:07:27 time: 0.0391 data: 0.0008 max mem: 33301 Test: [10400/21770] eta: 0:07:23 time: 0.0387 data: 0.0008 max mem: 33301 Test: [10500/21770] eta: 0:07:19 time: 0.0390 data: 0.0008 max mem: 33301 Test: [10600/21770] eta: 0:07:15 time: 0.0389 data: 0.0008 max mem: 33301 Test: [10700/21770] eta: 0:07:11 time: 0.0391 data: 0.0008 max mem: 33301 Test: [10800/21770] eta: 0:07:07 time: 0.0388 data: 0.0008 max mem: 33301 Test: [10900/21770] eta: 0:07:03 time: 0.0391 data: 0.0008 max mem: 33301 Test: [11000/21770] eta: 0:07:00 time: 0.0388 data: 0.0008 max mem: 33301 Test: [11100/21770] eta: 0:06:56 time: 0.0390 data: 0.0008 max mem: 33301 Test: [11200/21770] eta: 0:06:52 time: 0.0389 data: 0.0008 max mem: 33301 Test: [11300/21770] eta: 0:06:48 time: 0.0392 data: 0.0008 max mem: 33301 Test: [11400/21770] eta: 0:06:44 time: 0.0389 data: 0.0008 max mem: 33301 Test: [11500/21770] eta: 0:06:40 time: 0.0392 data: 0.0008 max mem: 33301 Test: [11600/21770] eta: 0:06:36 time: 0.0389 data: 0.0008 max mem: 33301 Test: [11700/21770] eta: 0:06:32 time: 0.0391 data: 0.0008 max mem: 33301 Test: [11800/21770] eta: 0:06:28 time: 0.0388 data: 0.0008 max mem: 33301 Test: [11900/21770] eta: 0:06:24 time: 0.0391 data: 0.0008 max mem: 33301 Test: [12000/21770] eta: 0:06:21 time: 0.0389 data: 0.0008 max mem: 33301 Test: [12100/21770] eta: 0:06:17 time: 0.0392 data: 0.0008 max mem: 33301 Test: [12200/21770] eta: 0:06:13 time: 0.0389 data: 0.0008 max mem: 33301 Test: [12300/21770] eta: 0:06:09 time: 0.0390 data: 0.0008 max mem: 33301 Test: [12400/21770] eta: 0:06:05 time: 0.0389 data: 0.0008 max mem: 33301 Test: [12500/21770] eta: 0:06:01 time: 0.0391 data: 0.0008 max mem: 33301 Test: [12600/21770] eta: 0:05:57 time: 0.0389 data: 0.0008 max mem: 33301 Test: [12700/21770] eta: 0:05:53 time: 0.0392 data: 0.0008 max mem: 33301 Test: [12800/21770] eta: 0:05:49 time: 0.0389 data: 0.0008 max mem: 33301 Test: [12900/21770] eta: 0:05:45 time: 0.0390 data: 0.0008 max mem: 33301 Test: [13000/21770] eta: 0:05:42 time: 0.0388 data: 0.0008 max mem: 33301 Test: [13100/21770] eta: 0:05:38 time: 0.0391 data: 0.0008 max mem: 33301 Test: [13200/21770] eta: 0:05:34 time: 0.0388 data: 0.0008 max mem: 33301 Test: [13300/21770] eta: 0:05:30 time: 0.0390 data: 0.0008 max mem: 33301 Test: [13400/21770] eta: 0:05:26 time: 0.0388 data: 0.0008 max mem: 33301 Test: [13500/21770] eta: 0:05:22 time: 0.0391 data: 0.0008 max mem: 33301 Test: [13600/21770] eta: 0:05:18 time: 0.0388 data: 0.0008 max mem: 33301 Test: [13700/21770] eta: 0:05:14 time: 0.0393 data: 0.0009 max mem: 33301 Test: [13800/21770] eta: 0:05:10 time: 0.0389 data: 0.0008 max mem: 33301 Test: [13900/21770] eta: 0:05:06 time: 0.0391 data: 0.0008 max mem: 33301 Test: [14000/21770] eta: 0:05:02 time: 0.0389 data: 0.0008 max mem: 33301 Test: [14100/21770] eta: 0:04:59 time: 0.0391 data: 0.0008 max mem: 33301 Test: [14200/21770] eta: 0:04:55 time: 0.0388 data: 0.0008 max mem: 33301 Test: [14300/21770] eta: 0:04:51 time: 0.0392 data: 0.0008 max mem: 33301 Test: [14400/21770] eta: 0:04:47 time: 0.0387 data: 0.0008 max mem: 33301 Test: [14500/21770] eta: 0:04:43 time: 0.0392 data: 0.0008 max mem: 33301 Test: [14600/21770] eta: 0:04:39 time: 0.0389 data: 0.0008 max mem: 33301 Test: [14700/21770] eta: 0:04:35 time: 0.0393 data: 0.0008 max mem: 33301 Test: [14800/21770] eta: 0:04:31 time: 0.0388 data: 0.0008 max mem: 33301 Test: [14900/21770] eta: 0:04:27 time: 0.0392 data: 0.0008 max mem: 33301 Test: [15000/21770] eta: 0:04:23 time: 0.0388 data: 0.0008 max mem: 33301 Test: [15100/21770] eta: 0:04:20 time: 0.0392 data: 0.0008 max mem: 33301 Test: [15200/21770] eta: 0:04:16 time: 0.0388 data: 0.0008 max mem: 33301 Test: [15300/21770] eta: 0:04:12 time: 0.0394 data: 0.0009 max mem: 33301 Test: [15400/21770] eta: 0:04:08 time: 0.0387 data: 0.0008 max mem: 33301 Test: [15500/21770] eta: 0:04:04 time: 0.0391 data: 0.0008 max mem: 33301 Test: [15600/21770] eta: 0:04:00 time: 0.0388 data: 0.0008 max mem: 33301 Test: [15700/21770] eta: 0:03:56 time: 0.0386 data: 0.0009 max mem: 33301 Test: [15800/21770] eta: 0:03:52 time: 0.0386 data: 0.0008 max mem: 33301 Test: [15900/21770] eta: 0:03:49 time: 0.0388 data: 0.0008 max mem: 33301 Test: [16000/21770] eta: 0:03:45 time: 0.0387 data: 0.0008 max mem: 33301 Test: [16100/21770] eta: 0:03:41 time: 0.0387 data: 0.0008 max mem: 33301 Test: [16200/21770] eta: 0:03:37 time: 0.0391 data: 0.0008 max mem: 33301 Test: [16300/21770] eta: 0:03:33 time: 0.0387 data: 0.0008 max mem: 33301 Test: [16400/21770] eta: 0:03:29 time: 0.0389 data: 0.0008 max mem: 33301 Test: [16500/21770] eta: 0:03:25 time: 0.0388 data: 0.0008 max mem: 33301 Test: [16600/21770] eta: 0:03:21 time: 0.0390 data: 0.0008 max mem: 33301 Test: [16700/21770] eta: 0:03:17 time: 0.0392 data: 0.0008 max mem: 33301 Test: [16800/21770] eta: 0:03:13 time: 0.0390 data: 0.0008 max mem: 33301 Test: [16900/21770] eta: 0:03:09 time: 0.0392 data: 0.0008 max mem: 33301 Test: [17000/21770] eta: 0:03:06 time: 0.0390 data: 0.0008 max mem: 33301 Test: [17100/21770] eta: 0:03:02 time: 0.0389 data: 0.0008 max mem: 33301 Test: [17200/21770] eta: 0:02:58 time: 0.0386 data: 0.0008 max mem: 33301 Test: [17300/21770] eta: 0:02:54 time: 0.0389 data: 0.0008 max mem: 33301 Test: [17400/21770] eta: 0:02:50 time: 0.0387 data: 0.0008 max mem: 33301 Test: [17500/21770] eta: 0:02:46 time: 0.0389 data: 0.0008 max mem: 33301 Test: [17600/21770] eta: 0:02:42 time: 0.0386 data: 0.0008 max mem: 33301 Test: [17700/21770] eta: 0:02:38 time: 0.0390 data: 0.0008 max mem: 33301 Test: [17800/21770] eta: 0:02:34 time: 0.0387 data: 0.0008 max mem: 33301 Test: [17900/21770] eta: 0:02:30 time: 0.0391 data: 0.0008 max mem: 33301 Test: [18000/21770] eta: 0:02:27 time: 0.0386 data: 0.0008 max mem: 33301 Test: [18100/21770] eta: 0:02:23 time: 0.0390 data: 0.0008 max mem: 33301 Test: [18200/21770] eta: 0:02:19 time: 0.0387 data: 0.0008 max mem: 33301 Test: [18300/21770] eta: 0:02:15 time: 0.0390 data: 0.0008 max mem: 33301 Test: [18400/21770] eta: 0:02:11 time: 0.0388 data: 0.0008 max mem: 33301 Test: [18500/21770] eta: 0:02:07 time: 0.0391 data: 0.0008 max mem: 33301 Test: [18600/21770] eta: 0:02:03 time: 0.0389 data: 0.0008 max mem: 33301 Test: [18700/21770] eta: 0:01:59 time: 0.0390 data: 0.0008 max mem: 33301 Test: [18800/21770] eta: 0:01:55 time: 0.0386 data: 0.0008 max mem: 33301 Test: [18900/21770] eta: 0:01:51 time: 0.0388 data: 0.0008 max mem: 33301 Test: [19000/21770] eta: 0:01:47 time: 0.0379 data: 0.0009 max mem: 33301 Test: [19100/21770] eta: 0:01:44 time: 0.0389 data: 0.0009 max mem: 33301 Test: [19200/21770] eta: 0:01:40 time: 0.0386 data: 0.0009 max mem: 33301 Test: [19300/21770] eta: 0:01:36 time: 0.0387 data: 0.0008 max mem: 33301 Test: [19400/21770] eta: 0:01:32 time: 0.0384 data: 0.0008 max mem: 33301 Test: [19500/21770] eta: 0:01:28 time: 0.0387 data: 0.0008 max mem: 33301 Test: [19600/21770] eta: 0:01:24 time: 0.0386 data: 0.0008 max mem: 33301 Test: [19700/21770] eta: 0:01:20 time: 0.0391 data: 0.0008 max mem: 33301 Test: [19800/21770] eta: 0:01:16 time: 0.0386 data: 0.0008 max mem: 33301 Test: [19900/21770] eta: 0:01:12 time: 0.0393 data: 0.0008 max mem: 33301 Test: [20000/21770] eta: 0:01:08 time: 0.0390 data: 0.0008 max mem: 33301 Test: [20100/21770] eta: 0:01:05 time: 0.0394 data: 0.0008 max mem: 33301 Test: [20200/21770] eta: 0:01:01 time: 0.0389 data: 0.0008 max mem: 33301 Test: [20300/21770] eta: 0:00:57 time: 0.0393 data: 0.0008 max mem: 33301 Test: [20400/21770] eta: 0:00:53 time: 0.0388 data: 0.0008 max mem: 33301 Test: [20500/21770] eta: 0:00:49 time: 0.0393 data: 0.0008 max mem: 33301 Test: [20600/21770] eta: 0:00:45 time: 0.0389 data: 0.0008 max mem: 33301 Test: [20700/21770] eta: 0:00:41 time: 0.0393 data: 0.0008 max mem: 33301 Test: [20800/21770] eta: 0:00:37 time: 0.0389 data: 0.0008 max mem: 33301 Test: [20900/21770] eta: 0:00:33 time: 0.0389 data: 0.0008 max mem: 33301 Test: [21000/21770] eta: 0:00:30 time: 0.0386 data: 0.0008 max mem: 33301 Test: [21100/21770] eta: 0:00:26 time: 0.0390 data: 0.0008 max mem: 33301 Test: [21200/21770] eta: 0:00:22 time: 0.0387 data: 0.0008 max mem: 33301 Test: [21300/21770] eta: 0:00:18 time: 0.0393 data: 0.0008 max mem: 33301 Test: [21400/21770] eta: 0:00:14 time: 0.0391 data: 0.0008 max mem: 33301 Test: [21500/21770] eta: 0:00:10 time: 0.0393 data: 0.0008 max mem: 33301 Test: [21600/21770] eta: 0:00:06 time: 0.0380 data: 0.0009 max mem: 33301 Test: [21700/21770] eta: 0:00:02 time: 0.0382 data: 0.0009 max mem: 33301 Test: Total time: 0:14:08 Final results: Mean IoU is 0.00 precision@0.5 = 0.00 precision@0.6 = 0.00 precision@0.7 = 0.00 precision@0.8 = 0.00 precision@0.9 = 0.00 overall IoU = 0.00 mean IoU = 0.00 Mean accuracy for one-to-zero sample is 100.00 Average object IoU 0.0 Overall IoU 0.0 Epoch: [27] [ 0/4276] eta: 5:49:03 lr: 1.8182684275190614e-05 loss: 0.1300 (0.1300) time: 4.8980 data: 1.8027 max mem: 33301 Epoch: [27] [ 10/4276] eta: 3:39:24 lr: 1.8179740320119498e-05 loss: 0.0953 (0.0950) time: 3.0858 data: 0.1716 max mem: 33301 Epoch: [27] [ 20/4276] eta: 3:32:26 lr: 1.8176796312077164e-05 loss: 0.0855 (0.0967) time: 2.8998 data: 0.0078 max mem: 33301 Epoch: [27] [ 30/4276] eta: 3:29:40 lr: 1.817385225105312e-05 loss: 0.0898 (0.0963) time: 2.8953 data: 0.0078 max mem: 33301 Epoch: [27] [ 40/4276] eta: 3:28:04 lr: 1.8170908137036874e-05 loss: 0.0925 (0.0955) time: 2.8970 data: 0.0079 max mem: 33301 Epoch: [27] [ 50/4276] eta: 3:26:42 lr: 1.8167963970017942e-05 loss: 0.0906 (0.0941) time: 2.8910 data: 0.0076 max mem: 33301 Epoch: [27] [ 60/4276] eta: 3:25:48 lr: 1.8165019749985816e-05 loss: 0.0856 (0.0925) time: 2.8915 data: 0.0081 max mem: 33301 Epoch: [27] [ 70/4276] eta: 3:25:10 lr: 1.816207547693e-05 loss: 0.0827 (0.0912) time: 2.9067 data: 0.0078 max mem: 33301 Epoch: [27] [ 80/4276] eta: 3:24:45 lr: 1.815913115083998e-05 loss: 0.0835 (0.0926) time: 2.9252 data: 0.0078 max mem: 33301 Epoch: [27] [ 90/4276] eta: 3:24:20 lr: 1.815618677170526e-05 loss: 0.0894 (0.0921) time: 2.9361 data: 0.0074 max mem: 33301 Epoch: [27] [ 100/4276] eta: 3:23:47 lr: 1.8153242339515314e-05 loss: 0.0832 (0.0935) time: 2.9288 data: 0.0075 max mem: 33301 Epoch: [27] [ 110/4276] eta: 3:23:22 lr: 1.8150297854259625e-05 loss: 0.0898 (0.0940) time: 2.9304 data: 0.0077 max mem: 33301 Epoch: [27] [ 120/4276] eta: 3:22:56 lr: 1.8147353315927677e-05 loss: 0.0872 (0.0935) time: 2.9381 data: 0.0071 max mem: 33301 Epoch: [27] [ 130/4276] eta: 3:22:29 lr: 1.814440872450894e-05 loss: 0.0858 (0.0940) time: 2.9373 data: 0.0071 max mem: 33301 Epoch: [27] [ 140/4276] eta: 3:22:01 lr: 1.8141464079992884e-05 loss: 0.0899 (0.0944) time: 2.9361 data: 0.0073 max mem: 33301 Epoch: [27] [ 150/4276] eta: 3:21:33 lr: 1.8138519382368973e-05 loss: 0.1006 (0.0950) time: 2.9360 data: 0.0070 max mem: 33301 Epoch: [27] [ 160/4276] eta: 3:21:06 lr: 1.8135574631626674e-05 loss: 0.0990 (0.0945) time: 2.9388 data: 0.0069 max mem: 33301 Epoch: [27] [ 170/4276] eta: 3:20:37 lr: 1.813262982775545e-05 loss: 0.0867 (0.0943) time: 2.9365 data: 0.0069 max mem: 33301 Epoch: [27] [ 180/4276] eta: 3:20:13 lr: 1.8129684970744738e-05 loss: 0.0910 (0.0949) time: 2.9426 data: 0.0072 max mem: 33301 Epoch: [27] [ 190/4276] eta: 3:19:44 lr: 1.8126740060584e-05 loss: 0.0859 (0.0945) time: 2.9453 data: 0.0073 max mem: 33301 Epoch: [27] [ 200/4276] eta: 3:19:17 lr: 1.8123795097262676e-05 loss: 0.0859 (0.0947) time: 2.9402 data: 0.0070 max mem: 33301 Epoch: [27] [ 210/4276] eta: 3:18:49 lr: 1.8120850080770224e-05 loss: 0.0897 (0.0949) time: 2.9407 data: 0.0069 max mem: 33301 Epoch: [27] [ 220/4276] eta: 3:18:20 lr: 1.8117905011096062e-05 loss: 0.0855 (0.0947) time: 2.9383 data: 0.0070 max mem: 33301 Epoch: [27] [ 230/4276] eta: 3:17:51 lr: 1.811495988822963e-05 loss: 0.0855 (0.0944) time: 2.9363 data: 0.0071 max mem: 33301 Epoch: [27] [ 240/4276] eta: 3:17:22 lr: 1.8112014712160366e-05 loss: 0.0904 (0.0942) time: 2.9349 data: 0.0068 max mem: 33301 Epoch: [27] [ 250/4276] eta: 3:16:53 lr: 1.8109069482877692e-05 loss: 0.0924 (0.0961) time: 2.9359 data: 0.0069 max mem: 33301 Epoch: [27] [ 260/4276] eta: 3:16:24 lr: 1.8106124200371026e-05 loss: 0.0886 (0.0957) time: 2.9372 data: 0.0072 max mem: 33301 Epoch: [27] [ 270/4276] eta: 3:15:55 lr: 1.8103178864629782e-05 loss: 0.0878 (0.0957) time: 2.9379 data: 0.0072 max mem: 33301 Epoch: [27] [ 280/4276] eta: 3:15:27 lr: 1.8100233475643386e-05 loss: 0.0840 (0.0956) time: 2.9395 data: 0.0071 max mem: 33301 Epoch: [27] [ 290/4276] eta: 3:14:58 lr: 1.8097288033401248e-05 loss: 0.0840 (0.0954) time: 2.9402 data: 0.0068 max mem: 33301 Epoch: [27] [ 300/4276] eta: 3:14:30 lr: 1.809434253789276e-05 loss: 0.0908 (0.0953) time: 2.9407 data: 0.0071 max mem: 33301 Epoch: [27] [ 310/4276] eta: 3:14:01 lr: 1.8091396989107335e-05 loss: 0.0854 (0.0949) time: 2.9406 data: 0.0075 max mem: 33301 Epoch: [27] [ 320/4276] eta: 3:13:32 lr: 1.8088451387034365e-05 loss: 0.0865 (0.0952) time: 2.9390 data: 0.0071 max mem: 33301 Epoch: [27] [ 330/4276] eta: 3:13:03 lr: 1.8085505731663254e-05 loss: 0.0994 (0.0955) time: 2.9374 data: 0.0066 max mem: 33301 Epoch: [27] [ 340/4276] eta: 3:12:34 lr: 1.808256002298338e-05 loss: 0.0991 (0.0956) time: 2.9377 data: 0.0068 max mem: 33301 Epoch: [27] [ 350/4276] eta: 3:12:05 lr: 1.8079614260984133e-05 loss: 0.0810 (0.0954) time: 2.9388 data: 0.0066 max mem: 33301 Epoch: [27] [ 360/4276] eta: 3:11:36 lr: 1.8076668445654905e-05 loss: 0.0926 (0.0958) time: 2.9384 data: 0.0062 max mem: 33301 Epoch: [27] [ 370/4276] eta: 3:11:03 lr: 1.8073722576985054e-05 loss: 0.0891 (0.0956) time: 2.9239 data: 0.0062 max mem: 33301 Epoch: [27] [ 380/4276] eta: 3:10:31 lr: 1.807077665496397e-05 loss: 0.0863 (0.0957) time: 2.9054 data: 0.0069 max mem: 33301 Epoch: [27] [ 390/4276] eta: 3:09:59 lr: 1.8067830679581007e-05 loss: 0.0994 (0.0960) time: 2.9037 data: 0.0074 max mem: 33301 Epoch: [27] [ 400/4276] eta: 3:09:28 lr: 1.8064884650825554e-05 loss: 0.0957 (0.0961) time: 2.9111 data: 0.0072 max mem: 33301 Epoch: [27] [ 410/4276] eta: 3:08:59 lr: 1.8061938568686948e-05 loss: 0.0930 (0.0960) time: 2.9313 data: 0.0071 max mem: 33301 Epoch: [27] [ 420/4276] eta: 3:08:30 lr: 1.8058992433154563e-05 loss: 0.0918 (0.0961) time: 2.9409 data: 0.0071 max mem: 33301 Epoch: [27] [ 430/4276] eta: 3:08:01 lr: 1.8056046244217743e-05 loss: 0.0922 (0.0963) time: 2.9353 data: 0.0073 max mem: 33301 Epoch: [27] [ 440/4276] eta: 3:07:33 lr: 1.8053100001865846e-05 loss: 0.0846 (0.0960) time: 2.9384 data: 0.0075 max mem: 33301 Epoch: [27] [ 450/4276] eta: 3:07:00 lr: 1.8050153706088207e-05 loss: 0.0871 (0.0964) time: 2.9156 data: 0.0074 max mem: 33301 Epoch: [27] [ 460/4276] eta: 3:06:30 lr: 1.8047207356874176e-05 loss: 0.0934 (0.0959) time: 2.9087 data: 0.0076 max mem: 33301 Epoch: [27] [ 470/4276] eta: 3:06:01 lr: 1.8044260954213084e-05 loss: 0.0762 (0.0957) time: 2.9318 data: 0.0075 max mem: 33301 Epoch: [27] [ 480/4276] eta: 3:05:32 lr: 1.8041314498094277e-05 loss: 0.0829 (0.0958) time: 2.9386 data: 0.0073 max mem: 33301 Epoch: [27] [ 490/4276] eta: 3:05:01 lr: 1.803836798850707e-05 loss: 0.0829 (0.0955) time: 2.9233 data: 0.0075 max mem: 33301 Epoch: [27] [ 500/4276] eta: 3:04:28 lr: 1.8035421425440788e-05 loss: 0.0858 (0.0954) time: 2.8937 data: 0.0075 max mem: 33301 Epoch: [27] [ 510/4276] eta: 3:03:56 lr: 1.8032474808884755e-05 loss: 0.0906 (0.0954) time: 2.8930 data: 0.0070 max mem: 33301 Epoch: [27] [ 520/4276] eta: 3:03:28 lr: 1.80295281388283e-05 loss: 0.0938 (0.0955) time: 2.9229 data: 0.0070 max mem: 33301 Epoch: [27] [ 530/4276] eta: 3:02:59 lr: 1.8026581415260717e-05 loss: 0.0922 (0.0956) time: 2.9377 data: 0.0075 max mem: 33301 Epoch: [27] [ 540/4276] eta: 3:02:26 lr: 1.8023634638171324e-05 loss: 0.0804 (0.0954) time: 2.9066 data: 0.0073 max mem: 33301 Epoch: [27] [ 550/4276] eta: 3:01:54 lr: 1.8020687807549424e-05 loss: 0.0942 (0.0958) time: 2.8841 data: 0.0076 max mem: 33301 Epoch: [27] [ 560/4276] eta: 3:01:21 lr: 1.8017740923384326e-05 loss: 0.0958 (0.0959) time: 2.8869 data: 0.0074 max mem: 33301 Epoch: [27] [ 570/4276] eta: 3:00:50 lr: 1.801479398566531e-05 loss: 0.0922 (0.0958) time: 2.8867 data: 0.0070 max mem: 33301 Epoch: [27] [ 580/4276] eta: 3:00:18 lr: 1.801184699438168e-05 loss: 0.0922 (0.0959) time: 2.8904 data: 0.0072 max mem: 33301 Epoch: [27] [ 590/4276] eta: 2:59:49 lr: 1.800889994952272e-05 loss: 0.0813 (0.0956) time: 2.9089 data: 0.0075 max mem: 33301 Epoch: [27] [ 600/4276] eta: 2:59:18 lr: 1.800595285107772e-05 loss: 0.0785 (0.0954) time: 2.9148 data: 0.0077 max mem: 33301 Epoch: [27] [ 610/4276] eta: 2:58:47 lr: 1.8003005699035953e-05 loss: 0.0836 (0.0953) time: 2.9004 data: 0.0074 max mem: 33301 Epoch: [27] [ 620/4276] eta: 2:58:16 lr: 1.80000584933867e-05 loss: 0.0867 (0.0952) time: 2.8973 data: 0.0071 max mem: 33301 Epoch: [27] [ 630/4276] eta: 2:57:45 lr: 1.799711123411923e-05 loss: 0.0963 (0.0955) time: 2.9001 data: 0.0074 max mem: 33301 Epoch: [27] [ 640/4276] eta: 2:57:14 lr: 1.7994163921222815e-05 loss: 0.1028 (0.0956) time: 2.8959 data: 0.0073 max mem: 33301 Epoch: [27] [ 650/4276] eta: 2:56:43 lr: 1.799121655468671e-05 loss: 0.0907 (0.0956) time: 2.8890 data: 0.0071 max mem: 33301 Epoch: [27] [ 660/4276] eta: 2:56:15 lr: 1.7988269134500183e-05 loss: 0.0961 (0.0957) time: 2.9219 data: 0.0073 max mem: 33301 Epoch: [27] [ 670/4276] eta: 2:55:46 lr: 1.798532166065249e-05 loss: 0.0978 (0.0956) time: 2.9371 data: 0.0078 max mem: 33301 Epoch: [27] [ 680/4276] eta: 2:55:19 lr: 1.7982374133132876e-05 loss: 0.0871 (0.0956) time: 2.9468 data: 0.0073 max mem: 33301 Epoch: [27] [ 690/4276] eta: 2:54:48 lr: 1.797942655193059e-05 loss: 0.0871 (0.0956) time: 2.9335 data: 0.0065 max mem: 33301 Epoch: [27] [ 700/4276] eta: 2:54:17 lr: 1.7976478917034877e-05 loss: 0.0850 (0.0954) time: 2.8892 data: 0.0067 max mem: 33301 Epoch: [27] [ 710/4276] eta: 2:53:46 lr: 1.7973531228434985e-05 loss: 0.0803 (0.0953) time: 2.8858 data: 0.0066 max mem: 33301 Epoch: [27] [ 720/4276] eta: 2:53:16 lr: 1.797058348612013e-05 loss: 0.0868 (0.0952) time: 2.8963 data: 0.0069 max mem: 33301 Epoch: [27] [ 730/4276] eta: 2:52:45 lr: 1.7967635690079554e-05 loss: 0.0883 (0.0953) time: 2.9007 data: 0.0075 max mem: 33301 Epoch: [27] [ 740/4276] eta: 2:52:16 lr: 1.7964687840302483e-05 loss: 0.0894 (0.0954) time: 2.9098 data: 0.0075 max mem: 33301 Epoch: [27] [ 750/4276] eta: 2:51:49 lr: 1.7961739936778146e-05 loss: 0.0896 (0.0954) time: 2.9451 data: 0.0075 max mem: 33301 Epoch: [27] [ 760/4276] eta: 2:51:21 lr: 1.7958791979495748e-05 loss: 0.0896 (0.0954) time: 2.9673 data: 0.0076 max mem: 33301 Epoch: [27] [ 770/4276] eta: 2:50:51 lr: 1.795584396844451e-05 loss: 0.0928 (0.0954) time: 2.9341 data: 0.0087 max mem: 33301 Epoch: [27] [ 780/4276] eta: 2:50:20 lr: 1.7952895903613642e-05 loss: 0.0840 (0.0953) time: 2.8950 data: 0.0097 max mem: 33301 Epoch: [27] [ 790/4276] eta: 2:49:50 lr: 1.794994778499236e-05 loss: 0.0954 (0.0953) time: 2.8913 data: 0.0098 max mem: 33301 Epoch: [27] [ 800/4276] eta: 2:49:19 lr: 1.7946999612569847e-05 loss: 0.0974 (0.0955) time: 2.8912 data: 0.0098 max mem: 33301 Epoch: [27] [ 810/4276] eta: 2:48:48 lr: 1.7944051386335308e-05 loss: 0.0951 (0.0955) time: 2.8904 data: 0.0098 max mem: 33301 Epoch: [27] [ 820/4276] eta: 2:48:18 lr: 1.7941103106277944e-05 loss: 0.0871 (0.0954) time: 2.8905 data: 0.0100 max mem: 33301 Epoch: [27] [ 830/4276] eta: 2:47:48 lr: 1.7938154772386944e-05 loss: 0.0865 (0.0954) time: 2.8940 data: 0.0101 max mem: 33301 Epoch: [27] [ 840/4276] eta: 2:47:17 lr: 1.793520638465148e-05 loss: 0.1011 (0.0955) time: 2.8968 data: 0.0099 max mem: 33301 Epoch: [27] [ 850/4276] eta: 2:46:47 lr: 1.7932257943060744e-05 loss: 0.0960 (0.0955) time: 2.8948 data: 0.0098 max mem: 33301 Epoch: [27] [ 860/4276] eta: 2:46:17 lr: 1.7929309447603908e-05 loss: 0.0946 (0.0956) time: 2.9019 data: 0.0103 max mem: 33301 Epoch: [27] [ 870/4276] eta: 2:45:49 lr: 1.792636089827016e-05 loss: 0.0855 (0.0954) time: 2.9275 data: 0.0099 max mem: 33301 Epoch: [27] [ 880/4276] eta: 2:45:20 lr: 1.7923412295048648e-05 loss: 0.0755 (0.0954) time: 2.9353 data: 0.0095 max mem: 33301 Epoch: [27] [ 890/4276] eta: 2:44:52 lr: 1.7920463637928544e-05 loss: 0.0998 (0.0956) time: 2.9340 data: 0.0086 max mem: 33301 Epoch: [27] [ 900/4276] eta: 2:44:23 lr: 1.7917514926899014e-05 loss: 0.0959 (0.0956) time: 2.9420 data: 0.0080 max mem: 33301 Epoch: [27] [ 910/4276] eta: 2:43:55 lr: 1.7914566161949213e-05 loss: 0.0914 (0.0957) time: 2.9415 data: 0.0081 max mem: 33301 Epoch: [27] [ 920/4276] eta: 2:43:26 lr: 1.7911617343068286e-05 loss: 0.0951 (0.0959) time: 2.9363 data: 0.0085 max mem: 33301 Epoch: [27] [ 930/4276] eta: 2:42:56 lr: 1.7908668470245387e-05 loss: 0.1000 (0.0958) time: 2.9241 data: 0.0090 max mem: 33301 Epoch: [27] [ 940/4276] eta: 2:42:28 lr: 1.7905719543469653e-05 loss: 0.0897 (0.0958) time: 2.9291 data: 0.0089 max mem: 33301 Epoch: [27] [ 950/4276] eta: 2:41:59 lr: 1.790277056273024e-05 loss: 0.1007 (0.0960) time: 2.9332 data: 0.0094 max mem: 33301 Epoch: [27] [ 960/4276] eta: 2:41:30 lr: 1.7899821528016265e-05 loss: 0.0968 (0.0959) time: 2.9292 data: 0.0095 max mem: 33301 Epoch: [27] [ 970/4276] eta: 2:41:01 lr: 1.7896872439316865e-05 loss: 0.0935 (0.0960) time: 2.9359 data: 0.0083 max mem: 33301 Epoch: [27] [ 980/4276] eta: 2:40:33 lr: 1.7893923296621175e-05 loss: 0.0978 (0.0961) time: 2.9369 data: 0.0080 max mem: 33301 Epoch: [27] [ 990/4276] eta: 2:40:03 lr: 1.7890974099918308e-05 loss: 0.0875 (0.0959) time: 2.9252 data: 0.0083 max mem: 33301 Epoch: [27] [1000/4276] eta: 2:39:33 lr: 1.7888024849197385e-05 loss: 0.0807 (0.0959) time: 2.9065 data: 0.0084 max mem: 33301 Epoch: [27] [1010/4276] eta: 2:39:04 lr: 1.788507554444752e-05 loss: 0.0963 (0.0960) time: 2.9127 data: 0.0088 max mem: 33301 Epoch: [27] [1020/4276] eta: 2:38:35 lr: 1.7882126185657832e-05 loss: 0.0950 (0.0959) time: 2.9286 data: 0.0083 max mem: 33301 Epoch: [27] [1030/4276] eta: 2:38:06 lr: 1.7879176772817415e-05 loss: 0.0950 (0.0960) time: 2.9310 data: 0.0080 max mem: 33301 Epoch: [27] [1040/4276] eta: 2:37:37 lr: 1.787622730591538e-05 loss: 0.1045 (0.0961) time: 2.9311 data: 0.0080 max mem: 33301 Epoch: [27] [1050/4276] eta: 2:37:08 lr: 1.787327778494082e-05 loss: 0.0954 (0.0962) time: 2.9334 data: 0.0078 max mem: 33301 Epoch: [27] [1060/4276] eta: 2:36:39 lr: 1.7870328209882832e-05 loss: 0.0992 (0.0964) time: 2.9333 data: 0.0078 max mem: 33301 Epoch: [27] [1070/4276] eta: 2:36:11 lr: 1.78673785807305e-05 loss: 0.1001 (0.0965) time: 2.9357 data: 0.0076 max mem: 33301 Epoch: [27] [1080/4276] eta: 2:35:42 lr: 1.786442889747291e-05 loss: 0.0992 (0.0965) time: 2.9399 data: 0.0074 max mem: 33301 Epoch: [27] [1090/4276] eta: 2:35:13 lr: 1.7861479160099145e-05 loss: 0.1021 (0.0966) time: 2.9409 data: 0.0075 max mem: 33301 Epoch: [27] [1100/4276] eta: 2:34:44 lr: 1.785852936859829e-05 loss: 0.1021 (0.0967) time: 2.9393 data: 0.0075 max mem: 33301 Epoch: [27] [1110/4276] eta: 2:34:16 lr: 1.7855579522959404e-05 loss: 0.1012 (0.0968) time: 2.9397 data: 0.0074 max mem: 33301 Epoch: [27] [1120/4276] eta: 2:33:47 lr: 1.785262962317156e-05 loss: 0.0974 (0.0969) time: 2.9454 data: 0.0074 max mem: 33301 Epoch: [27] [1130/4276] eta: 2:33:19 lr: 1.7849679669223824e-05 loss: 0.0885 (0.0968) time: 2.9493 data: 0.0077 max mem: 33301 Epoch: [27] [1140/4276] eta: 2:32:50 lr: 1.784672966110526e-05 loss: 0.0905 (0.0969) time: 2.9454 data: 0.0077 max mem: 33301 Epoch: [27] [1150/4276] eta: 2:32:21 lr: 1.7843779598804913e-05 loss: 0.0969 (0.0969) time: 2.9415 data: 0.0074 max mem: 33301 Epoch: [27] [1160/4276] eta: 2:31:52 lr: 1.7840829482311842e-05 loss: 0.1021 (0.0970) time: 2.9437 data: 0.0075 max mem: 33301 Epoch: [27] [1170/4276] eta: 2:31:24 lr: 1.783787931161509e-05 loss: 0.1086 (0.0970) time: 2.9445 data: 0.0075 max mem: 33301 Epoch: [27] [1180/4276] eta: 2:30:55 lr: 1.783492908670371e-05 loss: 0.0919 (0.0970) time: 2.9403 data: 0.0075 max mem: 33301 Epoch: [27] [1190/4276] eta: 2:30:26 lr: 1.7831978807566726e-05 loss: 0.0882 (0.0969) time: 2.9386 data: 0.0075 max mem: 33301 Epoch: [27] [1200/4276] eta: 2:29:57 lr: 1.782902847419318e-05 loss: 0.0844 (0.0969) time: 2.9414 data: 0.0074 max mem: 33301 Epoch: [27] [1210/4276] eta: 2:29:28 lr: 1.7826078086572105e-05 loss: 0.0825 (0.0967) time: 2.9432 data: 0.0075 max mem: 33301 Epoch: [27] [1220/4276] eta: 2:28:59 lr: 1.7823127644692524e-05 loss: 0.0845 (0.0968) time: 2.9441 data: 0.0075 max mem: 33301 Epoch: [27] [1230/4276] eta: 2:28:31 lr: 1.782017714854346e-05 loss: 0.0887 (0.0967) time: 2.9523 data: 0.0077 max mem: 33301 Epoch: [27] [1240/4276] eta: 2:28:03 lr: 1.7817226598113925e-05 loss: 0.0867 (0.0966) time: 2.9595 data: 0.0087 max mem: 33301 Epoch: [27] [1250/4276] eta: 2:27:34 lr: 1.781427599339294e-05 loss: 0.0867 (0.0967) time: 2.9513 data: 0.0089 max mem: 33301 Epoch: [27] [1260/4276] eta: 2:27:04 lr: 1.7811325334369514e-05 loss: 0.0854 (0.0966) time: 2.9289 data: 0.0084 max mem: 33301 Epoch: [27] [1270/4276] eta: 2:26:34 lr: 1.7808374621032644e-05 loss: 0.0785 (0.0965) time: 2.9018 data: 0.0088 max mem: 33301 Epoch: [27] [1280/4276] eta: 2:26:04 lr: 1.780542385337134e-05 loss: 0.0915 (0.0965) time: 2.8892 data: 0.0088 max mem: 33301 Epoch: [27] [1290/4276] eta: 2:25:34 lr: 1.7802473031374596e-05 loss: 0.1072 (0.0966) time: 2.8915 data: 0.0086 max mem: 33301 Epoch: [27] [1300/4276] eta: 2:25:04 lr: 1.7799522155031395e-05 loss: 0.0837 (0.0966) time: 2.8944 data: 0.0088 max mem: 33301 Epoch: [27] [1310/4276] eta: 2:24:34 lr: 1.7796571224330736e-05 loss: 0.0815 (0.0965) time: 2.8906 data: 0.0086 max mem: 33301 Epoch: [27] [1320/4276] eta: 2:24:04 lr: 1.7793620239261596e-05 loss: 0.0948 (0.0965) time: 2.8886 data: 0.0084 max mem: 33301 Epoch: [27] [1330/4276] eta: 2:23:34 lr: 1.7790669199812962e-05 loss: 0.0905 (0.0964) time: 2.8896 data: 0.0089 max mem: 33301 Epoch: [27] [1340/4276] eta: 2:23:04 lr: 1.7787718105973798e-05 loss: 0.0864 (0.0964) time: 2.8918 data: 0.0091 max mem: 33301 Epoch: [27] [1350/4276] eta: 2:22:34 lr: 1.7784766957733086e-05 loss: 0.0887 (0.0964) time: 2.8916 data: 0.0086 max mem: 33301 Epoch: [27] [1360/4276] eta: 2:22:04 lr: 1.778181575507978e-05 loss: 0.0917 (0.0965) time: 2.8910 data: 0.0082 max mem: 33301 Epoch: [27] [1370/4276] eta: 2:21:35 lr: 1.7778864498002862e-05 loss: 0.1007 (0.0966) time: 2.9086 data: 0.0085 max mem: 33301 Epoch: [27] [1380/4276] eta: 2:21:05 lr: 1.7775913186491267e-05 loss: 0.0955 (0.0967) time: 2.9168 data: 0.0090 max mem: 33301 Epoch: [27] [1390/4276] eta: 2:20:35 lr: 1.7772961820533963e-05 loss: 0.1046 (0.0967) time: 2.8933 data: 0.0087 max mem: 33301 Epoch: [27] [1400/4276] eta: 2:20:05 lr: 1.777001040011989e-05 loss: 0.1017 (0.0968) time: 2.8752 data: 0.0090 max mem: 33301 Epoch: [27] [1410/4276] eta: 2:19:35 lr: 1.7767058925238008e-05 loss: 0.0916 (0.0968) time: 2.8760 data: 0.0093 max mem: 33301 Epoch: [27] [1420/4276] eta: 2:19:05 lr: 1.7764107395877244e-05 loss: 0.0877 (0.0967) time: 2.8783 data: 0.0089 max mem: 33301 Epoch: [27] [1430/4276] eta: 2:18:35 lr: 1.7761155812026538e-05 loss: 0.0930 (0.0968) time: 2.8773 data: 0.0089 max mem: 33301 Epoch: [27] [1440/4276] eta: 2:18:05 lr: 1.7758204173674826e-05 loss: 0.0932 (0.0967) time: 2.8793 data: 0.0089 max mem: 33301 Epoch: [27] [1450/4276] eta: 2:17:35 lr: 1.7755252480811037e-05 loss: 0.0880 (0.0967) time: 2.8811 data: 0.0093 max mem: 33301 Epoch: [27] [1460/4276] eta: 2:17:05 lr: 1.7752300733424083e-05 loss: 0.0844 (0.0967) time: 2.8829 data: 0.0095 max mem: 33301 Epoch: [27] [1470/4276] eta: 2:16:36 lr: 1.7749348931502892e-05 loss: 0.0848 (0.0966) time: 2.9043 data: 0.0096 max mem: 33301 Epoch: [27] [1480/4276] eta: 2:16:06 lr: 1.7746397075036384e-05 loss: 0.0917 (0.0966) time: 2.9192 data: 0.0097 max mem: 33301 Epoch: [27] [1490/4276] eta: 2:15:37 lr: 1.7743445164013463e-05 loss: 0.0862 (0.0965) time: 2.9113 data: 0.0092 max mem: 33301 Epoch: [27] [1500/4276] eta: 2:15:08 lr: 1.7740493198423036e-05 loss: 0.0861 (0.0965) time: 2.9130 data: 0.0091 max mem: 33301 Epoch: [27] [1510/4276] eta: 2:14:39 lr: 1.7737541178254004e-05 loss: 0.0948 (0.0965) time: 2.9413 data: 0.0089 max mem: 33301 Epoch: [27] [1520/4276] eta: 2:14:10 lr: 1.773458910349527e-05 loss: 0.0918 (0.0965) time: 2.9527 data: 0.0087 max mem: 33301 Epoch: [27] [1530/4276] eta: 2:13:41 lr: 1.7731636974135726e-05 loss: 0.0826 (0.0964) time: 2.9344 data: 0.0085 max mem: 33301 Epoch: [27] [1540/4276] eta: 2:13:12 lr: 1.7728684790164258e-05 loss: 0.0823 (0.0964) time: 2.9280 data: 0.0080 max mem: 33301 Epoch: [27] [1550/4276] eta: 2:12:43 lr: 1.772573255156975e-05 loss: 0.0856 (0.0964) time: 2.9269 data: 0.0080 max mem: 33301 Epoch: [27] [1560/4276] eta: 2:12:14 lr: 1.772278025834109e-05 loss: 0.0867 (0.0964) time: 2.9238 data: 0.0080 max mem: 33301 Epoch: [27] [1570/4276] eta: 2:11:45 lr: 1.7719827910467146e-05 loss: 0.0942 (0.0963) time: 2.9226 data: 0.0078 max mem: 33301 Epoch: [27] [1580/4276] eta: 2:11:15 lr: 1.7716875507936793e-05 loss: 0.0872 (0.0963) time: 2.9226 data: 0.0078 max mem: 33301 Epoch: [27] [1590/4276] eta: 2:10:46 lr: 1.77139230507389e-05 loss: 0.0872 (0.0962) time: 2.9225 data: 0.0079 max mem: 33301 Epoch: [27] [1600/4276] eta: 2:10:17 lr: 1.7710970538862332e-05 loss: 0.0904 (0.0963) time: 2.9114 data: 0.0090 max mem: 33301 Epoch: [27] [1610/4276] eta: 2:09:47 lr: 1.770801797229594e-05 loss: 0.0816 (0.0962) time: 2.9113 data: 0.0093 max mem: 33301 Epoch: [27] [1620/4276] eta: 2:09:18 lr: 1.7705065351028585e-05 loss: 0.0818 (0.0962) time: 2.9240 data: 0.0080 max mem: 33301 Epoch: [27] [1630/4276] eta: 2:08:49 lr: 1.770211267504912e-05 loss: 0.0897 (0.0962) time: 2.9252 data: 0.0074 max mem: 33301 Epoch: [27] [1640/4276] eta: 2:08:20 lr: 1.769915994434639e-05 loss: 0.0802 (0.0961) time: 2.9237 data: 0.0074 max mem: 33301 Epoch: [27] [1650/4276] eta: 2:07:51 lr: 1.7696207158909226e-05 loss: 0.0768 (0.0960) time: 2.9179 data: 0.0073 max mem: 33301 Epoch: [27] [1660/4276] eta: 2:07:21 lr: 1.7693254318726473e-05 loss: 0.0816 (0.0960) time: 2.9160 data: 0.0082 max mem: 33301 Epoch: [27] [1670/4276] eta: 2:06:52 lr: 1.7690301423786967e-05 loss: 0.0824 (0.0959) time: 2.9264 data: 0.0083 max mem: 33301 Epoch: [27] [1680/4276] eta: 2:06:24 lr: 1.768734847407954e-05 loss: 0.0777 (0.0958) time: 2.9416 data: 0.0076 max mem: 33301 Epoch: [27] [1690/4276] eta: 2:05:54 lr: 1.7684395469593002e-05 loss: 0.0777 (0.0957) time: 2.9360 data: 0.0075 max mem: 33301 Epoch: [27] [1700/4276] eta: 2:05:25 lr: 1.768144241031618e-05 loss: 0.0815 (0.0957) time: 2.9221 data: 0.0073 max mem: 33301 Epoch: [27] [1710/4276] eta: 2:04:56 lr: 1.767848929623789e-05 loss: 0.0821 (0.0957) time: 2.9214 data: 0.0073 max mem: 33301 Epoch: [27] [1720/4276] eta: 2:04:27 lr: 1.7675536127346954e-05 loss: 0.0806 (0.0956) time: 2.9266 data: 0.0082 max mem: 33301 Epoch: [27] [1730/4276] eta: 2:03:58 lr: 1.7672582903632158e-05 loss: 0.0762 (0.0955) time: 2.9271 data: 0.0087 max mem: 33301 Epoch: [27] [1740/4276] eta: 2:03:28 lr: 1.7669629625082316e-05 loss: 0.0813 (0.0954) time: 2.9223 data: 0.0083 max mem: 33301 Epoch: [27] [1750/4276] eta: 2:02:59 lr: 1.7666676291686223e-05 loss: 0.0851 (0.0954) time: 2.9230 data: 0.0083 max mem: 33301 Epoch: [27] [1760/4276] eta: 2:02:30 lr: 1.7663722903432682e-05 loss: 0.0851 (0.0953) time: 2.9208 data: 0.0086 max mem: 33301 Epoch: [27] [1770/4276] eta: 2:02:01 lr: 1.7660769460310467e-05 loss: 0.0885 (0.0953) time: 2.9201 data: 0.0084 max mem: 33301 Epoch: [27] [1780/4276] eta: 2:01:32 lr: 1.765781596230837e-05 loss: 0.0885 (0.0953) time: 2.9228 data: 0.0078 max mem: 33301 Epoch: [27] [1790/4276] eta: 2:01:02 lr: 1.7654862409415173e-05 loss: 0.0896 (0.0952) time: 2.9221 data: 0.0078 max mem: 33301 Epoch: [27] [1800/4276] eta: 2:00:33 lr: 1.765190880161966e-05 loss: 0.0865 (0.0952) time: 2.9229 data: 0.0086 max mem: 33301 Epoch: [27] [1810/4276] eta: 2:00:04 lr: 1.7648955138910585e-05 loss: 0.0999 (0.0953) time: 2.9231 data: 0.0093 max mem: 33301 Epoch: [27] [1820/4276] eta: 1:59:35 lr: 1.7646001421276725e-05 loss: 0.0999 (0.0952) time: 2.9174 data: 0.0086 max mem: 33301 Epoch: [27] [1830/4276] eta: 1:59:05 lr: 1.7643047648706843e-05 loss: 0.0948 (0.0952) time: 2.9184 data: 0.0083 max mem: 33301 Epoch: [27] [1840/4276] eta: 1:58:36 lr: 1.7640093821189706e-05 loss: 0.0859 (0.0951) time: 2.9230 data: 0.0084 max mem: 33301 Epoch: [27] [1850/4276] eta: 1:58:07 lr: 1.763713993871405e-05 loss: 0.0773 (0.0951) time: 2.9228 data: 0.0080 max mem: 33301 Epoch: [27] [1860/4276] eta: 1:57:38 lr: 1.763418600126863e-05 loss: 0.0934 (0.0951) time: 2.9221 data: 0.0078 max mem: 33301 Epoch: [27] [1870/4276] eta: 1:57:09 lr: 1.7631232008842206e-05 loss: 0.0934 (0.0951) time: 2.9219 data: 0.0078 max mem: 33301 Epoch: [27] [1880/4276] eta: 1:56:39 lr: 1.7628277961423506e-05 loss: 0.0900 (0.0951) time: 2.9227 data: 0.0078 max mem: 33301 Epoch: [27] [1890/4276] eta: 1:56:10 lr: 1.7625323859001267e-05 loss: 0.0865 (0.0951) time: 2.9238 data: 0.0079 max mem: 33301 Epoch: [27] [1900/4276] eta: 1:55:41 lr: 1.762236970156422e-05 loss: 0.0762 (0.0950) time: 2.9252 data: 0.0082 max mem: 33301 Epoch: [27] [1910/4276] eta: 1:55:12 lr: 1.761941548910111e-05 loss: 0.0885 (0.0950) time: 2.9258 data: 0.0085 max mem: 33301 Epoch: [27] [1920/4276] eta: 1:54:43 lr: 1.7616461221600637e-05 loss: 0.0885 (0.0950) time: 2.9262 data: 0.0087 max mem: 33301 Epoch: [27] [1930/4276] eta: 1:54:14 lr: 1.761350689905153e-05 loss: 0.0870 (0.0949) time: 2.9286 data: 0.0085 max mem: 33301 Epoch: [27] [1940/4276] eta: 1:53:45 lr: 1.7610552521442504e-05 loss: 0.0925 (0.0950) time: 2.9306 data: 0.0083 max mem: 33301 Epoch: [27] [1950/4276] eta: 1:53:16 lr: 1.760759808876227e-05 loss: 0.0946 (0.0950) time: 2.9311 data: 0.0085 max mem: 33301 Epoch: [27] [1960/4276] eta: 1:52:46 lr: 1.7604643600999534e-05 loss: 0.0929 (0.0950) time: 2.9335 data: 0.0089 max mem: 33301 Epoch: [27] [1970/4276] eta: 1:52:17 lr: 1.7601689058142992e-05 loss: 0.0836 (0.0949) time: 2.9339 data: 0.0090 max mem: 33301 Epoch: [27] [1980/4276] eta: 1:51:48 lr: 1.7598734460181346e-05 loss: 0.0765 (0.0949) time: 2.9332 data: 0.0088 max mem: 33301 Epoch: [27] [1990/4276] eta: 1:51:19 lr: 1.759577980710329e-05 loss: 0.0867 (0.0949) time: 2.9326 data: 0.0087 max mem: 33301 Epoch: [27] [2000/4276] eta: 1:50:50 lr: 1.7592825098897507e-05 loss: 0.0993 (0.0949) time: 2.9323 data: 0.0086 max mem: 33301 Epoch: [27] [2010/4276] eta: 1:50:21 lr: 1.7589870335552682e-05 loss: 0.0929 (0.0949) time: 2.9326 data: 0.0087 max mem: 33301 Epoch: [27] [2020/4276] eta: 1:49:52 lr: 1.7586915517057494e-05 loss: 0.0903 (0.0949) time: 2.9333 data: 0.0087 max mem: 33301 Epoch: [27] [2030/4276] eta: 1:49:23 lr: 1.7583960643400626e-05 loss: 0.0864 (0.0948) time: 2.9343 data: 0.0087 max mem: 33301 Epoch: [27] [2040/4276] eta: 1:48:54 lr: 1.7581005714570737e-05 loss: 0.0803 (0.0947) time: 2.9384 data: 0.0085 max mem: 33301 Epoch: [27] [2050/4276] eta: 1:48:25 lr: 1.7578050730556495e-05 loss: 0.0892 (0.0948) time: 2.9435 data: 0.0081 max mem: 33301 Epoch: [27] [2060/4276] eta: 1:47:56 lr: 1.7575095691346564e-05 loss: 0.0918 (0.0948) time: 2.9434 data: 0.0077 max mem: 33301 Epoch: [27] [2070/4276] eta: 1:47:27 lr: 1.7572140596929607e-05 loss: 0.0854 (0.0947) time: 2.9599 data: 0.0077 max mem: 33301 Epoch: [27] [2080/4276] eta: 1:46:58 lr: 1.756918544729426e-05 loss: 0.0838 (0.0948) time: 2.9601 data: 0.0075 max mem: 33301 Epoch: [27] [2090/4276] eta: 1:46:29 lr: 1.7566230242429185e-05 loss: 0.0896 (0.0948) time: 2.9360 data: 0.0070 max mem: 33301 Epoch: [27] [2100/4276] eta: 1:46:02 lr: 1.7563274982323022e-05 loss: 0.0904 (0.0948) time: 3.0449 data: 0.0077 max mem: 33301 Epoch: [27] [2110/4276] eta: 1:45:36 lr: 1.7560319666964416e-05 loss: 0.0969 (0.0948) time: 3.2024 data: 0.0084 max mem: 33301 Epoch: [27] [2120/4276] eta: 1:45:10 lr: 1.7557364296341985e-05 loss: 0.0892 (0.0947) time: 3.2437 data: 0.0084 max mem: 33301 Epoch: [27] [2130/4276] eta: 1:44:44 lr: 1.7554408870444373e-05 loss: 0.0845 (0.0947) time: 3.2512 data: 0.0083 max mem: 33301 Epoch: [27] [2140/4276] eta: 1:44:18 lr: 1.7551453389260202e-05 loss: 0.0829 (0.0946) time: 3.2669 data: 0.0084 max mem: 33301 Epoch: [27] [2150/4276] eta: 1:43:53 lr: 1.7548497852778102e-05 loss: 0.0761 (0.0945) time: 3.2719 data: 0.0083 max mem: 33301 Epoch: [27] [2160/4276] eta: 1:43:26 lr: 1.7545542260986672e-05 loss: 0.0747 (0.0944) time: 3.2661 data: 0.0082 max mem: 33301 Epoch: [27] [2170/4276] eta: 1:43:00 lr: 1.7542586613874534e-05 loss: 0.0821 (0.0944) time: 3.2630 data: 0.0083 max mem: 33301 Epoch: [27] [2180/4276] eta: 1:42:34 lr: 1.7539630911430304e-05 loss: 0.0917 (0.0944) time: 3.2647 data: 0.0083 max mem: 33301 Epoch: [27] [2190/4276] eta: 1:42:08 lr: 1.753667515364257e-05 loss: 0.0852 (0.0944) time: 3.2695 data: 0.0084 max mem: 33301 Epoch: [27] [2200/4276] eta: 1:41:42 lr: 1.753371934049994e-05 loss: 0.0903 (0.0944) time: 3.2746 data: 0.0084 max mem: 33301 Epoch: [27] [2210/4276] eta: 1:41:16 lr: 1.7530763471991004e-05 loss: 0.0974 (0.0944) time: 3.2750 data: 0.0084 max mem: 33301 Epoch: [27] [2220/4276] eta: 1:40:49 lr: 1.7527807548104362e-05 loss: 0.0855 (0.0944) time: 3.2655 data: 0.0082 max mem: 33301 Epoch: [27] [2230/4276] eta: 1:40:22 lr: 1.7524851568828586e-05 loss: 0.0804 (0.0944) time: 3.2496 data: 0.0094 max mem: 33301 Epoch: [27] [2240/4276] eta: 1:39:55 lr: 1.752189553415226e-05 loss: 0.0751 (0.0943) time: 3.2213 data: 0.0103 max mem: 33301 Epoch: [27] [2250/4276] eta: 1:39:29 lr: 1.7518939444063968e-05 loss: 0.0770 (0.0942) time: 3.2254 data: 0.0102 max mem: 33301 Epoch: [27] [2260/4276] eta: 1:39:01 lr: 1.751598329855228e-05 loss: 0.0820 (0.0943) time: 3.2170 data: 0.0109 max mem: 33301 Epoch: [27] [2270/4276] eta: 1:38:34 lr: 1.7513027097605755e-05 loss: 0.0874 (0.0942) time: 3.1729 data: 0.0103 max mem: 33301 Epoch: [27] [2280/4276] eta: 1:38:06 lr: 1.7510070841212965e-05 loss: 0.0820 (0.0942) time: 3.1779 data: 0.0104 max mem: 33301 Epoch: [27] [2290/4276] eta: 1:37:39 lr: 1.750711452936246e-05 loss: 0.0899 (0.0942) time: 3.1987 data: 0.0110 max mem: 33301 Epoch: [27] [2300/4276] eta: 1:37:12 lr: 1.750415816204281e-05 loss: 0.0934 (0.0942) time: 3.2278 data: 0.0104 max mem: 33301 Epoch: [27] [2310/4276] eta: 1:36:45 lr: 1.7501201739242544e-05 loss: 0.0933 (0.0943) time: 3.2221 data: 0.0111 max mem: 33301 Epoch: [27] [2320/4276] eta: 1:36:17 lr: 1.7498245260950215e-05 loss: 0.0967 (0.0943) time: 3.1853 data: 0.0114 max mem: 33301 Epoch: [27] [2330/4276] eta: 1:35:49 lr: 1.749528872715437e-05 loss: 0.0945 (0.0943) time: 3.1818 data: 0.0099 max mem: 33301 Epoch: [27] [2340/4276] eta: 1:35:21 lr: 1.7492332137843543e-05 loss: 0.0945 (0.0943) time: 3.1240 data: 0.0089 max mem: 33301 Epoch: [27] [2350/4276] eta: 1:34:51 lr: 1.7489375493006254e-05 loss: 0.0932 (0.0943) time: 2.9890 data: 0.0078 max mem: 33301 Epoch: [27] [2360/4276] eta: 1:34:21 lr: 1.748641879263104e-05 loss: 0.0806 (0.0943) time: 2.9358 data: 0.0076 max mem: 33301 Epoch: [27] [2370/4276] eta: 1:33:51 lr: 1.7483462036706422e-05 loss: 0.0976 (0.0943) time: 2.9070 data: 0.0083 max mem: 33301 Epoch: [27] [2380/4276] eta: 1:33:21 lr: 1.748050522522092e-05 loss: 0.0966 (0.0943) time: 2.8564 data: 0.0085 max mem: 33301 Epoch: [27] [2390/4276] eta: 1:32:50 lr: 1.747754835816304e-05 loss: 0.0943 (0.0943) time: 2.8564 data: 0.0084 max mem: 33301 Epoch: [27] [2400/4276] eta: 1:32:20 lr: 1.7474591435521298e-05 loss: 0.0908 (0.0943) time: 2.8582 data: 0.0086 max mem: 33301 Epoch: [27] [2410/4276] eta: 1:31:50 lr: 1.747163445728419e-05 loss: 0.1001 (0.0943) time: 2.8713 data: 0.0092 max mem: 33301 Epoch: [27] [2420/4276] eta: 1:31:20 lr: 1.7468677423440232e-05 loss: 0.0897 (0.0943) time: 2.8827 data: 0.0091 max mem: 33301 Epoch: [27] [2430/4276] eta: 1:30:50 lr: 1.74657203339779e-05 loss: 0.0955 (0.0944) time: 2.8755 data: 0.0089 max mem: 33301 Epoch: [27] [2440/4276] eta: 1:30:20 lr: 1.746276318888569e-05 loss: 0.0967 (0.0944) time: 2.8645 data: 0.0085 max mem: 33301 Epoch: [27] [2450/4276] eta: 1:29:49 lr: 1.745980598815209e-05 loss: 0.0947 (0.0944) time: 2.8612 data: 0.0081 max mem: 33301 Epoch: [27] [2460/4276] eta: 1:29:19 lr: 1.7456848731765593e-05 loss: 0.0961 (0.0944) time: 2.8614 data: 0.0083 max mem: 33301 Epoch: [27] [2470/4276] eta: 1:28:49 lr: 1.7453891419714655e-05 loss: 0.0978 (0.0945) time: 2.8617 data: 0.0085 max mem: 33301 Epoch: [27] [2480/4276] eta: 1:28:19 lr: 1.745093405198776e-05 loss: 0.1008 (0.0945) time: 2.8662 data: 0.0083 max mem: 33301 Epoch: [27] [2490/4276] eta: 1:27:49 lr: 1.744797662857338e-05 loss: 0.0951 (0.0945) time: 2.8802 data: 0.0086 max mem: 33301 Epoch: [27] [2500/4276] eta: 1:27:19 lr: 1.744501914945996e-05 loss: 0.0918 (0.0945) time: 2.8940 data: 0.0087 max mem: 33301 Epoch: [27] [2510/4276] eta: 1:26:49 lr: 1.744206161463598e-05 loss: 0.0903 (0.0945) time: 2.8973 data: 0.0082 max mem: 33301 Epoch: [27] [2520/4276] eta: 1:26:19 lr: 1.743910402408988e-05 loss: 0.0889 (0.0945) time: 2.8960 data: 0.0080 max mem: 33301 Epoch: [27] [2530/4276] eta: 1:25:50 lr: 1.7436146377810118e-05 loss: 0.0771 (0.0944) time: 2.9113 data: 0.0078 max mem: 33301 Epoch: [27] [2540/4276] eta: 1:25:20 lr: 1.7433188675785133e-05 loss: 0.0821 (0.0945) time: 2.9071 data: 0.0083 max mem: 33301 Epoch: [27] [2550/4276] eta: 1:24:50 lr: 1.743023091800337e-05 loss: 0.0802 (0.0944) time: 2.8761 data: 0.0089 max mem: 33301 Epoch: [27] [2560/4276] eta: 1:24:20 lr: 1.7427273104453258e-05 loss: 0.0709 (0.0943) time: 2.8634 data: 0.0087 max mem: 33301 Epoch: [27] [2570/4276] eta: 1:23:50 lr: 1.7424315235123244e-05 loss: 0.0739 (0.0943) time: 2.8602 data: 0.0087 max mem: 33301 Epoch: [27] [2580/4276] eta: 1:23:20 lr: 1.742135731000173e-05 loss: 0.0773 (0.0942) time: 2.8765 data: 0.0082 max mem: 33301 Epoch: [27] [2590/4276] eta: 1:22:50 lr: 1.7418399329077155e-05 loss: 0.0796 (0.0942) time: 2.9044 data: 0.0076 max mem: 33301 Epoch: [27] [2600/4276] eta: 1:22:20 lr: 1.7415441292337936e-05 loss: 0.0804 (0.0942) time: 2.9170 data: 0.0081 max mem: 33301 Epoch: [27] [2610/4276] eta: 1:21:51 lr: 1.7412483199772487e-05 loss: 0.0742 (0.0941) time: 2.9174 data: 0.0081 max mem: 33301 Epoch: [27] [2620/4276] eta: 1:21:21 lr: 1.7409525051369206e-05 loss: 0.0799 (0.0941) time: 2.8969 data: 0.0081 max mem: 33301 Epoch: [27] [2630/4276] eta: 1:20:51 lr: 1.7406566847116503e-05 loss: 0.0956 (0.0941) time: 2.8720 data: 0.0087 max mem: 33301 Epoch: [27] [2640/4276] eta: 1:20:21 lr: 1.7403608587002773e-05 loss: 0.0808 (0.0940) time: 2.8654 data: 0.0087 max mem: 33301 Epoch: [27] [2650/4276] eta: 1:19:51 lr: 1.7400650271016422e-05 loss: 0.0798 (0.0940) time: 2.8599 data: 0.0087 max mem: 33301 Epoch: [27] [2660/4276] eta: 1:19:21 lr: 1.739769189914583e-05 loss: 0.0799 (0.0940) time: 2.8562 data: 0.0089 max mem: 33301 Epoch: [27] [2670/4276] eta: 1:18:51 lr: 1.7394733471379383e-05 loss: 0.0817 (0.0940) time: 2.8598 data: 0.0085 max mem: 33301 Epoch: [27] [2680/4276] eta: 1:18:21 lr: 1.7391774987705462e-05 loss: 0.1025 (0.0940) time: 2.8622 data: 0.0079 max mem: 33301 Epoch: [27] [2690/4276] eta: 1:17:51 lr: 1.738881644811245e-05 loss: 0.0807 (0.0940) time: 2.8574 data: 0.0077 max mem: 33301 Epoch: [27] [2700/4276] eta: 1:17:21 lr: 1.7385857852588706e-05 loss: 0.0801 (0.0940) time: 2.8689 data: 0.0085 max mem: 33301 Epoch: [27] [2710/4276] eta: 1:16:51 lr: 1.7382899201122603e-05 loss: 0.0795 (0.0939) time: 2.8891 data: 0.0093 max mem: 33301 Epoch: [27] [2720/4276] eta: 1:16:22 lr: 1.7379940493702507e-05 loss: 0.0760 (0.0938) time: 2.8985 data: 0.0097 max mem: 33301 Epoch: [27] [2730/4276] eta: 1:15:52 lr: 1.7376981730316775e-05 loss: 0.0790 (0.0938) time: 2.8961 data: 0.0098 max mem: 33301 Epoch: [27] [2740/4276] eta: 1:15:22 lr: 1.7374022910953753e-05 loss: 0.0916 (0.0938) time: 2.8752 data: 0.0092 max mem: 33301 Epoch: [27] [2750/4276] eta: 1:14:52 lr: 1.7371064035601793e-05 loss: 0.0923 (0.0939) time: 2.8624 data: 0.0087 max mem: 33301 Epoch: [27] [2760/4276] eta: 1:14:22 lr: 1.7368105104249245e-05 loss: 0.0878 (0.0939) time: 2.8644 data: 0.0085 max mem: 33301 Epoch: [27] [2770/4276] eta: 1:13:52 lr: 1.736514611688444e-05 loss: 0.0809 (0.0938) time: 2.8669 data: 0.0085 max mem: 33301 Epoch: [27] [2780/4276] eta: 1:13:22 lr: 1.7362187073495713e-05 loss: 0.0883 (0.0939) time: 2.8686 data: 0.0090 max mem: 33301 Epoch: [27] [2790/4276] eta: 1:12:53 lr: 1.7359227974071395e-05 loss: 0.0913 (0.0939) time: 2.8708 data: 0.0092 max mem: 33301 Epoch: [27] [2800/4276] eta: 1:12:23 lr: 1.7356268818599817e-05 loss: 0.0847 (0.0938) time: 2.8723 data: 0.0092 max mem: 33301 Epoch: [27] [2810/4276] eta: 1:11:53 lr: 1.7353309607069286e-05 loss: 0.0716 (0.0938) time: 2.8973 data: 0.0095 max mem: 33301 Epoch: [27] [2820/4276] eta: 1:11:24 lr: 1.735035033946813e-05 loss: 0.0779 (0.0937) time: 2.9174 data: 0.0093 max mem: 33301 Epoch: [27] [2830/4276] eta: 1:10:54 lr: 1.7347391015784657e-05 loss: 0.0870 (0.0938) time: 2.9145 data: 0.0092 max mem: 33301 Epoch: [27] [2840/4276] eta: 1:10:25 lr: 1.7344431636007175e-05 loss: 0.0975 (0.0938) time: 2.9169 data: 0.0095 max mem: 33301 Epoch: [27] [2850/4276] eta: 1:09:55 lr: 1.734147220012398e-05 loss: 0.0975 (0.0938) time: 2.9310 data: 0.0096 max mem: 33301 Epoch: [27] [2860/4276] eta: 1:09:26 lr: 1.733851270812337e-05 loss: 0.0968 (0.0938) time: 2.9626 data: 0.0096 max mem: 33301 Epoch: [27] [2870/4276] eta: 1:08:56 lr: 1.7335553159993643e-05 loss: 0.0896 (0.0938) time: 2.9478 data: 0.0100 max mem: 33301 Epoch: [27] [2880/4276] eta: 1:08:27 lr: 1.733259355572309e-05 loss: 0.0896 (0.0938) time: 2.9214 data: 0.0099 max mem: 33301 Epoch: [27] [2890/4276] eta: 1:07:57 lr: 1.7329633895299983e-05 loss: 0.0884 (0.0938) time: 2.9188 data: 0.0093 max mem: 33301 Epoch: [27] [2900/4276] eta: 1:07:28 lr: 1.7326674178712605e-05 loss: 0.0757 (0.0937) time: 2.9067 data: 0.0092 max mem: 33301 Epoch: [27] [2910/4276] eta: 1:06:58 lr: 1.732371440594923e-05 loss: 0.0771 (0.0938) time: 2.9023 data: 0.0090 max mem: 33301 Epoch: [27] [2920/4276] eta: 1:06:28 lr: 1.732075457699814e-05 loss: 0.0908 (0.0938) time: 2.8990 data: 0.0087 max mem: 33301 Epoch: [27] [2930/4276] eta: 1:05:59 lr: 1.731779469184758e-05 loss: 0.0890 (0.0938) time: 2.9041 data: 0.0089 max mem: 33301 Epoch: [27] [2940/4276] eta: 1:05:29 lr: 1.7314834750485813e-05 loss: 0.0875 (0.0938) time: 2.9053 data: 0.0091 max mem: 33301 Epoch: [27] [2950/4276] eta: 1:05:00 lr: 1.7311874752901103e-05 loss: 0.0950 (0.0938) time: 2.8975 data: 0.0090 max mem: 33301 Epoch: [27] [2960/4276] eta: 1:04:30 lr: 1.7308914699081704e-05 loss: 0.0861 (0.0938) time: 2.8944 data: 0.0089 max mem: 33301 Epoch: [27] [2970/4276] eta: 1:04:00 lr: 1.7305954589015843e-05 loss: 0.0813 (0.0938) time: 2.8909 data: 0.0088 max mem: 33301 Epoch: [27] [2980/4276] eta: 1:03:31 lr: 1.7302994422691776e-05 loss: 0.0798 (0.0938) time: 2.8894 data: 0.0084 max mem: 33301 Epoch: [27] [2990/4276] eta: 1:03:01 lr: 1.7300034200097734e-05 loss: 0.0782 (0.0938) time: 2.9122 data: 0.0084 max mem: 33301 Epoch: [27] [3000/4276] eta: 1:02:32 lr: 1.7297073921221958e-05 loss: 0.0796 (0.0937) time: 2.9383 data: 0.0088 max mem: 33301 Epoch: [27] [3010/4276] eta: 1:02:03 lr: 1.729411358605266e-05 loss: 0.0831 (0.0937) time: 2.9435 data: 0.0087 max mem: 33301 Epoch: [27] [3020/4276] eta: 1:01:33 lr: 1.729115319457807e-05 loss: 0.0912 (0.0937) time: 2.9323 data: 0.0088 max mem: 33301 Epoch: [27] [3030/4276] eta: 1:01:03 lr: 1.728819274678641e-05 loss: 0.0958 (0.0938) time: 2.9057 data: 0.0091 max mem: 33301 Epoch: [27] [3040/4276] eta: 1:00:34 lr: 1.7285232242665892e-05 loss: 0.1041 (0.0938) time: 2.8898 data: 0.0088 max mem: 33301 Epoch: [27] [3050/4276] eta: 1:00:04 lr: 1.7282271682204713e-05 loss: 0.0892 (0.0938) time: 2.8887 data: 0.0089 max mem: 33301 Epoch: [27] [3060/4276] eta: 0:59:35 lr: 1.7279311065391085e-05 loss: 0.0749 (0.0938) time: 2.8897 data: 0.0091 max mem: 33301 Epoch: [27] [3070/4276] eta: 0:59:05 lr: 1.7276350392213205e-05 loss: 0.0766 (0.0937) time: 2.8894 data: 0.0090 max mem: 33301 Epoch: [27] [3080/4276] eta: 0:58:35 lr: 1.7273389662659275e-05 loss: 0.0841 (0.0937) time: 2.8922 data: 0.0088 max mem: 33301 Epoch: [27] [3090/4276] eta: 0:58:06 lr: 1.7270428876717474e-05 loss: 0.0812 (0.0937) time: 2.9012 data: 0.0088 max mem: 33301 Epoch: [27] [3100/4276] eta: 0:57:36 lr: 1.7267468034375985e-05 loss: 0.0804 (0.0936) time: 2.9069 data: 0.0089 max mem: 33301 Epoch: [27] [3110/4276] eta: 0:57:07 lr: 1.7264507135623002e-05 loss: 0.0871 (0.0936) time: 2.9235 data: 0.0090 max mem: 33301 Epoch: [27] [3120/4276] eta: 0:56:38 lr: 1.7261546180446685e-05 loss: 0.0836 (0.0936) time: 2.9388 data: 0.0091 max mem: 33301 Epoch: [27] [3130/4276] eta: 0:56:08 lr: 1.7258585168835208e-05 loss: 0.0871 (0.0936) time: 2.9383 data: 0.0086 max mem: 33301 Epoch: [27] [3140/4276] eta: 0:55:39 lr: 1.7255624100776745e-05 loss: 0.0886 (0.0936) time: 2.9389 data: 0.0083 max mem: 33301 Epoch: [27] [3150/4276] eta: 0:55:09 lr: 1.7252662976259452e-05 loss: 0.0999 (0.0936) time: 2.9394 data: 0.0087 max mem: 33301 Epoch: [27] [3160/4276] eta: 0:54:40 lr: 1.7249701795271477e-05 loss: 0.0999 (0.0936) time: 2.9388 data: 0.0091 max mem: 33301 Epoch: [27] [3170/4276] eta: 0:54:11 lr: 1.724674055780098e-05 loss: 0.0916 (0.0937) time: 2.9366 data: 0.0088 max mem: 33301 Epoch: [27] [3180/4276] eta: 0:53:41 lr: 1.7243779263836108e-05 loss: 0.0906 (0.0937) time: 2.9358 data: 0.0085 max mem: 33301 Epoch: [27] [3190/4276] eta: 0:53:12 lr: 1.7240817913365008e-05 loss: 0.0982 (0.0937) time: 2.9379 data: 0.0087 max mem: 33301 Epoch: [27] [3200/4276] eta: 0:52:42 lr: 1.72378565063758e-05 loss: 0.1051 (0.0938) time: 2.9384 data: 0.0086 max mem: 33301 Epoch: [27] [3210/4276] eta: 0:52:13 lr: 1.7234895042856634e-05 loss: 0.0892 (0.0938) time: 2.9394 data: 0.0084 max mem: 33301 Epoch: [27] [3220/4276] eta: 0:51:44 lr: 1.7231933522795627e-05 loss: 0.0848 (0.0938) time: 2.9401 data: 0.0083 max mem: 33301 Epoch: [27] [3230/4276] eta: 0:51:14 lr: 1.722897194618091e-05 loss: 0.0899 (0.0938) time: 2.9399 data: 0.0085 max mem: 33301 Epoch: [27] [3240/4276] eta: 0:50:45 lr: 1.7226010313000593e-05 loss: 0.0977 (0.0938) time: 2.9411 data: 0.0087 max mem: 33301 Epoch: [27] [3250/4276] eta: 0:50:15 lr: 1.7223048623242793e-05 loss: 0.1022 (0.0939) time: 2.9407 data: 0.0085 max mem: 33301 Epoch: [27] [3260/4276] eta: 0:49:46 lr: 1.722008687689562e-05 loss: 0.1025 (0.0939) time: 2.9413 data: 0.0082 max mem: 33301 Epoch: [27] [3270/4276] eta: 0:49:17 lr: 1.7217125073947182e-05 loss: 0.0968 (0.0939) time: 2.9480 data: 0.0086 max mem: 33301 Epoch: [27] [3280/4276] eta: 0:48:47 lr: 1.7214163214385564e-05 loss: 0.0950 (0.0939) time: 2.9430 data: 0.0088 max mem: 33301 Epoch: [27] [3290/4276] eta: 0:48:18 lr: 1.7211201298198872e-05 loss: 0.0953 (0.0939) time: 2.9445 data: 0.0084 max mem: 33301 Epoch: [27] [3300/4276] eta: 0:47:49 lr: 1.7208239325375193e-05 loss: 0.0961 (0.0940) time: 2.9567 data: 0.0082 max mem: 33301 Epoch: [27] [3310/4276] eta: 0:47:19 lr: 1.720527729590262e-05 loss: 0.1010 (0.0940) time: 2.9527 data: 0.0085 max mem: 33301 Epoch: [27] [3320/4276] eta: 0:46:50 lr: 1.720231520976921e-05 loss: 0.1041 (0.0941) time: 2.9540 data: 0.0089 max mem: 33301 Epoch: [27] [3330/4276] eta: 0:46:21 lr: 1.7199353066963062e-05 loss: 0.0954 (0.0941) time: 2.9605 data: 0.0085 max mem: 33301 Epoch: [27] [3340/4276] eta: 0:45:51 lr: 1.7196390867472235e-05 loss: 0.0937 (0.0941) time: 2.9619 data: 0.0081 max mem: 33301 Epoch: [27] [3350/4276] eta: 0:45:22 lr: 1.71934286112848e-05 loss: 0.0938 (0.0941) time: 2.9617 data: 0.0084 max mem: 33301 Epoch: [27] [3360/4276] eta: 0:44:53 lr: 1.7190466298388807e-05 loss: 0.0938 (0.0941) time: 2.9603 data: 0.0088 max mem: 33301 Epoch: [27] [3370/4276] eta: 0:44:23 lr: 1.7187503928772324e-05 loss: 0.0960 (0.0941) time: 2.9602 data: 0.0090 max mem: 33301 Epoch: [27] [3380/4276] eta: 0:43:54 lr: 1.7184541502423393e-05 loss: 0.0841 (0.0940) time: 2.9618 data: 0.0088 max mem: 33301 Epoch: [27] [3390/4276] eta: 0:43:24 lr: 1.7181579019330072e-05 loss: 0.0786 (0.0940) time: 2.9608 data: 0.0086 max mem: 33301 Epoch: [27] [3400/4276] eta: 0:42:55 lr: 1.717861647948039e-05 loss: 0.0991 (0.0941) time: 2.9599 data: 0.0091 max mem: 33301 Epoch: [27] [3410/4276] eta: 0:42:26 lr: 1.7175653882862387e-05 loss: 0.0930 (0.0941) time: 2.9612 data: 0.0094 max mem: 33301 Epoch: [27] [3420/4276] eta: 0:41:56 lr: 1.7172691229464106e-05 loss: 0.0911 (0.0941) time: 2.9610 data: 0.0090 max mem: 33301 Epoch: [27] [3430/4276] eta: 0:41:27 lr: 1.7169728519273558e-05 loss: 0.0991 (0.0941) time: 2.9600 data: 0.0088 max mem: 33301 Epoch: [27] [3440/4276] eta: 0:40:58 lr: 1.7166765752278774e-05 loss: 0.0875 (0.0941) time: 2.9578 data: 0.0092 max mem: 33301 Epoch: [27] [3450/4276] eta: 0:40:28 lr: 1.716380292846777e-05 loss: 0.0875 (0.0942) time: 2.9565 data: 0.0090 max mem: 33301 Epoch: [27] [3460/4276] eta: 0:39:59 lr: 1.7160840047828562e-05 loss: 0.1034 (0.0942) time: 2.9516 data: 0.0080 max mem: 33301 Epoch: [27] [3470/4276] eta: 0:39:30 lr: 1.7157877110349153e-05 loss: 0.0863 (0.0941) time: 2.9411 data: 0.0073 max mem: 33301 Epoch: [27] [3480/4276] eta: 0:42:35 lr: 1.7154914116017546e-05 loss: 0.0866 (0.0942) time: 49.9892 data: 47.0606 max mem: 33301 Epoch: [27] [3490/4276] eta: 0:42:03 lr: 1.7151951064821738e-05 loss: 0.0959 (0.0942) time: 49.9934 data: 47.0612 max mem: 33301 Epoch: [27] [3500/4276] eta: 0:41:30 lr: 1.714898795674973e-05 loss: 0.0923 (0.0942) time: 2.9445 data: 0.0087 max mem: 33301 Epoch: [27] [3510/4276] eta: 0:40:57 lr: 1.7146024791789506e-05 loss: 0.0803 (0.0942) time: 2.9433 data: 0.0087 max mem: 33301 Epoch: [27] [3520/4276] eta: 0:40:25 lr: 1.7143061569929044e-05 loss: 0.0928 (0.0942) time: 2.9402 data: 0.0088 max mem: 33301 Epoch: [27] [3530/4276] eta: 0:39:52 lr: 1.7140098291156327e-05 loss: 0.0960 (0.0942) time: 2.9397 data: 0.0088 max mem: 33301 Epoch: [27] [3540/4276] eta: 0:39:19 lr: 1.713713495545934e-05 loss: 0.0868 (0.0942) time: 2.9412 data: 0.0084 max mem: 33301 Epoch: [27] [3550/4276] eta: 0:38:47 lr: 1.7134171562826034e-05 loss: 0.0868 (0.0942) time: 2.9415 data: 0.0085 max mem: 33301 Epoch: [27] [3560/4276] eta: 0:38:14 lr: 1.7131208113244383e-05 loss: 0.0910 (0.0942) time: 2.9414 data: 0.0089 max mem: 33301 Epoch: [27] [3570/4276] eta: 0:37:42 lr: 1.712824460670234e-05 loss: 0.1018 (0.0942) time: 2.9429 data: 0.0091 max mem: 33301 Epoch: [27] [3580/4276] eta: 0:37:09 lr: 1.712528104318787e-05 loss: 0.0862 (0.0942) time: 2.9440 data: 0.0087 max mem: 33301 Epoch: [27] [3590/4276] eta: 0:36:36 lr: 1.7122317422688916e-05 loss: 0.0862 (0.0942) time: 2.9438 data: 0.0085 max mem: 33301 Epoch: [27] [3600/4276] eta: 0:36:04 lr: 1.711935374519342e-05 loss: 0.0907 (0.0942) time: 2.9413 data: 0.0088 max mem: 33301 Epoch: [27] [3610/4276] eta: 0:35:31 lr: 1.7116390010689326e-05 loss: 0.0897 (0.0942) time: 2.9425 data: 0.0090 max mem: 33301 Epoch: [27] [3620/4276] eta: 0:34:59 lr: 1.7113426219164573e-05 loss: 0.0821 (0.0942) time: 2.9449 data: 0.0087 max mem: 33301 Epoch: [27] [3630/4276] eta: 0:34:27 lr: 1.7110462370607083e-05 loss: 0.0821 (0.0942) time: 2.9439 data: 0.0083 max mem: 33301 Epoch: [27] [3640/4276] eta: 0:33:54 lr: 1.7107498465004784e-05 loss: 0.0765 (0.0941) time: 2.9437 data: 0.0083 max mem: 33301 Epoch: [27] [3650/4276] eta: 0:33:22 lr: 1.7104534502345597e-05 loss: 0.0813 (0.0941) time: 2.9436 data: 0.0084 max mem: 33301 Epoch: [27] [3660/4276] eta: 0:32:49 lr: 1.710157048261744e-05 loss: 0.0880 (0.0941) time: 2.9430 data: 0.0082 max mem: 33301 Epoch: [27] [3670/4276] eta: 0:32:17 lr: 1.7098606405808222e-05 loss: 0.0880 (0.0941) time: 2.9423 data: 0.0080 max mem: 33301 Epoch: [27] [3680/4276] eta: 0:31:44 lr: 1.7095642271905844e-05 loss: 0.0915 (0.0941) time: 2.9428 data: 0.0081 max mem: 33301 Epoch: [27] [3690/4276] eta: 0:31:12 lr: 1.7092678080898212e-05 loss: 0.0988 (0.0941) time: 2.9438 data: 0.0083 max mem: 33301 Epoch: [27] [3700/4276] eta: 0:30:40 lr: 1.7089713832773228e-05 loss: 0.0872 (0.0941) time: 2.9427 data: 0.0084 max mem: 33301 Epoch: [27] [3710/4276] eta: 0:30:07 lr: 1.7086749527518768e-05 loss: 0.0776 (0.0941) time: 2.9368 data: 0.0081 max mem: 33301 Epoch: [27] [3720/4276] eta: 0:29:35 lr: 1.7083785165122727e-05 loss: 0.0943 (0.0941) time: 2.9249 data: 0.0082 max mem: 33301 Epoch: [27] [3730/4276] eta: 0:29:03 lr: 1.708082074557299e-05 loss: 0.0957 (0.0941) time: 2.9197 data: 0.0084 max mem: 33301 Epoch: [27] [3740/4276] eta: 0:28:30 lr: 1.7077856268857423e-05 loss: 0.0874 (0.0941) time: 2.9251 data: 0.0086 max mem: 33301 Epoch: [27] [3750/4276] eta: 0:27:58 lr: 1.7074891734963905e-05 loss: 0.0908 (0.0941) time: 2.9350 data: 0.0083 max mem: 33301 Epoch: [27] [3760/4276] eta: 0:27:26 lr: 1.70719271438803e-05 loss: 0.0908 (0.0941) time: 2.9413 data: 0.0080 max mem: 33301 Epoch: [27] [3770/4276] eta: 0:26:54 lr: 1.7068962495594473e-05 loss: 0.0783 (0.0941) time: 2.9419 data: 0.0079 max mem: 33301 Epoch: [27] [3780/4276] eta: 0:26:21 lr: 1.7065997790094275e-05 loss: 0.0783 (0.0941) time: 2.9408 data: 0.0077 max mem: 33301 Epoch: [27] [3790/4276] eta: 0:25:49 lr: 1.7063033027367563e-05 loss: 0.0809 (0.0940) time: 2.9402 data: 0.0076 max mem: 33301 Epoch: [27] [3800/4276] eta: 0:25:17 lr: 1.706006820740218e-05 loss: 0.0813 (0.0940) time: 2.9427 data: 0.0076 max mem: 33301 Epoch: [27] [3810/4276] eta: 0:24:45 lr: 1.7057103330185973e-05 loss: 0.0806 (0.0940) time: 2.9439 data: 0.0076 max mem: 33301 Epoch: [27] [3820/4276] eta: 0:24:13 lr: 1.705413839570677e-05 loss: 0.0848 (0.0940) time: 2.9445 data: 0.0077 max mem: 33301 Epoch: [27] [3830/4276] eta: 0:23:41 lr: 1.7051173403952408e-05 loss: 0.0878 (0.0940) time: 2.9453 data: 0.0077 max mem: 33301 Epoch: [27] [3840/4276] eta: 0:23:08 lr: 1.7048208354910717e-05 loss: 0.0871 (0.0940) time: 2.9452 data: 0.0079 max mem: 33301 Epoch: [27] [3850/4276] eta: 0:22:36 lr: 1.704524324856952e-05 loss: 0.0828 (0.0939) time: 2.9567 data: 0.0082 max mem: 33301 Epoch: [27] [3860/4276] eta: 0:22:04 lr: 1.7042278084916623e-05 loss: 0.0848 (0.0940) time: 2.9617 data: 0.0086 max mem: 33301 Epoch: [27] [3870/4276] eta: 0:21:32 lr: 1.703931286393985e-05 loss: 0.0880 (0.0939) time: 2.9610 data: 0.0089 max mem: 33301 Epoch: [27] [3880/4276] eta: 0:21:00 lr: 1.7036347585627006e-05 loss: 0.0839 (0.0939) time: 2.9756 data: 0.0081 max mem: 33301 Epoch: [27] [3890/4276] eta: 0:22:19 lr: 1.7033382249965893e-05 loss: 0.0914 (0.0939) time: 59.1112 data: 56.1521 max mem: 33301 Epoch: [27] [3900/4276] eta: 0:21:44 lr: 1.7030416856944303e-05 loss: 0.0945 (0.0939) time: 59.0882 data: 56.1530 max mem: 33301 Epoch: [27] [3910/4276] eta: 0:21:09 lr: 1.7027451406550036e-05 loss: 0.0855 (0.0939) time: 2.9432 data: 0.0096 max mem: 33301 Epoch: [27] [3920/4276] eta: 0:20:34 lr: 1.7024485898770873e-05 loss: 0.0798 (0.0939) time: 2.9470 data: 0.0094 max mem: 33301 Epoch: [27] [3930/4276] eta: 0:19:59 lr: 1.7021520333594603e-05 loss: 0.0923 (0.0939) time: 2.9456 data: 0.0084 max mem: 33301 Epoch: [27] [3940/4276] eta: 0:19:24 lr: 1.7018554711008998e-05 loss: 0.0910 (0.0939) time: 2.9467 data: 0.0084 max mem: 33301 Epoch: [27] [3950/4276] eta: 0:18:49 lr: 1.7015589031001833e-05 loss: 0.0788 (0.0939) time: 2.9475 data: 0.0088 max mem: 33301 Epoch: [27] [3960/4276] eta: 0:18:13 lr: 1.7012623293560874e-05 loss: 0.0788 (0.0939) time: 2.9494 data: 0.0087 max mem: 33301 Epoch: [27] [3970/4276] eta: 0:17:38 lr: 1.7009657498673893e-05 loss: 0.0879 (0.0939) time: 2.9340 data: 0.0080 max mem: 33301 Epoch: [27] [3980/4276] eta: 0:17:03 lr: 1.7006691646328632e-05 loss: 0.0923 (0.0939) time: 2.9076 data: 0.0081 max mem: 33301 Epoch: [27] [3990/4276] eta: 0:16:28 lr: 1.700372573651285e-05 loss: 0.0915 (0.0939) time: 2.9052 data: 0.0082 max mem: 33301 Epoch: [27] [4000/4276] eta: 0:15:53 lr: 1.70007597692143e-05 loss: 0.0828 (0.0939) time: 2.9081 data: 0.0079 max mem: 33301 Epoch: [27] [4010/4276] eta: 0:15:19 lr: 1.6997793744420726e-05 loss: 0.0800 (0.0939) time: 2.9137 data: 0.0082 max mem: 33301 Epoch: [27] [4020/4276] eta: 0:14:44 lr: 1.6994827662119854e-05 loss: 0.0914 (0.0939) time: 2.9238 data: 0.0085 max mem: 33301 Epoch: [27] [4030/4276] eta: 0:14:09 lr: 1.6991861522299423e-05 loss: 0.0808 (0.0939) time: 2.9137 data: 0.0082 max mem: 33301 Epoch: [27] [4040/4276] eta: 0:13:34 lr: 1.6988895324947167e-05 loss: 0.0898 (0.0939) time: 2.9047 data: 0.0080 max mem: 33301 Epoch: [27] [4050/4276] eta: 0:12:59 lr: 1.69859290700508e-05 loss: 0.0838 (0.0939) time: 2.9018 data: 0.0082 max mem: 33301 Epoch: [27] [4060/4276] eta: 0:12:24 lr: 1.6982962757598038e-05 loss: 0.0884 (0.0939) time: 2.9016 data: 0.0085 max mem: 33301 Epoch: [27] [4070/4276] eta: 0:11:50 lr: 1.69799963875766e-05 loss: 0.0912 (0.0939) time: 2.9260 data: 0.0086 max mem: 33301 Epoch: [27] [4080/4276] eta: 0:11:15 lr: 1.6977029959974196e-05 loss: 0.0843 (0.0939) time: 2.9281 data: 0.0090 max mem: 33301 Epoch: [27] [4090/4276] eta: 0:10:40 lr: 1.6974063474778524e-05 loss: 0.0913 (0.0939) time: 2.9265 data: 0.0093 max mem: 33301 Epoch: [27] [4100/4276] eta: 0:10:06 lr: 1.697109693197728e-05 loss: 0.1029 (0.0939) time: 2.9224 data: 0.0097 max mem: 33301 Epoch: [27] [4110/4276] eta: 0:09:31 lr: 1.6968130331558155e-05 loss: 0.1029 (0.0940) time: 2.9228 data: 0.0095 max mem: 33301 Epoch: [27] [4120/4276] eta: 0:08:56 lr: 1.696516367350885e-05 loss: 0.0968 (0.0940) time: 2.9258 data: 0.0091 max mem: 33301 Epoch: [27] [4130/4276] eta: 0:08:22 lr: 1.6962196957817034e-05 loss: 0.0830 (0.0939) time: 2.9277 data: 0.0093 max mem: 33301 Epoch: [27] [4140/4276] eta: 0:07:47 lr: 1.6959230184470387e-05 loss: 0.0830 (0.0939) time: 2.9395 data: 0.0092 max mem: 33301 Epoch: [27] [4150/4276] eta: 0:07:13 lr: 1.6956263353456584e-05 loss: 0.0868 (0.0939) time: 2.9171 data: 0.0089 max mem: 33301 Epoch: [27] [4160/4276] eta: 0:06:38 lr: 1.6953296464763297e-05 loss: 0.0835 (0.0939) time: 2.9156 data: 0.0088 max mem: 33301 Epoch: [27] [4170/4276] eta: 0:06:04 lr: 1.695032951837818e-05 loss: 0.0908 (0.0940) time: 2.9538 data: 0.0090 max mem: 33301 Epoch: [27] [4180/4276] eta: 0:05:29 lr: 1.694736251428889e-05 loss: 0.0848 (0.0939) time: 2.9748 data: 0.0090 max mem: 33301 Epoch: [27] [4190/4276] eta: 0:04:55 lr: 1.6944395452483086e-05 loss: 0.0818 (0.0940) time: 2.9634 data: 0.0087 max mem: 33301 Epoch: [27] [4200/4276] eta: 0:04:20 lr: 1.694142833294842e-05 loss: 0.0989 (0.0940) time: 2.9516 data: 0.0084 max mem: 33301 Epoch: [27] [4210/4276] eta: 0:03:46 lr: 1.693846115567252e-05 loss: 0.0970 (0.0940) time: 2.9639 data: 0.0082 max mem: 33301 Epoch: [27] [4220/4276] eta: 0:03:12 lr: 1.693549392064303e-05 loss: 0.0970 (0.0940) time: 2.9784 data: 0.0084 max mem: 33301 Epoch: [27] [4230/4276] eta: 0:02:37 lr: 1.6932526627847582e-05 loss: 0.1013 (0.0941) time: 2.9637 data: 0.0087 max mem: 33301 Epoch: [27] [4240/4276] eta: 0:02:03 lr: 1.692955927727381e-05 loss: 0.1013 (0.0941) time: 2.9486 data: 0.0086 max mem: 33301 Epoch: [27] [4250/4276] eta: 0:01:29 lr: 1.6926591868909325e-05 loss: 0.0947 (0.0941) time: 2.9607 data: 0.0085 max mem: 33301 Epoch: [27] [4260/4276] eta: 0:00:54 lr: 1.6923624402741748e-05 loss: 0.0871 (0.0941) time: 2.9879 data: 0.0085 max mem: 33301 Epoch: [27] [4270/4276] eta: 0:00:20 lr: 1.6920656878758693e-05 loss: 0.0960 (0.0941) time: 2.9740 data: 0.0079 max mem: 33301 Epoch: [27] Total time: 4:03:58 Test: [ 0/21770] eta: 8:38:12 time: 1.4282 data: 1.3899 max mem: 33301 Test: [ 100/21770] eta: 0:19:00 time: 0.0391 data: 0.0009 max mem: 33301 Test: [ 200/21770] eta: 0:16:28 time: 0.0389 data: 0.0009 max mem: 33301 Test: [ 300/21770] eta: 0:15:27 time: 0.0378 data: 0.0010 max mem: 33301 Test: [ 400/21770] eta: 0:14:54 time: 0.0379 data: 0.0009 max mem: 33301 Test: [ 500/21770] eta: 0:14:33 time: 0.0377 data: 0.0009 max mem: 33301 Test: [ 600/21770] eta: 0:14:17 time: 0.0379 data: 0.0009 max mem: 33301 Test: [ 700/21770] eta: 0:14:05 time: 0.0378 data: 0.0009 max mem: 33301 Test: [ 800/21770] eta: 0:13:56 time: 0.0382 data: 0.0009 max mem: 33301 Test: [ 900/21770] eta: 0:13:49 time: 0.0384 data: 0.0009 max mem: 33301 Test: [ 1000/21770] eta: 0:13:42 time: 0.0383 data: 0.0009 max mem: 33301 Test: [ 1100/21770] eta: 0:13:36 time: 0.0385 data: 0.0009 max mem: 33301 Test: [ 1200/21770] eta: 0:13:30 time: 0.0384 data: 0.0009 max mem: 33301 Test: [ 1300/21770] eta: 0:13:25 time: 0.0383 data: 0.0009 max mem: 33301 Test: [ 1400/21770] eta: 0:13:19 time: 0.0385 data: 0.0009 max mem: 33301 Test: [ 1500/21770] eta: 0:13:14 time: 0.0385 data: 0.0009 max mem: 33301 Test: [ 1600/21770] eta: 0:13:09 time: 0.0383 data: 0.0009 max mem: 33301 Test: [ 1700/21770] eta: 0:13:04 time: 0.0382 data: 0.0009 max mem: 33301 Test: [ 1800/21770] eta: 0:12:59 time: 0.0383 data: 0.0009 max mem: 33301 Test: [ 1900/21770] eta: 0:12:55 time: 0.0382 data: 0.0009 max mem: 33301 Test: [ 2000/21770] eta: 0:12:50 time: 0.0382 data: 0.0009 max mem: 33301 Test: [ 2100/21770] eta: 0:12:45 time: 0.0381 data: 0.0009 max mem: 33301 Test: [ 2200/21770] eta: 0:12:41 time: 0.0383 data: 0.0009 max mem: 33301 Test: [ 2300/21770] eta: 0:12:36 time: 0.0382 data: 0.0009 max mem: 33301 Test: [ 2400/21770] eta: 0:12:32 time: 0.0386 data: 0.0009 max mem: 33301 Test: [ 2500/21770] eta: 0:12:28 time: 0.0384 data: 0.0009 max mem: 33301 Test: [ 2600/21770] eta: 0:12:24 time: 0.0386 data: 0.0009 max mem: 33301 Test: [ 2700/21770] eta: 0:12:20 time: 0.0384 data: 0.0009 max mem: 33301 Test: [ 2800/21770] eta: 0:12:16 time: 0.0388 data: 0.0009 max mem: 33301 Test: [ 2900/21770] eta: 0:12:12 time: 0.0390 data: 0.0009 max mem: 33301 Test: [ 3000/21770] eta: 0:12:08 time: 0.0388 data: 0.0009 max mem: 33301 Test: [ 3100/21770] eta: 0:12:04 time: 0.0389 data: 0.0009 max mem: 33301 Test: [ 3200/21770] eta: 0:12:00 time: 0.0388 data: 0.0009 max mem: 33301 Test: [ 3300/21770] eta: 0:11:56 time: 0.0389 data: 0.0009 max mem: 33301 Test: [ 3400/21770] eta: 0:11:53 time: 0.0388 data: 0.0009 max mem: 33301 Test: [ 3500/21770] eta: 0:11:49 time: 0.0388 data: 0.0009 max mem: 33301 Test: [ 3600/21770] eta: 0:11:45 time: 0.0385 data: 0.0009 max mem: 33301 Test: [ 3700/21770] eta: 0:11:41 time: 0.0385 data: 0.0009 max mem: 33301 Test: [ 3800/21770] eta: 0:11:37 time: 0.0388 data: 0.0009 max mem: 33301 Test: [ 3900/21770] eta: 0:11:33 time: 0.0390 data: 0.0009 max mem: 33301 Test: [ 4000/21770] eta: 0:11:29 time: 0.0388 data: 0.0009 max mem: 33301 Test: [ 4100/21770] eta: 0:11:25 time: 0.0389 data: 0.0009 max mem: 33301 Test: [ 4200/21770] eta: 0:11:21 time: 0.0388 data: 0.0009 max mem: 33301 Test: [ 4300/21770] eta: 0:11:17 time: 0.0388 data: 0.0009 max mem: 33301 Test: [ 4400/21770] eta: 0:11:14 time: 0.0387 data: 0.0009 max mem: 33301 Test: [ 4500/21770] eta: 0:11:10 time: 0.0390 data: 0.0009 max mem: 33301 Test: [ 4600/21770] eta: 0:11:06 time: 0.0383 data: 0.0009 max mem: 33301 Test: [ 4700/21770] eta: 0:11:02 time: 0.0386 data: 0.0009 max mem: 33301 Test: [ 4800/21770] eta: 0:10:58 time: 0.0383 data: 0.0009 max mem: 33301 Test: [ 4900/21770] eta: 0:10:54 time: 0.0386 data: 0.0009 max mem: 33301 Test: [ 5000/21770] eta: 0:10:50 time: 0.0388 data: 0.0009 max mem: 33301 Test: [ 5100/21770] eta: 0:10:46 time: 0.0389 data: 0.0009 max mem: 33301 Test: [ 5200/21770] eta: 0:10:42 time: 0.0387 data: 0.0009 max mem: 33301 Test: [ 5300/21770] eta: 0:10:38 time: 0.0389 data: 0.0009 max mem: 33301 Test: [ 5400/21770] eta: 0:10:34 time: 0.0388 data: 0.0009 max mem: 33301 Test: [ 5500/21770] eta: 0:10:31 time: 0.0390 data: 0.0009 max mem: 33301 Test: [ 5600/21770] eta: 0:10:27 time: 0.0389 data: 0.0009 max mem: 33301 Test: [ 5700/21770] eta: 0:10:23 time: 0.0389 data: 0.0009 max mem: 33301 Test: [ 5800/21770] eta: 0:10:19 time: 0.0389 data: 0.0009 max mem: 33301 Test: [ 5900/21770] eta: 0:10:15 time: 0.0387 data: 0.0009 max mem: 33301 Test: [ 6000/21770] eta: 0:10:11 time: 0.0388 data: 0.0009 max mem: 33301 Test: [ 6100/21770] eta: 0:10:07 time: 0.0388 data: 0.0008 max mem: 33301 Test: [ 6200/21770] eta: 0:10:03 time: 0.0387 data: 0.0009 max mem: 33301 Test: [ 6300/21770] eta: 0:10:00 time: 0.0389 data: 0.0009 max mem: 33301 Test: [ 6400/21770] eta: 0:09:56 time: 0.0387 data: 0.0009 max mem: 33301 Test: [ 6500/21770] eta: 0:09:52 time: 0.0388 data: 0.0009 max mem: 33301 Test: [ 6600/21770] eta: 0:09:48 time: 0.0388 data: 0.0009 max mem: 33301 Test: [ 6700/21770] eta: 0:09:44 time: 0.0388 data: 0.0009 max mem: 33301 Test: [ 6800/21770] eta: 0:09:40 time: 0.0390 data: 0.0009 max mem: 33301 Test: [ 6900/21770] eta: 0:09:36 time: 0.0392 data: 0.0009 max mem: 33301 Test: [ 7000/21770] eta: 0:09:33 time: 0.0392 data: 0.0009 max mem: 33301 Test: [ 7100/21770] eta: 0:09:29 time: 0.0388 data: 0.0009 max mem: 33301 Test: [ 7200/21770] eta: 0:09:25 time: 0.0388 data: 0.0009 max mem: 33301 Test: [ 7300/21770] eta: 0:09:21 time: 0.0390 data: 0.0009 max mem: 33301 Test: [ 7400/21770] eta: 0:09:17 time: 0.0387 data: 0.0009 max mem: 33301 Test: [ 7500/21770] eta: 0:09:13 time: 0.0390 data: 0.0009 max mem: 33301 Test: [ 7600/21770] eta: 0:09:09 time: 0.0388 data: 0.0009 max mem: 33301 Test: [ 7700/21770] eta: 0:09:05 time: 0.0389 data: 0.0009 max mem: 33301 Test: [ 7800/21770] eta: 0:09:02 time: 0.0392 data: 0.0009 max mem: 33301 Test: [ 7900/21770] eta: 0:08:58 time: 0.0391 data: 0.0009 max mem: 33301 Test: [ 8000/21770] eta: 0:08:54 time: 0.0391 data: 0.0009 max mem: 33301 Test: [ 8100/21770] eta: 0:08:50 time: 0.0391 data: 0.0009 max mem: 33301 Test: [ 8200/21770] eta: 0:08:46 time: 0.0395 data: 0.0009 max mem: 33301 Test: [ 8300/21770] eta: 0:08:43 time: 0.0391 data: 0.0008 max mem: 33301 Test: [ 8400/21770] eta: 0:08:39 time: 0.0390 data: 0.0008 max mem: 33301 Test: [ 8500/21770] eta: 0:08:35 time: 0.0390 data: 0.0008 max mem: 33301 Test: [ 8600/21770] eta: 0:08:31 time: 0.0387 data: 0.0009 max mem: 33301 Test: [ 8700/21770] eta: 0:08:27 time: 0.0392 data: 0.0008 max mem: 33301 Test: [ 8800/21770] eta: 0:08:23 time: 0.0390 data: 0.0009 max mem: 33301 Test: [ 8900/21770] eta: 0:08:19 time: 0.0390 data: 0.0009 max mem: 33301 Test: [ 9000/21770] eta: 0:08:16 time: 0.0388 data: 0.0009 max mem: 33301 Test: [ 9100/21770] eta: 0:08:12 time: 0.0392 data: 0.0009 max mem: 33301 Test: [ 9200/21770] eta: 0:08:08 time: 0.0389 data: 0.0009 max mem: 33301 Test: [ 9300/21770] eta: 0:08:04 time: 0.0390 data: 0.0008 max mem: 33301 Test: [ 9400/21770] eta: 0:08:00 time: 0.0391 data: 0.0008 max mem: 33301 Test: [ 9500/21770] eta: 0:07:56 time: 0.0391 data: 0.0008 max mem: 33301 Test: [ 9600/21770] eta: 0:07:52 time: 0.0382 data: 0.0009 max mem: 33301 Test: [ 9700/21770] eta: 0:07:48 time: 0.0382 data: 0.0009 max mem: 33301 Test: [ 9800/21770] eta: 0:07:44 time: 0.0380 data: 0.0009 max mem: 33301 Test: [ 9900/21770] eta: 0:07:40 time: 0.0382 data: 0.0009 max mem: 33301 Test: [10000/21770] eta: 0:07:37 time: 0.0384 data: 0.0008 max mem: 33301 Test: [10100/21770] eta: 0:07:33 time: 0.0387 data: 0.0008 max mem: 33301 Test: [10200/21770] eta: 0:07:29 time: 0.0395 data: 0.0008 max mem: 33301 Test: [10300/21770] eta: 0:07:25 time: 0.0400 data: 0.0009 max mem: 33301 Test: [10400/21770] eta: 0:07:21 time: 0.0393 data: 0.0009 max mem: 33301 Test: [10500/21770] eta: 0:07:17 time: 0.0379 data: 0.0008 max mem: 33301 Test: [10600/21770] eta: 0:07:13 time: 0.0383 data: 0.0009 max mem: 33301 Test: [10700/21770] eta: 0:07:09 time: 0.0382 data: 0.0009 max mem: 33301 Test: [10800/21770] eta: 0:07:05 time: 0.0390 data: 0.0009 max mem: 33301 Test: [10900/21770] eta: 0:07:02 time: 0.0398 data: 0.0009 max mem: 33301 Test: [11000/21770] eta: 0:06:58 time: 0.0391 data: 0.0009 max mem: 33301 Test: [11100/21770] eta: 0:06:54 time: 0.0386 data: 0.0009 max mem: 33301 Test: [11200/21770] eta: 0:06:50 time: 0.0389 data: 0.0009 max mem: 33301 Test: [11300/21770] eta: 0:06:46 time: 0.0394 data: 0.0009 max mem: 33301 Test: [11400/21770] eta: 0:06:42 time: 0.0389 data: 0.0008 max mem: 33301 Test: [11500/21770] eta: 0:06:38 time: 0.0392 data: 0.0009 max mem: 33301 Test: [11600/21770] eta: 0:06:35 time: 0.0396 data: 0.0009 max mem: 33301 Test: [11700/21770] eta: 0:06:31 time: 0.0394 data: 0.0009 max mem: 33301 Test: [11800/21770] eta: 0:06:27 time: 0.0393 data: 0.0008 max mem: 33301 Test: [11900/21770] eta: 0:06:23 time: 0.0390 data: 0.0008 max mem: 33301 Test: [12000/21770] eta: 0:06:19 time: 0.0381 data: 0.0009 max mem: 33301 Test: [12100/21770] eta: 0:06:15 time: 0.0383 data: 0.0009 max mem: 33301 Test: [12200/21770] eta: 0:06:11 time: 0.0383 data: 0.0009 max mem: 33301 Test: [12300/21770] eta: 0:06:07 time: 0.0388 data: 0.0008 max mem: 33301 Test: [12400/21770] eta: 0:06:03 time: 0.0388 data: 0.0008 max mem: 33301 Test: [12500/21770] eta: 0:06:00 time: 0.0389 data: 0.0008 max mem: 33301 Test: [12600/21770] eta: 0:05:56 time: 0.0386 data: 0.0008 max mem: 33301 Test: [12700/21770] eta: 0:05:52 time: 0.0382 data: 0.0008 max mem: 33301 Test: [12800/21770] eta: 0:05:48 time: 0.0378 data: 0.0008 max mem: 33301 Test: [12900/21770] eta: 0:05:44 time: 0.0379 data: 0.0009 max mem: 33301 Test: [13000/21770] eta: 0:05:40 time: 0.0379 data: 0.0008 max mem: 33301 Test: [13100/21770] eta: 0:05:36 time: 0.0378 data: 0.0009 max mem: 33301 Test: [13200/21770] eta: 0:05:32 time: 0.0378 data: 0.0009 max mem: 33301 Test: [13300/21770] eta: 0:05:28 time: 0.0379 data: 0.0009 max mem: 33301 Test: [13400/21770] eta: 0:05:24 time: 0.0379 data: 0.0009 max mem: 33301 Test: [13500/21770] eta: 0:05:20 time: 0.0378 data: 0.0009 max mem: 33301 Test: [13600/21770] eta: 0:05:16 time: 0.0379 data: 0.0009 max mem: 33301 Test: [13700/21770] eta: 0:05:12 time: 0.0378 data: 0.0009 max mem: 33301 Test: [13800/21770] eta: 0:05:08 time: 0.0378 data: 0.0009 max mem: 33301 Test: [13900/21770] eta: 0:05:04 time: 0.0379 data: 0.0009 max mem: 33301 Test: [14000/21770] eta: 0:05:01 time: 0.0382 data: 0.0009 max mem: 33301 Test: [14100/21770] eta: 0:04:57 time: 0.0382 data: 0.0008 max mem: 33301 Test: [14200/21770] eta: 0:04:53 time: 0.0387 data: 0.0009 max mem: 33301 Test: [14300/21770] eta: 0:04:49 time: 0.0384 data: 0.0009 max mem: 33301 Test: [14400/21770] eta: 0:04:45 time: 0.0382 data: 0.0009 max mem: 33301 Test: [14500/21770] eta: 0:04:41 time: 0.0384 data: 0.0009 max mem: 33301 Test: [14600/21770] eta: 0:04:37 time: 0.0382 data: 0.0009 max mem: 33301 Test: [14700/21770] eta: 0:04:33 time: 0.0383 data: 0.0009 max mem: 33301 Test: [14800/21770] eta: 0:04:29 time: 0.0382 data: 0.0009 max mem: 33301 Test: [14900/21770] eta: 0:04:25 time: 0.0382 data: 0.0009 max mem: 33301 Test: [15000/21770] eta: 0:04:22 time: 0.0381 data: 0.0008 max mem: 33301 Test: [15100/21770] eta: 0:04:18 time: 0.0382 data: 0.0008 max mem: 33301 Test: [15200/21770] eta: 0:04:14 time: 0.0382 data: 0.0009 max mem: 33301 Test: [15300/21770] eta: 0:04:10 time: 0.0381 data: 0.0009 max mem: 33301 Test: [15400/21770] eta: 0:04:06 time: 0.0380 data: 0.0009 max mem: 33301 Test: [15500/21770] eta: 0:04:02 time: 0.0382 data: 0.0008 max mem: 33301 Test: [15600/21770] eta: 0:03:58 time: 0.0381 data: 0.0009 max mem: 33301 Test: [15700/21770] eta: 0:03:54 time: 0.0383 data: 0.0009 max mem: 33301 Test: [15800/21770] eta: 0:03:50 time: 0.0381 data: 0.0009 max mem: 33301 Test: [15900/21770] eta: 0:03:47 time: 0.0382 data: 0.0008 max mem: 33301 Test: [16000/21770] eta: 0:03:43 time: 0.0382 data: 0.0009 max mem: 33301 Test: [16100/21770] eta: 0:03:39 time: 0.0382 data: 0.0009 max mem: 33301 Test: [16200/21770] eta: 0:03:35 time: 0.0380 data: 0.0008 max mem: 33301 Test: [16300/21770] eta: 0:03:31 time: 0.0382 data: 0.0008 max mem: 33301 Test: [16400/21770] eta: 0:03:27 time: 0.0380 data: 0.0008 max mem: 33301 Test: [16500/21770] eta: 0:03:23 time: 0.0385 data: 0.0009 max mem: 33301 Test: [16600/21770] eta: 0:03:19 time: 0.0388 data: 0.0009 max mem: 33301 Test: [16700/21770] eta: 0:03:16 time: 0.0385 data: 0.0009 max mem: 33301 Test: [16800/21770] eta: 0:03:12 time: 0.0386 data: 0.0008 max mem: 33301 Test: [16900/21770] eta: 0:03:08 time: 0.0384 data: 0.0009 max mem: 33301 Test: [17000/21770] eta: 0:03:04 time: 0.0385 data: 0.0009 max mem: 33301 Test: [17100/21770] eta: 0:03:00 time: 0.0387 data: 0.0009 max mem: 33301 Test: [17200/21770] eta: 0:02:56 time: 0.0388 data: 0.0008 max mem: 33301 Test: [17300/21770] eta: 0:02:52 time: 0.0381 data: 0.0009 max mem: 33301 Test: [17400/21770] eta: 0:02:48 time: 0.0385 data: 0.0009 max mem: 33301 Test: [17500/21770] eta: 0:02:45 time: 0.0381 data: 0.0009 max mem: 33301 Test: [17600/21770] eta: 0:02:41 time: 0.0384 data: 0.0009 max mem: 33301 Test: [17700/21770] eta: 0:02:37 time: 0.0383 data: 0.0009 max mem: 33301 Test: [17800/21770] eta: 0:02:33 time: 0.0386 data: 0.0009 max mem: 33301 Test: [17900/21770] eta: 0:02:29 time: 0.0383 data: 0.0009 max mem: 33301 Test: [18000/21770] eta: 0:02:25 time: 0.0385 data: 0.0009 max mem: 33301 Test: [18100/21770] eta: 0:02:21 time: 0.0383 data: 0.0009 max mem: 33301 Test: [18200/21770] eta: 0:02:17 time: 0.0385 data: 0.0009 max mem: 33301 Test: [18300/21770] eta: 0:02:14 time: 0.0380 data: 0.0008 max mem: 33301 Test: [18400/21770] eta: 0:02:10 time: 0.0387 data: 0.0008 max mem: 33301 Test: [18500/21770] eta: 0:02:06 time: 0.0388 data: 0.0008 max mem: 33301 Test: [18600/21770] eta: 0:02:02 time: 0.0385 data: 0.0008 max mem: 33301 Test: [18700/21770] eta: 0:01:58 time: 0.0388 data: 0.0008 max mem: 33301 Test: [18800/21770] eta: 0:01:54 time: 0.0390 data: 0.0008 max mem: 33301 Test: [18900/21770] eta: 0:01:50 time: 0.0389 data: 0.0008 max mem: 33301 Test: [19000/21770] eta: 0:01:47 time: 0.0390 data: 0.0008 max mem: 33301 Test: [19100/21770] eta: 0:01:43 time: 0.0388 data: 0.0008 max mem: 33301 Test: [19200/21770] eta: 0:01:39 time: 0.0387 data: 0.0008 max mem: 33301 Test: [19300/21770] eta: 0:01:35 time: 0.0387 data: 0.0008 max mem: 33301 Test: [19400/21770] eta: 0:01:31 time: 0.0389 data: 0.0008 max mem: 33301 Test: [19500/21770] eta: 0:01:27 time: 0.0384 data: 0.0008 max mem: 33301 Test: [19600/21770] eta: 0:01:23 time: 0.0393 data: 0.0008 max mem: 33301 Test: [19700/21770] eta: 0:01:20 time: 0.0389 data: 0.0008 max mem: 33301 Test: [19800/21770] eta: 0:01:16 time: 0.0392 data: 0.0008 max mem: 33301 Test: [19900/21770] eta: 0:01:12 time: 0.0388 data: 0.0008 max mem: 33301 Test: [20000/21770] eta: 0:01:08 time: 0.0388 data: 0.0008 max mem: 33301 Test: [20100/21770] eta: 0:01:04 time: 0.0383 data: 0.0008 max mem: 33301 Test: [20200/21770] eta: 0:01:00 time: 0.0384 data: 0.0008 max mem: 33301 Test: [20300/21770] eta: 0:00:56 time: 0.0385 data: 0.0008 max mem: 33301 Test: [20400/21770] eta: 0:00:52 time: 0.0386 data: 0.0008 max mem: 33301 Test: [20500/21770] eta: 0:00:49 time: 0.0384 data: 0.0008 max mem: 33301 Test: [20600/21770] eta: 0:00:45 time: 0.0386 data: 0.0008 max mem: 33301 Test: [20700/21770] eta: 0:00:41 time: 0.0384 data: 0.0008 max mem: 33301 Test: [20800/21770] eta: 0:00:37 time: 0.0386 data: 0.0008 max mem: 33301 Test: [20900/21770] eta: 0:00:33 time: 0.0387 data: 0.0008 max mem: 33301 Test: [21000/21770] eta: 0:00:29 time: 0.0390 data: 0.0008 max mem: 33301 Test: [21100/21770] eta: 0:00:25 time: 0.0389 data: 0.0008 max mem: 33301 Test: [21200/21770] eta: 0:00:22 time: 0.0391 data: 0.0009 max mem: 33301 Test: [21300/21770] eta: 0:00:18 time: 0.0397 data: 0.0009 max mem: 33301 Test: [21400/21770] eta: 0:00:14 time: 0.0387 data: 0.0008 max mem: 33301 Test: [21500/21770] eta: 0:00:10 time: 0.0389 data: 0.0008 max mem: 33301 Test: [21600/21770] eta: 0:00:06 time: 0.0387 data: 0.0008 max mem: 33301 Test: [21700/21770] eta: 0:00:02 time: 0.0387 data: 0.0008 max mem: 33301 Test: Total time: 0:14:01 Final results: Mean IoU is 0.00 precision@0.5 = 0.00 precision@0.6 = 0.00 precision@0.7 = 0.00 precision@0.8 = 0.00 precision@0.9 = 0.00 overall IoU = 0.00 mean IoU = 0.00 Mean accuracy for one-to-zero sample is 100.00 Average object IoU 0.0 Overall IoU 0.0 Epoch: [28] [ 0/4276] eta: 6:29:24 lr: 1.6918876336612275e-05 loss: 0.0710 (0.0710) time: 5.4641 data: 2.4603 max mem: 33301 Epoch: [28] [ 10/4276] eta: 3:43:28 lr: 1.6915908720098677e-05 loss: 0.1066 (0.0965) time: 3.1431 data: 0.2303 max mem: 33301 Epoch: [28] [ 20/4276] eta: 3:35:40 lr: 1.6912941045737364e-05 loss: 0.0900 (0.0960) time: 2.9193 data: 0.0073 max mem: 33301 Epoch: [28] [ 30/4276] eta: 3:33:27 lr: 1.6909973313515935e-05 loss: 0.0902 (0.0945) time: 2.9468 data: 0.0079 max mem: 33301 Epoch: [28] [ 40/4276] eta: 3:32:16 lr: 1.6907005523421973e-05 loss: 0.0891 (0.0914) time: 2.9714 data: 0.0082 max mem: 33301 Epoch: [28] [ 50/4276] eta: 3:31:10 lr: 1.690403767544307e-05 loss: 0.0831 (0.0902) time: 2.9700 data: 0.0077 max mem: 33301 Epoch: [28] [ 60/4276] eta: 3:30:00 lr: 1.69010697695668e-05 loss: 0.0774 (0.0886) time: 2.9519 data: 0.0075 max mem: 33301 Epoch: [28] [ 70/4276] eta: 3:29:08 lr: 1.689810180578073e-05 loss: 0.0746 (0.0869) time: 2.9460 data: 0.0073 max mem: 33301 Epoch: [28] [ 80/4276] eta: 3:28:23 lr: 1.6895133784072437e-05 loss: 0.0705 (0.0870) time: 2.9524 data: 0.0072 max mem: 33301 Epoch: [28] [ 90/4276] eta: 3:27:41 lr: 1.6892165704429487e-05 loss: 0.0772 (0.0871) time: 2.9536 data: 0.0077 max mem: 33301 Epoch: [28] [ 100/4276] eta: 3:27:00 lr: 1.6889197566839425e-05 loss: 0.0913 (0.0893) time: 2.9523 data: 0.0079 max mem: 33301 Epoch: [28] [ 110/4276] eta: 3:26:30 lr: 1.6886229371289815e-05 loss: 0.0983 (0.0900) time: 2.9614 data: 0.0077 max mem: 33301 Epoch: [28] [ 120/4276] eta: 3:25:45 lr: 1.6883261117768203e-05 loss: 0.0847 (0.0896) time: 2.9516 data: 0.0078 max mem: 33301 Epoch: [28] [ 130/4276] eta: 3:25:46 lr: 1.6880292806262127e-05 loss: 0.0849 (0.0902) time: 2.9996 data: 0.0078 max mem: 33301 Epoch: [28] [ 140/4276] eta: 3:25:09 lr: 1.6877324436759126e-05 loss: 0.0834 (0.0899) time: 3.0100 data: 0.0076 max mem: 33301 Epoch: [28] [ 150/4276] eta: 10:01:49 lr: 1.6874356009246735e-05 loss: 0.0915 (0.0905) time: 46.5688 data: 43.6053 max mem: 33301 Epoch: [28] [ 160/4276] eta: 9:35:39 lr: 1.687138752371249e-05 loss: 0.0938 (0.0904) time: 46.5708 data: 43.6053 max mem: 33301 Epoch: [28] [ 170/4276] eta: 9:12:31 lr: 1.6868418980143892e-05 loss: 0.0843 (0.0906) time: 2.9561 data: 0.0077 max mem: 33301 Epoch: [28] [ 180/4276] eta: 8:51:48 lr: 1.6865450378528476e-05 loss: 0.0843 (0.0908) time: 2.9492 data: 0.0083 max mem: 33301 Epoch: [28] [ 190/4276] eta: 8:33:18 lr: 1.686248171885375e-05 loss: 0.0842 (0.0904) time: 2.9537 data: 0.0079 max mem: 33301 Epoch: [28] [ 200/4276] eta: 8:16:33 lr: 1.685951300110722e-05 loss: 0.0821 (0.0902) time: 2.9596 data: 0.0072 max mem: 33301 Epoch: [28] [ 210/4276] eta: 8:01:23 lr: 1.685654422527638e-05 loss: 0.0867 (0.0903) time: 2.9586 data: 0.0070 max mem: 33301 Epoch: [28] [ 220/4276] eta: 7:47:32 lr: 1.6853575391348736e-05 loss: 0.0810 (0.0899) time: 2.9647 data: 0.0071 max mem: 33301 Epoch: [28] [ 230/4276] eta: 7:34:51 lr: 1.6850606499311778e-05 loss: 0.0776 (0.0895) time: 2.9638 data: 0.0071 max mem: 33301 Epoch: [28] [ 240/4276] eta: 7:23:09 lr: 1.6847637549152997e-05 loss: 0.0849 (0.0903) time: 2.9597 data: 0.0073 max mem: 33301 Epoch: [28] [ 250/4276] eta: 7:12:24 lr: 1.6844668540859862e-05 loss: 0.0971 (0.0911) time: 2.9676 data: 0.0072 max mem: 33301 Epoch: [28] [ 260/4276] eta: 7:02:25 lr: 1.684169947441986e-05 loss: 0.0919 (0.0910) time: 2.9729 data: 0.0071 max mem: 33301 Epoch: [28] [ 270/4276] eta: 6:52:59 lr: 1.6838730349820454e-05 loss: 0.0670 (0.0909) time: 2.9393 data: 0.0076 max mem: 33301 Epoch: [28] [ 280/4276] eta: 6:44:33 lr: 1.683576116704912e-05 loss: 0.0683 (0.0905) time: 2.9878 data: 0.0077 max mem: 33301 Epoch: [28] [ 290/4276] eta: 10:09:53 lr: 1.6832791926093306e-05 loss: 0.0770 (0.0903) time: 49.7612 data: 46.7245 max mem: 33301 Epoch: [28] [ 300/4276] eta: 9:54:36 lr: 1.682982262694047e-05 loss: 0.0818 (0.0903) time: 49.6951 data: 46.7247 max mem: 33301 Epoch: [28] [ 310/4276] eta: 9:40:17 lr: 1.6826853269578072e-05 loss: 0.0798 (0.0900) time: 2.9366 data: 0.0072 max mem: 33301 Epoch: [28] [ 320/4276] eta: 9:26:50 lr: 1.6823883853993556e-05 loss: 0.0844 (0.0903) time: 2.9440 data: 0.0071 max mem: 33301 Epoch: [28] [ 330/4276] eta: 9:14:11 lr: 1.682091438017435e-05 loss: 0.0947 (0.0904) time: 2.9477 data: 0.0077 max mem: 33301 Epoch: [28] [ 340/4276] eta: 9:02:15 lr: 1.6817944848107894e-05 loss: 0.0856 (0.0910) time: 2.9502 data: 0.0078 max mem: 33301 Epoch: [28] [ 350/4276] eta: 8:50:56 lr: 1.6814975257781622e-05 loss: 0.0916 (0.0912) time: 2.9459 data: 0.0075 max mem: 33301 Epoch: [28] [ 360/4276] eta: 8:40:14 lr: 1.681200560918296e-05 loss: 0.0947 (0.0919) time: 2.9406 data: 0.0074 max mem: 33301 Epoch: [28] [ 370/4276] eta: 8:30:06 lr: 1.6809035902299316e-05 loss: 0.0986 (0.0923) time: 2.9458 data: 0.0076 max mem: 33301 Epoch: [28] [ 380/4276] eta: 8:20:28 lr: 1.6806066137118116e-05 loss: 0.0986 (0.0925) time: 2.9499 data: 0.0076 max mem: 33301 Epoch: [28] [ 390/4276] eta: 8:11:18 lr: 1.680309631362676e-05 loss: 0.1011 (0.0927) time: 2.9539 data: 0.0074 max mem: 33301 Epoch: [28] [ 400/4276] eta: 8:02:31 lr: 1.680012643181266e-05 loss: 0.1033 (0.0931) time: 2.9352 data: 0.0071 max mem: 33301 Epoch: [28] [ 410/4276] eta: 7:54:10 lr: 1.6797156491663207e-05 loss: 0.1033 (0.0934) time: 2.9266 data: 0.0074 max mem: 33301 Epoch: [28] [ 420/4276] eta: 7:46:08 lr: 1.67941864931658e-05 loss: 0.0996 (0.0935) time: 2.9194 data: 0.0079 max mem: 33301 Epoch: [28] [ 430/4276] eta: 7:38:32 lr: 1.6791216436307823e-05 loss: 0.0947 (0.0936) time: 2.9283 data: 0.0081 max mem: 33301 Epoch: [28] [ 440/4276] eta: 7:31:15 lr: 1.678824632107666e-05 loss: 0.0904 (0.0934) time: 2.9541 data: 0.0079 max mem: 33301 Epoch: [28] [ 450/4276] eta: 7:24:16 lr: 1.6785276147459686e-05 loss: 0.0923 (0.0936) time: 2.9506 data: 0.0079 max mem: 33301 Epoch: [28] [ 460/4276] eta: 7:17:33 lr: 1.6782305915444275e-05 loss: 0.0929 (0.0935) time: 2.9454 data: 0.0078 max mem: 33301 Epoch: [28] [ 470/4276] eta: 7:11:07 lr: 1.6779335625017805e-05 loss: 0.0894 (0.0934) time: 2.9459 data: 0.0076 max mem: 33301 Epoch: [28] [ 480/4276] eta: 7:04:55 lr: 1.677636527616762e-05 loss: 0.0870 (0.0933) time: 2.9524 data: 0.0079 max mem: 33301 Epoch: [28] [ 490/4276] eta: 6:59:01 lr: 1.677339486888109e-05 loss: 0.0838 (0.0930) time: 2.9688 data: 0.0078 max mem: 33301 Epoch: [28] [ 500/4276] eta: 6:53:18 lr: 1.677042440314556e-05 loss: 0.0838 (0.0929) time: 2.9818 data: 0.0079 max mem: 33301 Epoch: [28] [ 510/4276] eta: 6:47:43 lr: 1.676745387894838e-05 loss: 0.0937 (0.0933) time: 2.9447 data: 0.0085 max mem: 33301 Epoch: [28] [ 520/4276] eta: 6:42:21 lr: 1.6764483296276887e-05 loss: 0.0977 (0.0934) time: 2.9211 data: 0.0087 max mem: 33301 Epoch: [28] [ 530/4276] eta: 6:37:12 lr: 1.676151265511842e-05 loss: 0.0894 (0.0937) time: 2.9474 data: 0.0086 max mem: 33301 Epoch: [28] [ 540/4276] eta: 6:32:15 lr: 1.6758541955460308e-05 loss: 0.0894 (0.0936) time: 2.9720 data: 0.0087 max mem: 33301 Epoch: [28] [ 550/4276] eta: 6:27:25 lr: 1.675557119728988e-05 loss: 0.0894 (0.0936) time: 2.9608 data: 0.0089 max mem: 33301 Epoch: [28] [ 560/4276] eta: 6:22:44 lr: 1.675260038059445e-05 loss: 0.0904 (0.0936) time: 2.9353 data: 0.0087 max mem: 33301 Epoch: [28] [ 570/4276] eta: 6:18:10 lr: 1.674962950536134e-05 loss: 0.0923 (0.0935) time: 2.9236 data: 0.0082 max mem: 33301 Epoch: [28] [ 580/4276] eta: 6:13:43 lr: 1.674665857157785e-05 loss: 0.0867 (0.0935) time: 2.8987 data: 0.0084 max mem: 33301 Epoch: [28] [ 590/4276] eta: 6:09:24 lr: 1.67436875792313e-05 loss: 0.0793 (0.0933) time: 2.8901 data: 0.0088 max mem: 33301 Epoch: [28] [ 600/4276] eta: 6:05:13 lr: 1.674071652830897e-05 loss: 0.0808 (0.0932) time: 2.8922 data: 0.0088 max mem: 33301 Epoch: [28] [ 610/4276] eta: 6:01:10 lr: 1.673774541879817e-05 loss: 0.0808 (0.0931) time: 2.8916 data: 0.0088 max mem: 33301 Epoch: [28] [ 620/4276] eta: 5:57:18 lr: 1.6734774250686174e-05 loss: 0.0883 (0.0932) time: 2.9395 data: 0.0085 max mem: 33301 Epoch: [28] [ 630/4276] eta: 5:53:33 lr: 1.6731803023960285e-05 loss: 0.0921 (0.0933) time: 2.9796 data: 0.0082 max mem: 33301 Epoch: [28] [ 640/4276] eta: 5:49:52 lr: 1.6728831738607763e-05 loss: 0.0935 (0.0932) time: 2.9643 data: 0.0084 max mem: 33301 Epoch: [28] [ 650/4276] eta: 5:46:16 lr: 1.6725860394615885e-05 loss: 0.0904 (0.0932) time: 2.9436 data: 0.0085 max mem: 33301 Epoch: [28] [ 660/4276] eta: 5:42:48 lr: 1.6722888991971922e-05 loss: 0.0963 (0.0933) time: 2.9612 data: 0.0087 max mem: 33301 Epoch: [28] [ 670/4276] eta: 5:39:24 lr: 1.6719917530663142e-05 loss: 0.0963 (0.0931) time: 2.9637 data: 0.0084 max mem: 33301 Epoch: [28] [ 680/4276] eta: 5:36:04 lr: 1.6716946010676788e-05 loss: 0.0854 (0.0929) time: 2.9392 data: 0.0091 max mem: 33301 Epoch: [28] [ 690/4276] eta: 5:32:49 lr: 1.6713974432000117e-05 loss: 0.0854 (0.0929) time: 2.9269 data: 0.0093 max mem: 33301 Epoch: [28] [ 700/4276] eta: 5:29:39 lr: 1.671100279462038e-05 loss: 0.0819 (0.0927) time: 2.9323 data: 0.0089 max mem: 33301 Epoch: [28] [ 710/4276] eta: 5:26:35 lr: 1.6708031098524817e-05 loss: 0.0799 (0.0927) time: 2.9532 data: 0.0087 max mem: 33301 Epoch: [28] [ 720/4276] eta: 5:23:36 lr: 1.670505934370066e-05 loss: 0.0807 (0.0926) time: 2.9703 data: 0.0083 max mem: 33301 Epoch: [28] [ 730/4276] eta: 5:20:41 lr: 1.670208753013514e-05 loss: 0.0806 (0.0927) time: 2.9820 data: 0.0091 max mem: 33301 Epoch: [28] [ 740/4276] eta: 5:17:48 lr: 1.6699115657815492e-05 loss: 0.0825 (0.0926) time: 2.9622 data: 0.0090 max mem: 33301 Epoch: [28] [ 750/4276] eta: 5:14:59 lr: 1.669614372672892e-05 loss: 0.0860 (0.0925) time: 2.9393 data: 0.0082 max mem: 33301 Epoch: [28] [ 760/4276] eta: 5:12:14 lr: 1.6693171736862647e-05 loss: 0.0860 (0.0925) time: 2.9382 data: 0.0081 max mem: 33301 Epoch: [28] [ 770/4276] eta: 5:09:32 lr: 1.669019968820388e-05 loss: 0.0865 (0.0925) time: 2.9394 data: 0.0084 max mem: 33301 Epoch: [28] [ 780/4276] eta: 5:06:53 lr: 1.6687227580739835e-05 loss: 0.0871 (0.0925) time: 2.9404 data: 0.0085 max mem: 33301 Epoch: [28] [ 790/4276] eta: 5:04:18 lr: 1.668425541445769e-05 loss: 0.0894 (0.0925) time: 2.9394 data: 0.0082 max mem: 33301 Epoch: [28] [ 800/4276] eta: 5:01:46 lr: 1.668128318934465e-05 loss: 0.0888 (0.0926) time: 2.9379 data: 0.0081 max mem: 33301 Epoch: [28] [ 810/4276] eta: 4:59:17 lr: 1.66783109053879e-05 loss: 0.0935 (0.0928) time: 2.9385 data: 0.0086 max mem: 33301 Epoch: [28] [ 820/4276] eta: 4:56:51 lr: 1.667533856257463e-05 loss: 0.0905 (0.0926) time: 2.9382 data: 0.0089 max mem: 33301 Epoch: [28] [ 830/4276] eta: 4:54:27 lr: 1.6672366160892008e-05 loss: 0.0811 (0.0926) time: 2.9319 data: 0.0089 max mem: 33301 Epoch: [28] [ 840/4276] eta: 4:52:06 lr: 1.666939370032721e-05 loss: 0.1015 (0.0927) time: 2.9331 data: 0.0084 max mem: 33301 Epoch: [28] [ 850/4276] eta: 4:49:48 lr: 1.66664211808674e-05 loss: 0.0756 (0.0925) time: 2.9386 data: 0.0079 max mem: 33301 Epoch: [28] [ 860/4276] eta: 4:47:32 lr: 1.6663448602499748e-05 loss: 0.0865 (0.0926) time: 2.9383 data: 0.0078 max mem: 33301 Epoch: [28] [ 870/4276] eta: 4:45:19 lr: 1.66604759652114e-05 loss: 0.0878 (0.0925) time: 2.9396 data: 0.0077 max mem: 33301 Epoch: [28] [ 880/4276] eta: 4:43:09 lr: 1.6657503268989508e-05 loss: 0.0848 (0.0927) time: 2.9388 data: 0.0077 max mem: 33301 Epoch: [28] [ 890/4276] eta: 4:41:00 lr: 1.665453051382122e-05 loss: 0.0940 (0.0929) time: 2.9373 data: 0.0078 max mem: 33301 Epoch: [28] [ 900/4276] eta: 4:38:54 lr: 1.6651557699693684e-05 loss: 0.0946 (0.0929) time: 2.9371 data: 0.0079 max mem: 33301 Epoch: [28] [ 910/4276] eta: 4:36:50 lr: 1.664858482659402e-05 loss: 0.0946 (0.0929) time: 2.9379 data: 0.0077 max mem: 33301 Epoch: [28] [ 920/4276] eta: 4:34:48 lr: 1.664561189450936e-05 loss: 0.0987 (0.0930) time: 2.9390 data: 0.0077 max mem: 33301 Epoch: [28] [ 930/4276] eta: 4:32:48 lr: 1.664263890342683e-05 loss: 0.0926 (0.0930) time: 2.9454 data: 0.0084 max mem: 33301 Epoch: [28] [ 940/4276] eta: 4:30:50 lr: 1.663966585333356e-05 loss: 0.0848 (0.0930) time: 2.9456 data: 0.0091 max mem: 33301 Epoch: [28] [ 950/4276] eta: 4:28:54 lr: 1.6636692744216643e-05 loss: 0.0858 (0.0931) time: 2.9403 data: 0.0085 max mem: 33301 Epoch: [28] [ 960/4276] eta: 4:26:59 lr: 1.6633719576063203e-05 loss: 0.0908 (0.0932) time: 2.9393 data: 0.0081 max mem: 33301 Epoch: [28] [ 970/4276] eta: 4:25:06 lr: 1.6630746348860334e-05 loss: 0.0862 (0.0931) time: 2.9295 data: 0.0084 max mem: 33301 Epoch: [28] [ 980/4276] eta: 4:23:13 lr: 1.6627773062595138e-05 loss: 0.0948 (0.0932) time: 2.9095 data: 0.0088 max mem: 33301 Epoch: [28] [ 990/4276] eta: 4:21:23 lr: 1.6624799717254697e-05 loss: 0.0917 (0.0931) time: 2.9060 data: 0.0092 max mem: 33301 Epoch: [28] [1000/4276] eta: 4:19:34 lr: 1.6621826312826105e-05 loss: 0.0836 (0.0931) time: 2.9119 data: 0.0087 max mem: 33301 Epoch: [28] [1010/4276] eta: 4:17:47 lr: 1.661885284929645e-05 loss: 0.0923 (0.0931) time: 2.9022 data: 0.0086 max mem: 33301 Epoch: [28] [1020/4276] eta: 4:16:00 lr: 1.6615879326652788e-05 loss: 0.0923 (0.0931) time: 2.8906 data: 0.0086 max mem: 33301 Epoch: [28] [1030/4276] eta: 4:14:16 lr: 1.6612905744882205e-05 loss: 0.0947 (0.0931) time: 2.8846 data: 0.0080 max mem: 33301 Epoch: [28] [1040/4276] eta: 4:12:33 lr: 1.660993210397176e-05 loss: 0.0947 (0.0931) time: 2.8942 data: 0.0085 max mem: 33301 Epoch: [28] [1050/4276] eta: 4:10:52 lr: 1.660695840390852e-05 loss: 0.0906 (0.0932) time: 2.9111 data: 0.0089 max mem: 33301 Epoch: [28] [1060/4276] eta: 4:09:13 lr: 1.6603984644679522e-05 loss: 0.0983 (0.0933) time: 2.9274 data: 0.0085 max mem: 33301 Epoch: [28] [1070/4276] eta: 4:07:35 lr: 1.6601010826271827e-05 loss: 0.1042 (0.0934) time: 2.9384 data: 0.0086 max mem: 33301 Epoch: [28] [1080/4276] eta: 4:05:57 lr: 1.6598036948672474e-05 loss: 0.1029 (0.0935) time: 2.9202 data: 0.0085 max mem: 33301 Epoch: [28] [1090/4276] eta: 4:04:21 lr: 1.659506301186851e-05 loss: 0.1044 (0.0937) time: 2.8962 data: 0.0082 max mem: 33301 Epoch: [28] [1100/4276] eta: 4:02:46 lr: 1.659208901584695e-05 loss: 0.1089 (0.0937) time: 2.8991 data: 0.0081 max mem: 33301 Epoch: [28] [1110/4276] eta: 4:01:12 lr: 1.6589114960594832e-05 loss: 0.0984 (0.0938) time: 2.9025 data: 0.0083 max mem: 33301 Epoch: [28] [1120/4276] eta: 3:59:39 lr: 1.6586140846099174e-05 loss: 0.0984 (0.0938) time: 2.8936 data: 0.0086 max mem: 33301 Epoch: [28] [1130/4276] eta: 3:58:08 lr: 1.6583166672347e-05 loss: 0.0878 (0.0938) time: 2.9021 data: 0.0087 max mem: 33301 Epoch: [28] [1140/4276] eta: 3:56:38 lr: 1.658019243932531e-05 loss: 0.0865 (0.0938) time: 2.9276 data: 0.0087 max mem: 33301 Epoch: [28] [1150/4276] eta: 3:55:10 lr: 1.6577218147021108e-05 loss: 0.0913 (0.0937) time: 2.9408 data: 0.0082 max mem: 33301 Epoch: [28] [1160/4276] eta: 3:53:43 lr: 1.65742437954214e-05 loss: 0.0901 (0.0937) time: 2.9445 data: 0.0078 max mem: 33301 Epoch: [28] [1170/4276] eta: 3:52:16 lr: 1.657126938451318e-05 loss: 0.0871 (0.0937) time: 2.9433 data: 0.0074 max mem: 33301 Epoch: [28] [1180/4276] eta: 3:50:51 lr: 1.6568294914283435e-05 loss: 0.0871 (0.0937) time: 2.9450 data: 0.0074 max mem: 33301 Epoch: [28] [1190/4276] eta: 3:49:26 lr: 1.656532038471915e-05 loss: 0.0777 (0.0936) time: 2.9254 data: 0.0074 max mem: 33301 Epoch: [28] [1200/4276] eta: 3:48:01 lr: 1.6562345795807297e-05 loss: 0.0755 (0.0935) time: 2.8972 data: 0.0071 max mem: 33301 Epoch: [28] [1210/4276] eta: 3:46:37 lr: 1.6559371147534854e-05 loss: 0.0846 (0.0935) time: 2.8918 data: 0.0071 max mem: 33301 Epoch: [28] [1220/4276] eta: 3:45:15 lr: 1.6556396439888786e-05 loss: 0.0896 (0.0935) time: 2.9091 data: 0.0072 max mem: 33301 Epoch: [28] [1230/4276] eta: 3:43:53 lr: 1.6553421672856055e-05 loss: 0.0871 (0.0935) time: 2.9106 data: 0.0078 max mem: 33301 Epoch: [28] [1240/4276] eta: 3:42:32 lr: 1.655044684642361e-05 loss: 0.0871 (0.0934) time: 2.8999 data: 0.0079 max mem: 33301 Epoch: [28] [1250/4276] eta: 3:41:12 lr: 1.654747196057842e-05 loss: 0.0891 (0.0934) time: 2.9179 data: 0.0079 max mem: 33301 Epoch: [28] [1260/4276] eta: 3:39:54 lr: 1.6544497015307412e-05 loss: 0.0857 (0.0934) time: 2.9275 data: 0.0078 max mem: 33301 Epoch: [28] [1270/4276] eta: 3:38:36 lr: 1.654152201059753e-05 loss: 0.0857 (0.0933) time: 2.9327 data: 0.0077 max mem: 33301 Epoch: [28] [1280/4276] eta: 3:37:19 lr: 1.653854694643571e-05 loss: 0.0950 (0.0933) time: 2.9371 data: 0.0077 max mem: 33301 Epoch: [28] [1290/4276] eta: 3:36:03 lr: 1.6535571822808884e-05 loss: 0.1019 (0.0934) time: 2.9396 data: 0.0077 max mem: 33301 Epoch: [28] [1300/4276] eta: 3:34:48 lr: 1.6532596639703965e-05 loss: 0.0873 (0.0934) time: 2.9532 data: 0.0076 max mem: 33301 Epoch: [28] [1310/4276] eta: 3:33:32 lr: 1.652962139710788e-05 loss: 0.0753 (0.0933) time: 2.9361 data: 0.0079 max mem: 33301 Epoch: [28] [1320/4276] eta: 3:32:18 lr: 1.652664609500754e-05 loss: 0.0901 (0.0935) time: 2.9222 data: 0.0080 max mem: 33301 Epoch: [28] [1330/4276] eta: 3:31:05 lr: 1.6523670733389844e-05 loss: 0.0891 (0.0934) time: 2.9414 data: 0.0073 max mem: 33301 Epoch: [28] [1340/4276] eta: 3:29:52 lr: 1.65206953122417e-05 loss: 0.0829 (0.0934) time: 2.9485 data: 0.0076 max mem: 33301 Epoch: [28] [1350/4276] eta: 3:28:40 lr: 1.6517719831550003e-05 loss: 0.0860 (0.0934) time: 2.9286 data: 0.0075 max mem: 33301 Epoch: [28] [1360/4276] eta: 3:27:27 lr: 1.6514744291301644e-05 loss: 0.0917 (0.0934) time: 2.9050 data: 0.0077 max mem: 33301 Epoch: [28] [1370/4276] eta: 3:26:16 lr: 1.65117686914835e-05 loss: 0.0780 (0.0934) time: 2.9098 data: 0.0079 max mem: 33301 Epoch: [28] [1380/4276] eta: 3:25:06 lr: 1.6508793032082458e-05 loss: 0.0932 (0.0935) time: 2.9249 data: 0.0077 max mem: 33301 Epoch: [28] [1390/4276] eta: 3:23:55 lr: 1.650581731308539e-05 loss: 0.1014 (0.0935) time: 2.9201 data: 0.0078 max mem: 33301 Epoch: [28] [1400/4276] eta: 3:22:45 lr: 1.6502841534479165e-05 loss: 0.0910 (0.0935) time: 2.9048 data: 0.0077 max mem: 33301 Epoch: [28] [1410/4276] eta: 3:21:36 lr: 1.6499865696250642e-05 loss: 0.0855 (0.0935) time: 2.8967 data: 0.0075 max mem: 33301 Epoch: [28] [1420/4276] eta: 3:20:27 lr: 1.6496889798386677e-05 loss: 0.0851 (0.0935) time: 2.8907 data: 0.0073 max mem: 33301 Epoch: [28] [1430/4276] eta: 3:19:19 lr: 1.6493913840874124e-05 loss: 0.0929 (0.0936) time: 2.8893 data: 0.0072 max mem: 33301 Epoch: [28] [1440/4276] eta: 3:18:11 lr: 1.6490937823699835e-05 loss: 0.0995 (0.0936) time: 2.8945 data: 0.0072 max mem: 33301 Epoch: [28] [1450/4276] eta: 3:17:04 lr: 1.648796174685064e-05 loss: 0.0898 (0.0936) time: 2.9033 data: 0.0073 max mem: 33301 Epoch: [28] [1460/4276] eta: 3:15:57 lr: 1.6484985610313376e-05 loss: 0.0841 (0.0935) time: 2.9040 data: 0.0074 max mem: 33301 Epoch: [28] [1470/4276] eta: 3:14:51 lr: 1.6482009414074872e-05 loss: 0.0834 (0.0935) time: 2.8957 data: 0.0073 max mem: 33301 Epoch: [28] [1480/4276] eta: 3:13:45 lr: 1.6479033158121962e-05 loss: 0.0867 (0.0935) time: 2.8865 data: 0.0072 max mem: 33301 Epoch: [28] [1490/4276] eta: 3:12:40 lr: 1.6476056842441452e-05 loss: 0.0867 (0.0934) time: 2.8860 data: 0.0073 max mem: 33301 Epoch: [28] [1500/4276] eta: 3:11:35 lr: 1.6473080467020155e-05 loss: 0.0791 (0.0934) time: 2.8888 data: 0.0073 max mem: 33301 Epoch: [28] [1510/4276] eta: 3:10:31 lr: 1.6470104031844882e-05 loss: 0.0791 (0.0933) time: 2.8899 data: 0.0073 max mem: 33301 Epoch: [28] [1520/4276] eta: 3:09:28 lr: 1.646712753690244e-05 loss: 0.0754 (0.0933) time: 2.9102 data: 0.0072 max mem: 33301 Epoch: [28] [1530/4276] eta: 3:08:25 lr: 1.6464150982179616e-05 loss: 0.0769 (0.0932) time: 2.9323 data: 0.0071 max mem: 33301 Epoch: [28] [1540/4276] eta: 3:07:22 lr: 1.6461174367663202e-05 loss: 0.0806 (0.0931) time: 2.9205 data: 0.0071 max mem: 33301 Epoch: [28] [1550/4276] eta: 3:06:21 lr: 1.6458197693339984e-05 loss: 0.0823 (0.0931) time: 2.9244 data: 0.0074 max mem: 33301 Epoch: [28] [1560/4276] eta: 3:05:19 lr: 1.6455220959196744e-05 loss: 0.0821 (0.0931) time: 2.9355 data: 0.0083 max mem: 33301 Epoch: [28] [1570/4276] eta: 3:04:18 lr: 1.6452244165220253e-05 loss: 0.0818 (0.0930) time: 2.9266 data: 0.0082 max mem: 33301 Epoch: [28] [1580/4276] eta: 3:03:18 lr: 1.6449267311397272e-05 loss: 0.0783 (0.0930) time: 2.9353 data: 0.0073 max mem: 33301 Epoch: [28] [1590/4276] eta: 3:02:18 lr: 1.6446290397714574e-05 loss: 0.0828 (0.0931) time: 2.9384 data: 0.0073 max mem: 33301 Epoch: [28] [1600/4276] eta: 3:01:18 lr: 1.644331342415892e-05 loss: 0.0881 (0.0931) time: 2.9306 data: 0.0075 max mem: 33301 Epoch: [28] [1610/4276] eta: 3:00:18 lr: 1.644033639071704e-05 loss: 0.0942 (0.0931) time: 2.9247 data: 0.0079 max mem: 33301 Epoch: [28] [1620/4276] eta: 2:59:19 lr: 1.64373592973757e-05 loss: 0.0911 (0.0930) time: 2.9247 data: 0.0078 max mem: 33301 Epoch: [28] [1630/4276] eta: 2:58:21 lr: 1.6434382144121637e-05 loss: 0.0917 (0.0931) time: 2.9349 data: 0.0074 max mem: 33301 Epoch: [28] [1640/4276] eta: 2:57:22 lr: 1.6431404930941577e-05 loss: 0.0929 (0.0931) time: 2.9180 data: 0.0075 max mem: 33301 Epoch: [28] [1650/4276] eta: 2:56:24 lr: 1.642842765782225e-05 loss: 0.0799 (0.0930) time: 2.9155 data: 0.0076 max mem: 33301 Epoch: [28] [1660/4276] eta: 2:55:26 lr: 1.6425450324750383e-05 loss: 0.0799 (0.0929) time: 2.9355 data: 0.0074 max mem: 33301 Epoch: [28] [1670/4276] eta: 2:54:29 lr: 1.6422472931712702e-05 loss: 0.0800 (0.0929) time: 2.9232 data: 0.0072 max mem: 33301 Epoch: [28] [1680/4276] eta: 2:53:31 lr: 1.64194954786959e-05 loss: 0.0871 (0.0929) time: 2.9086 data: 0.0073 max mem: 33301 Epoch: [28] [1690/4276] eta: 2:52:34 lr: 1.6416517965686696e-05 loss: 0.0807 (0.0928) time: 2.8978 data: 0.0073 max mem: 33301 Epoch: [28] [1700/4276] eta: 2:51:38 lr: 1.6413540392671787e-05 loss: 0.0822 (0.0928) time: 2.9210 data: 0.0072 max mem: 33301 Epoch: [28] [1710/4276] eta: 2:50:42 lr: 1.6410562759637878e-05 loss: 0.0944 (0.0928) time: 2.9319 data: 0.0073 max mem: 33301 Epoch: [28] [1720/4276] eta: 2:49:46 lr: 1.6407585066571637e-05 loss: 0.0797 (0.0928) time: 2.9169 data: 0.0074 max mem: 33301 Epoch: [28] [1730/4276] eta: 2:48:51 lr: 1.6404607313459766e-05 loss: 0.0797 (0.0927) time: 2.9195 data: 0.0075 max mem: 33301 Epoch: [28] [1740/4276] eta: 2:47:56 lr: 1.6401629500288936e-05 loss: 0.0825 (0.0927) time: 2.9281 data: 0.0079 max mem: 33301 Epoch: [28] [1750/4276] eta: 2:47:01 lr: 1.6398651627045828e-05 loss: 0.0788 (0.0926) time: 2.9419 data: 0.0080 max mem: 33301 Epoch: [28] [1760/4276] eta: 2:46:07 lr: 1.6395673693717096e-05 loss: 0.0789 (0.0925) time: 2.9406 data: 0.0080 max mem: 33301 Epoch: [28] [1770/4276] eta: 2:45:12 lr: 1.6392695700289408e-05 loss: 0.0848 (0.0925) time: 2.9264 data: 0.0079 max mem: 33301 Epoch: [28] [1780/4276] eta: 2:44:18 lr: 1.6389717646749418e-05 loss: 0.0906 (0.0925) time: 2.9269 data: 0.0074 max mem: 33301 Epoch: [28] [1790/4276] eta: 2:43:24 lr: 1.638673953308378e-05 loss: 0.0845 (0.0925) time: 2.9195 data: 0.0074 max mem: 33301 Epoch: [28] [1800/4276] eta: 2:42:31 lr: 1.6383761359279135e-05 loss: 0.0774 (0.0924) time: 2.9009 data: 0.0078 max mem: 33301 Epoch: [28] [1810/4276] eta: 2:41:37 lr: 1.638078312532212e-05 loss: 0.0790 (0.0924) time: 2.9057 data: 0.0076 max mem: 33301 Epoch: [28] [1820/4276] eta: 2:40:44 lr: 1.6377804831199366e-05 loss: 0.0836 (0.0924) time: 2.9056 data: 0.0073 max mem: 33301 Epoch: [28] [1830/4276] eta: 2:39:51 lr: 1.6374826476897515e-05 loss: 0.0836 (0.0924) time: 2.9021 data: 0.0074 max mem: 33301 Epoch: [28] [1840/4276] eta: 2:38:58 lr: 1.637184806240317e-05 loss: 0.0766 (0.0923) time: 2.9051 data: 0.0077 max mem: 33301 Epoch: [28] [1850/4276] eta: 2:38:06 lr: 1.6368869587702955e-05 loss: 0.0830 (0.0923) time: 2.9134 data: 0.0082 max mem: 33301 Epoch: [28] [1860/4276] eta: 2:37:15 lr: 1.6365891052783475e-05 loss: 0.0888 (0.0923) time: 2.9359 data: 0.0080 max mem: 33301 Epoch: [28] [1870/4276] eta: 2:36:23 lr: 1.636291245763135e-05 loss: 0.0911 (0.0924) time: 2.9308 data: 0.0075 max mem: 33301 Epoch: [28] [1880/4276] eta: 2:35:31 lr: 1.635993380223316e-05 loss: 0.0911 (0.0924) time: 2.9076 data: 0.0075 max mem: 33302 Epoch: [28] [1890/4276] eta: 2:34:40 lr: 1.6356955086575508e-05 loss: 0.0853 (0.0923) time: 2.9142 data: 0.0073 max mem: 33302 Epoch: [28] [1900/4276] eta: 2:33:49 lr: 1.635397631064498e-05 loss: 0.0820 (0.0923) time: 2.9086 data: 0.0074 max mem: 33302 Epoch: [28] [1910/4276] eta: 2:32:58 lr: 1.6350997474428163e-05 loss: 0.0835 (0.0923) time: 2.9014 data: 0.0077 max mem: 33302 Epoch: [28] [1920/4276] eta: 2:32:07 lr: 1.634801857791162e-05 loss: 0.0823 (0.0922) time: 2.9051 data: 0.0076 max mem: 33302 Epoch: [28] [1930/4276] eta: 2:31:16 lr: 1.634503962108193e-05 loss: 0.0823 (0.0922) time: 2.8964 data: 0.0076 max mem: 33302 Epoch: [28] [1940/4276] eta: 2:30:26 lr: 1.6342060603925663e-05 loss: 0.0879 (0.0922) time: 2.9081 data: 0.0075 max mem: 33302 Epoch: [28] [1950/4276] eta: 2:29:36 lr: 1.6339081526429366e-05 loss: 0.0879 (0.0922) time: 2.9114 data: 0.0077 max mem: 33302 Epoch: [28] [1960/4276] eta: 2:28:46 lr: 1.6336102388579597e-05 loss: 0.0801 (0.0921) time: 2.9000 data: 0.0083 max mem: 33302 Epoch: [28] [1970/4276] eta: 2:27:56 lr: 1.6333123190362907e-05 loss: 0.0801 (0.0921) time: 2.9032 data: 0.0087 max mem: 33302 Epoch: [28] [1980/4276] eta: 2:27:07 lr: 1.6330143931765836e-05 loss: 0.0763 (0.0920) time: 2.9004 data: 0.0079 max mem: 33302 Epoch: [28] [1990/4276] eta: 2:26:18 lr: 1.6327164612774917e-05 loss: 0.0798 (0.0921) time: 2.9153 data: 0.0077 max mem: 33302 Epoch: [28] [2000/4276] eta: 2:25:29 lr: 1.632418523337668e-05 loss: 0.1021 (0.0921) time: 2.9185 data: 0.0081 max mem: 33302 Epoch: [28] [2010/4276] eta: 2:24:40 lr: 1.6321205793557652e-05 loss: 0.0899 (0.0921) time: 2.9059 data: 0.0078 max mem: 33302 Epoch: [28] [2020/4276] eta: 2:23:51 lr: 1.631822629330436e-05 loss: 0.0911 (0.0921) time: 2.8987 data: 0.0076 max mem: 33302 Epoch: [28] [2030/4276] eta: 2:23:03 lr: 1.63152467326033e-05 loss: 0.0815 (0.0920) time: 2.9188 data: 0.0075 max mem: 33302 Epoch: [28] [2040/4276] eta: 2:22:16 lr: 1.631226711144099e-05 loss: 0.0774 (0.0920) time: 2.9572 data: 0.0073 max mem: 33302 Epoch: [28] [2050/4276] eta: 2:21:28 lr: 1.630928742980393e-05 loss: 0.0886 (0.0920) time: 2.9321 data: 0.0079 max mem: 33302 Epoch: [28] [2060/4276] eta: 2:20:40 lr: 1.630630768767862e-05 loss: 0.0922 (0.0920) time: 2.9055 data: 0.0080 max mem: 33302 Epoch: [28] [2070/4276] eta: 2:19:52 lr: 1.630332788505154e-05 loss: 0.0862 (0.0920) time: 2.9055 data: 0.0072 max mem: 33302 Epoch: [28] [2080/4276] eta: 2:19:05 lr: 1.630034802190918e-05 loss: 0.0862 (0.0921) time: 2.9162 data: 0.0072 max mem: 33302 Epoch: [28] [2090/4276] eta: 2:18:17 lr: 1.629736809823802e-05 loss: 0.0835 (0.0920) time: 2.9149 data: 0.0076 max mem: 33302 Epoch: [28] [2100/4276] eta: 2:17:30 lr: 1.6294388114024534e-05 loss: 0.0883 (0.0920) time: 2.9088 data: 0.0078 max mem: 33302 Epoch: [28] [2110/4276] eta: 2:16:43 lr: 1.6291408069255182e-05 loss: 0.0883 (0.0920) time: 2.9067 data: 0.0078 max mem: 33302 Epoch: [28] [2120/4276] eta: 2:15:56 lr: 1.6288427963916432e-05 loss: 0.0679 (0.0919) time: 2.8933 data: 0.0078 max mem: 33302 Epoch: [28] [2130/4276] eta: 2:15:09 lr: 1.6285447797994734e-05 loss: 0.0696 (0.0919) time: 2.8958 data: 0.0080 max mem: 33302 Epoch: [28] [2140/4276] eta: 2:14:23 lr: 1.6282467571476546e-05 loss: 0.0812 (0.0918) time: 2.8952 data: 0.0079 max mem: 33302 Epoch: [28] [2150/4276] eta: 2:13:36 lr: 1.6279487284348303e-05 loss: 0.0896 (0.0918) time: 2.8932 data: 0.0076 max mem: 33302 Epoch: [28] [2160/4276] eta: 2:12:50 lr: 1.6276506936596445e-05 loss: 0.0859 (0.0918) time: 2.8937 data: 0.0082 max mem: 33302 Epoch: [28] [2170/4276] eta: 2:12:04 lr: 1.6273526528207405e-05 loss: 0.0879 (0.0919) time: 2.8935 data: 0.0083 max mem: 33302 Epoch: [28] [2180/4276] eta: 2:11:18 lr: 1.6270546059167618e-05 loss: 0.0906 (0.0919) time: 2.8915 data: 0.0079 max mem: 33302 Epoch: [28] [2190/4276] eta: 2:10:32 lr: 1.6267565529463488e-05 loss: 0.0831 (0.0918) time: 2.8927 data: 0.0080 max mem: 33302 Epoch: [28] [2200/4276] eta: 2:09:47 lr: 1.6264584939081444e-05 loss: 0.0861 (0.0919) time: 2.9063 data: 0.0078 max mem: 33302 Epoch: [28] [2210/4276] eta: 2:09:02 lr: 1.6261604288007886e-05 loss: 0.0920 (0.0919) time: 2.9308 data: 0.0078 max mem: 33302 Epoch: [28] [2220/4276] eta: 2:08:16 lr: 1.625862357622923e-05 loss: 0.0862 (0.0919) time: 2.9166 data: 0.0078 max mem: 33302 Epoch: [28] [2230/4276] eta: 2:07:31 lr: 1.6255642803731856e-05 loss: 0.0862 (0.0919) time: 2.9086 data: 0.0075 max mem: 33302 Epoch: [28] [2240/4276] eta: 2:06:46 lr: 1.6252661970502163e-05 loss: 0.0766 (0.0918) time: 2.9146 data: 0.0075 max mem: 33302 Epoch: [28] [2250/4276] eta: 2:06:01 lr: 1.624968107652655e-05 loss: 0.0776 (0.0918) time: 2.9016 data: 0.0076 max mem: 33302 Epoch: [28] [2260/4276] eta: 2:05:17 lr: 1.6246700121791372e-05 loss: 0.0821 (0.0918) time: 2.9026 data: 0.0077 max mem: 33302 Epoch: [28] [2270/4276] eta: 2:04:32 lr: 1.624371910628302e-05 loss: 0.0866 (0.0917) time: 2.9033 data: 0.0078 max mem: 33302 Epoch: [28] [2280/4276] eta: 2:03:48 lr: 1.624073802998786e-05 loss: 0.0835 (0.0917) time: 2.9119 data: 0.0075 max mem: 33302 Epoch: [28] [2290/4276] eta: 2:03:04 lr: 1.6237756892892255e-05 loss: 0.0848 (0.0917) time: 2.9298 data: 0.0072 max mem: 33302 Epoch: [28] [2300/4276] eta: 2:02:20 lr: 1.623477569498256e-05 loss: 0.0891 (0.0917) time: 2.9538 data: 0.0074 max mem: 33302 Epoch: [28] [2310/4276] eta: 2:01:36 lr: 1.623179443624512e-05 loss: 0.0956 (0.0917) time: 2.9474 data: 0.0075 max mem: 33302 Epoch: [28] [2320/4276] eta: 2:00:53 lr: 1.6228813116666285e-05 loss: 0.0971 (0.0918) time: 2.9271 data: 0.0073 max mem: 33302 Epoch: [28] [2330/4276] eta: 2:00:09 lr: 1.6225831736232403e-05 loss: 0.0923 (0.0918) time: 2.9269 data: 0.0077 max mem: 33302 Epoch: [28] [2340/4276] eta: 1:59:26 lr: 1.622285029492979e-05 loss: 0.0920 (0.0918) time: 2.9317 data: 0.0077 max mem: 33302 Epoch: [28] [2350/4276] eta: 1:58:42 lr: 1.6219868792744786e-05 loss: 0.0835 (0.0918) time: 2.9245 data: 0.0074 max mem: 33302 Epoch: [28] [2360/4276] eta: 1:57:59 lr: 1.6216887229663705e-05 loss: 0.0847 (0.0918) time: 2.9067 data: 0.0079 max mem: 33302 Epoch: [28] [2370/4276] eta: 1:57:15 lr: 1.6213905605672875e-05 loss: 0.0894 (0.0918) time: 2.9008 data: 0.0083 max mem: 33302 Epoch: [28] [2380/4276] eta: 1:56:32 lr: 1.6210923920758587e-05 loss: 0.0894 (0.0918) time: 2.8980 data: 0.0082 max mem: 33302 Epoch: [28] [2390/4276] eta: 1:55:49 lr: 1.6207942174907156e-05 loss: 0.0939 (0.0919) time: 2.8932 data: 0.0079 max mem: 33302 Epoch: [28] [2400/4276] eta: 1:55:06 lr: 1.620496036810488e-05 loss: 0.0939 (0.0919) time: 2.8909 data: 0.0078 max mem: 33302 Epoch: [28] [2410/4276] eta: 1:54:23 lr: 1.6201978500338054e-05 loss: 0.0894 (0.0919) time: 2.8919 data: 0.0083 max mem: 33302 Epoch: [28] [2420/4276] eta: 1:53:40 lr: 1.619899657159296e-05 loss: 0.0836 (0.0918) time: 2.8895 data: 0.0080 max mem: 33302 Epoch: [28] [2430/4276] eta: 1:52:58 lr: 1.6196014581855873e-05 loss: 0.0836 (0.0919) time: 2.9102 data: 0.0077 max mem: 33302 Epoch: [28] [2440/4276] eta: 1:52:15 lr: 1.6193032531113073e-05 loss: 0.0896 (0.0919) time: 2.9136 data: 0.0079 max mem: 33302 Epoch: [28] [2450/4276] eta: 1:51:33 lr: 1.6190050419350836e-05 loss: 0.0884 (0.0919) time: 2.9060 data: 0.0084 max mem: 33302 Epoch: [28] [2460/4276] eta: 1:50:51 lr: 1.6187068246555413e-05 loss: 0.0903 (0.0919) time: 2.9064 data: 0.0084 max mem: 33302 Epoch: [28] [2470/4276] eta: 1:50:09 lr: 1.6184086012713058e-05 loss: 0.0899 (0.0919) time: 2.9165 data: 0.0085 max mem: 33302 Epoch: [28] [2480/4276] eta: 1:49:27 lr: 1.6181103717810036e-05 loss: 0.0939 (0.0919) time: 2.9397 data: 0.0082 max mem: 33302 Epoch: [28] [2490/4276] eta: 1:48:45 lr: 1.6178121361832588e-05 loss: 0.0939 (0.0919) time: 2.9425 data: 0.0077 max mem: 33302 Epoch: [28] [2500/4276] eta: 1:48:04 lr: 1.6175138944766944e-05 loss: 0.0893 (0.0919) time: 2.9438 data: 0.0078 max mem: 33302 Epoch: [28] [2510/4276] eta: 1:47:22 lr: 1.617215646659934e-05 loss: 0.0908 (0.0919) time: 2.9286 data: 0.0078 max mem: 33302 Epoch: [28] [2520/4276] eta: 1:46:40 lr: 1.6169173927316008e-05 loss: 0.0832 (0.0919) time: 2.9246 data: 0.0080 max mem: 33302 Epoch: [28] [2530/4276] eta: 1:45:59 lr: 1.6166191326903174e-05 loss: 0.0782 (0.0918) time: 2.9406 data: 0.0079 max mem: 33302 Epoch: [28] [2540/4276] eta: 1:45:18 lr: 1.616320866534704e-05 loss: 0.0920 (0.0918) time: 2.9440 data: 0.0083 max mem: 33302 Epoch: [28] [2550/4276] eta: 1:44:37 lr: 1.616022594263382e-05 loss: 0.0841 (0.0917) time: 2.9429 data: 0.0085 max mem: 33302 Epoch: [28] [2560/4276] eta: 1:43:56 lr: 1.615724315874973e-05 loss: 0.0716 (0.0917) time: 2.9423 data: 0.0081 max mem: 33302 Epoch: [28] [2570/4276] eta: 1:43:15 lr: 1.6154260313680945e-05 loss: 0.0736 (0.0917) time: 2.9434 data: 0.0079 max mem: 33302 Epoch: [28] [2580/4276] eta: 1:42:34 lr: 1.615127740741367e-05 loss: 0.0781 (0.0916) time: 2.9441 data: 0.0081 max mem: 33302 Epoch: [28] [2590/4276] eta: 1:41:53 lr: 1.6148294439934086e-05 loss: 0.0781 (0.0916) time: 2.9445 data: 0.0083 max mem: 33302 Epoch: [28] [2600/4276] eta: 1:41:12 lr: 1.6145311411228387e-05 loss: 0.0785 (0.0916) time: 2.9321 data: 0.0078 max mem: 33302 Epoch: [28] [2610/4276] eta: 1:40:32 lr: 1.6142328321282724e-05 loss: 0.0865 (0.0916) time: 2.9277 data: 0.0073 max mem: 33302 Epoch: [28] [2620/4276] eta: 1:39:51 lr: 1.6139345170083277e-05 loss: 0.0937 (0.0916) time: 2.9407 data: 0.0079 max mem: 33302 Epoch: [28] [2630/4276] eta: 1:39:11 lr: 1.613636195761621e-05 loss: 0.0803 (0.0915) time: 2.9440 data: 0.0082 max mem: 33302 Epoch: [28] [2640/4276] eta: 1:38:31 lr: 1.613337868386768e-05 loss: 0.0764 (0.0915) time: 2.9430 data: 0.0080 max mem: 33302 Epoch: [28] [2650/4276] eta: 1:37:50 lr: 1.6130395348823824e-05 loss: 0.0764 (0.0914) time: 2.9444 data: 0.0082 max mem: 33302 Epoch: [28] [2660/4276] eta: 1:37:10 lr: 1.6127411952470793e-05 loss: 0.0806 (0.0914) time: 2.9445 data: 0.0084 max mem: 33302 Epoch: [28] [2670/4276] eta: 1:36:30 lr: 1.612442849479473e-05 loss: 0.0899 (0.0915) time: 2.9443 data: 0.0084 max mem: 33302 Epoch: [28] [2680/4276] eta: 1:35:50 lr: 1.6121444975781767e-05 loss: 0.0936 (0.0915) time: 2.9441 data: 0.0079 max mem: 33302 Epoch: [28] [2690/4276] eta: 1:35:10 lr: 1.6118461395418017e-05 loss: 0.0838 (0.0915) time: 2.9329 data: 0.0082 max mem: 33302 Epoch: [28] [2700/4276] eta: 1:34:30 lr: 1.611547775368961e-05 loss: 0.0770 (0.0914) time: 2.9067 data: 0.0086 max mem: 33302 Epoch: [28] [2710/4276] eta: 1:33:50 lr: 1.611249405058266e-05 loss: 0.0782 (0.0914) time: 2.8897 data: 0.0084 max mem: 33302 Epoch: [28] [2720/4276] eta: 1:33:10 lr: 1.6109510286083277e-05 loss: 0.0649 (0.0913) time: 2.8915 data: 0.0088 max mem: 33302 Epoch: [28] [2730/4276] eta: 1:32:30 lr: 1.610652646017755e-05 loss: 0.0745 (0.0913) time: 2.8934 data: 0.0088 max mem: 33302 Epoch: [28] [2740/4276] eta: 1:31:50 lr: 1.610354257285159e-05 loss: 0.0872 (0.0913) time: 2.9010 data: 0.0082 max mem: 33302 Epoch: [28] [2750/4276] eta: 1:31:11 lr: 1.6100558624091478e-05 loss: 0.0914 (0.0914) time: 2.9079 data: 0.0085 max mem: 33302 Epoch: [28] [2760/4276] eta: 1:30:31 lr: 1.6097574613883305e-05 loss: 0.0882 (0.0913) time: 2.9005 data: 0.0091 max mem: 33302 Epoch: [28] [2770/4276] eta: 1:29:51 lr: 1.609459054221314e-05 loss: 0.0848 (0.0913) time: 2.8914 data: 0.0085 max mem: 33302 Epoch: [28] [2780/4276] eta: 1:29:12 lr: 1.6091606409067057e-05 loss: 0.0888 (0.0913) time: 2.9021 data: 0.0085 max mem: 33302 Epoch: [28] [2790/4276] eta: 1:28:33 lr: 1.6088622214431128e-05 loss: 0.0941 (0.0914) time: 2.9067 data: 0.0088 max mem: 33302 Epoch: [28] [2800/4276] eta: 1:27:53 lr: 1.6085637958291415e-05 loss: 0.0839 (0.0913) time: 2.9021 data: 0.0084 max mem: 33302 Epoch: [28] [2810/4276] eta: 1:27:14 lr: 1.6082653640633955e-05 loss: 0.0722 (0.0913) time: 2.9177 data: 0.0087 max mem: 33302 Epoch: [28] [2820/4276] eta: 1:26:35 lr: 1.607966926144481e-05 loss: 0.0775 (0.0912) time: 2.9216 data: 0.0088 max mem: 33302 Epoch: [28] [2830/4276] eta: 1:25:56 lr: 1.6076684820710014e-05 loss: 0.0846 (0.0913) time: 2.9085 data: 0.0090 max mem: 33302 Epoch: [28] [2840/4276] eta: 1:25:17 lr: 1.6073700318415618e-05 loss: 0.0909 (0.0913) time: 2.9039 data: 0.0096 max mem: 33302 Epoch: [28] [2850/4276] eta: 1:24:38 lr: 1.6070715754547628e-05 loss: 0.0909 (0.0913) time: 2.9185 data: 0.0090 max mem: 33302 Epoch: [28] [2860/4276] eta: 1:24:00 lr: 1.606773112909208e-05 loss: 0.0738 (0.0913) time: 2.9412 data: 0.0078 max mem: 33302 Epoch: [28] [2870/4276] eta: 1:23:21 lr: 1.6064746442034997e-05 loss: 0.0738 (0.0913) time: 2.9459 data: 0.0075 max mem: 33302 Epoch: [28] [2880/4276] eta: 1:22:43 lr: 1.606176169336238e-05 loss: 0.0864 (0.0913) time: 2.9442 data: 0.0079 max mem: 33302 Epoch: [28] [2890/4276] eta: 1:22:04 lr: 1.6058776883060237e-05 loss: 0.0963 (0.0913) time: 2.9446 data: 0.0077 max mem: 33302 Epoch: [28] [2900/4276] eta: 1:21:26 lr: 1.6055792011114568e-05 loss: 0.0863 (0.0913) time: 2.9427 data: 0.0073 max mem: 33302 Epoch: [28] [2910/4276] eta: 1:20:47 lr: 1.6052807077511374e-05 loss: 0.0870 (0.0913) time: 2.9448 data: 0.0073 max mem: 33302 Epoch: [28] [2920/4276] eta: 1:20:09 lr: 1.604982208223663e-05 loss: 0.0910 (0.0913) time: 2.9423 data: 0.0074 max mem: 33302 Epoch: [28] [2930/4276] eta: 1:19:30 lr: 1.604683702527632e-05 loss: 0.0851 (0.0913) time: 2.9177 data: 0.0077 max mem: 33302 Epoch: [28] [2940/4276] eta: 1:18:52 lr: 1.604385190661642e-05 loss: 0.0870 (0.0913) time: 2.8979 data: 0.0081 max mem: 33302 Epoch: [28] [2950/4276] eta: 1:18:14 lr: 1.6040866726242903e-05 loss: 0.0870 (0.0913) time: 2.9036 data: 0.0083 max mem: 33302 Epoch: [28] [2960/4276] eta: 1:17:36 lr: 1.6037881484141724e-05 loss: 0.0869 (0.0913) time: 2.9181 data: 0.0087 max mem: 33302 Epoch: [28] [2970/4276] eta: 1:16:57 lr: 1.6034896180298847e-05 loss: 0.0871 (0.0913) time: 2.9109 data: 0.0087 max mem: 33302 Epoch: [28] [2980/4276] eta: 1:16:19 lr: 1.6031910814700218e-05 loss: 0.0938 (0.0913) time: 2.8993 data: 0.0082 max mem: 33302 Epoch: [28] [2990/4276] eta: 1:15:41 lr: 1.602892538733179e-05 loss: 0.0868 (0.0913) time: 2.9188 data: 0.0084 max mem: 33302 Epoch: [28] [3000/4276] eta: 1:15:04 lr: 1.6025939898179485e-05 loss: 0.0772 (0.0913) time: 2.9373 data: 0.0081 max mem: 33302 Epoch: [28] [3010/4276] eta: 1:14:26 lr: 1.6022954347229244e-05 loss: 0.0772 (0.0913) time: 2.9415 data: 0.0078 max mem: 33302 Epoch: [28] [3020/4276] eta: 1:13:48 lr: 1.6019968734466993e-05 loss: 0.0915 (0.0913) time: 2.9322 data: 0.0085 max mem: 33302 Epoch: [28] [3030/4276] eta: 1:13:10 lr: 1.6016983059878663e-05 loss: 0.0913 (0.0913) time: 2.9067 data: 0.0083 max mem: 33302 Epoch: [28] [3040/4276] eta: 1:12:32 lr: 1.6013997323450147e-05 loss: 0.1089 (0.0914) time: 2.8903 data: 0.0074 max mem: 33302 Epoch: [28] [3050/4276] eta: 1:11:55 lr: 1.6011011525167366e-05 loss: 0.0983 (0.0914) time: 2.8884 data: 0.0072 max mem: 33302 Epoch: [28] [3060/4276] eta: 1:11:17 lr: 1.6008025665016217e-05 loss: 0.0821 (0.0913) time: 2.8996 data: 0.0072 max mem: 33302 Epoch: [28] [3070/4276] eta: 1:10:39 lr: 1.60050397429826e-05 loss: 0.0777 (0.0913) time: 2.9018 data: 0.0074 max mem: 33302 Epoch: [28] [3080/4276] eta: 1:10:02 lr: 1.60020537590524e-05 loss: 0.0777 (0.0913) time: 2.8897 data: 0.0074 max mem: 33302 Epoch: [28] [3090/4276] eta: 1:09:24 lr: 1.5999067713211496e-05 loss: 0.0773 (0.0912) time: 2.8903 data: 0.0074 max mem: 33302 Epoch: [28] [3100/4276] eta: 1:08:47 lr: 1.5996081605445777e-05 loss: 0.0839 (0.0912) time: 2.9061 data: 0.0082 max mem: 33302 Epoch: [28] [3110/4276] eta: 1:08:10 lr: 1.5993095435741108e-05 loss: 0.0790 (0.0912) time: 2.9163 data: 0.0086 max mem: 33302 Epoch: [28] [3120/4276] eta: 1:07:32 lr: 1.599010920408335e-05 loss: 0.0790 (0.0912) time: 2.9260 data: 0.0086 max mem: 33302 Epoch: [28] [3130/4276] eta: 1:06:55 lr: 1.598712291045836e-05 loss: 0.0874 (0.0911) time: 2.9347 data: 0.0091 max mem: 33302 Epoch: [28] [3140/4276] eta: 1:06:18 lr: 1.5984136554852e-05 loss: 0.0812 (0.0911) time: 2.9111 data: 0.0088 max mem: 33302 Epoch: [28] [3150/4276] eta: 1:05:41 lr: 1.598115013725011e-05 loss: 0.0828 (0.0911) time: 2.8901 data: 0.0081 max mem: 33302 Epoch: [28] [3160/4276] eta: 1:05:04 lr: 1.5978163657638533e-05 loss: 0.0835 (0.0911) time: 2.8906 data: 0.0081 max mem: 33302 Epoch: [28] [3170/4276] eta: 1:04:26 lr: 1.5975177116003096e-05 loss: 0.0845 (0.0912) time: 2.8914 data: 0.0081 max mem: 33302 Epoch: [28] [3180/4276] eta: 1:03:49 lr: 1.597219051232964e-05 loss: 0.0815 (0.0911) time: 2.8897 data: 0.0079 max mem: 33302 Epoch: [28] [3190/4276] eta: 1:03:12 lr: 1.596920384660397e-05 loss: 0.0808 (0.0912) time: 2.8887 data: 0.0078 max mem: 33302 Epoch: [28] [3200/4276] eta: 1:02:35 lr: 1.5966217118811912e-05 loss: 0.0884 (0.0911) time: 2.8902 data: 0.0082 max mem: 33302 Epoch: [28] [3210/4276] eta: 1:01:59 lr: 1.596323032893927e-05 loss: 0.0882 (0.0911) time: 2.9077 data: 0.0091 max mem: 33302 Epoch: [28] [3220/4276] eta: 1:01:22 lr: 1.5960243476971857e-05 loss: 0.0882 (0.0912) time: 2.9151 data: 0.0093 max mem: 33302 Epoch: [28] [3230/4276] eta: 1:00:45 lr: 1.5957256562895457e-05 loss: 0.0818 (0.0911) time: 2.9266 data: 0.0085 max mem: 33302 Epoch: [28] [3240/4276] eta: 1:00:09 lr: 1.5954269586695862e-05 loss: 0.0896 (0.0912) time: 2.9432 data: 0.0076 max mem: 33302 Epoch: [28] [3250/4276] eta: 0:59:32 lr: 1.5951282548358866e-05 loss: 0.1028 (0.0912) time: 2.9189 data: 0.0077 max mem: 33302 Epoch: [28] [3260/4276] eta: 0:58:55 lr: 1.594829544787024e-05 loss: 0.0913 (0.0912) time: 2.8940 data: 0.0084 max mem: 33302 Epoch: [28] [3270/4276] eta: 0:58:19 lr: 1.5945308285215756e-05 loss: 0.0914 (0.0912) time: 2.8926 data: 0.0084 max mem: 33302 Epoch: [28] [3280/4276] eta: 0:57:42 lr: 1.594232106038118e-05 loss: 0.0921 (0.0912) time: 2.8926 data: 0.0083 max mem: 33302 Epoch: [28] [3290/4276] eta: 0:57:06 lr: 1.593933377335227e-05 loss: 0.0890 (0.0912) time: 2.9006 data: 0.0082 max mem: 33302 Epoch: [28] [3300/4276] eta: 0:56:29 lr: 1.5936346424114788e-05 loss: 0.0973 (0.0913) time: 2.9269 data: 0.0080 max mem: 33302 Epoch: [28] [3310/4276] eta: 0:55:53 lr: 1.593335901265447e-05 loss: 0.1038 (0.0913) time: 2.9425 data: 0.0077 max mem: 33302 Epoch: [28] [3320/4276] eta: 0:55:17 lr: 1.5930371538957062e-05 loss: 0.1064 (0.0913) time: 2.9411 data: 0.0076 max mem: 33302 Epoch: [28] [3330/4276] eta: 0:54:41 lr: 1.5927384003008294e-05 loss: 0.0914 (0.0913) time: 2.9218 data: 0.0082 max mem: 33302 Epoch: [28] [3340/4276] eta: 0:54:04 lr: 1.5924396404793907e-05 loss: 0.0843 (0.0913) time: 2.9223 data: 0.0083 max mem: 33302 Epoch: [28] [3350/4276] eta: 0:53:28 lr: 1.5921408744299606e-05 loss: 0.0814 (0.0913) time: 2.9428 data: 0.0076 max mem: 33302 Epoch: [28] [3360/4276] eta: 0:52:52 lr: 1.5918421021511114e-05 loss: 0.0817 (0.0913) time: 2.9424 data: 0.0074 max mem: 33302 Epoch: [28] [3370/4276] eta: 0:52:16 lr: 1.5915433236414143e-05 loss: 0.0868 (0.0913) time: 2.9409 data: 0.0076 max mem: 33302 Epoch: [28] [3380/4276] eta: 0:51:40 lr: 1.59124453889944e-05 loss: 0.0868 (0.0913) time: 2.9410 data: 0.0078 max mem: 33302 Epoch: [28] [3390/4276] eta: 0:51:04 lr: 1.590945747923757e-05 loss: 0.0865 (0.0913) time: 2.9416 data: 0.0079 max mem: 33302 Epoch: [28] [3400/4276] eta: 0:50:28 lr: 1.5906469507129352e-05 loss: 0.0910 (0.0913) time: 2.9432 data: 0.0074 max mem: 33302 Epoch: [28] [3410/4276] eta: 0:49:52 lr: 1.5903481472655426e-05 loss: 0.0885 (0.0913) time: 2.9438 data: 0.0075 max mem: 33302 Epoch: [28] [3420/4276] eta: 0:49:17 lr: 1.5900493375801477e-05 loss: 0.1016 (0.0914) time: 2.9434 data: 0.0077 max mem: 33302 Epoch: [28] [3430/4276] eta: 0:48:41 lr: 1.589750521655317e-05 loss: 0.0965 (0.0914) time: 2.9449 data: 0.0077 max mem: 33302 Epoch: [28] [3440/4276] eta: 0:48:05 lr: 1.589451699489617e-05 loss: 0.0798 (0.0914) time: 2.9441 data: 0.0076 max mem: 33302 Epoch: [28] [3450/4276] eta: 0:47:29 lr: 1.5891528710816146e-05 loss: 0.1028 (0.0915) time: 2.9414 data: 0.0076 max mem: 33302 Epoch: [28] [3460/4276] eta: 0:46:53 lr: 1.5888540364298745e-05 loss: 0.1057 (0.0915) time: 2.9304 data: 0.0077 max mem: 33302 Epoch: [28] [3470/4276] eta: 0:46:18 lr: 1.5885551955329608e-05 loss: 0.0835 (0.0914) time: 2.9071 data: 0.0086 max mem: 33302 Epoch: [28] [3480/4276] eta: 0:45:42 lr: 1.5882563483894383e-05 loss: 0.0804 (0.0914) time: 2.8948 data: 0.0094 max mem: 33302 Epoch: [28] [3490/4276] eta: 0:45:06 lr: 1.5879574949978708e-05 loss: 0.0881 (0.0914) time: 2.8949 data: 0.0092 max mem: 33302 Epoch: [28] [3500/4276] eta: 0:44:31 lr: 1.5876586353568196e-05 loss: 0.0767 (0.0914) time: 2.8934 data: 0.0089 max mem: 33302 Epoch: [28] [3510/4276] eta: 0:43:55 lr: 1.5873597694648476e-05 loss: 0.0802 (0.0914) time: 2.8932 data: 0.0088 max mem: 33302 Epoch: [28] [3520/4276] eta: 0:43:19 lr: 1.5870608973205168e-05 loss: 0.0795 (0.0914) time: 2.8936 data: 0.0089 max mem: 33302 Epoch: [28] [3530/4276] eta: 0:42:44 lr: 1.586762018922388e-05 loss: 0.0810 (0.0914) time: 2.8923 data: 0.0090 max mem: 33302 Epoch: [28] [3540/4276] eta: 0:42:08 lr: 1.586463134269021e-05 loss: 0.0898 (0.0914) time: 2.8945 data: 0.0091 max mem: 33302 Epoch: [28] [3550/4276] eta: 0:41:33 lr: 1.586164243358975e-05 loss: 0.0845 (0.0914) time: 2.8975 data: 0.0092 max mem: 33302 Epoch: [28] [3560/4276] eta: 0:40:58 lr: 1.5858653461908102e-05 loss: 0.0954 (0.0914) time: 2.9168 data: 0.0084 max mem: 33302 Epoch: [28] [3570/4276] eta: 0:40:22 lr: 1.5855664427630844e-05 loss: 0.0984 (0.0914) time: 2.9383 data: 0.0074 max mem: 33302 Epoch: [28] [3580/4276] eta: 0:39:47 lr: 1.5852675330743553e-05 loss: 0.0857 (0.0914) time: 2.9414 data: 0.0072 max mem: 33302 Epoch: [28] [3590/4276] eta: 0:39:12 lr: 1.5849686171231793e-05 loss: 0.0814 (0.0914) time: 2.9414 data: 0.0072 max mem: 33302 Epoch: [28] [3600/4276] eta: 0:38:37 lr: 1.584669694908114e-05 loss: 0.0848 (0.0914) time: 2.9392 data: 0.0073 max mem: 33302 Epoch: [28] [3610/4276] eta: 0:38:01 lr: 1.584370766427715e-05 loss: 0.0862 (0.0914) time: 2.9407 data: 0.0073 max mem: 33302 Epoch: [28] [3620/4276] eta: 0:37:26 lr: 1.5840718316805367e-05 loss: 0.0825 (0.0914) time: 2.9390 data: 0.0075 max mem: 33302 Epoch: [28] [3630/4276] eta: 0:36:51 lr: 1.583772890665134e-05 loss: 0.0849 (0.0914) time: 2.9381 data: 0.0074 max mem: 33302 Epoch: [28] [3640/4276] eta: 0:36:16 lr: 1.5834739433800612e-05 loss: 0.0831 (0.0914) time: 2.9425 data: 0.0072 max mem: 33302 Epoch: [28] [3650/4276] eta: 0:35:41 lr: 1.5831749898238715e-05 loss: 0.0739 (0.0913) time: 2.9428 data: 0.0073 max mem: 33302 Epoch: [28] [3660/4276] eta: 0:35:06 lr: 1.582876029995117e-05 loss: 0.0739 (0.0913) time: 2.9422 data: 0.0072 max mem: 33302 Epoch: [28] [3670/4276] eta: 0:34:31 lr: 1.58257706389235e-05 loss: 0.0805 (0.0913) time: 2.9431 data: 0.0072 max mem: 33302 Epoch: [28] [3680/4276] eta: 0:33:56 lr: 1.582278091514122e-05 loss: 0.0841 (0.0913) time: 2.9422 data: 0.0072 max mem: 33302 Epoch: [28] [3690/4276] eta: 0:33:21 lr: 1.581979112858984e-05 loss: 0.0878 (0.0913) time: 2.9421 data: 0.0072 max mem: 33302 Epoch: [28] [3700/4276] eta: 0:32:46 lr: 1.5816801279254854e-05 loss: 0.0878 (0.0913) time: 2.9430 data: 0.0072 max mem: 33302 Epoch: [28] [3710/4276] eta: 0:32:11 lr: 1.5813811367121756e-05 loss: 0.0842 (0.0913) time: 2.9421 data: 0.0073 max mem: 33302 Epoch: [28] [3720/4276] eta: 0:31:37 lr: 1.5810821392176036e-05 loss: 0.0924 (0.0913) time: 2.9467 data: 0.0072 max mem: 33302 Epoch: [28] [3730/4276] eta: 0:31:02 lr: 1.5807831354403186e-05 loss: 0.0923 (0.0913) time: 2.9482 data: 0.0072 max mem: 33302 Epoch: [28] [3740/4276] eta: 0:30:27 lr: 1.5804841253788666e-05 loss: 0.0843 (0.0913) time: 2.9431 data: 0.0073 max mem: 33302 Epoch: [28] [3750/4276] eta: 0:29:52 lr: 1.580185109031795e-05 loss: 0.0958 (0.0913) time: 2.9415 data: 0.0073 max mem: 33302 Epoch: [28] [3760/4276] eta: 0:29:17 lr: 1.5798860863976506e-05 loss: 0.0800 (0.0912) time: 2.9408 data: 0.0072 max mem: 33302 Epoch: [28] [3770/4276] eta: 0:28:43 lr: 1.579587057474978e-05 loss: 0.0707 (0.0913) time: 2.9412 data: 0.0072 max mem: 33302 Epoch: [28] [3780/4276] eta: 0:28:08 lr: 1.5792880222623228e-05 loss: 0.0858 (0.0912) time: 2.9414 data: 0.0072 max mem: 33302 Epoch: [28] [3790/4276] eta: 0:27:33 lr: 1.578988980758229e-05 loss: 0.0739 (0.0912) time: 2.9406 data: 0.0072 max mem: 33302 Epoch: [28] [3800/4276] eta: 0:26:59 lr: 1.578689932961241e-05 loss: 0.0782 (0.0912) time: 2.9406 data: 0.0072 max mem: 33302 Epoch: [28] [3810/4276] eta: 0:26:24 lr: 1.5783908788699006e-05 loss: 0.0890 (0.0912) time: 2.9316 data: 0.0079 max mem: 33302 Epoch: [28] [3820/4276] eta: 0:25:50 lr: 1.5780918184827508e-05 loss: 0.0821 (0.0912) time: 2.9073 data: 0.0092 max mem: 33302 Epoch: [28] [3830/4276] eta: 0:25:15 lr: 1.5777927517983335e-05 loss: 0.0821 (0.0912) time: 2.9145 data: 0.0089 max mem: 33302 Epoch: [28] [3840/4276] eta: 0:24:41 lr: 1.57749367881519e-05 loss: 0.0886 (0.0912) time: 2.9377 data: 0.0079 max mem: 33302 Epoch: [28] [3850/4276] eta: 0:24:06 lr: 1.5771945995318598e-05 loss: 0.0787 (0.0912) time: 2.9422 data: 0.0079 max mem: 33302 Epoch: [28] [3860/4276] eta: 0:23:32 lr: 1.5768955139468835e-05 loss: 0.0831 (0.0912) time: 2.9449 data: 0.0081 max mem: 33302 Epoch: [28] [3870/4276] eta: 0:22:57 lr: 1.5765964220588e-05 loss: 0.0905 (0.0912) time: 2.9434 data: 0.0078 max mem: 33302 Epoch: [28] [3880/4276] eta: 0:22:23 lr: 1.576297323866148e-05 loss: 0.0778 (0.0911) time: 2.9423 data: 0.0075 max mem: 33302 Epoch: [28] [3890/4276] eta: 0:21:49 lr: 1.575998219367465e-05 loss: 0.0747 (0.0911) time: 2.9417 data: 0.0075 max mem: 33302 Epoch: [28] [3900/4276] eta: 0:21:14 lr: 1.5756991085612887e-05 loss: 0.0802 (0.0911) time: 2.9425 data: 0.0077 max mem: 33302 Epoch: [28] [3910/4276] eta: 0:20:40 lr: 1.5753999914461546e-05 loss: 0.0845 (0.0911) time: 2.9442 data: 0.0077 max mem: 33302 Epoch: [28] [3920/4276] eta: 0:20:06 lr: 1.5751008680206004e-05 loss: 0.0786 (0.0911) time: 2.9451 data: 0.0077 max mem: 33302 Epoch: [28] [3930/4276] eta: 0:19:31 lr: 1.5748017382831596e-05 loss: 0.0789 (0.0911) time: 2.9438 data: 0.0079 max mem: 33302 Epoch: [28] [3940/4276] eta: 0:18:57 lr: 1.5745026022323676e-05 loss: 0.0875 (0.0911) time: 2.9432 data: 0.0079 max mem: 33302 Epoch: [28] [3950/4276] eta: 0:18:23 lr: 1.5742034598667582e-05 loss: 0.0804 (0.0910) time: 2.9426 data: 0.0076 max mem: 33302 Epoch: [28] [3960/4276] eta: 0:17:49 lr: 1.573904311184865e-05 loss: 0.0824 (0.0911) time: 2.9424 data: 0.0076 max mem: 33302 Epoch: [28] [3970/4276] eta: 0:17:14 lr: 1.5736051561852207e-05 loss: 0.0985 (0.0911) time: 2.9432 data: 0.0076 max mem: 33302 Epoch: [28] [3980/4276] eta: 0:16:40 lr: 1.5733059948663567e-05 loss: 0.1028 (0.0911) time: 2.9462 data: 0.0078 max mem: 33302 Epoch: [28] [3990/4276] eta: 0:16:06 lr: 1.5730068272268047e-05 loss: 0.0896 (0.0911) time: 2.9408 data: 0.0079 max mem: 33302 Epoch: [28] [4000/4276] eta: 0:15:32 lr: 1.572707653265096e-05 loss: 0.0770 (0.0911) time: 2.9159 data: 0.0082 max mem: 33302 Epoch: [28] [4010/4276] eta: 0:14:58 lr: 1.5724084729797597e-05 loss: 0.0770 (0.0911) time: 2.9010 data: 0.0086 max mem: 33302 Epoch: [28] [4020/4276] eta: 0:14:24 lr: 1.5721092863693258e-05 loss: 0.0843 (0.0911) time: 2.8989 data: 0.0083 max mem: 33302 Epoch: [28] [4030/4276] eta: 0:13:50 lr: 1.5718100934323224e-05 loss: 0.0835 (0.0911) time: 2.8938 data: 0.0080 max mem: 33302 Epoch: [28] [4040/4276] eta: 0:13:16 lr: 1.571510894167279e-05 loss: 0.0899 (0.0911) time: 2.8926 data: 0.0079 max mem: 33302 Epoch: [28] [4050/4276] eta: 0:12:42 lr: 1.5712116885727213e-05 loss: 0.0887 (0.0911) time: 2.8920 data: 0.0075 max mem: 33302 Epoch: [28] [4060/4276] eta: 0:12:08 lr: 1.570912476647177e-05 loss: 0.0723 (0.0911) time: 2.8936 data: 0.0074 max mem: 33302 Epoch: [28] [4070/4276] eta: 0:11:34 lr: 1.5706132583891726e-05 loss: 0.0865 (0.0911) time: 2.8947 data: 0.0081 max mem: 33302 Epoch: [28] [4080/4276] eta: 0:11:00 lr: 1.5703140337972327e-05 loss: 0.0938 (0.0911) time: 2.8963 data: 0.0087 max mem: 33302 Epoch: [28] [4090/4276] eta: 0:10:26 lr: 1.5700148028698825e-05 loss: 0.0980 (0.0912) time: 2.9007 data: 0.0090 max mem: 33302 Epoch: [28] [4100/4276] eta: 0:09:52 lr: 1.569715565605646e-05 loss: 0.1025 (0.0912) time: 2.8997 data: 0.0088 max mem: 33302 Epoch: [28] [4110/4276] eta: 0:09:18 lr: 1.569416322003048e-05 loss: 0.0998 (0.0912) time: 2.8934 data: 0.0084 max mem: 33302 Epoch: [28] [4120/4276] eta: 0:08:44 lr: 1.5691170720606095e-05 loss: 0.0986 (0.0912) time: 2.8923 data: 0.0087 max mem: 33302 Epoch: [28] [4130/4276] eta: 0:08:11 lr: 1.5688178157768533e-05 loss: 0.0832 (0.0912) time: 2.8958 data: 0.0087 max mem: 33302 Epoch: [28] [4140/4276] eta: 0:07:37 lr: 1.568518553150301e-05 loss: 0.0797 (0.0912) time: 2.8946 data: 0.0083 max mem: 33302 Epoch: [28] [4150/4276] eta: 0:07:03 lr: 1.5682192841794745e-05 loss: 0.0853 (0.0912) time: 2.8918 data: 0.0085 max mem: 33302 Epoch: [28] [4160/4276] eta: 0:06:29 lr: 1.5679200088628924e-05 loss: 0.0815 (0.0912) time: 2.8929 data: 0.0083 max mem: 33302 Epoch: [28] [4170/4276] eta: 0:05:56 lr: 1.5676207271990746e-05 loss: 0.0908 (0.0913) time: 2.8943 data: 0.0082 max mem: 33302 Epoch: [28] [4180/4276] eta: 0:05:22 lr: 1.567321439186541e-05 loss: 0.0983 (0.0913) time: 2.8927 data: 0.0084 max mem: 33302 Epoch: [28] [4190/4276] eta: 0:04:48 lr: 1.5670221448238094e-05 loss: 0.0838 (0.0913) time: 2.8936 data: 0.0082 max mem: 33302 Epoch: [28] [4200/4276] eta: 0:04:15 lr: 1.566722844109397e-05 loss: 0.1042 (0.0913) time: 2.8955 data: 0.0087 max mem: 33302 Epoch: [28] [4210/4276] eta: 0:03:41 lr: 1.5664235370418208e-05 loss: 0.1030 (0.0913) time: 2.8955 data: 0.0089 max mem: 33302 Epoch: [28] [4220/4276] eta: 0:03:07 lr: 1.566124223619597e-05 loss: 0.1044 (0.0914) time: 2.8941 data: 0.0088 max mem: 33302 Epoch: [28] [4230/4276] eta: 0:02:34 lr: 1.565824903841242e-05 loss: 0.1066 (0.0914) time: 2.8918 data: 0.0087 max mem: 33302 Epoch: [28] [4240/4276] eta: 0:02:00 lr: 1.56552557770527e-05 loss: 0.0975 (0.0914) time: 2.8907 data: 0.0083 max mem: 33302 Epoch: [28] [4250/4276] eta: 0:01:27 lr: 1.5652262452101947e-05 loss: 0.0993 (0.0915) time: 2.8915 data: 0.0089 max mem: 33302 Epoch: [28] [4260/4276] eta: 0:00:53 lr: 1.564926906354531e-05 loss: 0.0993 (0.0915) time: 2.8931 data: 0.0089 max mem: 33302 Epoch: [28] [4270/4276] eta: 0:00:20 lr: 1.5646275611367915e-05 loss: 0.0926 (0.0915) time: 2.8878 data: 0.0080 max mem: 33302 Epoch: [28] Total time: 3:58:34 Test: [ 0/21770] eta: 13:22:19 time: 2.2113 data: 2.1720 max mem: 33302 Test: [ 100/21770] eta: 0:21:40 time: 0.0387 data: 0.0008 max mem: 33302 Test: [ 200/21770] eta: 0:17:40 time: 0.0384 data: 0.0008 max mem: 33302 Test: [ 300/21770] eta: 0:16:17 time: 0.0384 data: 0.0008 max mem: 33302 Test: [ 400/21770] eta: 0:15:33 time: 0.0380 data: 0.0008 max mem: 33302 Test: [ 500/21770] eta: 0:15:05 time: 0.0382 data: 0.0008 max mem: 33302 Test: [ 600/21770] eta: 0:14:46 time: 0.0387 data: 0.0008 max mem: 33302 Test: [ 700/21770] eta: 0:14:32 time: 0.0385 data: 0.0008 max mem: 33302 Test: [ 800/21770] eta: 0:14:19 time: 0.0383 data: 0.0008 max mem: 33302 Test: [ 900/21770] eta: 0:14:09 time: 0.0381 data: 0.0008 max mem: 33302 Test: [ 1000/21770] eta: 0:14:00 time: 0.0386 data: 0.0008 max mem: 33302 Test: [ 1100/21770] eta: 0:13:51 time: 0.0380 data: 0.0009 max mem: 33302 Test: [ 1200/21770] eta: 0:13:43 time: 0.0380 data: 0.0009 max mem: 33302 Test: [ 1300/21770] eta: 0:13:36 time: 0.0383 data: 0.0009 max mem: 33302 Test: [ 1400/21770] eta: 0:13:30 time: 0.0381 data: 0.0009 max mem: 33302 Test: [ 1500/21770] eta: 0:13:23 time: 0.0382 data: 0.0009 max mem: 33302 Test: [ 1600/21770] eta: 0:13:18 time: 0.0380 data: 0.0009 max mem: 33302 Test: [ 1700/21770] eta: 0:13:12 time: 0.0383 data: 0.0009 max mem: 33302 Test: [ 1800/21770] eta: 0:13:07 time: 0.0382 data: 0.0009 max mem: 33302 Test: [ 1900/21770] eta: 0:13:02 time: 0.0383 data: 0.0009 max mem: 33302 Test: [ 2000/21770] eta: 0:12:57 time: 0.0383 data: 0.0009 max mem: 33302 Test: [ 2100/21770] eta: 0:12:52 time: 0.0382 data: 0.0008 max mem: 33302 Test: [ 2200/21770] eta: 0:12:47 time: 0.0383 data: 0.0008 max mem: 33302 Test: [ 2300/21770] eta: 0:12:42 time: 0.0380 data: 0.0008 max mem: 33302 Test: [ 2400/21770] eta: 0:12:38 time: 0.0388 data: 0.0009 max mem: 33302 Test: [ 2500/21770] eta: 0:12:34 time: 0.0387 data: 0.0009 max mem: 33302 Test: [ 2600/21770] eta: 0:12:30 time: 0.0387 data: 0.0009 max mem: 33302 Test: [ 2700/21770] eta: 0:12:25 time: 0.0385 data: 0.0009 max mem: 33302 Test: [ 2800/21770] eta: 0:12:21 time: 0.0388 data: 0.0009 max mem: 33302 Test: [ 2900/21770] eta: 0:12:17 time: 0.0388 data: 0.0009 max mem: 33302 Test: [ 3000/21770] eta: 0:12:13 time: 0.0388 data: 0.0009 max mem: 33302 Test: [ 3100/21770] eta: 0:12:09 time: 0.0386 data: 0.0009 max mem: 33302 Test: [ 3200/21770] eta: 0:12:05 time: 0.0388 data: 0.0009 max mem: 33302 Test: [ 3300/21770] eta: 0:12:01 time: 0.0388 data: 0.0009 max mem: 33302 Test: [ 3400/21770] eta: 0:11:57 time: 0.0388 data: 0.0009 max mem: 33302 Test: [ 3500/21770] eta: 0:11:53 time: 0.0387 data: 0.0009 max mem: 33302 Test: [ 3600/21770] eta: 0:11:48 time: 0.0389 data: 0.0009 max mem: 33302 Test: [ 3700/21770] eta: 0:11:44 time: 0.0386 data: 0.0009 max mem: 33302 Test: [ 3800/21770] eta: 0:11:40 time: 0.0388 data: 0.0009 max mem: 33302 Test: [ 3900/21770] eta: 0:11:36 time: 0.0388 data: 0.0009 max mem: 33302 Test: [ 4000/21770] eta: 0:11:32 time: 0.0388 data: 0.0009 max mem: 33302 Test: [ 4100/21770] eta: 0:11:28 time: 0.0386 data: 0.0009 max mem: 33302 Test: [ 4200/21770] eta: 0:11:24 time: 0.0386 data: 0.0009 max mem: 33302 Test: [ 4300/21770] eta: 0:11:20 time: 0.0384 data: 0.0009 max mem: 33302 Test: [ 4400/21770] eta: 0:11:16 time: 0.0385 data: 0.0009 max mem: 33302 Test: [ 4500/21770] eta: 0:11:12 time: 0.0382 data: 0.0009 max mem: 33302 Test: [ 4600/21770] eta: 0:11:08 time: 0.0386 data: 0.0009 max mem: 33302 Test: [ 4700/21770] eta: 0:11:04 time: 0.0381 data: 0.0009 max mem: 33302 Test: [ 4800/21770] eta: 0:11:00 time: 0.0386 data: 0.0009 max mem: 33302 Test: [ 4900/21770] eta: 0:10:56 time: 0.0383 data: 0.0009 max mem: 33302 Test: [ 5000/21770] eta: 0:10:52 time: 0.0387 data: 0.0009 max mem: 33302 Test: [ 5100/21770] eta: 0:10:48 time: 0.0386 data: 0.0009 max mem: 33302 Test: [ 5200/21770] eta: 0:10:44 time: 0.0389 data: 0.0009 max mem: 33302 Test: [ 5300/21770] eta: 0:10:40 time: 0.0383 data: 0.0009 max mem: 33302 Test: [ 5400/21770] eta: 0:10:36 time: 0.0386 data: 0.0009 max mem: 33302 Test: [ 5500/21770] eta: 0:10:32 time: 0.0383 data: 0.0009 max mem: 33302 Test: [ 5600/21770] eta: 0:10:28 time: 0.0387 data: 0.0009 max mem: 33302 Test: [ 5700/21770] eta: 0:10:24 time: 0.0383 data: 0.0009 max mem: 33302 Test: [ 5800/21770] eta: 0:10:20 time: 0.0386 data: 0.0009 max mem: 33302 Test: [ 5900/21770] eta: 0:10:16 time: 0.0382 data: 0.0009 max mem: 33302 Test: [ 6000/21770] eta: 0:10:12 time: 0.0387 data: 0.0009 max mem: 33302 Test: [ 6100/21770] eta: 0:10:08 time: 0.0383 data: 0.0009 max mem: 33302 Test: [ 6200/21770] eta: 0:10:04 time: 0.0387 data: 0.0009 max mem: 33302 Test: [ 6300/21770] eta: 0:10:00 time: 0.0382 data: 0.0009 max mem: 33302 Test: [ 6400/21770] eta: 0:09:56 time: 0.0388 data: 0.0009 max mem: 33302 Test: [ 6500/21770] eta: 0:09:52 time: 0.0383 data: 0.0009 max mem: 33302 Test: [ 6600/21770] eta: 0:09:48 time: 0.0381 data: 0.0009 max mem: 33302 Test: [ 6700/21770] eta: 0:09:44 time: 0.0380 data: 0.0009 max mem: 33302 Test: [ 6800/21770] eta: 0:09:40 time: 0.0382 data: 0.0009 max mem: 33302 Test: [ 6900/21770] eta: 0:09:36 time: 0.0381 data: 0.0009 max mem: 33302 Test: [ 7000/21770] eta: 0:09:32 time: 0.0382 data: 0.0009 max mem: 33302 Test: [ 7100/21770] eta: 0:09:28 time: 0.0380 data: 0.0009 max mem: 33302 Test: [ 7200/21770] eta: 0:09:24 time: 0.0383 data: 0.0009 max mem: 33302 Test: [ 7300/21770] eta: 0:09:20 time: 0.0382 data: 0.0009 max mem: 33302 Test: [ 7400/21770] eta: 0:09:16 time: 0.0383 data: 0.0009 max mem: 33302 Test: [ 7500/21770] eta: 0:09:12 time: 0.0381 data: 0.0009 max mem: 33302 Test: [ 7600/21770] eta: 0:09:08 time: 0.0382 data: 0.0009 max mem: 33302 Test: [ 7700/21770] eta: 0:09:04 time: 0.0381 data: 0.0009 max mem: 33302 Test: [ 7800/21770] eta: 0:09:00 time: 0.0382 data: 0.0009 max mem: 33302 Test: [ 7900/21770] eta: 0:08:56 time: 0.0381 data: 0.0009 max mem: 33302 Test: [ 8000/21770] eta: 0:08:52 time: 0.0382 data: 0.0009 max mem: 33302 Test: [ 8100/21770] eta: 0:08:48 time: 0.0381 data: 0.0009 max mem: 33302 Test: [ 8200/21770] eta: 0:08:44 time: 0.0381 data: 0.0009 max mem: 33302 Test: [ 8300/21770] eta: 0:08:40 time: 0.0379 data: 0.0009 max mem: 33302 Test: [ 8400/21770] eta: 0:08:36 time: 0.0382 data: 0.0009 max mem: 33302 Test: [ 8500/21770] eta: 0:08:32 time: 0.0381 data: 0.0009 max mem: 33302 Test: [ 8600/21770] eta: 0:08:28 time: 0.0378 data: 0.0009 max mem: 33302 Test: [ 8700/21770] eta: 0:08:24 time: 0.0379 data: 0.0009 max mem: 33302 Test: [ 8800/21770] eta: 0:08:20 time: 0.0379 data: 0.0009 max mem: 33302 Test: [ 8900/21770] eta: 0:08:16 time: 0.0378 data: 0.0009 max mem: 33302 Test: [ 9000/21770] eta: 0:08:12 time: 0.0381 data: 0.0009 max mem: 33302 Test: [ 9100/21770] eta: 0:08:09 time: 0.0381 data: 0.0009 max mem: 33302 Test: [ 9200/21770] eta: 0:08:05 time: 0.0382 data: 0.0009 max mem: 33302 Test: [ 9300/21770] eta: 0:08:01 time: 0.0379 data: 0.0009 max mem: 33302 Test: [ 9400/21770] eta: 0:07:57 time: 0.0381 data: 0.0009 max mem: 33302 Test: [ 9500/21770] eta: 0:07:53 time: 0.0384 data: 0.0009 max mem: 33302 Test: [ 9600/21770] eta: 0:07:49 time: 0.0385 data: 0.0009 max mem: 33302 Test: [ 9700/21770] eta: 0:07:45 time: 0.0384 data: 0.0009 max mem: 33302 Test: [ 9800/21770] eta: 0:07:41 time: 0.0384 data: 0.0009 max mem: 33302 Test: [ 9900/21770] eta: 0:07:37 time: 0.0383 data: 0.0009 max mem: 33302 Test: [10000/21770] eta: 0:07:33 time: 0.0384 data: 0.0009 max mem: 33302 Test: [10100/21770] eta: 0:07:30 time: 0.0387 data: 0.0009 max mem: 33302 Test: [10200/21770] eta: 0:07:26 time: 0.0385 data: 0.0009 max mem: 33302 Test: [10300/21770] eta: 0:07:22 time: 0.0385 data: 0.0009 max mem: 33302 Test: [10400/21770] eta: 0:07:18 time: 0.0384 data: 0.0009 max mem: 33302 Test: [10500/21770] eta: 0:07:14 time: 0.0383 data: 0.0009 max mem: 33302 Test: [10600/21770] eta: 0:07:10 time: 0.0385 data: 0.0009 max mem: 33302 Test: [10700/21770] eta: 0:07:06 time: 0.0383 data: 0.0009 max mem: 33302 Test: [10800/21770] eta: 0:07:03 time: 0.0384 data: 0.0009 max mem: 33302 Test: [10900/21770] eta: 0:06:59 time: 0.0382 data: 0.0009 max mem: 33302 Test: [11000/21770] eta: 0:06:55 time: 0.0383 data: 0.0008 max mem: 33302 Test: [11100/21770] eta: 0:06:51 time: 0.0385 data: 0.0009 max mem: 33302 Test: [11200/21770] eta: 0:06:47 time: 0.0387 data: 0.0009 max mem: 33302 Test: [11300/21770] eta: 0:06:43 time: 0.0385 data: 0.0009 max mem: 33302 Test: [11400/21770] eta: 0:06:39 time: 0.0383 data: 0.0009 max mem: 33302 Test: [11500/21770] eta: 0:06:35 time: 0.0385 data: 0.0009 max mem: 33302 Test: [11600/21770] eta: 0:06:32 time: 0.0386 data: 0.0009 max mem: 33302 Test: [11700/21770] eta: 0:06:28 time: 0.0381 data: 0.0009 max mem: 33302 Test: [11800/21770] eta: 0:06:24 time: 0.0386 data: 0.0009 max mem: 33302 Test: [11900/21770] eta: 0:06:20 time: 0.0383 data: 0.0009 max mem: 33302 Test: [12000/21770] eta: 0:06:16 time: 0.0384 data: 0.0009 max mem: 33302 Test: [12100/21770] eta: 0:06:12 time: 0.0382 data: 0.0009 max mem: 33302 Test: [12200/21770] eta: 0:06:08 time: 0.0385 data: 0.0009 max mem: 33302 Test: [12300/21770] eta: 0:06:05 time: 0.0383 data: 0.0009 max mem: 33302 Test: [12400/21770] eta: 0:06:01 time: 0.0387 data: 0.0009 max mem: 33302 Test: [12500/21770] eta: 0:05:57 time: 0.0386 data: 0.0009 max mem: 33302 Test: [12600/21770] eta: 0:05:53 time: 0.0382 data: 0.0009 max mem: 33302 Test: [12700/21770] eta: 0:05:49 time: 0.0385 data: 0.0009 max mem: 33302 Test: [12800/21770] eta: 0:05:45 time: 0.0385 data: 0.0009 max mem: 33302 Test: [12900/21770] eta: 0:05:41 time: 0.0383 data: 0.0009 max mem: 33302 Test: [13000/21770] eta: 0:05:38 time: 0.0384 data: 0.0009 max mem: 33302 Test: [13100/21770] eta: 0:05:34 time: 0.0386 data: 0.0009 max mem: 33302 Test: [13200/21770] eta: 0:05:30 time: 0.0386 data: 0.0009 max mem: 33302 Test: [13300/21770] eta: 0:05:26 time: 0.0386 data: 0.0009 max mem: 33302 Test: [13400/21770] eta: 0:05:22 time: 0.0386 data: 0.0008 max mem: 33302 Test: [13500/21770] eta: 0:05:18 time: 0.0390 data: 0.0009 max mem: 33302 Test: [13600/21770] eta: 0:05:14 time: 0.0386 data: 0.0009 max mem: 33302 Test: [13700/21770] eta: 0:05:11 time: 0.0390 data: 0.0009 max mem: 33302 Test: [13800/21770] eta: 0:05:07 time: 0.0387 data: 0.0009 max mem: 33302 Test: [13900/21770] eta: 0:05:03 time: 0.0388 data: 0.0009 max mem: 33302 Test: [14000/21770] eta: 0:04:59 time: 0.0386 data: 0.0009 max mem: 33302 Test: [14100/21770] eta: 0:04:55 time: 0.0388 data: 0.0009 max mem: 33302 Test: [14200/21770] eta: 0:04:51 time: 0.0389 data: 0.0009 max mem: 33302 Test: [14300/21770] eta: 0:04:48 time: 0.0390 data: 0.0009 max mem: 33302 Test: [14400/21770] eta: 0:04:44 time: 0.0388 data: 0.0009 max mem: 33302 Test: [14500/21770] eta: 0:04:40 time: 0.0389 data: 0.0009 max mem: 33302 Test: [14600/21770] eta: 0:04:36 time: 0.0388 data: 0.0009 max mem: 33302 Test: [14700/21770] eta: 0:04:32 time: 0.0389 data: 0.0009 max mem: 33302 Test: [14800/21770] eta: 0:04:28 time: 0.0387 data: 0.0009 max mem: 33302 Test: [14900/21770] eta: 0:04:24 time: 0.0390 data: 0.0009 max mem: 33302 Test: [15000/21770] eta: 0:04:21 time: 0.0388 data: 0.0009 max mem: 33302 Test: [15100/21770] eta: 0:04:17 time: 0.0390 data: 0.0009 max mem: 33302 Test: [15200/21770] eta: 0:04:13 time: 0.0387 data: 0.0009 max mem: 33302 Test: [15300/21770] eta: 0:04:09 time: 0.0389 data: 0.0009 max mem: 33302 Test: [15400/21770] eta: 0:04:05 time: 0.0388 data: 0.0009 max mem: 33302 Test: [15500/21770] eta: 0:04:01 time: 0.0390 data: 0.0009 max mem: 33302 Test: [15600/21770] eta: 0:03:58 time: 0.0388 data: 0.0009 max mem: 33302 Test: [15700/21770] eta: 0:03:54 time: 0.0396 data: 0.0009 max mem: 33302 Test: [15800/21770] eta: 0:03:50 time: 0.0391 data: 0.0009 max mem: 33302 Test: [15900/21770] eta: 0:03:46 time: 0.0394 data: 0.0008 max mem: 33302 Test: [16000/21770] eta: 0:03:42 time: 0.0393 data: 0.0008 max mem: 33302 Test: [16100/21770] eta: 0:03:38 time: 0.0392 data: 0.0009 max mem: 33302 Test: [16200/21770] eta: 0:03:35 time: 0.0391 data: 0.0008 max mem: 33302 Test: [16300/21770] eta: 0:03:31 time: 0.0390 data: 0.0008 max mem: 33302 Test: [16400/21770] eta: 0:03:27 time: 0.0392 data: 0.0008 max mem: 33302 Test: [16500/21770] eta: 0:03:23 time: 0.0388 data: 0.0008 max mem: 33302 Test: [16600/21770] eta: 0:03:19 time: 0.0392 data: 0.0008 max mem: 33302 Test: [16700/21770] eta: 0:03:15 time: 0.0389 data: 0.0008 max mem: 33302 Test: [16800/21770] eta: 0:03:11 time: 0.0381 data: 0.0009 max mem: 33302 Test: [16900/21770] eta: 0:03:08 time: 0.0381 data: 0.0009 max mem: 33302 Test: [17000/21770] eta: 0:03:04 time: 0.0381 data: 0.0009 max mem: 33302 Test: [17100/21770] eta: 0:03:00 time: 0.0381 data: 0.0009 max mem: 33302 Test: [17200/21770] eta: 0:02:56 time: 0.0378 data: 0.0009 max mem: 33302 Test: [17300/21770] eta: 0:02:52 time: 0.0379 data: 0.0009 max mem: 33302 Test: [17400/21770] eta: 0:02:48 time: 0.0379 data: 0.0009 max mem: 33302 Test: [17500/21770] eta: 0:02:44 time: 0.0378 data: 0.0009 max mem: 33302 Test: [17600/21770] eta: 0:02:40 time: 0.0378 data: 0.0009 max mem: 33302 Test: [17700/21770] eta: 0:02:37 time: 0.0379 data: 0.0009 max mem: 33302 Test: [17800/21770] eta: 0:02:33 time: 0.0387 data: 0.0009 max mem: 33302 Test: [17900/21770] eta: 0:02:29 time: 0.0387 data: 0.0009 max mem: 33302 Test: [18000/21770] eta: 0:02:25 time: 0.0387 data: 0.0008 max mem: 33302 Test: [18100/21770] eta: 0:02:21 time: 0.0388 data: 0.0009 max mem: 33302 Test: [18200/21770] eta: 0:02:17 time: 0.0387 data: 0.0008 max mem: 33302 Test: [18300/21770] eta: 0:02:13 time: 0.0388 data: 0.0009 max mem: 33302 Test: [18400/21770] eta: 0:02:10 time: 0.0386 data: 0.0009 max mem: 33302 Test: [18500/21770] eta: 0:02:06 time: 0.0387 data: 0.0009 max mem: 33302 Test: [18600/21770] eta: 0:02:02 time: 0.0387 data: 0.0008 max mem: 33302 Test: [18700/21770] eta: 0:01:58 time: 0.0388 data: 0.0008 max mem: 33302 Test: [18800/21770] eta: 0:01:54 time: 0.0388 data: 0.0009 max mem: 33302 Test: [18900/21770] eta: 0:01:50 time: 0.0391 data: 0.0008 max mem: 33302 Test: [19000/21770] eta: 0:01:46 time: 0.0391 data: 0.0008 max mem: 33302 Test: [19100/21770] eta: 0:01:43 time: 0.0389 data: 0.0008 max mem: 33302 Test: [19200/21770] eta: 0:01:39 time: 0.0390 data: 0.0008 max mem: 33302 Test: [19300/21770] eta: 0:01:35 time: 0.0391 data: 0.0008 max mem: 33302 Test: [19400/21770] eta: 0:01:31 time: 0.0391 data: 0.0008 max mem: 33302 Test: [19500/21770] eta: 0:01:27 time: 0.0390 data: 0.0008 max mem: 33302 Test: [19600/21770] eta: 0:01:23 time: 0.0389 data: 0.0009 max mem: 33302 Test: [19700/21770] eta: 0:01:19 time: 0.0387 data: 0.0009 max mem: 33302 Test: [19800/21770] eta: 0:01:16 time: 0.0387 data: 0.0009 max mem: 33302 Test: [19900/21770] eta: 0:01:12 time: 0.0388 data: 0.0009 max mem: 33302 Test: [20000/21770] eta: 0:01:08 time: 0.0386 data: 0.0008 max mem: 33302 Test: [20100/21770] eta: 0:01:04 time: 0.0389 data: 0.0008 max mem: 33302 Test: [20200/21770] eta: 0:01:00 time: 0.0384 data: 0.0009 max mem: 33302 Test: [20300/21770] eta: 0:00:56 time: 0.0385 data: 0.0009 max mem: 33302 Test: [20400/21770] eta: 0:00:52 time: 0.0381 data: 0.0008 max mem: 33302 Test: [20500/21770] eta: 0:00:49 time: 0.0385 data: 0.0009 max mem: 33302 Test: [20600/21770] eta: 0:00:45 time: 0.0383 data: 0.0009 max mem: 33302 Test: [20700/21770] eta: 0:00:41 time: 0.0381 data: 0.0009 max mem: 33302 Test: [20800/21770] eta: 0:00:37 time: 0.0384 data: 0.0008 max mem: 33302 Test: [20900/21770] eta: 0:00:33 time: 0.0382 data: 0.0009 max mem: 33302 Test: [21000/21770] eta: 0:00:29 time: 0.0380 data: 0.0009 max mem: 33302 Test: [21100/21770] eta: 0:00:25 time: 0.0382 data: 0.0009 max mem: 33302 Test: [21200/21770] eta: 0:00:21 time: 0.0385 data: 0.0009 max mem: 33302 Test: [21300/21770] eta: 0:00:18 time: 0.0384 data: 0.0009 max mem: 33302 Test: [21400/21770] eta: 0:00:14 time: 0.0386 data: 0.0009 max mem: 33302 Test: [21500/21770] eta: 0:00:10 time: 0.0385 data: 0.0009 max mem: 33302 Test: [21600/21770] eta: 0:00:06 time: 0.0384 data: 0.0009 max mem: 33302 Test: [21700/21770] eta: 0:00:02 time: 0.0385 data: 0.0009 max mem: 33302 Test: Total time: 0:14:00 Final results: Mean IoU is 2.27 precision@0.5 = 0.05 precision@0.6 = 0.00 precision@0.7 = 0.00 precision@0.8 = 0.00 precision@0.9 = 0.00 overall IoU = 2.26 mean IoU = 2.27 Mean accuracy for one-to-zero sample is 0.00 Average object IoU 0.02268696492369037 Overall IoU 2.263010025024414 Epoch: [29] [ 0/4276] eta: 6:37:50 lr: 1.5644479509517323e-05 loss: 0.0641 (0.0641) time: 5.5825 data: 2.5154 max mem: 33302 Epoch: [29] [ 10/4276] eta: 3:47:12 lr: 1.5641485955515762e-05 loss: 0.0805 (0.0842) time: 3.1956 data: 0.2364 max mem: 33302 Epoch: [29] [ 20/4276] eta: 3:38:40 lr: 1.5638492337854742e-05 loss: 0.0835 (0.0917) time: 2.9579 data: 0.0080 max mem: 33302 Epoch: [29] [ 30/4276] eta: 3:34:09 lr: 1.5635498656519375e-05 loss: 0.0850 (0.0916) time: 2.9331 data: 0.0074 max mem: 33302 Epoch: [29] [ 40/4276] eta: 3:31:41 lr: 1.5632504911494768e-05 loss: 0.0868 (0.0908) time: 2.9096 data: 0.0081 max mem: 33302 Epoch: [29] [ 50/4276] eta: 3:29:39 lr: 1.562951110276599e-05 loss: 0.0863 (0.0888) time: 2.8997 data: 0.0085 max mem: 33302 Epoch: [29] [ 60/4276] eta: 3:28:13 lr: 1.5626517230318145e-05 loss: 0.0768 (0.0882) time: 2.8913 data: 0.0086 max mem: 33302 Epoch: [29] [ 70/4276] eta: 3:27:07 lr: 1.5623523294136304e-05 loss: 0.0731 (0.0864) time: 2.8988 data: 0.0087 max mem: 33302 Epoch: [29] [ 80/4276] eta: 3:26:05 lr: 1.562052929420555e-05 loss: 0.0803 (0.0869) time: 2.8976 data: 0.0089 max mem: 33302 Epoch: [29] [ 90/4276] eta: 3:25:10 lr: 1.5617535230510933e-05 loss: 0.0837 (0.0871) time: 2.8916 data: 0.0089 max mem: 33302 Epoch: [29] [ 100/4276] eta: 3:24:21 lr: 1.561454110303752e-05 loss: 0.0929 (0.0882) time: 2.8921 data: 0.0087 max mem: 33302 Epoch: [29] [ 110/4276] eta: 3:23:35 lr: 1.561154691177036e-05 loss: 0.0996 (0.0898) time: 2.8929 data: 0.0087 max mem: 33302 Epoch: [29] [ 120/4276] eta: 3:22:50 lr: 1.5608552656694507e-05 loss: 0.0955 (0.0903) time: 2.8893 data: 0.0086 max mem: 33302 Epoch: [29] [ 130/4276] eta: 3:22:06 lr: 1.560555833779499e-05 loss: 0.0851 (0.0910) time: 2.8845 data: 0.0082 max mem: 33302 Epoch: [29] [ 140/4276] eta: 3:21:29 lr: 1.560256395505685e-05 loss: 0.0806 (0.0910) time: 2.8899 data: 0.0084 max mem: 33302 Epoch: [29] [ 150/4276] eta: 3:20:55 lr: 1.55995695084651e-05 loss: 0.0759 (0.0910) time: 2.9023 data: 0.0086 max mem: 33302 Epoch: [29] [ 160/4276] eta: 3:20:19 lr: 1.5596574998004777e-05 loss: 0.0862 (0.0911) time: 2.8999 data: 0.0085 max mem: 33302 Epoch: [29] [ 170/4276] eta: 3:19:43 lr: 1.5593580423660874e-05 loss: 0.0862 (0.0907) time: 2.8941 data: 0.0086 max mem: 33302 Epoch: [29] [ 180/4276] eta: 3:19:10 lr: 1.55905857854184e-05 loss: 0.0769 (0.0904) time: 2.8965 data: 0.0089 max mem: 33302 Epoch: [29] [ 190/4276] eta: 3:18:35 lr: 1.558759108326237e-05 loss: 0.0772 (0.0900) time: 2.8948 data: 0.0087 max mem: 33302 Epoch: [29] [ 200/4276] eta: 3:18:00 lr: 1.5584596317177757e-05 loss: 0.0777 (0.0900) time: 2.8900 data: 0.0078 max mem: 33302 Epoch: [29] [ 210/4276] eta: 3:17:26 lr: 1.5581601487149556e-05 loss: 0.0856 (0.0903) time: 2.8900 data: 0.0073 max mem: 33302 Epoch: [29] [ 220/4276] eta: 3:16:54 lr: 1.5578606593162736e-05 loss: 0.0844 (0.0901) time: 2.8928 data: 0.0073 max mem: 33302 Epoch: [29] [ 230/4276] eta: 3:16:23 lr: 1.5575611635202286e-05 loss: 0.0698 (0.0896) time: 2.8981 data: 0.0076 max mem: 33302 Epoch: [29] [ 240/4276] eta: 3:15:52 lr: 1.5572616613253153e-05 loss: 0.0806 (0.0895) time: 2.9022 data: 0.0075 max mem: 33302 Epoch: [29] [ 250/4276] eta: 3:15:20 lr: 1.5569621527300302e-05 loss: 0.0895 (0.0900) time: 2.8981 data: 0.0072 max mem: 33302 Epoch: [29] [ 260/4276] eta: 3:14:51 lr: 1.5566626377328684e-05 loss: 0.0826 (0.0897) time: 2.9038 data: 0.0077 max mem: 33302 Epoch: [29] [ 270/4276] eta: 3:14:19 lr: 1.556363116332325e-05 loss: 0.0733 (0.0895) time: 2.9017 data: 0.0082 max mem: 33302 Epoch: [29] [ 280/4276] eta: 3:13:50 lr: 1.5560635885268926e-05 loss: 0.0733 (0.0892) time: 2.9001 data: 0.0085 max mem: 33302 Epoch: [29] [ 290/4276] eta: 3:13:22 lr: 1.555764054315065e-05 loss: 0.0780 (0.0892) time: 2.9146 data: 0.0090 max mem: 33302 Epoch: [29] [ 300/4276] eta: 3:12:56 lr: 1.5554645136953347e-05 loss: 0.0833 (0.0891) time: 2.9271 data: 0.0090 max mem: 33302 Epoch: [29] [ 310/4276] eta: 3:12:31 lr: 1.5551649666661936e-05 loss: 0.0803 (0.0887) time: 2.9413 data: 0.0091 max mem: 33302 Epoch: [29] [ 320/4276] eta: 3:12:07 lr: 1.554865413226132e-05 loss: 0.0805 (0.0888) time: 2.9474 data: 0.0090 max mem: 33302 Epoch: [29] [ 330/4276] eta: 3:11:41 lr: 1.554565853373641e-05 loss: 0.0965 (0.0892) time: 2.9444 data: 0.0085 max mem: 33302 Epoch: [29] [ 340/4276] eta: 3:11:15 lr: 1.55426628710721e-05 loss: 0.0915 (0.0891) time: 2.9419 data: 0.0081 max mem: 33302 Epoch: [29] [ 350/4276] eta: 3:10:46 lr: 1.5539667144253288e-05 loss: 0.0799 (0.0893) time: 2.9289 data: 0.0082 max mem: 33302 Epoch: [29] [ 360/4276] eta: 3:10:16 lr: 1.5536671353264846e-05 loss: 0.0934 (0.0899) time: 2.9146 data: 0.0087 max mem: 33302 Epoch: [29] [ 370/4276] eta: 3:09:46 lr: 1.5533675498091655e-05 loss: 0.0938 (0.0898) time: 2.9082 data: 0.0092 max mem: 33302 Epoch: [29] [ 380/4276] eta: 3:09:17 lr: 1.5530679578718584e-05 loss: 0.0824 (0.0900) time: 2.9088 data: 0.0095 max mem: 33302 Epoch: [29] [ 390/4276] eta: 3:08:47 lr: 1.5527683595130506e-05 loss: 0.0895 (0.0901) time: 2.9134 data: 0.0088 max mem: 33302 Epoch: [29] [ 400/4276] eta: 3:08:18 lr: 1.5524687547312265e-05 loss: 0.0971 (0.0905) time: 2.9142 data: 0.0088 max mem: 33302 Epoch: [29] [ 410/4276] eta: 3:07:53 lr: 1.5521691435248716e-05 loss: 0.0976 (0.0909) time: 2.9370 data: 0.0092 max mem: 33302 Epoch: [29] [ 420/4276] eta: 3:07:26 lr: 1.55186952589247e-05 loss: 0.1006 (0.0910) time: 2.9490 data: 0.0084 max mem: 33302 Epoch: [29] [ 430/4276] eta: 3:06:59 lr: 1.551569901832506e-05 loss: 0.0957 (0.0910) time: 2.9407 data: 0.0077 max mem: 33302 Epoch: [29] [ 440/4276] eta: 3:06:32 lr: 1.5512702713434615e-05 loss: 0.0921 (0.0911) time: 2.9388 data: 0.0072 max mem: 33302 Epoch: [29] [ 450/4276] eta: 3:06:04 lr: 1.550970634423819e-05 loss: 0.0921 (0.0912) time: 2.9380 data: 0.0072 max mem: 33302 Epoch: [29] [ 460/4276] eta: 3:05:37 lr: 1.550670991072061e-05 loss: 0.0853 (0.0909) time: 2.9405 data: 0.0072 max mem: 33302 Epoch: [29] [ 470/4276] eta: 3:05:09 lr: 1.5503713412866665e-05 loss: 0.0671 (0.0905) time: 2.9395 data: 0.0072 max mem: 33302 Epoch: [29] [ 480/4276] eta: 3:04:39 lr: 1.550071685066117e-05 loss: 0.0740 (0.0903) time: 2.9216 data: 0.0075 max mem: 33302 Epoch: [29] [ 490/4276] eta: 3:04:10 lr: 1.549772022408892e-05 loss: 0.0740 (0.0900) time: 2.9111 data: 0.0082 max mem: 33302 Epoch: [29] [ 500/4276] eta: 3:03:42 lr: 1.5494723533134703e-05 loss: 0.0797 (0.0899) time: 2.9284 data: 0.0082 max mem: 33302 Epoch: [29] [ 510/4276] eta: 3:03:14 lr: 1.549172677778329e-05 loss: 0.0830 (0.0899) time: 2.9378 data: 0.0084 max mem: 33302 Epoch: [29] [ 520/4276] eta: 3:02:43 lr: 1.5488729958019467e-05 loss: 0.0942 (0.0899) time: 2.9149 data: 0.0084 max mem: 33302 Epoch: [29] [ 530/4276] eta: 3:02:16 lr: 1.548573307382799e-05 loss: 0.0887 (0.0900) time: 2.9210 data: 0.0085 max mem: 33302 Epoch: [29] [ 540/4276] eta: 3:01:49 lr: 1.5482736125193636e-05 loss: 0.0880 (0.0900) time: 2.9463 data: 0.0087 max mem: 33302 Epoch: [29] [ 550/4276] eta: 3:01:21 lr: 1.5479739112101145e-05 loss: 0.0899 (0.0901) time: 2.9450 data: 0.0085 max mem: 33302 Epoch: [29] [ 560/4276] eta: 3:00:52 lr: 1.5476742034535267e-05 loss: 0.0899 (0.0902) time: 2.9336 data: 0.0088 max mem: 33302 Epoch: [29] [ 570/4276] eta: 3:00:21 lr: 1.547374489248074e-05 loss: 0.0897 (0.0902) time: 2.9087 data: 0.0083 max mem: 33302 Epoch: [29] [ 580/4276] eta: 2:59:50 lr: 1.547074768592231e-05 loss: 0.0899 (0.0902) time: 2.8943 data: 0.0079 max mem: 33302 Epoch: [29] [ 590/4276] eta: 2:59:19 lr: 1.5467750414844684e-05 loss: 0.0779 (0.0900) time: 2.8925 data: 0.0081 max mem: 33302 Epoch: [29] [ 600/4276] eta: 2:58:48 lr: 1.546475307923259e-05 loss: 0.0779 (0.0899) time: 2.8911 data: 0.0079 max mem: 33302 Epoch: [29] [ 610/4276] eta: 2:58:20 lr: 1.5461755679070744e-05 loss: 0.0827 (0.0899) time: 2.9112 data: 0.0083 max mem: 33302 Epoch: [29] [ 620/4276] eta: 2:57:53 lr: 1.545875821434385e-05 loss: 0.0829 (0.0898) time: 2.9412 data: 0.0091 max mem: 33302 Epoch: [29] [ 630/4276] eta: 2:57:25 lr: 1.5455760685036598e-05 loss: 0.0892 (0.0900) time: 2.9496 data: 0.0092 max mem: 33302 Epoch: [29] [ 640/4276] eta: 2:56:58 lr: 1.5452763091133685e-05 loss: 0.0892 (0.0902) time: 2.9533 data: 0.0093 max mem: 33302 Epoch: [29] [ 650/4276] eta: 2:56:30 lr: 1.5449765432619798e-05 loss: 0.0802 (0.0901) time: 2.9494 data: 0.0095 max mem: 33302 Epoch: [29] [ 660/4276] eta: 2:56:02 lr: 1.5446767709479617e-05 loss: 0.0986 (0.0904) time: 2.9380 data: 0.0094 max mem: 33302 Epoch: [29] [ 670/4276] eta: 2:55:34 lr: 1.5443769921697802e-05 loss: 0.0913 (0.0903) time: 2.9407 data: 0.0090 max mem: 33302 Epoch: [29] [ 680/4276] eta: 2:55:06 lr: 1.5440772069259024e-05 loss: 0.0844 (0.0902) time: 2.9481 data: 0.0085 max mem: 33302 Epoch: [29] [ 690/4276] eta: 2:54:38 lr: 1.543777415214794e-05 loss: 0.0864 (0.0901) time: 2.9461 data: 0.0082 max mem: 33302 Epoch: [29] [ 700/4276] eta: 2:54:10 lr: 1.5434776170349203e-05 loss: 0.0817 (0.0899) time: 2.9413 data: 0.0077 max mem: 33302 Epoch: [29] [ 710/4276] eta: 2:53:41 lr: 1.5431778123847442e-05 loss: 0.0789 (0.0900) time: 2.9368 data: 0.0079 max mem: 33302 Epoch: [29] [ 720/4276] eta: 2:53:13 lr: 1.542878001262731e-05 loss: 0.0977 (0.0900) time: 2.9361 data: 0.0078 max mem: 33302 Epoch: [29] [ 730/4276] eta: 2:52:42 lr: 1.542578183667342e-05 loss: 0.0929 (0.0900) time: 2.9180 data: 0.0074 max mem: 33302 Epoch: [29] [ 740/4276] eta: 2:52:12 lr: 1.5422783595970418e-05 loss: 0.0870 (0.0899) time: 2.8967 data: 0.0077 max mem: 33302 Epoch: [29] [ 750/4276] eta: 2:51:41 lr: 1.5419785290502892e-05 loss: 0.0847 (0.0900) time: 2.8969 data: 0.0081 max mem: 33302 Epoch: [29] [ 760/4276] eta: 2:51:11 lr: 1.5416786920255462e-05 loss: 0.0847 (0.0900) time: 2.9026 data: 0.0082 max mem: 33302 Epoch: [29] [ 770/4276] eta: 2:50:41 lr: 1.5413788485212733e-05 loss: 0.0827 (0.0899) time: 2.9048 data: 0.0082 max mem: 33302 Epoch: [29] [ 780/4276] eta: 2:50:12 lr: 1.541078998535929e-05 loss: 0.0822 (0.0898) time: 2.9061 data: 0.0086 max mem: 33302 Epoch: [29] [ 790/4276] eta: 2:49:42 lr: 1.5407791420679724e-05 loss: 0.0793 (0.0899) time: 2.9114 data: 0.0087 max mem: 33302 Epoch: [29] [ 800/4276] eta: 2:49:12 lr: 1.540479279115862e-05 loss: 0.0872 (0.0900) time: 2.9041 data: 0.0082 max mem: 33302 Epoch: [29] [ 810/4276] eta: 2:48:41 lr: 1.5401794096780544e-05 loss: 0.0879 (0.0900) time: 2.8952 data: 0.0077 max mem: 33302 Epoch: [29] [ 820/4276] eta: 2:48:11 lr: 1.5398795337530065e-05 loss: 0.0842 (0.0899) time: 2.8927 data: 0.0080 max mem: 33302 Epoch: [29] [ 830/4276] eta: 2:47:41 lr: 1.5395796513391743e-05 loss: 0.0851 (0.0899) time: 2.8922 data: 0.0081 max mem: 33302 Epoch: [29] [ 840/4276] eta: 2:47:10 lr: 1.5392797624350125e-05 loss: 0.0891 (0.0899) time: 2.8928 data: 0.0080 max mem: 33302 Epoch: [29] [ 850/4276] eta: 2:46:40 lr: 1.5389798670389767e-05 loss: 0.0823 (0.0899) time: 2.8945 data: 0.0077 max mem: 33302 Epoch: [29] [ 860/4276] eta: 2:46:10 lr: 1.5386799651495192e-05 loss: 0.0840 (0.0902) time: 2.8971 data: 0.0081 max mem: 33302 Epoch: [29] [ 870/4276] eta: 2:45:42 lr: 1.5383800567650944e-05 loss: 0.0870 (0.0902) time: 2.9279 data: 0.0089 max mem: 33302 Epoch: [29] [ 880/4276] eta: 2:45:14 lr: 1.5380801418841536e-05 loss: 0.0844 (0.0905) time: 2.9523 data: 0.0094 max mem: 33302 Epoch: [29] [ 890/4276] eta: 2:44:46 lr: 1.53778022050515e-05 loss: 0.0917 (0.0905) time: 2.9470 data: 0.0094 max mem: 33302 Epoch: [29] [ 900/4276] eta: 2:44:18 lr: 1.537480292626533e-05 loss: 0.0917 (0.0905) time: 2.9457 data: 0.0089 max mem: 33302 Epoch: [29] [ 910/4276] eta: 2:43:50 lr: 1.5371803582467535e-05 loss: 0.0905 (0.0905) time: 2.9429 data: 0.0087 max mem: 33302 Epoch: [29] [ 920/4276] eta: 2:43:21 lr: 1.536880417364261e-05 loss: 0.0893 (0.0905) time: 2.9424 data: 0.0089 max mem: 33302 Epoch: [29] [ 930/4276] eta: 2:42:53 lr: 1.536580469977505e-05 loss: 0.0835 (0.0906) time: 2.9432 data: 0.0088 max mem: 33302 Epoch: [29] [ 940/4276] eta: 2:42:24 lr: 1.5362805160849326e-05 loss: 0.0908 (0.0905) time: 2.9428 data: 0.0085 max mem: 33302 Epoch: [29] [ 950/4276] eta: 2:41:56 lr: 1.5359805556849914e-05 loss: 0.0915 (0.0906) time: 2.9430 data: 0.0085 max mem: 33302 Epoch: [29] [ 960/4276] eta: 2:41:27 lr: 1.535680588776129e-05 loss: 0.0919 (0.0906) time: 2.9417 data: 0.0087 max mem: 33302 Epoch: [29] [ 970/4276] eta: 2:40:59 lr: 1.5353806153567912e-05 loss: 0.0919 (0.0907) time: 2.9414 data: 0.0087 max mem: 33302 Epoch: [29] [ 980/4276] eta: 2:40:30 lr: 1.5350806354254226e-05 loss: 0.0923 (0.0906) time: 2.9415 data: 0.0085 max mem: 33302 Epoch: [29] [ 990/4276] eta: 2:40:00 lr: 1.5347806489804685e-05 loss: 0.0784 (0.0905) time: 2.9212 data: 0.0082 max mem: 33302 Epoch: [29] [1000/4276] eta: 2:39:32 lr: 1.534480656020372e-05 loss: 0.0775 (0.0904) time: 2.9222 data: 0.0085 max mem: 33302 Epoch: [29] [1010/4276] eta: 2:39:03 lr: 1.534180656543578e-05 loss: 0.0812 (0.0904) time: 2.9428 data: 0.0085 max mem: 33302 Epoch: [29] [1020/4276] eta: 2:38:35 lr: 1.533880650548527e-05 loss: 0.0878 (0.0904) time: 2.9447 data: 0.0082 max mem: 33302 Epoch: [29] [1030/4276] eta: 2:38:06 lr: 1.5335806380336617e-05 loss: 0.0909 (0.0906) time: 2.9466 data: 0.0079 max mem: 33302 Epoch: [29] [1040/4276] eta: 2:37:37 lr: 1.533280618997423e-05 loss: 0.0848 (0.0905) time: 2.9396 data: 0.0076 max mem: 33302 Epoch: [29] [1050/4276] eta: 2:37:09 lr: 1.5329805934382523e-05 loss: 0.0834 (0.0906) time: 2.9390 data: 0.0074 max mem: 33302 Epoch: [29] [1060/4276] eta: 2:36:40 lr: 1.5326805613545873e-05 loss: 0.1020 (0.0908) time: 2.9326 data: 0.0077 max mem: 33302 Epoch: [29] [1070/4276] eta: 2:36:10 lr: 1.5323805227448683e-05 loss: 0.1020 (0.0909) time: 2.9184 data: 0.0082 max mem: 33302 Epoch: [29] [1080/4276] eta: 2:35:41 lr: 1.5320804776075335e-05 loss: 0.0968 (0.0909) time: 2.9294 data: 0.0079 max mem: 33302 Epoch: [29] [1090/4276] eta: 2:35:13 lr: 1.5317804259410198e-05 loss: 0.0968 (0.0910) time: 2.9405 data: 0.0072 max mem: 33302 Epoch: [29] [1100/4276] eta: 2:34:44 lr: 1.5314803677437638e-05 loss: 0.0988 (0.0911) time: 2.9409 data: 0.0071 max mem: 33302 Epoch: [29] [1110/4276] eta: 2:34:15 lr: 1.5311803030142026e-05 loss: 0.1041 (0.0912) time: 2.9408 data: 0.0071 max mem: 33302 Epoch: [29] [1120/4276] eta: 2:33:47 lr: 1.530880231750771e-05 loss: 0.1012 (0.0912) time: 2.9401 data: 0.0073 max mem: 33302 Epoch: [29] [1130/4276] eta: 2:33:18 lr: 1.5305801539519033e-05 loss: 0.0796 (0.0911) time: 2.9405 data: 0.0078 max mem: 33302 Epoch: [29] [1140/4276] eta: 2:32:49 lr: 1.530280069616034e-05 loss: 0.0817 (0.0912) time: 2.9402 data: 0.0077 max mem: 33302 Epoch: [29] [1150/4276] eta: 2:32:20 lr: 1.5299799787415956e-05 loss: 0.0882 (0.0912) time: 2.9407 data: 0.0079 max mem: 33302 Epoch: [29] [1160/4276] eta: 2:31:51 lr: 1.5296798813270218e-05 loss: 0.0874 (0.0912) time: 2.9407 data: 0.0082 max mem: 33302 Epoch: [29] [1170/4276] eta: 2:31:23 lr: 1.529379777370743e-05 loss: 0.0874 (0.0912) time: 2.9435 data: 0.0079 max mem: 33302 Epoch: [29] [1180/4276] eta: 2:30:54 lr: 1.529079666871191e-05 loss: 0.0857 (0.0912) time: 2.9422 data: 0.0078 max mem: 33302 Epoch: [29] [1190/4276] eta: 2:30:25 lr: 1.528779549826796e-05 loss: 0.0801 (0.0911) time: 2.9372 data: 0.0080 max mem: 33302 Epoch: [29] [1200/4276] eta: 2:29:56 lr: 1.528479426235988e-05 loss: 0.0775 (0.0911) time: 2.9398 data: 0.0083 max mem: 33302 Epoch: [29] [1210/4276] eta: 2:29:27 lr: 1.528179296097195e-05 loss: 0.0782 (0.0910) time: 2.9403 data: 0.0081 max mem: 33302 Epoch: [29] [1220/4276] eta: 2:28:58 lr: 1.5278791594088464e-05 loss: 0.0801 (0.0910) time: 2.9393 data: 0.0079 max mem: 33302 Epoch: [29] [1230/4276] eta: 2:28:29 lr: 1.5275790161693683e-05 loss: 0.0938 (0.0910) time: 2.9396 data: 0.0080 max mem: 33302 Epoch: [29] [1240/4276] eta: 2:28:01 lr: 1.5272788663771888e-05 loss: 0.0863 (0.0911) time: 2.9397 data: 0.0083 max mem: 33302 Epoch: [29] [1250/4276] eta: 2:27:32 lr: 1.5269787100307327e-05 loss: 0.0853 (0.0911) time: 2.9374 data: 0.0083 max mem: 33302 Epoch: [29] [1260/4276] eta: 2:27:03 lr: 1.5266785471284263e-05 loss: 0.0790 (0.0910) time: 2.9386 data: 0.0079 max mem: 33302 Epoch: [29] [1270/4276] eta: 2:26:34 lr: 1.526378377668693e-05 loss: 0.0783 (0.0909) time: 2.9415 data: 0.0079 max mem: 33302 Epoch: [29] [1280/4276] eta: 2:26:04 lr: 1.5260782016499584e-05 loss: 0.0867 (0.0909) time: 2.9312 data: 0.0079 max mem: 33302 Epoch: [29] [1290/4276] eta: 2:25:35 lr: 1.525778019070644e-05 loss: 0.0920 (0.0909) time: 2.9155 data: 0.0080 max mem: 33302 Epoch: [29] [1300/4276] eta: 2:25:06 lr: 1.5254778299291724e-05 loss: 0.0728 (0.0908) time: 2.9283 data: 0.0088 max mem: 33302 Epoch: [29] [1310/4276] eta: 2:24:37 lr: 1.5251776342239662e-05 loss: 0.0732 (0.0908) time: 2.9397 data: 0.0092 max mem: 33302 Epoch: [29] [1320/4276] eta: 2:24:08 lr: 1.5248774319534461e-05 loss: 0.0845 (0.0909) time: 2.9344 data: 0.0081 max mem: 33302 Epoch: [29] [1330/4276] eta: 2:23:39 lr: 1.5245772231160315e-05 loss: 0.0903 (0.0908) time: 2.9372 data: 0.0073 max mem: 33302 Epoch: [29] [1340/4276] eta: 2:23:10 lr: 1.5242770077101425e-05 loss: 0.0843 (0.0908) time: 2.9375 data: 0.0073 max mem: 33302 Epoch: [29] [1350/4276] eta: 2:22:41 lr: 1.523976785734198e-05 loss: 0.0903 (0.0909) time: 2.9355 data: 0.0072 max mem: 33302 Epoch: [29] [1360/4276] eta: 2:22:12 lr: 1.5236765571866163e-05 loss: 0.0944 (0.0909) time: 2.9351 data: 0.0072 max mem: 33302 Epoch: [29] [1370/4276] eta: 2:21:43 lr: 1.5233763220658138e-05 loss: 0.0799 (0.0909) time: 2.9391 data: 0.0072 max mem: 33302 Epoch: [29] [1380/4276] eta: 2:21:14 lr: 1.5230760803702076e-05 loss: 0.0868 (0.0909) time: 2.9387 data: 0.0072 max mem: 33302 Epoch: [29] [1390/4276] eta: 2:20:45 lr: 1.5227758320982141e-05 loss: 0.0997 (0.0910) time: 2.9360 data: 0.0074 max mem: 33302 Epoch: [29] [1400/4276] eta: 2:20:16 lr: 1.5224755772482475e-05 loss: 0.0984 (0.0910) time: 2.9434 data: 0.0076 max mem: 33302 Epoch: [29] [1410/4276] eta: 2:19:47 lr: 1.5221753158187223e-05 loss: 0.0930 (0.0910) time: 2.9450 data: 0.0080 max mem: 33302 Epoch: [29] [1420/4276] eta: 2:19:18 lr: 1.5218750478080526e-05 loss: 0.0807 (0.0910) time: 2.9385 data: 0.0082 max mem: 33302 Epoch: [29] [1430/4276] eta: 2:18:49 lr: 1.521574773214652e-05 loss: 0.0792 (0.0910) time: 2.9359 data: 0.0076 max mem: 33302 Epoch: [29] [1440/4276] eta: 2:18:20 lr: 1.5212744920369312e-05 loss: 0.0813 (0.0909) time: 2.9351 data: 0.0074 max mem: 33302 Epoch: [29] [1450/4276] eta: 2:17:51 lr: 1.5209742042733025e-05 loss: 0.0867 (0.0909) time: 2.9383 data: 0.0078 max mem: 33302 Epoch: [29] [1460/4276] eta: 2:17:22 lr: 1.5206739099221768e-05 loss: 0.0854 (0.0909) time: 2.9402 data: 0.0078 max mem: 33302 Epoch: [29] [1470/4276] eta: 2:16:52 lr: 1.5203736089819643e-05 loss: 0.0854 (0.0909) time: 2.9221 data: 0.0081 max mem: 33302 Epoch: [29] [1480/4276] eta: 2:16:23 lr: 1.5200733014510732e-05 loss: 0.0923 (0.0909) time: 2.9144 data: 0.0088 max mem: 33302 Epoch: [29] [1490/4276] eta: 2:15:53 lr: 1.5197729873279128e-05 loss: 0.0833 (0.0908) time: 2.9222 data: 0.0088 max mem: 33302 Epoch: [29] [1500/4276] eta: 2:15:24 lr: 1.519472666610891e-05 loss: 0.0784 (0.0908) time: 2.9310 data: 0.0085 max mem: 33302 Epoch: [29] [1510/4276] eta: 2:14:55 lr: 1.5191723392984156e-05 loss: 0.0812 (0.0908) time: 2.9396 data: 0.0078 max mem: 33302 Epoch: [29] [1520/4276] eta: 2:14:26 lr: 1.518872005388891e-05 loss: 0.0812 (0.0908) time: 2.9363 data: 0.0071 max mem: 33302 Epoch: [29] [1530/4276] eta: 2:13:57 lr: 1.5185716648807244e-05 loss: 0.0797 (0.0907) time: 2.9343 data: 0.0071 max mem: 33302 Epoch: [29] [1540/4276] eta: 2:13:28 lr: 1.5182713177723201e-05 loss: 0.0799 (0.0907) time: 2.9355 data: 0.0071 max mem: 33302 Epoch: [29] [1550/4276] eta: 2:12:59 lr: 1.517970964062083e-05 loss: 0.0806 (0.0907) time: 2.9407 data: 0.0071 max mem: 33302 Epoch: [29] [1560/4276] eta: 2:12:30 lr: 1.517670603748415e-05 loss: 0.0781 (0.0907) time: 2.9411 data: 0.0071 max mem: 33302 Epoch: [29] [1570/4276] eta: 2:12:01 lr: 1.5173702368297199e-05 loss: 0.0851 (0.0907) time: 2.9392 data: 0.0072 max mem: 33302 Epoch: [29] [1580/4276] eta: 2:11:32 lr: 1.5170698633043995e-05 loss: 0.0796 (0.0906) time: 2.9412 data: 0.0071 max mem: 33302 Epoch: [29] [1590/4276] eta: 2:11:03 lr: 1.5167694831708553e-05 loss: 0.0796 (0.0906) time: 2.9403 data: 0.0071 max mem: 33302 Epoch: [29] [1600/4276] eta: 2:10:34 lr: 1.5164690964274867e-05 loss: 0.0823 (0.0906) time: 2.9391 data: 0.0071 max mem: 33302 Epoch: [29] [1610/4276] eta: 2:10:05 lr: 1.5161687030726942e-05 loss: 0.0892 (0.0906) time: 2.9401 data: 0.0072 max mem: 33302 Epoch: [29] [1620/4276] eta: 2:09:36 lr: 1.5158683031048765e-05 loss: 0.0876 (0.0906) time: 2.9404 data: 0.0071 max mem: 33302 Epoch: [29] [1630/4276] eta: 2:09:06 lr: 1.5155678965224326e-05 loss: 0.0876 (0.0906) time: 2.9400 data: 0.0071 max mem: 33302 Epoch: [29] [1640/4276] eta: 2:08:37 lr: 1.5152674833237588e-05 loss: 0.0880 (0.0906) time: 2.9383 data: 0.0072 max mem: 33302 Epoch: [29] [1650/4276] eta: 2:08:08 lr: 1.5149670635072525e-05 loss: 0.0735 (0.0905) time: 2.9372 data: 0.0072 max mem: 33302 Epoch: [29] [1660/4276] eta: 2:07:39 lr: 1.5146666370713094e-05 loss: 0.0798 (0.0905) time: 2.9464 data: 0.0072 max mem: 33302 Epoch: [29] [1670/4276] eta: 2:07:10 lr: 1.5143662040143256e-05 loss: 0.0826 (0.0905) time: 2.9500 data: 0.0072 max mem: 33302 Epoch: [29] [1680/4276] eta: 2:06:41 lr: 1.5140657643346944e-05 loss: 0.0801 (0.0905) time: 2.9396 data: 0.0077 max mem: 33302 Epoch: [29] [1690/4276] eta: 2:06:12 lr: 1.5137653180308106e-05 loss: 0.0798 (0.0905) time: 2.9295 data: 0.0080 max mem: 33302 Epoch: [29] [1700/4276] eta: 2:05:42 lr: 1.5134648651010672e-05 loss: 0.0881 (0.0905) time: 2.9203 data: 0.0083 max mem: 33302 Epoch: [29] [1710/4276] eta: 2:05:13 lr: 1.5131644055438553e-05 loss: 0.0881 (0.0905) time: 2.9362 data: 0.0081 max mem: 33302 Epoch: [29] [1720/4276] eta: 2:04:45 lr: 1.5128639393575678e-05 loss: 0.0873 (0.0905) time: 2.9525 data: 0.0073 max mem: 33302 Epoch: [29] [1730/4276] eta: 2:04:16 lr: 1.5125634665405947e-05 loss: 0.0847 (0.0905) time: 2.9488 data: 0.0075 max mem: 33302 Epoch: [29] [1740/4276] eta: 2:03:47 lr: 1.5122629870913272e-05 loss: 0.0836 (0.0905) time: 2.9480 data: 0.0075 max mem: 33302 Epoch: [29] [1750/4276] eta: 2:03:17 lr: 1.5119625010081529e-05 loss: 0.0817 (0.0904) time: 2.9370 data: 0.0075 max mem: 33302 Epoch: [29] [1760/4276] eta: 2:02:47 lr: 1.5116620082894613e-05 loss: 0.0730 (0.0903) time: 2.9092 data: 0.0078 max mem: 33302 Epoch: [29] [1770/4276] eta: 2:02:18 lr: 1.5113615089336402e-05 loss: 0.0749 (0.0903) time: 2.8947 data: 0.0077 max mem: 33302 Epoch: [29] [1780/4276] eta: 2:01:49 lr: 1.5110610029390772e-05 loss: 0.0791 (0.0903) time: 2.9201 data: 0.0080 max mem: 33302 Epoch: [29] [1790/4276] eta: 2:01:20 lr: 1.5107604903041575e-05 loss: 0.0760 (0.0902) time: 2.9466 data: 0.0079 max mem: 33302 Epoch: [29] [1800/4276] eta: 2:00:51 lr: 1.5104599710272675e-05 loss: 0.0780 (0.0902) time: 2.9474 data: 0.0075 max mem: 33302 Epoch: [29] [1810/4276] eta: 2:00:22 lr: 1.5101594451067912e-05 loss: 0.0905 (0.0903) time: 2.9460 data: 0.0079 max mem: 33302 Epoch: [29] [1820/4276] eta: 1:59:52 lr: 1.5098589125411142e-05 loss: 0.0903 (0.0902) time: 2.9453 data: 0.0080 max mem: 33302 Epoch: [29] [1830/4276] eta: 1:59:23 lr: 1.509558373328618e-05 loss: 0.0848 (0.0903) time: 2.9462 data: 0.0076 max mem: 33302 Epoch: [29] [1840/4276] eta: 1:58:54 lr: 1.5092578274676864e-05 loss: 0.0758 (0.0902) time: 2.9372 data: 0.0078 max mem: 33302 Epoch: [29] [1850/4276] eta: 1:58:24 lr: 1.5089572749567004e-05 loss: 0.0758 (0.0902) time: 2.9100 data: 0.0077 max mem: 33302 Epoch: [29] [1860/4276] eta: 1:57:55 lr: 1.5086567157940423e-05 loss: 0.0842 (0.0901) time: 2.8967 data: 0.0073 max mem: 33302 Epoch: [29] [1870/4276] eta: 1:57:25 lr: 1.5083561499780912e-05 loss: 0.0829 (0.0901) time: 2.8989 data: 0.0076 max mem: 33302 Epoch: [29] [1880/4276] eta: 1:56:55 lr: 1.508055577507227e-05 loss: 0.0772 (0.0901) time: 2.8993 data: 0.0078 max mem: 33302 Epoch: [29] [1890/4276] eta: 1:56:26 lr: 1.5077549983798287e-05 loss: 0.0772 (0.0900) time: 2.8973 data: 0.0077 max mem: 33302 Epoch: [29] [1900/4276] eta: 1:55:56 lr: 1.5074544125942744e-05 loss: 0.0803 (0.0900) time: 2.8969 data: 0.0079 max mem: 33302 Epoch: [29] [1910/4276] eta: 1:55:27 lr: 1.5071538201489411e-05 loss: 0.0854 (0.0901) time: 2.9172 data: 0.0085 max mem: 33302 Epoch: [29] [1920/4276] eta: 1:54:57 lr: 1.5068532210422053e-05 loss: 0.0855 (0.0900) time: 2.9185 data: 0.0084 max mem: 33302 Epoch: [29] [1930/4276] eta: 1:54:28 lr: 1.5065526152724433e-05 loss: 0.0804 (0.0900) time: 2.9082 data: 0.0081 max mem: 33302 Epoch: [29] [1940/4276] eta: 1:53:59 lr: 1.50625200283803e-05 loss: 0.0837 (0.0900) time: 2.9386 data: 0.0088 max mem: 33302 Epoch: [29] [1950/4276] eta: 1:53:30 lr: 1.5059513837373393e-05 loss: 0.0922 (0.0901) time: 2.9604 data: 0.0090 max mem: 33302 Epoch: [29] [1960/4276] eta: 1:53:01 lr: 1.5056507579687446e-05 loss: 0.0793 (0.0900) time: 2.9549 data: 0.0087 max mem: 33302 Epoch: [29] [1970/4276] eta: 1:52:32 lr: 1.505350125530619e-05 loss: 0.0675 (0.0899) time: 2.9511 data: 0.0089 max mem: 33302 Epoch: [29] [1980/4276] eta: 1:52:03 lr: 1.5050494864213352e-05 loss: 0.0680 (0.0898) time: 2.9575 data: 0.0090 max mem: 33302 Epoch: [29] [1990/4276] eta: 1:51:34 lr: 1.5047488406392632e-05 loss: 0.0793 (0.0898) time: 2.9581 data: 0.0090 max mem: 33302 Epoch: [29] [2000/4276] eta: 1:51:05 lr: 1.504448188182774e-05 loss: 0.0816 (0.0898) time: 2.9538 data: 0.0084 max mem: 33302 Epoch: [29] [2010/4276] eta: 1:50:36 lr: 1.504147529050238e-05 loss: 0.0816 (0.0898) time: 2.9555 data: 0.0087 max mem: 33302 Epoch: [29] [2020/4276] eta: 1:50:07 lr: 1.5038468632400227e-05 loss: 0.0932 (0.0899) time: 2.9473 data: 0.0090 max mem: 33302 Epoch: [29] [2030/4276] eta: 1:49:38 lr: 1.5035461907504973e-05 loss: 0.0836 (0.0898) time: 2.9271 data: 0.0089 max mem: 33302 Epoch: [29] [2040/4276] eta: 1:49:08 lr: 1.5032455115800289e-05 loss: 0.0786 (0.0898) time: 2.9304 data: 0.0092 max mem: 33302 Epoch: [29] [2050/4276] eta: 1:48:39 lr: 1.502944825726985e-05 loss: 0.0886 (0.0898) time: 2.9508 data: 0.0085 max mem: 33302 Epoch: [29] [2060/4276] eta: 1:48:10 lr: 1.5026441331897303e-05 loss: 0.0888 (0.0898) time: 2.9570 data: 0.0076 max mem: 33302 Epoch: [29] [2070/4276] eta: 1:47:41 lr: 1.5023434339666306e-05 loss: 0.0804 (0.0898) time: 2.9562 data: 0.0077 max mem: 33302 Epoch: [29] [2080/4276] eta: 1:47:12 lr: 1.50204272805605e-05 loss: 0.0864 (0.0899) time: 2.9528 data: 0.0079 max mem: 33302 Epoch: [29] [2090/4276] eta: 1:46:43 lr: 1.5017420154563532e-05 loss: 0.0925 (0.0899) time: 2.9463 data: 0.0074 max mem: 33302 Epoch: [29] [2100/4276] eta: 1:46:14 lr: 1.5014412961659013e-05 loss: 0.0842 (0.0899) time: 2.9426 data: 0.0071 max mem: 33302 Epoch: [29] [2110/4276] eta: 1:45:45 lr: 1.5011405701830575e-05 loss: 0.0806 (0.0898) time: 2.9472 data: 0.0073 max mem: 33302 Epoch: [29] [2120/4276] eta: 1:45:16 lr: 1.500839837506183e-05 loss: 0.0778 (0.0898) time: 2.9495 data: 0.0073 max mem: 33302 Epoch: [29] [2130/4276] eta: 1:44:47 lr: 1.5005390981336387e-05 loss: 0.0715 (0.0897) time: 2.9474 data: 0.0074 max mem: 33302 Epoch: [29] [2140/4276] eta: 1:44:18 lr: 1.5002383520637833e-05 loss: 0.0826 (0.0898) time: 2.9442 data: 0.0074 max mem: 33302 Epoch: [29] [2150/4276] eta: 1:43:48 lr: 1.4999375992949769e-05 loss: 0.0821 (0.0897) time: 2.9436 data: 0.0073 max mem: 33302 Epoch: [29] [2160/4276] eta: 1:43:19 lr: 1.499636839825577e-05 loss: 0.0743 (0.0897) time: 2.9443 data: 0.0077 max mem: 33302 Epoch: [29] [2170/4276] eta: 1:42:50 lr: 1.4993360736539422e-05 loss: 0.0793 (0.0897) time: 2.9436 data: 0.0076 max mem: 33302 Epoch: [29] [2180/4276] eta: 1:42:20 lr: 1.499035300778428e-05 loss: 0.0793 (0.0896) time: 2.9205 data: 0.0080 max mem: 33302 Epoch: [29] [2190/4276] eta: 1:41:51 lr: 1.4987345211973911e-05 loss: 0.0815 (0.0896) time: 2.8931 data: 0.0084 max mem: 33302 Epoch: [29] [2200/4276] eta: 1:41:22 lr: 1.4984337349091859e-05 loss: 0.0936 (0.0897) time: 2.9182 data: 0.0084 max mem: 33302 Epoch: [29] [2210/4276] eta: 1:40:52 lr: 1.4981329419121685e-05 loss: 0.0953 (0.0897) time: 2.9276 data: 0.0084 max mem: 33302 Epoch: [29] [2220/4276] eta: 1:40:23 lr: 1.4978321422046906e-05 loss: 0.0795 (0.0896) time: 2.9230 data: 0.0079 max mem: 33302 Epoch: [29] [2230/4276] eta: 1:39:54 lr: 1.4975313357851059e-05 loss: 0.0799 (0.0896) time: 2.9406 data: 0.0074 max mem: 33302 Epoch: [29] [2240/4276] eta: 1:39:25 lr: 1.4972305226517663e-05 loss: 0.0743 (0.0896) time: 2.9423 data: 0.0071 max mem: 33302 Epoch: [29] [2250/4276] eta: 1:38:55 lr: 1.4969297028030244e-05 loss: 0.0756 (0.0895) time: 2.9425 data: 0.0072 max mem: 33302 Epoch: [29] [2260/4276] eta: 1:38:26 lr: 1.4966288762372289e-05 loss: 0.0813 (0.0896) time: 2.9409 data: 0.0077 max mem: 33302 Epoch: [29] [2270/4276] eta: 1:37:57 lr: 1.4963280429527302e-05 loss: 0.0869 (0.0896) time: 2.9400 data: 0.0077 max mem: 33302 Epoch: [29] [2280/4276] eta: 1:37:28 lr: 1.4960272029478776e-05 loss: 0.0857 (0.0896) time: 2.9408 data: 0.0071 max mem: 33302 Epoch: [29] [2290/4276] eta: 1:36:59 lr: 1.49572635622102e-05 loss: 0.0857 (0.0896) time: 2.9388 data: 0.0072 max mem: 33302 Epoch: [29] [2300/4276] eta: 1:36:29 lr: 1.4954255027705034e-05 loss: 0.0841 (0.0896) time: 2.9389 data: 0.0072 max mem: 33302 Epoch: [29] [2310/4276] eta: 1:36:00 lr: 1.495124642594675e-05 loss: 0.0879 (0.0896) time: 2.9413 data: 0.0072 max mem: 33302 Epoch: [29] [2320/4276] eta: 1:35:31 lr: 1.4948237756918814e-05 loss: 0.0990 (0.0897) time: 2.9419 data: 0.0072 max mem: 33302 Epoch: [29] [2330/4276] eta: 1:35:02 lr: 1.4945229020604667e-05 loss: 0.0877 (0.0897) time: 2.9434 data: 0.0072 max mem: 33302 Epoch: [29] [2340/4276] eta: 1:34:33 lr: 1.494222021698776e-05 loss: 0.0863 (0.0896) time: 2.9436 data: 0.0074 max mem: 33302 Epoch: [29] [2350/4276] eta: 1:34:03 lr: 1.4939211346051524e-05 loss: 0.0787 (0.0896) time: 2.9415 data: 0.0083 max mem: 33302 Epoch: [29] [2360/4276] eta: 1:33:34 lr: 1.4936202407779396e-05 loss: 0.0824 (0.0896) time: 2.9428 data: 0.0084 max mem: 33302 Epoch: [29] [2370/4276] eta: 1:33:05 lr: 1.4933193402154786e-05 loss: 0.0852 (0.0896) time: 2.9441 data: 0.0078 max mem: 33302 Epoch: [29] [2380/4276] eta: 1:32:36 lr: 1.4930184329161107e-05 loss: 0.0891 (0.0897) time: 2.9427 data: 0.0076 max mem: 33302 Epoch: [29] [2390/4276] eta: 1:32:07 lr: 1.4927175188781767e-05 loss: 0.0891 (0.0897) time: 2.9430 data: 0.0078 max mem: 33302 Epoch: [29] [2400/4276] eta: 1:31:37 lr: 1.4924165981000169e-05 loss: 0.0837 (0.0897) time: 2.9436 data: 0.0078 max mem: 33302 Epoch: [29] [2410/4276] eta: 1:31:08 lr: 1.4921156705799688e-05 loss: 0.0858 (0.0897) time: 2.9606 data: 0.0079 max mem: 33302 Epoch: [29] [2420/4276] eta: 1:30:39 lr: 1.4918147363163715e-05 loss: 0.0798 (0.0897) time: 2.9451 data: 0.0080 max mem: 33302 Epoch: [29] [2430/4276] eta: 1:30:10 lr: 1.4915137953075618e-05 loss: 0.0865 (0.0897) time: 2.9244 data: 0.0082 max mem: 33302 Epoch: [29] [2440/4276] eta: 1:29:41 lr: 1.4912128475518771e-05 loss: 0.0838 (0.0897) time: 2.9444 data: 0.0086 max mem: 33302 Epoch: [29] [2450/4276] eta: 1:29:11 lr: 1.4909118930476523e-05 loss: 0.0812 (0.0897) time: 2.9486 data: 0.0083 max mem: 33302 Epoch: [29] [2460/4276] eta: 1:28:42 lr: 1.4906109317932223e-05 loss: 0.0861 (0.0897) time: 2.9448 data: 0.0079 max mem: 33302 Epoch: [29] [2470/4276] eta: 1:28:13 lr: 1.490309963786922e-05 loss: 0.0855 (0.0897) time: 2.9401 data: 0.0078 max mem: 33302 Epoch: [29] [2480/4276] eta: 1:27:43 lr: 1.4900089890270848e-05 loss: 0.0903 (0.0898) time: 2.9278 data: 0.0082 max mem: 33302 Epoch: [29] [2490/4276] eta: 1:27:14 lr: 1.4897080075120428e-05 loss: 0.0895 (0.0898) time: 2.9305 data: 0.0080 max mem: 33302 Epoch: [29] [2500/4276] eta: 1:26:45 lr: 1.489407019240128e-05 loss: 0.0869 (0.0898) time: 2.9421 data: 0.0072 max mem: 33302 Epoch: [29] [2510/4276] eta: 1:26:16 lr: 1.4891060242096713e-05 loss: 0.0863 (0.0898) time: 2.9412 data: 0.0072 max mem: 33302 Epoch: [29] [2520/4276] eta: 1:25:46 lr: 1.488805022419004e-05 loss: 0.0855 (0.0898) time: 2.9399 data: 0.0072 max mem: 33302 Epoch: [29] [2530/4276] eta: 1:25:17 lr: 1.4885040138664538e-05 loss: 0.0838 (0.0897) time: 2.9396 data: 0.0072 max mem: 33302 Epoch: [29] [2540/4276] eta: 1:24:48 lr: 1.4882029985503507e-05 loss: 0.0831 (0.0898) time: 2.9426 data: 0.0071 max mem: 33302 Epoch: [29] [2550/4276] eta: 1:24:19 lr: 1.4879019764690221e-05 loss: 0.0802 (0.0897) time: 2.9441 data: 0.0075 max mem: 33302 Epoch: [29] [2560/4276] eta: 1:23:50 lr: 1.4876009476207958e-05 loss: 0.0668 (0.0896) time: 2.9428 data: 0.0077 max mem: 33302 Epoch: [29] [2570/4276] eta: 1:23:20 lr: 1.4872999120039973e-05 loss: 0.0755 (0.0896) time: 2.9444 data: 0.0077 max mem: 33302 Epoch: [29] [2580/4276] eta: 1:22:51 lr: 1.4869988696169524e-05 loss: 0.0766 (0.0896) time: 2.9449 data: 0.0078 max mem: 33302 Epoch: [29] [2590/4276] eta: 1:22:22 lr: 1.4866978204579856e-05 loss: 0.0747 (0.0895) time: 2.9444 data: 0.0076 max mem: 33302 Epoch: [29] [2600/4276] eta: 1:21:53 lr: 1.4863967645254217e-05 loss: 0.0713 (0.0895) time: 2.9447 data: 0.0075 max mem: 33302 Epoch: [29] [2610/4276] eta: 1:21:23 lr: 1.486095701817583e-05 loss: 0.0721 (0.0894) time: 2.9468 data: 0.0076 max mem: 33302 Epoch: [29] [2620/4276] eta: 1:20:54 lr: 1.485794632332792e-05 loss: 0.0795 (0.0895) time: 2.9491 data: 0.0078 max mem: 33302 Epoch: [29] [2630/4276] eta: 1:20:25 lr: 1.4854935560693712e-05 loss: 0.0753 (0.0894) time: 2.9491 data: 0.0079 max mem: 33302 Epoch: [29] [2640/4276] eta: 1:19:56 lr: 1.4851924730256397e-05 loss: 0.0705 (0.0894) time: 2.9479 data: 0.0078 max mem: 33302 Epoch: [29] [2650/4276] eta: 1:19:27 lr: 1.4848913831999186e-05 loss: 0.0724 (0.0893) time: 2.9471 data: 0.0076 max mem: 33302 Epoch: [29] [2660/4276] eta: 1:18:57 lr: 1.484590286590527e-05 loss: 0.0807 (0.0893) time: 2.9324 data: 0.0082 max mem: 33302 Epoch: [29] [2670/4276] eta: 1:18:28 lr: 1.4842891831957833e-05 loss: 0.0835 (0.0893) time: 2.9115 data: 0.0087 max mem: 33302 Epoch: [29] [2680/4276] eta: 1:17:58 lr: 1.483988073014005e-05 loss: 0.0865 (0.0893) time: 2.9161 data: 0.0086 max mem: 33302 Epoch: [29] [2690/4276] eta: 1:17:29 lr: 1.4836869560435084e-05 loss: 0.0799 (0.0892) time: 2.9379 data: 0.0086 max mem: 33302 Epoch: [29] [2700/4276] eta: 1:17:00 lr: 1.4833858322826102e-05 loss: 0.0695 (0.0892) time: 2.9490 data: 0.0086 max mem: 33302 Epoch: [29] [2710/4276] eta: 1:16:31 lr: 1.483084701729626e-05 loss: 0.0785 (0.0892) time: 2.9509 data: 0.0087 max mem: 33302 Epoch: [29] [2720/4276] eta: 1:16:02 lr: 1.482783564382869e-05 loss: 0.0785 (0.0891) time: 2.9507 data: 0.0085 max mem: 33302 Epoch: [29] [2730/4276] eta: 1:15:32 lr: 1.4824824202406534e-05 loss: 0.0844 (0.0891) time: 2.9484 data: 0.0081 max mem: 33302 Epoch: [29] [2740/4276] eta: 1:15:03 lr: 1.482181269301292e-05 loss: 0.0844 (0.0891) time: 2.9477 data: 0.0079 max mem: 33302 Epoch: [29] [2750/4276] eta: 1:14:34 lr: 1.4818801115630975e-05 loss: 0.0802 (0.0891) time: 2.9484 data: 0.0081 max mem: 33302 Epoch: [29] [2760/4276] eta: 1:14:05 lr: 1.48157894702438e-05 loss: 0.0766 (0.0891) time: 2.9502 data: 0.0081 max mem: 33302 Epoch: [29] [2770/4276] eta: 1:13:35 lr: 1.4812777756834501e-05 loss: 0.0766 (0.0891) time: 2.9488 data: 0.0078 max mem: 33302 Epoch: [29] [2780/4276] eta: 1:13:06 lr: 1.4809765975386175e-05 loss: 0.0754 (0.0891) time: 2.9470 data: 0.0076 max mem: 33302 Epoch: [29] [2790/4276] eta: 1:12:37 lr: 1.4806754125881923e-05 loss: 0.0753 (0.0890) time: 2.9316 data: 0.0082 max mem: 33302 Epoch: [29] [2800/4276] eta: 1:12:07 lr: 1.4803742208304802e-05 loss: 0.0753 (0.0890) time: 2.9089 data: 0.0084 max mem: 33302 Epoch: [29] [2810/4276] eta: 1:11:38 lr: 1.4800730222637902e-05 loss: 0.0684 (0.0889) time: 2.9041 data: 0.0082 max mem: 33302 Epoch: [29] [2820/4276] eta: 1:11:08 lr: 1.4797718168864275e-05 loss: 0.0684 (0.0889) time: 2.9053 data: 0.0083 max mem: 33302 Epoch: [29] [2830/4276] eta: 1:10:39 lr: 1.4794706046966992e-05 loss: 0.0824 (0.0889) time: 2.9173 data: 0.0092 max mem: 33302 Epoch: [29] [2840/4276] eta: 1:10:10 lr: 1.4791693856929084e-05 loss: 0.0951 (0.0889) time: 2.9387 data: 0.0095 max mem: 33302 Epoch: [29] [2850/4276] eta: 1:09:41 lr: 1.47886815987336e-05 loss: 0.0945 (0.0889) time: 2.9471 data: 0.0085 max mem: 33302 Epoch: [29] [2860/4276] eta: 1:09:11 lr: 1.478566927236357e-05 loss: 0.0695 (0.0889) time: 2.9307 data: 0.0087 max mem: 33302 Epoch: [29] [2870/4276] eta: 1:08:42 lr: 1.478265687780202e-05 loss: 0.0759 (0.0889) time: 2.9077 data: 0.0095 max mem: 33302 Epoch: [29] [2880/4276] eta: 1:08:12 lr: 1.4779644415031962e-05 loss: 0.0766 (0.0889) time: 2.9220 data: 0.0101 max mem: 33302 Epoch: [29] [2890/4276] eta: 1:07:43 lr: 1.4776631884036405e-05 loss: 0.0768 (0.0889) time: 2.9379 data: 0.0097 max mem: 33302 Epoch: [29] [2900/4276] eta: 1:07:14 lr: 1.4773619284798346e-05 loss: 0.0840 (0.0889) time: 2.9211 data: 0.0090 max mem: 33302 Epoch: [29] [2910/4276] eta: 1:06:44 lr: 1.4770606617300783e-05 loss: 0.0840 (0.0889) time: 2.9233 data: 0.0087 max mem: 33302 Epoch: [29] [2920/4276] eta: 1:06:15 lr: 1.4767593881526692e-05 loss: 0.0829 (0.0889) time: 2.9414 data: 0.0080 max mem: 33302 Epoch: [29] [2930/4276] eta: 1:05:46 lr: 1.4764581077459053e-05 loss: 0.0825 (0.0889) time: 2.9441 data: 0.0073 max mem: 33302 Epoch: [29] [2940/4276] eta: 1:05:17 lr: 1.4761568205080834e-05 loss: 0.0823 (0.0889) time: 2.9456 data: 0.0072 max mem: 33302 Epoch: [29] [2950/4276] eta: 1:04:47 lr: 1.4758555264374988e-05 loss: 0.0841 (0.0889) time: 2.9467 data: 0.0073 max mem: 33302 Epoch: [29] [2960/4276] eta: 1:04:18 lr: 1.4755542255324468e-05 loss: 0.0834 (0.0889) time: 2.9465 data: 0.0075 max mem: 33302 Epoch: [29] [2970/4276] eta: 1:03:49 lr: 1.4752529177912217e-05 loss: 0.0844 (0.0889) time: 2.9486 data: 0.0076 max mem: 33302 Epoch: [29] [2980/4276] eta: 1:03:20 lr: 1.4749516032121174e-05 loss: 0.0876 (0.0889) time: 2.9510 data: 0.0078 max mem: 33302 Epoch: [29] [2990/4276] eta: 1:02:50 lr: 1.4746502817934261e-05 loss: 0.0824 (0.0889) time: 2.9506 data: 0.0077 max mem: 33302 Epoch: [29] [3000/4276] eta: 1:02:21 lr: 1.4743489535334393e-05 loss: 0.0747 (0.0888) time: 2.9492 data: 0.0077 max mem: 33302 Epoch: [29] [3010/4276] eta: 1:01:52 lr: 1.4740476184304486e-05 loss: 0.0798 (0.0888) time: 2.9480 data: 0.0077 max mem: 33302 Epoch: [29] [3020/4276] eta: 1:01:23 lr: 1.4737462764827448e-05 loss: 0.0808 (0.0888) time: 2.9470 data: 0.0075 max mem: 33302 Epoch: [29] [3030/4276] eta: 1:00:53 lr: 1.4734449276886156e-05 loss: 0.0739 (0.0888) time: 2.9481 data: 0.0074 max mem: 33302 Epoch: [29] [3040/4276] eta: 1:00:24 lr: 1.4731435720463508e-05 loss: 0.0919 (0.0888) time: 2.9512 data: 0.0080 max mem: 33302 Epoch: [29] [3050/4276] eta: 0:59:55 lr: 1.4728422095542377e-05 loss: 0.0970 (0.0888) time: 2.9561 data: 0.0080 max mem: 33302 Epoch: [29] [3060/4276] eta: 0:59:26 lr: 1.4725408402105639e-05 loss: 0.0730 (0.0888) time: 2.9528 data: 0.0075 max mem: 33302 Epoch: [29] [3070/4276] eta: 0:58:56 lr: 1.4722394640136144e-05 loss: 0.0755 (0.0888) time: 2.9467 data: 0.0076 max mem: 33302 Epoch: [29] [3080/4276] eta: 0:58:27 lr: 1.4719380809616751e-05 loss: 0.0755 (0.0887) time: 2.9467 data: 0.0082 max mem: 33302 Epoch: [29] [3090/4276] eta: 0:57:58 lr: 1.4716366910530305e-05 loss: 0.0686 (0.0887) time: 2.9477 data: 0.0084 max mem: 33302 Epoch: [29] [3100/4276] eta: 0:57:29 lr: 1.4713352942859649e-05 loss: 0.0784 (0.0887) time: 2.9487 data: 0.0080 max mem: 33302 Epoch: [29] [3110/4276] eta: 0:56:59 lr: 1.4710338906587596e-05 loss: 0.0752 (0.0886) time: 2.9487 data: 0.0075 max mem: 33302 Epoch: [29] [3120/4276] eta: 0:56:30 lr: 1.4707324801696975e-05 loss: 0.0768 (0.0886) time: 2.9473 data: 0.0071 max mem: 33302 Epoch: [29] [3130/4276] eta: 0:56:01 lr: 1.4704310628170601e-05 loss: 0.0832 (0.0886) time: 2.9460 data: 0.0071 max mem: 33302 Epoch: [29] [3140/4276] eta: 0:55:31 lr: 1.4701296385991278e-05 loss: 0.0832 (0.0886) time: 2.9499 data: 0.0072 max mem: 33302 Epoch: [29] [3150/4276] eta: 0:55:02 lr: 1.4698282075141796e-05 loss: 0.0889 (0.0886) time: 2.9516 data: 0.0073 max mem: 33302 Epoch: [29] [3160/4276] eta: 0:54:33 lr: 1.4695267695604938e-05 loss: 0.0855 (0.0886) time: 2.9474 data: 0.0073 max mem: 33302 Epoch: [29] [3170/4276] eta: 0:54:04 lr: 1.4692253247363493e-05 loss: 0.0855 (0.0887) time: 2.9466 data: 0.0072 max mem: 33302 Epoch: [29] [3180/4276] eta: 0:53:34 lr: 1.4689238730400234e-05 loss: 0.0835 (0.0886) time: 2.9473 data: 0.0072 max mem: 33302 Epoch: [29] [3190/4276] eta: 0:53:05 lr: 1.468622414469791e-05 loss: 0.0747 (0.0886) time: 2.9481 data: 0.0072 max mem: 33302 Epoch: [29] [3200/4276] eta: 0:52:36 lr: 1.4683209490239287e-05 loss: 0.0806 (0.0886) time: 2.9502 data: 0.0072 max mem: 33302 Epoch: [29] [3210/4276] eta: 0:52:06 lr: 1.4680194767007108e-05 loss: 0.0761 (0.0886) time: 2.9389 data: 0.0077 max mem: 33302 Epoch: [29] [3220/4276] eta: 0:51:37 lr: 1.4677179974984104e-05 loss: 0.0898 (0.0886) time: 2.9135 data: 0.0087 max mem: 33302 Epoch: [29] [3230/4276] eta: 0:51:08 lr: 1.4674165114153013e-05 loss: 0.0898 (0.0886) time: 2.9095 data: 0.0094 max mem: 33302 Epoch: [29] [3240/4276] eta: 0:50:38 lr: 1.4671150184496552e-05 loss: 0.0838 (0.0886) time: 2.9276 data: 0.0086 max mem: 33302 Epoch: [29] [3250/4276] eta: 0:50:09 lr: 1.4668135185997443e-05 loss: 0.0868 (0.0887) time: 2.9376 data: 0.0074 max mem: 33302 Epoch: [29] [3260/4276] eta: 0:49:40 lr: 1.4665120118638376e-05 loss: 0.0856 (0.0887) time: 2.9322 data: 0.0075 max mem: 33302 Epoch: [29] [3270/4276] eta: 0:49:10 lr: 1.4662104982402052e-05 loss: 0.0848 (0.0887) time: 2.9216 data: 0.0081 max mem: 33302 Epoch: [29] [3280/4276] eta: 0:48:41 lr: 1.4659089777271165e-05 loss: 0.0815 (0.0887) time: 2.9324 data: 0.0077 max mem: 33302 Epoch: [29] [3290/4276] eta: 0:48:12 lr: 1.4656074503228392e-05 loss: 0.0939 (0.0887) time: 2.9386 data: 0.0077 max mem: 33302 Epoch: [29] [3300/4276] eta: 0:47:42 lr: 1.4653059160256402e-05 loss: 0.0967 (0.0888) time: 2.9338 data: 0.0078 max mem: 33302 Epoch: [29] [3310/4276] eta: 0:47:13 lr: 1.4650043748337857e-05 loss: 0.1014 (0.0888) time: 2.9369 data: 0.0073 max mem: 33302 Epoch: [29] [3320/4276] eta: 0:46:44 lr: 1.4647028267455417e-05 loss: 0.1030 (0.0889) time: 2.9360 data: 0.0072 max mem: 33302 Epoch: [29] [3330/4276] eta: 0:46:14 lr: 1.4644012717591729e-05 loss: 0.0974 (0.0888) time: 2.9377 data: 0.0072 max mem: 33302 Epoch: [29] [3340/4276] eta: 0:45:45 lr: 1.4640997098729426e-05 loss: 0.0824 (0.0888) time: 2.9390 data: 0.0071 max mem: 33302 Epoch: [29] [3350/4276] eta: 0:45:16 lr: 1.4637981410851135e-05 loss: 0.0831 (0.0888) time: 2.9450 data: 0.0073 max mem: 33302 Epoch: [29] [3360/4276] eta: 0:44:46 lr: 1.4634965653939484e-05 loss: 0.0783 (0.0888) time: 2.9446 data: 0.0075 max mem: 33302 Epoch: [29] [3370/4276] eta: 0:44:17 lr: 1.4631949827977086e-05 loss: 0.0795 (0.0888) time: 2.9395 data: 0.0073 max mem: 33302 Epoch: [29] [3380/4276] eta: 0:43:48 lr: 1.4628933932946542e-05 loss: 0.0926 (0.0888) time: 2.9463 data: 0.0071 max mem: 33302 Epoch: [29] [3390/4276] eta: 0:43:18 lr: 1.4625917968830449e-05 loss: 0.0845 (0.0888) time: 2.9472 data: 0.0071 max mem: 33302 Epoch: [29] [3400/4276] eta: 0:42:49 lr: 1.4622901935611397e-05 loss: 0.0930 (0.0888) time: 2.9427 data: 0.0073 max mem: 33302 Epoch: [29] [3410/4276] eta: 0:42:20 lr: 1.4619885833271968e-05 loss: 0.0803 (0.0888) time: 2.9329 data: 0.0082 max mem: 33302 Epoch: [29] [3420/4276] eta: 0:41:50 lr: 1.4616869661794727e-05 loss: 0.0851 (0.0888) time: 2.9234 data: 0.0083 max mem: 33302 Epoch: [29] [3430/4276] eta: 0:41:21 lr: 1.4613853421162236e-05 loss: 0.0906 (0.0889) time: 2.9127 data: 0.0082 max mem: 33302 Epoch: [29] [3440/4276] eta: 0:40:52 lr: 1.4610837111357054e-05 loss: 0.0840 (0.0888) time: 2.9247 data: 0.0082 max mem: 33302 Epoch: [29] [3450/4276] eta: 0:40:22 lr: 1.4607820732361733e-05 loss: 0.0863 (0.0889) time: 2.9490 data: 0.0077 max mem: 33302 Epoch: [29] [3460/4276] eta: 0:39:53 lr: 1.4604804284158793e-05 loss: 0.1128 (0.0890) time: 2.9495 data: 0.0073 max mem: 33302 Epoch: [29] [3470/4276] eta: 0:39:24 lr: 1.4601787766730779e-05 loss: 0.0871 (0.0889) time: 2.9489 data: 0.0073 max mem: 33302 Epoch: [29] [3480/4276] eta: 0:38:55 lr: 1.4598771180060203e-05 loss: 0.0851 (0.0890) time: 2.9507 data: 0.0075 max mem: 33302 Epoch: [29] [3490/4276] eta: 0:38:25 lr: 1.4595754524129584e-05 loss: 0.0855 (0.0890) time: 2.9537 data: 0.0075 max mem: 33302 Epoch: [29] [3500/4276] eta: 0:37:56 lr: 1.4592737798921421e-05 loss: 0.0801 (0.0889) time: 2.9534 data: 0.0073 max mem: 33302 Epoch: [29] [3510/4276] eta: 0:37:27 lr: 1.4589721004418208e-05 loss: 0.0816 (0.0889) time: 2.9503 data: 0.0074 max mem: 33302 Epoch: [29] [3520/4276] eta: 0:36:57 lr: 1.4586704140602443e-05 loss: 0.0839 (0.0889) time: 2.9530 data: 0.0078 max mem: 33302 Epoch: [29] [3530/4276] eta: 0:36:28 lr: 1.4583687207456587e-05 loss: 0.0839 (0.0889) time: 2.9561 data: 0.0077 max mem: 33302 Epoch: [29] [3540/4276] eta: 0:35:59 lr: 1.4580670204963124e-05 loss: 0.0852 (0.0889) time: 2.9537 data: 0.0075 max mem: 33302 Epoch: [29] [3550/4276] eta: 0:35:29 lr: 1.4577653133104505e-05 loss: 0.0819 (0.0889) time: 2.9521 data: 0.0075 max mem: 33302 Epoch: [29] [3560/4276] eta: 0:35:00 lr: 1.4574635991863198e-05 loss: 0.0903 (0.0890) time: 2.9525 data: 0.0076 max mem: 33302 Epoch: [29] [3570/4276] eta: 0:34:31 lr: 1.4571618781221632e-05 loss: 0.0962 (0.0890) time: 2.9548 data: 0.0080 max mem: 33302 Epoch: [29] [3580/4276] eta: 0:34:02 lr: 1.4568601501162252e-05 loss: 0.0762 (0.0889) time: 2.9335 data: 0.0084 max mem: 33302 Epoch: [29] [3590/4276] eta: 0:33:32 lr: 1.4565584151667483e-05 loss: 0.0801 (0.0890) time: 2.9044 data: 0.0081 max mem: 33302 Epoch: [29] [3600/4276] eta: 0:33:03 lr: 1.4562566732719748e-05 loss: 0.0890 (0.0890) time: 2.9012 data: 0.0077 max mem: 33302 Epoch: [29] [3610/4276] eta: 0:32:33 lr: 1.4559549244301452e-05 loss: 0.0897 (0.0890) time: 2.9282 data: 0.0078 max mem: 33302 Epoch: [29] [3620/4276] eta: 0:32:04 lr: 1.4556531686394997e-05 loss: 0.0812 (0.0889) time: 2.9509 data: 0.0077 max mem: 33302 Epoch: [29] [3630/4276] eta: 0:31:35 lr: 1.455351405898278e-05 loss: 0.0805 (0.0890) time: 2.9496 data: 0.0075 max mem: 33302 Epoch: [29] [3640/4276] eta: 0:31:05 lr: 1.4550496362047192e-05 loss: 0.0913 (0.0890) time: 2.9510 data: 0.0075 max mem: 33302 Epoch: [29] [3650/4276] eta: 0:30:36 lr: 1.4547478595570596e-05 loss: 0.0866 (0.0890) time: 2.9664 data: 0.0081 max mem: 33302 Epoch: [29] [3660/4276] eta: 0:30:07 lr: 1.4544460759535369e-05 loss: 0.0844 (0.0889) time: 2.9470 data: 0.0086 max mem: 33302 Epoch: [29] [3670/4276] eta: 0:29:37 lr: 1.4541442853923868e-05 loss: 0.0856 (0.0889) time: 2.9144 data: 0.0085 max mem: 33302 Epoch: [29] [3680/4276] eta: 0:29:08 lr: 1.4538424878718448e-05 loss: 0.0828 (0.0889) time: 2.9280 data: 0.0089 max mem: 33302 Epoch: [29] [3690/4276] eta: 0:28:39 lr: 1.4535406833901446e-05 loss: 0.0836 (0.0890) time: 2.9471 data: 0.0090 max mem: 33302 Epoch: [29] [3700/4276] eta: 0:28:10 lr: 1.4532388719455198e-05 loss: 0.0839 (0.0890) time: 2.9514 data: 0.0085 max mem: 33302 Epoch: [29] [3710/4276] eta: 0:27:40 lr: 1.4529370535362028e-05 loss: 0.0739 (0.0889) time: 2.9490 data: 0.0081 max mem: 33302 Epoch: [29] [3720/4276] eta: 0:27:11 lr: 1.4526352281604261e-05 loss: 0.0725 (0.0889) time: 2.9487 data: 0.0081 max mem: 33302 Epoch: [29] [3730/4276] eta: 0:26:42 lr: 1.4523333958164192e-05 loss: 0.0860 (0.0889) time: 2.9522 data: 0.0080 max mem: 33302 Epoch: [29] [3740/4276] eta: 0:26:12 lr: 1.4520315565024129e-05 loss: 0.0864 (0.0889) time: 2.9549 data: 0.0078 max mem: 33302 Epoch: [29] [3750/4276] eta: 0:25:43 lr: 1.4517297102166357e-05 loss: 0.0919 (0.0889) time: 2.9402 data: 0.0082 max mem: 33302 Epoch: [29] [3760/4276] eta: 0:25:14 lr: 1.4514278569573173e-05 loss: 0.0733 (0.0889) time: 2.9395 data: 0.0083 max mem: 33302 Epoch: [29] [3770/4276] eta: 0:24:44 lr: 1.4511259967226833e-05 loss: 0.0714 (0.0889) time: 2.9524 data: 0.0079 max mem: 33302 Epoch: [29] [3780/4276] eta: 0:24:15 lr: 1.4508241295109613e-05 loss: 0.0736 (0.0889) time: 2.9521 data: 0.0076 max mem: 33302 Epoch: [29] [3790/4276] eta: 0:23:46 lr: 1.4505222553203762e-05 loss: 0.0769 (0.0889) time: 2.9501 data: 0.0075 max mem: 33302 Epoch: [29] [3800/4276] eta: 0:23:16 lr: 1.4502203741491537e-05 loss: 0.0855 (0.0889) time: 2.9479 data: 0.0075 max mem: 33302 Epoch: [29] [3810/4276] eta: 0:22:47 lr: 1.4499184859955173e-05 loss: 0.0857 (0.0889) time: 2.9474 data: 0.0077 max mem: 33302 Epoch: [29] [3820/4276] eta: 0:22:18 lr: 1.4496165908576897e-05 loss: 0.0797 (0.0889) time: 2.9432 data: 0.0078 max mem: 33302 Epoch: [29] [3830/4276] eta: 0:21:48 lr: 1.449314688733894e-05 loss: 0.0870 (0.0889) time: 2.9463 data: 0.0073 max mem: 33302 Epoch: [29] [3840/4276] eta: 0:21:19 lr: 1.4490127796223507e-05 loss: 0.0870 (0.0889) time: 2.9500 data: 0.0071 max mem: 33302 Epoch: [29] [3850/4276] eta: 0:20:50 lr: 1.4487108635212807e-05 loss: 0.0752 (0.0888) time: 2.9482 data: 0.0072 max mem: 33302 Epoch: [29] [3860/4276] eta: 0:20:20 lr: 1.4484089404289033e-05 loss: 0.0873 (0.0889) time: 2.9506 data: 0.0073 max mem: 33302 Epoch: [29] [3870/4276] eta: 0:19:51 lr: 1.448107010343438e-05 loss: 0.0894 (0.0889) time: 2.9501 data: 0.0071 max mem: 33302 Epoch: [29] [3880/4276] eta: 0:19:22 lr: 1.4478050732631018e-05 loss: 0.0805 (0.0888) time: 2.9470 data: 0.0071 max mem: 33302 Epoch: [29] [3890/4276] eta: 0:18:52 lr: 1.4475031291861122e-05 loss: 0.0805 (0.0888) time: 2.9495 data: 0.0071 max mem: 33302 Epoch: [29] [3900/4276] eta: 0:18:23 lr: 1.4472011781106851e-05 loss: 0.0765 (0.0888) time: 2.9540 data: 0.0076 max mem: 33302 Epoch: [29] [3910/4276] eta: 0:17:54 lr: 1.4468992200350365e-05 loss: 0.0725 (0.0888) time: 2.9523 data: 0.0080 max mem: 33302 Epoch: [29] [3920/4276] eta: 0:17:24 lr: 1.4465972549573797e-05 loss: 0.0759 (0.0888) time: 2.9368 data: 0.0087 max mem: 33302 Epoch: [29] [3930/4276] eta: 0:16:55 lr: 1.4462952828759288e-05 loss: 0.0821 (0.0888) time: 2.9146 data: 0.0093 max mem: 33302 Epoch: [29] [3940/4276] eta: 0:16:26 lr: 1.4459933037888965e-05 loss: 0.0729 (0.0888) time: 2.9032 data: 0.0083 max mem: 33302 Epoch: [29] [3950/4276] eta: 0:15:56 lr: 1.4456913176944953e-05 loss: 0.0729 (0.0888) time: 2.9023 data: 0.0081 max mem: 33302 Epoch: [29] [3960/4276] eta: 0:15:27 lr: 1.4453893245909345e-05 loss: 0.0817 (0.0888) time: 2.9034 data: 0.0083 max mem: 33302 Epoch: [29] [3970/4276] eta: 0:14:57 lr: 1.4450873244764255e-05 loss: 0.0852 (0.0888) time: 2.9059 data: 0.0083 max mem: 33302 Epoch: [29] [3980/4276] eta: 0:14:28 lr: 1.4447853173491767e-05 loss: 0.0796 (0.0888) time: 2.9085 data: 0.0082 max mem: 33302 Epoch: [29] [3990/4276] eta: 0:13:59 lr: 1.4444833032073976e-05 loss: 0.0789 (0.0888) time: 2.9076 data: 0.0080 max mem: 33302 Epoch: [29] [4000/4276] eta: 0:13:29 lr: 1.4441812820492943e-05 loss: 0.0755 (0.0888) time: 2.9203 data: 0.0081 max mem: 33302 Epoch: [29] [4010/4276] eta: 0:13:00 lr: 1.4438792538730737e-05 loss: 0.0775 (0.0888) time: 2.9487 data: 0.0085 max mem: 33302 Epoch: [29] [4020/4276] eta: 0:12:31 lr: 1.4435772186769416e-05 loss: 0.0775 (0.0887) time: 2.9620 data: 0.0086 max mem: 33302 Epoch: [29] [4030/4276] eta: 0:12:01 lr: 1.4432751764591035e-05 loss: 0.0745 (0.0887) time: 2.9600 data: 0.0080 max mem: 33302 Epoch: [29] [4040/4276] eta: 0:11:32 lr: 1.4429731272177624e-05 loss: 0.0871 (0.0888) time: 2.9638 data: 0.0079 max mem: 33302 Epoch: [29] [4050/4276] eta: 0:11:03 lr: 1.4426710709511215e-05 loss: 0.0873 (0.0888) time: 2.9664 data: 0.0080 max mem: 33302 Epoch: [29] [4060/4276] eta: 0:10:33 lr: 1.4423690076573831e-05 loss: 0.0882 (0.0888) time: 2.9663 data: 0.0082 max mem: 33302 Epoch: [29] [4070/4276] eta: 0:10:04 lr: 1.4420669373347492e-05 loss: 0.0858 (0.0888) time: 2.9630 data: 0.0081 max mem: 33302 Epoch: [29] [4080/4276] eta: 0:09:35 lr: 1.441764859981419e-05 loss: 0.0890 (0.0888) time: 2.9588 data: 0.0080 max mem: 33302 Epoch: [29] [4090/4276] eta: 0:09:05 lr: 1.441462775595593e-05 loss: 0.0895 (0.0888) time: 2.9576 data: 0.0081 max mem: 33302 Epoch: [29] [4100/4276] eta: 0:08:36 lr: 1.4411606841754691e-05 loss: 0.0882 (0.0888) time: 2.9536 data: 0.0082 max mem: 33302 Epoch: [29] [4110/4276] eta: 0:08:07 lr: 1.440858585719246e-05 loss: 0.0919 (0.0888) time: 2.9503 data: 0.0084 max mem: 33302 Epoch: [29] [4120/4276] eta: 0:07:37 lr: 1.4405564802251198e-05 loss: 0.0910 (0.0889) time: 2.9487 data: 0.0083 max mem: 33302 Epoch: [29] [4130/4276] eta: 0:07:08 lr: 1.4402543676912867e-05 loss: 0.0806 (0.0888) time: 2.9482 data: 0.0081 max mem: 33302 Epoch: [29] [4140/4276] eta: 0:06:39 lr: 1.4399522481159426e-05 loss: 0.0810 (0.0888) time: 2.9589 data: 0.0085 max mem: 33302 Epoch: [29] [4150/4276] eta: 0:06:09 lr: 1.4396501214972804e-05 loss: 0.0828 (0.0888) time: 2.9764 data: 0.0085 max mem: 33302 Epoch: [29] [4160/4276] eta: 0:05:40 lr: 1.4393479878334943e-05 loss: 0.0791 (0.0888) time: 2.9710 data: 0.0081 max mem: 33302 Epoch: [29] [4170/4276] eta: 0:05:11 lr: 1.4390458471227767e-05 loss: 0.0932 (0.0889) time: 2.9572 data: 0.0079 max mem: 33302 Epoch: [29] [4180/4276] eta: 0:04:41 lr: 1.4387436993633194e-05 loss: 0.0821 (0.0888) time: 2.9556 data: 0.0082 max mem: 33302 Epoch: [29] [4190/4276] eta: 0:04:12 lr: 1.4384415445533125e-05 loss: 0.0805 (0.0888) time: 2.9564 data: 0.0082 max mem: 33302 Epoch: [29] [4200/4276] eta: 0:03:43 lr: 1.4381393826909462e-05 loss: 0.0886 (0.0889) time: 2.9570 data: 0.0084 max mem: 33302 Epoch: [29] [4210/4276] eta: 0:03:13 lr: 1.4378372137744094e-05 loss: 0.0979 (0.0889) time: 2.9577 data: 0.0089 max mem: 33302 Epoch: [29] [4220/4276] eta: 0:02:44 lr: 1.437535037801891e-05 loss: 0.0935 (0.0889) time: 2.9464 data: 0.0091 max mem: 33302 Epoch: [29] [4230/4276] eta: 0:02:15 lr: 1.4372328547715764e-05 loss: 0.0934 (0.0889) time: 2.9186 data: 0.0090 max mem: 33302 Epoch: [29] [4240/4276] eta: 0:01:45 lr: 1.4369306646816527e-05 loss: 0.0934 (0.0890) time: 2.9034 data: 0.0087 max mem: 33302 Epoch: [29] [4250/4276] eta: 0:01:16 lr: 1.4366284675303058e-05 loss: 0.1041 (0.0890) time: 2.9040 data: 0.0084 max mem: 33302 Epoch: [29] [4260/4276] eta: 0:00:46 lr: 1.43632626331572e-05 loss: 0.0858 (0.0890) time: 2.9240 data: 0.0087 max mem: 33302 Epoch: [29] [4270/4276] eta: 0:00:17 lr: 1.4360240520360782e-05 loss: 0.0853 (0.0890) time: 2.9435 data: 0.0081 max mem: 33302 Epoch: [29] Total time: 3:29:12 Test: [ 0/21770] eta: 12:11:08 time: 2.0151 data: 1.9760 max mem: 33302 Test: [ 100/21770] eta: 0:20:31 time: 0.0374 data: 0.0009 max mem: 33302 Test: [ 200/21770] eta: 0:16:57 time: 0.0374 data: 0.0009 max mem: 33302 Test: [ 300/21770] eta: 0:15:42 time: 0.0373 data: 0.0009 max mem: 33302 Test: [ 400/21770] eta: 0:15:03 time: 0.0374 data: 0.0009 max mem: 33302 Test: [ 500/21770] eta: 0:14:38 time: 0.0374 data: 0.0009 max mem: 33302 Test: [ 600/21770] eta: 0:14:20 time: 0.0375 data: 0.0009 max mem: 33302 Test: [ 700/21770] eta: 0:14:07 time: 0.0375 data: 0.0009 max mem: 33302 Test: [ 800/21770] eta: 0:13:56 time: 0.0375 data: 0.0009 max mem: 33302 Test: [ 900/21770] eta: 0:13:46 time: 0.0375 data: 0.0009 max mem: 33302 Test: [ 1000/21770] eta: 0:13:38 time: 0.0375 data: 0.0009 max mem: 33302 Test: [ 1100/21770] eta: 0:13:30 time: 0.0375 data: 0.0009 max mem: 33302 Test: [ 1200/21770] eta: 0:13:25 time: 0.0388 data: 0.0008 max mem: 33302 Test: [ 1300/21770] eta: 0:13:19 time: 0.0380 data: 0.0009 max mem: 33302 Test: [ 1400/21770] eta: 0:13:14 time: 0.0379 data: 0.0009 max mem: 33302 Test: [ 1500/21770] eta: 0:13:08 time: 0.0377 data: 0.0009 max mem: 33302 Test: [ 1600/21770] eta: 0:13:03 time: 0.0377 data: 0.0009 max mem: 33302 Test: [ 1700/21770] eta: 0:12:58 time: 0.0377 data: 0.0009 max mem: 33302 Test: [ 1800/21770] eta: 0:12:52 time: 0.0377 data: 0.0009 max mem: 33302 Test: [ 1900/21770] eta: 0:12:48 time: 0.0377 data: 0.0009 max mem: 33302 Test: [ 2000/21770] eta: 0:12:43 time: 0.0377 data: 0.0009 max mem: 33302 Test: [ 2100/21770] eta: 0:12:38 time: 0.0377 data: 0.0009 max mem: 33302 Test: [ 2200/21770] eta: 0:12:33 time: 0.0377 data: 0.0008 max mem: 33302 Test: [ 2300/21770] eta: 0:12:29 time: 0.0377 data: 0.0009 max mem: 33302 Test: [ 2400/21770] eta: 0:12:25 time: 0.0378 data: 0.0009 max mem: 33302 Test: [ 2500/21770] eta: 0:12:20 time: 0.0378 data: 0.0009 max mem: 33302 Test: [ 2600/21770] eta: 0:12:16 time: 0.0379 data: 0.0009 max mem: 33302 Test: [ 2700/21770] eta: 0:12:12 time: 0.0379 data: 0.0009 max mem: 33302 Test: [ 2800/21770] eta: 0:12:07 time: 0.0378 data: 0.0009 max mem: 33302 Test: [ 2900/21770] eta: 0:12:03 time: 0.0378 data: 0.0009 max mem: 33302 Test: [ 3000/21770] eta: 0:11:59 time: 0.0379 data: 0.0009 max mem: 33302 Test: [ 3100/21770] eta: 0:11:55 time: 0.0378 data: 0.0009 max mem: 33302 Test: [ 3200/21770] eta: 0:11:51 time: 0.0379 data: 0.0009 max mem: 33302 Test: [ 3300/21770] eta: 0:11:47 time: 0.0379 data: 0.0009 max mem: 33302 Test: [ 3400/21770] eta: 0:11:43 time: 0.0379 data: 0.0009 max mem: 33302 Test: [ 3500/21770] eta: 0:11:39 time: 0.0379 data: 0.0009 max mem: 33302 Test: [ 3600/21770] eta: 0:11:35 time: 0.0379 data: 0.0009 max mem: 33302 Test: [ 3700/21770] eta: 0:11:31 time: 0.0378 data: 0.0009 max mem: 33302 Test: [ 3800/21770] eta: 0:11:27 time: 0.0379 data: 0.0009 max mem: 33302 Test: [ 3900/21770] eta: 0:11:23 time: 0.0379 data: 0.0009 max mem: 33302 Test: [ 4000/21770] eta: 0:11:19 time: 0.0404 data: 0.0009 max mem: 33302 Test: [ 4100/21770] eta: 0:11:16 time: 0.0401 data: 0.0008 max mem: 33302 Test: [ 4200/21770] eta: 0:11:12 time: 0.0392 data: 0.0009 max mem: 33302 Test: [ 4300/21770] eta: 0:11:09 time: 0.0392 data: 0.0009 max mem: 33302 Test: [ 4400/21770] eta: 0:11:06 time: 0.0393 data: 0.0009 max mem: 33302 Test: [ 4500/21770] eta: 0:11:02 time: 0.0394 data: 0.0009 max mem: 33302 Test: [ 4600/21770] eta: 0:10:59 time: 0.0390 data: 0.0009 max mem: 33302 Test: [ 4700/21770] eta: 0:10:55 time: 0.0394 data: 0.0009 max mem: 33302 Test: [ 4800/21770] eta: 0:10:51 time: 0.0390 data: 0.0008 max mem: 33302 Test: [ 4900/21770] eta: 0:10:48 time: 0.0394 data: 0.0009 max mem: 33302 Test: [ 5000/21770] eta: 0:10:44 time: 0.0394 data: 0.0009 max mem: 33302 Test: [ 5100/21770] eta: 0:10:41 time: 0.0394 data: 0.0009 max mem: 33302 Test: [ 5200/21770] eta: 0:10:37 time: 0.0391 data: 0.0008 max mem: 33302 Test: [ 5300/21770] eta: 0:10:34 time: 0.0392 data: 0.0009 max mem: 33302 Test: [ 5400/21770] eta: 0:10:30 time: 0.0394 data: 0.0009 max mem: 33302 Test: [ 5500/21770] eta: 0:10:26 time: 0.0391 data: 0.0009 max mem: 33302 Test: [ 5600/21770] eta: 0:10:23 time: 0.0389 data: 0.0009 max mem: 33302 Test: [ 5700/21770] eta: 0:10:19 time: 0.0389 data: 0.0009 max mem: 33302 Test: [ 5800/21770] eta: 0:10:15 time: 0.0389 data: 0.0009 max mem: 33302 Test: [ 5900/21770] eta: 0:10:11 time: 0.0387 data: 0.0008 max mem: 33302 Test: [ 6000/21770] eta: 0:10:07 time: 0.0387 data: 0.0009 max mem: 33302 Test: [ 6100/21770] eta: 0:10:04 time: 0.0385 data: 0.0008 max mem: 33302 Test: [ 6200/21770] eta: 0:10:00 time: 0.0381 data: 0.0009 max mem: 33302 Test: [ 6300/21770] eta: 0:09:56 time: 0.0382 data: 0.0009 max mem: 33302 Test: [ 6400/21770] eta: 0:09:52 time: 0.0381 data: 0.0009 max mem: 33302 Test: [ 6500/21770] eta: 0:09:48 time: 0.0382 data: 0.0009 max mem: 33302 Test: [ 6600/21770] eta: 0:09:44 time: 0.0382 data: 0.0009 max mem: 33302 Test: [ 6700/21770] eta: 0:09:40 time: 0.0383 data: 0.0009 max mem: 33302 Test: [ 6800/21770] eta: 0:09:36 time: 0.0381 data: 0.0009 max mem: 33302 Test: [ 6900/21770] eta: 0:09:32 time: 0.0384 data: 0.0009 max mem: 33302 Test: [ 7000/21770] eta: 0:09:28 time: 0.0382 data: 0.0009 max mem: 33302 Test: [ 7100/21770] eta: 0:09:24 time: 0.0377 data: 0.0009 max mem: 33302 Test: [ 7200/21770] eta: 0:09:20 time: 0.0379 data: 0.0009 max mem: 33302 Test: [ 7300/21770] eta: 0:09:16 time: 0.0379 data: 0.0009 max mem: 33302 Test: [ 7400/21770] eta: 0:09:12 time: 0.0381 data: 0.0009 max mem: 33302 Test: [ 7500/21770] eta: 0:09:09 time: 0.0383 data: 0.0008 max mem: 33302 Test: [ 7600/21770] eta: 0:09:05 time: 0.0382 data: 0.0009 max mem: 33302 Test: [ 7700/21770] eta: 0:09:01 time: 0.0383 data: 0.0009 max mem: 33302 Test: [ 7800/21770] eta: 0:08:57 time: 0.0382 data: 0.0009 max mem: 33302 Test: [ 7900/21770] eta: 0:08:53 time: 0.0383 data: 0.0009 max mem: 33302 Test: [ 8000/21770] eta: 0:08:49 time: 0.0383 data: 0.0009 max mem: 33302 Test: [ 8100/21770] eta: 0:08:45 time: 0.0383 data: 0.0009 max mem: 33302 Test: [ 8200/21770] eta: 0:08:41 time: 0.0381 data: 0.0009 max mem: 33302 Test: [ 8300/21770] eta: 0:08:37 time: 0.0382 data: 0.0008 max mem: 33302 Test: [ 8400/21770] eta: 0:08:34 time: 0.0381 data: 0.0008 max mem: 33302 Test: [ 8500/21770] eta: 0:08:30 time: 0.0382 data: 0.0008 max mem: 33302 Test: [ 8600/21770] eta: 0:08:26 time: 0.0381 data: 0.0009 max mem: 33302 Test: [ 8700/21770] eta: 0:08:22 time: 0.0382 data: 0.0008 max mem: 33302 Test: [ 8800/21770] eta: 0:08:18 time: 0.0381 data: 0.0008 max mem: 33302 Test: [ 8900/21770] eta: 0:08:14 time: 0.0382 data: 0.0009 max mem: 33302 Test: [ 9000/21770] eta: 0:08:10 time: 0.0381 data: 0.0008 max mem: 33302 Test: [ 9100/21770] eta: 0:08:06 time: 0.0383 data: 0.0008 max mem: 33302 Test: [ 9200/21770] eta: 0:08:02 time: 0.0381 data: 0.0008 max mem: 33302 Test: [ 9300/21770] eta: 0:07:59 time: 0.0382 data: 0.0008 max mem: 33302 Test: [ 9400/21770] eta: 0:07:55 time: 0.0381 data: 0.0008 max mem: 33302 Test: [ 9500/21770] eta: 0:07:51 time: 0.0379 data: 0.0009 max mem: 33302 Test: [ 9600/21770] eta: 0:07:47 time: 0.0379 data: 0.0009 max mem: 33302 Test: [ 9700/21770] eta: 0:07:43 time: 0.0379 data: 0.0009 max mem: 33302 Test: [ 9800/21770] eta: 0:07:39 time: 0.0379 data: 0.0009 max mem: 33302 Test: [ 9900/21770] eta: 0:07:35 time: 0.0378 data: 0.0009 max mem: 33302 Test: [10000/21770] eta: 0:07:31 time: 0.0379 data: 0.0009 max mem: 33302 Test: [10100/21770] eta: 0:07:27 time: 0.0383 data: 0.0009 max mem: 33302 Test: [10200/21770] eta: 0:07:24 time: 0.0387 data: 0.0009 max mem: 33302 Test: [10300/21770] eta: 0:07:20 time: 0.0390 data: 0.0009 max mem: 33302 Test: [10400/21770] eta: 0:07:16 time: 0.0388 data: 0.0009 max mem: 33302 Test: [10500/21770] eta: 0:07:12 time: 0.0388 data: 0.0009 max mem: 33302 Test: [10600/21770] eta: 0:07:08 time: 0.0388 data: 0.0009 max mem: 33302 Test: [10700/21770] eta: 0:07:05 time: 0.0388 data: 0.0009 max mem: 33302 Test: [10800/21770] eta: 0:07:01 time: 0.0388 data: 0.0009 max mem: 33302 Test: [10900/21770] eta: 0:06:57 time: 0.0389 data: 0.0009 max mem: 33302 Test: [11000/21770] eta: 0:06:53 time: 0.0388 data: 0.0009 max mem: 33302 Test: [11100/21770] eta: 0:06:49 time: 0.0389 data: 0.0009 max mem: 33302 Test: [11200/21770] eta: 0:06:46 time: 0.0387 data: 0.0009 max mem: 33302 Test: [11300/21770] eta: 0:06:42 time: 0.0390 data: 0.0009 max mem: 33302 Test: [11400/21770] eta: 0:06:38 time: 0.0388 data: 0.0009 max mem: 33302 Test: [11500/21770] eta: 0:06:34 time: 0.0388 data: 0.0009 max mem: 33302 Test: [11600/21770] eta: 0:06:30 time: 0.0389 data: 0.0009 max mem: 33302 Test: [11700/21770] eta: 0:06:27 time: 0.0389 data: 0.0009 max mem: 33302 Test: [11800/21770] eta: 0:06:23 time: 0.0384 data: 0.0009 max mem: 33302 Test: [11900/21770] eta: 0:06:19 time: 0.0385 data: 0.0009 max mem: 33302 Test: [12000/21770] eta: 0:06:15 time: 0.0389 data: 0.0009 max mem: 33302 Test: [12100/21770] eta: 0:06:11 time: 0.0389 data: 0.0009 max mem: 33302 Test: [12200/21770] eta: 0:06:08 time: 0.0388 data: 0.0009 max mem: 33302 Test: [12300/21770] eta: 0:06:04 time: 0.0389 data: 0.0009 max mem: 33302 Test: [12400/21770] eta: 0:06:00 time: 0.0387 data: 0.0009 max mem: 33302 Test: [12500/21770] eta: 0:05:56 time: 0.0389 data: 0.0009 max mem: 33302 Test: [12600/21770] eta: 0:05:52 time: 0.0388 data: 0.0009 max mem: 33302 Test: [12700/21770] eta: 0:05:48 time: 0.0387 data: 0.0009 max mem: 33302 Test: [12800/21770] eta: 0:05:45 time: 0.0388 data: 0.0009 max mem: 33302 Test: [12900/21770] eta: 0:05:41 time: 0.0389 data: 0.0009 max mem: 33302 Test: [13000/21770] eta: 0:05:37 time: 0.0388 data: 0.0009 max mem: 33302 Test: [13100/21770] eta: 0:05:33 time: 0.0387 data: 0.0009 max mem: 33302 Test: [13200/21770] eta: 0:05:29 time: 0.0386 data: 0.0009 max mem: 33302 Test: [13300/21770] eta: 0:05:25 time: 0.0384 data: 0.0009 max mem: 33302 Test: [13400/21770] eta: 0:05:22 time: 0.0384 data: 0.0009 max mem: 33302 Test: [13500/21770] eta: 0:05:18 time: 0.0386 data: 0.0009 max mem: 33302 Test: [13600/21770] eta: 0:05:14 time: 0.0385 data: 0.0009 max mem: 33302 Test: [13700/21770] eta: 0:05:10 time: 0.0385 data: 0.0009 max mem: 33302 Test: [13800/21770] eta: 0:05:06 time: 0.0385 data: 0.0009 max mem: 33302 Test: [13900/21770] eta: 0:05:02 time: 0.0386 data: 0.0009 max mem: 33302 Test: [14000/21770] eta: 0:04:59 time: 0.0384 data: 0.0009 max mem: 33302 Test: [14100/21770] eta: 0:04:55 time: 0.0385 data: 0.0009 max mem: 33302 Test: [14200/21770] eta: 0:04:51 time: 0.0386 data: 0.0009 max mem: 33302 Test: [14300/21770] eta: 0:04:47 time: 0.0387 data: 0.0009 max mem: 33302 Test: [14400/21770] eta: 0:04:43 time: 0.0386 data: 0.0009 max mem: 33302 Test: [14500/21770] eta: 0:04:39 time: 0.0385 data: 0.0009 max mem: 33302 Test: [14600/21770] eta: 0:04:35 time: 0.0385 data: 0.0009 max mem: 33302 Test: [14700/21770] eta: 0:04:32 time: 0.0385 data: 0.0009 max mem: 33302 Test: [14800/21770] eta: 0:04:28 time: 0.0385 data: 0.0009 max mem: 33302 Test: [14900/21770] eta: 0:04:24 time: 0.0387 data: 0.0009 max mem: 33302 Test: [15000/21770] eta: 0:04:20 time: 0.0385 data: 0.0009 max mem: 33302 Test: [15100/21770] eta: 0:04:16 time: 0.0386 data: 0.0009 max mem: 33302 Test: [15200/21770] eta: 0:04:12 time: 0.0389 data: 0.0009 max mem: 33302 Test: [15300/21770] eta: 0:04:09 time: 0.0385 data: 0.0009 max mem: 33302 Test: [15400/21770] eta: 0:04:05 time: 0.0387 data: 0.0009 max mem: 33302 Test: [15500/21770] eta: 0:04:01 time: 0.0386 data: 0.0009 max mem: 33302 Test: [15600/21770] eta: 0:03:57 time: 0.0386 data: 0.0009 max mem: 33302 Test: [15700/21770] eta: 0:03:53 time: 0.0385 data: 0.0009 max mem: 33302 Test: [15800/21770] eta: 0:03:49 time: 0.0385 data: 0.0009 max mem: 33302 Test: [15900/21770] eta: 0:03:45 time: 0.0385 data: 0.0009 max mem: 33302 Test: [16000/21770] eta: 0:03:42 time: 0.0385 data: 0.0009 max mem: 33302 Test: [16100/21770] eta: 0:03:38 time: 0.0385 data: 0.0009 max mem: 33302 Test: [16200/21770] eta: 0:03:34 time: 0.0386 data: 0.0009 max mem: 33302 Test: [16300/21770] eta: 0:03:30 time: 0.0385 data: 0.0009 max mem: 33302 Test: [16400/21770] eta: 0:03:26 time: 0.0386 data: 0.0009 max mem: 33302 Test: [16500/21770] eta: 0:03:22 time: 0.0385 data: 0.0009 max mem: 33302 Test: [16600/21770] eta: 0:03:19 time: 0.0385 data: 0.0009 max mem: 33302 Test: [16700/21770] eta: 0:03:15 time: 0.0384 data: 0.0009 max mem: 33302 Test: [16800/21770] eta: 0:03:11 time: 0.0385 data: 0.0009 max mem: 33302 Test: [16900/21770] eta: 0:03:07 time: 0.0385 data: 0.0009 max mem: 33302 Test: [17000/21770] eta: 0:03:03 time: 0.0384 data: 0.0009 max mem: 33302 Test: [17100/21770] eta: 0:02:59 time: 0.0384 data: 0.0009 max mem: 33302 Test: [17200/21770] eta: 0:02:55 time: 0.0383 data: 0.0009 max mem: 33302 Test: [17300/21770] eta: 0:02:52 time: 0.0386 data: 0.0009 max mem: 33302 Test: [17400/21770] eta: 0:02:48 time: 0.0385 data: 0.0009 max mem: 33302 Test: [17500/21770] eta: 0:02:44 time: 0.0383 data: 0.0009 max mem: 33302 Test: [17600/21770] eta: 0:02:40 time: 0.0385 data: 0.0008 max mem: 33302 Test: [17700/21770] eta: 0:02:36 time: 0.0384 data: 0.0009 max mem: 33302 Test: [17800/21770] eta: 0:02:32 time: 0.0385 data: 0.0009 max mem: 33302 Test: [17900/21770] eta: 0:02:28 time: 0.0384 data: 0.0009 max mem: 33302 Test: [18000/21770] eta: 0:02:25 time: 0.0385 data: 0.0009 max mem: 33302 Test: [18100/21770] eta: 0:02:21 time: 0.0385 data: 0.0008 max mem: 33302 Test: [18200/21770] eta: 0:02:17 time: 0.0384 data: 0.0009 max mem: 33302 Test: [18300/21770] eta: 0:02:13 time: 0.0385 data: 0.0009 max mem: 33302 Test: [18400/21770] eta: 0:02:09 time: 0.0385 data: 0.0009 max mem: 33302 Test: [18500/21770] eta: 0:02:05 time: 0.0384 data: 0.0009 max mem: 33302 Test: [18600/21770] eta: 0:02:02 time: 0.0381 data: 0.0009 max mem: 33302 Test: [18700/21770] eta: 0:01:58 time: 0.0383 data: 0.0009 max mem: 33302 Test: [18800/21770] eta: 0:01:54 time: 0.0381 data: 0.0009 max mem: 33302 Test: [18900/21770] eta: 0:01:50 time: 0.0382 data: 0.0008 max mem: 33302 Test: [19000/21770] eta: 0:01:46 time: 0.0381 data: 0.0009 max mem: 33302 Test: [19100/21770] eta: 0:01:42 time: 0.0382 data: 0.0009 max mem: 33302 Test: [19200/21770] eta: 0:01:38 time: 0.0380 data: 0.0009 max mem: 33302 Test: [19300/21770] eta: 0:01:35 time: 0.0381 data: 0.0009 max mem: 33302 Test: [19400/21770] eta: 0:01:31 time: 0.0381 data: 0.0009 max mem: 33302 Test: [19500/21770] eta: 0:01:27 time: 0.0382 data: 0.0008 max mem: 33302 Test: [19600/21770] eta: 0:01:23 time: 0.0381 data: 0.0009 max mem: 33302 Test: [19700/21770] eta: 0:01:19 time: 0.0382 data: 0.0009 max mem: 33302 Test: [19800/21770] eta: 0:01:15 time: 0.0383 data: 0.0008 max mem: 33302 Test: [19900/21770] eta: 0:01:11 time: 0.0387 data: 0.0008 max mem: 33302 Test: [20000/21770] eta: 0:01:08 time: 0.0385 data: 0.0008 max mem: 33302 Test: [20100/21770] eta: 0:01:04 time: 0.0381 data: 0.0009 max mem: 33302 Test: [20200/21770] eta: 0:01:00 time: 0.0380 data: 0.0009 max mem: 33302 Test: [20300/21770] eta: 0:00:56 time: 0.0379 data: 0.0009 max mem: 33302 Test: [20400/21770] eta: 0:00:52 time: 0.0378 data: 0.0009 max mem: 33302 Test: [20500/21770] eta: 0:00:48 time: 0.0379 data: 0.0009 max mem: 33302 Test: [20600/21770] eta: 0:00:44 time: 0.0379 data: 0.0009 max mem: 33302 Test: [20700/21770] eta: 0:00:41 time: 0.0378 data: 0.0009 max mem: 33302 Test: [20800/21770] eta: 0:00:37 time: 0.0379 data: 0.0009 max mem: 33302 Test: [20900/21770] eta: 0:00:33 time: 0.0379 data: 0.0009 max mem: 33302 Test: [21000/21770] eta: 0:00:29 time: 0.0379 data: 0.0009 max mem: 33302 Test: [21100/21770] eta: 0:00:25 time: 0.0379 data: 0.0009 max mem: 33302 Test: [21200/21770] eta: 0:00:21 time: 0.0379 data: 0.0009 max mem: 33302 Test: [21300/21770] eta: 0:00:18 time: 0.0379 data: 0.0009 max mem: 33302 Test: [21400/21770] eta: 0:00:14 time: 0.0379 data: 0.0009 max mem: 33302 Test: [21500/21770] eta: 0:00:10 time: 0.0379 data: 0.0009 max mem: 33302 Test: [21600/21770] eta: 0:00:06 time: 0.0379 data: 0.0009 max mem: 33302 Test: [21700/21770] eta: 0:00:02 time: 0.0379 data: 0.0009 max mem: 33302 Test: Total time: 0:13:56 Final results: Mean IoU is 16.39 precision@0.5 = 3.00 precision@0.6 = 1.32 precision@0.7 = 0.40 precision@0.8 = 0.03 precision@0.9 = 0.00 overall IoU = 16.52 mean IoU = 16.39 Mean accuracy for one-to-zero sample is 0.00 Average object IoU 0.1638517244923191 Overall IoU 16.523422241210938 Better epoch: 29 Epoch: [30] [ 0/4276] eta: 6:46:31 lr: 1.4358427218763106e-05 loss: 0.0741 (0.0741) time: 5.7044 data: 2.6220 max mem: 33302 Epoch: [30] [ 10/4276] eta: 3:47:59 lr: 1.4355404992888e-05 loss: 0.0741 (0.0820) time: 3.2066 data: 0.2457 max mem: 33302 Epoch: [30] [ 20/4276] eta: 3:38:50 lr: 1.4352382696315067e-05 loss: 0.0895 (0.0888) time: 2.9542 data: 0.0077 max mem: 33302 Epoch: [30] [ 30/4276] eta: 3:35:20 lr: 1.434936032902612e-05 loss: 0.0895 (0.0879) time: 2.9528 data: 0.0076 max mem: 33302 Epoch: [30] [ 40/4276] eta: 3:33:25 lr: 1.4346337891002942e-05 loss: 0.0839 (0.0874) time: 2.9578 data: 0.0080 max mem: 33302 Epoch: [30] [ 50/4276] eta: 3:31:55 lr: 1.4343315382227327e-05 loss: 0.0812 (0.0858) time: 2.9564 data: 0.0079 max mem: 33302 Epoch: [30] [ 60/4276] eta: 3:30:47 lr: 1.4340292802681052e-05 loss: 0.0817 (0.0863) time: 2.9526 data: 0.0076 max mem: 33302 Epoch: [30] [ 70/4276] eta: 3:29:50 lr: 1.4337270152345886e-05 loss: 0.0851 (0.0858) time: 2.9540 data: 0.0076 max mem: 33302 Epoch: [30] [ 80/4276] eta: 3:29:00 lr: 1.433424743120358e-05 loss: 0.0785 (0.0858) time: 2.9548 data: 0.0078 max mem: 33302 Epoch: [30] [ 90/4276] eta: 3:28:12 lr: 1.433122463923589e-05 loss: 0.0785 (0.0866) time: 2.9519 data: 0.0078 max mem: 33302 Epoch: [30] [ 100/4276] eta: 3:27:28 lr: 1.4328201776424557e-05 loss: 0.0818 (0.0867) time: 2.9492 data: 0.0077 max mem: 33302 Epoch: [30] [ 110/4276] eta: 3:26:49 lr: 1.4325178842751319e-05 loss: 0.0831 (0.0870) time: 2.9529 data: 0.0077 max mem: 33302 Epoch: [30] [ 120/4276] eta: 3:26:10 lr: 1.4322155838197887e-05 loss: 0.0828 (0.0872) time: 2.9539 data: 0.0077 max mem: 33302 Epoch: [30] [ 130/4276] eta: 3:25:33 lr: 1.4319132762745978e-05 loss: 0.0810 (0.0882) time: 2.9532 data: 0.0077 max mem: 33302 Epoch: [30] [ 140/4276] eta: 3:24:57 lr: 1.4316109616377305e-05 loss: 0.0801 (0.0872) time: 2.9539 data: 0.0075 max mem: 33302 Epoch: [30] [ 150/4276] eta: 3:24:23 lr: 1.4313086399073561e-05 loss: 0.0759 (0.0870) time: 2.9549 data: 0.0075 max mem: 33302 Epoch: [30] [ 160/4276] eta: 3:23:43 lr: 1.4310063110816427e-05 loss: 0.0865 (0.0871) time: 2.9461 data: 0.0078 max mem: 33302 Epoch: [30] [ 170/4276] eta: 3:22:57 lr: 1.4307039751587586e-05 loss: 0.0871 (0.0871) time: 2.9183 data: 0.0078 max mem: 33302 Epoch: [30] [ 180/4276] eta: 3:22:14 lr: 1.4304016321368708e-05 loss: 0.0884 (0.0872) time: 2.9034 data: 0.0077 max mem: 33302 Epoch: [30] [ 190/4276] eta: 3:21:33 lr: 1.4300992820141454e-05 loss: 0.0863 (0.0869) time: 2.9085 data: 0.0076 max mem: 33302 Epoch: [30] [ 200/4276] eta: 3:20:53 lr: 1.429796924788747e-05 loss: 0.0715 (0.0865) time: 2.9094 data: 0.0078 max mem: 33302 Epoch: [30] [ 210/4276] eta: 3:20:12 lr: 1.4294945604588396e-05 loss: 0.0755 (0.0866) time: 2.9031 data: 0.0076 max mem: 33302 Epoch: [30] [ 220/4276] eta: 3:19:36 lr: 1.4291921890225876e-05 loss: 0.0801 (0.0861) time: 2.9092 data: 0.0078 max mem: 33302 Epoch: [30] [ 230/4276] eta: 3:19:00 lr: 1.4288898104781522e-05 loss: 0.0760 (0.0858) time: 2.9165 data: 0.0080 max mem: 33302 Epoch: [30] [ 240/4276] eta: 3:18:25 lr: 1.4285874248236955e-05 loss: 0.0744 (0.0860) time: 2.9146 data: 0.0079 max mem: 33302 Epoch: [30] [ 250/4276] eta: 3:17:48 lr: 1.4282850320573776e-05 loss: 0.0934 (0.0872) time: 2.9118 data: 0.0080 max mem: 33302 Epoch: [30] [ 260/4276] eta: 3:17:13 lr: 1.4279826321773593e-05 loss: 0.0934 (0.0870) time: 2.9093 data: 0.0078 max mem: 33302 Epoch: [30] [ 270/4276] eta: 3:16:41 lr: 1.4276802251817978e-05 loss: 0.0740 (0.0869) time: 2.9202 data: 0.0084 max mem: 33302 Epoch: [30] [ 280/4276] eta: 3:16:07 lr: 1.4273778110688516e-05 loss: 0.0744 (0.0867) time: 2.9219 data: 0.0082 max mem: 33302 Epoch: [30] [ 290/4276] eta: 3:15:33 lr: 1.4270753898366776e-05 loss: 0.0856 (0.0870) time: 2.9114 data: 0.0081 max mem: 33302 Epoch: [30] [ 300/4276] eta: 3:14:58 lr: 1.4267729614834326e-05 loss: 0.0808 (0.0867) time: 2.9062 data: 0.0081 max mem: 33302 Epoch: [30] [ 310/4276] eta: 3:14:24 lr: 1.4264705260072703e-05 loss: 0.0718 (0.0863) time: 2.9032 data: 0.0073 max mem: 33302 Epoch: [30] [ 320/4276] eta: 3:13:49 lr: 1.4261680834063456e-05 loss: 0.0819 (0.0862) time: 2.9025 data: 0.0070 max mem: 33302 Epoch: [30] [ 330/4276] eta: 3:13:16 lr: 1.4258656336788117e-05 loss: 0.0920 (0.0867) time: 2.9034 data: 0.0071 max mem: 33302 Epoch: [30] [ 340/4276] eta: 3:12:42 lr: 1.4255631768228215e-05 loss: 0.0920 (0.0867) time: 2.9026 data: 0.0070 max mem: 33302 Epoch: [30] [ 350/4276] eta: 3:12:15 lr: 1.4252607128365253e-05 loss: 0.0845 (0.0866) time: 2.9321 data: 0.0075 max mem: 33302 Epoch: [30] [ 360/4276] eta: 3:11:49 lr: 1.4249582417180742e-05 loss: 0.0864 (0.0870) time: 2.9620 data: 0.0087 max mem: 33302 Epoch: [30] [ 370/4276] eta: 3:11:21 lr: 1.4246557634656179e-05 loss: 0.0792 (0.0870) time: 2.9607 data: 0.0091 max mem: 33302 Epoch: [30] [ 380/4276] eta: 3:10:54 lr: 1.4243532780773059e-05 loss: 0.0792 (0.0870) time: 2.9619 data: 0.0085 max mem: 33302 Epoch: [30] [ 390/4276] eta: 3:10:27 lr: 1.4240507855512844e-05 loss: 0.0900 (0.0871) time: 2.9618 data: 0.0078 max mem: 33302 Epoch: [30] [ 400/4276] eta: 3:10:00 lr: 1.4237482858857007e-05 loss: 0.0972 (0.0874) time: 2.9612 data: 0.0074 max mem: 33302 Epoch: [30] [ 410/4276] eta: 3:09:32 lr: 1.4234457790787015e-05 loss: 0.1008 (0.0876) time: 2.9598 data: 0.0074 max mem: 33302 Epoch: [30] [ 420/4276] eta: 3:09:05 lr: 1.4231432651284316e-05 loss: 0.0979 (0.0876) time: 2.9651 data: 0.0077 max mem: 33302 Epoch: [30] [ 430/4276] eta: 3:08:37 lr: 1.4228407440330344e-05 loss: 0.0977 (0.0880) time: 2.9630 data: 0.0077 max mem: 33302 Epoch: [30] [ 440/4276] eta: 3:08:08 lr: 1.4225382157906533e-05 loss: 0.0938 (0.0881) time: 2.9541 data: 0.0074 max mem: 33302 Epoch: [30] [ 450/4276] eta: 3:07:40 lr: 1.4222356803994311e-05 loss: 0.0854 (0.0882) time: 2.9519 data: 0.0073 max mem: 33302 Epoch: [30] [ 460/4276] eta: 3:07:12 lr: 1.421933137857509e-05 loss: 0.0854 (0.0880) time: 2.9587 data: 0.0075 max mem: 33302 Epoch: [30] [ 470/4276] eta: 3:06:40 lr: 1.421630588163027e-05 loss: 0.0801 (0.0880) time: 2.9380 data: 0.0080 max mem: 33302 Epoch: [30] [ 480/4276] eta: 3:06:12 lr: 1.4213280313141247e-05 loss: 0.0799 (0.0878) time: 2.9327 data: 0.0087 max mem: 33302 Epoch: [30] [ 490/4276] eta: 3:05:43 lr: 1.421025467308941e-05 loss: 0.0789 (0.0878) time: 2.9567 data: 0.0085 max mem: 33302 Epoch: [30] [ 500/4276] eta: 3:05:15 lr: 1.4207228961456138e-05 loss: 0.0762 (0.0876) time: 2.9563 data: 0.0079 max mem: 33302 Epoch: [30] [ 510/4276] eta: 3:04:46 lr: 1.4204203178222786e-05 loss: 0.0761 (0.0877) time: 2.9577 data: 0.0082 max mem: 33302 Epoch: [30] [ 520/4276] eta: 3:04:18 lr: 1.420117732337072e-05 loss: 0.0850 (0.0878) time: 2.9567 data: 0.0082 max mem: 33302 Epoch: [30] [ 530/4276] eta: 3:03:49 lr: 1.4198151396881296e-05 loss: 0.1006 (0.0881) time: 2.9552 data: 0.0079 max mem: 33302 Epoch: [30] [ 540/4276] eta: 3:03:20 lr: 1.4195125398735836e-05 loss: 0.0857 (0.0879) time: 2.9556 data: 0.0082 max mem: 33302 Epoch: [30] [ 550/4276] eta: 3:02:52 lr: 1.4192099328915683e-05 loss: 0.0834 (0.0880) time: 2.9558 data: 0.0085 max mem: 33302 Epoch: [30] [ 560/4276] eta: 3:02:23 lr: 1.4189073187402156e-05 loss: 0.0863 (0.0880) time: 2.9556 data: 0.0081 max mem: 33302 Epoch: [30] [ 570/4276] eta: 3:01:54 lr: 1.4186046974176564e-05 loss: 0.0776 (0.0879) time: 2.9565 data: 0.0079 max mem: 33302 Epoch: [30] [ 580/4276] eta: 3:01:25 lr: 1.4183020689220209e-05 loss: 0.0776 (0.0880) time: 2.9558 data: 0.0083 max mem: 33302 Epoch: [30] [ 590/4276] eta: 3:00:54 lr: 1.4179994332514383e-05 loss: 0.0881 (0.0882) time: 2.9304 data: 0.0078 max mem: 33302 Epoch: [30] [ 600/4276] eta: 3:00:23 lr: 1.4176967904040372e-05 loss: 0.0779 (0.0882) time: 2.9127 data: 0.0075 max mem: 33302 Epoch: [30] [ 610/4276] eta: 2:59:51 lr: 1.4173941403779456e-05 loss: 0.0850 (0.0882) time: 2.9139 data: 0.0076 max mem: 33302 Epoch: [30] [ 620/4276] eta: 2:59:19 lr: 1.4170914831712886e-05 loss: 0.0850 (0.0881) time: 2.9068 data: 0.0075 max mem: 33302 Epoch: [30] [ 630/4276] eta: 2:58:48 lr: 1.4167888187821927e-05 loss: 0.0804 (0.0882) time: 2.9040 data: 0.0074 max mem: 33302 Epoch: [30] [ 640/4276] eta: 2:58:16 lr: 1.4164861472087823e-05 loss: 0.0808 (0.0881) time: 2.9082 data: 0.0074 max mem: 33302 Epoch: [30] [ 650/4276] eta: 2:57:47 lr: 1.4161834684491814e-05 loss: 0.0808 (0.0882) time: 2.9254 data: 0.0082 max mem: 33302 Epoch: [30] [ 660/4276] eta: 2:57:16 lr: 1.4158807825015122e-05 loss: 0.0827 (0.0884) time: 2.9255 data: 0.0084 max mem: 33302 Epoch: [30] [ 670/4276] eta: 2:56:47 lr: 1.4155780893638968e-05 loss: 0.0860 (0.0883) time: 2.9346 data: 0.0087 max mem: 33302 Epoch: [30] [ 680/4276] eta: 2:56:18 lr: 1.4152753890344558e-05 loss: 0.0860 (0.0883) time: 2.9546 data: 0.0091 max mem: 33302 Epoch: [30] [ 690/4276] eta: 2:55:50 lr: 1.4149726815113104e-05 loss: 0.0867 (0.0882) time: 2.9534 data: 0.0082 max mem: 33302 Epoch: [30] [ 700/4276] eta: 2:55:21 lr: 1.4146699667925778e-05 loss: 0.0758 (0.0882) time: 2.9559 data: 0.0078 max mem: 33302 Epoch: [30] [ 710/4276] eta: 2:54:52 lr: 1.4143672448763768e-05 loss: 0.0791 (0.0882) time: 2.9574 data: 0.0078 max mem: 33302 Epoch: [30] [ 720/4276] eta: 2:54:23 lr: 1.4140645157608245e-05 loss: 0.0822 (0.0880) time: 2.9560 data: 0.0078 max mem: 33302 Epoch: [30] [ 730/4276] eta: 2:53:55 lr: 1.413761779444038e-05 loss: 0.0741 (0.0879) time: 2.9555 data: 0.0078 max mem: 33302 Epoch: [30] [ 740/4276] eta: 2:53:26 lr: 1.4134590359241313e-05 loss: 0.0750 (0.0878) time: 2.9578 data: 0.0078 max mem: 33302 Epoch: [30] [ 750/4276] eta: 2:52:57 lr: 1.4131562851992186e-05 loss: 0.0757 (0.0877) time: 2.9581 data: 0.0078 max mem: 33302 Epoch: [30] [ 760/4276] eta: 2:52:28 lr: 1.412853527267414e-05 loss: 0.0757 (0.0879) time: 2.9552 data: 0.0080 max mem: 33302 Epoch: [30] [ 770/4276] eta: 2:51:59 lr: 1.4125507621268305e-05 loss: 0.0858 (0.0879) time: 2.9550 data: 0.0082 max mem: 33302 Epoch: [30] [ 780/4276] eta: 2:51:31 lr: 1.412247989775578e-05 loss: 0.0734 (0.0877) time: 2.9586 data: 0.0083 max mem: 33302 Epoch: [30] [ 790/4276] eta: 2:51:01 lr: 1.4119452102117678e-05 loss: 0.0807 (0.0878) time: 2.9512 data: 0.0085 max mem: 33302 Epoch: [30] [ 800/4276] eta: 2:50:33 lr: 1.4116424234335094e-05 loss: 0.0899 (0.0878) time: 2.9515 data: 0.0083 max mem: 33302 Epoch: [30] [ 810/4276] eta: 2:50:04 lr: 1.411339629438912e-05 loss: 0.0899 (0.0878) time: 2.9578 data: 0.0079 max mem: 33302 Epoch: [30] [ 820/4276] eta: 2:49:35 lr: 1.4110368282260821e-05 loss: 0.0851 (0.0878) time: 2.9549 data: 0.0081 max mem: 33302 Epoch: [30] [ 830/4276] eta: 2:49:06 lr: 1.4107340197931277e-05 loss: 0.0858 (0.0879) time: 2.9548 data: 0.0080 max mem: 33302 Epoch: [30] [ 840/4276] eta: 2:48:37 lr: 1.410431204138154e-05 loss: 0.0865 (0.0879) time: 2.9534 data: 0.0076 max mem: 33302 Epoch: [30] [ 850/4276] eta: 2:48:07 lr: 1.4101283812592656e-05 loss: 0.0801 (0.0879) time: 2.9532 data: 0.0076 max mem: 33302 Epoch: [30] [ 860/4276] eta: 2:47:38 lr: 1.4098255511545664e-05 loss: 0.0825 (0.0879) time: 2.9547 data: 0.0080 max mem: 33302 Epoch: [30] [ 870/4276] eta: 2:47:10 lr: 1.4095227138221598e-05 loss: 0.0868 (0.0879) time: 2.9573 data: 0.0080 max mem: 33302 Epoch: [30] [ 880/4276] eta: 2:46:41 lr: 1.409219869260148e-05 loss: 0.0877 (0.0881) time: 2.9575 data: 0.0076 max mem: 33302 Epoch: [30] [ 890/4276] eta: 2:46:11 lr: 1.408917017466631e-05 loss: 0.1017 (0.0882) time: 2.9548 data: 0.0078 max mem: 33302 Epoch: [30] [ 900/4276] eta: 2:45:42 lr: 1.4086141584397095e-05 loss: 0.0970 (0.0882) time: 2.9501 data: 0.0082 max mem: 33302 Epoch: [30] [ 910/4276] eta: 2:45:13 lr: 1.4083112921774826e-05 loss: 0.0878 (0.0882) time: 2.9511 data: 0.0082 max mem: 33302 Epoch: [30] [ 920/4276] eta: 2:44:44 lr: 1.4080084186780493e-05 loss: 0.0865 (0.0882) time: 2.9557 data: 0.0077 max mem: 33302 Epoch: [30] [ 930/4276] eta: 2:44:15 lr: 1.4077055379395054e-05 loss: 0.0835 (0.0883) time: 2.9554 data: 0.0076 max mem: 33302 Epoch: [30] [ 940/4276] eta: 2:43:46 lr: 1.4074026499599477e-05 loss: 0.0835 (0.0883) time: 2.9539 data: 0.0079 max mem: 33302 Epoch: [30] [ 950/4276] eta: 2:43:16 lr: 1.4070997547374715e-05 loss: 0.0888 (0.0884) time: 2.9530 data: 0.0079 max mem: 33302 Epoch: [30] [ 960/4276] eta: 2:42:47 lr: 1.4067968522701722e-05 loss: 0.0877 (0.0884) time: 2.9531 data: 0.0074 max mem: 33303 Epoch: [30] [ 970/4276] eta: 2:42:19 lr: 1.4064939425561413e-05 loss: 0.0865 (0.0885) time: 2.9600 data: 0.0073 max mem: 33303 Epoch: [30] [ 980/4276] eta: 2:41:49 lr: 1.4061910255934724e-05 loss: 0.0865 (0.0885) time: 2.9615 data: 0.0076 max mem: 33303 Epoch: [30] [ 990/4276] eta: 2:41:20 lr: 1.405888101380257e-05 loss: 0.0833 (0.0884) time: 2.9529 data: 0.0075 max mem: 33303 Epoch: [30] [1000/4276] eta: 2:40:51 lr: 1.4055851699145856e-05 loss: 0.0834 (0.0885) time: 2.9499 data: 0.0072 max mem: 33303 Epoch: [30] [1010/4276] eta: 2:40:22 lr: 1.405282231194547e-05 loss: 0.0893 (0.0885) time: 2.9531 data: 0.0074 max mem: 33303 Epoch: [30] [1020/4276] eta: 2:39:52 lr: 1.4049792852182307e-05 loss: 0.0868 (0.0885) time: 2.9560 data: 0.0075 max mem: 33303 Epoch: [30] [1030/4276] eta: 2:39:23 lr: 1.4046763319837238e-05 loss: 0.0893 (0.0885) time: 2.9564 data: 0.0074 max mem: 33303 Epoch: [30] [1040/4276] eta: 2:38:54 lr: 1.4043733714891138e-05 loss: 0.0861 (0.0884) time: 2.9588 data: 0.0072 max mem: 33303 Epoch: [30] [1050/4276] eta: 2:38:25 lr: 1.4040704037324853e-05 loss: 0.0835 (0.0885) time: 2.9569 data: 0.0073 max mem: 33303 Epoch: [30] [1060/4276] eta: 2:37:56 lr: 1.4037674287119234e-05 loss: 0.0878 (0.0885) time: 2.9545 data: 0.0075 max mem: 33303 Epoch: [30] [1070/4276] eta: 2:37:27 lr: 1.4034644464255123e-05 loss: 0.0888 (0.0886) time: 2.9563 data: 0.0074 max mem: 33303 Epoch: [30] [1080/4276] eta: 2:36:57 lr: 1.403161456871335e-05 loss: 0.0842 (0.0886) time: 2.9553 data: 0.0072 max mem: 33303 Epoch: [30] [1090/4276] eta: 2:36:28 lr: 1.4028584600474724e-05 loss: 0.0950 (0.0887) time: 2.9557 data: 0.0074 max mem: 33303 Epoch: [30] [1100/4276] eta: 2:35:59 lr: 1.4025554559520057e-05 loss: 0.0950 (0.0888) time: 2.9554 data: 0.0077 max mem: 33303 Epoch: [30] [1110/4276] eta: 2:35:29 lr: 1.4022524445830151e-05 loss: 0.0908 (0.0890) time: 2.9485 data: 0.0083 max mem: 33303 Epoch: [30] [1120/4276] eta: 2:35:00 lr: 1.4019494259385802e-05 loss: 0.0875 (0.0890) time: 2.9495 data: 0.0078 max mem: 33303 Epoch: [30] [1130/4276] eta: 2:34:31 lr: 1.4016464000167775e-05 loss: 0.0826 (0.0889) time: 2.9541 data: 0.0068 max mem: 33303 Epoch: [30] [1140/4276] eta: 2:34:02 lr: 1.401343366815685e-05 loss: 0.0858 (0.0889) time: 2.9530 data: 0.0068 max mem: 33303 Epoch: [30] [1150/4276] eta: 2:33:32 lr: 1.401040326333379e-05 loss: 0.0877 (0.0889) time: 2.9530 data: 0.0069 max mem: 33303 Epoch: [30] [1160/4276] eta: 2:33:03 lr: 1.4007372785679335e-05 loss: 0.0883 (0.0890) time: 2.9512 data: 0.0068 max mem: 33303 Epoch: [30] [1170/4276] eta: 2:32:33 lr: 1.4004342235174231e-05 loss: 0.0912 (0.0890) time: 2.9488 data: 0.0070 max mem: 33303 Epoch: [30] [1180/4276] eta: 2:32:04 lr: 1.4001311611799212e-05 loss: 0.0875 (0.0890) time: 2.9499 data: 0.0070 max mem: 33303 Epoch: [30] [1190/4276] eta: 2:31:35 lr: 1.3998280915535e-05 loss: 0.0849 (0.0890) time: 2.9482 data: 0.0074 max mem: 33303 Epoch: [30] [1200/4276] eta: 2:31:05 lr: 1.3995250146362302e-05 loss: 0.0805 (0.0890) time: 2.9487 data: 0.0076 max mem: 33303 Epoch: [30] [1210/4276] eta: 2:30:36 lr: 1.3992219304261822e-05 loss: 0.0710 (0.0888) time: 2.9535 data: 0.0075 max mem: 33303 Epoch: [30] [1220/4276] eta: 2:30:06 lr: 1.3989188389214254e-05 loss: 0.0779 (0.0889) time: 2.9472 data: 0.0076 max mem: 33303 Epoch: [30] [1230/4276] eta: 2:29:37 lr: 1.3986157401200283e-05 loss: 0.0871 (0.0889) time: 2.9482 data: 0.0071 max mem: 33303 Epoch: [30] [1240/4276] eta: 2:29:08 lr: 1.3983126340200573e-05 loss: 0.0871 (0.0889) time: 2.9566 data: 0.0071 max mem: 33303 Epoch: [30] [1250/4276] eta: 2:28:39 lr: 1.3980095206195792e-05 loss: 0.0854 (0.0889) time: 2.9571 data: 0.0073 max mem: 33303 Epoch: [30] [1260/4276] eta: 2:28:09 lr: 1.3977063999166596e-05 loss: 0.0779 (0.0888) time: 2.9548 data: 0.0076 max mem: 33303 Epoch: [30] [1270/4276] eta: 2:27:40 lr: 1.397403271909363e-05 loss: 0.0776 (0.0887) time: 2.9521 data: 0.0077 max mem: 33303 Epoch: [30] [1280/4276] eta: 2:27:11 lr: 1.3971001365957517e-05 loss: 0.0839 (0.0888) time: 2.9542 data: 0.0073 max mem: 33303 Epoch: [30] [1290/4276] eta: 2:26:41 lr: 1.3967969939738887e-05 loss: 0.0839 (0.0888) time: 2.9429 data: 0.0076 max mem: 33303 Epoch: [30] [1300/4276] eta: 2:26:10 lr: 1.3964938440418353e-05 loss: 0.0802 (0.0888) time: 2.9170 data: 0.0077 max mem: 33303 Epoch: [30] [1310/4276] eta: 2:25:40 lr: 1.3961906867976532e-05 loss: 0.0780 (0.0887) time: 2.9135 data: 0.0081 max mem: 33303 Epoch: [30] [1320/4276] eta: 2:25:10 lr: 1.3958875222393997e-05 loss: 0.0795 (0.0888) time: 2.9252 data: 0.0083 max mem: 33303 Epoch: [30] [1330/4276] eta: 2:24:40 lr: 1.3955843503651342e-05 loss: 0.0823 (0.0887) time: 2.9226 data: 0.0084 max mem: 33303 Epoch: [30] [1340/4276] eta: 2:24:10 lr: 1.3952811711729144e-05 loss: 0.0735 (0.0887) time: 2.9105 data: 0.0079 max mem: 33303 Epoch: [30] [1350/4276] eta: 2:23:40 lr: 1.3949779846607971e-05 loss: 0.0823 (0.0887) time: 2.9226 data: 0.0071 max mem: 33303 Epoch: [30] [1360/4276] eta: 2:23:11 lr: 1.3946747908268368e-05 loss: 0.0824 (0.0887) time: 2.9416 data: 0.0082 max mem: 33303 Epoch: [30] [1370/4276] eta: 2:22:41 lr: 1.3943715896690885e-05 loss: 0.0872 (0.0887) time: 2.9500 data: 0.0087 max mem: 33303 Epoch: [30] [1380/4276] eta: 2:22:12 lr: 1.3940683811856056e-05 loss: 0.0872 (0.0887) time: 2.9594 data: 0.0079 max mem: 33303 Epoch: [30] [1390/4276] eta: 2:21:42 lr: 1.3937651653744416e-05 loss: 0.0942 (0.0888) time: 2.9446 data: 0.0079 max mem: 33303 Epoch: [30] [1400/4276] eta: 2:21:13 lr: 1.3934619422336464e-05 loss: 0.0942 (0.0889) time: 2.9365 data: 0.0082 max mem: 33303 Epoch: [30] [1410/4276] eta: 2:20:44 lr: 1.3931587117612713e-05 loss: 0.0838 (0.0889) time: 2.9496 data: 0.0074 max mem: 33303 Epoch: [30] [1420/4276] eta: 2:20:13 lr: 1.3928554739553662e-05 loss: 0.0783 (0.0888) time: 2.9302 data: 0.0080 max mem: 33303 Epoch: [30] [1430/4276] eta: 2:19:43 lr: 1.39255222881398e-05 loss: 0.0733 (0.0888) time: 2.9144 data: 0.0085 max mem: 33303 Epoch: [30] [1440/4276] eta: 2:19:14 lr: 1.392248976335159e-05 loss: 0.0886 (0.0887) time: 2.9388 data: 0.0080 max mem: 33303 Epoch: [30] [1450/4276] eta: 2:18:45 lr: 1.3919457165169506e-05 loss: 0.0822 (0.0887) time: 2.9542 data: 0.0077 max mem: 33303 Epoch: [30] [1460/4276] eta: 2:18:16 lr: 1.391642449357401e-05 loss: 0.0822 (0.0887) time: 2.9556 data: 0.0072 max mem: 33303 Epoch: [30] [1470/4276] eta: 2:17:47 lr: 1.3913391748545531e-05 loss: 0.0829 (0.0887) time: 2.9661 data: 0.0074 max mem: 33303 Epoch: [30] [1480/4276] eta: 2:17:18 lr: 1.391035893006452e-05 loss: 0.0865 (0.0887) time: 2.9691 data: 0.0080 max mem: 33303 Epoch: [30] [1490/4276] eta: 2:16:48 lr: 1.3907326038111396e-05 loss: 0.0852 (0.0886) time: 2.9520 data: 0.0082 max mem: 33303 Epoch: [30] [1500/4276] eta: 2:16:18 lr: 1.3904293072666585e-05 loss: 0.0747 (0.0886) time: 2.9197 data: 0.0082 max mem: 33303 Epoch: [30] [1510/4276] eta: 2:15:47 lr: 1.390126003371048e-05 loss: 0.0758 (0.0885) time: 2.9005 data: 0.0083 max mem: 33303 Epoch: [30] [1520/4276] eta: 2:15:17 lr: 1.3898226921223484e-05 loss: 0.0776 (0.0885) time: 2.9007 data: 0.0078 max mem: 33303 Epoch: [30] [1530/4276] eta: 2:14:47 lr: 1.389519373518598e-05 loss: 0.0800 (0.0884) time: 2.8983 data: 0.0074 max mem: 33303 Epoch: [30] [1540/4276] eta: 2:14:16 lr: 1.3892160475578355e-05 loss: 0.0853 (0.0884) time: 2.8972 data: 0.0075 max mem: 33303 Epoch: [30] [1550/4276] eta: 2:13:46 lr: 1.388912714238096e-05 loss: 0.0752 (0.0884) time: 2.8985 data: 0.0073 max mem: 33303 Epoch: [30] [1560/4276] eta: 2:13:17 lr: 1.388609373557416e-05 loss: 0.0790 (0.0884) time: 2.9148 data: 0.0079 max mem: 33303 Epoch: [30] [1570/4276] eta: 2:12:47 lr: 1.3883060255138297e-05 loss: 0.0790 (0.0884) time: 2.9408 data: 0.0085 max mem: 33303 Epoch: [30] [1580/4276] eta: 2:12:18 lr: 1.3880026701053719e-05 loss: 0.0695 (0.0883) time: 2.9536 data: 0.0084 max mem: 33303 Epoch: [30] [1590/4276] eta: 2:11:49 lr: 1.3876993073300737e-05 loss: 0.0782 (0.0884) time: 2.9515 data: 0.0083 max mem: 33303 Epoch: [30] [1600/4276] eta: 2:11:19 lr: 1.387395937185967e-05 loss: 0.0885 (0.0883) time: 2.9448 data: 0.0084 max mem: 33303 Epoch: [30] [1610/4276] eta: 2:10:50 lr: 1.3870925596710831e-05 loss: 0.0812 (0.0883) time: 2.9467 data: 0.0084 max mem: 33303 Epoch: [30] [1620/4276] eta: 2:10:20 lr: 1.3867891747834519e-05 loss: 0.0806 (0.0883) time: 2.9532 data: 0.0083 max mem: 33303 Epoch: [30] [1630/4276] eta: 2:09:51 lr: 1.3864857825211008e-05 loss: 0.0806 (0.0883) time: 2.9552 data: 0.0083 max mem: 33303 Epoch: [30] [1640/4276] eta: 2:09:22 lr: 1.3861823828820578e-05 loss: 0.0794 (0.0883) time: 2.9547 data: 0.0085 max mem: 33303 Epoch: [30] [1650/4276] eta: 2:08:53 lr: 1.3858789758643497e-05 loss: 0.0792 (0.0883) time: 2.9537 data: 0.0089 max mem: 33303 Epoch: [30] [1660/4276] eta: 2:08:23 lr: 1.3855755614660026e-05 loss: 0.0796 (0.0882) time: 2.9552 data: 0.0089 max mem: 33303 Epoch: [30] [1670/4276] eta: 2:07:54 lr: 1.3852721396850402e-05 loss: 0.0820 (0.0882) time: 2.9560 data: 0.0088 max mem: 33303 Epoch: [30] [1680/4276] eta: 2:07:25 lr: 1.3849687105194861e-05 loss: 0.0820 (0.0882) time: 2.9547 data: 0.0090 max mem: 33303 Epoch: [30] [1690/4276] eta: 2:06:55 lr: 1.3846652739673633e-05 loss: 0.0758 (0.0881) time: 2.9533 data: 0.0088 max mem: 33303 Epoch: [30] [1700/4276] eta: 2:06:26 lr: 1.3843618300266935e-05 loss: 0.0775 (0.0881) time: 2.9500 data: 0.0086 max mem: 33303 Epoch: [30] [1710/4276] eta: 2:05:57 lr: 1.3840583786954966e-05 loss: 0.0794 (0.0880) time: 2.9489 data: 0.0090 max mem: 33303 Epoch: [30] [1720/4276] eta: 2:05:27 lr: 1.3837549199717926e-05 loss: 0.0782 (0.0880) time: 2.9502 data: 0.0092 max mem: 33303 Epoch: [30] [1730/4276] eta: 2:04:58 lr: 1.3834514538535995e-05 loss: 0.0763 (0.0880) time: 2.9505 data: 0.0088 max mem: 33303 Epoch: [30] [1740/4276] eta: 2:04:29 lr: 1.3831479803389358e-05 loss: 0.0765 (0.0879) time: 2.9518 data: 0.0086 max mem: 33303 Epoch: [30] [1750/4276] eta: 2:03:59 lr: 1.382844499425817e-05 loss: 0.0748 (0.0878) time: 2.9536 data: 0.0087 max mem: 33303 Epoch: [30] [1760/4276] eta: 2:03:30 lr: 1.3825410111122587e-05 loss: 0.0747 (0.0878) time: 2.9545 data: 0.0090 max mem: 33303 Epoch: [30] [1770/4276] eta: 2:03:00 lr: 1.3822375153962758e-05 loss: 0.0753 (0.0878) time: 2.9528 data: 0.0088 max mem: 33303 Epoch: [30] [1780/4276] eta: 2:02:31 lr: 1.381934012275881e-05 loss: 0.0839 (0.0878) time: 2.9541 data: 0.0089 max mem: 33303 Epoch: [30] [1790/4276] eta: 2:02:02 lr: 1.3816305017490873e-05 loss: 0.0885 (0.0878) time: 2.9565 data: 0.0093 max mem: 33303 Epoch: [30] [1800/4276] eta: 2:01:33 lr: 1.3813269838139061e-05 loss: 0.0867 (0.0878) time: 2.9554 data: 0.0092 max mem: 33303 Epoch: [30] [1810/4276] eta: 2:01:03 lr: 1.3810234584683479e-05 loss: 0.0827 (0.0878) time: 2.9548 data: 0.0091 max mem: 33303 Epoch: [30] [1820/4276] eta: 2:00:34 lr: 1.3807199257104215e-05 loss: 0.0874 (0.0879) time: 2.9537 data: 0.0090 max mem: 33303 Epoch: [30] [1830/4276] eta: 2:00:05 lr: 1.380416385538135e-05 loss: 0.0874 (0.0878) time: 2.9529 data: 0.0092 max mem: 33303 Epoch: [30] [1840/4276] eta: 1:59:35 lr: 1.3801128379494966e-05 loss: 0.0701 (0.0877) time: 2.9535 data: 0.0093 max mem: 33303 Epoch: [30] [1850/4276] eta: 1:59:06 lr: 1.3798092829425125e-05 loss: 0.0750 (0.0878) time: 2.9555 data: 0.0091 max mem: 33303 Epoch: [30] [1860/4276] eta: 1:58:37 lr: 1.3795057205151871e-05 loss: 0.0861 (0.0877) time: 2.9570 data: 0.0090 max mem: 33303 Epoch: [30] [1870/4276] eta: 1:58:07 lr: 1.3792021506655253e-05 loss: 0.0887 (0.0878) time: 2.9562 data: 0.0092 max mem: 33303 Epoch: [30] [1880/4276] eta: 1:57:38 lr: 1.3788985733915299e-05 loss: 0.0856 (0.0877) time: 2.9565 data: 0.0088 max mem: 33303 Epoch: [30] [1890/4276] eta: 1:57:09 lr: 1.3785949886912041e-05 loss: 0.0826 (0.0877) time: 2.9551 data: 0.0085 max mem: 33303 Epoch: [30] [1900/4276] eta: 1:56:39 lr: 1.378291396562548e-05 loss: 0.0877 (0.0877) time: 2.9562 data: 0.0087 max mem: 33303 Epoch: [30] [1910/4276] eta: 1:56:10 lr: 1.3779877970035618e-05 loss: 0.0745 (0.0877) time: 2.9571 data: 0.0089 max mem: 33303 Epoch: [30] [1920/4276] eta: 1:55:41 lr: 1.377684190012245e-05 loss: 0.0705 (0.0876) time: 2.9551 data: 0.0087 max mem: 33303 Epoch: [30] [1930/4276] eta: 1:55:11 lr: 1.377380575586596e-05 loss: 0.0670 (0.0876) time: 2.9551 data: 0.0085 max mem: 33303 Epoch: [30] [1940/4276] eta: 1:54:42 lr: 1.3770769537246109e-05 loss: 0.0679 (0.0875) time: 2.9489 data: 0.0084 max mem: 33303 Epoch: [30] [1950/4276] eta: 1:54:12 lr: 1.376773324424286e-05 loss: 0.0839 (0.0876) time: 2.9328 data: 0.0086 max mem: 33303 Epoch: [30] [1960/4276] eta: 1:53:42 lr: 1.3764696876836167e-05 loss: 0.0821 (0.0875) time: 2.9128 data: 0.0085 max mem: 33303 Epoch: [30] [1970/4276] eta: 1:53:12 lr: 1.3761660435005972e-05 loss: 0.0748 (0.0875) time: 2.9071 data: 0.0081 max mem: 33303 Epoch: [30] [1980/4276] eta: 1:52:42 lr: 1.3758623918732194e-05 loss: 0.0827 (0.0875) time: 2.9076 data: 0.0077 max mem: 33303 Epoch: [30] [1990/4276] eta: 1:52:12 lr: 1.3755587327994756e-05 loss: 0.0799 (0.0875) time: 2.9021 data: 0.0073 max mem: 33303 Epoch: [30] [2000/4276] eta: 1:51:42 lr: 1.375255066277357e-05 loss: 0.0827 (0.0875) time: 2.9088 data: 0.0075 max mem: 33303 Epoch: [30] [2010/4276] eta: 1:51:13 lr: 1.3749513923048538e-05 loss: 0.0901 (0.0876) time: 2.9091 data: 0.0079 max mem: 33303 Epoch: [30] [2020/4276] eta: 1:50:43 lr: 1.3746477108799532e-05 loss: 0.0982 (0.0876) time: 2.9018 data: 0.0075 max mem: 33303 Epoch: [30] [2030/4276] eta: 1:50:13 lr: 1.3743440220006443e-05 loss: 0.0846 (0.0876) time: 2.9071 data: 0.0073 max mem: 33303 Epoch: [30] [2040/4276] eta: 1:49:43 lr: 1.3740403256649134e-05 loss: 0.0789 (0.0875) time: 2.9162 data: 0.0079 max mem: 33303 Epoch: [30] [2050/4276] eta: 1:49:13 lr: 1.3737366218707468e-05 loss: 0.0815 (0.0875) time: 2.9254 data: 0.0084 max mem: 33303 Epoch: [30] [2060/4276] eta: 1:48:44 lr: 1.3734329106161276e-05 loss: 0.0863 (0.0875) time: 2.9293 data: 0.0085 max mem: 33303 Epoch: [30] [2070/4276] eta: 1:48:14 lr: 1.3731291918990404e-05 loss: 0.0827 (0.0875) time: 2.9375 data: 0.0088 max mem: 33303 Epoch: [30] [2080/4276] eta: 1:47:45 lr: 1.3728254657174686e-05 loss: 0.0847 (0.0876) time: 2.9403 data: 0.0089 max mem: 33303 Epoch: [30] [2090/4276] eta: 1:47:16 lr: 1.3725217320693918e-05 loss: 0.0864 (0.0876) time: 2.9440 data: 0.0083 max mem: 33303 Epoch: [30] [2100/4276] eta: 1:46:46 lr: 1.3722179909527915e-05 loss: 0.0824 (0.0876) time: 2.9541 data: 0.0075 max mem: 33303 Epoch: [30] [2110/4276] eta: 1:46:17 lr: 1.3719142423656473e-05 loss: 0.0780 (0.0875) time: 2.9534 data: 0.0072 max mem: 33303 Epoch: [30] [2120/4276] eta: 1:45:47 lr: 1.3716104863059376e-05 loss: 0.0689 (0.0874) time: 2.9509 data: 0.0072 max mem: 33303 Epoch: [30] [2130/4276] eta: 1:45:18 lr: 1.3713067227716389e-05 loss: 0.0687 (0.0874) time: 2.9502 data: 0.0070 max mem: 33303 Epoch: [30] [2140/4276] eta: 1:44:49 lr: 1.371002951760728e-05 loss: 0.0749 (0.0873) time: 2.9499 data: 0.0070 max mem: 33303 Epoch: [30] [2150/4276] eta: 1:44:19 lr: 1.3706991732711802e-05 loss: 0.0794 (0.0873) time: 2.9502 data: 0.0072 max mem: 33303 Epoch: [30] [2160/4276] eta: 1:43:50 lr: 1.3703953873009704e-05 loss: 0.0797 (0.0873) time: 2.9508 data: 0.0077 max mem: 33303 Epoch: [30] [2170/4276] eta: 1:43:21 lr: 1.3700915938480705e-05 loss: 0.0820 (0.0873) time: 2.9539 data: 0.0079 max mem: 33303 Epoch: [30] [2180/4276] eta: 1:42:51 lr: 1.3697877929104528e-05 loss: 0.0906 (0.0873) time: 2.9554 data: 0.0077 max mem: 33303 Epoch: [30] [2190/4276] eta: 1:42:22 lr: 1.3694839844860889e-05 loss: 0.0881 (0.0873) time: 2.9533 data: 0.0075 max mem: 33303 Epoch: [30] [2200/4276] eta: 1:41:53 lr: 1.3691801685729488e-05 loss: 0.0866 (0.0874) time: 2.9524 data: 0.0073 max mem: 33303 Epoch: [30] [2210/4276] eta: 1:41:23 lr: 1.368876345169001e-05 loss: 0.0866 (0.0874) time: 2.9520 data: 0.0074 max mem: 33303 Epoch: [30] [2220/4276] eta: 1:40:53 lr: 1.3685725142722133e-05 loss: 0.0830 (0.0874) time: 2.9315 data: 0.0080 max mem: 33303 Epoch: [30] [2230/4276] eta: 1:40:24 lr: 1.368268675880553e-05 loss: 0.0830 (0.0874) time: 2.9213 data: 0.0083 max mem: 33303 Epoch: [30] [2240/4276] eta: 1:39:54 lr: 1.3679648299919862e-05 loss: 0.0765 (0.0873) time: 2.9440 data: 0.0079 max mem: 33303 Epoch: [30] [2250/4276] eta: 1:39:25 lr: 1.3676609766044765e-05 loss: 0.0696 (0.0872) time: 2.9558 data: 0.0077 max mem: 33303 Epoch: [30] [2260/4276] eta: 1:38:56 lr: 1.3673571157159882e-05 loss: 0.0724 (0.0872) time: 2.9549 data: 0.0075 max mem: 33303 Epoch: [30] [2270/4276] eta: 1:38:26 lr: 1.3670532473244841e-05 loss: 0.0778 (0.0872) time: 2.9491 data: 0.0074 max mem: 33303 Epoch: [30] [2280/4276] eta: 1:37:57 lr: 1.366749371427926e-05 loss: 0.0765 (0.0872) time: 2.9499 data: 0.0073 max mem: 33303 Epoch: [30] [2290/4276] eta: 1:37:27 lr: 1.3664454880242736e-05 loss: 0.0765 (0.0872) time: 2.9330 data: 0.0079 max mem: 33303 Epoch: [30] [2300/4276] eta: 1:36:57 lr: 1.3661415971114868e-05 loss: 0.0826 (0.0872) time: 2.9059 data: 0.0085 max mem: 33303 Epoch: [30] [2310/4276] eta: 1:36:28 lr: 1.3658376986875238e-05 loss: 0.0934 (0.0872) time: 2.9017 data: 0.0080 max mem: 33303 Epoch: [30] [2320/4276] eta: 1:35:58 lr: 1.3655337927503428e-05 loss: 0.0851 (0.0873) time: 2.9000 data: 0.0077 max mem: 33303 Epoch: [30] [2330/4276] eta: 1:35:28 lr: 1.3652298792978987e-05 loss: 0.0845 (0.0873) time: 2.9088 data: 0.0085 max mem: 33303 Epoch: [30] [2340/4276] eta: 1:34:59 lr: 1.3649259583281473e-05 loss: 0.0823 (0.0872) time: 2.9363 data: 0.0091 max mem: 33303 Epoch: [30] [2350/4276] eta: 1:34:29 lr: 1.364622029839043e-05 loss: 0.0787 (0.0872) time: 2.9542 data: 0.0081 max mem: 33303 Epoch: [30] [2360/4276] eta: 1:34:00 lr: 1.3643180938285391e-05 loss: 0.0784 (0.0872) time: 2.9551 data: 0.0074 max mem: 33303 Epoch: [30] [2370/4276] eta: 1:33:31 lr: 1.3640141502945866e-05 loss: 0.0847 (0.0872) time: 2.9541 data: 0.0073 max mem: 33303 Epoch: [30] [2380/4276] eta: 1:33:01 lr: 1.3637101992351373e-05 loss: 0.0899 (0.0873) time: 2.9531 data: 0.0076 max mem: 33303 Epoch: [30] [2390/4276] eta: 1:32:32 lr: 1.3634062406481413e-05 loss: 0.0984 (0.0873) time: 2.9532 data: 0.0076 max mem: 33303 Epoch: [30] [2400/4276] eta: 1:32:03 lr: 1.3631022745315463e-05 loss: 0.0881 (0.0873) time: 2.9484 data: 0.0075 max mem: 33303 Epoch: [30] [2410/4276] eta: 1:31:33 lr: 1.362798300883301e-05 loss: 0.0828 (0.0873) time: 2.9454 data: 0.0079 max mem: 33303 Epoch: [30] [2420/4276] eta: 1:31:04 lr: 1.3624943197013516e-05 loss: 0.0801 (0.0873) time: 2.9507 data: 0.0078 max mem: 33303 Epoch: [30] [2430/4276] eta: 1:30:34 lr: 1.3621903309836442e-05 loss: 0.0868 (0.0873) time: 2.9569 data: 0.0076 max mem: 33303 Epoch: [30] [2440/4276] eta: 1:30:05 lr: 1.3618863347281228e-05 loss: 0.0815 (0.0873) time: 2.9568 data: 0.0075 max mem: 33303 Epoch: [30] [2450/4276] eta: 1:29:36 lr: 1.361582330932731e-05 loss: 0.0815 (0.0873) time: 2.9526 data: 0.0074 max mem: 33303 Epoch: [30] [2460/4276] eta: 1:29:06 lr: 1.3612783195954118e-05 loss: 0.0837 (0.0873) time: 2.9533 data: 0.0079 max mem: 33303 Epoch: [30] [2470/4276] eta: 1:28:37 lr: 1.3609743007141062e-05 loss: 0.0848 (0.0873) time: 2.9549 data: 0.0078 max mem: 33303 Epoch: [30] [2480/4276] eta: 1:28:08 lr: 1.360670274286754e-05 loss: 0.0848 (0.0873) time: 2.9523 data: 0.0074 max mem: 33303 Epoch: [30] [2490/4276] eta: 1:27:38 lr: 1.3603662403112947e-05 loss: 0.0825 (0.0873) time: 2.9467 data: 0.0075 max mem: 33303 Epoch: [30] [2500/4276] eta: 1:27:08 lr: 1.3600621987856663e-05 loss: 0.0846 (0.0873) time: 2.9328 data: 0.0079 max mem: 33303 Epoch: [30] [2510/4276] eta: 1:26:39 lr: 1.3597581497078066e-05 loss: 0.0873 (0.0873) time: 2.9384 data: 0.0076 max mem: 33303 Epoch: [30] [2520/4276] eta: 1:26:10 lr: 1.3594540930756506e-05 loss: 0.0873 (0.0873) time: 2.9529 data: 0.0071 max mem: 33303 Epoch: [30] [2530/4276] eta: 1:25:40 lr: 1.359150028887133e-05 loss: 0.0727 (0.0873) time: 2.9524 data: 0.0072 max mem: 33303 Epoch: [30] [2540/4276] eta: 1:25:11 lr: 1.3588459571401885e-05 loss: 0.0779 (0.0873) time: 2.9528 data: 0.0070 max mem: 33303 Epoch: [30] [2550/4276] eta: 1:24:42 lr: 1.3585418778327499e-05 loss: 0.0767 (0.0872) time: 2.9519 data: 0.0070 max mem: 33303 Epoch: [30] [2560/4276] eta: 1:24:12 lr: 1.358237790962748e-05 loss: 0.0716 (0.0872) time: 2.9514 data: 0.0071 max mem: 33303 Epoch: [30] [2570/4276] eta: 1:23:43 lr: 1.3579336965281136e-05 loss: 0.0798 (0.0872) time: 2.9519 data: 0.0072 max mem: 33303 Epoch: [30] [2580/4276] eta: 1:23:13 lr: 1.3576295945267765e-05 loss: 0.0848 (0.0872) time: 2.9525 data: 0.0072 max mem: 33303 Epoch: [30] [2590/4276] eta: 1:22:44 lr: 1.3573254849566652e-05 loss: 0.0795 (0.0872) time: 2.9519 data: 0.0070 max mem: 33303 Epoch: [30] [2600/4276] eta: 1:22:15 lr: 1.3570213678157064e-05 loss: 0.0775 (0.0872) time: 2.9528 data: 0.0071 max mem: 33303 Epoch: [30] [2610/4276] eta: 1:21:45 lr: 1.3567172431018268e-05 loss: 0.0810 (0.0872) time: 2.9530 data: 0.0070 max mem: 33303 Epoch: [30] [2620/4276] eta: 1:21:16 lr: 1.3564131108129513e-05 loss: 0.0869 (0.0872) time: 2.9521 data: 0.0070 max mem: 33303 Epoch: [30] [2630/4276] eta: 1:20:46 lr: 1.3561089709470046e-05 loss: 0.0797 (0.0871) time: 2.9511 data: 0.0071 max mem: 33303 Epoch: [30] [2640/4276] eta: 1:20:17 lr: 1.3558048235019089e-05 loss: 0.0782 (0.0871) time: 2.9514 data: 0.0072 max mem: 33303 Epoch: [30] [2650/4276] eta: 1:19:48 lr: 1.355500668475586e-05 loss: 0.0821 (0.0871) time: 2.9537 data: 0.0074 max mem: 33303 Epoch: [30] [2660/4276] eta: 1:19:18 lr: 1.3551965058659577e-05 loss: 0.0782 (0.0871) time: 2.9533 data: 0.0075 max mem: 33303 Epoch: [30] [2670/4276] eta: 1:18:49 lr: 1.3548923356709428e-05 loss: 0.0845 (0.0871) time: 2.9525 data: 0.0075 max mem: 33303 Epoch: [30] [2680/4276] eta: 1:18:19 lr: 1.35458815788846e-05 loss: 0.0847 (0.0871) time: 2.9576 data: 0.0078 max mem: 33303 Epoch: [30] [2690/4276] eta: 1:17:50 lr: 1.3542839725164271e-05 loss: 0.0808 (0.0871) time: 2.9596 data: 0.0080 max mem: 33303 Epoch: [30] [2700/4276] eta: 1:17:21 lr: 1.3539797795527612e-05 loss: 0.0730 (0.0870) time: 2.9544 data: 0.0080 max mem: 33303 Epoch: [30] [2710/4276] eta: 1:16:51 lr: 1.3536755789953761e-05 loss: 0.0792 (0.0870) time: 2.9535 data: 0.0079 max mem: 33303 Epoch: [30] [2720/4276] eta: 1:16:22 lr: 1.353371370842187e-05 loss: 0.0783 (0.0869) time: 2.9549 data: 0.0080 max mem: 33303 Epoch: [30] [2730/4276] eta: 1:15:52 lr: 1.353067155091107e-05 loss: 0.0804 (0.0869) time: 2.9542 data: 0.0079 max mem: 33303 Epoch: [30] [2740/4276] eta: 1:15:23 lr: 1.3527629317400486e-05 loss: 0.0844 (0.0869) time: 2.9535 data: 0.0079 max mem: 33303 Epoch: [30] [2750/4276] eta: 1:14:54 lr: 1.352458700786922e-05 loss: 0.0798 (0.0869) time: 2.9533 data: 0.0080 max mem: 33303 Epoch: [30] [2760/4276] eta: 1:14:24 lr: 1.352154462229637e-05 loss: 0.0733 (0.0869) time: 2.9450 data: 0.0081 max mem: 33303 Epoch: [30] [2770/4276] eta: 1:13:55 lr: 1.3518502160661032e-05 loss: 0.0835 (0.0869) time: 2.9457 data: 0.0082 max mem: 33303 Epoch: [30] [2780/4276] eta: 1:13:25 lr: 1.3515459622942277e-05 loss: 0.0866 (0.0869) time: 2.9546 data: 0.0077 max mem: 33303 Epoch: [30] [2790/4276] eta: 1:12:56 lr: 1.3512417009119175e-05 loss: 0.0854 (0.0869) time: 2.9531 data: 0.0074 max mem: 33303 Epoch: [30] [2800/4276] eta: 1:12:27 lr: 1.3509374319170773e-05 loss: 0.0818 (0.0869) time: 2.9547 data: 0.0078 max mem: 33303 Epoch: [30] [2810/4276] eta: 1:11:57 lr: 1.3506331553076119e-05 loss: 0.0690 (0.0868) time: 2.9549 data: 0.0081 max mem: 33303 Epoch: [30] [2820/4276] eta: 1:11:28 lr: 1.3503288710814254e-05 loss: 0.0714 (0.0868) time: 2.9529 data: 0.0079 max mem: 33303 Epoch: [30] [2830/4276] eta: 1:10:58 lr: 1.3500245792364185e-05 loss: 0.0863 (0.0868) time: 2.9538 data: 0.0077 max mem: 33303 Epoch: [30] [2840/4276] eta: 1:10:29 lr: 1.3497202797704934e-05 loss: 0.0863 (0.0868) time: 2.9581 data: 0.0081 max mem: 33303 Epoch: [30] [2850/4276] eta: 1:10:00 lr: 1.3494159726815492e-05 loss: 0.0917 (0.0869) time: 2.9508 data: 0.0087 max mem: 33303 Epoch: [30] [2860/4276] eta: 1:09:30 lr: 1.3491116579674859e-05 loss: 0.0793 (0.0869) time: 2.9488 data: 0.0082 max mem: 33303 Epoch: [30] [2870/4276] eta: 1:09:01 lr: 1.3488073356262002e-05 loss: 0.0738 (0.0868) time: 2.9606 data: 0.0074 max mem: 33303 Epoch: [30] [2880/4276] eta: 1:08:31 lr: 1.348503005655589e-05 loss: 0.0811 (0.0868) time: 2.9569 data: 0.0074 max mem: 33303 Epoch: [30] [2890/4276] eta: 1:08:02 lr: 1.348198668053548e-05 loss: 0.0805 (0.0869) time: 2.9536 data: 0.0078 max mem: 33303 Epoch: [30] [2900/4276] eta: 1:07:33 lr: 1.3478943228179722e-05 loss: 0.0784 (0.0868) time: 2.9558 data: 0.0076 max mem: 33303 Epoch: [30] [2910/4276] eta: 1:07:03 lr: 1.347589969946754e-05 loss: 0.0808 (0.0868) time: 2.9532 data: 0.0070 max mem: 33303 Epoch: [30] [2920/4276] eta: 1:06:34 lr: 1.3472856094377857e-05 loss: 0.0872 (0.0869) time: 2.9528 data: 0.0069 max mem: 33303 Epoch: [30] [2930/4276] eta: 1:06:04 lr: 1.3469812412889585e-05 loss: 0.0918 (0.0869) time: 2.9532 data: 0.0070 max mem: 33303 Epoch: [30] [2940/4276] eta: 1:05:35 lr: 1.3466768654981635e-05 loss: 0.0844 (0.0869) time: 2.9526 data: 0.0070 max mem: 33303 Epoch: [30] [2950/4276] eta: 1:05:05 lr: 1.3463724820632879e-05 loss: 0.0879 (0.0869) time: 2.9521 data: 0.0070 max mem: 33303 Epoch: [30] [2960/4276] eta: 1:04:36 lr: 1.3460680909822202e-05 loss: 0.0889 (0.0869) time: 2.9542 data: 0.0070 max mem: 33303 Epoch: [30] [2970/4276] eta: 1:04:07 lr: 1.3457636922528474e-05 loss: 0.0889 (0.0869) time: 2.9541 data: 0.0069 max mem: 33303 Epoch: [30] [2980/4276] eta: 1:03:37 lr: 1.3454592858730545e-05 loss: 0.0810 (0.0869) time: 2.9509 data: 0.0070 max mem: 33303 Epoch: [30] [2990/4276] eta: 1:03:08 lr: 1.3451548718407258e-05 loss: 0.0810 (0.0869) time: 2.9508 data: 0.0070 max mem: 33303 Epoch: [30] [3000/4276] eta: 1:02:38 lr: 1.3448504501537452e-05 loss: 0.0803 (0.0869) time: 2.9458 data: 0.0069 max mem: 33303 Epoch: [30] [3010/4276] eta: 1:02:09 lr: 1.344546020809995e-05 loss: 0.0818 (0.0868) time: 2.9222 data: 0.0078 max mem: 33303 Epoch: [30] [3020/4276] eta: 1:01:39 lr: 1.3442415838073552e-05 loss: 0.0818 (0.0868) time: 2.9286 data: 0.0085 max mem: 33303 Epoch: [30] [3030/4276] eta: 1:01:10 lr: 1.3439371391437067e-05 loss: 0.0757 (0.0868) time: 2.9561 data: 0.0083 max mem: 33303 Epoch: [30] [3040/4276] eta: 1:00:40 lr: 1.3436326868169277e-05 loss: 0.0962 (0.0869) time: 2.9555 data: 0.0084 max mem: 33303 Epoch: [30] [3050/4276] eta: 1:00:11 lr: 1.343328226824897e-05 loss: 0.0962 (0.0869) time: 2.9551 data: 0.0084 max mem: 33303 Epoch: [30] [3060/4276] eta: 0:59:41 lr: 1.3430237591654899e-05 loss: 0.0729 (0.0868) time: 2.9554 data: 0.0089 max mem: 33303 Epoch: [30] [3070/4276] eta: 0:59:12 lr: 1.342719283836582e-05 loss: 0.0722 (0.0868) time: 2.9552 data: 0.0089 max mem: 33303 Epoch: [30] [3080/4276] eta: 0:58:43 lr: 1.3424148008360485e-05 loss: 0.0766 (0.0867) time: 2.9557 data: 0.0087 max mem: 33303 Epoch: [30] [3090/4276] eta: 0:58:13 lr: 1.3421103101617624e-05 loss: 0.0766 (0.0868) time: 2.9551 data: 0.0087 max mem: 33303 Epoch: [30] [3100/4276] eta: 0:57:44 lr: 1.3418058118115947e-05 loss: 0.0735 (0.0867) time: 2.9546 data: 0.0088 max mem: 33303 Epoch: [30] [3110/4276] eta: 0:57:14 lr: 1.3415013057834175e-05 loss: 0.0712 (0.0867) time: 2.9544 data: 0.0087 max mem: 33303 Epoch: [30] [3120/4276] eta: 0:56:45 lr: 1.3411967920751003e-05 loss: 0.0730 (0.0867) time: 2.9559 data: 0.0088 max mem: 33303 Epoch: [30] [3130/4276] eta: 0:56:16 lr: 1.3408922706845123e-05 loss: 0.0834 (0.0867) time: 2.9568 data: 0.0088 max mem: 33303 Epoch: [30] [3140/4276] eta: 0:55:46 lr: 1.3405877416095196e-05 loss: 0.0842 (0.0866) time: 2.9564 data: 0.0086 max mem: 33303 Epoch: [30] [3150/4276] eta: 0:55:17 lr: 1.3402832048479903e-05 loss: 0.0772 (0.0866) time: 2.9551 data: 0.0083 max mem: 33303 Epoch: [30] [3160/4276] eta: 0:54:47 lr: 1.3399786603977885e-05 loss: 0.0787 (0.0866) time: 2.9544 data: 0.0082 max mem: 33303 Epoch: [30] [3170/4276] eta: 0:54:18 lr: 1.3396741082567796e-05 loss: 0.0793 (0.0866) time: 2.9533 data: 0.0082 max mem: 33303 Epoch: [30] [3180/4276] eta: 0:53:48 lr: 1.3393695484228253e-05 loss: 0.0835 (0.0866) time: 2.9538 data: 0.0084 max mem: 33303 Epoch: [30] [3190/4276] eta: 0:53:19 lr: 1.339064980893788e-05 loss: 0.0840 (0.0867) time: 2.9557 data: 0.0085 max mem: 33303 Epoch: [30] [3200/4276] eta: 0:52:50 lr: 1.3387604056675288e-05 loss: 0.0791 (0.0866) time: 2.9521 data: 0.0084 max mem: 33303 Epoch: [30] [3210/4276] eta: 0:52:20 lr: 1.3384558227419077e-05 loss: 0.0736 (0.0866) time: 2.9524 data: 0.0086 max mem: 33303 Epoch: [30] [3220/4276] eta: 0:51:51 lr: 1.338151232114782e-05 loss: 0.0775 (0.0866) time: 2.9548 data: 0.0085 max mem: 33303 Epoch: [30] [3230/4276] eta: 0:51:21 lr: 1.3378466337840098e-05 loss: 0.0820 (0.0866) time: 2.9543 data: 0.0081 max mem: 33303 Epoch: [30] [3240/4276] eta: 0:50:52 lr: 1.3375420277474474e-05 loss: 0.0854 (0.0866) time: 2.9548 data: 0.0080 max mem: 33303 Epoch: [30] [3250/4276] eta: 0:50:22 lr: 1.3372374140029501e-05 loss: 0.0854 (0.0866) time: 2.9554 data: 0.0081 max mem: 33303 Epoch: [30] [3260/4276] eta: 0:49:53 lr: 1.3369327925483707e-05 loss: 0.0863 (0.0866) time: 2.9552 data: 0.0081 max mem: 33303 Epoch: [30] [3270/4276] eta: 0:49:23 lr: 1.3366281633815631e-05 loss: 0.0849 (0.0866) time: 2.9535 data: 0.0080 max mem: 33303 Epoch: [30] [3280/4276] eta: 0:48:54 lr: 1.336323526500379e-05 loss: 0.0841 (0.0866) time: 2.9532 data: 0.0080 max mem: 33303 Epoch: [30] [3290/4276] eta: 0:48:25 lr: 1.3360188819026684e-05 loss: 0.0919 (0.0867) time: 2.9533 data: 0.0080 max mem: 33303 Epoch: [30] [3300/4276] eta: 0:47:55 lr: 1.3357142295862804e-05 loss: 0.0999 (0.0867) time: 2.9542 data: 0.0080 max mem: 33303 Epoch: [30] [3310/4276] eta: 0:47:26 lr: 1.335409569549064e-05 loss: 0.0956 (0.0867) time: 2.9393 data: 0.0088 max mem: 33303 Epoch: [30] [3320/4276] eta: 0:46:56 lr: 1.3351049017888664e-05 loss: 0.1004 (0.0868) time: 2.9154 data: 0.0090 max mem: 33303 Epoch: [30] [3330/4276] eta: 0:46:27 lr: 1.3348002263035325e-05 loss: 0.0862 (0.0868) time: 2.9265 data: 0.0089 max mem: 33303 Epoch: [30] [3340/4276] eta: 0:45:57 lr: 1.3344955430909078e-05 loss: 0.0862 (0.0868) time: 2.9500 data: 0.0089 max mem: 33303 Epoch: [30] [3350/4276] eta: 0:45:28 lr: 1.3341908521488359e-05 loss: 0.0876 (0.0868) time: 2.9546 data: 0.0084 max mem: 33303 Epoch: [30] [3360/4276] eta: 0:44:58 lr: 1.3338861534751595e-05 loss: 0.0776 (0.0868) time: 2.9565 data: 0.0086 max mem: 33303 Epoch: [30] [3370/4276] eta: 0:44:29 lr: 1.3335814470677197e-05 loss: 0.0820 (0.0868) time: 2.9568 data: 0.0086 max mem: 33303 Epoch: [30] [3380/4276] eta: 0:43:59 lr: 1.3332767329243562e-05 loss: 0.0838 (0.0868) time: 2.9546 data: 0.0084 max mem: 33303 Epoch: [30] [3390/4276] eta: 0:43:30 lr: 1.3329720110429086e-05 loss: 0.0770 (0.0868) time: 2.9542 data: 0.0084 max mem: 33303 Epoch: [30] [3400/4276] eta: 0:43:01 lr: 1.3326672814212158e-05 loss: 0.0825 (0.0868) time: 2.9545 data: 0.0086 max mem: 33303 Epoch: [30] [3410/4276] eta: 0:42:31 lr: 1.3323625440571125e-05 loss: 0.0862 (0.0869) time: 2.9552 data: 0.0086 max mem: 33303 Epoch: [30] [3420/4276] eta: 0:42:02 lr: 1.3320577989484351e-05 loss: 0.0872 (0.0869) time: 2.9520 data: 0.0088 max mem: 33303 Epoch: [30] [3430/4276] eta: 0:41:32 lr: 1.3317530460930186e-05 loss: 0.0966 (0.0869) time: 2.9515 data: 0.0088 max mem: 33303 Epoch: [30] [3440/4276] eta: 0:41:03 lr: 1.3314482854886962e-05 loss: 0.0815 (0.0869) time: 2.9571 data: 0.0086 max mem: 33303 Epoch: [30] [3450/4276] eta: 0:40:33 lr: 1.3311435171332991e-05 loss: 0.0848 (0.0870) time: 2.9569 data: 0.0086 max mem: 33303 Epoch: [30] [3460/4276] eta: 0:40:04 lr: 1.330838741024659e-05 loss: 0.1021 (0.0870) time: 2.9543 data: 0.0085 max mem: 33303 Epoch: [30] [3470/4276] eta: 0:39:34 lr: 1.3305339571606054e-05 loss: 0.0859 (0.0870) time: 2.9537 data: 0.0084 max mem: 33303 Epoch: [30] [3480/4276] eta: 0:39:05 lr: 1.3302291655389676e-05 loss: 0.0845 (0.0870) time: 2.9535 data: 0.0086 max mem: 33303 Epoch: [30] [3490/4276] eta: 0:38:36 lr: 1.3299243661575723e-05 loss: 0.0825 (0.0870) time: 2.9553 data: 0.0083 max mem: 33303 Epoch: [30] [3500/4276] eta: 0:38:06 lr: 1.3296195590142457e-05 loss: 0.0825 (0.0870) time: 2.9613 data: 0.0079 max mem: 33303 Epoch: [30] [3510/4276] eta: 0:37:37 lr: 1.3293147441068135e-05 loss: 0.0780 (0.0870) time: 2.9601 data: 0.0083 max mem: 33303 Epoch: [30] [3520/4276] eta: 0:37:07 lr: 1.3290099214331e-05 loss: 0.0780 (0.0870) time: 2.9547 data: 0.0087 max mem: 33303 Epoch: [30] [3530/4276] eta: 0:36:38 lr: 1.3287050909909269e-05 loss: 0.0838 (0.0870) time: 2.9533 data: 0.0083 max mem: 33303 Epoch: [30] [3540/4276] eta: 0:36:08 lr: 1.3284002527781167e-05 loss: 0.0859 (0.0870) time: 2.9530 data: 0.0080 max mem: 33303 Epoch: [30] [3550/4276] eta: 0:35:39 lr: 1.3280954067924895e-05 loss: 0.0859 (0.0870) time: 2.9548 data: 0.0081 max mem: 33303 Epoch: [30] [3560/4276] eta: 0:35:09 lr: 1.3277905530318654e-05 loss: 0.0868 (0.0870) time: 2.9534 data: 0.0081 max mem: 33303 Epoch: [30] [3570/4276] eta: 0:34:40 lr: 1.3274856914940614e-05 loss: 0.0868 (0.0870) time: 2.9511 data: 0.0082 max mem: 33303 Epoch: [30] [3580/4276] eta: 0:34:10 lr: 1.3271808221768952e-05 loss: 0.0786 (0.0870) time: 2.9513 data: 0.0081 max mem: 33303 Epoch: [30] [3590/4276] eta: 0:33:41 lr: 1.326875945078183e-05 loss: 0.0782 (0.0870) time: 2.9540 data: 0.0080 max mem: 33303 Epoch: [30] [3600/4276] eta: 0:33:12 lr: 1.3265710601957382e-05 loss: 0.0751 (0.0870) time: 2.9543 data: 0.0082 max mem: 33303 Epoch: [30] [3610/4276] eta: 0:32:42 lr: 1.3262661675273752e-05 loss: 0.0796 (0.0870) time: 2.9543 data: 0.0083 max mem: 33303 Epoch: [30] [3620/4276] eta: 0:32:13 lr: 1.325961267070906e-05 loss: 0.0778 (0.0869) time: 2.9560 data: 0.0080 max mem: 33303 Epoch: [30] [3630/4276] eta: 0:31:43 lr: 1.3256563588241422e-05 loss: 0.0820 (0.0870) time: 2.9547 data: 0.0080 max mem: 33303 Epoch: [30] [3640/4276] eta: 0:31:14 lr: 1.3253514427848931e-05 loss: 0.0852 (0.0869) time: 2.9528 data: 0.0080 max mem: 33303 Epoch: [30] [3650/4276] eta: 0:30:44 lr: 1.3250465189509675e-05 loss: 0.0810 (0.0869) time: 2.9527 data: 0.0080 max mem: 33303 Epoch: [30] [3660/4276] eta: 0:30:15 lr: 1.3247415873201734e-05 loss: 0.0808 (0.0869) time: 2.9543 data: 0.0081 max mem: 33303 Epoch: [30] [3670/4276] eta: 0:29:45 lr: 1.3244366478903177e-05 loss: 0.0774 (0.0869) time: 2.9546 data: 0.0081 max mem: 33303 Epoch: [30] [3680/4276] eta: 0:29:16 lr: 1.3241317006592044e-05 loss: 0.0774 (0.0869) time: 2.9538 data: 0.0080 max mem: 33303 Epoch: [30] [3690/4276] eta: 0:28:46 lr: 1.3238267456246384e-05 loss: 0.0839 (0.0869) time: 2.9543 data: 0.0080 max mem: 33303 Epoch: [30] [3700/4276] eta: 0:28:17 lr: 1.323521782784422e-05 loss: 0.0889 (0.0869) time: 2.9540 data: 0.0079 max mem: 33303 Epoch: [30] [3710/4276] eta: 0:27:48 lr: 1.3232168121363581e-05 loss: 0.0781 (0.0869) time: 2.9528 data: 0.0079 max mem: 33303 Epoch: [30] [3720/4276] eta: 0:27:18 lr: 1.3229118336782456e-05 loss: 0.0727 (0.0869) time: 2.9534 data: 0.0080 max mem: 33303 Epoch: [30] [3730/4276] eta: 0:26:49 lr: 1.3226068474078848e-05 loss: 0.0919 (0.0869) time: 2.9539 data: 0.0080 max mem: 33303 Epoch: [30] [3740/4276] eta: 0:26:19 lr: 1.3223018533230738e-05 loss: 0.0806 (0.0869) time: 2.9523 data: 0.0080 max mem: 33303 Epoch: [30] [3750/4276] eta: 0:25:50 lr: 1.3219968514216097e-05 loss: 0.0785 (0.0869) time: 2.9527 data: 0.0080 max mem: 33303 Epoch: [30] [3760/4276] eta: 0:25:20 lr: 1.3216918417012878e-05 loss: 0.0765 (0.0868) time: 2.9582 data: 0.0081 max mem: 33303 Epoch: [30] [3770/4276] eta: 0:24:51 lr: 1.3213868241599028e-05 loss: 0.0728 (0.0868) time: 2.9605 data: 0.0083 max mem: 33303 Epoch: [30] [3780/4276] eta: 0:24:21 lr: 1.3210817987952481e-05 loss: 0.0777 (0.0868) time: 2.9565 data: 0.0085 max mem: 33303 Epoch: [30] [3790/4276] eta: 0:23:52 lr: 1.320776765605117e-05 loss: 0.0806 (0.0868) time: 2.9546 data: 0.0082 max mem: 33303 Epoch: [30] [3800/4276] eta: 0:23:22 lr: 1.320471724587299e-05 loss: 0.0806 (0.0868) time: 2.9546 data: 0.0080 max mem: 33303 Epoch: [30] [3810/4276] eta: 0:22:53 lr: 1.3201666757395841e-05 loss: 0.0774 (0.0868) time: 2.9541 data: 0.0080 max mem: 33303 Epoch: [30] [3820/4276] eta: 0:22:23 lr: 1.3198616190597619e-05 loss: 0.0750 (0.0867) time: 2.9550 data: 0.0080 max mem: 33303 Epoch: [30] [3830/4276] eta: 0:21:54 lr: 1.3195565545456196e-05 loss: 0.0733 (0.0867) time: 2.9557 data: 0.0080 max mem: 33303 Epoch: [30] [3840/4276] eta: 0:21:25 lr: 1.3192514821949426e-05 loss: 0.0744 (0.0867) time: 2.9584 data: 0.0080 max mem: 33303 Epoch: [30] [3850/4276] eta: 0:20:55 lr: 1.3189464020055166e-05 loss: 0.0647 (0.0867) time: 2.9478 data: 0.0081 max mem: 33303 Epoch: [30] [3860/4276] eta: 0:20:26 lr: 1.3186413139751255e-05 loss: 0.0896 (0.0867) time: 2.9448 data: 0.0085 max mem: 33303 Epoch: [30] [3870/4276] eta: 0:19:56 lr: 1.3183362181015524e-05 loss: 0.0896 (0.0867) time: 2.9549 data: 0.0086 max mem: 33303 Epoch: [30] [3880/4276] eta: 0:19:27 lr: 1.318031114382578e-05 loss: 0.0801 (0.0867) time: 2.9550 data: 0.0082 max mem: 33303 Epoch: [30] [3890/4276] eta: 0:18:57 lr: 1.3177260028159826e-05 loss: 0.0817 (0.0867) time: 2.9539 data: 0.0080 max mem: 33303 Epoch: [30] [3900/4276] eta: 0:18:28 lr: 1.3174208833995464e-05 loss: 0.0834 (0.0867) time: 2.9534 data: 0.0080 max mem: 33303 Epoch: [30] [3910/4276] eta: 0:17:58 lr: 1.3171157561310454e-05 loss: 0.0822 (0.0866) time: 2.9546 data: 0.0080 max mem: 33303 Epoch: [30] [3920/4276] eta: 0:17:29 lr: 1.3168106210082575e-05 loss: 0.0713 (0.0867) time: 2.9540 data: 0.0082 max mem: 33303 Epoch: [30] [3930/4276] eta: 0:16:59 lr: 1.3165054780289581e-05 loss: 0.0771 (0.0867) time: 2.9544 data: 0.0082 max mem: 33303 Epoch: [30] [3940/4276] eta: 0:16:30 lr: 1.3162003271909221e-05 loss: 0.0771 (0.0866) time: 2.9530 data: 0.0080 max mem: 33303 Epoch: [30] [3950/4276] eta: 0:16:00 lr: 1.315895168491921e-05 loss: 0.0763 (0.0866) time: 2.9538 data: 0.0080 max mem: 33303 Epoch: [30] [3960/4276] eta: 0:15:31 lr: 1.3155900019297273e-05 loss: 0.0855 (0.0866) time: 2.9540 data: 0.0082 max mem: 33303 Epoch: [30] [3970/4276] eta: 0:15:01 lr: 1.3152848275021124e-05 loss: 0.0880 (0.0866) time: 2.9508 data: 0.0084 max mem: 33303 Epoch: [30] [3980/4276] eta: 0:14:32 lr: 1.3149796452068453e-05 loss: 0.0766 (0.0866) time: 2.9578 data: 0.0082 max mem: 33303 Epoch: [30] [3990/4276] eta: 0:14:02 lr: 1.3146744550416937e-05 loss: 0.0879 (0.0866) time: 2.9512 data: 0.0080 max mem: 33303 Epoch: [30] [4000/4276] eta: 0:13:33 lr: 1.3143692570044253e-05 loss: 0.0871 (0.0866) time: 2.9328 data: 0.0090 max mem: 33303 Epoch: [30] [4010/4276] eta: 0:13:04 lr: 1.3140640510928053e-05 loss: 0.0833 (0.0866) time: 2.9313 data: 0.0093 max mem: 33303 Epoch: [30] [4020/4276] eta: 0:12:34 lr: 1.3137588373045997e-05 loss: 0.0834 (0.0866) time: 2.9447 data: 0.0079 max mem: 33303 Epoch: [30] [4030/4276] eta: 0:12:05 lr: 1.3134536156375702e-05 loss: 0.0790 (0.0866) time: 2.9534 data: 0.0072 max mem: 33303 Epoch: [30] [4040/4276] eta: 0:11:35 lr: 1.3131483860894797e-05 loss: 0.0772 (0.0866) time: 2.9530 data: 0.0070 max mem: 33303 Epoch: [30] [4050/4276] eta: 0:11:06 lr: 1.3128431486580891e-05 loss: 0.0775 (0.0866) time: 2.9465 data: 0.0070 max mem: 33303 Epoch: [30] [4060/4276] eta: 0:10:36 lr: 1.312537903341159e-05 loss: 0.0775 (0.0866) time: 2.9456 data: 0.0071 max mem: 33303 Epoch: [30] [4070/4276] eta: 0:10:07 lr: 1.3122326501364462e-05 loss: 0.0864 (0.0866) time: 2.9516 data: 0.0072 max mem: 33303 Epoch: [30] [4080/4276] eta: 0:09:37 lr: 1.3119273890417095e-05 loss: 0.0829 (0.0866) time: 2.9517 data: 0.0070 max mem: 33303 Epoch: [30] [4090/4276] eta: 0:09:08 lr: 1.3116221200547043e-05 loss: 0.0829 (0.0866) time: 2.9534 data: 0.0072 max mem: 33303 Epoch: [30] [4100/4276] eta: 0:08:38 lr: 1.3113168431731862e-05 loss: 0.0895 (0.0867) time: 2.9517 data: 0.0072 max mem: 33303 Epoch: [30] [4110/4276] eta: 0:08:09 lr: 1.3110115583949082e-05 loss: 0.0869 (0.0867) time: 2.9525 data: 0.0070 max mem: 33303 Epoch: [30] [4120/4276] eta: 0:07:39 lr: 1.3107062657176226e-05 loss: 0.0914 (0.0867) time: 2.9529 data: 0.0071 max mem: 33303 Epoch: [30] [4130/4276] eta: 0:07:10 lr: 1.3104009651390811e-05 loss: 0.0784 (0.0867) time: 2.9536 data: 0.0075 max mem: 33303 Epoch: [30] [4140/4276] eta: 0:06:40 lr: 1.3100956566570342e-05 loss: 0.0765 (0.0867) time: 2.9531 data: 0.0076 max mem: 33303 Epoch: [30] [4150/4276] eta: 0:06:11 lr: 1.3097903402692294e-05 loss: 0.0747 (0.0867) time: 2.9522 data: 0.0072 max mem: 33303 Epoch: [30] [4160/4276] eta: 0:05:41 lr: 1.3094850159734148e-05 loss: 0.0769 (0.0867) time: 2.9529 data: 0.0071 max mem: 33303 Epoch: [30] [4170/4276] eta: 0:05:12 lr: 1.309179683767337e-05 loss: 0.0902 (0.0867) time: 2.9514 data: 0.0070 max mem: 33303 Epoch: [30] [4180/4276] eta: 0:04:42 lr: 1.3088743436487416e-05 loss: 0.0889 (0.0867) time: 2.9504 data: 0.0071 max mem: 33303 Epoch: [30] [4190/4276] eta: 0:04:13 lr: 1.3085689956153712e-05 loss: 0.0851 (0.0867) time: 2.9501 data: 0.0071 max mem: 33303 Epoch: [30] [4200/4276] eta: 0:03:44 lr: 1.3082636396649692e-05 loss: 0.0905 (0.0867) time: 2.9526 data: 0.0073 max mem: 33303 Epoch: [30] [4210/4276] eta: 0:03:14 lr: 1.3079582757952778e-05 loss: 0.0908 (0.0867) time: 2.9605 data: 0.0075 max mem: 33303 Epoch: [30] [4220/4276] eta: 0:02:45 lr: 1.3076529040040352e-05 loss: 0.0931 (0.0868) time: 2.9595 data: 0.0073 max mem: 33303 Epoch: [30] [4230/4276] eta: 0:02:15 lr: 1.3073475242889816e-05 loss: 0.0864 (0.0868) time: 2.9525 data: 0.0071 max mem: 33303 Epoch: [30] [4240/4276] eta: 0:01:46 lr: 1.3070421366478547e-05 loss: 0.0959 (0.0868) time: 2.9530 data: 0.0074 max mem: 33303 Epoch: [30] [4250/4276] eta: 0:01:16 lr: 1.3067367410783914e-05 loss: 0.0959 (0.0868) time: 2.9503 data: 0.0075 max mem: 33303 Epoch: [30] [4260/4276] eta: 0:00:47 lr: 1.306431337578326e-05 loss: 0.0803 (0.0868) time: 2.9307 data: 0.0080 max mem: 33303 Epoch: [30] [4270/4276] eta: 0:00:17 lr: 1.3061259261453932e-05 loss: 0.0895 (0.0868) time: 2.9092 data: 0.0078 max mem: 33303 Epoch: [30] Total time: 3:30:04 Test: [ 0/21770] eta: 11:18:45 time: 1.8707 data: 1.8323 max mem: 33303 Test: [ 100/21770] eta: 0:21:01 time: 0.0380 data: 0.0008 max mem: 33303 Test: [ 200/21770] eta: 0:17:23 time: 0.0386 data: 0.0008 max mem: 33303 Test: [ 300/21770] eta: 0:16:07 time: 0.0384 data: 0.0008 max mem: 33303 Test: [ 400/21770] eta: 0:15:28 time: 0.0386 data: 0.0008 max mem: 33303 Test: [ 500/21770] eta: 0:15:02 time: 0.0383 data: 0.0008 max mem: 33303 Test: [ 600/21770] eta: 0:14:44 time: 0.0387 data: 0.0008 max mem: 33303 Test: [ 700/21770] eta: 0:14:30 time: 0.0382 data: 0.0008 max mem: 33303 Test: [ 800/21770] eta: 0:14:19 time: 0.0390 data: 0.0008 max mem: 33303 Test: [ 900/21770] eta: 0:14:09 time: 0.0383 data: 0.0009 max mem: 33303 Test: [ 1000/21770] eta: 0:14:00 time: 0.0379 data: 0.0009 max mem: 33303 Test: [ 1100/21770] eta: 0:13:51 time: 0.0381 data: 0.0009 max mem: 33303 Test: [ 1200/21770] eta: 0:13:43 time: 0.0379 data: 0.0009 max mem: 33303 Test: [ 1300/21770] eta: 0:13:36 time: 0.0382 data: 0.0009 max mem: 33303 Test: [ 1400/21770] eta: 0:13:29 time: 0.0380 data: 0.0009 max mem: 33303 Test: [ 1500/21770] eta: 0:13:23 time: 0.0384 data: 0.0009 max mem: 33303 Test: [ 1600/21770] eta: 0:13:18 time: 0.0384 data: 0.0009 max mem: 33303 Test: [ 1700/21770] eta: 0:13:13 time: 0.0384 data: 0.0009 max mem: 33303 Test: [ 1800/21770] eta: 0:13:07 time: 0.0383 data: 0.0009 max mem: 33303 Test: [ 1900/21770] eta: 0:13:02 time: 0.0382 data: 0.0009 max mem: 33303 Test: [ 2000/21770] eta: 0:12:57 time: 0.0384 data: 0.0009 max mem: 33303 Test: [ 2100/21770] eta: 0:12:53 time: 0.0384 data: 0.0008 max mem: 33303 Test: [ 2200/21770] eta: 0:12:48 time: 0.0384 data: 0.0009 max mem: 33303 Test: [ 2300/21770] eta: 0:12:44 time: 0.0388 data: 0.0008 max mem: 33303 Test: [ 2400/21770] eta: 0:12:40 time: 0.0395 data: 0.0008 max mem: 33303 Test: [ 2500/21770] eta: 0:12:36 time: 0.0392 data: 0.0008 max mem: 33303 Test: [ 2600/21770] eta: 0:12:32 time: 0.0399 data: 0.0008 max mem: 33303 Test: [ 2700/21770] eta: 0:12:28 time: 0.0378 data: 0.0009 max mem: 33303 Test: [ 2800/21770] eta: 0:12:24 time: 0.0382 data: 0.0009 max mem: 33303 Test: [ 2900/21770] eta: 0:12:19 time: 0.0381 data: 0.0009 max mem: 33303 Test: [ 3000/21770] eta: 0:12:14 time: 0.0382 data: 0.0009 max mem: 33303 Test: [ 3100/21770] eta: 0:12:10 time: 0.0380 data: 0.0009 max mem: 33303 Test: [ 3200/21770] eta: 0:12:05 time: 0.0382 data: 0.0009 max mem: 33303 Test: [ 3300/21770] eta: 0:12:01 time: 0.0381 data: 0.0009 max mem: 33303 Test: [ 3400/21770] eta: 0:11:56 time: 0.0381 data: 0.0009 max mem: 33303 Test: [ 3500/21770] eta: 0:11:52 time: 0.0380 data: 0.0009 max mem: 33303 Test: [ 3600/21770] eta: 0:11:48 time: 0.0382 data: 0.0009 max mem: 33303 Test: [ 3700/21770] eta: 0:11:43 time: 0.0380 data: 0.0009 max mem: 33303 Test: [ 3800/21770] eta: 0:11:39 time: 0.0381 data: 0.0009 max mem: 33303 Test: [ 3900/21770] eta: 0:11:35 time: 0.0381 data: 0.0009 max mem: 33303 Test: [ 4000/21770] eta: 0:11:31 time: 0.0382 data: 0.0009 max mem: 33303 Test: [ 4100/21770] eta: 0:11:26 time: 0.0380 data: 0.0009 max mem: 33303 Test: [ 4200/21770] eta: 0:11:22 time: 0.0381 data: 0.0009 max mem: 33303 Test: [ 4300/21770] eta: 0:11:18 time: 0.0380 data: 0.0009 max mem: 33303 Test: [ 4400/21770] eta: 0:11:14 time: 0.0386 data: 0.0009 max mem: 33303 Test: [ 4500/21770] eta: 0:11:10 time: 0.0384 data: 0.0009 max mem: 33303 Test: [ 4600/21770] eta: 0:11:06 time: 0.0384 data: 0.0009 max mem: 33303 Test: [ 4700/21770] eta: 0:11:02 time: 0.0383 data: 0.0009 max mem: 33303 Test: [ 4800/21770] eta: 0:10:58 time: 0.0384 data: 0.0009 max mem: 33303 Test: [ 4900/21770] eta: 0:10:54 time: 0.0382 data: 0.0009 max mem: 33303 Test: [ 5000/21770] eta: 0:10:50 time: 0.0385 data: 0.0009 max mem: 33303 Test: [ 5100/21770] eta: 0:10:46 time: 0.0384 data: 0.0009 max mem: 33303 Test: [ 5200/21770] eta: 0:10:42 time: 0.0386 data: 0.0009 max mem: 33303 Test: [ 5300/21770] eta: 0:10:38 time: 0.0382 data: 0.0009 max mem: 33303 Test: [ 5400/21770] eta: 0:10:34 time: 0.0387 data: 0.0009 max mem: 33303 Test: [ 5500/21770] eta: 0:10:30 time: 0.0391 data: 0.0008 max mem: 33303 Test: [ 5600/21770] eta: 0:10:26 time: 0.0390 data: 0.0008 max mem: 33303 Test: [ 5700/21770] eta: 0:10:22 time: 0.0394 data: 0.0008 max mem: 33303 Test: [ 5800/21770] eta: 0:10:19 time: 0.0390 data: 0.0008 max mem: 33303 Test: [ 5900/21770] eta: 0:10:15 time: 0.0390 data: 0.0008 max mem: 33303 Test: [ 6000/21770] eta: 0:10:11 time: 0.0386 data: 0.0008 max mem: 33303 Test: [ 6100/21770] eta: 0:10:07 time: 0.0389 data: 0.0008 max mem: 33303 Test: [ 6200/21770] eta: 0:10:03 time: 0.0387 data: 0.0008 max mem: 33303 Test: [ 6300/21770] eta: 0:09:59 time: 0.0389 data: 0.0009 max mem: 33303 Test: [ 6400/21770] eta: 0:09:56 time: 0.0387 data: 0.0009 max mem: 33303 Test: [ 6500/21770] eta: 0:09:52 time: 0.0388 data: 0.0008 max mem: 33303 Test: [ 6600/21770] eta: 0:09:48 time: 0.0386 data: 0.0008 max mem: 33303 Test: [ 6700/21770] eta: 0:09:44 time: 0.0391 data: 0.0008 max mem: 33303 Test: [ 6800/21770] eta: 0:09:40 time: 0.0391 data: 0.0008 max mem: 33303 Test: [ 6900/21770] eta: 0:09:36 time: 0.0391 data: 0.0008 max mem: 33303 Test: [ 7000/21770] eta: 0:09:33 time: 0.0391 data: 0.0008 max mem: 33303 Test: [ 7100/21770] eta: 0:09:29 time: 0.0391 data: 0.0008 max mem: 33303 Test: [ 7200/21770] eta: 0:09:25 time: 0.0392 data: 0.0008 max mem: 33303 Test: [ 7300/21770] eta: 0:09:21 time: 0.0391 data: 0.0008 max mem: 33303 Test: [ 7400/21770] eta: 0:09:17 time: 0.0391 data: 0.0008 max mem: 33303 Test: [ 7500/21770] eta: 0:09:13 time: 0.0391 data: 0.0008 max mem: 33303 Test: [ 7600/21770] eta: 0:09:10 time: 0.0391 data: 0.0008 max mem: 33303 Test: [ 7700/21770] eta: 0:09:06 time: 0.0392 data: 0.0008 max mem: 33303 Test: [ 7800/21770] eta: 0:09:02 time: 0.0388 data: 0.0008 max mem: 33303 Test: [ 7900/21770] eta: 0:08:58 time: 0.0387 data: 0.0008 max mem: 33303 Test: [ 8000/21770] eta: 0:08:54 time: 0.0388 data: 0.0008 max mem: 33303 Test: [ 8100/21770] eta: 0:08:50 time: 0.0387 data: 0.0008 max mem: 33303 Test: [ 8200/21770] eta: 0:08:46 time: 0.0386 data: 0.0008 max mem: 33303 Test: [ 8300/21770] eta: 0:08:42 time: 0.0387 data: 0.0008 max mem: 33303 Test: [ 8400/21770] eta: 0:08:39 time: 0.0390 data: 0.0008 max mem: 33303 Test: [ 8500/21770] eta: 0:08:35 time: 0.0391 data: 0.0008 max mem: 33303 Test: [ 8600/21770] eta: 0:08:31 time: 0.0391 data: 0.0008 max mem: 33303 Test: [ 8700/21770] eta: 0:08:27 time: 0.0390 data: 0.0008 max mem: 33303 Test: [ 8800/21770] eta: 0:08:23 time: 0.0391 data: 0.0008 max mem: 33303 Test: [ 8900/21770] eta: 0:08:19 time: 0.0392 data: 0.0008 max mem: 33303 Test: [ 9000/21770] eta: 0:08:15 time: 0.0389 data: 0.0008 max mem: 33303 Test: [ 9100/21770] eta: 0:08:12 time: 0.0389 data: 0.0008 max mem: 33303 Test: [ 9200/21770] eta: 0:08:08 time: 0.0386 data: 0.0008 max mem: 33303 Test: [ 9300/21770] eta: 0:08:04 time: 0.0387 data: 0.0008 max mem: 33303 Test: [ 9400/21770] eta: 0:08:00 time: 0.0385 data: 0.0008 max mem: 33303 Test: [ 9500/21770] eta: 0:07:56 time: 0.0389 data: 0.0008 max mem: 33303 Test: [ 9600/21770] eta: 0:07:52 time: 0.0387 data: 0.0008 max mem: 33303 Test: [ 9700/21770] eta: 0:07:48 time: 0.0388 data: 0.0008 max mem: 33303 Test: [ 9800/21770] eta: 0:07:44 time: 0.0387 data: 0.0008 max mem: 33303 Test: [ 9900/21770] eta: 0:07:40 time: 0.0390 data: 0.0008 max mem: 33303 Test: [10000/21770] eta: 0:07:36 time: 0.0387 data: 0.0009 max mem: 33303 Test: [10100/21770] eta: 0:07:33 time: 0.0387 data: 0.0008 max mem: 33303 Test: [10200/21770] eta: 0:07:29 time: 0.0387 data: 0.0008 max mem: 33303 Test: [10300/21770] eta: 0:07:25 time: 0.0390 data: 0.0008 max mem: 33303 Test: [10400/21770] eta: 0:07:21 time: 0.0387 data: 0.0008 max mem: 33303 Test: [10500/21770] eta: 0:07:17 time: 0.0389 data: 0.0008 max mem: 33303 Test: [10600/21770] eta: 0:07:13 time: 0.0388 data: 0.0009 max mem: 33303 Test: [10700/21770] eta: 0:07:09 time: 0.0388 data: 0.0008 max mem: 33303 Test: [10800/21770] eta: 0:07:05 time: 0.0391 data: 0.0008 max mem: 33303 Test: [10900/21770] eta: 0:07:02 time: 0.0391 data: 0.0008 max mem: 33303 Test: [11000/21770] eta: 0:06:58 time: 0.0390 data: 0.0008 max mem: 33303 Test: [11100/21770] eta: 0:06:54 time: 0.0388 data: 0.0008 max mem: 33303 Test: [11200/21770] eta: 0:06:50 time: 0.0387 data: 0.0009 max mem: 33303 Test: [11300/21770] eta: 0:06:46 time: 0.0389 data: 0.0008 max mem: 33303 Test: [11400/21770] eta: 0:06:42 time: 0.0391 data: 0.0008 max mem: 33303 Test: [11500/21770] eta: 0:06:38 time: 0.0388 data: 0.0008 max mem: 33303 Test: [11600/21770] eta: 0:06:34 time: 0.0386 data: 0.0008 max mem: 33303 Test: [11700/21770] eta: 0:06:30 time: 0.0384 data: 0.0008 max mem: 33303 Test: [11800/21770] eta: 0:06:27 time: 0.0384 data: 0.0008 max mem: 33303 Test: [11900/21770] eta: 0:06:23 time: 0.0385 data: 0.0008 max mem: 33303 Test: [12000/21770] eta: 0:06:19 time: 0.0385 data: 0.0009 max mem: 33303 Test: [12100/21770] eta: 0:06:15 time: 0.0385 data: 0.0008 max mem: 33303 Test: [12200/21770] eta: 0:06:11 time: 0.0384 data: 0.0009 max mem: 33303 Test: [12300/21770] eta: 0:06:07 time: 0.0385 data: 0.0008 max mem: 33303 Test: [12400/21770] eta: 0:06:03 time: 0.0385 data: 0.0009 max mem: 33303 Test: [12500/21770] eta: 0:05:59 time: 0.0384 data: 0.0008 max mem: 33303 Test: [12600/21770] eta: 0:05:55 time: 0.0386 data: 0.0009 max mem: 33303 Test: [12700/21770] eta: 0:05:51 time: 0.0384 data: 0.0008 max mem: 33303 Test: [12800/21770] eta: 0:05:47 time: 0.0385 data: 0.0009 max mem: 33303 Test: [12900/21770] eta: 0:05:44 time: 0.0384 data: 0.0009 max mem: 33303 Test: [13000/21770] eta: 0:05:40 time: 0.0386 data: 0.0009 max mem: 33303 Test: [13100/21770] eta: 0:05:36 time: 0.0384 data: 0.0008 max mem: 33303 Test: [13200/21770] eta: 0:05:32 time: 0.0385 data: 0.0008 max mem: 33303 Test: [13300/21770] eta: 0:05:28 time: 0.0385 data: 0.0008 max mem: 33303 Test: [13400/21770] eta: 0:05:24 time: 0.0384 data: 0.0008 max mem: 33303 Test: [13500/21770] eta: 0:05:20 time: 0.0388 data: 0.0008 max mem: 33303 Test: [13600/21770] eta: 0:05:16 time: 0.0389 data: 0.0008 max mem: 33303 Test: [13700/21770] eta: 0:05:12 time: 0.0388 data: 0.0008 max mem: 33303 Test: [13800/21770] eta: 0:05:09 time: 0.0389 data: 0.0008 max mem: 33303 Test: [13900/21770] eta: 0:05:05 time: 0.0387 data: 0.0008 max mem: 33303 Test: [14000/21770] eta: 0:05:01 time: 0.0390 data: 0.0008 max mem: 33303 Test: [14100/21770] eta: 0:04:57 time: 0.0388 data: 0.0009 max mem: 33303 Test: [14200/21770] eta: 0:04:53 time: 0.0389 data: 0.0009 max mem: 33303 Test: [14300/21770] eta: 0:04:49 time: 0.0385 data: 0.0009 max mem: 33303 Test: [14400/21770] eta: 0:04:45 time: 0.0385 data: 0.0009 max mem: 33303 Test: [14500/21770] eta: 0:04:41 time: 0.0385 data: 0.0008 max mem: 33303 Test: [14600/21770] eta: 0:04:38 time: 0.0386 data: 0.0009 max mem: 33303 Test: [14700/21770] eta: 0:04:34 time: 0.0385 data: 0.0008 max mem: 33303 Test: [14800/21770] eta: 0:04:30 time: 0.0384 data: 0.0009 max mem: 33303 Test: [14900/21770] eta: 0:04:26 time: 0.0386 data: 0.0008 max mem: 33303 Test: [15000/21770] eta: 0:04:22 time: 0.0386 data: 0.0009 max mem: 33303 Test: [15100/21770] eta: 0:04:18 time: 0.0384 data: 0.0008 max mem: 33303 Test: [15200/21770] eta: 0:04:14 time: 0.0385 data: 0.0008 max mem: 33303 Test: [15300/21770] eta: 0:04:10 time: 0.0385 data: 0.0008 max mem: 33303 Test: [15400/21770] eta: 0:04:06 time: 0.0385 data: 0.0008 max mem: 33303 Test: [15500/21770] eta: 0:04:03 time: 0.0385 data: 0.0009 max mem: 33303 Test: [15600/21770] eta: 0:03:59 time: 0.0387 data: 0.0009 max mem: 33303 Test: [15700/21770] eta: 0:03:55 time: 0.0389 data: 0.0009 max mem: 33303 Test: [15800/21770] eta: 0:03:51 time: 0.0392 data: 0.0009 max mem: 33303 Test: [15900/21770] eta: 0:03:47 time: 0.0390 data: 0.0009 max mem: 33303 Test: [16000/21770] eta: 0:03:43 time: 0.0391 data: 0.0009 max mem: 33303 Test: [16100/21770] eta: 0:03:39 time: 0.0392 data: 0.0009 max mem: 33303 Test: [16200/21770] eta: 0:03:35 time: 0.0390 data: 0.0009 max mem: 33303 Test: [16300/21770] eta: 0:03:32 time: 0.0390 data: 0.0009 max mem: 33303 Test: [16400/21770] eta: 0:03:28 time: 0.0390 data: 0.0009 max mem: 33303 Test: [16500/21770] eta: 0:03:24 time: 0.0389 data: 0.0009 max mem: 33303 Test: [16600/21770] eta: 0:03:20 time: 0.0391 data: 0.0009 max mem: 33303 Test: [16700/21770] eta: 0:03:16 time: 0.0389 data: 0.0008 max mem: 33303 Test: [16800/21770] eta: 0:03:12 time: 0.0392 data: 0.0008 max mem: 33303 Test: [16900/21770] eta: 0:03:08 time: 0.0391 data: 0.0008 max mem: 33303 Test: [17000/21770] eta: 0:03:04 time: 0.0390 data: 0.0008 max mem: 33303 Test: [17100/21770] eta: 0:03:01 time: 0.0390 data: 0.0008 max mem: 33303 Test: [17200/21770] eta: 0:02:57 time: 0.0387 data: 0.0008 max mem: 33303 Test: [17300/21770] eta: 0:02:53 time: 0.0391 data: 0.0008 max mem: 33303 Test: [17400/21770] eta: 0:02:49 time: 0.0391 data: 0.0008 max mem: 33303 Test: [17500/21770] eta: 0:02:45 time: 0.0389 data: 0.0008 max mem: 33303 Test: [17600/21770] eta: 0:02:41 time: 0.0391 data: 0.0008 max mem: 33303 Test: [17700/21770] eta: 0:02:37 time: 0.0389 data: 0.0008 max mem: 33303 Test: [17800/21770] eta: 0:02:33 time: 0.0388 data: 0.0008 max mem: 33303 Test: [17900/21770] eta: 0:02:30 time: 0.0386 data: 0.0008 max mem: 33303 Test: [18000/21770] eta: 0:02:26 time: 0.0388 data: 0.0008 max mem: 33303 Test: [18100/21770] eta: 0:02:22 time: 0.0384 data: 0.0008 max mem: 33303 Test: [18200/21770] eta: 0:02:18 time: 0.0387 data: 0.0008 max mem: 33303 Test: [18300/21770] eta: 0:02:14 time: 0.0386 data: 0.0008 max mem: 33303 Test: [18400/21770] eta: 0:02:10 time: 0.0390 data: 0.0008 max mem: 33303 Test: [18500/21770] eta: 0:02:06 time: 0.0386 data: 0.0009 max mem: 33303 Test: [18600/21770] eta: 0:02:02 time: 0.0387 data: 0.0009 max mem: 33303 Test: [18700/21770] eta: 0:01:59 time: 0.0388 data: 0.0009 max mem: 33303 Test: [18800/21770] eta: 0:01:55 time: 0.0387 data: 0.0009 max mem: 33303 Test: [18900/21770] eta: 0:01:51 time: 0.0388 data: 0.0009 max mem: 33303 Test: [19000/21770] eta: 0:01:47 time: 0.0387 data: 0.0009 max mem: 33303 Test: [19100/21770] eta: 0:01:43 time: 0.0390 data: 0.0009 max mem: 33303 Test: [19200/21770] eta: 0:01:39 time: 0.0388 data: 0.0009 max mem: 33303 Test: [19300/21770] eta: 0:01:35 time: 0.0391 data: 0.0009 max mem: 33303 Test: [19400/21770] eta: 0:01:31 time: 0.0392 data: 0.0009 max mem: 33303 Test: [19500/21770] eta: 0:01:28 time: 0.0392 data: 0.0009 max mem: 33303 Test: [19600/21770] eta: 0:01:24 time: 0.0395 data: 0.0009 max mem: 33303 Test: [19700/21770] eta: 0:01:20 time: 0.0394 data: 0.0009 max mem: 33303 Test: [19800/21770] eta: 0:01:16 time: 0.0390 data: 0.0009 max mem: 33303 Test: [19900/21770] eta: 0:01:12 time: 0.0394 data: 0.0009 max mem: 33303 Test: [20000/21770] eta: 0:01:08 time: 0.0392 data: 0.0009 max mem: 33303 Test: [20100/21770] eta: 0:01:04 time: 0.0394 data: 0.0009 max mem: 33303 Test: [20200/21770] eta: 0:01:00 time: 0.0393 data: 0.0009 max mem: 33303 Test: [20300/21770] eta: 0:00:57 time: 0.0394 data: 0.0009 max mem: 33303 Test: [20400/21770] eta: 0:00:53 time: 0.0391 data: 0.0009 max mem: 33303 Test: [20500/21770] eta: 0:00:49 time: 0.0394 data: 0.0009 max mem: 33303 Test: [20600/21770] eta: 0:00:45 time: 0.0390 data: 0.0009 max mem: 33303 Test: [20700/21770] eta: 0:00:41 time: 0.0390 data: 0.0009 max mem: 33303 Test: [20800/21770] eta: 0:00:37 time: 0.0385 data: 0.0008 max mem: 33303 Test: [20900/21770] eta: 0:00:33 time: 0.0387 data: 0.0009 max mem: 33303 Test: [21000/21770] eta: 0:00:29 time: 0.0387 data: 0.0009 max mem: 33303 Test: [21100/21770] eta: 0:00:26 time: 0.0398 data: 0.0009 max mem: 33303 Test: [21200/21770] eta: 0:00:22 time: 0.0389 data: 0.0009 max mem: 33303 Test: [21300/21770] eta: 0:00:18 time: 0.0388 data: 0.0009 max mem: 33303 Test: [21400/21770] eta: 0:00:14 time: 0.0386 data: 0.0009 max mem: 33303 Test: [21500/21770] eta: 0:00:10 time: 0.0384 data: 0.0009 max mem: 33303 Test: [21600/21770] eta: 0:00:06 time: 0.0386 data: 0.0009 max mem: 33303 Test: [21700/21770] eta: 0:00:02 time: 0.0384 data: 0.0008 max mem: 33303 Test: Total time: 0:14:05 Final results: Mean IoU is 0.00 precision@0.5 = 0.00 precision@0.6 = 0.00 precision@0.7 = 0.00 precision@0.8 = 0.00 precision@0.9 = 0.00 overall IoU = 0.00 mean IoU = 0.00 Mean accuracy for one-to-zero sample is 99.96 Average object IoU 0.0 Overall IoU 0.0 Epoch: [31] [ 0/4276] eta: 6:30:52 lr: 1.3059426754769138e-05 loss: 0.0745 (0.0745) time: 5.4847 data: 2.3910 max mem: 33303 Epoch: [31] [ 10/4276] eta: 3:45:45 lr: 1.3056372513466764e-05 loss: 0.0745 (0.0817) time: 3.1752 data: 0.2252 max mem: 33303 Epoch: [31] [ 20/4276] eta: 3:37:41 lr: 1.3053318192776742e-05 loss: 0.0699 (0.0878) time: 2.9483 data: 0.0087 max mem: 33303 Epoch: [31] [ 30/4276] eta: 3:34:39 lr: 1.3050263792676364e-05 loss: 0.0804 (0.0878) time: 2.9554 data: 0.0082 max mem: 33303 Epoch: [31] [ 40/4276] eta: 3:32:50 lr: 1.3047209313142916e-05 loss: 0.0817 (0.0874) time: 2.9575 data: 0.0074 max mem: 33303 Epoch: [31] [ 50/4276] eta: 3:31:29 lr: 1.3044154754153667e-05 loss: 0.0731 (0.0848) time: 2.9551 data: 0.0072 max mem: 33303 Epoch: [31] [ 60/4276] eta: 3:30:29 lr: 1.3041100115685886e-05 loss: 0.0664 (0.0828) time: 2.9562 data: 0.0071 max mem: 33303 Epoch: [31] [ 70/4276] eta: 3:29:23 lr: 1.3038045397716803e-05 loss: 0.0619 (0.0802) time: 2.9472 data: 0.0073 max mem: 33303 Epoch: [31] [ 80/4276] eta: 3:28:15 lr: 1.3034990600223656e-05 loss: 0.0756 (0.0810) time: 2.9241 data: 0.0079 max mem: 33303 Epoch: [31] [ 90/4276] eta: 3:27:13 lr: 1.3031935723183671e-05 loss: 0.0856 (0.0827) time: 2.9109 data: 0.0085 max mem: 33303 Epoch: [31] [ 100/4276] eta: 3:26:17 lr: 1.3028880766574062e-05 loss: 0.0856 (0.0846) time: 2.9077 data: 0.0083 max mem: 33303 Epoch: [31] [ 110/4276] eta: 3:25:31 lr: 1.3025825730372007e-05 loss: 0.0928 (0.0857) time: 2.9135 data: 0.0085 max mem: 33303 Epoch: [31] [ 120/4276] eta: 3:24:46 lr: 1.3022770614554705e-05 loss: 0.0928 (0.0859) time: 2.9180 data: 0.0088 max mem: 33303 Epoch: [31] [ 130/4276] eta: 3:24:02 lr: 1.3019715419099322e-05 loss: 0.0903 (0.0862) time: 2.9129 data: 0.0088 max mem: 33303 Epoch: [31] [ 140/4276] eta: 3:23:21 lr: 1.301666014398302e-05 loss: 0.0795 (0.0855) time: 2.9120 data: 0.0084 max mem: 33303 Epoch: [31] [ 150/4276] eta: 3:22:53 lr: 1.301360478918294e-05 loss: 0.0736 (0.0849) time: 2.9342 data: 0.0086 max mem: 33303 Epoch: [31] [ 160/4276] eta: 3:22:24 lr: 1.3010549354676218e-05 loss: 0.0798 (0.0851) time: 2.9549 data: 0.0081 max mem: 33303 Epoch: [31] [ 170/4276] eta: 3:21:56 lr: 1.3007493840439978e-05 loss: 0.0850 (0.0853) time: 2.9545 data: 0.0071 max mem: 33303 Epoch: [31] [ 180/4276] eta: 3:21:27 lr: 1.3004438246451328e-05 loss: 0.0954 (0.0856) time: 2.9530 data: 0.0071 max mem: 33303 Epoch: [31] [ 190/4276] eta: 3:20:58 lr: 1.300138257268736e-05 loss: 0.0789 (0.0851) time: 2.9532 data: 0.0071 max mem: 33303 Epoch: [31] [ 200/4276] eta: 3:20:28 lr: 1.2998326819125156e-05 loss: 0.0771 (0.0851) time: 2.9531 data: 0.0071 max mem: 33303 Epoch: [31] [ 210/4276] eta: 3:19:57 lr: 1.2995270985741794e-05 loss: 0.0884 (0.0853) time: 2.9451 data: 0.0073 max mem: 33303 Epoch: [31] [ 220/4276] eta: 3:19:24 lr: 1.2992215072514333e-05 loss: 0.0842 (0.0849) time: 2.9377 data: 0.0078 max mem: 33303 Epoch: [31] [ 230/4276] eta: 3:18:55 lr: 1.298915907941981e-05 loss: 0.0698 (0.0843) time: 2.9425 data: 0.0077 max mem: 33303 Epoch: [31] [ 240/4276] eta: 3:18:26 lr: 1.2986103006435266e-05 loss: 0.0704 (0.0840) time: 2.9503 data: 0.0073 max mem: 33303 Epoch: [31] [ 250/4276] eta: 3:17:56 lr: 1.2983046853537716e-05 loss: 0.0808 (0.0844) time: 2.9522 data: 0.0071 max mem: 33303 Epoch: [31] [ 260/4276] eta: 3:17:27 lr: 1.2979990620704177e-05 loss: 0.0844 (0.0845) time: 2.9522 data: 0.0072 max mem: 33303 Epoch: [31] [ 270/4276] eta: 3:16:58 lr: 1.2976934307911634e-05 loss: 0.0724 (0.0848) time: 2.9515 data: 0.0073 max mem: 33303 Epoch: [31] [ 280/4276] eta: 3:16:29 lr: 1.2973877915137073e-05 loss: 0.0806 (0.0847) time: 2.9519 data: 0.0071 max mem: 33303 Epoch: [31] [ 290/4276] eta: 3:15:59 lr: 1.297082144235747e-05 loss: 0.0808 (0.0844) time: 2.9523 data: 0.0073 max mem: 33303 Epoch: [31] [ 300/4276] eta: 3:15:30 lr: 1.2967764889549772e-05 loss: 0.0777 (0.0844) time: 2.9501 data: 0.0073 max mem: 33303 Epoch: [31] [ 310/4276] eta: 3:15:00 lr: 1.296470825669093e-05 loss: 0.0766 (0.0839) time: 2.9504 data: 0.0070 max mem: 33303 Epoch: [31] [ 320/4276] eta: 3:14:32 lr: 1.2961651543757875e-05 loss: 0.0879 (0.0845) time: 2.9564 data: 0.0072 max mem: 33303 Epoch: [31] [ 330/4276] eta: 3:14:03 lr: 1.2958594750727529e-05 loss: 0.1031 (0.0850) time: 2.9568 data: 0.0076 max mem: 33303 Epoch: [31] [ 340/4276] eta: 3:13:33 lr: 1.2955537877576792e-05 loss: 0.0873 (0.0850) time: 2.9520 data: 0.0080 max mem: 33303 Epoch: [31] [ 350/4276] eta: 3:13:01 lr: 1.2952480924282564e-05 loss: 0.0859 (0.0852) time: 2.9360 data: 0.0087 max mem: 33303 Epoch: [31] [ 360/4276] eta: 3:12:27 lr: 1.2949423890821721e-05 loss: 0.0935 (0.0858) time: 2.9143 data: 0.0088 max mem: 33303 Epoch: [31] [ 370/4276] eta: 3:11:53 lr: 1.2946366777171142e-05 loss: 0.0874 (0.0858) time: 2.9094 data: 0.0084 max mem: 33303 Epoch: [31] [ 380/4276] eta: 3:11:23 lr: 1.294330958330767e-05 loss: 0.0816 (0.0862) time: 2.9263 data: 0.0086 max mem: 33303 Epoch: [31] [ 390/4276] eta: 3:10:54 lr: 1.294025230920815e-05 loss: 0.0816 (0.0862) time: 2.9480 data: 0.0084 max mem: 33303 Epoch: [31] [ 400/4276] eta: 3:10:25 lr: 1.2937194954849419e-05 loss: 0.0890 (0.0865) time: 2.9550 data: 0.0079 max mem: 33303 Epoch: [31] [ 410/4276] eta: 3:09:58 lr: 1.2934137520208295e-05 loss: 0.0950 (0.0867) time: 2.9618 data: 0.0077 max mem: 33303 Epoch: [31] [ 420/4276] eta: 3:09:29 lr: 1.293108000526157e-05 loss: 0.0885 (0.0870) time: 2.9612 data: 0.0077 max mem: 33303 Epoch: [31] [ 430/4276] eta: 3:09:01 lr: 1.2928022409986048e-05 loss: 0.0841 (0.0872) time: 2.9571 data: 0.0078 max mem: 33303 Epoch: [31] [ 440/4276] eta: 3:08:32 lr: 1.2924964734358503e-05 loss: 0.0889 (0.0872) time: 2.9595 data: 0.0078 max mem: 33303 Epoch: [31] [ 450/4276] eta: 3:08:03 lr: 1.2921906978355708e-05 loss: 0.0777 (0.0871) time: 2.9583 data: 0.0078 max mem: 33303 Epoch: [31] [ 460/4276] eta: 3:07:34 lr: 1.2918849141954404e-05 loss: 0.0743 (0.0867) time: 2.9557 data: 0.0079 max mem: 33303 Epoch: [31] [ 470/4276] eta: 3:07:05 lr: 1.2915791225131338e-05 loss: 0.0660 (0.0865) time: 2.9543 data: 0.0080 max mem: 33303 Epoch: [31] [ 480/4276] eta: 3:06:37 lr: 1.2912733227863238e-05 loss: 0.0824 (0.0864) time: 2.9603 data: 0.0079 max mem: 33303 Epoch: [31] [ 490/4276] eta: 3:06:06 lr: 1.2909675150126826e-05 loss: 0.0785 (0.0862) time: 2.9492 data: 0.0084 max mem: 33303 Epoch: [31] [ 500/4276] eta: 3:05:37 lr: 1.290661699189879e-05 loss: 0.0755 (0.0861) time: 2.9424 data: 0.0084 max mem: 33303 Epoch: [31] [ 510/4276] eta: 3:05:08 lr: 1.2903558753155826e-05 loss: 0.0755 (0.0859) time: 2.9531 data: 0.0075 max mem: 33303 Epoch: [31] [ 520/4276] eta: 3:04:37 lr: 1.290050043387461e-05 loss: 0.0778 (0.0860) time: 2.9462 data: 0.0077 max mem: 33303 Epoch: [31] [ 530/4276] eta: 3:04:08 lr: 1.289744203403181e-05 loss: 0.0813 (0.0860) time: 2.9459 data: 0.0077 max mem: 33303 Epoch: [31] [ 540/4276] eta: 3:03:39 lr: 1.2894383553604065e-05 loss: 0.0774 (0.0859) time: 2.9536 data: 0.0074 max mem: 33303 Epoch: [31] [ 550/4276] eta: 3:03:10 lr: 1.289132499256802e-05 loss: 0.0778 (0.0859) time: 2.9578 data: 0.0074 max mem: 33303 Epoch: [31] [ 560/4276] eta: 3:02:41 lr: 1.2888266350900299e-05 loss: 0.0828 (0.0859) time: 2.9581 data: 0.0075 max mem: 33303 Epoch: [31] [ 570/4276] eta: 3:02:12 lr: 1.2885207628577516e-05 loss: 0.0907 (0.0861) time: 2.9566 data: 0.0075 max mem: 33303 Epoch: [31] [ 580/4276] eta: 3:01:43 lr: 1.2882148825576266e-05 loss: 0.0907 (0.0861) time: 2.9569 data: 0.0073 max mem: 33303 Epoch: [31] [ 590/4276] eta: 3:01:14 lr: 1.2879089941873129e-05 loss: 0.0744 (0.0858) time: 2.9572 data: 0.0074 max mem: 33303 Epoch: [31] [ 600/4276] eta: 3:00:45 lr: 1.2876030977444695e-05 loss: 0.0724 (0.0858) time: 2.9589 data: 0.0078 max mem: 33303 Epoch: [31] [ 610/4276] eta: 3:00:16 lr: 1.2872971932267502e-05 loss: 0.0782 (0.0857) time: 2.9595 data: 0.0078 max mem: 33303 Epoch: [31] [ 620/4276] eta: 2:59:47 lr: 1.286991280631811e-05 loss: 0.0815 (0.0858) time: 2.9609 data: 0.0080 max mem: 33303 Epoch: [31] [ 630/4276] eta: 2:59:18 lr: 1.2866853599573051e-05 loss: 0.0893 (0.0860) time: 2.9582 data: 0.0081 max mem: 33303 Epoch: [31] [ 640/4276] eta: 2:58:49 lr: 1.2863794312008847e-05 loss: 0.0814 (0.0859) time: 2.9563 data: 0.0080 max mem: 33303 Epoch: [31] [ 650/4276] eta: 2:58:20 lr: 1.2860734943601998e-05 loss: 0.0813 (0.0864) time: 2.9568 data: 0.0078 max mem: 33303 Epoch: [31] [ 660/4276] eta: 2:57:50 lr: 1.2857675494329005e-05 loss: 0.0985 (0.0867) time: 2.9523 data: 0.0075 max mem: 33303 Epoch: [31] [ 670/4276] eta: 2:57:18 lr: 1.2854615964166347e-05 loss: 0.0965 (0.0867) time: 2.9254 data: 0.0073 max mem: 33303 Epoch: [31] [ 680/4276] eta: 2:56:46 lr: 1.2851556353090499e-05 loss: 0.0817 (0.0867) time: 2.9008 data: 0.0071 max mem: 33303 Epoch: [31] [ 690/4276] eta: 2:56:14 lr: 1.2848496661077907e-05 loss: 0.0837 (0.0867) time: 2.9003 data: 0.0070 max mem: 33303 Epoch: [31] [ 700/4276] eta: 2:55:42 lr: 1.2845436888105017e-05 loss: 0.0870 (0.0867) time: 2.9080 data: 0.0074 max mem: 33303 Epoch: [31] [ 710/4276] eta: 2:55:12 lr: 1.2842377034148258e-05 loss: 0.0777 (0.0866) time: 2.9232 data: 0.0080 max mem: 33303 Epoch: [31] [ 720/4276] eta: 2:54:41 lr: 1.2839317099184054e-05 loss: 0.0738 (0.0865) time: 2.9238 data: 0.0082 max mem: 33303 Epoch: [31] [ 730/4276] eta: 2:54:09 lr: 1.2836257083188791e-05 loss: 0.0751 (0.0866) time: 2.9089 data: 0.0079 max mem: 33303 Epoch: [31] [ 740/4276] eta: 2:53:38 lr: 1.2833196986138871e-05 loss: 0.0753 (0.0865) time: 2.9030 data: 0.0074 max mem: 33303 Epoch: [31] [ 750/4276] eta: 2:53:07 lr: 1.2830136808010667e-05 loss: 0.0710 (0.0863) time: 2.9143 data: 0.0073 max mem: 33303 Epoch: [31] [ 760/4276] eta: 2:52:36 lr: 1.2827076548780553e-05 loss: 0.0736 (0.0863) time: 2.9136 data: 0.0075 max mem: 33303 Epoch: [31] [ 770/4276] eta: 2:52:05 lr: 1.282401620842486e-05 loss: 0.0860 (0.0864) time: 2.9037 data: 0.0076 max mem: 33303 Epoch: [31] [ 780/4276] eta: 2:51:33 lr: 1.2820955786919939e-05 loss: 0.0783 (0.0863) time: 2.9039 data: 0.0074 max mem: 33303 Epoch: [31] [ 790/4276] eta: 2:51:04 lr: 1.281789528424211e-05 loss: 0.0839 (0.0864) time: 2.9295 data: 0.0080 max mem: 33303 Epoch: [31] [ 800/4276] eta: 2:50:35 lr: 1.281483470036769e-05 loss: 0.0846 (0.0863) time: 2.9568 data: 0.0089 max mem: 33303 Epoch: [31] [ 810/4276] eta: 2:50:06 lr: 1.2811774035272963e-05 loss: 0.0810 (0.0865) time: 2.9568 data: 0.0088 max mem: 33303 Epoch: [31] [ 820/4276] eta: 2:49:37 lr: 1.2808713288934223e-05 loss: 0.0810 (0.0864) time: 2.9548 data: 0.0079 max mem: 33303 Epoch: [31] [ 830/4276] eta: 2:49:08 lr: 1.2805652461327742e-05 loss: 0.0778 (0.0864) time: 2.9544 data: 0.0073 max mem: 33303 Epoch: [31] [ 840/4276] eta: 2:48:39 lr: 1.280259155242978e-05 loss: 0.0778 (0.0864) time: 2.9551 data: 0.0075 max mem: 33303 Epoch: [31] [ 850/4276] eta: 2:48:10 lr: 1.2799530562216572e-05 loss: 0.0741 (0.0864) time: 2.9559 data: 0.0075 max mem: 33303 Epoch: [31] [ 860/4276] eta: 2:47:41 lr: 1.2796469490664354e-05 loss: 0.0782 (0.0865) time: 2.9577 data: 0.0073 max mem: 33303 Epoch: [31] [ 870/4276] eta: 2:47:12 lr: 1.2793408337749347e-05 loss: 0.0814 (0.0866) time: 2.9571 data: 0.0073 max mem: 33303 Epoch: [31] [ 880/4276] eta: 2:46:43 lr: 1.2790347103447758e-05 loss: 0.0945 (0.0868) time: 2.9572 data: 0.0075 max mem: 33303 Epoch: [31] [ 890/4276] eta: 2:46:14 lr: 1.278728578773577e-05 loss: 0.0983 (0.0869) time: 2.9588 data: 0.0075 max mem: 33303 Epoch: [31] [ 900/4276] eta: 2:45:45 lr: 1.278422439058957e-05 loss: 0.0932 (0.0871) time: 2.9565 data: 0.0073 max mem: 33303 Epoch: [31] [ 910/4276] eta: 2:45:16 lr: 1.2781162911985319e-05 loss: 0.0841 (0.0870) time: 2.9541 data: 0.0073 max mem: 33303 Epoch: [31] [ 920/4276] eta: 2:44:47 lr: 1.2778101351899169e-05 loss: 0.0814 (0.0871) time: 2.9551 data: 0.0074 max mem: 33303 Epoch: [31] [ 930/4276] eta: 2:44:18 lr: 1.2775039710307254e-05 loss: 0.0844 (0.0871) time: 2.9543 data: 0.0075 max mem: 33303 Epoch: [31] [ 940/4276] eta: 2:43:49 lr: 1.2771977987185707e-05 loss: 0.0806 (0.0871) time: 2.9542 data: 0.0075 max mem: 33303 Epoch: [31] [ 950/4276] eta: 2:43:19 lr: 1.2768916182510643e-05 loss: 0.0806 (0.0871) time: 2.9539 data: 0.0076 max mem: 33303 Epoch: [31] [ 960/4276] eta: 2:42:49 lr: 1.2765854296258145e-05 loss: 0.0830 (0.0872) time: 2.9413 data: 0.0076 max mem: 33303 Epoch: [31] [ 970/4276] eta: 2:42:19 lr: 1.2762792328404308e-05 loss: 0.0841 (0.0873) time: 2.9269 data: 0.0079 max mem: 33303 Epoch: [31] [ 980/4276] eta: 2:41:49 lr: 1.2759730278925203e-05 loss: 0.0893 (0.0873) time: 2.9218 data: 0.0080 max mem: 33303 Epoch: [31] [ 990/4276] eta: 2:41:18 lr: 1.275666814779689e-05 loss: 0.0809 (0.0873) time: 2.9124 data: 0.0080 max mem: 33303 Epoch: [31] [1000/4276] eta: 2:40:47 lr: 1.2753605934995411e-05 loss: 0.0777 (0.0872) time: 2.9059 data: 0.0083 max mem: 33303 Epoch: [31] [1010/4276] eta: 2:40:17 lr: 1.2750543640496793e-05 loss: 0.0854 (0.0872) time: 2.9072 data: 0.0088 max mem: 33303 Epoch: [31] [1020/4276] eta: 2:39:46 lr: 1.2747481264277062e-05 loss: 0.0808 (0.0872) time: 2.9039 data: 0.0085 max mem: 33303 Epoch: [31] [1030/4276] eta: 2:39:16 lr: 1.2744418806312222e-05 loss: 0.0727 (0.0872) time: 2.9244 data: 0.0088 max mem: 33303 Epoch: [31] [1040/4276] eta: 2:38:47 lr: 1.2741356266578256e-05 loss: 0.0761 (0.0872) time: 2.9534 data: 0.0094 max mem: 33303 Epoch: [31] [1050/4276] eta: 2:38:18 lr: 1.2738293645051147e-05 loss: 0.0820 (0.0873) time: 2.9555 data: 0.0090 max mem: 33303 Epoch: [31] [1060/4276] eta: 2:37:49 lr: 1.2735230941706858e-05 loss: 0.0820 (0.0872) time: 2.9535 data: 0.0089 max mem: 33303 Epoch: [31] [1070/4276] eta: 2:37:20 lr: 1.2732168156521345e-05 loss: 0.0768 (0.0873) time: 2.9555 data: 0.0084 max mem: 33303 Epoch: [31] [1080/4276] eta: 2:36:51 lr: 1.2729105289470538e-05 loss: 0.0790 (0.0873) time: 2.9546 data: 0.0081 max mem: 33303 Epoch: [31] [1090/4276] eta: 2:36:22 lr: 1.2726042340530362e-05 loss: 0.0960 (0.0874) time: 2.9545 data: 0.0081 max mem: 33303 Epoch: [31] [1100/4276] eta: 2:35:52 lr: 1.2722979309676725e-05 loss: 0.0960 (0.0875) time: 2.9524 data: 0.0085 max mem: 33303 Epoch: [31] [1110/4276] eta: 2:35:22 lr: 1.2719916196885535e-05 loss: 0.0877 (0.0875) time: 2.9362 data: 0.0086 max mem: 33303 Epoch: [31] [1120/4276] eta: 2:34:52 lr: 1.2716853002132661e-05 loss: 0.0922 (0.0876) time: 2.9203 data: 0.0075 max mem: 33303 Epoch: [31] [1130/4276] eta: 2:34:24 lr: 1.2713789725393976e-05 loss: 0.0888 (0.0875) time: 2.9504 data: 0.0075 max mem: 33303 Epoch: [31] [1140/4276] eta: 2:33:53 lr: 1.271072636664534e-05 loss: 0.0836 (0.0875) time: 2.9479 data: 0.0083 max mem: 33303 Epoch: [31] [1150/4276] eta: 2:33:23 lr: 1.2707662925862598e-05 loss: 0.0885 (0.0875) time: 2.9073 data: 0.0088 max mem: 33303 Epoch: [31] [1160/4276] eta: 2:32:52 lr: 1.2704599403021567e-05 loss: 0.0885 (0.0875) time: 2.9010 data: 0.0084 max mem: 33303 Epoch: [31] [1170/4276] eta: 2:32:22 lr: 1.270153579809807e-05 loss: 0.0826 (0.0875) time: 2.9060 data: 0.0079 max mem: 33303 Epoch: [31] [1180/4276] eta: 2:31:52 lr: 1.2698472111067906e-05 loss: 0.0849 (0.0875) time: 2.9137 data: 0.0083 max mem: 33303 Epoch: [31] [1190/4276] eta: 2:31:21 lr: 1.269540834190687e-05 loss: 0.0876 (0.0875) time: 2.9099 data: 0.0085 max mem: 33303 Epoch: [31] [1200/4276] eta: 2:30:51 lr: 1.2692344490590727e-05 loss: 0.0730 (0.0874) time: 2.9052 data: 0.0085 max mem: 33303 Epoch: [31] [1210/4276] eta: 2:30:21 lr: 1.268928055709524e-05 loss: 0.0740 (0.0873) time: 2.9057 data: 0.0086 max mem: 33303 Epoch: [31] [1220/4276] eta: 2:29:50 lr: 1.2686216541396163e-05 loss: 0.0805 (0.0873) time: 2.9108 data: 0.0086 max mem: 33303 Epoch: [31] [1230/4276] eta: 2:29:21 lr: 1.2683152443469215e-05 loss: 0.0867 (0.0873) time: 2.9307 data: 0.0087 max mem: 33303 Epoch: [31] [1240/4276] eta: 2:28:51 lr: 1.2680088263290127e-05 loss: 0.0788 (0.0873) time: 2.9319 data: 0.0088 max mem: 33303 Epoch: [31] [1250/4276] eta: 2:28:21 lr: 1.26770240008346e-05 loss: 0.0861 (0.0873) time: 2.9121 data: 0.0088 max mem: 33303 Epoch: [31] [1260/4276] eta: 2:27:51 lr: 1.2673959656078333e-05 loss: 0.0851 (0.0872) time: 2.9071 data: 0.0086 max mem: 33303 Epoch: [31] [1270/4276] eta: 2:27:20 lr: 1.2670895228996996e-05 loss: 0.0841 (0.0871) time: 2.9061 data: 0.0084 max mem: 33303 Epoch: [31] [1280/4276] eta: 2:26:50 lr: 1.2667830719566254e-05 loss: 0.0878 (0.0872) time: 2.9082 data: 0.0086 max mem: 33303 Epoch: [31] [1290/4276] eta: 2:26:20 lr: 1.2664766127761762e-05 loss: 0.0939 (0.0872) time: 2.9111 data: 0.0087 max mem: 33303 Epoch: [31] [1300/4276] eta: 2:25:50 lr: 1.2661701453559164e-05 loss: 0.0811 (0.0872) time: 2.9124 data: 0.0083 max mem: 33303 Epoch: [31] [1310/4276] eta: 2:25:20 lr: 1.2658636696934067e-05 loss: 0.0741 (0.0872) time: 2.9128 data: 0.0082 max mem: 33303 Epoch: [31] [1320/4276] eta: 2:24:51 lr: 1.2655571857862093e-05 loss: 0.0850 (0.0873) time: 2.9294 data: 0.0088 max mem: 33303 Epoch: [31] [1330/4276] eta: 2:24:21 lr: 1.265250693631883e-05 loss: 0.0760 (0.0872) time: 2.9327 data: 0.0090 max mem: 33303 Epoch: [31] [1340/4276] eta: 2:23:52 lr: 1.2649441932279873e-05 loss: 0.0760 (0.0872) time: 2.9392 data: 0.0089 max mem: 33303 Epoch: [31] [1350/4276] eta: 2:23:23 lr: 1.2646376845720779e-05 loss: 0.0864 (0.0872) time: 2.9615 data: 0.0085 max mem: 33303 Epoch: [31] [1360/4276] eta: 2:22:54 lr: 1.2643311676617101e-05 loss: 0.0853 (0.0872) time: 2.9604 data: 0.0077 max mem: 33303 Epoch: [31] [1370/4276] eta: 2:22:25 lr: 1.2640246424944385e-05 loss: 0.0805 (0.0872) time: 2.9580 data: 0.0074 max mem: 33303 Epoch: [31] [1380/4276] eta: 2:21:56 lr: 1.2637181090678166e-05 loss: 0.0862 (0.0872) time: 2.9569 data: 0.0074 max mem: 33303 Epoch: [31] [1390/4276] eta: 2:21:27 lr: 1.263411567379394e-05 loss: 0.0904 (0.0872) time: 2.9572 data: 0.0076 max mem: 33303 Epoch: [31] [1400/4276] eta: 2:20:58 lr: 1.2631050174267215e-05 loss: 0.0904 (0.0872) time: 2.9527 data: 0.0076 max mem: 33303 Epoch: [31] [1410/4276] eta: 2:20:28 lr: 1.2627984592073475e-05 loss: 0.0806 (0.0872) time: 2.9275 data: 0.0083 max mem: 33303 Epoch: [31] [1420/4276] eta: 2:19:57 lr: 1.2624918927188194e-05 loss: 0.0734 (0.0871) time: 2.9065 data: 0.0086 max mem: 33303 Epoch: [31] [1430/4276] eta: 2:19:27 lr: 1.2621853179586826e-05 loss: 0.0837 (0.0872) time: 2.9060 data: 0.0080 max mem: 33303 Epoch: [31] [1440/4276] eta: 2:18:57 lr: 1.2618787349244814e-05 loss: 0.0849 (0.0872) time: 2.9062 data: 0.0081 max mem: 33303 Epoch: [31] [1450/4276] eta: 2:18:27 lr: 1.2615721436137592e-05 loss: 0.0787 (0.0871) time: 2.9070 data: 0.0085 max mem: 33303 Epoch: [31] [1460/4276] eta: 2:17:57 lr: 1.2612655440240579e-05 loss: 0.0800 (0.0871) time: 2.9087 data: 0.0085 max mem: 33303 Epoch: [31] [1470/4276] eta: 2:17:28 lr: 1.2609589361529165e-05 loss: 0.0823 (0.0871) time: 2.9393 data: 0.0088 max mem: 33303 Epoch: [31] [1480/4276] eta: 2:16:59 lr: 1.2606523199978743e-05 loss: 0.0851 (0.0871) time: 2.9467 data: 0.0090 max mem: 33303 Epoch: [31] [1490/4276] eta: 2:16:29 lr: 1.2603456955564685e-05 loss: 0.0851 (0.0871) time: 2.9253 data: 0.0086 max mem: 33303 Epoch: [31] [1500/4276] eta: 2:16:00 lr: 1.2600390628262362e-05 loss: 0.0834 (0.0871) time: 2.9370 data: 0.0086 max mem: 33303 Epoch: [31] [1510/4276] eta: 2:15:30 lr: 1.2597324218047107e-05 loss: 0.0812 (0.0870) time: 2.9354 data: 0.0089 max mem: 33303 Epoch: [31] [1520/4276] eta: 2:15:01 lr: 1.2594257724894254e-05 loss: 0.0789 (0.0870) time: 2.9390 data: 0.0090 max mem: 33303 Epoch: [31] [1530/4276] eta: 2:14:32 lr: 1.2591191148779131e-05 loss: 0.0789 (0.0870) time: 2.9569 data: 0.0083 max mem: 33303 Epoch: [31] [1540/4276] eta: 2:14:03 lr: 1.2588124489677028e-05 loss: 0.0834 (0.0870) time: 2.9602 data: 0.0076 max mem: 33303 Epoch: [31] [1550/4276] eta: 2:13:34 lr: 1.2585057747563239e-05 loss: 0.0846 (0.0869) time: 2.9578 data: 0.0076 max mem: 33303 Epoch: [31] [1560/4276] eta: 2:13:05 lr: 1.2581990922413042e-05 loss: 0.0764 (0.0869) time: 2.9557 data: 0.0076 max mem: 33303 Epoch: [31] [1570/4276] eta: 2:12:36 lr: 1.2578924014201707e-05 loss: 0.0764 (0.0869) time: 2.9568 data: 0.0074 max mem: 33303 Epoch: [31] [1580/4276] eta: 2:12:06 lr: 1.2575857022904464e-05 loss: 0.0752 (0.0868) time: 2.9549 data: 0.0074 max mem: 33303 Epoch: [31] [1590/4276] eta: 2:11:37 lr: 1.2572789948496558e-05 loss: 0.0752 (0.0867) time: 2.9549 data: 0.0076 max mem: 33303 Epoch: [31] [1600/4276] eta: 2:11:08 lr: 1.2569722790953203e-05 loss: 0.0798 (0.0867) time: 2.9526 data: 0.0075 max mem: 33303 Epoch: [31] [1610/4276] eta: 2:10:38 lr: 1.2566655550249614e-05 loss: 0.0798 (0.0867) time: 2.9275 data: 0.0074 max mem: 33303 Epoch: [31] [1620/4276] eta: 2:10:08 lr: 1.256358822636097e-05 loss: 0.0762 (0.0866) time: 2.8978 data: 0.0075 max mem: 33303 Epoch: [31] [1630/4276] eta: 2:09:38 lr: 1.2560520819262453e-05 loss: 0.0767 (0.0866) time: 2.8892 data: 0.0076 max mem: 33303 Epoch: [31] [1640/4276] eta: 2:09:07 lr: 1.2557453328929227e-05 loss: 0.0807 (0.0866) time: 2.8920 data: 0.0079 max mem: 33303 Epoch: [31] [1650/4276] eta: 2:08:38 lr: 1.2554385755336445e-05 loss: 0.0743 (0.0865) time: 2.9142 data: 0.0082 max mem: 33303 Epoch: [31] [1660/4276] eta: 2:08:09 lr: 1.2551318098459234e-05 loss: 0.0717 (0.0865) time: 2.9391 data: 0.0085 max mem: 33303 Epoch: [31] [1670/4276] eta: 2:07:40 lr: 1.2548250358272712e-05 loss: 0.0812 (0.0864) time: 2.9489 data: 0.0086 max mem: 33303 Epoch: [31] [1680/4276] eta: 2:07:10 lr: 1.2545182534751992e-05 loss: 0.0767 (0.0865) time: 2.9506 data: 0.0082 max mem: 33303 Epoch: [31] [1690/4276] eta: 2:06:41 lr: 1.2542114627872172e-05 loss: 0.0736 (0.0864) time: 2.9469 data: 0.0081 max mem: 33303 Epoch: [31] [1700/4276] eta: 2:06:12 lr: 1.2539046637608316e-05 loss: 0.0823 (0.0864) time: 2.9467 data: 0.0082 max mem: 33303 Epoch: [31] [1710/4276] eta: 2:05:42 lr: 1.2535978563935493e-05 loss: 0.0833 (0.0864) time: 2.9503 data: 0.0082 max mem: 33303 Epoch: [31] [1720/4276] eta: 2:05:13 lr: 1.2532910406828751e-05 loss: 0.0705 (0.0863) time: 2.9375 data: 0.0083 max mem: 33303 Epoch: [31] [1730/4276] eta: 2:04:43 lr: 1.2529842166263136e-05 loss: 0.0645 (0.0862) time: 2.9109 data: 0.0081 max mem: 33303 Epoch: [31] [1740/4276] eta: 2:04:14 lr: 1.2526773842213655e-05 loss: 0.0696 (0.0862) time: 2.9229 data: 0.0077 max mem: 33303 Epoch: [31] [1750/4276] eta: 2:03:44 lr: 1.2523705434655316e-05 loss: 0.0764 (0.0862) time: 2.9462 data: 0.0076 max mem: 33303 Epoch: [31] [1760/4276] eta: 2:03:15 lr: 1.2520636943563116e-05 loss: 0.0762 (0.0861) time: 2.9470 data: 0.0075 max mem: 33303 Epoch: [31] [1770/4276] eta: 2:02:46 lr: 1.2517568368912037e-05 loss: 0.0798 (0.0861) time: 2.9494 data: 0.0074 max mem: 33303 Epoch: [31] [1780/4276] eta: 2:02:17 lr: 1.2514499710677034e-05 loss: 0.0745 (0.0861) time: 2.9501 data: 0.0073 max mem: 33303 Epoch: [31] [1790/4276] eta: 2:01:47 lr: 1.2511430968833055e-05 loss: 0.0740 (0.0860) time: 2.9488 data: 0.0075 max mem: 33303 Epoch: [31] [1800/4276] eta: 2:01:18 lr: 1.2508362143355042e-05 loss: 0.0801 (0.0860) time: 2.9424 data: 0.0077 max mem: 33303 Epoch: [31] [1810/4276] eta: 2:00:48 lr: 1.2505293234217918e-05 loss: 0.0801 (0.0861) time: 2.9347 data: 0.0080 max mem: 33303 Epoch: [31] [1820/4276] eta: 2:00:19 lr: 1.2502224241396581e-05 loss: 0.0925 (0.0861) time: 2.9404 data: 0.0084 max mem: 33303 Epoch: [31] [1830/4276] eta: 1:59:50 lr: 1.2499155164865924e-05 loss: 0.0786 (0.0861) time: 2.9472 data: 0.0083 max mem: 33303 Epoch: [31] [1840/4276] eta: 1:59:20 lr: 1.2496086004600833e-05 loss: 0.0716 (0.0860) time: 2.9465 data: 0.0080 max mem: 33303 Epoch: [31] [1850/4276] eta: 1:58:51 lr: 1.2493016760576161e-05 loss: 0.0746 (0.0860) time: 2.9502 data: 0.0078 max mem: 33303 Epoch: [31] [1860/4276] eta: 1:58:22 lr: 1.2489947432766759e-05 loss: 0.0862 (0.0861) time: 2.9527 data: 0.0078 max mem: 33303 Epoch: [31] [1870/4276] eta: 1:57:53 lr: 1.2486878021147462e-05 loss: 0.0863 (0.0861) time: 2.9491 data: 0.0081 max mem: 33303 Epoch: [31] [1880/4276] eta: 1:57:23 lr: 1.2483808525693097e-05 loss: 0.0693 (0.0860) time: 2.9457 data: 0.0079 max mem: 33303 Epoch: [31] [1890/4276] eta: 1:56:54 lr: 1.2480738946378458e-05 loss: 0.0676 (0.0859) time: 2.9436 data: 0.0078 max mem: 33303 Epoch: [31] [1900/4276] eta: 1:56:25 lr: 1.247766928317834e-05 loss: 0.0687 (0.0859) time: 2.9419 data: 0.0078 max mem: 33303 Epoch: [31] [1910/4276] eta: 1:55:55 lr: 1.2474599536067521e-05 loss: 0.0770 (0.0859) time: 2.9419 data: 0.0078 max mem: 33303 Epoch: [31] [1920/4276] eta: 1:55:26 lr: 1.2471529705020769e-05 loss: 0.0860 (0.0859) time: 2.9426 data: 0.0078 max mem: 33303 Epoch: [31] [1930/4276] eta: 1:54:56 lr: 1.2468459790012817e-05 loss: 0.0824 (0.0859) time: 2.9391 data: 0.0080 max mem: 33303 Epoch: [31] [1940/4276] eta: 1:54:27 lr: 1.2465389791018409e-05 loss: 0.0800 (0.0859) time: 2.9310 data: 0.0081 max mem: 33303 Epoch: [31] [1950/4276] eta: 1:53:57 lr: 1.2462319708012257e-05 loss: 0.0905 (0.0859) time: 2.9289 data: 0.0081 max mem: 33303 Epoch: [31] [1960/4276] eta: 1:53:28 lr: 1.2459249540969076e-05 loss: 0.0740 (0.0858) time: 2.9339 data: 0.0084 max mem: 33303 Epoch: [31] [1970/4276] eta: 1:52:58 lr: 1.2456179289863542e-05 loss: 0.0794 (0.0858) time: 2.9370 data: 0.0085 max mem: 33303 Epoch: [31] [1980/4276] eta: 1:52:29 lr: 1.2453108954670335e-05 loss: 0.0802 (0.0858) time: 2.9362 data: 0.0083 max mem: 33303 Epoch: [31] [1990/4276] eta: 1:52:00 lr: 1.2450038535364115e-05 loss: 0.0790 (0.0858) time: 2.9347 data: 0.0083 max mem: 33303 Epoch: [31] [2000/4276] eta: 1:51:30 lr: 1.2446968031919532e-05 loss: 0.0851 (0.0858) time: 2.9362 data: 0.0084 max mem: 33303 Epoch: [31] [2010/4276] eta: 1:51:01 lr: 1.2443897444311212e-05 loss: 0.0848 (0.0858) time: 2.9412 data: 0.0084 max mem: 33303 Epoch: [31] [2020/4276] eta: 1:50:31 lr: 1.244082677251377e-05 loss: 0.0790 (0.0858) time: 2.9338 data: 0.0084 max mem: 33303 Epoch: [31] [2030/4276] eta: 1:50:02 lr: 1.2437756016501812e-05 loss: 0.0714 (0.0857) time: 2.9277 data: 0.0084 max mem: 33303 Epoch: [31] [2040/4276] eta: 1:49:32 lr: 1.2434685176249928e-05 loss: 0.0664 (0.0857) time: 2.9187 data: 0.0083 max mem: 33303 Epoch: [31] [2050/4276] eta: 1:49:02 lr: 1.2431614251732682e-05 loss: 0.0796 (0.0857) time: 2.9108 data: 0.0084 max mem: 33303 Epoch: [31] [2060/4276] eta: 1:48:33 lr: 1.2428543242924637e-05 loss: 0.0846 (0.0857) time: 2.9151 data: 0.0085 max mem: 33303 Epoch: [31] [2070/4276] eta: 1:48:03 lr: 1.2425472149800333e-05 loss: 0.0787 (0.0857) time: 2.9139 data: 0.0082 max mem: 33303 Epoch: [31] [2080/4276] eta: 1:47:33 lr: 1.2422400972334309e-05 loss: 0.0770 (0.0857) time: 2.9107 data: 0.0078 max mem: 33303 Epoch: [31] [2090/4276] eta: 1:47:03 lr: 1.2419329710501066e-05 loss: 0.0770 (0.0857) time: 2.9031 data: 0.0078 max mem: 33303 Epoch: [31] [2100/4276] eta: 1:46:34 lr: 1.2416258364275108e-05 loss: 0.0900 (0.0857) time: 2.8924 data: 0.0082 max mem: 33303 Epoch: [31] [2110/4276] eta: 1:46:04 lr: 1.2413186933630924e-05 loss: 0.0886 (0.0857) time: 2.8995 data: 0.0079 max mem: 33303 Epoch: [31] [2120/4276] eta: 1:45:34 lr: 1.2410115418542973e-05 loss: 0.0666 (0.0856) time: 2.9238 data: 0.0077 max mem: 33303 Epoch: [31] [2130/4276] eta: 1:45:05 lr: 1.2407043818985719e-05 loss: 0.0641 (0.0856) time: 2.9340 data: 0.0077 max mem: 33303 Epoch: [31] [2140/4276] eta: 1:44:36 lr: 1.2403972134933597e-05 loss: 0.0758 (0.0855) time: 2.9320 data: 0.0076 max mem: 33303 Epoch: [31] [2150/4276] eta: 1:44:06 lr: 1.2400900366361044e-05 loss: 0.0739 (0.0855) time: 2.9317 data: 0.0078 max mem: 33303 Epoch: [31] [2160/4276] eta: 1:43:37 lr: 1.2397828513242452e-05 loss: 0.0697 (0.0854) time: 2.9359 data: 0.0080 max mem: 33303 Epoch: [31] [2170/4276] eta: 1:43:07 lr: 1.2394756575552227e-05 loss: 0.0817 (0.0855) time: 2.9197 data: 0.0078 max mem: 33303 Epoch: [31] [2180/4276] eta: 1:42:37 lr: 1.2391684553264752e-05 loss: 0.0817 (0.0854) time: 2.8930 data: 0.0080 max mem: 33303 Epoch: [31] [2190/4276] eta: 1:42:08 lr: 1.2388612446354394e-05 loss: 0.0847 (0.0855) time: 2.8998 data: 0.0085 max mem: 33303 Epoch: [31] [2200/4276] eta: 1:41:38 lr: 1.2385540254795496e-05 loss: 0.0847 (0.0855) time: 2.9258 data: 0.0086 max mem: 33303 Epoch: [31] [2210/4276] eta: 1:41:09 lr: 1.23824679785624e-05 loss: 0.0861 (0.0855) time: 2.9379 data: 0.0083 max mem: 33303 Epoch: [31] [2220/4276] eta: 1:40:39 lr: 1.2379395617629429e-05 loss: 0.0861 (0.0855) time: 2.9367 data: 0.0080 max mem: 33303 Epoch: [31] [2230/4276] eta: 1:40:10 lr: 1.2376323171970892e-05 loss: 0.0766 (0.0855) time: 2.9344 data: 0.0079 max mem: 33303 Epoch: [31] [2240/4276] eta: 1:39:41 lr: 1.2373250641561073e-05 loss: 0.0693 (0.0854) time: 2.9351 data: 0.0081 max mem: 33303 Epoch: [31] [2250/4276] eta: 1:39:11 lr: 1.237017802637425e-05 loss: 0.0672 (0.0853) time: 2.9352 data: 0.0080 max mem: 33303 Epoch: [31] [2260/4276] eta: 1:38:42 lr: 1.236710532638469e-05 loss: 0.0762 (0.0854) time: 2.9342 data: 0.0078 max mem: 33303 Epoch: [31] [2270/4276] eta: 1:38:12 lr: 1.2364032541566645e-05 loss: 0.0881 (0.0854) time: 2.9254 data: 0.0079 max mem: 33303 Epoch: [31] [2280/4276] eta: 1:37:43 lr: 1.2360959671894335e-05 loss: 0.0797 (0.0854) time: 2.9177 data: 0.0081 max mem: 33303 Epoch: [31] [2290/4276] eta: 1:37:13 lr: 1.2357886717341985e-05 loss: 0.0773 (0.0853) time: 2.8964 data: 0.0086 max mem: 33303 Epoch: [31] [2300/4276] eta: 1:36:43 lr: 1.2354813677883793e-05 loss: 0.0773 (0.0853) time: 2.8685 data: 0.0088 max mem: 33303 Epoch: [31] [2310/4276] eta: 1:36:13 lr: 1.2351740553493956e-05 loss: 0.0860 (0.0854) time: 2.8629 data: 0.0086 max mem: 33303 Epoch: [31] [2320/4276] eta: 1:35:43 lr: 1.2348667344146633e-05 loss: 0.0836 (0.0854) time: 2.8662 data: 0.0085 max mem: 33303 Epoch: [31] [2330/4276] eta: 1:35:13 lr: 1.2345594049815987e-05 loss: 0.0787 (0.0853) time: 2.8789 data: 0.0081 max mem: 33303 Epoch: [31] [2340/4276] eta: 1:34:44 lr: 1.2342520670476162e-05 loss: 0.0815 (0.0853) time: 2.9045 data: 0.0084 max mem: 33303 Epoch: [31] [2350/4276] eta: 1:34:14 lr: 1.2339447206101291e-05 loss: 0.0770 (0.0853) time: 2.9085 data: 0.0091 max mem: 33303 Epoch: [31] [2360/4276] eta: 1:33:44 lr: 1.2336373656665474e-05 loss: 0.0739 (0.0853) time: 2.8951 data: 0.0090 max mem: 33303 Epoch: [31] [2370/4276] eta: 1:33:14 lr: 1.2333300022142814e-05 loss: 0.0748 (0.0853) time: 2.8886 data: 0.0082 max mem: 33303 Epoch: [31] [2380/4276] eta: 1:32:45 lr: 1.2330226302507394e-05 loss: 0.0698 (0.0853) time: 2.8882 data: 0.0078 max mem: 33303 Epoch: [31] [2390/4276] eta: 1:32:15 lr: 1.2327152497733287e-05 loss: 0.0751 (0.0852) time: 2.8888 data: 0.0081 max mem: 33303 Epoch: [31] [2400/4276] eta: 1:31:45 lr: 1.2324078607794532e-05 loss: 0.0869 (0.0853) time: 2.8838 data: 0.0080 max mem: 33303 Epoch: [31] [2410/4276] eta: 1:31:15 lr: 1.2321004632665172e-05 loss: 0.0808 (0.0852) time: 2.8819 data: 0.0078 max mem: 33303 Epoch: [31] [2420/4276] eta: 1:30:46 lr: 1.2317930572319235e-05 loss: 0.0734 (0.0852) time: 2.8817 data: 0.0080 max mem: 33303 Epoch: [31] [2430/4276] eta: 1:30:16 lr: 1.2314856426730719e-05 loss: 0.0818 (0.0853) time: 2.8829 data: 0.0082 max mem: 33303 Epoch: [31] [2440/4276] eta: 1:29:46 lr: 1.2311782195873617e-05 loss: 0.0876 (0.0852) time: 2.8820 data: 0.0084 max mem: 33303 Epoch: [31] [2450/4276] eta: 1:29:16 lr: 1.2308707879721906e-05 loss: 0.0791 (0.0852) time: 2.8822 data: 0.0086 max mem: 33303 Epoch: [31] [2460/4276] eta: 1:28:47 lr: 1.2305633478249557e-05 loss: 0.0783 (0.0852) time: 2.8833 data: 0.0088 max mem: 33303 Epoch: [31] [2470/4276] eta: 1:28:17 lr: 1.2302558991430501e-05 loss: 0.0687 (0.0852) time: 2.8840 data: 0.0088 max mem: 33303 Epoch: [31] [2480/4276] eta: 1:27:47 lr: 1.2299484419238674e-05 loss: 0.0778 (0.0852) time: 2.8834 data: 0.0088 max mem: 33303 Epoch: [31] [2490/4276] eta: 1:27:18 lr: 1.2296409761647995e-05 loss: 0.0886 (0.0852) time: 2.8824 data: 0.0088 max mem: 33303 Epoch: [31] [2500/4276] eta: 1:26:48 lr: 1.2293335018632369e-05 loss: 0.0883 (0.0852) time: 2.8836 data: 0.0088 max mem: 33303 Epoch: [31] [2510/4276] eta: 1:26:18 lr: 1.2290260190165667e-05 loss: 0.0883 (0.0852) time: 2.8823 data: 0.0088 max mem: 33303 Epoch: [31] [2520/4276] eta: 1:25:49 lr: 1.228718527622177e-05 loss: 0.0861 (0.0852) time: 2.8818 data: 0.0088 max mem: 33303 Epoch: [31] [2530/4276] eta: 1:25:19 lr: 1.2284110276774526e-05 loss: 0.0697 (0.0852) time: 2.8838 data: 0.0088 max mem: 33303 Epoch: [31] [2540/4276] eta: 1:24:49 lr: 1.2281035191797786e-05 loss: 0.0697 (0.0852) time: 2.8838 data: 0.0088 max mem: 33303 Epoch: [31] [2550/4276] eta: 1:24:20 lr: 1.2277960021265359e-05 loss: 0.0760 (0.0851) time: 2.8835 data: 0.0088 max mem: 33303 Epoch: [31] [2560/4276] eta: 1:23:50 lr: 1.2274884765151063e-05 loss: 0.0677 (0.0851) time: 2.8827 data: 0.0088 max mem: 33303 Epoch: [31] [2570/4276] eta: 1:23:20 lr: 1.2271809423428688e-05 loss: 0.0674 (0.0851) time: 2.8820 data: 0.0088 max mem: 33303 Epoch: [31] [2580/4276] eta: 1:22:51 lr: 1.2268733996072023e-05 loss: 0.0716 (0.0850) time: 2.8823 data: 0.0088 max mem: 33303 Epoch: [31] [2590/4276] eta: 1:22:21 lr: 1.2265658483054813e-05 loss: 0.0673 (0.0850) time: 2.8803 data: 0.0088 max mem: 33303 Epoch: [31] [2600/4276] eta: 1:21:52 lr: 1.2262582884350814e-05 loss: 0.0658 (0.0849) time: 2.8827 data: 0.0085 max mem: 33303 Epoch: [31] [2610/4276] eta: 1:21:22 lr: 1.2259507199933761e-05 loss: 0.0767 (0.0849) time: 2.8944 data: 0.0087 max mem: 33303 Epoch: [31] [2620/4276] eta: 1:20:52 lr: 1.2256431429777373e-05 loss: 0.0828 (0.0849) time: 2.8937 data: 0.0087 max mem: 33303 Epoch: [31] [2630/4276] eta: 1:20:23 lr: 1.225335557385534e-05 loss: 0.0849 (0.0849) time: 2.8894 data: 0.0079 max mem: 33303 Epoch: [31] [2640/4276] eta: 1:19:53 lr: 1.2250279632141357e-05 loss: 0.0635 (0.0849) time: 2.8973 data: 0.0079 max mem: 33303 Epoch: [31] [2650/4276] eta: 1:19:24 lr: 1.2247203604609092e-05 loss: 0.0727 (0.0849) time: 2.9032 data: 0.0085 max mem: 33303 Epoch: [31] [2660/4276] eta: 1:18:55 lr: 1.2244127491232207e-05 loss: 0.0769 (0.0849) time: 2.9127 data: 0.0086 max mem: 33303 Epoch: [31] [2670/4276] eta: 1:18:25 lr: 1.2241051291984327e-05 loss: 0.0822 (0.0849) time: 2.9075 data: 0.0084 max mem: 33303 Epoch: [31] [2680/4276] eta: 1:17:55 lr: 1.2237975006839087e-05 loss: 0.0843 (0.0849) time: 2.8883 data: 0.0080 max mem: 33303 Epoch: [31] [2690/4276] eta: 1:17:26 lr: 1.2234898635770092e-05 loss: 0.0799 (0.0849) time: 2.8812 data: 0.0078 max mem: 33303 Epoch: [31] [2700/4276] eta: 1:16:56 lr: 1.2231822178750942e-05 loss: 0.0750 (0.0849) time: 2.8771 data: 0.0078 max mem: 33303 Epoch: [31] [2710/4276] eta: 1:16:27 lr: 1.2228745635755207e-05 loss: 0.0750 (0.0849) time: 2.8746 data: 0.0077 max mem: 33303 Epoch: [31] [2720/4276] eta: 1:15:57 lr: 1.2225669006756449e-05 loss: 0.0663 (0.0848) time: 2.8761 data: 0.0076 max mem: 33303 Epoch: [31] [2730/4276] eta: 1:15:28 lr: 1.2222592291728224e-05 loss: 0.0674 (0.0848) time: 2.8874 data: 0.0079 max mem: 33303 Epoch: [31] [2740/4276] eta: 1:14:58 lr: 1.2219515490644051e-05 loss: 0.0758 (0.0848) time: 2.8992 data: 0.0087 max mem: 33303 Epoch: [31] [2750/4276] eta: 1:14:29 lr: 1.2216438603477452e-05 loss: 0.0753 (0.0848) time: 2.8973 data: 0.0085 max mem: 33303 Epoch: [31] [2760/4276] eta: 1:13:59 lr: 1.2213361630201926e-05 loss: 0.0652 (0.0847) time: 2.9005 data: 0.0087 max mem: 33303 Epoch: [31] [2770/4276] eta: 1:13:30 lr: 1.2210284570790965e-05 loss: 0.0652 (0.0847) time: 2.9047 data: 0.0093 max mem: 33303 Epoch: [31] [2780/4276] eta: 1:13:00 lr: 1.2207207425218025e-05 loss: 0.0816 (0.0847) time: 2.8905 data: 0.0086 max mem: 33303 Epoch: [31] [2790/4276] eta: 1:12:31 lr: 1.2204130193456566e-05 loss: 0.0858 (0.0847) time: 2.8835 data: 0.0083 max mem: 33303 Epoch: [31] [2800/4276] eta: 1:12:01 lr: 1.2201052875480025e-05 loss: 0.0785 (0.0847) time: 2.8965 data: 0.0082 max mem: 33303 Epoch: [31] [2810/4276] eta: 1:11:32 lr: 1.219797547126183e-05 loss: 0.0648 (0.0846) time: 2.9120 data: 0.0086 max mem: 33303 Epoch: [31] [2820/4276] eta: 1:11:03 lr: 1.2194897980775378e-05 loss: 0.0650 (0.0846) time: 2.9264 data: 0.0089 max mem: 33303 Epoch: [31] [2830/4276] eta: 1:10:34 lr: 1.219182040399406e-05 loss: 0.0740 (0.0846) time: 2.9319 data: 0.0086 max mem: 33303 Epoch: [31] [2840/4276] eta: 1:10:04 lr: 1.2188742740891258e-05 loss: 0.0894 (0.0846) time: 2.9315 data: 0.0087 max mem: 33303 Epoch: [31] [2850/4276] eta: 1:09:35 lr: 1.2185664991440332e-05 loss: 0.0909 (0.0846) time: 2.9327 data: 0.0087 max mem: 33303 Epoch: [31] [2860/4276] eta: 1:09:06 lr: 1.2182587155614617e-05 loss: 0.0772 (0.0846) time: 2.9309 data: 0.0083 max mem: 33303 Epoch: [31] [2870/4276] eta: 1:08:36 lr: 1.2179509233387446e-05 loss: 0.0794 (0.0846) time: 2.9159 data: 0.0080 max mem: 33303 Epoch: [31] [2880/4276] eta: 1:08:07 lr: 1.2176431224732131e-05 loss: 0.0794 (0.0846) time: 2.9156 data: 0.0079 max mem: 33303 Epoch: [31] [2890/4276] eta: 1:07:38 lr: 1.2173353129621975e-05 loss: 0.0786 (0.0846) time: 2.9275 data: 0.0076 max mem: 33303 Epoch: [31] [2900/4276] eta: 1:07:08 lr: 1.2170274948030245e-05 loss: 0.0742 (0.0846) time: 2.9261 data: 0.0072 max mem: 33303 Epoch: [31] [2910/4276] eta: 1:06:39 lr: 1.2167196679930215e-05 loss: 0.0869 (0.0846) time: 2.9268 data: 0.0073 max mem: 33303 Epoch: [31] [2920/4276] eta: 1:06:10 lr: 1.2164118325295133e-05 loss: 0.0919 (0.0846) time: 2.9285 data: 0.0075 max mem: 33303 Epoch: [31] [2930/4276] eta: 1:05:41 lr: 1.2161039884098238e-05 loss: 0.0863 (0.0846) time: 2.9274 data: 0.0075 max mem: 33303 Epoch: [31] [2940/4276] eta: 1:05:11 lr: 1.2157961356312737e-05 loss: 0.0785 (0.0846) time: 2.9272 data: 0.0073 max mem: 33303 Epoch: [31] [2950/4276] eta: 1:04:42 lr: 1.2154882741911835e-05 loss: 0.0804 (0.0846) time: 2.9277 data: 0.0073 max mem: 33303 Epoch: [31] [2960/4276] eta: 1:04:13 lr: 1.215180404086872e-05 loss: 0.0816 (0.0846) time: 2.9224 data: 0.0078 max mem: 33303 Epoch: [31] [2970/4276] eta: 1:03:43 lr: 1.2148725253156568e-05 loss: 0.0785 (0.0846) time: 2.9226 data: 0.0083 max mem: 33303 Epoch: [31] [2980/4276] eta: 1:03:14 lr: 1.2145646378748523e-05 loss: 0.0756 (0.0846) time: 2.9261 data: 0.0080 max mem: 33303 Epoch: [31] [2990/4276] eta: 1:02:45 lr: 1.2142567417617725e-05 loss: 0.0755 (0.0846) time: 2.9251 data: 0.0076 max mem: 33303 Epoch: [31] [3000/4276] eta: 1:02:16 lr: 1.2139488369737302e-05 loss: 0.0764 (0.0846) time: 2.9274 data: 0.0081 max mem: 33303 Epoch: [31] [3010/4276] eta: 1:01:46 lr: 1.213640923508036e-05 loss: 0.0764 (0.0846) time: 2.9311 data: 0.0084 max mem: 33303 Epoch: [31] [3020/4276] eta: 1:01:17 lr: 1.2133330013619983e-05 loss: 0.0766 (0.0846) time: 2.9327 data: 0.0079 max mem: 33303 Epoch: [31] [3030/4276] eta: 1:00:48 lr: 1.213025070532925e-05 loss: 0.0803 (0.0846) time: 2.9298 data: 0.0077 max mem: 33303 Epoch: [31] [3040/4276] eta: 1:00:19 lr: 1.2127171310181223e-05 loss: 0.0899 (0.0846) time: 2.9258 data: 0.0076 max mem: 33303 Epoch: [31] [3050/4276] eta: 0:59:49 lr: 1.2124091828148939e-05 loss: 0.0912 (0.0846) time: 2.9256 data: 0.0079 max mem: 33303 Epoch: [31] [3060/4276] eta: 0:59:20 lr: 1.2121012259205425e-05 loss: 0.0682 (0.0845) time: 2.9260 data: 0.0081 max mem: 33303 Epoch: [31] [3070/4276] eta: 0:58:51 lr: 1.2117932603323695e-05 loss: 0.0702 (0.0845) time: 2.9263 data: 0.0079 max mem: 33303 Epoch: [31] [3080/4276] eta: 0:58:21 lr: 1.211485286047675e-05 loss: 0.0719 (0.0845) time: 2.9259 data: 0.0079 max mem: 33303 Epoch: [31] [3090/4276] eta: 0:57:52 lr: 1.2111773030637555e-05 loss: 0.0736 (0.0845) time: 2.9258 data: 0.0079 max mem: 33303 Epoch: [31] [3100/4276] eta: 0:57:23 lr: 1.2108693113779079e-05 loss: 0.0736 (0.0845) time: 2.9293 data: 0.0077 max mem: 33303 Epoch: [31] [3110/4276] eta: 0:56:54 lr: 1.210561310987427e-05 loss: 0.0729 (0.0845) time: 2.9261 data: 0.0080 max mem: 33303 Epoch: [31] [3120/4276] eta: 0:56:24 lr: 1.2102533018896063e-05 loss: 0.0729 (0.0844) time: 2.9282 data: 0.0083 max mem: 33303 Epoch: [31] [3130/4276] eta: 0:55:55 lr: 1.2099452840817362e-05 loss: 0.0766 (0.0844) time: 2.9193 data: 0.0088 max mem: 33303 Epoch: [31] [3140/4276] eta: 0:55:25 lr: 1.2096372575611073e-05 loss: 0.0766 (0.0844) time: 2.8939 data: 0.0089 max mem: 33303 Epoch: [31] [3150/4276] eta: 0:54:56 lr: 1.2093292223250075e-05 loss: 0.0732 (0.0844) time: 2.9068 data: 0.0081 max mem: 33303 Epoch: [31] [3160/4276] eta: 0:54:27 lr: 1.2090211783707241e-05 loss: 0.0798 (0.0844) time: 2.9280 data: 0.0077 max mem: 33303 Epoch: [31] [3170/4276] eta: 0:53:58 lr: 1.2087131256955411e-05 loss: 0.0805 (0.0844) time: 2.9332 data: 0.0078 max mem: 33303 Epoch: [31] [3180/4276] eta: 0:53:28 lr: 1.2084050642967423e-05 loss: 0.0805 (0.0844) time: 2.9391 data: 0.0079 max mem: 33303 Epoch: [31] [3190/4276] eta: 0:52:59 lr: 1.2080969941716098e-05 loss: 0.0821 (0.0844) time: 2.9393 data: 0.0078 max mem: 33303 Epoch: [31] [3200/4276] eta: 0:52:30 lr: 1.207788915317424e-05 loss: 0.0813 (0.0844) time: 2.9399 data: 0.0079 max mem: 33303 Epoch: [31] [3210/4276] eta: 0:52:01 lr: 1.2074808277314625e-05 loss: 0.0739 (0.0844) time: 2.9359 data: 0.0080 max mem: 33303 Epoch: [31] [3220/4276] eta: 0:51:31 lr: 1.2071727314110024e-05 loss: 0.0754 (0.0844) time: 2.9100 data: 0.0082 max mem: 33303 Epoch: [31] [3230/4276] eta: 0:51:02 lr: 1.2068646263533196e-05 loss: 0.0772 (0.0844) time: 2.8879 data: 0.0081 max mem: 33303 Epoch: [31] [3240/4276] eta: 0:50:32 lr: 1.206556512555688e-05 loss: 0.0795 (0.0844) time: 2.8881 data: 0.0082 max mem: 33303 Epoch: [31] [3250/4276] eta: 0:50:03 lr: 1.2062483900153789e-05 loss: 0.0922 (0.0845) time: 2.8874 data: 0.0085 max mem: 33303 Epoch: [31] [3260/4276] eta: 0:49:34 lr: 1.2059402587296627e-05 loss: 0.0843 (0.0845) time: 2.9033 data: 0.0090 max mem: 33303 Epoch: [31] [3270/4276] eta: 0:49:04 lr: 1.2056321186958085e-05 loss: 0.0843 (0.0845) time: 2.9177 data: 0.0095 max mem: 33303 Epoch: [31] [3280/4276] eta: 0:48:35 lr: 1.2053239699110844e-05 loss: 0.0881 (0.0845) time: 2.9114 data: 0.0093 max mem: 33303 Epoch: [31] [3290/4276] eta: 0:48:06 lr: 1.2050158123727543e-05 loss: 0.0909 (0.0845) time: 2.9153 data: 0.0088 max mem: 33303 Epoch: [31] [3300/4276] eta: 0:47:37 lr: 1.2047076460780827e-05 loss: 0.0874 (0.0846) time: 2.9333 data: 0.0082 max mem: 33303 Epoch: [31] [3310/4276] eta: 0:47:07 lr: 1.2043994710243322e-05 loss: 0.0869 (0.0846) time: 2.9409 data: 0.0082 max mem: 33303 Epoch: [31] [3320/4276] eta: 0:46:38 lr: 1.2040912872087638e-05 loss: 0.0869 (0.0846) time: 2.9396 data: 0.0080 max mem: 33303 Epoch: [31] [3330/4276] eta: 0:46:09 lr: 1.2037830946286353e-05 loss: 0.0815 (0.0846) time: 2.9407 data: 0.0079 max mem: 33303 Epoch: [31] [3340/4276] eta: 0:45:40 lr: 1.203474893281205e-05 loss: 0.0797 (0.0846) time: 2.9404 data: 0.0081 max mem: 33303 Epoch: [31] [3350/4276] eta: 0:45:10 lr: 1.2031666831637289e-05 loss: 0.0812 (0.0846) time: 2.9331 data: 0.0084 max mem: 33303 Epoch: [31] [3360/4276] eta: 0:44:41 lr: 1.20285846427346e-05 loss: 0.0776 (0.0845) time: 2.9153 data: 0.0087 max mem: 33303 Epoch: [31] [3370/4276] eta: 0:44:12 lr: 1.2025502366076514e-05 loss: 0.0815 (0.0846) time: 2.8996 data: 0.0083 max mem: 33303 Epoch: [31] [3380/4276] eta: 0:43:42 lr: 1.2022420001635538e-05 loss: 0.0833 (0.0846) time: 2.9001 data: 0.0083 max mem: 33303 Epoch: [31] [3390/4276] eta: 0:43:13 lr: 1.2019337549384172e-05 loss: 0.0751 (0.0846) time: 2.8989 data: 0.0089 max mem: 33303 Epoch: [31] [3400/4276] eta: 0:42:44 lr: 1.2016255009294876e-05 loss: 0.0781 (0.0845) time: 2.8931 data: 0.0085 max mem: 33303 Epoch: [31] [3410/4276] eta: 0:42:14 lr: 1.2013172381340117e-05 loss: 0.0861 (0.0846) time: 2.9029 data: 0.0081 max mem: 33303 Epoch: [31] [3420/4276] eta: 0:41:45 lr: 1.2010089665492335e-05 loss: 0.0801 (0.0846) time: 2.9108 data: 0.0084 max mem: 33303 Epoch: [31] [3430/4276] eta: 0:41:16 lr: 1.2007006861723965e-05 loss: 0.0820 (0.0846) time: 2.9045 data: 0.0089 max mem: 33303 Epoch: [31] [3440/4276] eta: 0:40:46 lr: 1.2003923970007401e-05 loss: 0.0839 (0.0846) time: 2.9083 data: 0.0088 max mem: 33303 Epoch: [31] [3450/4276] eta: 0:40:17 lr: 1.2000840990315043e-05 loss: 0.0859 (0.0846) time: 2.9110 data: 0.0087 max mem: 33303 Epoch: [31] [3460/4276] eta: 0:39:48 lr: 1.1997757922619269e-05 loss: 0.0928 (0.0847) time: 2.9000 data: 0.0084 max mem: 33303 Epoch: [31] [3470/4276] eta: 0:39:18 lr: 1.199467476689244e-05 loss: 0.0821 (0.0846) time: 2.9150 data: 0.0083 max mem: 33303 Epoch: [31] [3480/4276] eta: 0:38:49 lr: 1.1991591523106892e-05 loss: 0.0845 (0.0847) time: 2.9357 data: 0.0079 max mem: 33303 Epoch: [31] [3490/4276] eta: 0:38:20 lr: 1.1988508191234953e-05 loss: 0.0817 (0.0847) time: 2.9360 data: 0.0074 max mem: 33303 Epoch: [31] [3500/4276] eta: 0:37:51 lr: 1.1985424771248936e-05 loss: 0.0725 (0.0846) time: 2.9343 data: 0.0076 max mem: 33303 Epoch: [31] [3510/4276] eta: 0:37:21 lr: 1.1982341263121138e-05 loss: 0.0842 (0.0846) time: 2.9323 data: 0.0076 max mem: 33303 Epoch: [31] [3520/4276] eta: 0:36:52 lr: 1.1979257666823826e-05 loss: 0.0774 (0.0846) time: 2.9298 data: 0.0079 max mem: 33303 Epoch: [31] [3530/4276] eta: 0:36:23 lr: 1.1976173982329264e-05 loss: 0.0774 (0.0847) time: 2.9337 data: 0.0082 max mem: 33303 Epoch: [31] [3540/4276] eta: 0:35:54 lr: 1.1973090209609695e-05 loss: 0.0841 (0.0847) time: 2.9372 data: 0.0082 max mem: 33303 Epoch: [31] [3550/4276] eta: 0:35:24 lr: 1.1970006348637352e-05 loss: 0.0789 (0.0846) time: 2.9300 data: 0.0086 max mem: 33303 Epoch: [31] [3560/4276] eta: 0:34:55 lr: 1.1966922399384433e-05 loss: 0.0751 (0.0846) time: 2.9078 data: 0.0088 max mem: 33303 Epoch: [31] [3570/4276] eta: 0:34:26 lr: 1.1963838361823137e-05 loss: 0.0828 (0.0847) time: 2.8988 data: 0.0087 max mem: 33303 Epoch: [31] [3580/4276] eta: 0:33:56 lr: 1.196075423592564e-05 loss: 0.0774 (0.0846) time: 2.8950 data: 0.0086 max mem: 33303 Epoch: [31] [3590/4276] eta: 0:33:27 lr: 1.195767002166411e-05 loss: 0.0744 (0.0846) time: 2.8934 data: 0.0086 max mem: 33303 Epoch: [31] [3600/4276] eta: 0:32:58 lr: 1.1954585719010673e-05 loss: 0.0814 (0.0846) time: 2.9067 data: 0.0081 max mem: 33303 Epoch: [31] [3610/4276] eta: 0:32:29 lr: 1.1951501327937466e-05 loss: 0.0813 (0.0846) time: 2.9246 data: 0.0076 max mem: 33303 Epoch: [31] [3620/4276] eta: 0:31:59 lr: 1.1948416848416595e-05 loss: 0.0728 (0.0846) time: 2.9400 data: 0.0072 max mem: 33303 Epoch: [31] [3630/4276] eta: 0:31:30 lr: 1.194533228042016e-05 loss: 0.0815 (0.0846) time: 2.9380 data: 0.0071 max mem: 33303 Epoch: [31] [3640/4276] eta: 0:31:01 lr: 1.1942247623920226e-05 loss: 0.0918 (0.0846) time: 2.9377 data: 0.0070 max mem: 33303 Epoch: [31] [3650/4276] eta: 0:30:32 lr: 1.1939162878888854e-05 loss: 0.0816 (0.0846) time: 2.9487 data: 0.0070 max mem: 33303 Epoch: [31] [3660/4276] eta: 0:30:02 lr: 1.1936078045298096e-05 loss: 0.0832 (0.0846) time: 2.9647 data: 0.0074 max mem: 33303 Epoch: [31] [3670/4276] eta: 0:29:33 lr: 1.1932993123119964e-05 loss: 0.0853 (0.0846) time: 2.9566 data: 0.0079 max mem: 33303 Epoch: [31] [3680/4276] eta: 0:29:04 lr: 1.1929908112326472e-05 loss: 0.0853 (0.0846) time: 2.9401 data: 0.0079 max mem: 33303 Epoch: [31] [3690/4276] eta: 0:28:35 lr: 1.1926823012889609e-05 loss: 0.0909 (0.0847) time: 2.9403 data: 0.0075 max mem: 33303 Epoch: [31] [3700/4276] eta: 0:28:05 lr: 1.192373782478136e-05 loss: 0.0866 (0.0847) time: 2.9429 data: 0.0076 max mem: 33303 Epoch: [31] [3710/4276] eta: 0:27:36 lr: 1.1920652547973669e-05 loss: 0.0719 (0.0846) time: 2.9549 data: 0.0077 max mem: 33303 Epoch: [31] [3720/4276] eta: 0:27:07 lr: 1.191756718243848e-05 loss: 0.0734 (0.0847) time: 2.9596 data: 0.0077 max mem: 33303 Epoch: [31] [3730/4276] eta: 0:26:38 lr: 1.1914481728147722e-05 loss: 0.0890 (0.0847) time: 2.9283 data: 0.0071 max mem: 33303 Epoch: [31] [3740/4276] eta: 0:26:08 lr: 1.19113961850733e-05 loss: 0.0794 (0.0846) time: 2.9006 data: 0.0065 max mem: 33303 Epoch: [31] [3750/4276] eta: 0:25:39 lr: 1.1908310553187099e-05 loss: 0.0864 (0.0847) time: 2.8938 data: 0.0067 max mem: 33303 Epoch: [31] [3760/4276] eta: 0:25:10 lr: 1.1905224832460994e-05 loss: 0.0832 (0.0846) time: 2.8985 data: 0.0066 max mem: 33303 Epoch: [31] [3770/4276] eta: 0:24:40 lr: 1.1902139022866843e-05 loss: 0.0689 (0.0846) time: 2.9029 data: 0.0067 max mem: 33303 Epoch: [31] [3780/4276] eta: 0:24:11 lr: 1.1899053124376489e-05 loss: 0.0706 (0.0846) time: 2.9006 data: 0.0068 max mem: 33303 Epoch: [31] [3790/4276] eta: 0:23:42 lr: 1.1895967136961742e-05 loss: 0.0706 (0.0846) time: 2.9296 data: 0.0074 max mem: 33303 Epoch: [31] [3800/4276] eta: 0:23:13 lr: 1.1892881060594412e-05 loss: 0.0788 (0.0846) time: 2.9306 data: 0.0074 max mem: 33303 Epoch: [31] [3810/4276] eta: 0:22:43 lr: 1.1889794895246287e-05 loss: 0.0831 (0.0845) time: 2.9131 data: 0.0078 max mem: 33303 Epoch: [31] [3820/4276] eta: 0:22:14 lr: 1.1886708640889146e-05 loss: 0.0693 (0.0845) time: 2.9203 data: 0.0086 max mem: 33303 Epoch: [31] [3830/4276] eta: 0:21:45 lr: 1.1883622297494726e-05 loss: 0.0693 (0.0845) time: 2.9161 data: 0.0086 max mem: 33303 Epoch: [31] [3840/4276] eta: 0:21:16 lr: 1.1880535865034772e-05 loss: 0.0733 (0.0845) time: 2.9355 data: 0.0079 max mem: 33303 Epoch: [31] [3850/4276] eta: 0:20:46 lr: 1.1877449343481003e-05 loss: 0.0733 (0.0844) time: 2.9571 data: 0.0071 max mem: 33303 Epoch: [31] [3860/4276] eta: 0:20:17 lr: 1.1874362732805127e-05 loss: 0.0818 (0.0845) time: 2.9553 data: 0.0075 max mem: 33303 Epoch: [31] [3870/4276] eta: 0:19:48 lr: 1.1871276032978815e-05 loss: 0.0787 (0.0844) time: 2.9337 data: 0.0078 max mem: 33303 Epoch: [31] [3880/4276] eta: 0:19:19 lr: 1.1868189243973742e-05 loss: 0.0770 (0.0844) time: 2.9111 data: 0.0075 max mem: 33303 Epoch: [31] [3890/4276] eta: 0:18:49 lr: 1.1865102365761559e-05 loss: 0.0807 (0.0844) time: 2.9054 data: 0.0074 max mem: 33303 Epoch: [31] [3900/4276] eta: 0:18:20 lr: 1.1862015398313903e-05 loss: 0.0726 (0.0844) time: 2.9029 data: 0.0075 max mem: 33303 Epoch: [31] [3910/4276] eta: 0:17:51 lr: 1.1858928341602381e-05 loss: 0.0709 (0.0844) time: 2.9004 data: 0.0075 max mem: 33303 Epoch: [31] [3920/4276] eta: 0:17:21 lr: 1.1855841195598594e-05 loss: 0.0719 (0.0844) time: 2.9071 data: 0.0078 max mem: 33303 Epoch: [31] [3930/4276] eta: 0:16:52 lr: 1.1852753960274128e-05 loss: 0.0764 (0.0844) time: 2.9267 data: 0.0082 max mem: 33303 Epoch: [31] [3940/4276] eta: 0:16:23 lr: 1.184966663560055e-05 loss: 0.0764 (0.0844) time: 2.9469 data: 0.0085 max mem: 33303 Epoch: [31] [3950/4276] eta: 0:15:54 lr: 1.1846579221549396e-05 loss: 0.0763 (0.0843) time: 2.9567 data: 0.0083 max mem: 33303 Epoch: [31] [3960/4276] eta: 0:15:24 lr: 1.1843491718092201e-05 loss: 0.0851 (0.0843) time: 2.9508 data: 0.0079 max mem: 33303 Epoch: [31] [3970/4276] eta: 0:14:55 lr: 1.1840404125200484e-05 loss: 0.0821 (0.0843) time: 2.9490 data: 0.0079 max mem: 33303 Epoch: [31] [3980/4276] eta: 0:14:26 lr: 1.1837316442845728e-05 loss: 0.0830 (0.0844) time: 2.9490 data: 0.0079 max mem: 33303 Epoch: [31] [3990/4276] eta: 0:13:57 lr: 1.1834228670999415e-05 loss: 0.0844 (0.0844) time: 2.9461 data: 0.0081 max mem: 33303 Epoch: [31] [4000/4276] eta: 0:13:27 lr: 1.1831140809633007e-05 loss: 0.0728 (0.0844) time: 2.9433 data: 0.0086 max mem: 33303 Epoch: [31] [4010/4276] eta: 0:12:58 lr: 1.1828052858717955e-05 loss: 0.0752 (0.0844) time: 2.9439 data: 0.0088 max mem: 33303 Epoch: [31] [4020/4276] eta: 0:12:29 lr: 1.1824964818225666e-05 loss: 0.0890 (0.0844) time: 2.9537 data: 0.0085 max mem: 33303 Epoch: [31] [4030/4276] eta: 0:12:00 lr: 1.1821876688127561e-05 loss: 0.0780 (0.0844) time: 2.9753 data: 0.0085 max mem: 33303 Epoch: [31] [4040/4276] eta: 0:11:30 lr: 1.1818788468395025e-05 loss: 0.0768 (0.0844) time: 2.9642 data: 0.0087 max mem: 33303 Epoch: [31] [4050/4276] eta: 0:11:01 lr: 1.1815700158999442e-05 loss: 0.0891 (0.0844) time: 2.9458 data: 0.0085 max mem: 33303 Epoch: [31] [4060/4276] eta: 0:10:32 lr: 1.1812611759912152e-05 loss: 0.0771 (0.0844) time: 2.9524 data: 0.0082 max mem: 33303 Epoch: [31] [4070/4276] eta: 0:10:03 lr: 1.1809523271104503e-05 loss: 0.0771 (0.0844) time: 2.9342 data: 0.0080 max mem: 33303 Epoch: [31] [4080/4276] eta: 0:09:33 lr: 1.1806434692547811e-05 loss: 0.0793 (0.0844) time: 2.9084 data: 0.0083 max mem: 33303 Epoch: [31] [4090/4276] eta: 0:09:04 lr: 1.180334602421339e-05 loss: 0.0990 (0.0844) time: 2.9057 data: 0.0085 max mem: 33303 Epoch: [31] [4100/4276] eta: 0:08:35 lr: 1.180025726607251e-05 loss: 0.0889 (0.0844) time: 2.9432 data: 0.0094 max mem: 33303 Epoch: [31] [4110/4276] eta: 0:08:05 lr: 1.1797168418096447e-05 loss: 0.0834 (0.0844) time: 2.9691 data: 0.0094 max mem: 33303 Epoch: [31] [4120/4276] eta: 0:07:36 lr: 1.1794079480256453e-05 loss: 0.0797 (0.0844) time: 2.9564 data: 0.0082 max mem: 33303 Epoch: [31] [4130/4276] eta: 0:07:07 lr: 1.1790990452523764e-05 loss: 0.0725 (0.0844) time: 2.9513 data: 0.0083 max mem: 33303 Epoch: [31] [4140/4276] eta: 0:06:38 lr: 1.1787901334869585e-05 loss: 0.0725 (0.0844) time: 2.9540 data: 0.0087 max mem: 33303 Epoch: [31] [4150/4276] eta: 0:06:08 lr: 1.1784812127265122e-05 loss: 0.0817 (0.0844) time: 2.9526 data: 0.0087 max mem: 33303 Epoch: [31] [4160/4276] eta: 0:05:39 lr: 1.1781722829681551e-05 loss: 0.0890 (0.0844) time: 2.9509 data: 0.0085 max mem: 33303 Epoch: [31] [4170/4276] eta: 0:05:10 lr: 1.1778633442090044e-05 loss: 0.0816 (0.0845) time: 2.9480 data: 0.0085 max mem: 33303 Epoch: [31] [4180/4276] eta: 0:04:41 lr: 1.1775543964461733e-05 loss: 0.0735 (0.0844) time: 2.9435 data: 0.0087 max mem: 33303 Epoch: [31] [4190/4276] eta: 0:04:11 lr: 1.177245439676775e-05 loss: 0.0701 (0.0844) time: 2.9473 data: 0.0086 max mem: 33303 Epoch: [31] [4200/4276] eta: 0:03:42 lr: 1.1769364738979207e-05 loss: 0.0832 (0.0844) time: 2.9514 data: 0.0087 max mem: 33303 Epoch: [31] [4210/4276] eta: 0:03:13 lr: 1.17662749910672e-05 loss: 0.0861 (0.0844) time: 2.9516 data: 0.0090 max mem: 33303 Epoch: [31] [4220/4276] eta: 0:02:43 lr: 1.1763185153002792e-05 loss: 0.0874 (0.0845) time: 2.9524 data: 0.0089 max mem: 33303 Epoch: [31] [4230/4276] eta: 0:02:14 lr: 1.1760095224757046e-05 loss: 0.0888 (0.0845) time: 2.9545 data: 0.0087 max mem: 33303 Epoch: [31] [4240/4276] eta: 0:01:45 lr: 1.1757005206300998e-05 loss: 0.0888 (0.0845) time: 2.9544 data: 0.0086 max mem: 33303 Epoch: [31] [4250/4276] eta: 0:01:16 lr: 1.1753915097605679e-05 loss: 0.0968 (0.0846) time: 2.9522 data: 0.0085 max mem: 33303 Epoch: [31] [4260/4276] eta: 0:00:46 lr: 1.1750824898642079e-05 loss: 0.0973 (0.0846) time: 2.9506 data: 0.0087 max mem: 33303 Epoch: [31] [4270/4276] eta: 0:00:17 lr: 1.1747734609381188e-05 loss: 0.0892 (0.0846) time: 2.9453 data: 0.0079 max mem: 33303 Epoch: [31] Total time: 3:28:42 Test: [ 0/21770] eta: 11:30:34 time: 1.9033 data: 1.8645 max mem: 33303 Test: [ 100/21770] eta: 0:20:17 time: 0.0383 data: 0.0008 max mem: 33303 Test: [ 200/21770] eta: 0:16:58 time: 0.0381 data: 0.0008 max mem: 33303 Test: [ 300/21770] eta: 0:15:50 time: 0.0386 data: 0.0008 max mem: 33303 Test: [ 400/21770] eta: 0:15:15 time: 0.0383 data: 0.0008 max mem: 33303 Test: [ 500/21770] eta: 0:14:53 time: 0.0386 data: 0.0008 max mem: 33303 Test: [ 600/21770] eta: 0:14:36 time: 0.0386 data: 0.0008 max mem: 33303 Test: [ 700/21770] eta: 0:14:24 time: 0.0385 data: 0.0009 max mem: 33303 Test: [ 800/21770] eta: 0:14:13 time: 0.0388 data: 0.0009 max mem: 33303 Test: [ 900/21770] eta: 0:14:05 time: 0.0386 data: 0.0009 max mem: 33303 Test: [ 1000/21770] eta: 0:13:57 time: 0.0388 data: 0.0009 max mem: 33303 Test: [ 1100/21770] eta: 0:13:50 time: 0.0386 data: 0.0009 max mem: 33303 Test: [ 1200/21770] eta: 0:13:43 time: 0.0389 data: 0.0009 max mem: 33303 Test: [ 1300/21770] eta: 0:13:37 time: 0.0384 data: 0.0009 max mem: 33303 Test: [ 1400/21770] eta: 0:13:31 time: 0.0389 data: 0.0009 max mem: 33303 Test: [ 1500/21770] eta: 0:13:25 time: 0.0382 data: 0.0009 max mem: 33303 Test: [ 1600/21770] eta: 0:13:20 time: 0.0384 data: 0.0008 max mem: 33303 Test: [ 1700/21770] eta: 0:13:14 time: 0.0385 data: 0.0009 max mem: 33303 Test: [ 1800/21770] eta: 0:13:09 time: 0.0385 data: 0.0009 max mem: 33303 Test: [ 1900/21770] eta: 0:13:04 time: 0.0384 data: 0.0009 max mem: 33303 Test: [ 2000/21770] eta: 0:12:59 time: 0.0385 data: 0.0009 max mem: 33303 Test: [ 2100/21770] eta: 0:12:54 time: 0.0386 data: 0.0009 max mem: 33303 Test: [ 2200/21770] eta: 0:12:50 time: 0.0387 data: 0.0009 max mem: 33303 Test: [ 2300/21770] eta: 0:12:45 time: 0.0384 data: 0.0009 max mem: 33303 Test: [ 2400/21770] eta: 0:12:41 time: 0.0387 data: 0.0009 max mem: 33303 Test: [ 2500/21770] eta: 0:12:36 time: 0.0384 data: 0.0009 max mem: 33303 Test: [ 2600/21770] eta: 0:12:32 time: 0.0387 data: 0.0009 max mem: 33303 Test: [ 2700/21770] eta: 0:12:27 time: 0.0383 data: 0.0009 max mem: 33303 Test: [ 2800/21770] eta: 0:12:23 time: 0.0387 data: 0.0009 max mem: 33303 Test: [ 2900/21770] eta: 0:12:18 time: 0.0387 data: 0.0009 max mem: 33303 Test: [ 3000/21770] eta: 0:12:14 time: 0.0388 data: 0.0009 max mem: 33303 Test: [ 3100/21770] eta: 0:12:10 time: 0.0387 data: 0.0009 max mem: 33303 Test: [ 3200/21770] eta: 0:12:06 time: 0.0389 data: 0.0009 max mem: 33303 Test: [ 3300/21770] eta: 0:12:02 time: 0.0394 data: 0.0009 max mem: 33303 Test: [ 3400/21770] eta: 0:11:58 time: 0.0391 data: 0.0009 max mem: 33303 Test: [ 3500/21770] eta: 0:11:54 time: 0.0394 data: 0.0009 max mem: 33303 Test: [ 3600/21770] eta: 0:11:50 time: 0.0391 data: 0.0009 max mem: 33303 Test: [ 3700/21770] eta: 0:11:46 time: 0.0392 data: 0.0009 max mem: 33303 Test: [ 3800/21770] eta: 0:11:43 time: 0.0392 data: 0.0009 max mem: 33303 Test: [ 3900/21770] eta: 0:11:39 time: 0.0391 data: 0.0009 max mem: 33303 Test: [ 4000/21770] eta: 0:11:35 time: 0.0393 data: 0.0009 max mem: 33303 Test: [ 4100/21770] eta: 0:11:31 time: 0.0393 data: 0.0009 max mem: 33303 Test: [ 4200/21770] eta: 0:11:27 time: 0.0392 data: 0.0009 max mem: 33303 Test: [ 4300/21770] eta: 0:11:23 time: 0.0393 data: 0.0009 max mem: 33303 Test: [ 4400/21770] eta: 0:11:19 time: 0.0391 data: 0.0009 max mem: 33303 Test: [ 4500/21770] eta: 0:11:15 time: 0.0391 data: 0.0009 max mem: 33303 Test: [ 4600/21770] eta: 0:11:11 time: 0.0385 data: 0.0009 max mem: 33303 Test: [ 4700/21770] eta: 0:11:07 time: 0.0389 data: 0.0009 max mem: 33303 Test: [ 4800/21770] eta: 0:11:03 time: 0.0386 data: 0.0009 max mem: 33303 Test: [ 4900/21770] eta: 0:10:59 time: 0.0390 data: 0.0009 max mem: 33303 Test: [ 5000/21770] eta: 0:10:55 time: 0.0389 data: 0.0009 max mem: 33303 Test: [ 5100/21770] eta: 0:10:51 time: 0.0389 data: 0.0009 max mem: 33303 Test: [ 5200/21770] eta: 0:10:47 time: 0.0388 data: 0.0009 max mem: 33303 Test: [ 5300/21770] eta: 0:10:43 time: 0.0392 data: 0.0009 max mem: 33303 Test: [ 5400/21770] eta: 0:10:39 time: 0.0392 data: 0.0009 max mem: 33303 Test: [ 5500/21770] eta: 0:10:35 time: 0.0392 data: 0.0009 max mem: 33303 Test: [ 5600/21770] eta: 0:10:32 time: 0.0391 data: 0.0009 max mem: 33303 Test: [ 5700/21770] eta: 0:10:28 time: 0.0391 data: 0.0009 max mem: 33303 Test: [ 5800/21770] eta: 0:10:24 time: 0.0391 data: 0.0009 max mem: 33303 Test: [ 5900/21770] eta: 0:10:20 time: 0.0391 data: 0.0009 max mem: 33303 Test: [ 6000/21770] eta: 0:10:16 time: 0.0392 data: 0.0009 max mem: 33303 Test: [ 6100/21770] eta: 0:10:12 time: 0.0390 data: 0.0009 max mem: 33303 Test: [ 6200/21770] eta: 0:10:08 time: 0.0393 data: 0.0009 max mem: 33303 Test: [ 6300/21770] eta: 0:10:04 time: 0.0390 data: 0.0009 max mem: 33303 Test: [ 6400/21770] eta: 0:10:00 time: 0.0382 data: 0.0008 max mem: 33303 Test: [ 6500/21770] eta: 0:09:56 time: 0.0382 data: 0.0009 max mem: 33303 Test: [ 6600/21770] eta: 0:09:52 time: 0.0383 data: 0.0008 max mem: 33303 Test: [ 6700/21770] eta: 0:09:48 time: 0.0382 data: 0.0009 max mem: 33303 Test: [ 6800/21770] eta: 0:09:44 time: 0.0398 data: 0.0009 max mem: 33303 Test: [ 6900/21770] eta: 0:09:40 time: 0.0402 data: 0.0009 max mem: 33303 Test: [ 7000/21770] eta: 0:09:37 time: 0.0393 data: 0.0009 max mem: 33303 Test: [ 7100/21770] eta: 0:09:33 time: 0.0392 data: 0.0009 max mem: 33303 Test: [ 7200/21770] eta: 0:09:29 time: 0.0392 data: 0.0009 max mem: 33303 Test: [ 7300/21770] eta: 0:09:25 time: 0.0391 data: 0.0009 max mem: 33303 Test: [ 7400/21770] eta: 0:09:21 time: 0.0391 data: 0.0009 max mem: 33303 Test: [ 7500/21770] eta: 0:09:17 time: 0.0394 data: 0.0009 max mem: 33303 Test: [ 7600/21770] eta: 0:09:13 time: 0.0382 data: 0.0009 max mem: 33303 Test: [ 7700/21770] eta: 0:09:09 time: 0.0382 data: 0.0009 max mem: 33303 Test: [ 7800/21770] eta: 0:09:05 time: 0.0383 data: 0.0009 max mem: 33303 Test: [ 7900/21770] eta: 0:09:01 time: 0.0382 data: 0.0009 max mem: 33303 Test: [ 8000/21770] eta: 0:08:57 time: 0.0383 data: 0.0009 max mem: 33303 Test: [ 8100/21770] eta: 0:08:53 time: 0.0382 data: 0.0009 max mem: 33303 Test: [ 8200/21770] eta: 0:08:49 time: 0.0387 data: 0.0009 max mem: 33303 Test: [ 8300/21770] eta: 0:08:45 time: 0.0389 data: 0.0009 max mem: 33303 Test: [ 8400/21770] eta: 0:08:41 time: 0.0389 data: 0.0009 max mem: 33303 Test: [ 8500/21770] eta: 0:08:37 time: 0.0392 data: 0.0009 max mem: 33303 Test: [ 8600/21770] eta: 0:08:33 time: 0.0391 data: 0.0009 max mem: 33303 Test: [ 8700/21770] eta: 0:08:30 time: 0.0392 data: 0.0009 max mem: 33303 Test: [ 8800/21770] eta: 0:08:26 time: 0.0387 data: 0.0009 max mem: 33303 Test: [ 8900/21770] eta: 0:08:22 time: 0.0389 data: 0.0009 max mem: 33303 Test: [ 9000/21770] eta: 0:08:18 time: 0.0388 data: 0.0009 max mem: 33303 Test: [ 9100/21770] eta: 0:08:14 time: 0.0389 data: 0.0009 max mem: 33303 Test: [ 9200/21770] eta: 0:08:10 time: 0.0387 data: 0.0009 max mem: 33303 Test: [ 9300/21770] eta: 0:08:06 time: 0.0389 data: 0.0009 max mem: 33303 Test: [ 9400/21770] eta: 0:08:02 time: 0.0388 data: 0.0009 max mem: 33303 Test: [ 9500/21770] eta: 0:07:58 time: 0.0389 data: 0.0009 max mem: 33303 Test: [ 9600/21770] eta: 0:07:54 time: 0.0388 data: 0.0009 max mem: 33303 Test: [ 9700/21770] eta: 0:07:50 time: 0.0391 data: 0.0009 max mem: 33303 Test: [ 9800/21770] eta: 0:07:46 time: 0.0388 data: 0.0009 max mem: 33303 Test: [ 9900/21770] eta: 0:07:42 time: 0.0388 data: 0.0009 max mem: 33303 Test: [10000/21770] eta: 0:07:38 time: 0.0389 data: 0.0009 max mem: 33303 Test: [10100/21770] eta: 0:07:35 time: 0.0390 data: 0.0009 max mem: 33303 Test: [10200/21770] eta: 0:07:31 time: 0.0387 data: 0.0009 max mem: 33303 Test: [10300/21770] eta: 0:07:27 time: 0.0390 data: 0.0009 max mem: 33303 Test: [10400/21770] eta: 0:07:23 time: 0.0388 data: 0.0009 max mem: 33303 Test: [10500/21770] eta: 0:07:19 time: 0.0390 data: 0.0009 max mem: 33303 Test: [10600/21770] eta: 0:07:15 time: 0.0389 data: 0.0009 max mem: 33303 Test: [10700/21770] eta: 0:07:11 time: 0.0388 data: 0.0009 max mem: 33303 Test: [10800/21770] eta: 0:07:07 time: 0.0388 data: 0.0009 max mem: 33303 Test: [10900/21770] eta: 0:07:03 time: 0.0389 data: 0.0009 max mem: 33303 Test: [11000/21770] eta: 0:06:59 time: 0.0388 data: 0.0009 max mem: 33303 Test: [11100/21770] eta: 0:06:55 time: 0.0388 data: 0.0009 max mem: 33303 Test: [11200/21770] eta: 0:06:51 time: 0.0387 data: 0.0009 max mem: 33303 Test: [11300/21770] eta: 0:06:48 time: 0.0390 data: 0.0009 max mem: 33303 Test: [11400/21770] eta: 0:06:44 time: 0.0388 data: 0.0009 max mem: 33303 Test: [11500/21770] eta: 0:06:40 time: 0.0392 data: 0.0009 max mem: 33303 Test: [11600/21770] eta: 0:06:36 time: 0.0392 data: 0.0009 max mem: 33303 Test: [11700/21770] eta: 0:06:32 time: 0.0389 data: 0.0009 max mem: 33303 Test: [11800/21770] eta: 0:06:28 time: 0.0388 data: 0.0009 max mem: 33303 Test: [11900/21770] eta: 0:06:24 time: 0.0388 data: 0.0009 max mem: 33303 Test: [12000/21770] eta: 0:06:20 time: 0.0389 data: 0.0009 max mem: 33303 Test: [12100/21770] eta: 0:06:16 time: 0.0389 data: 0.0009 max mem: 33303 Test: [12200/21770] eta: 0:06:12 time: 0.0386 data: 0.0009 max mem: 33303 Test: [12300/21770] eta: 0:06:09 time: 0.0389 data: 0.0009 max mem: 33303 Test: [12400/21770] eta: 0:06:05 time: 0.0389 data: 0.0009 max mem: 33303 Test: [12500/21770] eta: 0:06:01 time: 0.0390 data: 0.0009 max mem: 33303 Test: [12600/21770] eta: 0:05:57 time: 0.0388 data: 0.0009 max mem: 33303 Test: [12700/21770] eta: 0:05:53 time: 0.0390 data: 0.0009 max mem: 33303 Test: [12800/21770] eta: 0:05:49 time: 0.0387 data: 0.0009 max mem: 33303 Test: [12900/21770] eta: 0:05:45 time: 0.0391 data: 0.0009 max mem: 33303 Test: [13000/21770] eta: 0:05:41 time: 0.0388 data: 0.0009 max mem: 33303 Test: [13100/21770] eta: 0:05:37 time: 0.0393 data: 0.0009 max mem: 33303 Test: [13200/21770] eta: 0:05:33 time: 0.0391 data: 0.0008 max mem: 33303 Test: [13300/21770] eta: 0:05:30 time: 0.0393 data: 0.0009 max mem: 33303 Test: [13400/21770] eta: 0:05:26 time: 0.0392 data: 0.0009 max mem: 33303 Test: [13500/21770] eta: 0:05:22 time: 0.0394 data: 0.0009 max mem: 33303 Test: [13600/21770] eta: 0:05:18 time: 0.0391 data: 0.0008 max mem: 33303 Test: [13700/21770] eta: 0:05:14 time: 0.0394 data: 0.0009 max mem: 33303 Test: [13800/21770] eta: 0:05:10 time: 0.0391 data: 0.0008 max mem: 33303 Test: [13900/21770] eta: 0:05:06 time: 0.0394 data: 0.0009 max mem: 33303 Test: [14000/21770] eta: 0:05:02 time: 0.0391 data: 0.0009 max mem: 33303 Test: [14100/21770] eta: 0:04:59 time: 0.0393 data: 0.0009 max mem: 33303 Test: [14200/21770] eta: 0:04:55 time: 0.0391 data: 0.0009 max mem: 33303 Test: [14300/21770] eta: 0:04:51 time: 0.0394 data: 0.0009 max mem: 33303 Test: [14400/21770] eta: 0:04:47 time: 0.0391 data: 0.0009 max mem: 33303 Test: [14500/21770] eta: 0:04:43 time: 0.0401 data: 0.0009 max mem: 33303 Test: [14600/21770] eta: 0:04:39 time: 0.0395 data: 0.0009 max mem: 33303 Test: [14700/21770] eta: 0:04:35 time: 0.0396 data: 0.0009 max mem: 33303 Test: [14800/21770] eta: 0:04:31 time: 0.0393 data: 0.0009 max mem: 33303 Test: [14900/21770] eta: 0:04:27 time: 0.0398 data: 0.0009 max mem: 33303 Test: [15000/21770] eta: 0:04:24 time: 0.0390 data: 0.0009 max mem: 33303 Test: [15100/21770] eta: 0:04:20 time: 0.0395 data: 0.0010 max mem: 33303 Test: [15200/21770] eta: 0:04:16 time: 0.0392 data: 0.0009 max mem: 33303 Test: [15300/21770] eta: 0:04:12 time: 0.0392 data: 0.0009 max mem: 33303 Test: [15400/21770] eta: 0:04:08 time: 0.0394 data: 0.0009 max mem: 33303 Test: [15500/21770] eta: 0:04:04 time: 0.0394 data: 0.0009 max mem: 33303 Test: [15600/21770] eta: 0:04:00 time: 0.0398 data: 0.0008 max mem: 33303 Test: [15700/21770] eta: 0:03:56 time: 0.0393 data: 0.0009 max mem: 33303 Test: [15800/21770] eta: 0:03:52 time: 0.0391 data: 0.0009 max mem: 33303 Test: [15900/21770] eta: 0:03:49 time: 0.0395 data: 0.0009 max mem: 33303 Test: [16000/21770] eta: 0:03:45 time: 0.0391 data: 0.0009 max mem: 33303 Test: [16100/21770] eta: 0:03:41 time: 0.0393 data: 0.0009 max mem: 33303 Test: [16200/21770] eta: 0:03:37 time: 0.0389 data: 0.0009 max mem: 33303 Test: [16300/21770] eta: 0:03:33 time: 0.0396 data: 0.0008 max mem: 33303 Test: [16400/21770] eta: 0:03:29 time: 0.0393 data: 0.0008 max mem: 33303 Test: [16500/21770] eta: 0:03:25 time: 0.0393 data: 0.0009 max mem: 33303 Test: [16600/21770] eta: 0:03:21 time: 0.0393 data: 0.0009 max mem: 33303 Test: [16700/21770] eta: 0:03:17 time: 0.0395 data: 0.0008 max mem: 33303 Test: [16800/21770] eta: 0:03:14 time: 0.0395 data: 0.0008 max mem: 33303 Test: [16900/21770] eta: 0:03:10 time: 0.0397 data: 0.0008 max mem: 33303 Test: [17000/21770] eta: 0:03:06 time: 0.0392 data: 0.0008 max mem: 33303 Test: [17100/21770] eta: 0:03:02 time: 0.0389 data: 0.0008 max mem: 33303 Test: [17200/21770] eta: 0:02:58 time: 0.0388 data: 0.0008 max mem: 33303 Test: [17300/21770] eta: 0:02:54 time: 0.0471 data: 0.0099 max mem: 33303 Test: [17400/21770] eta: 0:02:50 time: 0.0384 data: 0.0009 max mem: 33303 Test: [17500/21770] eta: 0:02:46 time: 0.0386 data: 0.0008 max mem: 33303 Test: [17600/21770] eta: 0:02:42 time: 0.0387 data: 0.0008 max mem: 33303 Test: [17700/21770] eta: 0:02:38 time: 0.0386 data: 0.0009 max mem: 33303 Test: [17800/21770] eta: 0:02:34 time: 0.0386 data: 0.0009 max mem: 33303 Test: [17900/21770] eta: 0:02:31 time: 0.0472 data: 0.0091 max mem: 33303 Test: [18000/21770] eta: 0:02:27 time: 0.0385 data: 0.0009 max mem: 33303 Test: [18100/21770] eta: 0:02:23 time: 0.0388 data: 0.0009 max mem: 33303 Test: [18200/21770] eta: 0:02:19 time: 0.0383 data: 0.0009 max mem: 33303 Test: [18300/21770] eta: 0:02:15 time: 0.0392 data: 0.0009 max mem: 33303 Test: [18400/21770] eta: 0:02:11 time: 0.0387 data: 0.0009 max mem: 33303 Test: [18500/21770] eta: 0:02:07 time: 0.0390 data: 0.0009 max mem: 33303 Test: [18600/21770] eta: 0:02:03 time: 0.0388 data: 0.0009 max mem: 33303 Test: [18700/21770] eta: 0:01:59 time: 0.0390 data: 0.0009 max mem: 33303 Test: [18800/21770] eta: 0:01:55 time: 0.0387 data: 0.0009 max mem: 33303 Test: [18900/21770] eta: 0:01:52 time: 0.0390 data: 0.0009 max mem: 33303 Test: [19000/21770] eta: 0:01:48 time: 0.0389 data: 0.0009 max mem: 33303 Test: [19100/21770] eta: 0:01:44 time: 0.0388 data: 0.0009 max mem: 33303 Test: [19200/21770] eta: 0:01:40 time: 0.0388 data: 0.0008 max mem: 33303 Test: [19300/21770] eta: 0:01:36 time: 0.0390 data: 0.0009 max mem: 33303 Test: [19400/21770] eta: 0:01:32 time: 0.0388 data: 0.0009 max mem: 33303 Test: [19500/21770] eta: 0:01:28 time: 0.0388 data: 0.0008 max mem: 33303 Test: [19600/21770] eta: 0:01:24 time: 0.0387 data: 0.0009 max mem: 33303 Test: [19700/21770] eta: 0:01:20 time: 0.0389 data: 0.0009 max mem: 33303 Test: [19800/21770] eta: 0:01:16 time: 0.0388 data: 0.0008 max mem: 33303 Test: [19900/21770] eta: 0:01:12 time: 0.0386 data: 0.0009 max mem: 33303 Test: [20000/21770] eta: 0:01:09 time: 0.0386 data: 0.0009 max mem: 33303 Test: [20100/21770] eta: 0:01:05 time: 0.0385 data: 0.0009 max mem: 33303 Test: [20200/21770] eta: 0:01:01 time: 0.0386 data: 0.0009 max mem: 33303 Test: [20300/21770] eta: 0:00:57 time: 0.0386 data: 0.0009 max mem: 33303 Test: [20400/21770] eta: 0:00:53 time: 0.0386 data: 0.0009 max mem: 33303 Test: [20500/21770] eta: 0:00:49 time: 0.0384 data: 0.0008 max mem: 33303 Test: [20600/21770] eta: 0:00:45 time: 0.0388 data: 0.0009 max mem: 33303 Test: [20700/21770] eta: 0:00:41 time: 0.0389 data: 0.0008 max mem: 33303 Test: [20800/21770] eta: 0:00:37 time: 0.0388 data: 0.0008 max mem: 33303 Test: [20900/21770] eta: 0:00:33 time: 0.0391 data: 0.0009 max mem: 33303 Test: [21000/21770] eta: 0:00:30 time: 0.0389 data: 0.0009 max mem: 33303 Test: [21100/21770] eta: 0:00:26 time: 0.0391 data: 0.0008 max mem: 33303 Test: [21200/21770] eta: 0:00:22 time: 0.0389 data: 0.0009 max mem: 33303 Test: [21300/21770] eta: 0:00:18 time: 0.0390 data: 0.0008 max mem: 33303 Test: [21400/21770] eta: 0:00:14 time: 0.0390 data: 0.0009 max mem: 33303 Test: [21500/21770] eta: 0:00:10 time: 0.0391 data: 0.0009 max mem: 33303 Test: [21600/21770] eta: 0:00:06 time: 0.0395 data: 0.0009 max mem: 33303 Test: [21700/21770] eta: 0:00:02 time: 0.0396 data: 0.0009 max mem: 33303 Test: Total time: 0:14:09 Final results: Mean IoU is 0.00 precision@0.5 = 0.00 precision@0.6 = 0.00 precision@0.7 = 0.00 precision@0.8 = 0.00 precision@0.9 = 0.00 overall IoU = 0.00 mean IoU = 0.00 Mean accuracy for one-to-zero sample is 100.00 Average object IoU 0.0 Overall IoU 0.0 Epoch: [32] [ 0/4276] eta: 6:24:59 lr: 1.174588039246988e-05 loss: 0.0738 (0.0738) time: 5.4022 data: 2.3652 max mem: 33303 Epoch: [32] [ 10/4276] eta: 3:44:21 lr: 1.174278995867293e-05 loss: 0.0738 (0.0819) time: 3.1556 data: 0.2229 max mem: 33303 Epoch: [32] [ 20/4276] eta: 3:36:54 lr: 1.1739699434503175e-05 loss: 0.0769 (0.0839) time: 2.9407 data: 0.0084 max mem: 33303 Epoch: [32] [ 30/4276] eta: 3:33:55 lr: 1.1736608819931524e-05 loss: 0.0814 (0.0836) time: 2.9499 data: 0.0077 max mem: 33303 Epoch: [32] [ 40/4276] eta: 3:32:09 lr: 1.1733518114928887e-05 loss: 0.0777 (0.0823) time: 2.9494 data: 0.0072 max mem: 33303 Epoch: [32] [ 50/4276] eta: 3:30:53 lr: 1.1730427319466154e-05 loss: 0.0766 (0.0833) time: 2.9500 data: 0.0072 max mem: 33303 Epoch: [32] [ 60/4276] eta: 3:29:41 lr: 1.1727336433514182e-05 loss: 0.0752 (0.0814) time: 2.9415 data: 0.0072 max mem: 33303 Epoch: [32] [ 70/4276] eta: 3:28:42 lr: 1.1724245457043825e-05 loss: 0.0702 (0.0799) time: 2.9338 data: 0.0072 max mem: 33303 Epoch: [32] [ 80/4276] eta: 3:27:49 lr: 1.1721154390025917e-05 loss: 0.0762 (0.0807) time: 2.9340 data: 0.0071 max mem: 33303 Epoch: [32] [ 90/4276] eta: 3:27:04 lr: 1.1718063232431276e-05 loss: 0.0801 (0.0804) time: 2.9353 data: 0.0071 max mem: 33303 Epoch: [32] [ 100/4276] eta: 3:26:24 lr: 1.1714971984230687e-05 loss: 0.0816 (0.0814) time: 2.9410 data: 0.0072 max mem: 33303 Epoch: [32] [ 110/4276] eta: 3:25:49 lr: 1.1711880645394932e-05 loss: 0.0833 (0.0821) time: 2.9470 data: 0.0072 max mem: 33303 Epoch: [32] [ 120/4276] eta: 3:25:16 lr: 1.1708789215894772e-05 loss: 0.0833 (0.0821) time: 2.9523 data: 0.0071 max mem: 33303 Epoch: [32] [ 130/4276] eta: 3:24:40 lr: 1.1705697695700957e-05 loss: 0.0877 (0.0827) time: 2.9494 data: 0.0073 max mem: 33303 Epoch: [32] [ 140/4276] eta: 3:24:06 lr: 1.1702606084784195e-05 loss: 0.0754 (0.0818) time: 2.9452 data: 0.0073 max mem: 33303 Epoch: [32] [ 150/4276] eta: 3:23:33 lr: 1.16995143831152e-05 loss: 0.0720 (0.0815) time: 2.9479 data: 0.0071 max mem: 33303 Epoch: [32] [ 160/4276] eta: 3:22:57 lr: 1.1696422590664659e-05 loss: 0.0754 (0.0815) time: 2.9420 data: 0.0072 max mem: 33303 Epoch: [32] [ 170/4276] eta: 3:22:12 lr: 1.1693330707403246e-05 loss: 0.0806 (0.0815) time: 2.9146 data: 0.0080 max mem: 33303 Epoch: [32] [ 180/4276] eta: 3:21:29 lr: 1.1690238733301604e-05 loss: 0.0810 (0.0824) time: 2.8953 data: 0.0088 max mem: 33303 Epoch: [32] [ 190/4276] eta: 3:21:00 lr: 1.1687146668330372e-05 loss: 0.0807 (0.0821) time: 2.9256 data: 0.0084 max mem: 33303 Epoch: [32] [ 200/4276] eta: 3:20:30 lr: 1.1684054512460162e-05 loss: 0.0711 (0.0817) time: 2.9506 data: 0.0077 max mem: 33303 Epoch: [32] [ 210/4276] eta: 3:19:58 lr: 1.168096226566158e-05 loss: 0.0755 (0.0822) time: 2.9435 data: 0.0074 max mem: 33303 Epoch: [32] [ 220/4276] eta: 3:19:25 lr: 1.1677869927905192e-05 loss: 0.0755 (0.0823) time: 2.9364 data: 0.0074 max mem: 33303 Epoch: [32] [ 230/4276] eta: 3:18:55 lr: 1.1674777499161565e-05 loss: 0.0780 (0.0824) time: 2.9381 data: 0.0083 max mem: 33303 Epoch: [32] [ 240/4276] eta: 3:18:28 lr: 1.1671684979401242e-05 loss: 0.0832 (0.0827) time: 2.9555 data: 0.0086 max mem: 33303 Epoch: [32] [ 250/4276] eta: 3:17:59 lr: 1.166859236859475e-05 loss: 0.0864 (0.0829) time: 2.9592 data: 0.0075 max mem: 33303 Epoch: [32] [ 260/4276] eta: 3:17:29 lr: 1.166549966671259e-05 loss: 0.0891 (0.0834) time: 2.9522 data: 0.0071 max mem: 33303 Epoch: [32] [ 270/4276] eta: 3:17:00 lr: 1.166240687372525e-05 loss: 0.0852 (0.0834) time: 2.9513 data: 0.0072 max mem: 33303 Epoch: [32] [ 280/4276] eta: 3:16:31 lr: 1.1659313989603201e-05 loss: 0.0648 (0.0832) time: 2.9521 data: 0.0070 max mem: 33303 Epoch: [32] [ 290/4276] eta: 3:16:02 lr: 1.1656221014316901e-05 loss: 0.0710 (0.0830) time: 2.9551 data: 0.0070 max mem: 33303 Epoch: [32] [ 300/4276] eta: 3:15:33 lr: 1.1653127947836772e-05 loss: 0.0716 (0.0829) time: 2.9573 data: 0.0073 max mem: 33303 Epoch: [32] [ 310/4276] eta: 3:15:04 lr: 1.1650034790133233e-05 loss: 0.0662 (0.0824) time: 2.9559 data: 0.0073 max mem: 33303 Epoch: [32] [ 320/4276] eta: 3:14:35 lr: 1.1646941541176681e-05 loss: 0.0720 (0.0825) time: 2.9519 data: 0.0071 max mem: 33303 Epoch: [32] [ 330/4276] eta: 3:14:03 lr: 1.1643848200937501e-05 loss: 0.0794 (0.0826) time: 2.9426 data: 0.0072 max mem: 33303 Epoch: [32] [ 340/4276] eta: 3:13:33 lr: 1.164075476938604e-05 loss: 0.0789 (0.0825) time: 2.9399 data: 0.0073 max mem: 33303 Epoch: [32] [ 350/4276] eta: 3:13:03 lr: 1.1637661246492643e-05 loss: 0.0805 (0.0828) time: 2.9450 data: 0.0073 max mem: 33303 Epoch: [32] [ 360/4276] eta: 3:12:33 lr: 1.1634567632227643e-05 loss: 0.0935 (0.0833) time: 2.9443 data: 0.0071 max mem: 33303 Epoch: [32] [ 370/4276] eta: 3:12:02 lr: 1.163147392656133e-05 loss: 0.0859 (0.0835) time: 2.9440 data: 0.0069 max mem: 33303 Epoch: [32] [ 380/4276] eta: 3:11:32 lr: 1.1628380129463998e-05 loss: 0.0846 (0.0835) time: 2.9439 data: 0.0069 max mem: 33303 Epoch: [32] [ 390/4276] eta: 3:11:03 lr: 1.1625286240905913e-05 loss: 0.0753 (0.0834) time: 2.9457 data: 0.0069 max mem: 33303 Epoch: [32] [ 400/4276] eta: 3:10:32 lr: 1.162219226085733e-05 loss: 0.0875 (0.0837) time: 2.9454 data: 0.0069 max mem: 33303 Epoch: [32] [ 410/4276] eta: 3:10:02 lr: 1.161909818928847e-05 loss: 0.0903 (0.0838) time: 2.9439 data: 0.0069 max mem: 33303 Epoch: [32] [ 420/4276] eta: 3:09:33 lr: 1.1616004026169548e-05 loss: 0.0834 (0.0839) time: 2.9453 data: 0.0070 max mem: 33303 Epoch: [32] [ 430/4276] eta: 3:09:03 lr: 1.161290977147076e-05 loss: 0.0887 (0.0840) time: 2.9458 data: 0.0070 max mem: 33303 Epoch: [32] [ 440/4276] eta: 3:08:33 lr: 1.1609815425162287e-05 loss: 0.0802 (0.0839) time: 2.9447 data: 0.0070 max mem: 33303 Epoch: [32] [ 450/4276] eta: 3:08:03 lr: 1.160672098721427e-05 loss: 0.0795 (0.0840) time: 2.9444 data: 0.0069 max mem: 33303 Epoch: [32] [ 460/4276] eta: 3:07:33 lr: 1.160362645759686e-05 loss: 0.0806 (0.0838) time: 2.9435 data: 0.0071 max mem: 33303 Epoch: [32] [ 470/4276] eta: 3:07:03 lr: 1.1600531836280172e-05 loss: 0.0798 (0.0837) time: 2.9412 data: 0.0072 max mem: 33303 Epoch: [32] [ 480/4276] eta: 3:06:32 lr: 1.1597437123234313e-05 loss: 0.0817 (0.0836) time: 2.9371 data: 0.0073 max mem: 33303 Epoch: [32] [ 490/4276] eta: 3:06:02 lr: 1.1594342318429353e-05 loss: 0.0795 (0.0834) time: 2.9365 data: 0.0074 max mem: 33303 Epoch: [32] [ 500/4276] eta: 3:05:31 lr: 1.1591247421835363e-05 loss: 0.0691 (0.0833) time: 2.9360 data: 0.0071 max mem: 33303 Epoch: [32] [ 510/4276] eta: 3:05:01 lr: 1.1588152433422386e-05 loss: 0.0726 (0.0833) time: 2.9327 data: 0.0070 max mem: 33303 Epoch: [32] [ 520/4276] eta: 3:04:31 lr: 1.1585057353160458e-05 loss: 0.0792 (0.0833) time: 2.9380 data: 0.0069 max mem: 33303 Epoch: [32] [ 530/4276] eta: 3:03:59 lr: 1.158196218101957e-05 loss: 0.0792 (0.0834) time: 2.9305 data: 0.0073 max mem: 33303 Epoch: [32] [ 540/4276] eta: 3:03:26 lr: 1.157886691696972e-05 loss: 0.0804 (0.0834) time: 2.9042 data: 0.0077 max mem: 33303 Epoch: [32] [ 550/4276] eta: 3:02:53 lr: 1.1575771560980877e-05 loss: 0.0809 (0.0834) time: 2.8902 data: 0.0079 max mem: 33303 Epoch: [32] [ 560/4276] eta: 3:02:20 lr: 1.1572676113022999e-05 loss: 0.0849 (0.0836) time: 2.8907 data: 0.0082 max mem: 33303 Epoch: [32] [ 570/4276] eta: 3:01:47 lr: 1.1569580573066006e-05 loss: 0.0795 (0.0836) time: 2.8904 data: 0.0083 max mem: 33303 Epoch: [32] [ 580/4276] eta: 3:01:14 lr: 1.156648494107982e-05 loss: 0.0774 (0.0837) time: 2.8894 data: 0.0081 max mem: 33303 Epoch: [32] [ 590/4276] eta: 3:00:41 lr: 1.1563389217034332e-05 loss: 0.0805 (0.0836) time: 2.8886 data: 0.0081 max mem: 33303 Epoch: [32] [ 600/4276] eta: 3:00:09 lr: 1.156029340089943e-05 loss: 0.0805 (0.0836) time: 2.8939 data: 0.0083 max mem: 33303 Epoch: [32] [ 610/4276] eta: 2:59:38 lr: 1.1557197492644954e-05 loss: 0.0759 (0.0836) time: 2.9062 data: 0.0084 max mem: 33303 Epoch: [32] [ 620/4276] eta: 2:59:06 lr: 1.1554101492240752e-05 loss: 0.0730 (0.0836) time: 2.9033 data: 0.0082 max mem: 33303 Epoch: [32] [ 630/4276] eta: 2:58:34 lr: 1.1551005399656642e-05 loss: 0.0797 (0.0837) time: 2.8938 data: 0.0082 max mem: 33303 Epoch: [32] [ 640/4276] eta: 2:58:02 lr: 1.1547909214862433e-05 loss: 0.0803 (0.0837) time: 2.8924 data: 0.0084 max mem: 33303 Epoch: [32] [ 650/4276] eta: 2:57:30 lr: 1.1544812937827892e-05 loss: 0.0761 (0.0836) time: 2.8933 data: 0.0083 max mem: 33303 Epoch: [32] [ 660/4276] eta: 2:56:58 lr: 1.154171656852279e-05 loss: 0.0820 (0.0838) time: 2.8885 data: 0.0080 max mem: 33303 Epoch: [32] [ 670/4276] eta: 2:56:25 lr: 1.1538620106916879e-05 loss: 0.0833 (0.0840) time: 2.8754 data: 0.0081 max mem: 33303 Epoch: [32] [ 680/4276] eta: 2:55:52 lr: 1.153552355297987e-05 loss: 0.0854 (0.0840) time: 2.8718 data: 0.0084 max mem: 33303 Epoch: [32] [ 690/4276] eta: 2:55:18 lr: 1.1532426906681476e-05 loss: 0.0831 (0.0840) time: 2.8643 data: 0.0086 max mem: 33303 Epoch: [32] [ 700/4276] eta: 2:54:45 lr: 1.1529330167991382e-05 loss: 0.0766 (0.0839) time: 2.8563 data: 0.0084 max mem: 33303 Epoch: [32] [ 710/4276] eta: 2:54:15 lr: 1.1526233336879266e-05 loss: 0.0775 (0.0839) time: 2.8823 data: 0.0084 max mem: 33303 Epoch: [32] [ 720/4276] eta: 2:53:45 lr: 1.1523136413314763e-05 loss: 0.0778 (0.0838) time: 2.9169 data: 0.0085 max mem: 33303 Epoch: [32] [ 730/4276] eta: 2:53:16 lr: 1.152003939726751e-05 loss: 0.0748 (0.0838) time: 2.9337 data: 0.0081 max mem: 33303 Epoch: [32] [ 740/4276] eta: 2:52:47 lr: 1.1516942288707117e-05 loss: 0.0731 (0.0836) time: 2.9386 data: 0.0078 max mem: 33303 Epoch: [32] [ 750/4276] eta: 2:52:18 lr: 1.1513845087603186e-05 loss: 0.0721 (0.0837) time: 2.9334 data: 0.0080 max mem: 33303 Epoch: [32] [ 760/4276] eta: 2:51:49 lr: 1.1510747793925272e-05 loss: 0.0721 (0.0837) time: 2.9343 data: 0.0078 max mem: 33303 Epoch: [32] [ 770/4276] eta: 2:51:20 lr: 1.150765040764294e-05 loss: 0.0801 (0.0838) time: 2.9374 data: 0.0075 max mem: 33303 Epoch: [32] [ 780/4276] eta: 2:50:50 lr: 1.1504552928725723e-05 loss: 0.0797 (0.0836) time: 2.9343 data: 0.0075 max mem: 33303 Epoch: [32] [ 790/4276] eta: 2:50:21 lr: 1.1501455357143143e-05 loss: 0.0797 (0.0837) time: 2.9382 data: 0.0073 max mem: 33303 Epoch: [32] [ 800/4276] eta: 2:49:50 lr: 1.1498357692864683e-05 loss: 0.0792 (0.0836) time: 2.9126 data: 0.0074 max mem: 33303 Epoch: [32] [ 810/4276] eta: 2:49:18 lr: 1.1495259935859828e-05 loss: 0.0795 (0.0837) time: 2.8821 data: 0.0077 max mem: 33303 Epoch: [32] [ 820/4276] eta: 2:48:47 lr: 1.1492162086098037e-05 loss: 0.0780 (0.0837) time: 2.8836 data: 0.0076 max mem: 33303 Epoch: [32] [ 830/4276] eta: 2:48:16 lr: 1.1489064143548754e-05 loss: 0.0756 (0.0837) time: 2.8870 data: 0.0074 max mem: 33303 Epoch: [32] [ 840/4276] eta: 2:47:46 lr: 1.1485966108181385e-05 loss: 0.0756 (0.0839) time: 2.8997 data: 0.0076 max mem: 33303 Epoch: [32] [ 850/4276] eta: 2:47:15 lr: 1.1482867979965339e-05 loss: 0.0761 (0.0837) time: 2.8995 data: 0.0076 max mem: 33303 Epoch: [32] [ 860/4276] eta: 2:46:44 lr: 1.1479769758869994e-05 loss: 0.0774 (0.0839) time: 2.8878 data: 0.0077 max mem: 33303 Epoch: [32] [ 870/4276] eta: 2:46:13 lr: 1.1476671444864723e-05 loss: 0.0889 (0.0839) time: 2.8837 data: 0.0078 max mem: 33303 Epoch: [32] [ 880/4276] eta: 2:45:42 lr: 1.1473573037918852e-05 loss: 0.0866 (0.0840) time: 2.8848 data: 0.0078 max mem: 33303 Epoch: [32] [ 890/4276] eta: 2:45:12 lr: 1.1470474538001711e-05 loss: 0.0866 (0.0841) time: 2.8925 data: 0.0078 max mem: 33303 Epoch: [32] [ 900/4276] eta: 2:44:42 lr: 1.1467375945082608e-05 loss: 0.0862 (0.0842) time: 2.9109 data: 0.0076 max mem: 33303 Epoch: [32] [ 910/4276] eta: 2:44:14 lr: 1.146427725913083e-05 loss: 0.0899 (0.0843) time: 2.9354 data: 0.0078 max mem: 33303 Epoch: [32] [ 920/4276] eta: 2:43:45 lr: 1.146117848011563e-05 loss: 0.0940 (0.0844) time: 2.9431 data: 0.0083 max mem: 33303 Epoch: [32] [ 930/4276] eta: 2:43:16 lr: 1.145807960800626e-05 loss: 0.0905 (0.0844) time: 2.9426 data: 0.0083 max mem: 33303 Epoch: [32] [ 940/4276] eta: 2:42:47 lr: 1.1454980642771948e-05 loss: 0.0799 (0.0845) time: 2.9415 data: 0.0081 max mem: 33303 Epoch: [32] [ 950/4276] eta: 2:42:19 lr: 1.1451881584381906e-05 loss: 0.0874 (0.0846) time: 2.9392 data: 0.0082 max mem: 33303 Epoch: [32] [ 960/4276] eta: 2:41:50 lr: 1.144878243280531e-05 loss: 0.0874 (0.0845) time: 2.9394 data: 0.0085 max mem: 33303 Epoch: [32] [ 970/4276] eta: 2:41:21 lr: 1.1445683188011334e-05 loss: 0.0775 (0.0845) time: 2.9400 data: 0.0085 max mem: 33303 Epoch: [32] [ 980/4276] eta: 2:40:52 lr: 1.1442583849969134e-05 loss: 0.0802 (0.0845) time: 2.9372 data: 0.0082 max mem: 33303 Epoch: [32] [ 990/4276] eta: 2:40:23 lr: 1.1439484418647824e-05 loss: 0.0826 (0.0844) time: 2.9367 data: 0.0082 max mem: 33303 Epoch: [32] [1000/4276] eta: 2:39:54 lr: 1.1436384894016522e-05 loss: 0.0732 (0.0843) time: 2.9480 data: 0.0087 max mem: 33303 Epoch: [32] [1010/4276] eta: 2:39:25 lr: 1.1433285276044317e-05 loss: 0.0733 (0.0843) time: 2.9334 data: 0.0092 max mem: 33303 Epoch: [32] [1020/4276] eta: 2:38:54 lr: 1.1430185564700286e-05 loss: 0.0782 (0.0843) time: 2.9054 data: 0.0090 max mem: 33303 Epoch: [32] [1030/4276] eta: 2:38:24 lr: 1.1427085759953468e-05 loss: 0.0813 (0.0844) time: 2.9023 data: 0.0088 max mem: 33303 Epoch: [32] [1040/4276] eta: 2:37:54 lr: 1.1423985861772901e-05 loss: 0.0797 (0.0843) time: 2.9032 data: 0.0091 max mem: 33303 Epoch: [32] [1050/4276] eta: 2:37:24 lr: 1.1420885870127595e-05 loss: 0.0797 (0.0844) time: 2.8996 data: 0.0089 max mem: 33303 Epoch: [32] [1060/4276] eta: 2:36:54 lr: 1.141778578498655e-05 loss: 0.0878 (0.0846) time: 2.8977 data: 0.0087 max mem: 33303 Epoch: [32] [1070/4276] eta: 2:36:23 lr: 1.1414685606318726e-05 loss: 0.0926 (0.0847) time: 2.8945 data: 0.0082 max mem: 33303 Epoch: [32] [1080/4276] eta: 2:35:54 lr: 1.1411585334093083e-05 loss: 0.0913 (0.0848) time: 2.8996 data: 0.0081 max mem: 33303 Epoch: [32] [1090/4276] eta: 2:35:24 lr: 1.140848496827855e-05 loss: 0.0876 (0.0849) time: 2.9180 data: 0.0087 max mem: 33303 Epoch: [32] [1100/4276] eta: 2:34:55 lr: 1.1405384508844049e-05 loss: 0.0949 (0.0850) time: 2.9235 data: 0.0089 max mem: 33303 Epoch: [32] [1110/4276] eta: 2:34:26 lr: 1.1402283955758463e-05 loss: 0.0949 (0.0851) time: 2.9340 data: 0.0085 max mem: 33303 Epoch: [32] [1120/4276] eta: 2:33:58 lr: 1.1399183308990671e-05 loss: 0.0870 (0.0851) time: 2.9500 data: 0.0080 max mem: 33303 Epoch: [32] [1130/4276] eta: 2:33:29 lr: 1.1396082568509529e-05 loss: 0.0814 (0.0851) time: 2.9501 data: 0.0077 max mem: 33303 Epoch: [32] [1140/4276] eta: 2:33:00 lr: 1.1392981734283872e-05 loss: 0.0814 (0.0851) time: 2.9500 data: 0.0077 max mem: 33303 Epoch: [32] [1150/4276] eta: 2:32:32 lr: 1.1389880806282508e-05 loss: 0.0785 (0.0851) time: 2.9512 data: 0.0077 max mem: 33303 Epoch: [32] [1160/4276] eta: 2:32:03 lr: 1.1386779784474236e-05 loss: 0.0813 (0.0851) time: 2.9519 data: 0.0078 max mem: 33303 Epoch: [32] [1170/4276] eta: 2:31:33 lr: 1.138367866882783e-05 loss: 0.0821 (0.0852) time: 2.9334 data: 0.0080 max mem: 33303 Epoch: [32] [1180/4276] eta: 2:31:04 lr: 1.1380577459312055e-05 loss: 0.0796 (0.0852) time: 2.9102 data: 0.0084 max mem: 33303 Epoch: [32] [1190/4276] eta: 2:30:35 lr: 1.137747615589563e-05 loss: 0.0795 (0.0851) time: 2.9305 data: 0.0085 max mem: 33303 Epoch: [32] [1200/4276] eta: 2:30:06 lr: 1.1374374758547277e-05 loss: 0.0795 (0.0851) time: 2.9541 data: 0.0084 max mem: 33303 Epoch: [32] [1210/4276] eta: 2:29:38 lr: 1.1371273267235693e-05 loss: 0.0708 (0.0850) time: 2.9529 data: 0.0082 max mem: 33303 Epoch: [32] [1220/4276] eta: 2:29:09 lr: 1.1368171681929561e-05 loss: 0.0819 (0.0851) time: 2.9530 data: 0.0082 max mem: 33303 Epoch: [32] [1230/4276] eta: 2:28:40 lr: 1.1365070002597522e-05 loss: 0.0928 (0.0851) time: 2.9512 data: 0.0085 max mem: 33303 Epoch: [32] [1240/4276] eta: 2:28:11 lr: 1.1361968229208218e-05 loss: 0.0929 (0.0852) time: 2.9448 data: 0.0084 max mem: 33303 Epoch: [32] [1250/4276] eta: 2:27:43 lr: 1.1358866361730264e-05 loss: 0.0880 (0.0852) time: 2.9464 data: 0.0091 max mem: 33303 Epoch: [32] [1260/4276] eta: 2:27:14 lr: 1.1355764400132266e-05 loss: 0.0712 (0.0851) time: 2.9506 data: 0.0089 max mem: 33303 Epoch: [32] [1270/4276] eta: 2:26:45 lr: 1.1352662344382781e-05 loss: 0.0699 (0.0850) time: 2.9494 data: 0.0079 max mem: 33303 Epoch: [32] [1280/4276] eta: 2:26:16 lr: 1.1349560194450378e-05 loss: 0.0806 (0.0850) time: 2.9525 data: 0.0081 max mem: 33303 Epoch: [32] [1290/4276] eta: 2:25:48 lr: 1.1346457950303593e-05 loss: 0.0855 (0.0851) time: 2.9541 data: 0.0080 max mem: 33303 Epoch: [32] [1300/4276] eta: 2:25:19 lr: 1.1343355611910932e-05 loss: 0.0835 (0.0851) time: 2.9516 data: 0.0076 max mem: 33303 Epoch: [32] [1310/4276] eta: 2:24:50 lr: 1.1340253179240896e-05 loss: 0.0751 (0.0850) time: 2.9527 data: 0.0074 max mem: 33303 Epoch: [32] [1320/4276] eta: 2:24:21 lr: 1.133715065226196e-05 loss: 0.0751 (0.0851) time: 2.9475 data: 0.0074 max mem: 33303 Epoch: [32] [1330/4276] eta: 2:23:52 lr: 1.1334048030942586e-05 loss: 0.0837 (0.0850) time: 2.9411 data: 0.0078 max mem: 33303 Epoch: [32] [1340/4276] eta: 2:23:23 lr: 1.1330945315251196e-05 loss: 0.0713 (0.0850) time: 2.9412 data: 0.0080 max mem: 33303 Epoch: [32] [1350/4276] eta: 2:22:54 lr: 1.1327842505156211e-05 loss: 0.0721 (0.0849) time: 2.9484 data: 0.0076 max mem: 33303 Epoch: [32] [1360/4276] eta: 2:22:25 lr: 1.1324739600626027e-05 loss: 0.0821 (0.0849) time: 2.9540 data: 0.0074 max mem: 33303 Epoch: [32] [1370/4276] eta: 2:21:56 lr: 1.1321636601629022e-05 loss: 0.0865 (0.0849) time: 2.9519 data: 0.0077 max mem: 33303 Epoch: [32] [1380/4276] eta: 2:21:28 lr: 1.131853350813354e-05 loss: 0.0817 (0.0849) time: 2.9532 data: 0.0081 max mem: 33303 Epoch: [32] [1390/4276] eta: 2:20:58 lr: 1.1315430320107924e-05 loss: 0.0834 (0.0849) time: 2.9422 data: 0.0084 max mem: 33303 Epoch: [32] [1400/4276] eta: 2:20:29 lr: 1.1312327037520482e-05 loss: 0.0829 (0.0849) time: 2.9333 data: 0.0082 max mem: 33303 Epoch: [32] [1410/4276] eta: 2:20:00 lr: 1.1309223660339518e-05 loss: 0.0780 (0.0848) time: 2.9438 data: 0.0075 max mem: 33303 Epoch: [32] [1420/4276] eta: 2:19:31 lr: 1.130612018853329e-05 loss: 0.0723 (0.0848) time: 2.9520 data: 0.0072 max mem: 33303 Epoch: [32] [1430/4276] eta: 2:19:02 lr: 1.1303016622070057e-05 loss: 0.0765 (0.0849) time: 2.9458 data: 0.0070 max mem: 33303 Epoch: [32] [1440/4276] eta: 2:18:33 lr: 1.1299912960918055e-05 loss: 0.0845 (0.0849) time: 2.9342 data: 0.0074 max mem: 33303 Epoch: [32] [1450/4276] eta: 2:18:03 lr: 1.1296809205045499e-05 loss: 0.0737 (0.0848) time: 2.9222 data: 0.0075 max mem: 33303 Epoch: [32] [1460/4276] eta: 2:17:33 lr: 1.1293705354420568e-05 loss: 0.0737 (0.0848) time: 2.9064 data: 0.0078 max mem: 33303 Epoch: [32] [1470/4276] eta: 2:17:03 lr: 1.1290601409011444e-05 loss: 0.0864 (0.0849) time: 2.8962 data: 0.0079 max mem: 33303 Epoch: [32] [1480/4276] eta: 2:16:33 lr: 1.1287497368786272e-05 loss: 0.0864 (0.0849) time: 2.8984 data: 0.0075 max mem: 33303 Epoch: [32] [1490/4276] eta: 2:16:03 lr: 1.1284393233713193e-05 loss: 0.0845 (0.0848) time: 2.9017 data: 0.0075 max mem: 33303 Epoch: [32] [1500/4276] eta: 2:15:34 lr: 1.1281289003760303e-05 loss: 0.0796 (0.0848) time: 2.9013 data: 0.0078 max mem: 33303 Epoch: [32] [1510/4276] eta: 2:15:04 lr: 1.1278184678895699e-05 loss: 0.0752 (0.0847) time: 2.9072 data: 0.0076 max mem: 33303 Epoch: [32] [1520/4276] eta: 2:14:35 lr: 1.1275080259087448e-05 loss: 0.0692 (0.0847) time: 2.9263 data: 0.0073 max mem: 33303 Epoch: [32] [1530/4276] eta: 2:14:06 lr: 1.1271975744303605e-05 loss: 0.0669 (0.0846) time: 2.9328 data: 0.0079 max mem: 33303 Epoch: [32] [1540/4276] eta: 2:13:36 lr: 1.1268871134512188e-05 loss: 0.0769 (0.0846) time: 2.9303 data: 0.0083 max mem: 33303 Epoch: [32] [1550/4276] eta: 2:13:07 lr: 1.126576642968121e-05 loss: 0.0782 (0.0846) time: 2.9415 data: 0.0078 max mem: 33303 Epoch: [32] [1560/4276] eta: 2:12:38 lr: 1.1262661629778662e-05 loss: 0.0725 (0.0845) time: 2.9497 data: 0.0073 max mem: 33303 Epoch: [32] [1570/4276] eta: 2:12:09 lr: 1.12595567347725e-05 loss: 0.0725 (0.0845) time: 2.9496 data: 0.0074 max mem: 33303 Epoch: [32] [1580/4276] eta: 2:11:40 lr: 1.1256451744630676e-05 loss: 0.0684 (0.0844) time: 2.9484 data: 0.0072 max mem: 33303 Epoch: [32] [1590/4276] eta: 2:11:11 lr: 1.1253346659321114e-05 loss: 0.0699 (0.0844) time: 2.9488 data: 0.0070 max mem: 33303 Epoch: [32] [1600/4276] eta: 2:10:42 lr: 1.1250241478811725e-05 loss: 0.0879 (0.0844) time: 2.9482 data: 0.0070 max mem: 33303 Epoch: [32] [1610/4276] eta: 2:10:13 lr: 1.124713620307038e-05 loss: 0.0873 (0.0844) time: 2.9310 data: 0.0074 max mem: 33303 Epoch: [32] [1620/4276] eta: 2:09:44 lr: 1.1244030832064947e-05 loss: 0.0787 (0.0844) time: 2.9307 data: 0.0077 max mem: 33303 Epoch: [32] [1630/4276] eta: 2:09:15 lr: 1.1240925365763271e-05 loss: 0.0917 (0.0845) time: 2.9510 data: 0.0075 max mem: 33303 Epoch: [32] [1640/4276] eta: 2:08:46 lr: 1.123781980413318e-05 loss: 0.0876 (0.0844) time: 2.9532 data: 0.0072 max mem: 33303 Epoch: [32] [1650/4276] eta: 2:08:17 lr: 1.1234714147142462e-05 loss: 0.0676 (0.0844) time: 2.9522 data: 0.0070 max mem: 33303 Epoch: [32] [1660/4276] eta: 2:07:48 lr: 1.1231608394758902e-05 loss: 0.0802 (0.0844) time: 2.9513 data: 0.0070 max mem: 33303 Epoch: [32] [1670/4276] eta: 2:07:19 lr: 1.1228502546950258e-05 loss: 0.0831 (0.0845) time: 2.9483 data: 0.0070 max mem: 33303 Epoch: [32] [1680/4276] eta: 2:06:49 lr: 1.1225396603684276e-05 loss: 0.0738 (0.0844) time: 2.9338 data: 0.0073 max mem: 33303 Epoch: [32] [1690/4276] eta: 2:06:19 lr: 1.1222290564928663e-05 loss: 0.0734 (0.0844) time: 2.9086 data: 0.0078 max mem: 33303 Epoch: [32] [1700/4276] eta: 2:05:50 lr: 1.1219184430651124e-05 loss: 0.0801 (0.0844) time: 2.9041 data: 0.0081 max mem: 33303 Epoch: [32] [1710/4276] eta: 2:05:21 lr: 1.1216078200819329e-05 loss: 0.0785 (0.0843) time: 2.9340 data: 0.0086 max mem: 33303 Epoch: [32] [1720/4276] eta: 2:04:52 lr: 1.1212971875400941e-05 loss: 0.0730 (0.0843) time: 2.9466 data: 0.0087 max mem: 33303 Epoch: [32] [1730/4276] eta: 2:04:23 lr: 1.1209865454363586e-05 loss: 0.0730 (0.0843) time: 2.9430 data: 0.0081 max mem: 33303 Epoch: [32] [1740/4276] eta: 2:03:54 lr: 1.1206758937674879e-05 loss: 0.0698 (0.0842) time: 2.9477 data: 0.0075 max mem: 33303 Epoch: [32] [1750/4276] eta: 2:03:24 lr: 1.1203652325302413e-05 loss: 0.0698 (0.0842) time: 2.9464 data: 0.0073 max mem: 33303 Epoch: [32] [1760/4276] eta: 2:02:55 lr: 1.1200545617213766e-05 loss: 0.0695 (0.0841) time: 2.9477 data: 0.0073 max mem: 33303 Epoch: [32] [1770/4276] eta: 2:02:26 lr: 1.1197438813376477e-05 loss: 0.0744 (0.0841) time: 2.9366 data: 0.0082 max mem: 33303 Epoch: [32] [1780/4276] eta: 2:01:57 lr: 1.1194331913758082e-05 loss: 0.0775 (0.0841) time: 2.9392 data: 0.0085 max mem: 33303 Epoch: [32] [1790/4276] eta: 2:01:28 lr: 1.1191224918326085e-05 loss: 0.0775 (0.0841) time: 2.9516 data: 0.0078 max mem: 33303 Epoch: [32] [1800/4276] eta: 2:00:59 lr: 1.1188117827047983e-05 loss: 0.0752 (0.0840) time: 2.9493 data: 0.0077 max mem: 33303 Epoch: [32] [1810/4276] eta: 2:00:30 lr: 1.1185010639891231e-05 loss: 0.0842 (0.0841) time: 2.9480 data: 0.0079 max mem: 33303 Epoch: [32] [1820/4276] eta: 2:00:01 lr: 1.1181903356823277e-05 loss: 0.0851 (0.0840) time: 2.9489 data: 0.0080 max mem: 33303 Epoch: [32] [1830/4276] eta: 1:59:31 lr: 1.1178795977811548e-05 loss: 0.0772 (0.0840) time: 2.9483 data: 0.0079 max mem: 33303 Epoch: [32] [1840/4276] eta: 1:59:02 lr: 1.117568850282345e-05 loss: 0.0722 (0.0840) time: 2.9495 data: 0.0079 max mem: 33303 Epoch: [32] [1850/4276] eta: 1:58:34 lr: 1.1172580931826353e-05 loss: 0.0693 (0.0840) time: 2.9746 data: 0.0076 max mem: 33303 Epoch: [32] [1860/4276] eta: 1:58:06 lr: 1.1169473264787627e-05 loss: 0.0711 (0.0840) time: 2.9998 data: 0.0076 max mem: 33303 Epoch: [32] [1870/4276] eta: 1:57:37 lr: 1.1166365501674612e-05 loss: 0.0719 (0.0839) time: 2.9817 data: 0.0080 max mem: 33303 Epoch: [32] [1880/4276] eta: 1:57:07 lr: 1.1163257642454618e-05 loss: 0.0725 (0.0839) time: 2.9511 data: 0.0074 max mem: 33303 Epoch: [32] [1890/4276] eta: 1:56:38 lr: 1.1160149687094949e-05 loss: 0.0754 (0.0839) time: 2.9313 data: 0.0074 max mem: 33303 Epoch: [32] [1900/4276] eta: 1:56:09 lr: 1.1157041635562877e-05 loss: 0.0738 (0.0838) time: 2.9421 data: 0.0077 max mem: 33303 Epoch: [32] [1910/4276] eta: 1:55:40 lr: 1.1153933487825665e-05 loss: 0.0738 (0.0838) time: 2.9618 data: 0.0074 max mem: 33303 Epoch: [32] [1920/4276] eta: 1:55:11 lr: 1.1150825243850531e-05 loss: 0.0731 (0.0838) time: 2.9611 data: 0.0072 max mem: 33303 Epoch: [32] [1930/4276] eta: 1:54:42 lr: 1.1147716903604696e-05 loss: 0.0754 (0.0838) time: 2.9595 data: 0.0071 max mem: 33303 Epoch: [32] [1940/4276] eta: 1:54:13 lr: 1.114460846705535e-05 loss: 0.0754 (0.0838) time: 2.9646 data: 0.0074 max mem: 33303 Epoch: [32] [1950/4276] eta: 1:53:44 lr: 1.1141499934169665e-05 loss: 0.0743 (0.0838) time: 2.9599 data: 0.0075 max mem: 33303 Epoch: [32] [1960/4276] eta: 1:53:15 lr: 1.113839130491478e-05 loss: 0.0702 (0.0837) time: 2.9468 data: 0.0074 max mem: 33303 Epoch: [32] [1970/4276] eta: 1:52:45 lr: 1.1135282579257827e-05 loss: 0.0667 (0.0837) time: 2.9479 data: 0.0077 max mem: 33303 Epoch: [32] [1980/4276] eta: 1:52:17 lr: 1.113217375716591e-05 loss: 0.0667 (0.0836) time: 2.9596 data: 0.0082 max mem: 33303 Epoch: [32] [1990/4276] eta: 1:51:47 lr: 1.1129064838606118e-05 loss: 0.0697 (0.0836) time: 2.9584 data: 0.0084 max mem: 33303 Epoch: [32] [2000/4276] eta: 1:51:18 lr: 1.1125955823545503e-05 loss: 0.0755 (0.0836) time: 2.9478 data: 0.0080 max mem: 33303 Epoch: [32] [2010/4276] eta: 1:50:49 lr: 1.112284671195111e-05 loss: 0.0786 (0.0836) time: 2.9595 data: 0.0083 max mem: 33303 Epoch: [32] [2020/4276] eta: 1:50:20 lr: 1.1119737503789958e-05 loss: 0.0820 (0.0836) time: 2.9622 data: 0.0089 max mem: 33303 Epoch: [32] [2030/4276] eta: 1:49:51 lr: 1.111662819902905e-05 loss: 0.0708 (0.0835) time: 2.9598 data: 0.0088 max mem: 33303 Epoch: [32] [2040/4276] eta: 1:49:22 lr: 1.1113518797635352e-05 loss: 0.0691 (0.0835) time: 2.9624 data: 0.0084 max mem: 33303 Epoch: [32] [2050/4276] eta: 1:48:53 lr: 1.1110409299575822e-05 loss: 0.0771 (0.0836) time: 2.9610 data: 0.0088 max mem: 33303 Epoch: [32] [2060/4276] eta: 1:48:24 lr: 1.1107299704817395e-05 loss: 0.0810 (0.0835) time: 2.9590 data: 0.0095 max mem: 33303 Epoch: [32] [2070/4276] eta: 1:47:55 lr: 1.1104190013326986e-05 loss: 0.0781 (0.0835) time: 2.9519 data: 0.0095 max mem: 33303 Epoch: [32] [2080/4276] eta: 1:47:25 lr: 1.1101080225071477e-05 loss: 0.0815 (0.0835) time: 2.9487 data: 0.0089 max mem: 33303 Epoch: [32] [2090/4276] eta: 1:46:56 lr: 1.1097970340017737e-05 loss: 0.0796 (0.0835) time: 2.9483 data: 0.0087 max mem: 33303 Epoch: [32] [2100/4276] eta: 1:46:27 lr: 1.1094860358132613e-05 loss: 0.0693 (0.0835) time: 2.9479 data: 0.0092 max mem: 33303 Epoch: [32] [2110/4276] eta: 1:45:58 lr: 1.1091750279382938e-05 loss: 0.0781 (0.0834) time: 2.9472 data: 0.0091 max mem: 33303 Epoch: [32] [2120/4276] eta: 1:45:29 lr: 1.1088640103735502e-05 loss: 0.0640 (0.0833) time: 2.9471 data: 0.0088 max mem: 33303 Epoch: [32] [2130/4276] eta: 1:44:59 lr: 1.108552983115709e-05 loss: 0.0624 (0.0833) time: 2.9395 data: 0.0087 max mem: 33303 Epoch: [32] [2140/4276] eta: 1:44:30 lr: 1.1082419461614465e-05 loss: 0.0794 (0.0833) time: 2.9419 data: 0.0091 max mem: 33303 Epoch: [32] [2150/4276] eta: 1:44:01 lr: 1.107930899507437e-05 loss: 0.0794 (0.0832) time: 2.9518 data: 0.0091 max mem: 33303 Epoch: [32] [2160/4276] eta: 1:43:32 lr: 1.1076198431503507e-05 loss: 0.0716 (0.0832) time: 2.9509 data: 0.0087 max mem: 33303 Epoch: [32] [2170/4276] eta: 1:43:03 lr: 1.1073087770868577e-05 loss: 0.0772 (0.0832) time: 2.9643 data: 0.0088 max mem: 33303 Epoch: [32] [2180/4276] eta: 1:42:33 lr: 1.1069977013136257e-05 loss: 0.0792 (0.0832) time: 2.9678 data: 0.0087 max mem: 33303 Epoch: [32] [2190/4276] eta: 1:42:04 lr: 1.1066866158273188e-05 loss: 0.0697 (0.0831) time: 2.9582 data: 0.0088 max mem: 33303 Epoch: [32] [2200/4276] eta: 1:41:35 lr: 1.1063755206246003e-05 loss: 0.0820 (0.0832) time: 2.9579 data: 0.0087 max mem: 33303 Epoch: [32] [2210/4276] eta: 1:41:06 lr: 1.1060644157021308e-05 loss: 0.0826 (0.0832) time: 2.9584 data: 0.0087 max mem: 33303 Epoch: [32] [2220/4276] eta: 1:40:37 lr: 1.1057533010565695e-05 loss: 0.0811 (0.0832) time: 2.9541 data: 0.0091 max mem: 33303 Epoch: [32] [2230/4276] eta: 1:40:07 lr: 1.1054421766845715e-05 loss: 0.0830 (0.0832) time: 2.9321 data: 0.0089 max mem: 33303 Epoch: [32] [2240/4276] eta: 1:39:38 lr: 1.1051310425827912e-05 loss: 0.0645 (0.0831) time: 2.9152 data: 0.0091 max mem: 33303 Epoch: [32] [2250/4276] eta: 1:39:08 lr: 1.1048198987478808e-05 loss: 0.0676 (0.0831) time: 2.9182 data: 0.0093 max mem: 33303 Epoch: [32] [2260/4276] eta: 1:38:39 lr: 1.1045087451764904e-05 loss: 0.0829 (0.0831) time: 2.9435 data: 0.0084 max mem: 33303 Epoch: [32] [2270/4276] eta: 1:38:10 lr: 1.1041975818652665e-05 loss: 0.0839 (0.0831) time: 2.9623 data: 0.0079 max mem: 33303 Epoch: [32] [2280/4276] eta: 1:37:41 lr: 1.1038864088108547e-05 loss: 0.0749 (0.0831) time: 2.9582 data: 0.0080 max mem: 33303 Epoch: [32] [2290/4276] eta: 1:37:11 lr: 1.1035752260098982e-05 loss: 0.0778 (0.0831) time: 2.9579 data: 0.0080 max mem: 33303 Epoch: [32] [2300/4276] eta: 1:36:42 lr: 1.1032640334590384e-05 loss: 0.0816 (0.0831) time: 2.9594 data: 0.0080 max mem: 33303 Epoch: [32] [2310/4276] eta: 1:36:13 lr: 1.102952831154913e-05 loss: 0.0906 (0.0832) time: 2.9424 data: 0.0085 max mem: 33303 Epoch: [32] [2320/4276] eta: 1:35:44 lr: 1.1026416190941586e-05 loss: 0.0837 (0.0831) time: 2.9400 data: 0.0087 max mem: 33303 Epoch: [32] [2330/4276] eta: 1:35:14 lr: 1.1023303972734097e-05 loss: 0.0829 (0.0832) time: 2.9565 data: 0.0079 max mem: 33303 Epoch: [32] [2340/4276] eta: 1:34:45 lr: 1.1020191656892989e-05 loss: 0.0829 (0.0832) time: 2.9592 data: 0.0077 max mem: 33303 Epoch: [32] [2350/4276] eta: 1:34:16 lr: 1.1017079243384549e-05 loss: 0.0818 (0.0832) time: 2.9602 data: 0.0079 max mem: 33303 Epoch: [32] [2360/4276] eta: 1:33:47 lr: 1.1013966732175055e-05 loss: 0.0818 (0.0832) time: 2.9629 data: 0.0080 max mem: 33303 Epoch: [32] [2370/4276] eta: 1:33:18 lr: 1.1010854123230762e-05 loss: 0.0781 (0.0833) time: 2.9605 data: 0.0080 max mem: 33303 Epoch: [32] [2380/4276] eta: 1:32:48 lr: 1.1007741416517908e-05 loss: 0.0857 (0.0833) time: 2.9410 data: 0.0082 max mem: 33303 Epoch: [32] [2390/4276] eta: 1:32:19 lr: 1.100462861200269e-05 loss: 0.0826 (0.0833) time: 2.9354 data: 0.0084 max mem: 33303 Epoch: [32] [2400/4276] eta: 1:31:50 lr: 1.10015157096513e-05 loss: 0.0838 (0.0833) time: 2.9478 data: 0.0084 max mem: 33303 Epoch: [32] [2410/4276] eta: 1:31:20 lr: 1.0998402709429902e-05 loss: 0.0870 (0.0833) time: 2.9471 data: 0.0080 max mem: 33303 Epoch: [32] [2420/4276] eta: 1:30:51 lr: 1.0995289611304645e-05 loss: 0.0769 (0.0833) time: 2.9429 data: 0.0075 max mem: 33303 Epoch: [32] [2430/4276] eta: 1:30:22 lr: 1.0992176415241633e-05 loss: 0.0855 (0.0833) time: 2.9430 data: 0.0077 max mem: 33303 Epoch: [32] [2440/4276] eta: 1:29:52 lr: 1.0989063121206972e-05 loss: 0.0855 (0.0833) time: 2.9450 data: 0.0077 max mem: 33303 Epoch: [32] [2450/4276] eta: 1:29:23 lr: 1.0985949729166739e-05 loss: 0.0746 (0.0833) time: 2.9468 data: 0.0075 max mem: 33303 Epoch: [32] [2460/4276] eta: 1:28:54 lr: 1.0982836239086989e-05 loss: 0.0783 (0.0833) time: 2.9468 data: 0.0084 max mem: 33303 Epoch: [32] [2470/4276] eta: 1:28:25 lr: 1.0979722650933739e-05 loss: 0.0772 (0.0833) time: 2.9558 data: 0.0088 max mem: 33303 Epoch: [32] [2480/4276] eta: 1:27:55 lr: 1.0976608964673004e-05 loss: 0.0772 (0.0833) time: 2.9605 data: 0.0080 max mem: 33303 Epoch: [32] [2490/4276] eta: 1:27:26 lr: 1.0973495180270773e-05 loss: 0.0833 (0.0833) time: 2.9563 data: 0.0078 max mem: 33303 Epoch: [32] [2500/4276] eta: 1:26:57 lr: 1.0970381297693e-05 loss: 0.0842 (0.0833) time: 2.9571 data: 0.0080 max mem: 33303 Epoch: [32] [2510/4276] eta: 1:26:28 lr: 1.096726731690563e-05 loss: 0.0842 (0.0833) time: 2.9579 data: 0.0082 max mem: 33303 Epoch: [32] [2520/4276] eta: 1:25:58 lr: 1.0964153237874576e-05 loss: 0.0814 (0.0833) time: 2.9572 data: 0.0080 max mem: 33303 Epoch: [32] [2530/4276] eta: 1:25:29 lr: 1.0961039060565741e-05 loss: 0.0694 (0.0832) time: 2.9607 data: 0.0077 max mem: 33303 Epoch: [32] [2540/4276] eta: 1:25:00 lr: 1.0957924784944986e-05 loss: 0.0668 (0.0832) time: 2.9593 data: 0.0074 max mem: 33303 Epoch: [32] [2550/4276] eta: 1:35:16 lr: 1.0954810410978166e-05 loss: 0.0693 (0.0832) time: 50.6122 data: 47.6538 max mem: 33303 Epoch: [32] [2560/4276] eta: 1:34:40 lr: 1.0951695938631109e-05 loss: 0.0693 (0.0831) time: 50.6048 data: 47.6539 max mem: 33303 Epoch: [32] [2570/4276] eta: 1:34:05 lr: 1.0948581367869621e-05 loss: 0.0698 (0.0831) time: 2.9446 data: 0.0074 max mem: 33303 Epoch: [32] [2580/4276] eta: 1:33:29 lr: 1.0945466698659476e-05 loss: 0.0717 (0.0831) time: 2.9578 data: 0.0076 max mem: 33303 Epoch: [32] [2590/4276] eta: 1:32:54 lr: 1.0942351930966434e-05 loss: 0.0691 (0.0831) time: 2.9436 data: 0.0076 max mem: 33303 Epoch: [32] [2600/4276] eta: 1:32:18 lr: 1.0939237064756236e-05 loss: 0.0663 (0.0831) time: 2.9291 data: 0.0077 max mem: 33303 Epoch: [32] [2610/4276] eta: 1:31:43 lr: 1.0936122099994598e-05 loss: 0.0753 (0.0831) time: 2.9313 data: 0.0079 max mem: 33303 Epoch: [32] [2620/4276] eta: 1:31:07 lr: 1.0933007036647198e-05 loss: 0.0797 (0.0831) time: 2.9333 data: 0.0078 max mem: 33303 Epoch: [32] [2630/4276] eta: 1:30:32 lr: 1.092989187467971e-05 loss: 0.0720 (0.0831) time: 2.9390 data: 0.0080 max mem: 33303 Epoch: [32] [2640/4276] eta: 1:29:57 lr: 1.092677661405778e-05 loss: 0.0720 (0.0831) time: 2.9424 data: 0.0076 max mem: 33303 Epoch: [32] [2650/4276] eta: 1:29:22 lr: 1.0923661254747034e-05 loss: 0.0789 (0.0831) time: 2.9484 data: 0.0071 max mem: 33303 Epoch: [32] [2660/4276] eta: 1:28:47 lr: 1.0920545796713059e-05 loss: 0.0891 (0.0831) time: 2.9496 data: 0.0071 max mem: 33303 Epoch: [32] [2670/4276] eta: 1:28:12 lr: 1.0917430239921439e-05 loss: 0.0829 (0.0832) time: 2.9478 data: 0.0077 max mem: 33303 Epoch: [32] [2680/4276] eta: 1:27:37 lr: 1.0914314584337725e-05 loss: 0.0831 (0.0832) time: 2.9535 data: 0.0078 max mem: 33303 Epoch: [32] [2690/4276] eta: 1:27:02 lr: 1.0911198829927453e-05 loss: 0.0722 (0.0831) time: 2.9547 data: 0.0078 max mem: 33303 Epoch: [32] [2700/4276] eta: 1:26:27 lr: 1.0908082976656121e-05 loss: 0.0722 (0.0831) time: 2.9412 data: 0.0081 max mem: 33303 Epoch: [32] [2710/4276] eta: 1:25:51 lr: 1.0904967024489216e-05 loss: 0.0695 (0.0831) time: 2.9123 data: 0.0084 max mem: 33303 Epoch: [32] [2720/4276] eta: 1:25:16 lr: 1.0901850973392202e-05 loss: 0.0675 (0.0830) time: 2.9142 data: 0.0086 max mem: 33303 Epoch: [32] [2730/4276] eta: 1:24:42 lr: 1.0898734823330522e-05 loss: 0.0711 (0.0830) time: 2.9404 data: 0.0086 max mem: 33303 Epoch: [32] [2740/4276] eta: 1:24:07 lr: 1.0895618574269579e-05 loss: 0.0773 (0.0830) time: 2.9517 data: 0.0083 max mem: 33303 Epoch: [32] [2750/4276] eta: 1:23:32 lr: 1.089250222617477e-05 loss: 0.0773 (0.0830) time: 2.9804 data: 0.0078 max mem: 33303 Epoch: [32] [2760/4276] eta: 1:22:58 lr: 1.0889385779011463e-05 loss: 0.0786 (0.0830) time: 2.9919 data: 0.0076 max mem: 33303 Epoch: [32] [2770/4276] eta: 1:22:23 lr: 1.0886269232745014e-05 loss: 0.0787 (0.0830) time: 2.9631 data: 0.0077 max mem: 33303 Epoch: [32] [2780/4276] eta: 1:21:49 lr: 1.0883152587340731e-05 loss: 0.0799 (0.0830) time: 2.9498 data: 0.0079 max mem: 33303 Epoch: [32] [2790/4276] eta: 1:21:14 lr: 1.088003584276392e-05 loss: 0.0839 (0.0830) time: 2.9537 data: 0.0074 max mem: 33303 Epoch: [32] [2800/4276] eta: 1:20:40 lr: 1.0876918998979862e-05 loss: 0.0775 (0.0830) time: 2.9468 data: 0.0074 max mem: 33303 Epoch: [32] [2810/4276] eta: 1:20:05 lr: 1.08738020559538e-05 loss: 0.0637 (0.0829) time: 2.9407 data: 0.0080 max mem: 33303 Epoch: [32] [2820/4276] eta: 1:19:31 lr: 1.0870685013650967e-05 loss: 0.0606 (0.0829) time: 2.9439 data: 0.0080 max mem: 33303 Epoch: [32] [2830/4276] eta: 1:18:56 lr: 1.0867567872036574e-05 loss: 0.0812 (0.0829) time: 2.9676 data: 0.0076 max mem: 33303 Epoch: [32] [2840/4276] eta: 1:18:22 lr: 1.0864450631075806e-05 loss: 0.0859 (0.0829) time: 2.9638 data: 0.0073 max mem: 33303 Epoch: [32] [2850/4276] eta: 1:17:48 lr: 1.0861333290733814e-05 loss: 0.0902 (0.0829) time: 2.9619 data: 0.0073 max mem: 33303 Epoch: [32] [2860/4276] eta: 1:17:13 lr: 1.085821585097574e-05 loss: 0.0761 (0.0829) time: 2.9625 data: 0.0073 max mem: 33303 Epoch: [32] [2870/4276] eta: 1:16:39 lr: 1.0855098311766694e-05 loss: 0.0761 (0.0829) time: 2.9686 data: 0.0077 max mem: 33303 Epoch: [32] [2880/4276] eta: 1:16:05 lr: 1.0851980673071776e-05 loss: 0.0730 (0.0828) time: 2.9921 data: 0.0077 max mem: 33303 Epoch: [32] [2890/4276] eta: 1:15:31 lr: 1.0848862934856039e-05 loss: 0.0730 (0.0829) time: 2.9726 data: 0.0076 max mem: 33303 Epoch: [32] [2900/4276] eta: 1:14:57 lr: 1.0845745097084531e-05 loss: 0.0727 (0.0828) time: 2.9435 data: 0.0078 max mem: 33303 Epoch: [32] [2910/4276] eta: 1:14:22 lr: 1.0842627159722274e-05 loss: 0.0806 (0.0829) time: 2.9407 data: 0.0080 max mem: 33303 Epoch: [32] [2920/4276] eta: 1:13:48 lr: 1.0839509122734267e-05 loss: 0.0860 (0.0829) time: 2.9454 data: 0.0080 max mem: 33303 Epoch: [32] [2930/4276] eta: 1:13:14 lr: 1.0836390986085474e-05 loss: 0.0789 (0.0828) time: 2.9554 data: 0.0079 max mem: 33303 Epoch: [32] [2940/4276] eta: 1:12:40 lr: 1.0833272749740849e-05 loss: 0.0751 (0.0829) time: 2.9572 data: 0.0083 max mem: 33303 Epoch: [32] [2950/4276] eta: 1:12:06 lr: 1.0830154413665315e-05 loss: 0.0798 (0.0829) time: 2.9468 data: 0.0085 max mem: 33303 Epoch: [32] [2960/4276] eta: 1:11:32 lr: 1.0827035977823783e-05 loss: 0.0769 (0.0828) time: 2.9397 data: 0.0085 max mem: 33303 Epoch: [32] [2970/4276] eta: 1:10:58 lr: 1.0823917442181119e-05 loss: 0.0757 (0.0828) time: 2.9394 data: 0.0086 max mem: 33303 Epoch: [32] [2980/4276] eta: 1:10:24 lr: 1.0820798806702183e-05 loss: 0.0757 (0.0828) time: 2.9265 data: 0.0084 max mem: 33303 Epoch: [32] [2990/4276] eta: 1:09:50 lr: 1.0817680071351808e-05 loss: 0.0739 (0.0828) time: 2.9000 data: 0.0084 max mem: 33303 Epoch: [32] [3000/4276] eta: 1:09:16 lr: 1.0814561236094803e-05 loss: 0.0740 (0.0828) time: 2.9027 data: 0.0091 max mem: 33303 Epoch: [32] [3010/4276] eta: 1:08:42 lr: 1.0811442300895945e-05 loss: 0.0766 (0.0828) time: 2.9359 data: 0.0089 max mem: 33303 Epoch: [32] [3020/4276] eta: 1:08:08 lr: 1.0808323265719998e-05 loss: 0.0746 (0.0828) time: 2.9808 data: 0.0082 max mem: 33303 Epoch: [32] [3030/4276] eta: 1:07:34 lr: 1.08052041305317e-05 loss: 0.0775 (0.0828) time: 2.9968 data: 0.0083 max mem: 33303 Epoch: [32] [3040/4276] eta: 1:07:01 lr: 1.0802084895295766e-05 loss: 0.0837 (0.0828) time: 2.9682 data: 0.0083 max mem: 33303 Epoch: [32] [3050/4276] eta: 1:06:27 lr: 1.0798965559976875e-05 loss: 0.0820 (0.0828) time: 2.9441 data: 0.0080 max mem: 33303 Epoch: [32] [3060/4276] eta: 1:05:53 lr: 1.0795846124539698e-05 loss: 0.0664 (0.0827) time: 2.9452 data: 0.0080 max mem: 33303 Epoch: [32] [3070/4276] eta: 1:05:19 lr: 1.0792726588948876e-05 loss: 0.0696 (0.0827) time: 2.9468 data: 0.0083 max mem: 33303 Epoch: [32] [3080/4276] eta: 1:04:46 lr: 1.0789606953169034e-05 loss: 0.0717 (0.0827) time: 2.9606 data: 0.0081 max mem: 33303 Epoch: [32] [3090/4276] eta: 1:04:12 lr: 1.0786487217164752e-05 loss: 0.0706 (0.0827) time: 2.9801 data: 0.0080 max mem: 33303 Epoch: [32] [3100/4276] eta: 1:03:39 lr: 1.0783367380900605e-05 loss: 0.0705 (0.0826) time: 2.9680 data: 0.0081 max mem: 33303 Epoch: [32] [3110/4276] eta: 1:03:05 lr: 1.0780247444341147e-05 loss: 0.0723 (0.0826) time: 2.9327 data: 0.0081 max mem: 33303 Epoch: [32] [3120/4276] eta: 1:02:31 lr: 1.0777127407450885e-05 loss: 0.0743 (0.0826) time: 2.9340 data: 0.0084 max mem: 33303 Epoch: [32] [3130/4276] eta: 1:01:58 lr: 1.0774007270194326e-05 loss: 0.0773 (0.0826) time: 2.9345 data: 0.0094 max mem: 33303 Epoch: [32] [3140/4276] eta: 1:01:24 lr: 1.0770887032535942e-05 loss: 0.0773 (0.0826) time: 2.9536 data: 0.0095 max mem: 33303 Epoch: [32] [3150/4276] eta: 1:00:51 lr: 1.0767766694440188e-05 loss: 0.0843 (0.0826) time: 2.9810 data: 0.0083 max mem: 33303 Epoch: [32] [3160/4276] eta: 1:00:18 lr: 1.0764646255871478e-05 loss: 0.0850 (0.0826) time: 2.9843 data: 0.0079 max mem: 33303 Epoch: [32] [3170/4276] eta: 0:59:44 lr: 1.0761525716794222e-05 loss: 0.0800 (0.0826) time: 2.9915 data: 0.0075 max mem: 33303 Epoch: [32] [3180/4276] eta: 0:59:11 lr: 1.0758405077172795e-05 loss: 0.0765 (0.0826) time: 2.9785 data: 0.0071 max mem: 33303 Epoch: [32] [3190/4276] eta: 0:58:38 lr: 1.0755284336971556e-05 loss: 0.0765 (0.0826) time: 2.9540 data: 0.0073 max mem: 33303 Epoch: [32] [3200/4276] eta: 0:58:04 lr: 1.0752163496154826e-05 loss: 0.0834 (0.0826) time: 2.9637 data: 0.0073 max mem: 33303 Epoch: [32] [3210/4276] eta: 0:57:31 lr: 1.0749042554686913e-05 loss: 0.0775 (0.0826) time: 2.9742 data: 0.0071 max mem: 33303 Epoch: [32] [3220/4276] eta: 0:56:58 lr: 1.07459215125321e-05 loss: 0.0715 (0.0826) time: 2.9650 data: 0.0070 max mem: 33303 Epoch: [32] [3230/4276] eta: 0:56:25 lr: 1.0742800369654648e-05 loss: 0.0780 (0.0826) time: 2.9658 data: 0.0071 max mem: 33303 Epoch: [32] [3240/4276] eta: 0:55:51 lr: 1.073967912601878e-05 loss: 0.0858 (0.0826) time: 2.9547 data: 0.0073 max mem: 33303 Epoch: [32] [3250/4276] eta: 0:55:18 lr: 1.0736557781588706e-05 loss: 0.0906 (0.0826) time: 2.9432 data: 0.0073 max mem: 33303 Epoch: [32] [3260/4276] eta: 0:54:45 lr: 1.0733436336328616e-05 loss: 0.0886 (0.0827) time: 2.9534 data: 0.0074 max mem: 33303 Epoch: [32] [3270/4276] eta: 0:54:12 lr: 1.073031479020267e-05 loss: 0.0842 (0.0827) time: 2.9602 data: 0.0074 max mem: 33303 Epoch: [32] [3280/4276] eta: 0:53:38 lr: 1.0727193143174996e-05 loss: 0.0843 (0.0827) time: 2.9581 data: 0.0073 max mem: 33303 Epoch: [32] [3290/4276] eta: 0:53:05 lr: 1.0724071395209707e-05 loss: 0.0863 (0.0827) time: 2.9503 data: 0.0073 max mem: 33303 Epoch: [32] [3300/4276] eta: 0:52:32 lr: 1.0720949546270894e-05 loss: 0.0867 (0.0828) time: 2.9387 data: 0.0074 max mem: 33303 Epoch: [32] [3310/4276] eta: 0:51:59 lr: 1.0717827596322623e-05 loss: 0.0938 (0.0828) time: 2.9555 data: 0.0073 max mem: 33303 Epoch: [32] [3320/4276] eta: 0:51:26 lr: 1.0714705545328918e-05 loss: 0.0905 (0.0828) time: 2.9631 data: 0.0072 max mem: 33303 Epoch: [32] [3330/4276] eta: 0:50:53 lr: 1.0711583393253802e-05 loss: 0.0816 (0.0828) time: 2.9668 data: 0.0072 max mem: 33303 Epoch: [32] [3340/4276] eta: 0:50:20 lr: 1.0708461140061263e-05 loss: 0.0736 (0.0828) time: 2.9639 data: 0.0072 max mem: 33303 Epoch: [32] [3350/4276] eta: 0:49:47 lr: 1.070533878571527e-05 loss: 0.0794 (0.0828) time: 2.9731 data: 0.0075 max mem: 33303 Epoch: [32] [3360/4276] eta: 0:49:14 lr: 1.070221633017975e-05 loss: 0.0794 (0.0828) time: 2.9612 data: 0.0077 max mem: 33303 Epoch: [32] [3370/4276] eta: 0:48:41 lr: 1.0699093773418628e-05 loss: 0.0818 (0.0828) time: 2.9306 data: 0.0076 max mem: 33303 Epoch: [32] [3380/4276] eta: 0:48:08 lr: 1.0695971115395792e-05 loss: 0.0766 (0.0828) time: 2.9367 data: 0.0075 max mem: 33303 Epoch: [32] [3390/4276] eta: 0:47:35 lr: 1.0692848356075116e-05 loss: 0.0707 (0.0828) time: 2.9383 data: 0.0074 max mem: 33303 Epoch: [32] [3400/4276] eta: 0:47:02 lr: 1.0689725495420427e-05 loss: 0.0772 (0.0828) time: 2.9388 data: 0.0073 max mem: 33303 Epoch: [32] [3410/4276] eta: 0:46:29 lr: 1.068660253339555e-05 loss: 0.0753 (0.0828) time: 2.9685 data: 0.0072 max mem: 33303 Epoch: [32] [3420/4276] eta: 0:45:56 lr: 1.0683479469964284e-05 loss: 0.0875 (0.0828) time: 2.9782 data: 0.0074 max mem: 33303 Epoch: [32] [3430/4276] eta: 0:45:23 lr: 1.0680356305090381e-05 loss: 0.0908 (0.0828) time: 2.9531 data: 0.0076 max mem: 33303 Epoch: [32] [3440/4276] eta: 0:44:51 lr: 1.0677233038737594e-05 loss: 0.0784 (0.0828) time: 2.9440 data: 0.0080 max mem: 33303 Epoch: [32] [3450/4276] eta: 0:44:18 lr: 1.067410967086964e-05 loss: 0.0909 (0.0829) time: 2.9616 data: 0.0078 max mem: 33303 Epoch: [32] [3460/4276] eta: 0:43:45 lr: 1.0670986201450215e-05 loss: 0.0932 (0.0829) time: 2.9851 data: 0.0073 max mem: 33303 Epoch: [32] [3470/4276] eta: 0:43:12 lr: 1.066786263044298e-05 loss: 0.0803 (0.0829) time: 2.9974 data: 0.0072 max mem: 33303 Epoch: [32] [3480/4276] eta: 0:42:40 lr: 1.066473895781158e-05 loss: 0.0804 (0.0829) time: 2.9724 data: 0.0073 max mem: 33303 Epoch: [32] [3490/4276] eta: 0:42:07 lr: 1.066161518351964e-05 loss: 0.0850 (0.0829) time: 2.9413 data: 0.0072 max mem: 33303 Epoch: [32] [3500/4276] eta: 0:41:34 lr: 1.0658491307530755e-05 loss: 0.0756 (0.0829) time: 2.9444 data: 0.0073 max mem: 33303 Epoch: [32] [3510/4276] eta: 0:41:01 lr: 1.0655367329808483e-05 loss: 0.0768 (0.0829) time: 2.9457 data: 0.0077 max mem: 33303 Epoch: [32] [3520/4276] eta: 0:40:29 lr: 1.0652243250316375e-05 loss: 0.0789 (0.0829) time: 2.9465 data: 0.0077 max mem: 33303 Epoch: [32] [3530/4276] eta: 0:39:56 lr: 1.064911906901795e-05 loss: 0.0881 (0.0829) time: 2.9457 data: 0.0075 max mem: 33303 Epoch: [32] [3540/4276] eta: 0:39:23 lr: 1.0645994785876709e-05 loss: 0.0881 (0.0829) time: 2.9469 data: 0.0077 max mem: 33303 Epoch: [32] [3550/4276] eta: 0:38:51 lr: 1.0642870400856107e-05 loss: 0.0759 (0.0829) time: 2.9483 data: 0.0080 max mem: 33303 Epoch: [32] [3560/4276] eta: 0:38:18 lr: 1.0639745913919596e-05 loss: 0.0759 (0.0829) time: 2.9472 data: 0.0080 max mem: 33303 Epoch: [32] [3570/4276] eta: 0:37:45 lr: 1.0636621325030594e-05 loss: 0.0826 (0.0829) time: 2.9591 data: 0.0080 max mem: 33303 Epoch: [32] [3580/4276] eta: 0:37:13 lr: 1.0633496634152502e-05 loss: 0.0826 (0.0830) time: 2.9741 data: 0.0080 max mem: 33303 Epoch: [32] [3590/4276] eta: 0:36:40 lr: 1.0630371841248678e-05 loss: 0.0767 (0.0830) time: 2.9640 data: 0.0079 max mem: 33303 Epoch: [32] [3600/4276] eta: 0:36:08 lr: 1.0627246946282469e-05 loss: 0.0859 (0.0830) time: 2.9326 data: 0.0078 max mem: 33303 Epoch: [32] [3610/4276] eta: 0:35:35 lr: 1.0624121949217197e-05 loss: 0.0856 (0.0830) time: 2.9124 data: 0.0080 max mem: 33303 Epoch: [32] [3620/4276] eta: 0:35:02 lr: 1.0620996850016158e-05 loss: 0.0731 (0.0829) time: 2.9007 data: 0.0082 max mem: 33303 Epoch: [32] [3630/4276] eta: 0:34:30 lr: 1.0617871648642613e-05 loss: 0.0832 (0.0829) time: 2.8864 data: 0.0075 max mem: 33303 Epoch: [32] [3640/4276] eta: 0:33:57 lr: 1.0614746345059804e-05 loss: 0.0832 (0.0829) time: 2.8835 data: 0.0072 max mem: 33303 Epoch: [32] [3650/4276] eta: 0:33:25 lr: 1.0611620939230957e-05 loss: 0.0810 (0.0829) time: 2.8834 data: 0.0079 max mem: 33303 Epoch: [32] [3660/4276] eta: 0:32:52 lr: 1.0608495431119263e-05 loss: 0.0827 (0.0829) time: 2.9114 data: 0.0085 max mem: 33303 Epoch: [32] [3670/4276] eta: 0:32:20 lr: 1.0605369820687883e-05 loss: 0.0827 (0.0829) time: 2.9441 data: 0.0088 max mem: 33303 Epoch: [32] [3680/4276] eta: 0:31:47 lr: 1.0602244107899965e-05 loss: 0.0764 (0.0829) time: 2.9479 data: 0.0089 max mem: 33303 Epoch: [32] [3690/4276] eta: 0:31:15 lr: 1.0599118292718619e-05 loss: 0.0827 (0.0830) time: 2.9474 data: 0.0088 max mem: 33303 Epoch: [32] [3700/4276] eta: 0:30:42 lr: 1.0595992375106948e-05 loss: 0.0827 (0.0829) time: 2.9484 data: 0.0086 max mem: 33303 Epoch: [32] [3710/4276] eta: 0:30:10 lr: 1.0592866355028005e-05 loss: 0.0813 (0.0829) time: 2.9485 data: 0.0086 max mem: 33303 Epoch: [32] [3720/4276] eta: 0:29:38 lr: 1.0589740232444837e-05 loss: 0.0777 (0.0829) time: 2.9474 data: 0.0088 max mem: 33303 Epoch: [32] [3730/4276] eta: 0:29:05 lr: 1.0586614007320461e-05 loss: 0.0777 (0.0830) time: 2.9331 data: 0.0091 max mem: 33303 Epoch: [32] [3740/4276] eta: 0:28:33 lr: 1.058348767961786e-05 loss: 0.0769 (0.0829) time: 2.9297 data: 0.0086 max mem: 33303 Epoch: [32] [3750/4276] eta: 0:28:01 lr: 1.0580361249299999e-05 loss: 0.0791 (0.0829) time: 2.9436 data: 0.0080 max mem: 33303 Epoch: [32] [3760/4276] eta: 0:27:28 lr: 1.0577234716329818e-05 loss: 0.0762 (0.0829) time: 3.0116 data: 0.0075 max mem: 33303 Epoch: [32] [3770/4276] eta: 0:33:44 lr: 1.0574108080670235e-05 loss: 0.0701 (0.0830) time: 154.9523 data: 151.3996 max mem: 33303 Epoch: [32] [3780/4276] eta: 0:33:36 lr: 1.0570981342284127e-05 loss: 0.0701 (0.0829) time: 167.7421 data: 163.9925 max mem: 33303 Epoch: [32] [3790/4276] eta: 0:33:27 lr: 1.0567854501134363e-05 loss: 0.0668 (0.0829) time: 28.4175 data: 24.8906 max mem: 33303 Epoch: [32] [3800/4276] eta: 0:33:14 lr: 1.0564727557183775e-05 loss: 0.0809 (0.0829) time: 27.7598 data: 24.1809 max mem: 33303 Epoch: [32] [3810/4276] eta: 0:33:33 lr: 1.0561600510395182e-05 loss: 0.0808 (0.0829) time: 40.6605 data: 36.8461 max mem: 33303 Epoch: [32] [3820/4276] eta: 0:33:18 lr: 1.0558473360731354e-05 loss: 0.0788 (0.0829) time: 41.0438 data: 37.2699 max mem: 33303 Epoch: [32] [3830/4276] eta: 0:33:05 lr: 1.0555346108155058e-05 loss: 0.0728 (0.0829) time: 29.2520 data: 25.6704 max mem: 33303 Epoch: [32] [3840/4276] eta: 0:32:48 lr: 1.0552218752629028e-05 loss: 0.0686 (0.0829) time: 29.5477 data: 25.9210 max mem: 33303 Epoch: [32] [3850/4276] eta: 0:32:57 lr: 1.0549091294115974e-05 loss: 0.0685 (0.0828) time: 41.0501 data: 37.2423 max mem: 33303 Epoch: [32] [3860/4276] eta: 0:32:36 lr: 1.054596373257857e-05 loss: 0.0880 (0.0829) time: 40.9166 data: 37.1445 max mem: 33303 Epoch: [32] [3870/4276] eta: 0:32:15 lr: 1.0542836067979473e-05 loss: 0.0751 (0.0829) time: 28.7478 data: 25.2886 max mem: 33303 Epoch: [32] [3880/4276] eta: 0:31:50 lr: 1.0539708300281317e-05 loss: 0.0729 (0.0829) time: 28.4353 data: 25.0021 max mem: 33303 Epoch: [32] [3890/4276] eta: 0:31:50 lr: 1.053658042944671e-05 loss: 0.0794 (0.0829) time: 40.3014 data: 36.5665 max mem: 33303 Epoch: [32] [3900/4276] eta: 0:31:24 lr: 1.0533452455438217e-05 loss: 0.0876 (0.0829) time: 41.1279 data: 37.4009 max mem: 33303 Epoch: [32] [3910/4276] eta: 0:30:54 lr: 1.0530324378218397e-05 loss: 0.0773 (0.0829) time: 28.2113 data: 24.7791 max mem: 33303 Epoch: [32] [3920/4276] eta: 0:30:27 lr: 1.052719619774978e-05 loss: 0.0681 (0.0828) time: 28.7242 data: 25.1891 max mem: 33303 Epoch: [32] [3930/4276] eta: 0:30:18 lr: 1.0524067913994865e-05 loss: 0.0727 (0.0828) time: 41.9998 data: 38.1094 max mem: 33303 Epoch: [32] [3940/4276] eta: 0:29:45 lr: 1.0520939526916121e-05 loss: 0.0780 (0.0828) time: 40.6515 data: 36.8656 max mem: 33303 Epoch: [32] [3950/4276] eta: 0:29:12 lr: 1.0517811036475998e-05 loss: 0.0697 (0.0828) time: 28.8474 data: 25.3436 max mem: 33303 Epoch: [32] [3960/4276] eta: 0:28:36 lr: 1.051468244263692e-05 loss: 0.0702 (0.0828) time: 28.6106 data: 24.9910 max mem: 33303 Epoch: [32] [3970/4276] eta: 0:28:18 lr: 1.0511553745361286e-05 loss: 0.0749 (0.0828) time: 40.4271 data: 36.5256 max mem: 33303 Epoch: [32] [3980/4276] eta: 0:27:40 lr: 1.050842494461146e-05 loss: 0.0737 (0.0828) time: 41.0342 data: 37.2264 max mem: 33303 Epoch: [32] [3990/4276] eta: 0:27:01 lr: 1.0505296040349787e-05 loss: 0.0697 (0.0828) time: 28.9490 data: 25.4384 max mem: 33303 Epoch: [32] [4000/4276] eta: 0:26:19 lr: 1.0502167032538585e-05 loss: 0.0697 (0.0827) time: 28.5461 data: 24.9885 max mem: 33303 Epoch: [32] [4010/4276] eta: 0:25:53 lr: 1.0499037921140151e-05 loss: 0.0772 (0.0828) time: 39.6831 data: 35.8415 max mem: 33303 Epoch: [32] [4020/4276] eta: 0:25:09 lr: 1.0495908706116742e-05 loss: 0.0849 (0.0828) time: 40.1610 data: 36.3922 max mem: 33303 Epoch: [32] [4030/4276] eta: 0:24:24 lr: 1.0492779387430598e-05 loss: 0.0809 (0.0827) time: 28.9585 data: 25.4129 max mem: 33303 Epoch: [32] [4040/4276] eta: 0:23:38 lr: 1.0489649965043942e-05 loss: 0.0792 (0.0828) time: 29.2380 data: 25.6673 max mem: 33303 Epoch: [32] [4050/4276] eta: 0:23:05 lr: 1.0486520438918944e-05 loss: 0.0787 (0.0828) time: 42.0404 data: 38.2103 max mem: 33303 Epoch: [32] [4060/4276] eta: 0:22:15 lr: 1.0483390809017773e-05 loss: 0.0778 (0.0828) time: 40.9861 data: 37.1628 max mem: 33303 Epoch: [32] [4070/4276] eta: 0:21:25 lr: 1.0480261075302562e-05 loss: 0.0872 (0.0828) time: 28.1877 data: 24.7401 max mem: 33303 Epoch: [32] [4080/4276] eta: 0:20:34 lr: 1.0477131237735424e-05 loss: 0.0865 (0.0828) time: 29.4350 data: 25.9820 max mem: 33303 Epoch: [32] [4090/4276] eta: 0:19:52 lr: 1.0474001296278428e-05 loss: 0.0765 (0.0828) time: 41.5018 data: 37.7305 max mem: 33303 Epoch: [32] [4100/4276] eta: 0:18:57 lr: 1.0470871250893635e-05 loss: 0.0816 (0.0829) time: 40.5349 data: 36.7747 max mem: 33303 Epoch: [32] [4110/4276] eta: 0:18:01 lr: 1.0467741101543072e-05 loss: 0.0805 (0.0828) time: 27.9231 data: 24.4707 max mem: 33303 Epoch: [32] [4120/4276] eta: 0:17:04 lr: 1.0464610848188746e-05 loss: 0.0811 (0.0829) time: 27.8533 data: 24.4135 max mem: 33303 Epoch: [32] [4130/4276] eta: 0:16:15 lr: 1.0461480490792621e-05 loss: 0.0753 (0.0828) time: 40.3034 data: 36.5728 max mem: 33303 Epoch: [32] [4140/4276] eta: 0:15:15 lr: 1.0458350029316652e-05 loss: 0.0706 (0.0828) time: 40.9835 data: 37.2438 max mem: 33303 Epoch: [32] [4150/4276] eta: 0:14:14 lr: 1.0455219463722759e-05 loss: 0.0732 (0.0828) time: 28.3766 data: 24.8062 max mem: 33303 Epoch: [32] [4160/4276] eta: 0:13:13 lr: 1.0452088793972845e-05 loss: 0.0812 (0.0828) time: 28.2818 data: 24.7149 max mem: 33303 Epoch: [32] [4170/4276] eta: 0:12:16 lr: 1.0448958020028764e-05 loss: 0.0873 (0.0828) time: 40.8825 data: 37.1729 max mem: 33303 Epoch: [32] [4180/4276] eta: 0:11:12 lr: 1.0445827141852365e-05 loss: 0.0729 (0.0828) time: 40.8810 data: 37.0865 max mem: 33303 Epoch: [32] [4190/4276] eta: 0:10:06 lr: 1.0442696159405466e-05 loss: 0.0744 (0.0828) time: 28.6628 data: 25.1524 max mem: 33303 Epoch: [32] [4200/4276] eta: 0:08:59 lr: 1.0439565072649858e-05 loss: 0.0892 (0.0828) time: 28.2972 data: 24.8513 max mem: 33303 Epoch: [32] [4210/4276] eta: 0:07:55 lr: 1.0436433881547293e-05 loss: 0.0856 (0.0829) time: 40.7430 data: 37.0055 max mem: 33303 Epoch: [32] [4220/4276] eta: 0:06:46 lr: 1.043330258605951e-05 loss: 0.0891 (0.0829) time: 41.0736 data: 37.3511 max mem: 33303 Epoch: [32] [4230/4276] eta: 0:05:36 lr: 1.0430171186148217e-05 loss: 0.0900 (0.0829) time: 28.1218 data: 24.6755 max mem: 33303 Epoch: [32] [4240/4276] eta: 0:04:24 lr: 1.0427039681775105e-05 loss: 0.0873 (0.0829) time: 28.0412 data: 24.5811 max mem: 33303 Epoch: [32] [4250/4276] eta: 0:03:14 lr: 1.0423908072901812e-05 loss: 0.0865 (0.0830) time: 40.6843 data: 36.9223 max mem: 33303 Epoch: [32] [4260/4276] eta: 0:02:00 lr: 1.0420776359489976e-05 loss: 0.0865 (0.0830) time: 41.2194 data: 37.4587 max mem: 33303 Epoch: [32] [4270/4276] eta: 0:00:45 lr: 1.0417644541501192e-05 loss: 0.0818 (0.0830) time: 28.8658 data: 25.4146 max mem: 33303 Epoch: [32] Total time: 9:02:05 Test: [ 0/21770] eta: 21 days, 12:15:10 time: 85.3702 data: 85.2991 max mem: 33303 Test: [ 100/21770] eta: 16:19:01 time: 1.6439 data: 1.6074 max mem: 33303 Test: [ 200/21770] eta: 14:41:46 time: 2.6530 data: 2.5411 max mem: 33303 Test: [ 300/21770] eta: 13:56:57 time: 1.7601 data: 1.6778 max mem: 33303 Test: [ 400/21770] eta: 13:32:25 time: 2.3405 data: 2.3023 max mem: 33303 Test: [ 500/21770] eta: 13:18:32 time: 1.3401 data: 1.2999 max mem: 33303 Test: [ 600/21770] eta: 13:08:21 time: 2.3098 data: 2.2725 max mem: 33303 Test: [ 700/21770] eta: 12:51:08 time: 1.3578 data: 1.3216 max mem: 33303 Test: [ 800/21770] eta: 12:50:31 time: 2.7036 data: 2.6639 max mem: 33303 Test: [ 900/21770] eta: 12:39:01 time: 1.7043 data: 1.6316 max mem: 33303 Test: [ 1000/21770] eta: 12:38:21 time: 2.5611 data: 2.4512 max mem: 33303 Test: [ 1100/21770] eta: 12:28:34 time: 1.5621 data: 1.5063 max mem: 33303 Test: [ 1200/21770] eta: 12:24:45 time: 2.4986 data: 2.4252 max mem: 33303 Test: [ 1300/21770] eta: 12:16:53 time: 1.7037 data: 1.6335 max mem: 33303 Test: [ 1400/21770] eta: 12:15:36 time: 1.8536 data: 1.8171 max mem: 33303 Test: [ 1500/21770] eta: 12:07:05 time: 1.6502 data: 1.5976 max mem: 33303 Test: [ 1600/21770] eta: 12:03:38 time: 2.1231 data: 2.0859 max mem: 33303 Test: [ 1700/21770] eta: 11:59:18 time: 1.8699 data: 1.8278 max mem: 33303 Test: [ 1800/21770] eta: 11:58:08 time: 3.0532 data: 3.0091 max mem: 33303 Test: [ 1900/21770] eta: 11:51:48 time: 1.8725 data: 1.8326 max mem: 33303 Test: [ 2000/21770] eta: 11:50:18 time: 2.6809 data: 2.5873 max mem: 33303 Test: [ 2100/21770] eta: 11:44:40 time: 1.8137 data: 1.7043 max mem: 33303 Test: [ 2200/21770] eta: 11:40:45 time: 2.5805 data: 2.4707 max mem: 33303 Test: [ 2300/21770] eta: 11:36:07 time: 1.7695 data: 1.6590 max mem: 33303 Test: [ 2400/21770] eta: 11:21:53 time: 1.3729 data: 1.3364 max mem: 33303 Test: [ 2500/21770] eta: 10:57:31 time: 0.8081 data: 0.6804 max mem: 33303 Test: [ 2600/21770] eta: 10:37:22 time: 0.8899 data: 0.7986 max mem: 33303 Test: [ 2700/21770] eta: 10:18:01 time: 0.7917 data: 0.6828 max mem: 33303 Test: [ 2800/21770] eta: 10:01:49 time: 0.8004 data: 0.7639 max mem: 33303 Test: [ 2900/21770] eta: 9:46:02 time: 0.4770 data: 0.4406 max mem: 33303 Test: [ 3000/21770] eta: 9:31:32 time: 0.4042 data: 0.3679 max mem: 33303 Test: [ 3100/21770] eta: 9:17:34 time: 0.7194 data: 0.6827 max mem: 33303 Test: [ 3200/21770] eta: 9:03:49 time: 0.4307 data: 0.3816 max mem: 33303 Test: [ 3300/21770] eta: 8:52:13 time: 0.8507 data: 0.8138 max mem: 33303 Test: [ 3400/21770] eta: 8:41:40 time: 0.9179 data: 0.8810 max mem: 33303 Test: [ 3500/21770] eta: 8:30:05 time: 0.4613 data: 0.4247 max mem: 33303 Test: [ 3600/21770] eta: 8:21:03 time: 0.6782 data: 0.6414 max mem: 33303 Test: [ 3700/21770] eta: 8:09:19 time: 0.6238 data: 0.5870 max mem: 33303 Test: [ 3800/21770] eta: 7:59:29 time: 1.1443 data: 1.1068 max mem: 33303 Test: [ 3900/21770] eta: 7:49:49 time: 0.4142 data: 0.3777 max mem: 33303 Test: [ 4000/21770] eta: 7:41:10 time: 0.8516 data: 0.8154 max mem: 33303 Test: [ 4100/21770] eta: 7:31:42 time: 0.4692 data: 0.4325 max mem: 33303 Test: [ 4200/21770] eta: 7:22:55 time: 0.3734 data: 0.3370 max mem: 33303 Test: [ 4300/21770] eta: 7:15:12 time: 0.6483 data: 0.6119 max mem: 33303 Test: [ 4400/21770] eta: 7:08:13 time: 0.7961 data: 0.7589 max mem: 33303 Test: [ 4500/21770] eta: 7:01:20 time: 0.7449 data: 0.7082 max mem: 33303 Test: [ 4600/21770] eta: 6:53:52 time: 0.4828 data: 0.4461 max mem: 33303 Test: [ 4700/21770] eta: 6:46:46 time: 0.7492 data: 0.7125 max mem: 33303 Test: [ 4800/21770] eta: 6:39:28 time: 0.3132 data: 0.2768 max mem: 33303 Test: [ 4900/21770] eta: 6:33:02 time: 0.6623 data: 0.6260 max mem: 33303 Test: [ 5000/21770] eta: 6:26:29 time: 0.0604 data: 0.0239 max mem: 33303 Test: [ 5100/21770] eta: 6:20:32 time: 0.7039 data: 0.6676 max mem: 33303 Test: [ 5200/21770] eta: 6:15:16 time: 0.7911 data: 0.7545 max mem: 33303 Test: [ 5300/21770] eta: 6:10:22 time: 0.3493 data: 0.3129 max mem: 33303 Test: [ 5400/21770] eta: 6:04:48 time: 0.3628 data: 0.3256 max mem: 33303 Test: [ 5500/21770] eta: 6:00:09 time: 0.9205 data: 0.8843 max mem: 33303 Test: [ 5600/21770] eta: 5:55:21 time: 0.7688 data: 0.6763 max mem: 33303 Test: [ 5700/21770] eta: 5:50:16 time: 0.9072 data: 0.8533 max mem: 33303 Test: [ 5800/21770] eta: 5:44:55 time: 0.8218 data: 0.7845 max mem: 33303 Test: [ 5900/21770] eta: 5:40:18 time: 1.1742 data: 1.1371 max mem: 33303 Test: [ 6000/21770] eta: 5:35:33 time: 0.6246 data: 0.5876 max mem: 33303 Test: [ 6100/21770] eta: 5:31:13 time: 0.3879 data: 0.3509 max mem: 33303 Test: [ 6200/21770] eta: 5:26:20 time: 0.6639 data: 0.6270 max mem: 33303 Test: [ 6300/21770] eta: 5:22:01 time: 0.7345 data: 0.6983 max mem: 33303 Test: [ 6400/21770] eta: 5:17:49 time: 0.7233 data: 0.6870 max mem: 33303 Test: [ 6500/21770] eta: 5:13:25 time: 0.7645 data: 0.7280 max mem: 33303 Test: [ 6600/21770] eta: 5:09:05 time: 0.7502 data: 0.7138 max mem: 33303 Test: [ 6700/21770] eta: 5:05:02 time: 0.6553 data: 0.6191 max mem: 33303 Test: [ 6800/21770] eta: 5:01:01 time: 0.7090 data: 0.6728 max mem: 33303 Test: [ 6900/21770] eta: 4:57:20 time: 0.7782 data: 0.7420 max mem: 33303 Test: [ 7000/21770] eta: 4:54:01 time: 1.2029 data: 1.1485 max mem: 33303 Test: [ 7100/21770] eta: 4:50:05 time: 0.8529 data: 0.7659 max mem: 33303 Test: [ 7200/21770] eta: 4:46:03 time: 0.8144 data: 0.7412 max mem: 33303 Test: [ 7300/21770] eta: 4:42:32 time: 0.8998 data: 0.8282 max mem: 33303 Test: [ 7400/21770] eta: 4:38:52 time: 0.7076 data: 0.6706 max mem: 33303 Test: [ 7500/21770] eta: 4:35:28 time: 0.3533 data: 0.3168 max mem: 33303 Test: [ 7600/21770] eta: 4:32:21 time: 0.6807 data: 0.6443 max mem: 33303 Test: [ 7700/21770] eta: 4:28:48 time: 0.3960 data: 0.3593 max mem: 33303 Test: [ 7800/21770] eta: 4:26:07 time: 0.8743 data: 0.8376 max mem: 33303 Test: [ 7900/21770] eta: 4:23:12 time: 1.2020 data: 1.1479 max mem: 33303 Test: [ 8000/21770] eta: 4:20:04 time: 0.6033 data: 0.5669 max mem: 33303 Test: [ 8100/21770] eta: 4:17:07 time: 0.8539 data: 0.8174 max mem: 33303 Test: [ 8200/21770] eta: 4:14:12 time: 0.7662 data: 0.7296 max mem: 33303 Test: [ 8300/21770] eta: 4:11:57 time: 1.3704 data: 1.2609 max mem: 33303 Test: [ 8400/21770] eta: 4:11:23 time: 2.0193 data: 1.9827 max mem: 33303 Test: [ 8500/21770] eta: 4:10:14 time: 1.1573 data: 1.1204 max mem: 33303 Test: [ 8600/21770] eta: 4:10:01 time: 1.7428 data: 1.7066 max mem: 33303 Test: [ 8700/21770] eta: 4:09:06 time: 1.8650 data: 1.7946 max mem: 33303 Test: [ 8800/21770] eta: 4:07:59 time: 1.7988 data: 1.7621 max mem: 33303 Test: [ 8900/21770] eta: 4:07:04 time: 1.3195 data: 1.2831 max mem: 33303 Test: [ 9000/21770] eta: 4:05:51 time: 1.2977 data: 1.2117 max mem: 33303 Test: [ 9100/21770] eta: 4:04:50 time: 1.2311 data: 1.1948 max mem: 33303 Test: [ 9200/21770] eta: 4:04:08 time: 1.3052 data: 1.2323 max mem: 33303 Test: [ 9300/21770] eta: 4:03:03 time: 1.4293 data: 1.3926 max mem: 33303 Test: [ 9400/21770] eta: 4:02:07 time: 1.3666 data: 1.3300 max mem: 33303 Test: [ 9500/21770] eta: 4:01:16 time: 1.3119 data: 1.2756 max mem: 33303 Test: [ 9600/21770] eta: 4:00:34 time: 1.6153 data: 1.5594 max mem: 33303 Test: [ 9700/21770] eta: 4:00:30 time: 2.3909 data: 2.3528 max mem: 33303 Test: [ 9800/21770] eta: 4:00:20 time: 1.5649 data: 1.5235 max mem: 33303 Test: [ 9900/21770] eta: 3:59:32 time: 1.7203 data: 1.6831 max mem: 33303 Test: [10000/21770] eta: 3:59:07 time: 2.3751 data: 2.2839 max mem: 33303 Test: [10100/21770] eta: 3:58:13 time: 1.5343 data: 1.4977 max mem: 33303 Test: [10200/21770] eta: 3:57:40 time: 1.4240 data: 1.3872 max mem: 33303 Test: [10300/21770] eta: 3:57:00 time: 1.3754 data: 1.3387 max mem: 33303 Test: [10400/21770] eta: 3:55:31 time: 1.3491 data: 1.2936 max mem: 33303 Test: [10500/21770] eta: 3:53:54 time: 2.3991 data: 2.3417 max mem: 33303 Test: [10600/21770] eta: 3:53:17 time: 1.7986 data: 1.6716 max mem: 33303 Test: [10700/21770] eta: 3:53:00 time: 2.7143 data: 2.6154 max mem: 33303 Test: [10800/21770] eta: 3:52:27 time: 2.8380 data: 2.7794 max mem: 33303 Test: [10900/21770] eta: 3:51:39 time: 2.1590 data: 2.1223 max mem: 33303 Test: [11000/21770] eta: 3:50:40 time: 1.7189 data: 1.6817 max mem: 33303 Test: [11100/21770] eta: 3:50:14 time: 2.6536 data: 2.5410 max mem: 33303 Test: [11200/21770] eta: 3:49:10 time: 1.7055 data: 1.5991 max mem: 33303 Test: [11300/21770] eta: 3:48:32 time: 2.6089 data: 2.5041 max mem: 33303 Test: [11400/21770] eta: 3:47:25 time: 1.6282 data: 1.5909 max mem: 33303 Test: [11500/21770] eta: 3:46:31 time: 2.5128 data: 2.4746 max mem: 33303 Test: [11600/21770] eta: 3:45:26 time: 1.8160 data: 1.7778 max mem: 33303 Test: [11700/21770] eta: 3:44:40 time: 2.2073 data: 2.1203 max mem: 33303 Test: [11800/21770] eta: 3:43:22 time: 1.9232 data: 1.8322 max mem: 33303 Test: [11900/21770] eta: 3:42:24 time: 3.0335 data: 2.9372 max mem: 33303 Test: [12000/21770] eta: 3:40:55 time: 1.7284 data: 1.6138 max mem: 33303 Test: [12100/21770] eta: 3:39:53 time: 2.3714 data: 2.2843 max mem: 33303 Test: [12200/21770] eta: 3:38:31 time: 1.5436 data: 1.4618 max mem: 33303 Test: [12300/21770] eta: 3:37:13 time: 2.4665 data: 2.3628 max mem: 33303 Test: [12400/21770] eta: 3:35:48 time: 1.8642 data: 1.7649 max mem: 33303 Test: [12500/21770] eta: 3:34:41 time: 3.6349 data: 3.5928 max mem: 33303 Test: [12600/21770] eta: 3:32:57 time: 1.5701 data: 1.5336 max mem: 33303 Test: [12700/21770] eta: 3:31:31 time: 2.4395 data: 2.3889 max mem: 33303 Test: [12800/21770] eta: 3:30:06 time: 1.9181 data: 1.8261 max mem: 33303 Test: [12900/21770] eta: 3:28:35 time: 2.5816 data: 2.5160 max mem: 33303 Test: [13000/21770] eta: 3:27:04 time: 1.7956 data: 1.6872 max mem: 33303 Test: [13100/21770] eta: 3:25:04 time: 1.2308 data: 1.1401 max mem: 33303 Test: [13200/21770] eta: 3:22:55 time: 2.0750 data: 1.9875 max mem: 33303 Test: [13300/21770] eta: 3:20:36 time: 1.7524 data: 1.7161 max mem: 33303 Test: [13400/21770] eta: 3:18:20 time: 1.1003 data: 1.0280 max mem: 33303 Test: [13500/21770] eta: 3:15:58 time: 1.0714 data: 1.0344 max mem: 33303 Test: [13600/21770] eta: 3:13:25 time: 1.2095 data: 1.1730 max mem: 33303 Test: [13700/21770] eta: 3:10:53 time: 0.6528 data: 0.6162 max mem: 33303 Test: [13800/21770] eta: 3:08:43 time: 1.4102 data: 1.3370 max mem: 33303 Test: [13900/21770] eta: 3:06:45 time: 1.4361 data: 1.3995 max mem: 33303 Test: [14000/21770] eta: 3:04:20 time: 1.1511 data: 1.1149 max mem: 33303 Test: [14100/21770] eta: 3:01:52 time: 1.8890 data: 1.8348 max mem: 33303 Test: [14200/21770] eta: 2:59:29 time: 1.2949 data: 1.2584 max mem: 33303 Test: [14300/21770] eta: 2:57:05 time: 1.1400 data: 1.1035 max mem: 33303 Test: [14400/21770] eta: 2:54:41 time: 1.4047 data: 1.3683 max mem: 33303 Test: [14500/21770] eta: 2:52:14 time: 2.1118 data: 2.0726 max mem: 33303 Test: [14600/21770] eta: 2:49:27 time: 0.9492 data: 0.8604 max mem: 33303 Test: [14700/21770] eta: 2:46:39 time: 0.8767 data: 0.8405 max mem: 33303 Test: [14800/21770] eta: 2:44:06 time: 1.3647 data: 1.3104 max mem: 33303 Test: [14900/21770] eta: 2:41:30 time: 0.9806 data: 0.9441 max mem: 33303 Test: [15000/21770] eta: 2:38:55 time: 1.0676 data: 1.0308 max mem: 33303 Test: [15100/21770] eta: 2:36:23 time: 1.2044 data: 1.1680 max mem: 33303 Test: [15200/21770] eta: 2:33:55 time: 1.2117 data: 1.1753 max mem: 33303 Test: [15300/21770] eta: 2:31:30 time: 0.9867 data: 0.9505 max mem: 33303 Test: [15400/21770] eta: 2:29:04 time: 1.5350 data: 1.4985 max mem: 33303 Test: [15500/21770] eta: 2:26:34 time: 1.1191 data: 1.0468 max mem: 33303 Test: [15600/21770] eta: 2:24:07 time: 1.6153 data: 1.5783 max mem: 33303 Test: [15700/21770] eta: 2:21:36 time: 1.1371 data: 1.1008 max mem: 33303 Test: [15800/21770] eta: 2:19:13 time: 1.2595 data: 1.2231 max mem: 33303 Test: [15900/21770] eta: 2:16:51 time: 0.9571 data: 0.9209 max mem: 33303 Test: [16000/21770] eta: 2:14:33 time: 1.5876 data: 1.5512 max mem: 33303 Test: [16100/21770] eta: 2:12:12 time: 1.2896 data: 1.1999 max mem: 33303 Test: [16200/21770] eta: 2:09:51 time: 1.5374 data: 1.4468 max mem: 33303 Test: [16300/21770] eta: 2:07:48 time: 2.3510 data: 2.2964 max mem: 33303 Test: [16400/21770] eta: 2:05:44 time: 1.5959 data: 1.5586 max mem: 33303 Test: [16500/21770] eta: 2:03:42 time: 2.2195 data: 2.1461 max mem: 33303 Test: [16600/21770] eta: 2:01:37 time: 1.3952 data: 1.3589 max mem: 33303 Test: [16700/21770] eta: 1:59:39 time: 2.6105 data: 2.5708 max mem: 33303 Test: [16800/21770] eta: 1:57:37 time: 2.8817 data: 2.8424 max mem: 33303 Test: [16900/21770] eta: 1:55:30 time: 1.5693 data: 1.5321 max mem: 33303 Test: [17000/21770] eta: 1:53:28 time: 2.4144 data: 2.3769 max mem: 33303 Test: [17100/21770] eta: 1:51:17 time: 1.5134 data: 1.4624 max mem: 33303 Test: [17200/21770] eta: 1:49:13 time: 2.3087 data: 2.2712 max mem: 33303 Test: [17300/21770] eta: 1:47:01 time: 1.4816 data: 1.4453 max mem: 33303 Test: [17400/21770] eta: 1:44:55 time: 2.8198 data: 2.7441 max mem: 33303 Test: [17500/21770] eta: 1:42:42 time: 1.4000 data: 1.3632 max mem: 33303 Test: [17600/21770] eta: 1:40:34 time: 1.7068 data: 1.6693 max mem: 33303 Test: [17700/21770] eta: 1:38:24 time: 2.8321 data: 2.7382 max mem: 33303 Test: [17800/21770] eta: 1:36:09 time: 2.2525 data: 2.2150 max mem: 33303 Test: [17900/21770] eta: 1:33:58 time: 1.6429 data: 1.5304 max mem: 33303 Test: [18000/21770] eta: 1:31:49 time: 3.5906 data: 3.5313 max mem: 33303 Test: [18100/21770] eta: 1:29:30 time: 1.6074 data: 1.5702 max mem: 33303 Test: [18200/21770] eta: 1:27:16 time: 2.2963 data: 2.2539 max mem: 33303 Test: [18300/21770] eta: 1:24:59 time: 1.6512 data: 1.6139 max mem: 33303 Test: [18400/21770] eta: 1:22:43 time: 2.2448 data: 2.2082 max mem: 33303 Test: [18500/21770] eta: 1:20:25 time: 2.1964 data: 2.1595 max mem: 33303 Test: [18600/21770] eta: 1:18:06 time: 1.8197 data: 1.7826 max mem: 33303 Test: [18700/21770] eta: 1:15:49 time: 2.6936 data: 2.6537 max mem: 33303 Test: [18800/21770] eta: 1:13:29 time: 1.5673 data: 1.5301 max mem: 33303 Test: [18900/21770] eta: 1:11:10 time: 2.1619 data: 2.1253 max mem: 33303 Test: [19000/21770] eta: 1:08:48 time: 2.3811 data: 2.3441 max mem: 33303 Test: [19100/21770] eta: 1:06:24 time: 2.4113 data: 2.3730 max mem: 33303 Test: [19200/21770] eta: 1:04:03 time: 1.5340 data: 1.4967 max mem: 33303 Test: [19300/21770] eta: 1:01:41 time: 2.3534 data: 2.3167 max mem: 33303 Test: [19400/21770] eta: 0:59:17 time: 1.6554 data: 1.6185 max mem: 33303 Test: [19500/21770] eta: 0:56:55 time: 2.4024 data: 2.3483 max mem: 33303 Test: [19600/21770] eta: 0:54:30 time: 2.4010 data: 2.3261 max mem: 33303 Test: [19700/21770] eta: 0:52:03 time: 1.6258 data: 1.5516 max mem: 33303 Test: [19800/21770] eta: 0:49:40 time: 2.3147 data: 2.2594 max mem: 33303 Test: [19900/21770] eta: 0:47:13 time: 1.7295 data: 1.6359 max mem: 33303 Test: [20000/21770] eta: 0:44:46 time: 2.4153 data: 2.3416 max mem: 33303 Test: [20100/21770] eta: 0:42:19 time: 1.6770 data: 1.6397 max mem: 33303 Test: [20200/21770] eta: 0:39:52 time: 2.4333 data: 2.3054 max mem: 33303 Test: [20300/21770] eta: 0:37:23 time: 1.6281 data: 1.5605 max mem: 33303 Test: [20400/21770] eta: 0:34:54 time: 2.1323 data: 2.0956 max mem: 33303 Test: [20500/21770] eta: 0:32:25 time: 1.5226 data: 1.4860 max mem: 33303 Test: [20600/21770] eta: 0:29:55 time: 2.6251 data: 2.5127 max mem: 33303 Test: [20700/21770] eta: 0:27:25 time: 1.5586 data: 1.5215 max mem: 33303 Test: [20800/21770] eta: 0:24:53 time: 2.2196 data: 2.1671 max mem: 33303 Test: [20900/21770] eta: 0:22:21 time: 2.9420 data: 2.8507 max mem: 33303 Test: [21000/21770] eta: 0:19:46 time: 1.1398 data: 1.0856 max mem: 33303 Test: [21100/21770] eta: 0:17:12 time: 1.6279 data: 1.5904 max mem: 33303 Test: [21200/21770] eta: 0:14:38 time: 1.8079 data: 1.7716 max mem: 33303 Test: [21300/21770] eta: 0:12:04 time: 2.3175 data: 2.2807 max mem: 33303 Test: [21400/21770] eta: 0:09:30 time: 1.1380 data: 1.0722 max mem: 33303 Test: [21500/21770] eta: 0:06:56 time: 1.4121 data: 1.3758 max mem: 33303 Test: [21600/21770] eta: 0:04:21 time: 1.7925 data: 1.7557 max mem: 33303 Test: [21700/21770] eta: 0:01:47 time: 1.4611 data: 1.4071 max mem: 33303 Test: Total time: 9:19:37 Final results: Mean IoU is 0.00 precision@0.5 = 0.00 precision@0.6 = 0.00 precision@0.7 = 0.00 precision@0.8 = 0.00 precision@0.9 = 0.00 overall IoU = 0.00 mean IoU = 0.00 Mean accuracy for one-to-zero sample is 37.60 Average object IoU 4.5207339821769405e-06 Overall IoU 0.0004990559536963701 Epoch: [33] [ 0/4276] eta: 16 days, 2:38:51 lr: 1.0415765400495008e-05 loss: 0.0901 (0.0901) time: 325.5219 data: 316.8533 max mem: 33303 Epoch: [33] [ 10/4276] eta: 2 days, 17:57:30 lr: 1.0412633415103181e-05 loss: 0.0823 (0.0804) time: 55.6612 data: 51.7041 max mem: 33303 Epoch: [33] [ 20/4276] eta: 2 days, 2:20:40 lr: 1.0409501325034442e-05 loss: 0.0805 (0.0789) time: 28.4378 data: 24.9456 max mem: 33303 Epoch: [33] [ 30/4276] eta: 1 day, 21:11:04 lr: 1.0406369130250295e-05 loss: 0.0832 (0.0808) time: 28.7670 data: 25.2706 max mem: 33303 Epoch: [33] [ 40/4276] eta: 2 days, 1:18:08 lr: 1.040323683071222e-05 loss: 0.0822 (0.0804) time: 41.1813 data: 37.4012 max mem: 33303 Epoch: [33] [ 50/4276] eta: 1 day, 22:15:25 lr: 1.0400104426381674e-05 loss: 0.0757 (0.0798) time: 41.1021 data: 37.3331 max mem: 33303 Epoch: [33] [ 60/4276] eta: 1 day, 19:52:24 lr: 1.0396971917220064e-05 loss: 0.0694 (0.0793) time: 28.3675 data: 24.9104 max mem: 33303 Epoch: [33] [ 70/4276] eta: 1 day, 18:22:56 lr: 1.0393839303188791e-05 loss: 0.0667 (0.0778) time: 28.2968 data: 24.8457 max mem: 33303 Epoch: [33] [ 80/4276] eta: 1 day, 20:49:28 lr: 1.0390706584249227e-05 loss: 0.0696 (0.0784) time: 41.4910 data: 37.7564 max mem: 33303 Epoch: [33] [ 90/4276] eta: 1 day, 19:20:25 lr: 1.0387573760362716e-05 loss: 0.0818 (0.0795) time: 40.8133 data: 37.0782 max mem: 33303 Epoch: [33] [ 100/4276] eta: 1 day, 18:19:27 lr: 1.038444083149056e-05 loss: 0.0771 (0.0800) time: 28.5028 data: 25.0416 max mem: 33303 Epoch: [33] [ 110/4276] eta: 1 day, 17:21:23 lr: 1.0381307797594052e-05 loss: 0.0851 (0.0813) time: 28.7514 data: 25.3003 max mem: 33303 Epoch: [33] [ 120/4276] eta: 1 day, 18:50:17 lr: 1.0378174658634457e-05 loss: 0.0851 (0.0811) time: 40.2414 data: 36.4406 max mem: 33303 Epoch: [33] [ 130/4276] eta: 1 day, 17:56:12 lr: 1.0375041414572995e-05 loss: 0.0803 (0.0823) time: 40.1671 data: 36.2988 max mem: 33303 Epoch: [33] [ 140/4276] eta: 1 day, 17:12:31 lr: 1.0371908065370877e-05 loss: 0.0757 (0.0817) time: 28.3745 data: 24.8615 max mem: 33303 Epoch: [33] [ 150/4276] eta: 1 day, 16:27:29 lr: 1.0368774610989282e-05 loss: 0.0715 (0.0816) time: 28.0067 data: 24.5493 max mem: 33303 Epoch: [33] [ 160/4276] eta: 1 day, 17:40:50 lr: 1.0365641051389362e-05 loss: 0.0793 (0.0817) time: 40.5944 data: 36.7703 max mem: 33303 Epoch: [33] [ 170/4276] eta: 1 day, 17:05:25 lr: 1.036250738653223e-05 loss: 0.0765 (0.0815) time: 41.5110 data: 37.6764 max mem: 33303 Epoch: [33] [ 180/4276] eta: 1 day, 16:28:23 lr: 1.0359373616378988e-05 loss: 0.0768 (0.0816) time: 28.4612 data: 24.9865 max mem: 33303 Epoch: [33] [ 190/4276] eta: 1 day, 16:04:23 lr: 1.03562397408907e-05 loss: 0.0761 (0.0815) time: 29.1503 data: 25.6582 max mem: 33303 Epoch: [33] [ 200/4276] eta: 1 day, 17:02:55 lr: 1.0353105760028418e-05 loss: 0.0704 (0.0815) time: 42.4340 data: 38.5715 max mem: 33303 Epoch: [33] [ 210/4276] eta: 1 day, 16:27:22 lr: 1.0349971673753137e-05 loss: 0.0724 (0.0812) time: 40.7166 data: 36.8822 max mem: 33303 Epoch: [33] [ 220/4276] eta: 1 day, 16:00:39 lr: 1.0346837482025849e-05 loss: 0.0713 (0.0811) time: 28.0541 data: 24.5927 max mem: 33303 Epoch: [33] [ 230/4276] eta: 1 day, 15:37:39 lr: 1.0343703184807516e-05 loss: 0.0674 (0.0806) time: 29.3490 data: 25.8152 max mem: 33303 Epoch: [33] [ 240/4276] eta: 1 day, 16:20:44 lr: 1.0340568782059067e-05 loss: 0.0712 (0.0804) time: 41.2311 data: 37.3136 max mem: 33303 Epoch: [33] [ 250/4276] eta: 1 day, 15:53:36 lr: 1.0337434273741399e-05 loss: 0.0811 (0.0812) time: 40.4429 data: 36.5593 max mem: 33303 Epoch: [33] [ 260/4276] eta: 1 day, 15:33:34 lr: 1.0334299659815387e-05 loss: 0.0833 (0.0815) time: 29.1296 data: 25.5182 max mem: 33303 Epoch: [33] [ 270/4276] eta: 1 day, 15:06:27 lr: 1.0331164940241881e-05 loss: 0.0752 (0.0813) time: 28.5169 data: 24.9487 max mem: 33303 Epoch: [33] [ 280/4276] eta: 1 day, 15:43:37 lr: 1.0328030114981708e-05 loss: 0.0677 (0.0813) time: 40.0763 data: 36.3448 max mem: 33303 Epoch: [33] [ 290/4276] eta: 1 day, 15:22:47 lr: 1.0324895183995644e-05 loss: 0.0771 (0.0810) time: 41.2850 data: 37.5626 max mem: 33303 Epoch: [33] [ 300/4276] eta: 1 day, 15:01:34 lr: 1.0321760147244457e-05 loss: 0.0791 (0.0812) time: 28.9478 data: 25.5000 max mem: 33303 Epoch: [33] [ 310/4276] eta: 1 day, 14:42:44 lr: 1.0318625004688888e-05 loss: 0.0744 (0.0807) time: 28.9356 data: 25.3789 max mem: 33303 Epoch: [33] [ 320/4276] eta: 1 day, 15:11:02 lr: 1.0315489756289648e-05 loss: 0.0735 (0.0810) time: 40.5057 data: 36.6092 max mem: 33303 Epoch: [33] [ 330/4276] eta: 1 day, 14:49:45 lr: 1.0312354402007406e-05 loss: 0.0772 (0.0809) time: 39.8504 data: 36.0639 max mem: 33303 Epoch: [33] [ 340/4276] eta: 1 day, 14:30:54 lr: 1.030921894180282e-05 loss: 0.0762 (0.0809) time: 28.3168 data: 24.8765 max mem: 33303 Epoch: [33] [ 350/4276] eta: 1 day, 14:10:19 lr: 1.0306083375636515e-05 loss: 0.0741 (0.0809) time: 28.0165 data: 24.5738 max mem: 33303 Epoch: [33] [ 360/4276] eta: 1 day, 14:36:33 lr: 1.030294770346909e-05 loss: 0.0810 (0.0812) time: 40.0385 data: 36.2948 max mem: 33303 Epoch: [33] [ 370/4276] eta: 1 day, 14:17:33 lr: 1.0299811925261106e-05 loss: 0.0858 (0.0813) time: 40.3859 data: 36.6402 max mem: 33303 Epoch: [33] [ 380/4276] eta: 1 day, 13:59:47 lr: 1.0296676040973108e-05 loss: 0.0809 (0.0815) time: 28.1758 data: 24.7061 max mem: 33303 Epoch: [33] [ 390/4276] eta: 1 day, 13:45:54 lr: 1.0293540050565607e-05 loss: 0.0896 (0.0818) time: 29.2933 data: 25.8171 max mem: 33303 Epoch: [33] [ 400/4276] eta: 1 day, 14:10:00 lr: 1.0290403953999094e-05 loss: 0.0907 (0.0819) time: 41.9156 data: 38.1766 max mem: 33303 Epoch: [33] [ 410/4276] eta: 1 day, 13:53:49 lr: 1.0287267751234013e-05 loss: 0.0883 (0.0821) time: 41.2294 data: 37.4996 max mem: 33303 Epoch: [33] [ 420/4276] eta: 1 day, 13:35:28 lr: 1.02841314422308e-05 loss: 0.0791 (0.0821) time: 28.0059 data: 24.5602 max mem: 33303 Epoch: [33] [ 430/4276] eta: 1 day, 13:22:04 lr: 1.0280995026949856e-05 loss: 0.0773 (0.0821) time: 28.5676 data: 25.1160 max mem: 33303 Epoch: [33] [ 440/4276] eta: 1 day, 13:44:06 lr: 1.0277858505351546e-05 loss: 0.0765 (0.0820) time: 42.1097 data: 38.2708 max mem: 33303 Epoch: [33] [ 450/4276] eta: 1 day, 13:28:09 lr: 1.0274721877396216e-05 loss: 0.0765 (0.0819) time: 41.2535 data: 37.2921 max mem: 33303 Epoch: [33] [ 460/4276] eta: 1 day, 13:13:30 lr: 1.0271585143044184e-05 loss: 0.0723 (0.0817) time: 28.6000 data: 24.9898 max mem: 33303 Epoch: [33] [ 470/4276] eta: 1 day, 13:00:20 lr: 1.026844830225574e-05 loss: 0.0719 (0.0817) time: 29.2948 data: 25.7569 max mem: 33303 Epoch: [33] [ 480/4276] eta: 1 day, 13:17:29 lr: 1.0265311354991134e-05 loss: 0.0778 (0.0817) time: 41.0862 data: 37.2268 max mem: 33303 Epoch: [33] [ 490/4276] eta: 1 day, 13:01:56 lr: 1.0262174301210598e-05 loss: 0.0741 (0.0815) time: 40.1633 data: 36.2710 max mem: 33303 Epoch: [33] [ 500/4276] eta: 1 day, 12:48:45 lr: 1.0259037140874336e-05 loss: 0.0728 (0.0813) time: 28.6209 data: 24.9677 max mem: 33303 Epoch: [33] [ 510/4276] eta: 1 day, 12:33:02 lr: 1.025589987394253e-05 loss: 0.0733 (0.0813) time: 28.2243 data: 24.5427 max mem: 33303 Epoch: [33] [ 520/4276] eta: 1 day, 12:48:22 lr: 1.025276250037531e-05 loss: 0.0778 (0.0814) time: 39.8043 data: 35.8959 max mem: 33303 Epoch: [33] [ 530/4276] eta: 1 day, 12:35:12 lr: 1.02496250201328e-05 loss: 0.0773 (0.0814) time: 40.8143 data: 37.0232 max mem: 33303 Epoch: [33] [ 540/4276] eta: 1 day, 12:22:14 lr: 1.0246487433175087e-05 loss: 0.0822 (0.0815) time: 29.0363 data: 25.5970 max mem: 33303 Epoch: [33] [ 550/4276] eta: 1 day, 12:09:21 lr: 1.0243349739462236e-05 loss: 0.0827 (0.0818) time: 28.8934 data: 25.2892 max mem: 33303 Epoch: [33] [ 560/4276] eta: 1 day, 12:23:57 lr: 1.024021193895427e-05 loss: 0.0841 (0.0818) time: 41.1117 data: 37.1252 max mem: 33303 Epoch: [33] [ 570/4276] eta: 1 day, 12:10:23 lr: 1.0237074031611193e-05 loss: 0.0783 (0.0817) time: 40.7913 data: 36.9627 max mem: 33303 Epoch: [33] [ 580/4276] eta: 1 day, 11:58:28 lr: 1.0233936017392982e-05 loss: 0.0763 (0.0819) time: 28.7882 data: 25.3336 max mem: 33303 Epoch: [33] [ 590/4276] eta: 1 day, 11:46:38 lr: 1.0230797896259585e-05 loss: 0.0771 (0.0818) time: 29.3537 data: 25.8794 max mem: 33303 Epoch: [33] [ 600/4276] eta: 1 day, 11:59:31 lr: 1.0227659668170912e-05 loss: 0.0734 (0.0817) time: 41.2840 data: 37.4753 max mem: 33303 Epoch: [33] [ 610/4276] eta: 1 day, 11:48:13 lr: 1.0224521333086853e-05 loss: 0.0781 (0.0818) time: 41.5573 data: 37.7250 max mem: 33303 Epoch: [33] [ 620/4276] eta: 1 day, 11:35:45 lr: 1.0221382890967265e-05 loss: 0.0772 (0.0819) time: 29.1207 data: 25.6060 max mem: 33303 Epoch: [33] [ 630/4276] eta: 1 day, 11:23:48 lr: 1.0218244341771989e-05 loss: 0.0830 (0.0822) time: 28.5657 data: 25.0847 max mem: 33303 Epoch: [33] [ 640/4276] eta: 1 day, 11:33:49 lr: 1.0215105685460813e-05 loss: 0.0853 (0.0822) time: 40.2143 data: 36.4722 max mem: 33303 Epoch: [33] [ 650/4276] eta: 1 day, 11:22:47 lr: 1.0211966921993515e-05 loss: 0.0700 (0.0822) time: 40.6727 data: 36.8611 max mem: 33303 Epoch: [33] [ 660/4276] eta: 1 day, 11:10:34 lr: 1.020882805132984e-05 loss: 0.0821 (0.0824) time: 28.8944 data: 25.3424 max mem: 33303 Epoch: [33] [ 670/4276] eta: 1 day, 10:59:38 lr: 1.0205689073429508e-05 loss: 0.0828 (0.0824) time: 28.7460 data: 25.2700 max mem: 33303 Epoch: [33] [ 680/4276] eta: 1 day, 11:09:39 lr: 1.0202549988252195e-05 loss: 0.0726 (0.0823) time: 41.1294 data: 37.3844 max mem: 33303 Epoch: [33] [ 690/4276] eta: 1 day, 11:00:07 lr: 1.019941079575756e-05 loss: 0.0751 (0.0822) time: 41.9453 data: 38.0634 max mem: 33303 Epoch: [33] [ 700/4276] eta: 1 day, 10:50:18 lr: 1.0196271495905237e-05 loss: 0.0739 (0.0822) time: 30.7208 data: 27.1337 max mem: 33303 Epoch: [33] [ 710/4276] eta: 1 day, 10:39:43 lr: 1.0193132088654827e-05 loss: 0.0739 (0.0822) time: 29.9464 data: 26.5041 max mem: 33303 Epoch: [33] [ 720/4276] eta: 1 day, 10:51:45 lr: 1.018999257396589e-05 loss: 0.0744 (0.0821) time: 43.0641 data: 39.3248 max mem: 33303 Epoch: [33] [ 730/4276] eta: 1 day, 10:41:36 lr: 1.0186852951797973e-05 loss: 0.0689 (0.0820) time: 43.3640 data: 39.5395 max mem: 33303 Epoch: [33] [ 740/4276] eta: 1 day, 10:31:27 lr: 1.0183713222110595e-05 loss: 0.0742 (0.0819) time: 29.9332 data: 26.3888 max mem: 33303 Epoch: [33] [ 750/4276] eta: 1 day, 10:22:45 lr: 1.0180573384863223e-05 loss: 0.0707 (0.0818) time: 30.6777 data: 27.1576 max mem: 33303 Epoch: [33] [ 760/4276] eta: 1 day, 10:32:56 lr: 1.017743344001532e-05 loss: 0.0705 (0.0817) time: 43.7155 data: 39.9157 max mem: 33303 Epoch: [33] [ 770/4276] eta: 1 day, 10:24:09 lr: 1.017429338752631e-05 loss: 0.0740 (0.0818) time: 43.7464 data: 39.9170 max mem: 33303 Epoch: [33] [ 780/4276] eta: 1 day, 10:13:09 lr: 1.0171153227355596e-05 loss: 0.0831 (0.0818) time: 30.0180 data: 26.4633 max mem: 33303 Epoch: [33] [ 790/4276] eta: 1 day, 10:02:24 lr: 1.016801295946253e-05 loss: 0.0886 (0.0819) time: 28.5410 data: 25.0645 max mem: 33303 Epoch: [33] [ 800/4276] eta: 1 day, 10:07:56 lr: 1.0164872583806454e-05 loss: 0.0879 (0.0819) time: 39.7564 data: 35.9311 max mem: 33303 Epoch: [33] [ 810/4276] eta: 1 day, 9:58:08 lr: 1.0161732100346677e-05 loss: 0.0879 (0.0821) time: 40.3739 data: 36.5400 max mem: 33303 Epoch: [33] [ 820/4276] eta: 1 day, 9:47:42 lr: 1.0158591509042483e-05 loss: 0.0745 (0.0820) time: 29.3230 data: 25.8372 max mem: 33303 Epoch: [33] [ 830/4276] eta: 1 day, 9:36:57 lr: 1.015545080985311e-05 loss: 0.0686 (0.0820) time: 28.4672 data: 24.9124 max mem: 33303 Epoch: [33] [ 840/4276] eta: 1 day, 9:42:38 lr: 1.015231000273778e-05 loss: 0.0756 (0.0820) time: 40.1085 data: 36.2601 max mem: 33303 Epoch: [33] [ 850/4276] eta: 1 day, 9:31:57 lr: 1.0149169087655685e-05 loss: 0.0775 (0.0819) time: 40.1141 data: 36.3303 max mem: 33303 Epoch: [33] [ 860/4276] eta: 1 day, 9:22:18 lr: 1.0146028064565993e-05 loss: 0.0799 (0.0821) time: 28.8402 data: 25.2901 max mem: 33303 Epoch: [33] [ 870/4276] eta: 1 day, 9:11:49 lr: 1.014288693342782e-05 loss: 0.0758 (0.0820) time: 28.8094 data: 25.2761 max mem: 33303 Epoch: [33] [ 880/4276] eta: 1 day, 9:16:48 lr: 1.0139745694200275e-05 loss: 0.0809 (0.0822) time: 40.0160 data: 36.2320 max mem: 33303 Epoch: [33] [ 890/4276] eta: 1 day, 9:06:46 lr: 1.013660434684243e-05 loss: 0.0934 (0.0823) time: 40.3245 data: 36.5367 max mem: 33303 Epoch: [33] [ 900/4276] eta: 1 day, 8:58:03 lr: 1.0133462891313333e-05 loss: 0.0851 (0.0824) time: 29.6837 data: 26.1112 max mem: 33303 Epoch: [33] [ 910/4276] eta: 1 day, 8:48:01 lr: 1.0130321327571984e-05 loss: 0.0797 (0.0824) time: 29.5204 data: 25.9603 max mem: 33303 Epoch: [33] [ 920/4276] eta: 1 day, 8:53:56 lr: 1.0127179655577371e-05 loss: 0.0805 (0.0826) time: 41.4071 data: 37.5818 max mem: 33303 Epoch: [33] [ 930/4276] eta: 1 day, 8:43:49 lr: 1.0124037875288451e-05 loss: 0.0817 (0.0826) time: 41.3283 data: 37.4361 max mem: 33303 Epoch: [33] [ 940/4276] eta: 1 day, 8:34:39 lr: 1.012089598666415e-05 loss: 0.0793 (0.0826) time: 28.9288 data: 25.3483 max mem: 33303 Epoch: [33] [ 950/4276] eta: 1 day, 8:24:39 lr: 1.0117753989663351e-05 loss: 0.0840 (0.0827) time: 28.8523 data: 25.2819 max mem: 33303 Epoch: [33] [ 960/4276] eta: 1 day, 8:29:55 lr: 1.0114611884244926e-05 loss: 0.0840 (0.0826) time: 41.2345 data: 37.3421 max mem: 33303 Epoch: [33] [ 970/4276] eta: 1 day, 8:19:41 lr: 1.0111469670367705e-05 loss: 0.0779 (0.0826) time: 40.9944 data: 37.1473 max mem: 33303 Epoch: [33] [ 980/4276] eta: 1 day, 8:10:22 lr: 1.0108327347990501e-05 loss: 0.0792 (0.0827) time: 28.3198 data: 24.8656 max mem: 33303 Epoch: [33] [ 990/4276] eta: 1 day, 8:00:47 lr: 1.0105184917072078e-05 loss: 0.0829 (0.0826) time: 28.7245 data: 25.2769 max mem: 33303 Epoch: [33] [1000/4276] eta: 1 day, 8:04:28 lr: 1.0102042377571181e-05 loss: 0.0791 (0.0826) time: 40.4726 data: 36.7427 max mem: 33303 Epoch: [33] [1010/4276] eta: 1 day, 7:55:10 lr: 1.0098899729446537e-05 loss: 0.0791 (0.0826) time: 40.7089 data: 36.9707 max mem: 33303 Epoch: [33] [1020/4276] eta: 1 day, 7:46:01 lr: 1.0095756972656816e-05 loss: 0.0769 (0.0826) time: 28.9400 data: 25.4463 max mem: 33303 Epoch: [33] [1030/4276] eta: 1 day, 7:36:57 lr: 1.0092614107160678e-05 loss: 0.0883 (0.0827) time: 29.0023 data: 25.4985 max mem: 33303 Epoch: [33] [1040/4276] eta: 1 day, 7:39:50 lr: 1.0089471132916747e-05 loss: 0.0839 (0.0827) time: 40.4556 data: 36.6602 max mem: 33303 Epoch: [33] [1050/4276] eta: 1 day, 7:30:21 lr: 1.0086328049883626e-05 loss: 0.0807 (0.0827) time: 40.0409 data: 36.2469 max mem: 33303 Epoch: [33] [1060/4276] eta: 1 day, 7:22:04 lr: 1.0083184858019864e-05 loss: 0.0862 (0.0829) time: 29.2661 data: 25.6973 max mem: 33303 Epoch: [33] [1070/4276] eta: 1 day, 7:12:46 lr: 1.0080041557284005e-05 loss: 0.0871 (0.0829) time: 29.2823 data: 25.6736 max mem: 33303 Epoch: [33] [1080/4276] eta: 1 day, 7:16:16 lr: 1.007689814763455e-05 loss: 0.0871 (0.0829) time: 41.1108 data: 37.3127 max mem: 33303 Epoch: [33] [1090/4276] eta: 1 day, 7:07:07 lr: 1.0073754629029981e-05 loss: 0.0989 (0.0831) time: 41.2396 data: 37.4775 max mem: 33303 Epoch: [33] [1100/4276] eta: 1 day, 6:58:36 lr: 1.007061100142873e-05 loss: 0.0833 (0.0832) time: 29.0572 data: 25.4944 max mem: 33303 Epoch: [33] [1110/4276] eta: 1 day, 6:49:50 lr: 1.006746726478921e-05 loss: 0.0814 (0.0834) time: 29.3207 data: 25.7450 max mem: 33303 Epoch: [33] [1120/4276] eta: 1 day, 6:52:19 lr: 1.0064323419069814e-05 loss: 0.0908 (0.0835) time: 40.8890 data: 37.0445 max mem: 33303 Epoch: [33] [1130/4276] eta: 1 day, 6:43:28 lr: 1.0061179464228896e-05 loss: 0.0826 (0.0834) time: 40.7910 data: 36.9788 max mem: 33303 Epoch: [33] [1140/4276] eta: 1 day, 6:34:28 lr: 1.0058035400224763e-05 loss: 0.0812 (0.0835) time: 28.5466 data: 25.0362 max mem: 33303 Epoch: [33] [1150/4276] eta: 1 day, 6:25:47 lr: 1.0054891227015719e-05 loss: 0.0769 (0.0834) time: 28.5682 data: 25.0645 max mem: 33303 Epoch: [33] [1160/4276] eta: 1 day, 6:28:02 lr: 1.005174694456002e-05 loss: 0.0753 (0.0835) time: 40.9714 data: 37.2398 max mem: 33303 Epoch: [33] [1170/4276] eta: 1 day, 6:19:12 lr: 1.004860255281591e-05 loss: 0.0941 (0.0836) time: 40.8113 data: 37.0034 max mem: 33303 Epoch: [33] [1180/4276] eta: 1 day, 6:11:17 lr: 1.004545805174157e-05 loss: 0.0941 (0.0837) time: 29.4580 data: 25.8308 max mem: 33303 Epoch: [33] [1190/4276] eta: 1 day, 6:02:10 lr: 1.0042313441295181e-05 loss: 0.0746 (0.0836) time: 28.9880 data: 25.3665 max mem: 33303 Epoch: [33] [1200/4276] eta: 1 day, 6:03:31 lr: 1.003916872143488e-05 loss: 0.0711 (0.0837) time: 39.7171 data: 35.8822 max mem: 33303 Epoch: [33] [1210/4276] eta: 1 day, 5:55:14 lr: 1.0036023892118783e-05 loss: 0.0779 (0.0836) time: 40.6537 data: 36.8436 max mem: 33303 Epoch: [33] [1220/4276] eta: 1 day, 5:46:50 lr: 1.0032878953304955e-05 loss: 0.0779 (0.0836) time: 29.2349 data: 25.6408 max mem: 33303 Epoch: [33] [1230/4276] eta: 1 day, 5:37:50 lr: 1.0029733904951453e-05 loss: 0.0838 (0.0838) time: 28.2358 data: 24.6583 max mem: 33303 Epoch: [33] [1240/4276] eta: 1 day, 5:39:17 lr: 1.002658874701629e-05 loss: 0.0786 (0.0837) time: 40.1785 data: 36.3916 max mem: 33303 Epoch: [33] [1250/4276] eta: 1 day, 5:30:44 lr: 1.0023443479457459e-05 loss: 0.0778 (0.0838) time: 40.6840 data: 36.9161 max mem: 33303 Epoch: [33] [1260/4276] eta: 1 day, 5:22:53 lr: 1.0020298102232905e-05 loss: 0.0799 (0.0837) time: 29.2845 data: 25.8184 max mem: 33303 Epoch: [33] [1270/4276] eta: 1 day, 5:14:24 lr: 1.0017152615300557e-05 loss: 0.0722 (0.0836) time: 29.2381 data: 25.7783 max mem: 33303 Epoch: [33] [1280/4276] eta: 1 day, 5:16:10 lr: 1.001400701861831e-05 loss: 0.0779 (0.0836) time: 41.4626 data: 37.7234 max mem: 33303 Epoch: [33] [1290/4276] eta: 1 day, 5:07:24 lr: 1.0010861312144032e-05 loss: 0.0851 (0.0837) time: 41.0938 data: 37.2968 max mem: 33303 Epoch: [33] [1300/4276] eta: 1 day, 4:59:32 lr: 1.0007715495835544e-05 loss: 0.0792 (0.0838) time: 28.7287 data: 25.1850 max mem: 33303 Epoch: [33] [1310/4276] eta: 1 day, 4:51:14 lr: 1.000456956965065e-05 loss: 0.0747 (0.0837) time: 29.1814 data: 25.6739 max mem: 33303 Epoch: [33] [1320/4276] eta: 1 day, 4:52:12 lr: 1.0001423533547131e-05 loss: 0.0755 (0.0837) time: 40.9114 data: 37.1840 max mem: 33303 Epoch: [33] [1330/4276] eta: 1 day, 4:43:57 lr: 9.998277387482709e-06 loss: 0.0893 (0.0837) time: 40.9817 data: 37.2793 max mem: 33303 Epoch: [33] [1340/4276] eta: 1 day, 4:36:13 lr: 9.995131131415102e-06 loss: 0.0820 (0.0837) time: 29.3140 data: 25.7872 max mem: 33303 Epoch: [33] [1350/4276] eta: 1 day, 4:27:55 lr: 9.991984765301986e-06 loss: 0.0806 (0.0837) time: 29.1195 data: 25.5754 max mem: 33303 Epoch: [33] [1360/4276] eta: 1 day, 4:28:38 lr: 9.988838289101012e-06 loss: 0.0823 (0.0837) time: 40.8233 data: 37.0456 max mem: 33303 Epoch: [33] [1370/4276] eta: 1 day, 4:20:41 lr: 9.985691702769784e-06 loss: 0.0812 (0.0836) time: 41.2973 data: 37.4752 max mem: 33303 Epoch: [33] [1380/4276] eta: 1 day, 4:12:44 lr: 9.98254500626589e-06 loss: 0.0789 (0.0836) time: 29.1829 data: 25.6203 max mem: 33303 Epoch: [33] [1390/4276] eta: 1 day, 4:04:19 lr: 9.979398199546884e-06 loss: 0.0787 (0.0837) time: 28.3747 data: 24.8769 max mem: 33303 Epoch: [33] [1400/4276] eta: 1 day, 4:04:49 lr: 9.976251282570292e-06 loss: 0.0775 (0.0837) time: 40.5888 data: 36.8734 max mem: 33303 Epoch: [33] [1410/4276] eta: 1 day, 3:56:43 lr: 9.973104255293594e-06 loss: 0.0800 (0.0836) time: 41.0368 data: 37.3161 max mem: 33303 Epoch: [33] [1420/4276] eta: 1 day, 3:48:59 lr: 9.969957117674252e-06 loss: 0.0738 (0.0836) time: 29.0033 data: 25.5452 max mem: 33303 Epoch: [33] [1430/4276] eta: 1 day, 3:40:59 lr: 9.966809869669695e-06 loss: 0.0711 (0.0836) time: 29.0069 data: 25.4997 max mem: 33303 Epoch: [33] [1440/4276] eta: 1 day, 3:40:56 lr: 9.963662511237325e-06 loss: 0.0718 (0.0836) time: 40.6057 data: 36.8158 max mem: 33303 Epoch: [33] [1450/4276] eta: 1 day, 3:33:04 lr: 9.960515042334493e-06 loss: 0.0733 (0.0835) time: 40.8175 data: 37.0769 max mem: 33303 Epoch: [33] [1460/4276] eta: 1 day, 3:25:27 lr: 9.95736746291854e-06 loss: 0.0777 (0.0835) time: 29.2721 data: 25.7493 max mem: 33303 Epoch: [33] [1470/4276] eta: 1 day, 3:17:19 lr: 9.954219772946766e-06 loss: 0.0777 (0.0835) time: 28.7270 data: 25.1636 max mem: 33303 Epoch: [33] [1480/4276] eta: 1 day, 3:17:22 lr: 9.951071972376448e-06 loss: 0.0879 (0.0836) time: 40.7648 data: 37.0078 max mem: 33303 Epoch: [33] [1490/4276] eta: 1 day, 3:09:38 lr: 9.947924061164812e-06 loss: 0.0773 (0.0835) time: 41.3954 data: 37.6806 max mem: 33303 Epoch: [33] [1500/4276] eta: 1 day, 3:01:40 lr: 9.944776039269072e-06 loss: 0.0674 (0.0834) time: 28.6891 data: 25.1543 max mem: 33303 Epoch: [33] [1510/4276] eta: 1 day, 2:53:45 lr: 9.941627906646401e-06 loss: 0.0694 (0.0834) time: 28.2559 data: 24.7070 max mem: 33303 Epoch: [33] [1520/4276] eta: 1 day, 2:53:31 lr: 9.938479663253953e-06 loss: 0.0737 (0.0833) time: 40.8807 data: 37.0960 max mem: 33303 Epoch: [33] [1530/4276] eta: 1 day, 2:45:48 lr: 9.935331309048823e-06 loss: 0.0736 (0.0833) time: 41.2308 data: 37.3762 max mem: 33303 Epoch: [33] [1540/4276] eta: 1 day, 2:38:20 lr: 9.932182843988101e-06 loss: 0.0760 (0.0832) time: 29.2675 data: 25.7331 max mem: 33303 Epoch: [33] [1550/4276] eta: 1 day, 2:29:52 lr: 9.929034268028833e-06 loss: 0.0818 (0.0832) time: 27.8531 data: 24.3277 max mem: 33303 Epoch: [33] [1560/4276] eta: 1 day, 2:29:09 lr: 9.925885581128046e-06 loss: 0.0739 (0.0832) time: 39.3502 data: 35.5437 max mem: 33303 Epoch: [33] [1570/4276] eta: 1 day, 2:21:37 lr: 9.922736783242709e-06 loss: 0.0698 (0.0831) time: 40.9182 data: 37.1902 max mem: 33303 Epoch: [33] [1580/4276] eta: 1 day, 2:14:10 lr: 9.919587874329782e-06 loss: 0.0677 (0.0831) time: 29.3324 data: 25.7279 max mem: 33303 Epoch: [33] [1590/4276] eta: 1 day, 2:06:21 lr: 9.916438854346187e-06 loss: 0.0774 (0.0831) time: 28.7246 data: 25.0346 max mem: 33303 Epoch: [33] [1600/4276] eta: 1 day, 2:05:21 lr: 9.913289723248818e-06 loss: 0.0865 (0.0831) time: 40.1855 data: 36.3050 max mem: 33303 Epoch: [33] [1610/4276] eta: 1 day, 1:57:38 lr: 9.910140480994523e-06 loss: 0.0863 (0.0831) time: 40.3340 data: 36.4384 max mem: 33303 Epoch: [33] [1620/4276] eta: 1 day, 1:49:58 lr: 9.906991127540132e-06 loss: 0.0757 (0.0831) time: 28.3354 data: 24.7688 max mem: 33303 Epoch: [33] [1630/4276] eta: 1 day, 1:42:36 lr: 9.903841662842443e-06 loss: 0.0839 (0.0832) time: 28.8630 data: 25.2967 max mem: 33303 Epoch: [33] [1640/4276] eta: 1 day, 1:41:52 lr: 9.900692086858209e-06 loss: 0.0814 (0.0832) time: 41.6892 data: 37.7055 max mem: 33303 Epoch: [33] [1650/4276] eta: 1 day, 1:33:47 lr: 9.89754239954416e-06 loss: 0.0803 (0.0831) time: 40.3447 data: 36.4175 max mem: 33303 Epoch: [33] [1660/4276] eta: 1 day, 1:26:12 lr: 9.894392600856997e-06 loss: 0.0780 (0.0831) time: 27.5450 data: 24.0360 max mem: 33303 Epoch: [33] [1670/4276] eta: 1 day, 1:18:30 lr: 9.891242690753389e-06 loss: 0.0795 (0.0832) time: 28.1250 data: 24.6385 max mem: 33303 Epoch: [33] [1680/4276] eta: 1 day, 1:17:19 lr: 9.88809266918996e-06 loss: 0.0795 (0.0832) time: 40.4056 data: 36.6610 max mem: 33303 Epoch: [33] [1690/4276] eta: 1 day, 1:09:46 lr: 9.884942536123311e-06 loss: 0.0749 (0.0832) time: 40.6657 data: 36.9336 max mem: 33303 Epoch: [33] [1700/4276] eta: 1 day, 1:02:29 lr: 9.881792291510015e-06 loss: 0.0810 (0.0832) time: 28.8510 data: 25.3447 max mem: 33303 Epoch: [33] [1710/4276] eta: 1 day, 0:55:04 lr: 9.87864193530661e-06 loss: 0.0818 (0.0832) time: 28.9804 data: 25.3416 max mem: 33303 Epoch: [33] [1720/4276] eta: 1 day, 0:53:58 lr: 9.875491467469592e-06 loss: 0.0738 (0.0831) time: 41.3442 data: 37.4781 max mem: 33303 Epoch: [33] [1730/4276] eta: 1 day, 0:46:45 lr: 9.872340887955435e-06 loss: 0.0735 (0.0831) time: 41.7867 data: 38.0621 max mem: 33303 Epoch: [33] [1740/4276] eta: 1 day, 0:39:10 lr: 9.86919019672058e-06 loss: 0.0788 (0.0831) time: 28.6931 data: 25.2300 max mem: 33303 Epoch: [33] [1750/4276] eta: 1 day, 0:32:18 lr: 9.866039393721436e-06 loss: 0.0748 (0.0830) time: 29.2675 data: 25.7168 max mem: 33303 Epoch: [33] [1760/4276] eta: 1 day, 0:30:50 lr: 9.862888478914365e-06 loss: 0.0692 (0.0830) time: 41.9636 data: 38.1608 max mem: 33303 Epoch: [33] [1770/4276] eta: 1 day, 0:23:18 lr: 9.85973745225572e-06 loss: 0.0669 (0.0829) time: 40.5917 data: 36.8644 max mem: 33303 Epoch: [33] [1780/4276] eta: 1 day, 0:16:13 lr: 9.856586313701804e-06 loss: 0.0669 (0.0829) time: 28.8474 data: 25.3709 max mem: 33303 Epoch: [33] [1790/4276] eta: 1 day, 0:08:51 lr: 9.8534350632089e-06 loss: 0.0732 (0.0828) time: 29.0665 data: 25.4910 max mem: 33303 Epoch: [33] [1800/4276] eta: 1 day, 0:07:20 lr: 9.850283700733243e-06 loss: 0.0784 (0.0829) time: 41.0480 data: 37.1492 max mem: 33303 Epoch: [33] [1810/4276] eta: 23:59:39 lr: 9.847132226231045e-06 loss: 0.0761 (0.0829) time: 40.3809 data: 36.5830 max mem: 33303 Epoch: [33] [1820/4276] eta: 23:52:29 lr: 9.843980639658488e-06 loss: 0.0761 (0.0828) time: 28.0626 data: 24.5742 max mem: 33303 Epoch: [33] [1830/4276] eta: 23:45:00 lr: 9.840828940971722e-06 loss: 0.0740 (0.0828) time: 28.3424 data: 24.8668 max mem: 33303 Epoch: [33] [1840/4276] eta: 23:43:24 lr: 9.837677130126845e-06 loss: 0.0665 (0.0827) time: 40.8687 data: 37.1587 max mem: 33303 Epoch: [33] [1850/4276] eta: 23:36:14 lr: 9.834525207079949e-06 loss: 0.0688 (0.0828) time: 41.5528 data: 37.8424 max mem: 33303 Epoch: [33] [1860/4276] eta: 23:28:49 lr: 9.831373171787075e-06 loss: 0.0875 (0.0828) time: 28.3633 data: 24.9198 max mem: 33303 Epoch: [33] [1870/4276] eta: 23:21:47 lr: 9.828221024204249e-06 loss: 0.0850 (0.0827) time: 28.5747 data: 25.0379 max mem: 33303 Epoch: [33] [1880/4276] eta: 23:20:05 lr: 9.825068764287433e-06 loss: 0.0689 (0.0827) time: 41.8545 data: 37.9851 max mem: 33303 Epoch: [33] [1890/4276] eta: 23:12:51 lr: 9.821916391992586e-06 loss: 0.0685 (0.0826) time: 41.4144 data: 37.6133 max mem: 33303 Epoch: [33] [1900/4276] eta: 23:05:35 lr: 9.818763907275623e-06 loss: 0.0775 (0.0826) time: 28.3194 data: 24.8574 max mem: 33303 Epoch: [33] [1910/4276] eta: 22:58:53 lr: 9.815611310092431e-06 loss: 0.0741 (0.0826) time: 29.4629 data: 25.9382 max mem: 33303 Epoch: [33] [1920/4276] eta: 22:56:45 lr: 9.812458600398849e-06 loss: 0.0621 (0.0826) time: 41.9356 data: 38.1313 max mem: 33303 Epoch: [33] [1930/4276] eta: 22:49:30 lr: 9.809305778150695e-06 loss: 0.0715 (0.0825) time: 40.5839 data: 36.8554 max mem: 33303 Epoch: [33] [1940/4276] eta: 22:42:27 lr: 9.806152843303762e-06 loss: 0.0715 (0.0825) time: 28.5487 data: 25.1072 max mem: 33303 Epoch: [33] [1950/4276] eta: 22:35:19 lr: 9.802999795813786e-06 loss: 0.0841 (0.0826) time: 28.6973 data: 25.1727 max mem: 33303 Epoch: [33] [1960/4276] eta: 22:33:11 lr: 9.799846635636492e-06 loss: 0.0752 (0.0826) time: 41.0572 data: 37.2420 max mem: 33303 Epoch: [33] [1970/4276] eta: 22:25:56 lr: 9.796693362727558e-06 loss: 0.0751 (0.0826) time: 40.7935 data: 37.0313 max mem: 33303 Epoch: [33] [1980/4276] eta: 22:18:56 lr: 9.793539977042644e-06 loss: 0.0712 (0.0825) time: 28.4143 data: 24.8585 max mem: 33303 Epoch: [33] [1990/4276] eta: 22:11:54 lr: 9.790386478537356e-06 loss: 0.0714 (0.0825) time: 28.8709 data: 25.2816 max mem: 33303 Epoch: [33] [2000/4276] eta: 22:09:31 lr: 9.787232867167279e-06 loss: 0.0826 (0.0825) time: 40.9489 data: 37.1502 max mem: 33303 Epoch: [33] [2010/4276] eta: 22:02:26 lr: 9.784079142887966e-06 loss: 0.0835 (0.0826) time: 40.7738 data: 37.0474 max mem: 33303 Epoch: [33] [2020/4276] eta: 21:55:09 lr: 9.780925305654937e-06 loss: 0.0837 (0.0826) time: 27.8306 data: 24.3822 max mem: 33303 Epoch: [33] [2030/4276] eta: 21:47:57 lr: 9.777771355423667e-06 loss: 0.0692 (0.0825) time: 27.3966 data: 23.9473 max mem: 33303 Epoch: [33] [2040/4276] eta: 21:45:44 lr: 9.774617292149608e-06 loss: 0.0656 (0.0825) time: 41.0907 data: 37.3698 max mem: 33303 Epoch: [33] [2050/4276] eta: 21:38:38 lr: 9.771463115788177e-06 loss: 0.0798 (0.0825) time: 41.3644 data: 37.6350 max mem: 33303 Epoch: [33] [2060/4276] eta: 21:31:50 lr: 9.768308826294765e-06 loss: 0.0771 (0.0825) time: 28.8793 data: 25.4375 max mem: 33303 Epoch: [33] [2070/4276] eta: 21:24:37 lr: 9.765154423624703e-06 loss: 0.0722 (0.0825) time: 28.3985 data: 24.8720 max mem: 33303 Epoch: [33] [2080/4276] eta: 21:22:07 lr: 9.761999907733317e-06 loss: 0.0783 (0.0825) time: 40.4685 data: 36.5827 max mem: 33303 Epoch: [33] [2090/4276] eta: 21:15:21 lr: 9.758845278575886e-06 loss: 0.0858 (0.0825) time: 41.7387 data: 37.9048 max mem: 33303 Epoch: [33] [2100/4276] eta: 21:08:38 lr: 9.755690536107666e-06 loss: 0.0765 (0.0825) time: 29.8007 data: 26.2286 max mem: 33303 Epoch: [33] [2110/4276] eta: 21:01:59 lr: 9.752535680283855e-06 loss: 0.0752 (0.0824) time: 30.0774 data: 26.5343 max mem: 33303 Epoch: [33] [2120/4276] eta: 20:59:15 lr: 9.749380711059645e-06 loss: 0.0651 (0.0823) time: 41.6839 data: 37.8953 max mem: 33303 Epoch: [33] [2130/4276] eta: 20:52:12 lr: 9.746225628390176e-06 loss: 0.0651 (0.0823) time: 40.5131 data: 36.6446 max mem: 33303 Epoch: [33] [2140/4276] eta: 20:45:25 lr: 9.74307043223057e-06 loss: 0.0752 (0.0823) time: 28.5908 data: 24.9932 max mem: 33303 Epoch: [33] [2150/4276] eta: 20:38:59 lr: 9.739915122535893e-06 loss: 0.0716 (0.0822) time: 30.3307 data: 26.8131 max mem: 33303 Epoch: [33] [2160/4276] eta: 20:36:24 lr: 9.736759699261196e-06 loss: 0.0716 (0.0822) time: 43.1076 data: 39.3701 max mem: 33303 Epoch: [33] [2170/4276] eta: 20:29:15 lr: 9.733604162361488e-06 loss: 0.0765 (0.0823) time: 40.8965 data: 37.1594 max mem: 33303 Epoch: [33] [2180/4276] eta: 20:22:25 lr: 9.730448511791753e-06 loss: 0.0804 (0.0823) time: 27.8934 data: 24.4458 max mem: 33303 Epoch: [33] [2190/4276] eta: 20:15:30 lr: 9.72729274750692e-06 loss: 0.0827 (0.0823) time: 28.4988 data: 25.0626 max mem: 33303 Epoch: [33] [2200/4276] eta: 20:12:35 lr: 9.724136869461904e-06 loss: 0.0840 (0.0823) time: 40.8213 data: 37.0894 max mem: 33303 Epoch: [33] [2210/4276] eta: 20:05:52 lr: 9.72098087761158e-06 loss: 0.0782 (0.0823) time: 41.4262 data: 37.6731 max mem: 33303 Epoch: [33] [2220/4276] eta: 19:58:59 lr: 9.717824771910792e-06 loss: 0.0771 (0.0823) time: 28.8212 data: 25.3613 max mem: 33303 Epoch: [33] [2230/4276] eta: 19:52:01 lr: 9.714668552314335e-06 loss: 0.0771 (0.0823) time: 27.8778 data: 24.4367 max mem: 33303 Epoch: [33] [2240/4276] eta: 19:49:04 lr: 9.711512218776987e-06 loss: 0.0717 (0.0823) time: 40.7674 data: 37.0409 max mem: 33303 Epoch: [33] [2250/4276] eta: 19:42:17 lr: 9.70835577125349e-06 loss: 0.0717 (0.0823) time: 41.4015 data: 37.6763 max mem: 33303 Epoch: [33] [2260/4276] eta: 19:35:23 lr: 9.705199209698536e-06 loss: 0.0794 (0.0823) time: 28.2533 data: 24.8123 max mem: 33303 Epoch: [33] [2270/4276] eta: 19:28:35 lr: 9.702042534066799e-06 loss: 0.0739 (0.0822) time: 28.1086 data: 24.5591 max mem: 33303 Epoch: [33] [2280/4276] eta: 19:25:22 lr: 9.698885744312913e-06 loss: 0.0749 (0.0822) time: 40.6598 data: 36.7104 max mem: 33303 Epoch: [33] [2290/4276] eta: 19:18:36 lr: 9.695728840391483e-06 loss: 0.0793 (0.0822) time: 40.6937 data: 36.8588 max mem: 33303 Epoch: [33] [2300/4276] eta: 19:12:00 lr: 9.692571822257062e-06 loss: 0.0746 (0.0822) time: 29.0922 data: 25.5883 max mem: 33303 Epoch: [33] [2310/4276] eta: 19:05:12 lr: 9.689414689864189e-06 loss: 0.0840 (0.0822) time: 28.9232 data: 25.3219 max mem: 33303 Epoch: [33] [2320/4276] eta: 19:01:45 lr: 9.686257443167357e-06 loss: 0.0895 (0.0822) time: 40.0271 data: 36.1600 max mem: 33303 Epoch: [33] [2330/4276] eta: 18:55:02 lr: 9.683100082121036e-06 loss: 0.0832 (0.0823) time: 40.2770 data: 36.4997 max mem: 33303 Epoch: [33] [2340/4276] eta: 18:48:18 lr: 9.679942606679637e-06 loss: 0.0809 (0.0823) time: 28.5508 data: 25.1096 max mem: 33303 Epoch: [33] [2350/4276] eta: 18:41:48 lr: 9.676785016797562e-06 loss: 0.0755 (0.0822) time: 29.2282 data: 25.6826 max mem: 33303 Epoch: [33] [2360/4276] eta: 18:38:26 lr: 9.673627312429167e-06 loss: 0.0727 (0.0822) time: 41.6084 data: 37.6953 max mem: 33303 Epoch: [33] [2370/4276] eta: 18:31:43 lr: 9.67046949352878e-06 loss: 0.0822 (0.0822) time: 40.8028 data: 36.9813 max mem: 33303 Epoch: [33] [2380/4276] eta: 18:25:01 lr: 9.667311560050678e-06 loss: 0.0757 (0.0822) time: 28.4602 data: 25.0059 max mem: 33303 Epoch: [33] [2390/4276] eta: 18:18:16 lr: 9.664153511949117e-06 loss: 0.0749 (0.0822) time: 28.2445 data: 24.6300 max mem: 33303 Epoch: [33] [2400/4276] eta: 18:14:53 lr: 9.660995349178319e-06 loss: 0.0809 (0.0822) time: 40.7858 data: 36.7961 max mem: 33303 Epoch: [33] [2410/4276] eta: 18:08:12 lr: 9.65783707169247e-06 loss: 0.0853 (0.0823) time: 41.0197 data: 37.2080 max mem: 33303 Epoch: [33] [2420/4276] eta: 18:01:29 lr: 9.654678679445709e-06 loss: 0.0814 (0.0823) time: 28.2956 data: 24.8603 max mem: 33303 Epoch: [33] [2430/4276] eta: 17:54:48 lr: 9.651520172392152e-06 loss: 0.0788 (0.0823) time: 28.1817 data: 24.7231 max mem: 33303 Epoch: [33] [2440/4276] eta: 17:51:17 lr: 9.64836155048588e-06 loss: 0.0760 (0.0822) time: 40.7351 data: 36.9724 max mem: 33303 Epoch: [33] [2450/4276] eta: 17:44:38 lr: 9.645202813680938e-06 loss: 0.0735 (0.0822) time: 40.8539 data: 37.1100 max mem: 33303 Epoch: [33] [2460/4276] eta: 17:38:01 lr: 9.642043961931326e-06 loss: 0.0756 (0.0822) time: 28.5436 data: 25.1057 max mem: 33303 Epoch: [33] [2470/4276] eta: 17:31:10 lr: 9.638884995191024e-06 loss: 0.0776 (0.0822) time: 27.5942 data: 24.1461 max mem: 33303 Epoch: [33] [2480/4276] eta: 17:27:25 lr: 9.635725913413964e-06 loss: 0.0878 (0.0823) time: 39.3304 data: 35.5965 max mem: 33303 Epoch: [33] [2490/4276] eta: 17:20:54 lr: 9.632566716554058e-06 loss: 0.0795 (0.0823) time: 40.6583 data: 36.9293 max mem: 33303 Epoch: [33] [2500/4276] eta: 17:14:20 lr: 9.629407404565159e-06 loss: 0.0795 (0.0822) time: 28.9744 data: 25.5183 max mem: 33303 Epoch: [33] [2510/4276] eta: 17:07:53 lr: 9.626247977401107e-06 loss: 0.0880 (0.0823) time: 29.1814 data: 25.6705 max mem: 33303 Epoch: [33] [2520/4276] eta: 17:04:19 lr: 9.623088435015698e-06 loss: 0.0751 (0.0822) time: 41.9550 data: 38.1494 max mem: 33303 Epoch: [33] [2530/4276] eta: 16:57:39 lr: 9.619928777362697e-06 loss: 0.0696 (0.0822) time: 41.0160 data: 37.2651 max mem: 33303 Epoch: [33] [2540/4276] eta: 16:51:01 lr: 9.61676900439582e-06 loss: 0.0716 (0.0822) time: 27.8520 data: 24.3304 max mem: 33303 Epoch: [33] [2550/4276] eta: 16:44:30 lr: 9.61360911606876e-06 loss: 0.0698 (0.0821) time: 28.3548 data: 24.8242 max mem: 33303 Epoch: [33] [2560/4276] eta: 16:40:48 lr: 9.610449112335183e-06 loss: 0.0635 (0.0821) time: 41.3416 data: 37.5753 max mem: 33303 Epoch: [33] [2570/4276] eta: 16:34:09 lr: 9.60728899314869e-06 loss: 0.0716 (0.0821) time: 40.7780 data: 36.9567 max mem: 33303 Epoch: [33] [2580/4276] eta: 16:27:44 lr: 9.604128758462873e-06 loss: 0.0758 (0.0821) time: 28.6233 data: 25.0122 max mem: 33303 Epoch: [33] [2590/4276] eta: 16:21:05 lr: 9.60096840823128e-06 loss: 0.0701 (0.0820) time: 28.4131 data: 24.7985 max mem: 33303 Epoch: [33] [2600/4276] eta: 16:17:13 lr: 9.597807942407428e-06 loss: 0.0669 (0.0820) time: 40.1962 data: 36.3283 max mem: 33303 Epoch: [33] [2610/4276] eta: 16:10:46 lr: 9.594647360944783e-06 loss: 0.0703 (0.0820) time: 41.1617 data: 37.2952 max mem: 33303 Epoch: [33] [2620/4276] eta: 16:04:16 lr: 9.59148666379679e-06 loss: 0.0785 (0.0820) time: 28.8212 data: 25.2916 max mem: 33303 Epoch: [33] [2630/4276] eta: 15:57:52 lr: 9.588325850916854e-06 loss: 0.0752 (0.0819) time: 28.9930 data: 25.5386 max mem: 33303 Epoch: [33] [2640/4276] eta: 15:53:55 lr: 9.585164922258349e-06 loss: 0.0703 (0.0819) time: 41.2468 data: 37.5132 max mem: 33303 Epoch: [33] [2650/4276] eta: 15:47:24 lr: 9.5820038777746e-06 loss: 0.0736 (0.0819) time: 40.5933 data: 36.7682 max mem: 33303 Epoch: [33] [2660/4276] eta: 15:41:01 lr: 9.578842717418906e-06 loss: 0.0764 (0.0819) time: 28.8330 data: 25.2159 max mem: 33303 Epoch: [33] [2670/4276] eta: 15:34:25 lr: 9.575681441144529e-06 loss: 0.0771 (0.0819) time: 28.3622 data: 24.7500 max mem: 33303 Epoch: [33] [2680/4276] eta: 15:30:28 lr: 9.572520048904699e-06 loss: 0.0792 (0.0819) time: 40.4622 data: 36.6453 max mem: 33303 Epoch: [33] [2690/4276] eta: 15:23:59 lr: 9.569358540652598e-06 loss: 0.0742 (0.0818) time: 41.0045 data: 37.2765 max mem: 33303 Epoch: [33] [2700/4276] eta: 15:17:31 lr: 9.566196916341382e-06 loss: 0.0677 (0.0818) time: 28.4059 data: 24.9516 max mem: 33303 Epoch: [33] [2710/4276] eta: 15:11:04 lr: 9.563035175924165e-06 loss: 0.0698 (0.0818) time: 28.4043 data: 24.9468 max mem: 33303 Epoch: [33] [2720/4276] eta: 15:07:01 lr: 9.55987331935404e-06 loss: 0.0713 (0.0817) time: 40.9243 data: 37.2053 max mem: 33303 Epoch: [33] [2730/4276] eta: 15:00:37 lr: 9.556711346584033e-06 loss: 0.0713 (0.0817) time: 41.1620 data: 37.4361 max mem: 33303 Epoch: [33] [2740/4276] eta: 14:54:12 lr: 9.553549257567161e-06 loss: 0.0752 (0.0817) time: 28.7515 data: 25.2960 max mem: 33303 Epoch: [33] [2750/4276] eta: 14:47:53 lr: 9.550387052256399e-06 loss: 0.0791 (0.0817) time: 29.1798 data: 25.7063 max mem: 33303 Epoch: [33] [2760/4276] eta: 14:43:49 lr: 9.547224730604682e-06 loss: 0.0793 (0.0817) time: 41.7875 data: 38.0465 max mem: 33303 Epoch: [33] [2770/4276] eta: 14:37:23 lr: 9.544062292564906e-06 loss: 0.0682 (0.0816) time: 41.1094 data: 37.3906 max mem: 33303 Epoch: [33] [2780/4276] eta: 14:31:04 lr: 9.54089973808993e-06 loss: 0.0692 (0.0816) time: 28.8922 data: 25.4027 max mem: 33303 Epoch: [33] [2790/4276] eta: 14:24:36 lr: 9.537737067132589e-06 loss: 0.0873 (0.0817) time: 28.5961 data: 25.1078 max mem: 33303 Epoch: [33] [2800/4276] eta: 14:20:26 lr: 9.534574279645673e-06 loss: 0.0692 (0.0816) time: 40.7078 data: 36.9052 max mem: 33303 Epoch: [33] [2810/4276] eta: 14:14:09 lr: 9.531411375581927e-06 loss: 0.0602 (0.0815) time: 41.7634 data: 37.8792 max mem: 33303 Epoch: [33] [2820/4276] eta: 14:07:44 lr: 9.528248354894073e-06 loss: 0.0602 (0.0815) time: 29.0015 data: 25.4518 max mem: 33303 Epoch: [33] [2830/4276] eta: 14:01:23 lr: 9.525085217534788e-06 loss: 0.0737 (0.0815) time: 28.3706 data: 24.8785 max mem: 33303 Epoch: [33] [2840/4276] eta: 13:57:03 lr: 9.521921963456727e-06 loss: 0.0894 (0.0816) time: 40.6381 data: 36.8785 max mem: 33303 Epoch: [33] [2850/4276] eta: 13:50:41 lr: 9.51875859261248e-06 loss: 0.0767 (0.0816) time: 40.5902 data: 36.8324 max mem: 33303 Epoch: [33] [2860/4276] eta: 13:44:23 lr: 9.515595104954625e-06 loss: 0.0730 (0.0816) time: 28.8589 data: 25.3695 max mem: 33303 Epoch: [33] [2870/4276] eta: 13:37:58 lr: 9.5124315004357e-06 loss: 0.0713 (0.0816) time: 28.4452 data: 24.9712 max mem: 33303 Epoch: [33] [2880/4276] eta: 13:33:38 lr: 9.509267779008192e-06 loss: 0.0731 (0.0816) time: 40.3870 data: 36.6462 max mem: 33303 Epoch: [33] [2890/4276] eta: 13:27:10 lr: 9.506103940624564e-06 loss: 0.0731 (0.0815) time: 40.0962 data: 36.2480 max mem: 33303 Epoch: [33] [2900/4276] eta: 13:20:49 lr: 9.502939985237238e-06 loss: 0.0649 (0.0815) time: 27.7224 data: 24.1579 max mem: 33303 Epoch: [33] [2910/4276] eta: 13:14:26 lr: 9.499775912798608e-06 loss: 0.0778 (0.0815) time: 27.9605 data: 24.5116 max mem: 33303 Epoch: [33] [2920/4276] eta: 13:09:58 lr: 9.49661172326101e-06 loss: 0.0778 (0.0815) time: 39.9430 data: 36.2191 max mem: 33303 Epoch: [33] [2930/4276] eta: 13:03:35 lr: 9.493447416576761e-06 loss: 0.0755 (0.0816) time: 40.0664 data: 36.3234 max mem: 33303 Epoch: [33] [2940/4276] eta: 12:57:21 lr: 9.490282992698135e-06 loss: 0.0853 (0.0815) time: 28.6449 data: 25.1841 max mem: 33303 Epoch: [33] [2950/4276] eta: 12:51:02 lr: 9.487118451577378e-06 loss: 0.0845 (0.0816) time: 28.9038 data: 25.3952 max mem: 33303 Epoch: [33] [2960/4276] eta: 12:46:35 lr: 9.483953793166674e-06 loss: 0.0757 (0.0816) time: 40.8029 data: 37.0175 max mem: 33303 Epoch: [33] [2970/4276] eta: 12:40:20 lr: 9.480789017418195e-06 loss: 0.0757 (0.0816) time: 41.2152 data: 37.4814 max mem: 33303 Epoch: [33] [2980/4276] eta: 12:34:02 lr: 9.477624124284065e-06 loss: 0.0783 (0.0815) time: 28.7381 data: 25.2775 max mem: 33303 Epoch: [33] [2990/4276] eta: 12:27:52 lr: 9.474459113716379e-06 loss: 0.0700 (0.0815) time: 29.1475 data: 25.7006 max mem: 33303 Epoch: [33] [3000/4276] eta: 12:23:13 lr: 9.471293985667176e-06 loss: 0.0700 (0.0815) time: 40.5944 data: 36.7738 max mem: 33303 Epoch: [33] [3010/4276] eta: 12:16:54 lr: 9.468128740088477e-06 loss: 0.0764 (0.0815) time: 39.6322 data: 35.8204 max mem: 33303 Epoch: [33] [3020/4276] eta: 12:10:40 lr: 9.464963376932257e-06 loss: 0.0684 (0.0814) time: 28.4735 data: 25.0499 max mem: 33303 Epoch: [33] [3030/4276] eta: 12:04:27 lr: 9.46179789615046e-06 loss: 0.0680 (0.0814) time: 29.0274 data: 25.5004 max mem: 33303 Epoch: [33] [3040/4276] eta: 11:59:57 lr: 9.458632297694975e-06 loss: 0.0768 (0.0814) time: 41.6992 data: 37.8094 max mem: 33303 Epoch: [33] [3050/4276] eta: 11:53:40 lr: 9.455466581517673e-06 loss: 0.0777 (0.0814) time: 41.1735 data: 37.3022 max mem: 33303 Epoch: [33] [3060/4276] eta: 11:47:26 lr: 9.45230074757038e-06 loss: 0.0664 (0.0814) time: 28.3634 data: 24.8528 max mem: 33303 Epoch: [33] [3070/4276] eta: 11:41:11 lr: 9.44913479580489e-06 loss: 0.0748 (0.0814) time: 28.5017 data: 25.0628 max mem: 33303 Epoch: [33] [3080/4276] eta: 11:36:33 lr: 9.445968726172941e-06 loss: 0.0817 (0.0814) time: 40.7742 data: 37.0188 max mem: 33303 Epoch: [33] [3090/4276] eta: 11:30:19 lr: 9.442802538626255e-06 loss: 0.0663 (0.0813) time: 40.8680 data: 37.1099 max mem: 33303 Epoch: [33] [3100/4276] eta: 11:24:04 lr: 9.439636233116503e-06 loss: 0.0663 (0.0813) time: 28.3990 data: 24.8374 max mem: 33303 Epoch: [33] [3110/4276] eta: 11:17:51 lr: 9.43646980959533e-06 loss: 0.0657 (0.0812) time: 28.3521 data: 24.7652 max mem: 33303 Epoch: [33] [3120/4276] eta: 11:13:13 lr: 9.433303268014325e-06 loss: 0.0671 (0.0812) time: 41.1857 data: 37.3333 max mem: 33303 Epoch: [33] [3130/4276] eta: 11:07:01 lr: 9.430136608325053e-06 loss: 0.0779 (0.0812) time: 41.4112 data: 37.5736 max mem: 33303 Epoch: [33] [3140/4276] eta: 11:00:49 lr: 9.426969830479038e-06 loss: 0.0728 (0.0812) time: 28.6443 data: 25.1097 max mem: 33303 Epoch: [33] [3150/4276] eta: 10:54:35 lr: 9.423802934427776e-06 loss: 0.0676 (0.0812) time: 28.1585 data: 24.6427 max mem: 33303 Epoch: [33] [3160/4276] eta: 10:49:49 lr: 9.420635920122695e-06 loss: 0.0704 (0.0812) time: 40.4126 data: 36.6899 max mem: 33303 Epoch: [33] [3170/4276] eta: 10:43:37 lr: 9.417468787515218e-06 loss: 0.0771 (0.0812) time: 40.6743 data: 36.8926 max mem: 33303 Epoch: [33] [3180/4276] eta: 10:37:30 lr: 9.414301536556719e-06 loss: 0.0771 (0.0812) time: 28.9804 data: 25.3948 max mem: 33303 Epoch: [33] [3190/4276] eta: 10:31:20 lr: 9.411134167198517e-06 loss: 0.0779 (0.0812) time: 29.2068 data: 25.5924 max mem: 33303 Epoch: [33] [3200/4276] eta: 10:26:31 lr: 9.407966679391916e-06 loss: 0.0771 (0.0812) time: 40.7527 data: 36.9105 max mem: 33303 Epoch: [33] [3210/4276] eta: 10:20:19 lr: 9.404799073088172e-06 loss: 0.0734 (0.0812) time: 40.3589 data: 36.5604 max mem: 33303 Epoch: [33] [3220/4276] eta: 10:14:09 lr: 9.40163134823851e-06 loss: 0.0734 (0.0812) time: 28.3530 data: 24.7810 max mem: 33303 Epoch: [33] [3230/4276] eta: 10:07:58 lr: 9.398463504794096e-06 loss: 0.0766 (0.0812) time: 28.3813 data: 24.7902 max mem: 33303 Epoch: [33] [3240/4276] eta: 10:03:07 lr: 9.39529554270608e-06 loss: 0.0766 (0.0812) time: 40.4587 data: 36.5914 max mem: 33303 Epoch: [33] [3250/4276] eta: 9:57:01 lr: 9.392127461925564e-06 loss: 0.0769 (0.0812) time: 41.1694 data: 37.2989 max mem: 33303 Epoch: [33] [3260/4276] eta: 9:50:49 lr: 9.388959262403618e-06 loss: 0.0774 (0.0813) time: 28.6390 data: 25.0821 max mem: 33303 Epoch: [33] [3270/4276] eta: 9:44:45 lr: 9.38579094409126e-06 loss: 0.0774 (0.0812) time: 28.8106 data: 25.3007 max mem: 33303 Epoch: [33] [3280/4276] eta: 9:39:54 lr: 9.382622506939477e-06 loss: 0.0813 (0.0813) time: 41.9037 data: 38.1548 max mem: 33303 Epoch: [33] [3290/4276] eta: 9:33:45 lr: 9.379453950899224e-06 loss: 0.0907 (0.0813) time: 41.1449 data: 37.3787 max mem: 33303 Epoch: [33] [3300/4276] eta: 9:27:40 lr: 9.376285275921416e-06 loss: 0.0917 (0.0813) time: 29.0097 data: 25.4205 max mem: 33303 Epoch: [33] [3310/4276] eta: 9:21:28 lr: 9.37311648195691e-06 loss: 0.0851 (0.0813) time: 28.2546 data: 24.6524 max mem: 33303 Epoch: [33] [3320/4276] eta: 9:16:39 lr: 9.369947568956545e-06 loss: 0.0851 (0.0814) time: 41.3710 data: 37.5700 max mem: 33303 Epoch: [33] [3330/4276] eta: 9:10:30 lr: 9.36677853687112e-06 loss: 0.0905 (0.0814) time: 41.9032 data: 38.1164 max mem: 33303 Epoch: [33] [3340/4276] eta: 9:04:23 lr: 9.363609385651391e-06 loss: 0.0858 (0.0814) time: 28.2413 data: 24.6321 max mem: 33303 Epoch: [33] [3350/4276] eta: 8:58:16 lr: 9.360440115248066e-06 loss: 0.0833 (0.0814) time: 28.3923 data: 24.7897 max mem: 33303 Epoch: [33] [3360/4276] eta: 8:53:23 lr: 9.357270725611824e-06 loss: 0.0729 (0.0814) time: 41.9148 data: 38.1794 max mem: 33303 Epoch: [33] [3370/4276] eta: 8:47:15 lr: 9.354101216693307e-06 loss: 0.0752 (0.0814) time: 41.7178 data: 37.9687 max mem: 33303 Epoch: [33] [3380/4276] eta: 8:41:09 lr: 9.35093158844312e-06 loss: 0.0758 (0.0814) time: 28.2699 data: 24.7285 max mem: 33303 Epoch: [33] [3390/4276] eta: 8:35:03 lr: 9.34776184081181e-06 loss: 0.0808 (0.0814) time: 28.4080 data: 24.8082 max mem: 33303 Epoch: [33] [3400/4276] eta: 8:30:02 lr: 9.344591973749905e-06 loss: 0.0808 (0.0814) time: 40.8313 data: 36.9107 max mem: 33303 Epoch: [33] [3410/4276] eta: 8:23:54 lr: 9.341421987207887e-06 loss: 0.0789 (0.0814) time: 40.5766 data: 36.6683 max mem: 33303 Epoch: [33] [3420/4276] eta: 8:17:49 lr: 9.338251881136206e-06 loss: 0.0747 (0.0814) time: 28.1413 data: 24.5744 max mem: 33303 Epoch: [33] [3430/4276] eta: 8:11:48 lr: 9.33508165548525e-06 loss: 0.0822 (0.0814) time: 29.3874 data: 25.7889 max mem: 33303 Epoch: [33] [3440/4276] eta: 8:06:46 lr: 9.331911310205391e-06 loss: 0.0774 (0.0814) time: 42.0589 data: 38.1173 max mem: 33303 Epoch: [33] [3450/4276] eta: 8:00:41 lr: 9.328740845246958e-06 loss: 0.0752 (0.0815) time: 41.2077 data: 37.3486 max mem: 33303 Epoch: [33] [3460/4276] eta: 7:54:36 lr: 9.325570260560237e-06 loss: 0.1053 (0.0815) time: 28.2422 data: 24.7392 max mem: 33303 Epoch: [33] [3470/4276] eta: 7:48:35 lr: 9.322399556095464e-06 loss: 0.0745 (0.0815) time: 28.8353 data: 25.2572 max mem: 33303 Epoch: [33] [3480/4276] eta: 7:43:26 lr: 9.319228731802851e-06 loss: 0.0706 (0.0815) time: 40.9569 data: 37.1204 max mem: 33303 Epoch: [33] [3490/4276] eta: 7:37:20 lr: 9.316057787632573e-06 loss: 0.0754 (0.0815) time: 40.0529 data: 36.2792 max mem: 33303 Epoch: [33] [3500/4276] eta: 7:31:18 lr: 9.312886723534743e-06 loss: 0.0724 (0.0815) time: 28.3202 data: 24.7276 max mem: 33303 Epoch: [33] [3510/4276] eta: 7:25:15 lr: 9.309715539459457e-06 loss: 0.0724 (0.0815) time: 28.7447 data: 25.1787 max mem: 33303 Epoch: [33] [3520/4276] eta: 7:20:04 lr: 9.306544235356763e-06 loss: 0.0753 (0.0815) time: 40.5045 data: 36.6540 max mem: 33303 Epoch: [33] [3530/4276] eta: 7:14:04 lr: 9.303372811176672e-06 loss: 0.0753 (0.0815) time: 41.2198 data: 37.2778 max mem: 33303 Epoch: [33] [3540/4276] eta: 7:08:04 lr: 9.300201266869145e-06 loss: 0.0771 (0.0815) time: 29.7309 data: 26.1139 max mem: 33303 Epoch: [33] [3550/4276] eta: 7:02:02 lr: 9.297029602384115e-06 loss: 0.0800 (0.0815) time: 28.9795 data: 25.4675 max mem: 33303 Epoch: [33] [3560/4276] eta: 6:56:50 lr: 9.293857817671472e-06 loss: 0.0858 (0.0815) time: 40.9942 data: 37.2726 max mem: 33303 Epoch: [33] [3570/4276] eta: 6:50:50 lr: 9.290685912681068e-06 loss: 0.0858 (0.0815) time: 41.3522 data: 37.6269 max mem: 33303 Epoch: [33] [3580/4276] eta: 6:44:48 lr: 9.287513887362706e-06 loss: 0.0755 (0.0815) time: 28.7342 data: 25.2606 max mem: 33303 Epoch: [33] [3590/4276] eta: 6:38:51 lr: 9.284341741666156e-06 loss: 0.0720 (0.0815) time: 29.4663 data: 25.8927 max mem: 33303 Epoch: [33] [3600/4276] eta: 6:33:36 lr: 9.281169475541148e-06 loss: 0.0792 (0.0815) time: 41.8468 data: 37.7974 max mem: 33303 Epoch: [33] [3610/4276] eta: 6:27:37 lr: 9.277997088937379e-06 loss: 0.0792 (0.0815) time: 41.2808 data: 37.1079 max mem: 33303 Epoch: [33] [3620/4276] eta: 6:21:35 lr: 9.274824581804484e-06 loss: 0.0727 (0.0815) time: 28.8561 data: 25.1748 max mem: 33303 Epoch: [33] [3630/4276] eta: 6:15:33 lr: 9.271651954092078e-06 loss: 0.0741 (0.0815) time: 27.9469 data: 24.4830 max mem: 33303 Epoch: [33] [3640/4276] eta: 6:10:22 lr: 9.268479205749731e-06 loss: 0.0775 (0.0814) time: 42.1813 data: 38.4340 max mem: 33303 Epoch: [33] [3650/4276] eta: 6:04:22 lr: 9.265306336726975e-06 loss: 0.0692 (0.0814) time: 42.5731 data: 38.8451 max mem: 33303 Epoch: [33] [3660/4276] eta: 5:58:25 lr: 9.262133346973289e-06 loss: 0.0701 (0.0814) time: 29.4576 data: 25.9420 max mem: 33303 Epoch: [33] [3670/4276] eta: 5:52:25 lr: 9.258960236438124e-06 loss: 0.0777 (0.0814) time: 29.1807 data: 25.6260 max mem: 33303 Epoch: [33] [3680/4276] eta: 5:47:05 lr: 9.255787005070885e-06 loss: 0.0728 (0.0814) time: 40.4411 data: 36.6139 max mem: 33303 Epoch: [33] [3690/4276] eta: 5:41:06 lr: 9.25261365282095e-06 loss: 0.0716 (0.0814) time: 40.8040 data: 36.9616 max mem: 33303 Epoch: [33] [3700/4276] eta: 5:35:06 lr: 9.249440179637627e-06 loss: 0.0878 (0.0814) time: 28.6378 data: 25.0902 max mem: 33303 Epoch: [33] [3710/4276] eta: 5:29:12 lr: 9.246266585470212e-06 loss: 0.0685 (0.0814) time: 29.9363 data: 26.3414 max mem: 33303 Epoch: [33] [3720/4276] eta: 5:23:47 lr: 9.243092870267948e-06 loss: 0.0657 (0.0814) time: 41.2448 data: 37.4262 max mem: 33303 Epoch: [33] [3730/4276] eta: 5:17:48 lr: 9.239919033980046e-06 loss: 0.0799 (0.0814) time: 39.8085 data: 35.9995 max mem: 33303 Epoch: [33] [3740/4276] eta: 5:11:49 lr: 9.236745076555657e-06 loss: 0.0782 (0.0813) time: 28.3190 data: 24.7876 max mem: 33303 Epoch: [33] [3750/4276] eta: 5:05:51 lr: 9.23357099794391e-06 loss: 0.0691 (0.0814) time: 28.2920 data: 24.8465 max mem: 33303 Epoch: [33] [3760/4276] eta: 5:00:28 lr: 9.230396798093894e-06 loss: 0.0684 (0.0813) time: 41.2179 data: 37.4802 max mem: 33303 Epoch: [33] [3770/4276] eta: 4:54:29 lr: 9.227222476954634e-06 loss: 0.0811 (0.0814) time: 40.5631 data: 36.8061 max mem: 33303 Epoch: [33] [3780/4276] eta: 4:48:30 lr: 9.224048034475142e-06 loss: 0.0772 (0.0814) time: 27.6168 data: 24.1356 max mem: 33303 Epoch: [33] [3790/4276] eta: 4:42:35 lr: 9.220873470604371e-06 loss: 0.0749 (0.0814) time: 28.9910 data: 25.4945 max mem: 33303 slurmstepd-node07: error: *** JOB 5753 ON node07 CANCELLED AT 2025-02-02T19:09:53 *** slurmstepd-node07: error: *** JOB 5753 STEPD TERMINATED ON node07 AT 2025-02-02T19:10:54 DUE TO JOB NOT ENDING WITH SIGNALS *** slurmstepd-node07: error: Container 3801075 in cgroup plugin has 2 processes, giving up after 63 sec