File size: 10,715 Bytes
09a501b |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 |
2024-02-12,02:49:55 | INFO | Running with a single process. Device cuda:0.
2024-02-12,02:49:55 | INFO | Loaded ViT-B-32 model config.
2024-02-12,02:49:58 | INFO | Loading pretrained ViT-B-32 weights (laion2b_s34b_b79k).
2024-02-12,02:49:58 | INFO | Model:
2024-02-12,02:49:58 | INFO | CLIP(
(visual): VisionTransformer(
(conv1): Conv2d(3, 768, kernel_size=(32, 32), stride=(32, 32), bias=False)
(patch_dropout): Identity()
(ln_pre): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
(transformer): Transformer(
(resblocks): ModuleList(
(0-11): 12 x ResidualAttentionBlock(
(ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
(attn): MultiheadAttention(
(out_proj): NonDynamicallyQuantizableLinear(in_features=768, out_features=768, bias=True)
)
(ls_1): Identity()
(ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
(mlp): Sequential(
(c_fc): Linear(in_features=768, out_features=3072, bias=True)
(gelu): GELU(approximate='none')
(c_proj): Linear(in_features=3072, out_features=768, bias=True)
)
(ls_2): Identity()
)
)
)
(ln_post): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
)
(transformer): Transformer(
(resblocks): ModuleList(
(0-11): 12 x ResidualAttentionBlock(
(ln_1): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
(attn): MultiheadAttention(
(out_proj): NonDynamicallyQuantizableLinear(in_features=512, out_features=512, bias=True)
)
(ls_1): Identity()
(ln_2): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
(mlp): Sequential(
(c_fc): Linear(in_features=512, out_features=2048, bias=True)
(gelu): GELU(approximate='none')
(c_proj): Linear(in_features=2048, out_features=512, bias=True)
)
(ls_2): Identity()
)
)
)
(token_embedding): Embedding(49408, 512)
(ln_final): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
)
2024-02-12,02:49:58 | INFO | Params:
2024-02-12,02:49:58 | INFO | accum_freq: 1
2024-02-12,02:49:58 | INFO | aug_cfg: {}
2024-02-12,02:49:58 | INFO | batch_size: 256
2024-02-12,02:49:58 | INFO | beta1: 0.9
2024-02-12,02:49:58 | INFO | beta2: 0.98
2024-02-12,02:49:58 | INFO | checkpoint_path: ./logs/2024_02_12-02_49_55-model_ViT-B-32-lr_1e-05-b_256-j_8-p_amp_bf16/checkpoints
2024-02-12,02:49:58 | INFO | coca_caption_loss_weight: 2.0
2024-02-12,02:49:58 | INFO | coca_contrastive_loss_weight: 1.0
2024-02-12,02:49:58 | INFO | copy_codebase: False
2024-02-12,02:49:58 | INFO | csv_caption_key: captions
2024-02-12,02:49:58 | INFO | csv_img_key: images
2024-02-12,02:49:58 | INFO | csv_separator:
2024-02-12,02:49:58 | INFO | dataset_resampled: False
2024-02-12,02:49:58 | INFO | dataset_type: auto
2024-02-12,02:49:58 | INFO | ddp_static_graph: True
2024-02-12,02:49:58 | INFO | debug: False
2024-02-12,02:49:58 | INFO | delete_previous_checkpoint: False
2024-02-12,02:49:58 | INFO | device: cuda:0
2024-02-12,02:49:58 | INFO | dist_backend: nccl
2024-02-12,02:49:58 | INFO | dist_url: env://
2024-02-12,02:49:58 | INFO | distill: False
2024-02-12,02:49:58 | INFO | distill_model: None
2024-02-12,02:49:58 | INFO | distill_pretrained: None
2024-02-12,02:49:58 | INFO | distributed: False
2024-02-12,02:49:58 | INFO | epochs: 5
2024-02-12,02:49:58 | INFO | epochs_cooldown: None
2024-02-12,02:49:58 | INFO | eps: 1e-06
2024-02-12,02:49:58 | INFO | force_custom_text: False
2024-02-12,02:49:58 | INFO | force_image_size: None
2024-02-12,02:49:58 | INFO | force_patch_dropout: None
2024-02-12,02:49:58 | INFO | force_quick_gelu: False
2024-02-12,02:49:58 | INFO | gather_with_grad: True
2024-02-12,02:49:58 | INFO | grad_checkpointing: False
2024-02-12,02:49:58 | INFO | grad_clip_norm: None
2024-02-12,02:49:58 | INFO | horovod: False
2024-02-12,02:49:58 | INFO | image_interpolation: None
2024-02-12,02:49:58 | INFO | image_mean: None
2024-02-12,02:49:58 | INFO | image_resize_mode: None
2024-02-12,02:49:58 | INFO | image_std: None
2024-02-12,02:49:58 | INFO | imagenet_v2: None
2024-02-12,02:49:58 | INFO | imagenet_val: None
2024-02-12,02:49:58 | INFO | local_loss: True
2024-02-12,02:49:58 | INFO | local_rank: 0
2024-02-12,02:49:58 | INFO | lock_image: False
2024-02-12,02:49:58 | INFO | lock_image_freeze_bn_stats: False
2024-02-12,02:49:58 | INFO | lock_image_unlocked_groups: 0
2024-02-12,02:49:58 | INFO | lock_text: False
2024-02-12,02:49:58 | INFO | lock_text_freeze_layer_norm: False
2024-02-12,02:49:58 | INFO | lock_text_unlocked_layers: 0
2024-02-12,02:49:58 | INFO | log_every_n_steps: 100
2024-02-12,02:49:58 | INFO | log_level: 20
2024-02-12,02:49:58 | INFO | log_local: False
2024-02-12,02:49:58 | INFO | log_path: ./logs/2024_02_12-02_49_55-model_ViT-B-32-lr_1e-05-b_256-j_8-p_amp_bf16/out.log
2024-02-12,02:49:58 | INFO | logs: ./logs/
2024-02-12,02:49:58 | INFO | lr: 1e-05
2024-02-12,02:49:58 | INFO | lr_cooldown_end: 0.0
2024-02-12,02:49:58 | INFO | lr_cooldown_power: 1.0
2024-02-12,02:49:58 | INFO | lr_scheduler: cosine
2024-02-12,02:49:58 | INFO | model: ViT-B-32
2024-02-12,02:49:58 | INFO | name: 2024_02_12-02_49_55-model_ViT-B-32-lr_1e-05-b_256-j_8-p_amp_bf16
2024-02-12,02:49:58 | INFO | no_set_device_rank: False
2024-02-12,02:49:58 | INFO | precision: amp_bf16
2024-02-12,02:49:58 | INFO | pretrained: laion2b_s34b_b79k
2024-02-12,02:49:58 | INFO | pretrained_image: False
2024-02-12,02:49:58 | INFO | rank: 0
2024-02-12,02:49:58 | INFO | remote_sync: None
2024-02-12,02:49:58 | INFO | remote_sync_frequency: 300
2024-02-12,02:49:58 | INFO | remote_sync_protocol: s3
2024-02-12,02:49:58 | INFO | report_to:
2024-02-12,02:49:58 | INFO | resume: None
2024-02-12,02:49:58 | INFO | save_frequency: 5
2024-02-12,02:49:58 | INFO | save_most_recent: False
2024-02-12,02:49:58 | INFO | seed: 0
2024-02-12,02:49:58 | INFO | siglip: False
2024-02-12,02:49:58 | INFO | skip_scheduler: False
2024-02-12,02:49:58 | INFO | tensorboard: False
2024-02-12,02:49:58 | INFO | tensorboard_path:
2024-02-12,02:49:58 | INFO | torchcompile: False
2024-02-12,02:49:58 | INFO | torchscript: False
2024-02-12,02:49:58 | INFO | trace: False
2024-02-12,02:49:58 | INFO | train_data: ../../train_data_counterfactuals_neg_clip2.csv
2024-02-12,02:49:58 | INFO | train_data_upsampling_factors: None
2024-02-12,02:49:58 | INFO | train_num_samples: None
2024-02-12,02:49:58 | INFO | use_bn_sync: False
2024-02-12,02:49:58 | INFO | use_bnb_linear: None
2024-02-12,02:49:58 | INFO | val_data: None
2024-02-12,02:49:58 | INFO | val_frequency: 5
2024-02-12,02:49:58 | INFO | val_num_samples: None
2024-02-12,02:49:58 | INFO | wandb: False
2024-02-12,02:49:58 | INFO | wandb_notes:
2024-02-12,02:49:58 | INFO | wandb_project_name: open-clip
2024-02-12,02:49:58 | INFO | warmup: 1024
2024-02-12,02:49:58 | INFO | wd: 0.2
2024-02-12,02:49:58 | INFO | workers: 8
2024-02-12,02:49:58 | INFO | world_size: 1
2024-02-12,02:49:58 | INFO | zeroshot_frequency: 5
2024-02-12,02:49:58 | INFO | Start epoch 0
2024-02-12,02:50:15 | INFO | Train Epoch: 0 [ 1024/27087 (1%)] Data (t): 12.525 Batch (t): 16.592, 15.4295/s, 15.4295/s/gpu LR: 0.000000 Logit Scale: 100.000 Contrastive_loss: 1.0551 (1.0551) Loss: 1.0551 (1.0551)
2024-02-12,02:52:13 | INFO | Train Epoch: 0 [103424/27087 (96%)] Data (t): 0.645 Batch (t): 1.175, 459.500/s, 459.500/s/gpu LR: 0.000001 Logit Scale: 99.996 Contrastive_loss: 0.80440 (0.92975) Loss: 0.80440 (0.92975)
2024-02-12,02:52:20 | INFO | Train Epoch: 0 [107520/27087 (100%)] Data (t): 1.439 Batch (t): 1.884, 43.6989/s, 43.6989/s/gpu LR: 0.000001 Logit Scale: 99.996 Contrastive_loss: 0.73623 (0.86524) Loss: 0.73623 (0.86524)
2024-02-12,02:52:21 | INFO | Start epoch 1
2024-02-12,02:52:33 | INFO | Train Epoch: 1 [ 1024/27087 (1%)] Data (t): 11.817 Batch (t): 12.154, 21.0639/s, 21.0639/s/gpu LR: 0.000001 Logit Scale: 99.995 Contrastive_loss: 0.75390 (0.75390) Loss: 0.75390 (0.75390)
2024-02-12,02:54:37 | INFO | Train Epoch: 1 [103424/27087 (96%)] Data (t): 0.740 Batch (t): 1.238, 460.135/s, 460.135/s/gpu LR: 0.000002 Logit Scale: 99.988 Contrastive_loss: 0.65958 (0.70674) Loss: 0.65958 (0.70674)
2024-02-12,02:54:39 | INFO | Train Epoch: 1 [107520/27087 (100%)] Data (t): 0.058 Batch (t): 0.557, 459.304/s, 459.304/s/gpu LR: 0.000002 Logit Scale: 99.988 Contrastive_loss: 0.64635 (0.68661) Loss: 0.64635 (0.68661)
2024-02-12,02:54:39 | INFO | Start epoch 2
2024-02-12,02:54:51 | INFO | Train Epoch: 2 [ 1024/27087 (1%)] Data (t): 11.166 Batch (t): 11.505, 22.2512/s, 22.2512/s/gpu LR: 0.000002 Logit Scale: 99.988 Contrastive_loss: 0.53999 (0.53999) Loss: 0.53999 (0.53999)
2024-02-12,02:56:51 | INFO | Train Epoch: 2 [103424/27087 (96%)] Data (t): 0.696 Batch (t): 1.195, 459.292/s, 459.292/s/gpu LR: 0.000003 Logit Scale: 99.983 Contrastive_loss: 0.56759 (0.55379) Loss: 0.56759 (0.55379)
2024-02-12,02:56:54 | INFO | Train Epoch: 2 [107520/27087 (100%)] Data (t): 0.387 Batch (t): 0.888, 457.597/s, 457.597/s/gpu LR: 0.000003 Logit Scale: 99.983 Contrastive_loss: 0.48756 (0.53171) Loss: 0.48756 (0.53171)
2024-02-12,02:56:55 | INFO | Start epoch 3
2024-02-12,02:57:07 | INFO | Train Epoch: 3 [ 1024/27087 (1%)] Data (t): 11.677 Batch (t): 12.022, 21.2941/s, 21.2941/s/gpu LR: 0.000003 Logit Scale: 99.983 Contrastive_loss: 0.44987 (0.44987) Loss: 0.44987 (0.44987)
2024-02-12,02:59:10 | INFO | Train Epoch: 3 [103424/27087 (96%)] Data (t): 0.718 Batch (t): 1.230, 459.886/s, 459.886/s/gpu LR: 0.000004 Logit Scale: 99.981 Contrastive_loss: 0.42789 (0.43888) Loss: 0.42789 (0.43888)
2024-02-12,02:59:12 | INFO | Train Epoch: 3 [107520/27087 (100%)] Data (t): 0.058 Batch (t): 0.558, 459.170/s, 459.170/s/gpu LR: 0.000004 Logit Scale: 99.980 Contrastive_loss: 0.42664 (0.43480) Loss: 0.42664 (0.43480)
2024-02-12,02:59:12 | INFO | Start epoch 4
2024-02-12,02:59:24 | INFO | Train Epoch: 4 [ 1024/27087 (1%)] Data (t): 11.325 Batch (t): 11.659, 21.9575/s, 21.9575/s/gpu LR: 0.000004 Logit Scale: 99.980 Contrastive_loss: 0.34311 (0.34311) Loss: 0.34311 (0.34311)
2024-02-12,03:01:24 | INFO | Train Epoch: 4 [103424/27087 (96%)] Data (t): 0.712 Batch (t): 1.198, 459.840/s, 459.840/s/gpu LR: 0.000005 Logit Scale: 99.989 Contrastive_loss: 0.32785 (0.33548) Loss: 0.32785 (0.33548)
2024-02-12,03:01:27 | INFO | Train Epoch: 4 [107520/27087 (100%)] Data (t): 0.180 Batch (t): 0.623, 313.004/s, 313.004/s/gpu LR: 0.000005 Logit Scale: 99.989 Contrastive_loss: 0.36298 (0.34464) Loss: 0.36298 (0.34464)
|