HanSolo9682's picture
upload model
0509cf5
2024-02-13,14:16:04 | INFO | Running with a single process. Device cuda:0.
2024-02-13,14:16:04 | INFO | Loaded ViT-B-32 model config.
2024-02-13,14:16:07 | INFO | Loading pretrained ViT-B-32 weights (laion2b_s34b_b79k).
2024-02-13,14:16:08 | INFO | Model:
2024-02-13,14:16:08 | INFO | CLIP(
(visual): VisionTransformer(
(conv1): Conv2d(3, 768, kernel_size=(32, 32), stride=(32, 32), bias=False)
(patch_dropout): Identity()
(ln_pre): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
(transformer): Transformer(
(resblocks): ModuleList(
(0-11): 12 x ResidualAttentionBlock(
(ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
(attn): MultiheadAttention(
(out_proj): NonDynamicallyQuantizableLinear(in_features=768, out_features=768, bias=True)
)
(ls_1): Identity()
(ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
(mlp): Sequential(
(c_fc): Linear(in_features=768, out_features=3072, bias=True)
(gelu): GELU(approximate='none')
(c_proj): Linear(in_features=3072, out_features=768, bias=True)
)
(ls_2): Identity()
)
)
)
(ln_post): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
)
(transformer): Transformer(
(resblocks): ModuleList(
(0-11): 12 x ResidualAttentionBlock(
(ln_1): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
(attn): MultiheadAttention(
(out_proj): NonDynamicallyQuantizableLinear(in_features=512, out_features=512, bias=True)
)
(ls_1): Identity()
(ln_2): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
(mlp): Sequential(
(c_fc): Linear(in_features=512, out_features=2048, bias=True)
(gelu): GELU(approximate='none')
(c_proj): Linear(in_features=2048, out_features=512, bias=True)
)
(ls_2): Identity()
)
)
)
(token_embedding): Embedding(49408, 512)
(ln_final): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
)
2024-02-13,14:16:08 | INFO | Params:
2024-02-13,14:16:08 | INFO | accum_freq: 1
2024-02-13,14:16:08 | INFO | aug_cfg: {}
2024-02-13,14:16:08 | INFO | batch_size: 512
2024-02-13,14:16:08 | INFO | beta1: 0.9
2024-02-13,14:16:08 | INFO | beta2: 0.98
2024-02-13,14:16:08 | INFO | checkpoint_path: ./logs/2024_02_13-14_16_04-model_ViT-B-32-lr_5e-05-b_512-j_8-p_amp_bf16/checkpoints
2024-02-13,14:16:08 | INFO | coca_caption_loss_weight: 2.0
2024-02-13,14:16:08 | INFO | coca_contrastive_loss_weight: 1.0
2024-02-13,14:16:08 | INFO | copy_codebase: False
2024-02-13,14:16:08 | INFO | csv_caption_key: captions
2024-02-13,14:16:08 | INFO | csv_img_key: images
2024-02-13,14:16:08 | INFO | csv_separator:
2024-02-13,14:16:08 | INFO | dataset_resampled: False
2024-02-13,14:16:08 | INFO | dataset_type: auto
2024-02-13,14:16:08 | INFO | ddp_static_graph: True
2024-02-13,14:16:08 | INFO | debug: False
2024-02-13,14:16:08 | INFO | delete_previous_checkpoint: False
2024-02-13,14:16:08 | INFO | device: cuda:0
2024-02-13,14:16:08 | INFO | dist_backend: nccl
2024-02-13,14:16:08 | INFO | dist_url: env://
2024-02-13,14:16:08 | INFO | distill: False
2024-02-13,14:16:08 | INFO | distill_model: None
2024-02-13,14:16:08 | INFO | distill_pretrained: None
2024-02-13,14:16:08 | INFO | distributed: False
2024-02-13,14:16:08 | INFO | epochs: 15
2024-02-13,14:16:08 | INFO | epochs_cooldown: None
2024-02-13,14:16:08 | INFO | eps: 1e-06
2024-02-13,14:16:08 | INFO | force_custom_text: False
2024-02-13,14:16:08 | INFO | force_image_size: None
2024-02-13,14:16:08 | INFO | force_patch_dropout: None
2024-02-13,14:16:08 | INFO | force_quick_gelu: False
2024-02-13,14:16:08 | INFO | gather_with_grad: True
2024-02-13,14:16:08 | INFO | grad_checkpointing: False
2024-02-13,14:16:08 | INFO | grad_clip_norm: None
2024-02-13,14:16:08 | INFO | horovod: False
2024-02-13,14:16:08 | INFO | image_interpolation: None
2024-02-13,14:16:08 | INFO | image_mean: None
2024-02-13,14:16:08 | INFO | image_resize_mode: None
2024-02-13,14:16:08 | INFO | image_std: None
2024-02-13,14:16:08 | INFO | imagenet_v2: None
2024-02-13,14:16:08 | INFO | imagenet_val: None
2024-02-13,14:16:08 | INFO | local_loss: True
2024-02-13,14:16:08 | INFO | local_rank: 0
2024-02-13,14:16:08 | INFO | lock_image: False
2024-02-13,14:16:08 | INFO | lock_image_freeze_bn_stats: False
2024-02-13,14:16:08 | INFO | lock_image_unlocked_groups: 0
2024-02-13,14:16:08 | INFO | lock_text: False
2024-02-13,14:16:08 | INFO | lock_text_freeze_layer_norm: False
2024-02-13,14:16:08 | INFO | lock_text_unlocked_layers: 0
2024-02-13,14:16:08 | INFO | log_every_n_steps: 100
2024-02-13,14:16:08 | INFO | log_level: 20
2024-02-13,14:16:08 | INFO | log_local: False
2024-02-13,14:16:08 | INFO | log_path: ./logs/2024_02_13-14_16_04-model_ViT-B-32-lr_5e-05-b_512-j_8-p_amp_bf16/out.log
2024-02-13,14:16:08 | INFO | logs: ./logs/
2024-02-13,14:16:08 | INFO | lr: 5e-05
2024-02-13,14:16:08 | INFO | lr_cooldown_end: 0.0
2024-02-13,14:16:08 | INFO | lr_cooldown_power: 1.0
2024-02-13,14:16:08 | INFO | lr_scheduler: cosine
2024-02-13,14:16:08 | INFO | model: ViT-B-32
2024-02-13,14:16:08 | INFO | name: 2024_02_13-14_16_04-model_ViT-B-32-lr_5e-05-b_512-j_8-p_amp_bf16
2024-02-13,14:16:08 | INFO | no_set_device_rank: False
2024-02-13,14:16:08 | INFO | precision: amp_bf16
2024-02-13,14:16:08 | INFO | pretrained: laion2b_s34b_b79k
2024-02-13,14:16:08 | INFO | pretrained_image: False
2024-02-13,14:16:08 | INFO | rank: 0
2024-02-13,14:16:08 | INFO | remote_sync: None
2024-02-13,14:16:08 | INFO | remote_sync_frequency: 300
2024-02-13,14:16:08 | INFO | remote_sync_protocol: s3
2024-02-13,14:16:08 | INFO | report_to:
2024-02-13,14:16:08 | INFO | resume: None
2024-02-13,14:16:08 | INFO | save_frequency: 5
2024-02-13,14:16:08 | INFO | save_most_recent: False
2024-02-13,14:16:08 | INFO | seed: 0
2024-02-13,14:16:08 | INFO | siglip: False
2024-02-13,14:16:08 | INFO | skip_scheduler: False
2024-02-13,14:16:08 | INFO | tensorboard: False
2024-02-13,14:16:08 | INFO | tensorboard_path:
2024-02-13,14:16:08 | INFO | torchcompile: False
2024-02-13,14:16:08 | INFO | torchscript: False
2024-02-13,14:16:08 | INFO | trace: False
2024-02-13,14:16:08 | INFO | train_data: /staging/jzhang2427/flickr30k_entities/train_data/train_data_pos_neg_clip.csv
2024-02-13,14:16:08 | INFO | train_data_upsampling_factors: None
2024-02-13,14:16:08 | INFO | train_num_samples: None
2024-02-13,14:16:08 | INFO | use_bn_sync: False
2024-02-13,14:16:08 | INFO | use_bnb_linear: None
2024-02-13,14:16:08 | INFO | val_data: None
2024-02-13,14:16:08 | INFO | val_frequency: 5
2024-02-13,14:16:08 | INFO | val_num_samples: None
2024-02-13,14:16:08 | INFO | wandb: False
2024-02-13,14:16:08 | INFO | wandb_notes:
2024-02-13,14:16:08 | INFO | wandb_project_name: open-clip
2024-02-13,14:16:08 | INFO | warmup: 1024
2024-02-13,14:16:08 | INFO | wd: 0.2
2024-02-13,14:16:08 | INFO | workers: 8
2024-02-13,14:16:08 | INFO | world_size: 1
2024-02-13,14:16:08 | INFO | zeroshot_frequency: 5
2024-02-13,14:16:08 | INFO | Start epoch 0
2024-02-13,14:16:23 | INFO | Train Epoch: 0 [ 1024/45368 (1%)] Data (t): 5.578 Batch (t): 15.287, 33.4927/s, 33.4927/s/gpu LR: 0.000000 Logit Scale: 100.000 Contrastive_loss: 5.5776 (5.5776) Loss: 5.5776 (5.5776)
2024-02-13,14:17:12 | INFO | Train Epoch: 0 [90112/45368 (100%)] Data (t): 0.001 Batch (t): 0.561, 920.721/s, 920.721/s/gpu LR: 0.000004 Logit Scale: 99.995 Contrastive_loss: 3.0440 (4.3108) Loss: 3.0440 (4.3108)
2024-02-13,14:17:12 | INFO | Start epoch 1
2024-02-13,14:17:19 | INFO | Train Epoch: 1 [ 1024/45368 (1%)] Data (t): 6.202 Batch (t): 6.546, 78.2156/s, 78.2156/s/gpu LR: 0.000004 Logit Scale: 99.995 Contrastive_loss: 3.0290 (3.0290) Loss: 3.0290 (3.0290)
2024-02-13,14:18:08 | INFO | Train Epoch: 1 [90112/45368 (100%)] Data (t): 0.003 Batch (t): 0.558, 923.024/s, 923.024/s/gpu LR: 0.000009 Logit Scale: 99.994 Contrastive_loss: 2.4762 (2.7526) Loss: 2.4762 (2.7526)
2024-02-13,14:18:08 | INFO | Start epoch 2
2024-02-13,14:18:15 | INFO | Train Epoch: 2 [ 1024/45368 (1%)] Data (t): 5.969 Batch (t): 6.301, 81.2608/s, 81.2608/s/gpu LR: 0.000009 Logit Scale: 99.994 Contrastive_loss: 2.1421 (2.1421) Loss: 2.1421 (2.1421)
2024-02-13,14:19:03 | INFO | Train Epoch: 2 [90112/45368 (100%)] Data (t): 0.003 Batch (t): 0.559, 866.204/s, 866.204/s/gpu LR: 0.000013 Logit Scale: 99.992 Contrastive_loss: 1.9024 (2.0222) Loss: 1.9024 (2.0222)
2024-02-13,14:19:04 | INFO | Start epoch 3
2024-02-13,14:19:12 | INFO | Train Epoch: 3 [ 1024/45368 (1%)] Data (t): 7.652 Batch (t): 7.986, 64.1139/s, 64.1139/s/gpu LR: 0.000013 Logit Scale: 99.992 Contrastive_loss: 1.5644 (1.5644) Loss: 1.5644 (1.5644)
2024-02-13,14:20:03 | INFO | Train Epoch: 3 [90112/45368 (100%)] Data (t): 0.052 Batch (t): 0.583, 922.873/s, 922.873/s/gpu LR: 0.000017 Logit Scale: 99.990 Contrastive_loss: 1.5900 (1.5772) Loss: 1.5900 (1.5772)
2024-02-13,14:20:03 | INFO | Start epoch 4
2024-02-13,14:20:10 | INFO | Train Epoch: 4 [ 1024/45368 (1%)] Data (t): 6.206 Batch (t): 6.543, 78.2560/s, 78.2560/s/gpu LR: 0.000017 Logit Scale: 99.990 Contrastive_loss: 1.1582 (1.1582) Loss: 1.1582 (1.1582)
2024-02-13,14:21:00 | INFO | Train Epoch: 4 [90112/45368 (100%)] Data (t): 0.026 Batch (t): 0.573, 865.168/s, 865.168/s/gpu LR: 0.000021 Logit Scale: 99.986 Contrastive_loss: 1.1860 (1.1721) Loss: 1.1860 (1.1721)
2024-02-13,14:21:02 | INFO | Start epoch 5
2024-02-13,14:21:08 | INFO | Train Epoch: 5 [ 1024/45368 (1%)] Data (t): 6.005 Batch (t): 6.350, 80.6259/s, 80.6259/s/gpu LR: 0.000022 Logit Scale: 99.985 Contrastive_loss: 0.90302 (0.90302) Loss: 0.90302 (0.90302)
2024-02-13,14:21:57 | INFO | Train Epoch: 5 [90112/45368 (100%)] Data (t): 0.003 Batch (t): 0.558, 923.267/s, 923.267/s/gpu LR: 0.000026 Logit Scale: 99.970 Contrastive_loss: 1.0292 (0.96613) Loss: 1.0292 (0.96613)
2024-02-13,14:21:57 | INFO | Start epoch 6
2024-02-13,14:22:04 | INFO | Train Epoch: 6 [ 1024/45368 (1%)] Data (t): 6.185 Batch (t): 6.521, 78.5109/s, 78.5109/s/gpu LR: 0.000026 Logit Scale: 99.969 Contrastive_loss: 0.77590 (0.77590) Loss: 0.77590 (0.77590)
2024-02-13,14:22:53 | INFO | Train Epoch: 6 [90112/45368 (100%)] Data (t): 0.007 Batch (t): 0.560, 921.958/s, 921.958/s/gpu LR: 0.000030 Logit Scale: 99.916 Contrastive_loss: 0.92965 (0.85277) Loss: 0.92965 (0.85277)
2024-02-13,14:22:53 | INFO | Start epoch 7
2024-02-13,14:22:59 | INFO | Train Epoch: 7 [ 1024/45368 (1%)] Data (t): 5.502 Batch (t): 5.834, 87.7655/s, 87.7655/s/gpu LR: 0.000030 Logit Scale: 99.915 Contrastive_loss: 0.72926 (0.72926) Loss: 0.72926 (0.72926)
2024-02-13,14:23:49 | INFO | Train Epoch: 7 [90112/45368 (100%)] Data (t): 0.026 Batch (t): 0.574, 911.691/s, 911.691/s/gpu LR: 0.000034 Logit Scale: 99.846 Contrastive_loss: 0.86592 (0.79759) Loss: 0.86592 (0.79759)
2024-02-13,14:23:49 | INFO | Start epoch 8
2024-02-13,14:23:56 | INFO | Train Epoch: 8 [ 1024/45368 (1%)] Data (t): 6.269 Batch (t): 6.609, 77.4675/s, 77.4675/s/gpu LR: 0.000034 Logit Scale: 99.846 Contrastive_loss: 0.57734 (0.57734) Loss: 0.57734 (0.57734)
2024-02-13,14:24:45 | INFO | Train Epoch: 8 [90112/45368 (100%)] Data (t): 0.004 Batch (t): 0.559, 890.899/s, 890.899/s/gpu LR: 0.000039 Logit Scale: 99.772 Contrastive_loss: 0.62928 (0.60331) Loss: 0.62928 (0.60331)
2024-02-13,14:24:45 | INFO | Start epoch 9
2024-02-13,14:24:51 | INFO | Train Epoch: 9 [ 1024/45368 (1%)] Data (t): 5.301 Batch (t): 5.647, 90.6720/s, 90.6720/s/gpu LR: 0.000039 Logit Scale: 99.772 Contrastive_loss: 0.53649 (0.53649) Loss: 0.53649 (0.53649)
2024-02-13,14:25:41 | INFO | Train Epoch: 9 [90112/45368 (100%)] Data (t): 0.026 Batch (t): 0.571, 864.281/s, 864.281/s/gpu LR: 0.000043 Logit Scale: 99.671 Contrastive_loss: 0.61911 (0.57780) Loss: 0.61911 (0.57780)
2024-02-13,14:25:43 | INFO | Start epoch 10
2024-02-13,14:25:50 | INFO | Train Epoch: 10 [ 1024/45368 (1%)] Data (t): 6.521 Batch (t): 6.853, 74.7106/s, 74.7106/s/gpu LR: 0.000043 Logit Scale: 99.669 Contrastive_loss: 0.48743 (0.48743) Loss: 0.48743 (0.48743)
2024-02-13,14:26:39 | INFO | Train Epoch: 10 [90112/45368 (100%)] Data (t): 0.015 Batch (t): 0.566, 922.642/s, 922.642/s/gpu LR: 0.000047 Logit Scale: 99.527 Contrastive_loss: 0.58032 (0.53388) Loss: 0.58032 (0.53388)
2024-02-13,14:26:40 | INFO | Start epoch 11
2024-02-13,14:26:47 | INFO | Train Epoch: 11 [ 1024/45368 (1%)] Data (t): 7.228 Batch (t): 7.567, 67.6600/s, 67.6600/s/gpu LR: 0.000047 Logit Scale: 99.525 Contrastive_loss: 0.38881 (0.38881) Loss: 0.38881 (0.38881)
2024-02-13,14:27:37 | INFO | Train Epoch: 11 [90112/45368 (100%)] Data (t): 0.025 Batch (t): 0.568, 915.143/s, 915.143/s/gpu LR: 0.000049 Logit Scale: 99.360 Contrastive_loss: 0.47798 (0.43340) Loss: 0.47798 (0.43340)
2024-02-13,14:27:37 | INFO | Start epoch 12
2024-02-13,14:27:44 | INFO | Train Epoch: 12 [ 1024/45368 (1%)] Data (t): 6.610 Batch (t): 6.942, 73.7587/s, 73.7587/s/gpu LR: 0.000049 Logit Scale: 99.357 Contrastive_loss: 0.34789 (0.34789) Loss: 0.34789 (0.34789)
2024-02-13,14:28:33 | INFO | Train Epoch: 12 [90112/45368 (100%)] Data (t): 0.009 Batch (t): 0.563, 923.298/s, 923.298/s/gpu LR: 0.000033 Logit Scale: 99.254 Contrastive_loss: 0.32083 (0.33436) Loss: 0.32083 (0.33436)
2024-02-13,14:28:33 | INFO | Start epoch 13
2024-02-13,14:28:39 | INFO | Train Epoch: 13 [ 1024/45368 (1%)] Data (t): 5.156 Batch (t): 5.490, 93.2544/s, 93.2544/s/gpu LR: 0.000032 Logit Scale: 99.253 Contrastive_loss: 0.23794 (0.23794) Loss: 0.23794 (0.23794)
2024-02-13,14:29:29 | INFO | Train Epoch: 13 [90112/45368 (100%)] Data (t): 0.020 Batch (t): 0.574, 922.599/s, 922.599/s/gpu LR: 0.000010 Logit Scale: 99.241 Contrastive_loss: 0.13853 (0.18823) Loss: 0.13853 (0.18823)
2024-02-13,14:29:30 | INFO | Start epoch 14
2024-02-13,14:29:36 | INFO | Train Epoch: 14 [ 1024/45368 (1%)] Data (t): 5.908 Batch (t): 6.247, 81.9657/s, 81.9657/s/gpu LR: 0.000010 Logit Scale: 99.241 Contrastive_loss: 0.16239 (0.16239) Loss: 0.16239 (0.16239)
2024-02-13,14:30:25 | INFO | Train Epoch: 14 [90112/45368 (100%)] Data (t): 0.004 Batch (t): 0.561, 888.267/s, 888.267/s/gpu LR: 0.000000 Logit Scale: 99.249 Contrastive_loss: 0.11300 (0.13770) Loss: 0.11300 (0.13770)