xmutly commited on
Commit
2aa0c85
·
verified ·
1 Parent(s): b81f863

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +68 -0
README.md CHANGED
@@ -55,3 +55,71 @@ CUDA_VISIBLE_DEVICES=0,1,2,3 python -m train.training_clip_slots --clip_model_na
55
 
56
  The results of reconstruction after slot-attention and ckps are stored in './output_slots/ViT-L-14_openai_imagenet_l2_imagenet_SLOTS_xxxxx'
57
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
55
 
56
  The results of reconstruction after slot-attention and ckps are stored in './output_slots/ViT-L-14_openai_imagenet_l2_imagenet_SLOTS_xxxxx'
57
 
58
+
59
+
60
+ ## Stage2: Training and Evaluation with Object-centric Representations
61
+
62
+ - SlotVLM<sup>4</sup>
63
+ ```shell
64
+ python -m train.adversarial_training_clip_with_object_token --clip_model_name ViT-L-14 --slots_ckp ./ckps/model_slots_step_300000.pt --pretrained openai --dataset imagenet --imagenet_root /path/to/imagenet --template std --output_normalize False --steps 20000 --warmup 1400 --batch_size 128 --loss l2 --opt adamw --lr 1e-5 --wd 1e-4 --attack pgd --inner_loss l2 --norm linf --eps 4 --iterations_adv 10 --stepsize_adv 1 --wandb False --output_dir ./output --experiment_name with_OT --log_freq 10 --eval_freq 10
65
+ ```
66
+
67
+ Set `--eps 2` to obtain SlotVLM<sup>2</sup> models.
68
+
69
+
70
+ ## Evaluation
71
+ Make sure files in `bash` directory are executable: `chmod +x bash/*`
72
+ ### CLIP ImageNet
73
+ ```shell
74
+ python -m CLIP_eval.clip_robustbench --clip_model_name ViT-L-14 --pretrained /path/to/ckpt.pt --dataset imagenet --imagenet_root /path/to/imagenet --wandb False --norm linf --eps 2
75
+ ```
76
+ ⬆ You should notice the `--pretrained` and the `--eps 2/4` for SlotVLM<sup>2/4</sup> models.
77
+
78
+
79
+ ### CLIP Zero-Shot
80
+ Set models to be evaluated in `CLIP_benchmark/benchmark/models.txt` and datasets in `CLIP_benchmark/benchmark/datasets.txt`
81
+ (the datasets are downloaded from HuggingFace). Then run
82
+ ```shell
83
+ cd CLIP_benchmark
84
+ ./bash/run_benchmark_adv.sh
85
+ ```
86
+
87
+ ### VLM Captioning and VQA
88
+ #### LLaVA
89
+ In `/bash/llava_eval.sh` supply paths for the datasets. The required annotation files for the datasets can be obtained from this [HuggingFace repository](https://huggingface.co/datasets/openflamingo/eval_benchmark/tree/main).
90
+ Set `--vision_encoder_pretrained` to `openai` or supply path to fine-tuned CLIP model checkpoint.
91
+ Then run
92
+ ```shell
93
+ ./bash/llava_eval.sh
94
+ ```
95
+ The LLaVA model will be automatically downloaded from HuggingFace.
96
+
97
+ #### OpenFlamingo
98
+ Download the OpenFlamingo 9B [model](https://huggingface.co/openflamingo/OpenFlamingo-9B-vitl-mpt7b/tree/main), supply paths in `/bash/of_eval_9B.sh` and run
99
+ ```shell
100
+ ./bash/of_eval_9B.sh
101
+ ```
102
+
103
+ Some non-standard annotation files are supplied [here](https://nc.mlcloud.uni-tuebingen.de/index.php/s/mtRnQFaZJkR9zaX) and [here](https://github.com/mlfoundations/open_flamingo/tree/main/open_flamingo/eval/data).
104
+
105
+ ### VLM Stealthy Targeted Attacks
106
+ For targeted attacks on COCO, run
107
+ ```shell
108
+ ./bash/llava_eval_targeted.sh
109
+ ```
110
+ For targeted attacks on self-selected images, set images and target captions in `vlm_eval/run_evaluation_qualitative.py` and run
111
+ ```shell
112
+ python -m vlm_eval.run_evaluation_qualitative --precision float32 --attack apgd --eps 2 --steps 10000 --vlm_model_name llava --vision_encoder_pretrained openai --verbose
113
+ ```
114
+ With 10,000 iterations it takes about 2 hours per image on an A100 GPU.
115
+
116
+ ### POPE
117
+ ```shell
118
+ ./bash/eval_pope.sh openai # for clean model evaluation
119
+ ./bash/eval_pope.sh # for robust model evaluation - add path_to_ckpt in bash file
120
+ ```
121
+ ### SQA
122
+ ```shell
123
+ ./bash/eval_scienceqa.sh openai # for clean model evaluation
124
+ ./bash/eval_scienceqa.sh # for robust model evaluation - add path_to_ckpt in bash file
125
+ ```