xmutly commited on
Commit
d6f788a
·
verified ·
1 Parent(s): 21fb9db

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +57 -0
README.md ADDED
@@ -0,0 +1,57 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # RobustVLM (Foundation Models) via Object-centric Learning
2
+
3
+
4
+ ## Table of Contents
5
+ - [Installation](#installation)
6
+ - [Stage1: Get Object-centric Models](#models)
7
+ - [Dataset](#loading-pretrained-models)
8
+ - [Training](#summary-of-results)
9
+
10
+ ## Installation
11
+ Create and activate anaconda environment:
12
+ ```shell
13
+ conda create -n robustclip python==3.11
14
+ ```
15
+ ```shell
16
+ conda activate robustclip
17
+ ```
18
+
19
+ The code is tested with Python 3.11. To install the required packages, run:
20
+ ```shell
21
+ pip install -r requirements.txt
22
+ ```
23
+
24
+ To install the open_clip_torch locally run:
25
+ ```shell
26
+ cd ./open_clip_torch
27
+ ```
28
+ ```shell
29
+ python setup.py develop
30
+ ```
31
+
32
+ ## Stage1: Get Object-centric Models
33
+
34
+ ### Dataset
35
+ Prepare the ImageNet dataset in a torch.ImageFolder style format:
36
+ ```
37
+ dataset_path
38
+ └─imagenet
39
+ └─train
40
+ └─n01440764
41
+ xxxxxx.JPEG
42
+ .....
43
+ └─......
44
+ └─val
45
+ └─n04254680
46
+ xxxxxx.JPEG
47
+ .....
48
+ └─......
49
+ ```
50
+ ### Training
51
+ - Slot-Attention on 4GPUs
52
+ ```shell
53
+ CUDA_VISIBLE_DEVICES=0,1,2,3 python -m train.training_clip_slots --clip_model_name ViT-L-14 --pretrained openai --dataset imagenet --imagenet_root /.../.../dataset_path/imagenet --template std --output_normalize False --steps 1000000 --warmup 10000 --batch_size 128 --loss l2 --opt adamw --lr 5e-5 --wd 1e-4 --attack pgd --inner_loss l2 --norm linf --eps 4 --iterations_adv 10 --stepsize_adv 1 --wandb False --output_dir ./output_slots --experiment_name SLOTS --log_freq 1000 --eval_freq 1000```
54
+ ```
55
+
56
+ The results of reconstruction after slot-attention and ckps are stored in './output_slots/ViT-L-14_openai_imagenet_l2_imagenet_SLOTS_xxxxx'
57
+