Image-to-Image
Diffusers
English
JOY-Huang commited on
Commit
48f7d41
1 Parent(s): 04f6fe0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +154 -152
README.md CHANGED
@@ -1,153 +1,155 @@
1
- ---
2
- license: apache-2.0
3
- language:
4
- - en
5
- library_name: diffusers
6
- pipeline_tag: image-to-image
7
- ---
8
-
9
- # InstantIR Model Card
10
-
11
- <!-- > **InstantIR: Blind Image Restoration with Instant Generative Reference**<br>
12
- > Jen-Yuan Huang<sup>1,2</sup>, Haofan Wang<sup>2</sup>, Qixun Wang<sup>2</sup>, Xu Bai<sup>2</sup>, Hao Ai<sup>2</sup>, Peng Xing<sup>2</sup>, Jen-Tse Huang<sup>3</sup> <br>
13
- > <sup>1</sup>Peking University, <sup>2</sup>InstantX Team, <sup>3</sup>The Chinese University of Hong Kong -->
14
-
15
- <a href='https://arxiv.org/abs/2410.06551'><img src='https://img.shields.io/badge/arXiv-b31b1b.svg'>
16
- <a href='https://jy-joy.github.io/InstantIR'><img src='https://img.shields.io/badge/Website-informational'></a>
17
- <a href='https://github.com/JY-Joy/InstantIR'><img src='https://img.shields.io/badge/Github-gray'></a>
18
-
19
- > **InstantIR** is a novel single-image restoration model designed to resurrect your damaged images, delivering extrem-quality yet realistic details. You can further boost **InstantIR** performance with additional text prompts, even achieve customized editing!
20
-
21
- <div align="center">
22
- <img src='assets/teaser_figure.png'>
23
- </div>
24
-
25
-
26
- ## Usage
27
-
28
- ### 1. Clone the github repo
29
- ```sh
30
- git clone https://github.com/JY-Joy/InstantIR.git
31
- cd InstantIR
32
- ```
33
-
34
- ### 2. Download model weights
35
- You can directly download InstantIR weights in this repository, or
36
- you can download them using python script:
37
-
38
- ```python
39
- from huggingface_hub import hf_hub_download
40
- hf_hub_download(repo_id="InstantX/InstantIR", filename="models/adapter.pt", local_dir="./models")
41
- hf_hub_download(repo_id="InstantX/InstantIR", filename="models/aggregator.pt", local_dir="./models")
42
- hf_hub_download(repo_id="InstantX/InstantIR", filename="models/previewer_lora_weights.bin", local_dir="./models")
43
- ```
44
-
45
- ### 3. Load InstantIR with 🧨 diffusers
46
-
47
- ```python
48
- # !pip install opencv-python transformers accelerate
49
- import torch
50
- from PIL import Image
51
-
52
- import diffusers
53
- from diffusers import DDPMScheduler, StableDiffusionXLPipeline
54
- from diffusers.utils import load_image
55
- from schedulers.lcm_single_step_scheduler import LCMSingleStepScheduler
56
-
57
- from transformers import AutoImageProcessor, AutoModel
58
-
59
- from module.ip_adapter.utils import load_ip_adapter_to_pipe, revise_state_dict, init_ip_adapter_in_unet
60
- from module.ip_adapter.resampler import Resampler
61
- from module.aggregator import Aggregator
62
- from pipelines.sdxl_instantir import InstantIRPipeline
63
-
64
- # prepare 'dinov2'
65
- image_encoder = AutoModel.from_pretrained('facebook/dinov2-large')
66
- image_processor = AutoImageProcessor.from_pretrained('facebook/dinov2-large')
67
-
68
- # prepare models under ./checkpoints
69
- dcp_adapter = f'./models/adapter.pt'
70
- previewer_lora_path = f'./models'
71
- instantir_path = f'./models/aggregator.pt'
72
-
73
- # load SDXL
74
- sdxl = StableDiffusionXLPipeline.from_pretrained('stabilityai/stable-diffusion-xl-base-1.0', torch_dtype=torch.float16)
75
-
76
- # load adapter
77
- image_proj_model = Resampler(
78
- embedding_dim=image_encoder.config.hidden_size,
79
- output_dim=sdxl.unet.config.cross_attention_dim,
80
- )
81
- init_ip_adapter_in_unet(
82
- sdxl.unet,
83
- image_proj_model,
84
- dcp_adapter,
85
- )
86
-
87
- pipe = InstantIRPipeline(
88
- sdxl.vae, sdxl.text_encoder, sdxl.text_encoder_2, sdxl.tokenizer, sdxl.tokenizer_2,
89
- sdxl.unet, sdxl.scheduler, feature_extractor=image_processor, image_encoder=image_encoder,
90
- )
91
- pipe.cuda()
92
-
93
- # load previewer lora
94
- pipe.prepare_previewers(previewer_lora_path)
95
- pipe.unet.to(dtype=torch.float16)
96
- pipe.scheduler = DDPMScheduler.from_pretrained('stabilityai/stable-diffusion-xl-base-1.0', subfolder="scheduler")
97
- lcm_scheduler = LCMSingleStepScheduler.from_config(pipe.scheduler.config)
98
-
99
- # load aggregator weights
100
- pretrained_state_dict = torch.load(instantir_path)
101
- pipe.aggregator.load_state_dict(pretrained_state_dict)
102
- pipe.aggregator.to(dtype=torch.float16)
103
- ```
104
-
105
- Then, you can restore your broken images with:
106
-
107
- ```python
108
- # load a broken image
109
- image = Image.open('path/to/your-image').convert("RGB")
110
-
111
- # InstantIR restoration
112
- image = pipe(
113
- prompt='',
114
- image=image,
115
- ip_adapter_image=[image],
116
- negative_prompt='',
117
- guidance_scale=7.0,
118
- previewer_scheduler=lcm_scheduler,
119
- return_dict=False,
120
- )[0]
121
- ```
122
-
123
- For more details including text-guided enhancement/editing, please refer to our [GitHub repository](https://github.com/JY-Joy/InstantIR).
124
-
125
- <!-- ## Usage Tips
126
- 1. If you're not satisfied with the similarity, try to increase the weight of "IdentityNet Strength" and "Adapter Strength".
127
- 2. If you feel that the saturation is too high, first decrease the Adapter strength. If it is still too high, then decrease the IdentityNet strength.
128
- 3. If you find that text control is not as expected, decrease Adapter strength.
129
- 4. If you find that realistic style is not good enough, go for our Github repo and use a more realistic base model. -->
130
-
131
- ## Examples
132
-
133
- <div align="center">
134
- <img src='assets/qualitative_real.png'>
135
- </div>
136
-
137
- <div align="center">
138
- <img src='assets/outdomain_preview.png'>
139
- </div>
140
-
141
- ## Disclaimer
142
-
143
- This project is released under Apache License and aims to positively impact the field of AI-driven image generation. Users are granted the freedom to create images using this tool, but they are obligated to comply with local laws and utilize it responsibly. The developers will not assume any responsibility for potential misuse by users.
144
-
145
- ## Citation
146
- ```bibtex
147
- @article{huang2024instantir,
148
- title={InstantIR: Blind Image Restoration with Instant Generative Reference},
149
- author={Huang, Jen-Yuan and Wang, Haofan and Wang, Qixun and Bai, Xu and Ai, Hao and Xing, Peng and Huang, Jen-Tse},
150
- journal={arXiv preprint arXiv:2410.06551},
151
- year={2024}
152
- }
 
 
153
  ```
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - en
5
+ library_name: diffusers
6
+ pipeline_tag: image-to-image
7
+ ---
8
+
9
+ # InstantIR Model Card
10
+
11
+ <!-- > **InstantIR: Blind Image Restoration with Instant Generative Reference**<br>
12
+ > Jen-Yuan Huang<sup>1,2</sup>, Haofan Wang<sup>2</sup>, Qixun Wang<sup>2</sup>, Xu Bai<sup>2</sup>, Hao Ai<sup>2</sup>, Peng Xing<sup>2</sup>, Jen-Tse Huang<sup>3</sup> <br>
13
+ > <sup>1</sup>Peking University, <sup>2</sup>InstantX Team, <sup>3</sup>The Chinese University of Hong Kong -->
14
+
15
+ <div align="center">
16
+ <a href='https://arxiv.org/abs/2410.06551'><img src='https://img.shields.io/badge/arXiv-b31b1b.svg'>
17
+ <a href='https://jy-joy.github.io/InstantIR'><img src='https://img.shields.io/badge/Website-informational'></a>
18
+ <a href='https://github.com/JY-Joy/InstantIR'><img src='https://img.shields.io/badge/Github-gray'></a>
19
+ </div>
20
+
21
+ > **InstantIR** is a novel single-image restoration model designed to resurrect your damaged images, delivering extrem-quality yet realistic details. You can further boost **InstantIR** performance with additional text prompts, even achieve customized editing!
22
+
23
+ <div align="center">
24
+ <img src='assets/teaser_figure.png'>
25
+ </div>
26
+
27
+
28
+ ## Usage
29
+
30
+ ### 1. Clone the github repo
31
+ ```sh
32
+ git clone https://github.com/JY-Joy/InstantIR.git
33
+ cd InstantIR
34
+ ```
35
+
36
+ ### 2. Download model weights
37
+ You can directly download InstantIR weights in this repository, or
38
+ you can download them using python script:
39
+
40
+ ```python
41
+ from huggingface_hub import hf_hub_download
42
+ hf_hub_download(repo_id="InstantX/InstantIR", filename="models/adapter.pt", local_dir="./models")
43
+ hf_hub_download(repo_id="InstantX/InstantIR", filename="models/aggregator.pt", local_dir="./models")
44
+ hf_hub_download(repo_id="InstantX/InstantIR", filename="models/previewer_lora_weights.bin", local_dir="./models")
45
+ ```
46
+
47
+ ### 3. Load InstantIR with 🧨 diffusers
48
+
49
+ ```python
50
+ # !pip install opencv-python transformers accelerate
51
+ import torch
52
+ from PIL import Image
53
+
54
+ import diffusers
55
+ from diffusers import DDPMScheduler, StableDiffusionXLPipeline
56
+ from diffusers.utils import load_image
57
+ from schedulers.lcm_single_step_scheduler import LCMSingleStepScheduler
58
+
59
+ from transformers import AutoImageProcessor, AutoModel
60
+
61
+ from module.ip_adapter.utils import load_ip_adapter_to_pipe, revise_state_dict, init_ip_adapter_in_unet
62
+ from module.ip_adapter.resampler import Resampler
63
+ from module.aggregator import Aggregator
64
+ from pipelines.sdxl_instantir import InstantIRPipeline
65
+
66
+ # prepare 'dinov2'
67
+ image_encoder = AutoModel.from_pretrained('facebook/dinov2-large')
68
+ image_processor = AutoImageProcessor.from_pretrained('facebook/dinov2-large')
69
+
70
+ # prepare models under ./checkpoints
71
+ dcp_adapter = f'./models/adapter.pt'
72
+ previewer_lora_path = f'./models'
73
+ instantir_path = f'./models/aggregator.pt'
74
+
75
+ # load SDXL
76
+ sdxl = StableDiffusionXLPipeline.from_pretrained('stabilityai/stable-diffusion-xl-base-1.0', torch_dtype=torch.float16)
77
+
78
+ # load adapter
79
+ image_proj_model = Resampler(
80
+ embedding_dim=image_encoder.config.hidden_size,
81
+ output_dim=sdxl.unet.config.cross_attention_dim,
82
+ )
83
+ init_ip_adapter_in_unet(
84
+ sdxl.unet,
85
+ image_proj_model,
86
+ dcp_adapter,
87
+ )
88
+
89
+ pipe = InstantIRPipeline(
90
+ sdxl.vae, sdxl.text_encoder, sdxl.text_encoder_2, sdxl.tokenizer, sdxl.tokenizer_2,
91
+ sdxl.unet, sdxl.scheduler, feature_extractor=image_processor, image_encoder=image_encoder,
92
+ )
93
+ pipe.cuda()
94
+
95
+ # load previewer lora
96
+ pipe.prepare_previewers(previewer_lora_path)
97
+ pipe.unet.to(dtype=torch.float16)
98
+ pipe.scheduler = DDPMScheduler.from_pretrained('stabilityai/stable-diffusion-xl-base-1.0', subfolder="scheduler")
99
+ lcm_scheduler = LCMSingleStepScheduler.from_config(pipe.scheduler.config)
100
+
101
+ # load aggregator weights
102
+ pretrained_state_dict = torch.load(instantir_path)
103
+ pipe.aggregator.load_state_dict(pretrained_state_dict)
104
+ pipe.aggregator.to(dtype=torch.float16)
105
+ ```
106
+
107
+ Then, you can restore your broken images with:
108
+
109
+ ```python
110
+ # load a broken image
111
+ image = Image.open('path/to/your-image').convert("RGB")
112
+
113
+ # InstantIR restoration
114
+ image = pipe(
115
+ prompt='',
116
+ image=image,
117
+ ip_adapter_image=[image],
118
+ negative_prompt='',
119
+ guidance_scale=7.0,
120
+ previewer_scheduler=lcm_scheduler,
121
+ return_dict=False,
122
+ )[0]
123
+ ```
124
+
125
+ For more details including text-guided enhancement/editing, please refer to our [GitHub repository](https://github.com/JY-Joy/InstantIR).
126
+
127
+ <!-- ## Usage Tips
128
+ 1. If you're not satisfied with the similarity, try to increase the weight of "IdentityNet Strength" and "Adapter Strength".
129
+ 2. If you feel that the saturation is too high, first decrease the Adapter strength. If it is still too high, then decrease the IdentityNet strength.
130
+ 3. If you find that text control is not as expected, decrease Adapter strength.
131
+ 4. If you find that realistic style is not good enough, go for our Github repo and use a more realistic base model. -->
132
+
133
+ ## Examples
134
+
135
+ <div align="center">
136
+ <img src='assets/qualitative_real.png'>
137
+ </div>
138
+
139
+ <div align="center">
140
+ <img src='assets/outdomain_preview.png'>
141
+ </div>
142
+
143
+ ## Disclaimer
144
+
145
+ This project is released under Apache License and aims to positively impact the field of AI-driven image generation. Users are granted the freedom to create images using this tool, but they are obligated to comply with local laws and utilize it responsibly. The developers will not assume any responsibility for potential misuse by users.
146
+
147
+ ## Citation
148
+ ```bibtex
149
+ @article{huang2024instantir,
150
+ title={InstantIR: Blind Image Restoration with Instant Generative Reference},
151
+ author={Huang, Jen-Yuan and Wang, Haofan and Wang, Qixun and Bai, Xu and Ai, Hao and Xing, Peng and Huang, Jen-Tse},
152
+ journal={arXiv preprint arXiv:2410.06551},
153
+ year={2024}
154
+ }
155
  ```