hirosue commited on
Commit
2132819
·
1 Parent(s): 86c3d66

Add all files

Browse files
Files changed (5) hide show
  1. README.md +7 -6
  2. app.py +2364 -0
  3. flavors.jpg +0 -0
  4. requirements.txt +29 -0
  5. taming/modules/autoencoder/lpips/vgg.pth +3 -0
README.md CHANGED
@@ -1,12 +1,13 @@
1
  ---
2
- title: VQGAN CLIP
3
- emoji: 🐢
4
- colorFrom: gray
5
- colorTo: pink
6
  sdk: gradio
7
- sdk_version: 3.1.7
8
  app_file: app.py
9
  pinned: false
 
10
  ---
11
 
12
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
1
  ---
2
+ title: VQGAN+CLIP (Hypertron v2)
3
+ emoji: 👁
4
+ colorFrom: red
5
+ colorTo: blue
6
  sdk: gradio
7
+ sdk_version: 2.9.4
8
  app_file: app.py
9
  pinned: false
10
+ license: mit
11
  ---
12
 
13
+ Check out the configuration reference at https://huggingface.co/docs/hub/spaces#reference
app.py ADDED
@@ -0,0 +1,2364 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import sys
2
+ import argparse
3
+ import math
4
+ from pathlib import Path
5
+ import sys
6
+ import pandas as pd
7
+ from base64 import b64encode
8
+ from omegaconf import OmegaConf
9
+ from PIL import Image
10
+ from taming.models import cond_transformer, vqgan
11
+ import torch
12
+ from os.path import exists as path_exists
13
+
14
+ torch.cuda.empty_cache()
15
+ from torch import nn
16
+ import torch.optim as optim
17
+ from torch import optim
18
+ from torch.nn import functional as F
19
+ from torchvision import transforms
20
+ from torchvision.transforms import functional as TF
21
+ import torchvision.transforms as T
22
+ from git.repo.base import Repo
23
+
24
+ if not (path_exists(f"CLIP")):
25
+ Repo.clone_from("https://github.com/openai/CLIP", "CLIP")
26
+
27
+ from CLIP import clip
28
+ import gradio as gr
29
+ import kornia.augmentation as K
30
+ import numpy as np
31
+ import subprocess
32
+ import imageio
33
+ from PIL import ImageFile, Image
34
+ import time
35
+ import base64
36
+
37
+ import hashlib
38
+ from PIL.PngImagePlugin import PngImageFile, PngInfo
39
+ import json
40
+ import urllib.request
41
+ from random import randint
42
+ from pathvalidate import sanitize_filename
43
+ from huggingface_hub import hf_hub_download
44
+ import shortuuid
45
+ import gc
46
+
47
+ device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
48
+ print("Using device:", device)
49
+
50
+ vqgan_model = hf_hub_download(repo_id="boris/vqgan_f16_16384", filename="model.ckpt")
51
+ vqgan_config = hf_hub_download(repo_id="boris/vqgan_f16_16384", filename="config.yaml")
52
+
53
+ def load_vqgan_model(config_path, checkpoint_path):
54
+ config = OmegaConf.load(config_path)
55
+ if config.model.target == "taming.models.vqgan.VQModel":
56
+ model = vqgan.VQModel(**config.model.params)
57
+ model.eval().requires_grad_(False)
58
+ model.init_from_ckpt(checkpoint_path)
59
+ elif config.model.target == "taming.models.cond_transformer.Net2NetTransformer":
60
+ parent_model = cond_transformer.Net2NetTransformer(**config.model.params)
61
+ parent_model.eval().requires_grad_(False)
62
+ parent_model.init_from_ckpt(checkpoint_path)
63
+ model = parent_model.first_stage_model
64
+ elif config.model.target == "taming.models.vqgan.GumbelVQ":
65
+ model = vqgan.GumbelVQ(**config.model.params)
66
+ # print(config.model.params)
67
+ model.eval().requires_grad_(False)
68
+ model.init_from_ckpt(checkpoint_path)
69
+ else:
70
+ raise ValueError(f"unknown model type: {config.model.target}")
71
+ del model.loss
72
+ return model
73
+ model = load_vqgan_model(vqgan_config, vqgan_model).to(device)
74
+ perceptor = (
75
+ clip.load("ViT-B/32", jit=False)[0]
76
+ .eval()
77
+ .requires_grad_(False)
78
+ .to(device)
79
+ )
80
+ gc.collect()
81
+ torch.cuda.empty_cache()
82
+
83
+ def run_all(user_input, width, height, template, num_steps, flavor):
84
+ import random
85
+ import torch
86
+ gc.collect()
87
+ torch.cuda.empty_cache()
88
+
89
+ #if uploaded_file is not None:
90
+ #uploaded_folder = f"{DefaultPaths.root_path}/uploaded"
91
+ #if not path_exists(uploaded_folder):
92
+ # os.makedirs(uploaded_folder)
93
+ #image_data = uploaded_file.read()
94
+ #f = open(f"{uploaded_folder}/{uploaded_file.name}", "wb")
95
+ #f.write(image_data)
96
+ #f.close()
97
+ #image_path = f"{uploaded_folder}/{uploaded_file.name}"
98
+ #pass
99
+ #else:
100
+ image_path = None
101
+ url = shortuuid.uuid()
102
+ args2 = argparse.Namespace(
103
+ prompt=user_input,
104
+ seed=int(random.randint(0, 2147483647)),
105
+ sizex=width,
106
+ sizey=height,
107
+ flavor=flavor,
108
+ iterations=num_steps,
109
+ mse=True,
110
+ update=100,
111
+ template=template,
112
+ vqgan_model='ImageNet 16384',
113
+ seed_image=image_path,
114
+ image_file=f"{url}.png",
115
+ #frame_dir=intermediary_folder,
116
+ )
117
+ if args2.seed is not None:
118
+ import torch
119
+
120
+ import numpy as np
121
+
122
+ np.random.seed(args2.seed)
123
+ import random
124
+
125
+ random.seed(args2.seed)
126
+ # next line forces deterministic random values, but causes other issues with resampling (uncomment to see)
127
+ torch.manual_seed(args2.seed)
128
+ torch.cuda.manual_seed(args2.seed)
129
+ torch.cuda.manual_seed_all(args2.seed)
130
+ torch.backends.cudnn.deterministic = True
131
+ torch.backends.cudnn.benchmark = False
132
+ device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
133
+ print("Using device:", device)
134
+
135
+ def noise_gen(shape, octaves=5):
136
+ n, c, h, w = shape
137
+ noise = torch.zeros([n, c, 1, 1])
138
+ max_octaves = min(octaves, math.log(h) / math.log(2), math.log(w) / math.log(2))
139
+ for i in reversed(range(max_octaves)):
140
+ h_cur, w_cur = h // 2**i, w // 2**i
141
+ noise = F.interpolate(
142
+ noise, (h_cur, w_cur), mode="bicubic", align_corners=False
143
+ )
144
+ noise += torch.randn([n, c, h_cur, w_cur]) / 5
145
+ return noise
146
+
147
+ def sinc(x):
148
+ return torch.where(
149
+ x != 0, torch.sin(math.pi * x) / (math.pi * x), x.new_ones([])
150
+ )
151
+
152
+ def lanczos(x, a):
153
+ cond = torch.logical_and(-a < x, x < a)
154
+ out = torch.where(cond, sinc(x) * sinc(x / a), x.new_zeros([]))
155
+ return out / out.sum()
156
+
157
+ def ramp(ratio, width):
158
+ n = math.ceil(width / ratio + 1)
159
+ out = torch.empty([n])
160
+ cur = 0
161
+ for i in range(out.shape[0]):
162
+ out[i] = cur
163
+ cur += ratio
164
+ return torch.cat([-out[1:].flip([0]), out])[1:-1]
165
+
166
+ def resample(input, size, align_corners=True):
167
+ n, c, h, w = input.shape
168
+ dh, dw = size
169
+
170
+ input = input.view([n * c, 1, h, w])
171
+
172
+ if dh < h:
173
+ kernel_h = lanczos(ramp(dh / h, 2), 2).to(input.device, input.dtype)
174
+ pad_h = (kernel_h.shape[0] - 1) // 2
175
+ input = F.pad(input, (0, 0, pad_h, pad_h), "reflect")
176
+ input = F.conv2d(input, kernel_h[None, None, :, None])
177
+
178
+ if dw < w:
179
+ kernel_w = lanczos(ramp(dw / w, 2), 2).to(input.device, input.dtype)
180
+ pad_w = (kernel_w.shape[0] - 1) // 2
181
+ input = F.pad(input, (pad_w, pad_w, 0, 0), "reflect")
182
+ input = F.conv2d(input, kernel_w[None, None, None, :])
183
+
184
+ input = input.view([n, c, h, w])
185
+ return F.interpolate(input, size, mode="bicubic", align_corners=align_corners)
186
+
187
+ def lerp(a, b, f):
188
+ return (a * (1.0 - f)) + (b * f)
189
+
190
+ class ReplaceGrad(torch.autograd.Function):
191
+ @staticmethod
192
+ def forward(ctx, x_forward, x_backward):
193
+ ctx.shape = x_backward.shape
194
+ return x_forward
195
+
196
+ @staticmethod
197
+ def backward(ctx, grad_in):
198
+ return None, grad_in.sum_to_size(ctx.shape)
199
+
200
+ replace_grad = ReplaceGrad.apply
201
+
202
+ class ClampWithGrad(torch.autograd.Function):
203
+ @staticmethod
204
+ def forward(ctx, input, min, max):
205
+ ctx.min = min
206
+ ctx.max = max
207
+ ctx.save_for_backward(input)
208
+ return input.clamp(min, max)
209
+
210
+ @staticmethod
211
+ def backward(ctx, grad_in):
212
+ (input,) = ctx.saved_tensors
213
+ return (
214
+ grad_in * (grad_in * (input - input.clamp(ctx.min, ctx.max)) >= 0),
215
+ None,
216
+ None,
217
+ )
218
+
219
+ clamp_with_grad = ClampWithGrad.apply
220
+
221
+ def vector_quantize(x, codebook):
222
+ d = (
223
+ x.pow(2).sum(dim=-1, keepdim=True)
224
+ + codebook.pow(2).sum(dim=1)
225
+ - 2 * x @ codebook.T
226
+ )
227
+ indices = d.argmin(-1)
228
+ x_q = F.one_hot(indices, codebook.shape[0]).to(d.dtype) @ codebook
229
+ return replace_grad(x_q, x)
230
+
231
+ class Prompt(nn.Module):
232
+ def __init__(self, embed, weight=1.0, stop=float("-inf")):
233
+ super().__init__()
234
+ self.register_buffer("embed", embed)
235
+ self.register_buffer("weight", torch.as_tensor(weight))
236
+ self.register_buffer("stop", torch.as_tensor(stop))
237
+
238
+ def forward(self, input):
239
+ input_normed = F.normalize(input.unsqueeze(1), dim=2)
240
+ embed_normed = F.normalize(self.embed.unsqueeze(0), dim=2)
241
+ dists = (
242
+ input_normed.sub(embed_normed).norm(dim=2).div(2).arcsin().pow(2).mul(2)
243
+ )
244
+ dists = dists * self.weight.sign()
245
+ return (
246
+ self.weight.abs()
247
+ * replace_grad(dists, torch.maximum(dists, self.stop)).mean()
248
+ )
249
+
250
+ def parse_prompt(prompt):
251
+ if prompt.startswith("http://") or prompt.startswith("https://"):
252
+ vals = prompt.rsplit(":", 1)
253
+ vals = [vals[0] + ":" + vals[1], *vals[2:]]
254
+ else:
255
+ vals = prompt.rsplit(":", 1)
256
+ vals = vals + ["", "1", "-inf"][len(vals) :]
257
+ return vals[0], float(vals[1]), float(vals[2])
258
+
259
+ def one_sided_clip_loss(input, target, labels=None, logit_scale=100):
260
+ input_normed = F.normalize(input, dim=-1)
261
+ target_normed = F.normalize(target, dim=-1)
262
+ logits = input_normed @ target_normed.T * logit_scale
263
+ if labels is None:
264
+ labels = torch.arange(len(input), device=logits.device)
265
+ return F.cross_entropy(logits, labels)
266
+
267
+ class EMATensor(nn.Module):
268
+ """implmeneted by Katherine Crowson"""
269
+
270
+ def __init__(self, tensor, decay):
271
+ super().__init__()
272
+ self.tensor = nn.Parameter(tensor)
273
+ self.register_buffer("biased", torch.zeros_like(tensor))
274
+ self.register_buffer("average", torch.zeros_like(tensor))
275
+ self.decay = decay
276
+ self.register_buffer("accum", torch.tensor(1.0))
277
+ self.update()
278
+
279
+ @torch.no_grad()
280
+ def update(self):
281
+ if not self.training:
282
+ raise RuntimeError("update() should only be called during training")
283
+
284
+ self.accum *= self.decay
285
+ self.biased.mul_(self.decay)
286
+ self.biased.add_((1 - self.decay) * self.tensor)
287
+ self.average.copy_(self.biased)
288
+ self.average.div_(1 - self.accum)
289
+
290
+ def forward(self):
291
+ if self.training:
292
+ return self.tensor
293
+ return self.average
294
+
295
+ class MakeCutoutsCustom(nn.Module):
296
+ def __init__(self, cut_size, cutn, cut_pow, augs):
297
+ super().__init__()
298
+ self.cut_size = cut_size
299
+ # tqdm.write(f"cut size: {self.cut_size}")
300
+ self.cutn = cutn
301
+ self.cut_pow = cut_pow
302
+ self.noise_fac = 0.1
303
+ self.av_pool = nn.AdaptiveAvgPool2d((self.cut_size, self.cut_size))
304
+ self.max_pool = nn.AdaptiveMaxPool2d((self.cut_size, self.cut_size))
305
+ self.augs = nn.Sequential(
306
+ K.RandomHorizontalFlip(p=Random_Horizontal_Flip),
307
+ K.RandomSharpness(Random_Sharpness, p=Random_Sharpness_P),
308
+ K.RandomGaussianBlur(
309
+ (Random_Gaussian_Blur),
310
+ (Random_Gaussian_Blur_W, Random_Gaussian_Blur_W),
311
+ p=Random_Gaussian_Blur_P,
312
+ ),
313
+ K.RandomGaussianNoise(p=Random_Gaussian_Noise_P),
314
+ K.RandomElasticTransform(
315
+ kernel_size=(
316
+ Random_Elastic_Transform_Kernel_Size_W,
317
+ Random_Elastic_Transform_Kernel_Size_H,
318
+ ),
319
+ sigma=(Random_Elastic_Transform_Sigma),
320
+ p=Random_Elastic_Transform_P,
321
+ ),
322
+ K.RandomAffine(
323
+ degrees=Random_Affine_Degrees,
324
+ translate=Random_Affine_Translate,
325
+ p=Random_Affine_P,
326
+ padding_mode="border",
327
+ ),
328
+ K.RandomPerspective(Random_Perspective, p=Random_Perspective_P),
329
+ K.ColorJitter(
330
+ hue=Color_Jitter_Hue,
331
+ saturation=Color_Jitter_Saturation,
332
+ p=Color_Jitter_P,
333
+ ),
334
+ )
335
+ # K.RandomErasing((0.1, 0.7), (0.3, 1/0.4), same_on_batch=True, p=0.2),)
336
+
337
+ def set_cut_pow(self, cut_pow):
338
+ self.cut_pow = cut_pow
339
+
340
+ def forward(self, input):
341
+ sideY, sideX = input.shape[2:4]
342
+ max_size = min(sideX, sideY)
343
+ min_size = min(sideX, sideY, self.cut_size)
344
+ cutouts = []
345
+ cutouts_full = []
346
+ noise_fac = 0.1
347
+
348
+ min_size_width = min(sideX, sideY)
349
+ lower_bound = float(self.cut_size / min_size_width)
350
+
351
+ for ii in range(self.cutn):
352
+
353
+ # size = int(torch.rand([])**self.cut_pow * (max_size - min_size) + min_size)
354
+ randsize = (
355
+ torch.zeros(
356
+ 1,
357
+ )
358
+ .normal_(mean=0.8, std=0.3)
359
+ .clip(lower_bound, 1.0)
360
+ )
361
+ size_mult = randsize**self.cut_pow
362
+ size = int(
363
+ min_size_width * (size_mult.clip(lower_bound, 1.0))
364
+ ) # replace .5 with a result for 224 the default large size is .95
365
+ # size = int(min_size_width*torch.zeros(1,).normal_(mean=.9, std=.3).clip(lower_bound, .95)) # replace .5 with a result for 224 the default large size is .95
366
+
367
+ offsetx = torch.randint(0, sideX - size + 1, ())
368
+ offsety = torch.randint(0, sideY - size + 1, ())
369
+ cutout = input[:, :, offsety : offsety + size, offsetx : offsetx + size]
370
+ cutouts.append(resample(cutout, (self.cut_size, self.cut_size)))
371
+
372
+ cutouts = torch.cat(cutouts, dim=0)
373
+ cutouts = clamp_with_grad(cutouts, 0, 1)
374
+
375
+ # if args.use_augs:
376
+ cutouts = self.augs(cutouts)
377
+ if self.noise_fac:
378
+ facs = cutouts.new_empty([cutouts.shape[0], 1, 1, 1]).uniform_(
379
+ 0, self.noise_fac
380
+ )
381
+ cutouts = cutouts + facs * torch.randn_like(cutouts)
382
+ return cutouts
383
+
384
+ class MakeCutoutsJuu(nn.Module):
385
+ def __init__(self, cut_size, cutn, cut_pow, augs):
386
+ super().__init__()
387
+ self.cut_size = cut_size
388
+ self.cutn = cutn
389
+ self.cut_pow = cut_pow
390
+ self.augs = nn.Sequential(
391
+ # K.RandomGaussianNoise(mean=0.0, std=0.5, p=0.1),
392
+ K.RandomHorizontalFlip(p=0.5),
393
+ K.RandomSharpness(0.3, p=0.4),
394
+ K.RandomAffine(degrees=30, translate=0.1, p=0.8, padding_mode="border"),
395
+ K.RandomPerspective(0.2, p=0.4),
396
+ K.ColorJitter(hue=0.01, saturation=0.01, p=0.7),
397
+ K.RandomGrayscale(p=0.1),
398
+ )
399
+ self.noise_fac = 0.1
400
+
401
+ def forward(self, input):
402
+ sideY, sideX = input.shape[2:4]
403
+ max_size = min(sideX, sideY)
404
+ min_size = min(sideX, sideY, self.cut_size)
405
+ cutouts = []
406
+ for _ in range(self.cutn):
407
+ size = int(
408
+ torch.rand([]) ** self.cut_pow * (max_size - min_size) + min_size
409
+ )
410
+ offsetx = torch.randint(0, sideX - size + 1, ())
411
+ offsety = torch.randint(0, sideY - size + 1, ())
412
+ cutout = input[:, :, offsety : offsety + size, offsetx : offsetx + size]
413
+ cutouts.append(resample(cutout, (self.cut_size, self.cut_size)))
414
+ batch = self.augs(torch.cat(cutouts, dim=0))
415
+ if self.noise_fac:
416
+ facs = batch.new_empty([self.cutn, 1, 1, 1]).uniform_(0, self.noise_fac)
417
+ batch = batch + facs * torch.randn_like(batch)
418
+ return batch
419
+
420
+ class MakeCutoutsMoth(nn.Module):
421
+ def __init__(self, cut_size, cutn, cut_pow, augs, skip_augs=False):
422
+ super().__init__()
423
+ self.cut_size = cut_size
424
+ self.cutn = cutn
425
+ self.cut_pow = cut_pow
426
+ self.skip_augs = skip_augs
427
+ self.augs = T.Compose(
428
+ [
429
+ T.RandomHorizontalFlip(p=0.5),
430
+ T.Lambda(lambda x: x + torch.randn_like(x) * 0.01),
431
+ T.RandomAffine(degrees=15, translate=(0.1, 0.1)),
432
+ T.Lambda(lambda x: x + torch.randn_like(x) * 0.01),
433
+ T.RandomPerspective(distortion_scale=0.4, p=0.7),
434
+ T.Lambda(lambda x: x + torch.randn_like(x) * 0.01),
435
+ T.RandomGrayscale(p=0.15),
436
+ T.Lambda(lambda x: x + torch.randn_like(x) * 0.01),
437
+ # T.ColorJitter(brightness=0.1, contrast=0.1, saturation=0.1, hue=0.1),
438
+ ]
439
+ )
440
+
441
+ def forward(self, input):
442
+ input = T.Pad(input.shape[2] // 4, fill=0)(input)
443
+ sideY, sideX = input.shape[2:4]
444
+ max_size = min(sideX, sideY)
445
+
446
+ cutouts = []
447
+ for ch in range(cutn):
448
+ if ch > cutn - cutn // 4:
449
+ cutout = input.clone()
450
+ else:
451
+ size = int(
452
+ max_size
453
+ * torch.zeros(
454
+ 1,
455
+ )
456
+ .normal_(mean=0.8, std=0.3)
457
+ .clip(float(self.cut_size / max_size), 1.0)
458
+ )
459
+ offsetx = torch.randint(0, abs(sideX - size + 1), ())
460
+ offsety = torch.randint(0, abs(sideY - size + 1), ())
461
+ cutout = input[
462
+ :, :, offsety : offsety + size, offsetx : offsetx + size
463
+ ]
464
+
465
+ if not self.skip_augs:
466
+ cutout = self.augs(cutout)
467
+ cutouts.append(resample(cutout, (self.cut_size, self.cut_size)))
468
+ del cutout
469
+
470
+ cutouts = torch.cat(cutouts, dim=0)
471
+ return cutouts
472
+
473
+ class MakeCutoutsAaron(nn.Module):
474
+ def __init__(self, cut_size, cutn, cut_pow, augs):
475
+ super().__init__()
476
+ self.cut_size = cut_size
477
+ self.cutn = cutn
478
+ self.cut_pow = cut_pow
479
+ self.augs = augs
480
+ self.av_pool = nn.AdaptiveAvgPool2d((self.cut_size, self.cut_size))
481
+ self.max_pool = nn.AdaptiveMaxPool2d((self.cut_size, self.cut_size))
482
+
483
+ def set_cut_pow(self, cut_pow):
484
+ self.cut_pow = cut_pow
485
+
486
+ def forward(self, input):
487
+ sideY, sideX = input.shape[2:4]
488
+ max_size = min(sideX, sideY)
489
+ min_size = min(sideX, sideY, self.cut_size)
490
+ cutouts = []
491
+ cutouts_full = []
492
+
493
+ min_size_width = min(sideX, sideY)
494
+ lower_bound = float(self.cut_size / min_size_width)
495
+
496
+ for ii in range(self.cutn):
497
+ size = int(
498
+ min_size_width
499
+ * torch.zeros(
500
+ 1,
501
+ )
502
+ .normal_(mean=0.8, std=0.3)
503
+ .clip(lower_bound, 1.0)
504
+ ) # replace .5 with a result for 224 the default large size is .95
505
+
506
+ offsetx = torch.randint(0, sideX - size + 1, ())
507
+ offsety = torch.randint(0, sideY - size + 1, ())
508
+ cutout = input[:, :, offsety : offsety + size, offsetx : offsetx + size]
509
+ cutouts.append(resample(cutout, (self.cut_size, self.cut_size)))
510
+
511
+ cutouts = torch.cat(cutouts, dim=0)
512
+
513
+ return clamp_with_grad(cutouts, 0, 1)
514
+
515
+ class MakeCutoutsCumin(nn.Module):
516
+ # from https://colab.research.google.com/drive/1ZAus_gn2RhTZWzOWUpPERNC0Q8OhZRTZ
517
+ def __init__(self, cut_size, cutn, cut_pow, augs):
518
+ super().__init__()
519
+ self.cut_size = cut_size
520
+ # tqdm.write(f"cut size: {self.cut_size}")
521
+ self.cutn = cutn
522
+ self.cut_pow = cut_pow
523
+ self.noise_fac = 0.1
524
+ self.av_pool = nn.AdaptiveAvgPool2d((self.cut_size, self.cut_size))
525
+ self.max_pool = nn.AdaptiveMaxPool2d((self.cut_size, self.cut_size))
526
+ self.augs = nn.Sequential(
527
+ # K.RandomHorizontalFlip(p=0.5),
528
+ # K.RandomSharpness(0.3,p=0.4),
529
+ # K.RandomGaussianBlur((3,3),(10.5,10.5),p=0.2),
530
+ # K.RandomGaussianNoise(p=0.5),
531
+ # K.RandomElasticTransform(kernel_size=(33, 33), sigma=(7,7), p=0.2),
532
+ K.RandomAffine(degrees=15, translate=0.1, p=0.7, padding_mode="border"),
533
+ K.RandomPerspective(0.7, p=0.7),
534
+ K.ColorJitter(hue=0.1, saturation=0.1, p=0.7),
535
+ K.RandomErasing((0.1, 0.4), (0.3, 1 / 0.3), same_on_batch=True, p=0.7),
536
+ )
537
+
538
+ def set_cut_pow(self, cut_pow):
539
+ self.cut_pow = cut_pow
540
+
541
+ def forward(self, input):
542
+ sideY, sideX = input.shape[2:4]
543
+ max_size = min(sideX, sideY)
544
+ min_size = min(sideX, sideY, self.cut_size)
545
+ cutouts = []
546
+ cutouts_full = []
547
+ noise_fac = 0.1
548
+
549
+ min_size_width = min(sideX, sideY)
550
+ lower_bound = float(self.cut_size / min_size_width)
551
+
552
+ for ii in range(self.cutn):
553
+
554
+ # size = int(torch.rand([])**self.cut_pow * (max_size - min_size) + min_size)
555
+ randsize = (
556
+ torch.zeros(
557
+ 1,
558
+ )
559
+ .normal_(mean=0.8, std=0.3)
560
+ .clip(lower_bound, 1.0)
561
+ )
562
+ size_mult = randsize**self.cut_pow
563
+ size = int(
564
+ min_size_width * (size_mult.clip(lower_bound, 1.0))
565
+ ) # replace .5 with a result for 224 the default large size is .95
566
+ # size = int(min_size_width*torch.zeros(1,).normal_(mean=.9, std=.3).clip(lower_bound, .95)) # replace .5 with a result for 224 the default large size is .95
567
+
568
+ offsetx = torch.randint(0, sideX - size + 1, ())
569
+ offsety = torch.randint(0, sideY - size + 1, ())
570
+ cutout = input[:, :, offsety : offsety + size, offsetx : offsetx + size]
571
+ cutouts.append(resample(cutout, (self.cut_size, self.cut_size)))
572
+
573
+ cutouts = torch.cat(cutouts, dim=0)
574
+ cutouts = clamp_with_grad(cutouts, 0, 1)
575
+
576
+ # if args.use_augs:
577
+ cutouts = self.augs(cutouts)
578
+ if self.noise_fac:
579
+ facs = cutouts.new_empty([cutouts.shape[0], 1, 1, 1]).uniform_(
580
+ 0, self.noise_fac
581
+ )
582
+ cutouts = cutouts + facs * torch.randn_like(cutouts)
583
+ return cutouts
584
+
585
+ class MakeCutoutsHolywater(nn.Module):
586
+ def __init__(self, cut_size, cutn, cut_pow, augs):
587
+ super().__init__()
588
+ self.cut_size = cut_size
589
+ # tqdm.write(f"cut size: {self.cut_size}")
590
+ self.cutn = cutn
591
+ self.cut_pow = cut_pow
592
+ self.noise_fac = 0.1
593
+ self.av_pool = nn.AdaptiveAvgPool2d((self.cut_size, self.cut_size))
594
+ self.max_pool = nn.AdaptiveMaxPool2d((self.cut_size, self.cut_size))
595
+ self.augs = nn.Sequential(
596
+ # K.RandomGaussianNoise(mean=0.0, std=0.5, p=0.1),
597
+ K.RandomHorizontalFlip(p=0.5),
598
+ K.RandomSharpness(0.3, p=0.4),
599
+ K.RandomAffine(degrees=30, translate=0.1, p=0.8, padding_mode="border"),
600
+ K.RandomPerspective(0.2, p=0.4),
601
+ K.ColorJitter(hue=0.01, saturation=0.01, p=0.7),
602
+ K.RandomGrayscale(p=0.1),
603
+ )
604
+
605
+ def set_cut_pow(self, cut_pow):
606
+ self.cut_pow = cut_pow
607
+
608
+ def forward(self, input):
609
+ sideY, sideX = input.shape[2:4]
610
+ max_size = min(sideX, sideY)
611
+ min_size = min(sideX, sideY, self.cut_size)
612
+ cutouts = []
613
+ cutouts_full = []
614
+ noise_fac = 0.1
615
+ min_size_width = min(sideX, sideY)
616
+ lower_bound = float(self.cut_size / min_size_width)
617
+
618
+ for ii in range(self.cutn):
619
+ size = int(
620
+ torch.rand([]) ** self.cut_pow * (max_size - min_size) + min_size
621
+ )
622
+ randsize = (
623
+ torch.zeros(
624
+ 1,
625
+ )
626
+ .normal_(mean=0.8, std=0.3)
627
+ .clip(lower_bound, 1.0)
628
+ )
629
+ size_mult = randsize**self.cut_pow * ii + size
630
+ size1 = int(
631
+ (min_size_width) * (size_mult.clip(lower_bound, 1.0))
632
+ ) # replace .5 with a result for 224 the default large size is .95
633
+ size2 = int(
634
+ (min_size_width)
635
+ * torch.zeros(
636
+ 1,
637
+ )
638
+ .normal_(mean=0.9, std=0.3)
639
+ .clip(lower_bound, 0.95)
640
+ ) # replace .5 with a result for 224 the default large size is .95
641
+ offsetx = torch.randint(0, sideX - size1 + 1, ())
642
+ offsety = torch.randint(0, sideY - size2 + 1, ())
643
+ cutout = input[
644
+ :, :, offsety : offsety + size2 + ii, offsetx : offsetx + size1 + ii
645
+ ]
646
+ cutouts.append(resample(cutout, (self.cut_size, self.cut_size)))
647
+
648
+ cutouts = torch.cat(cutouts, dim=0)
649
+ cutouts = clamp_with_grad(cutouts, 0, 1)
650
+ cutouts = self.augs(cutouts)
651
+ facs = cutouts.new_empty([cutouts.shape[0], 1, 1, 1]).uniform_(
652
+ 0, self.noise_fac
653
+ )
654
+ cutouts = cutouts + facs * torch.randn_like(cutouts)
655
+ return cutouts
656
+
657
+ class MakeCutoutsOldHolywater(nn.Module):
658
+ def __init__(self, cut_size, cutn, cut_pow, augs):
659
+ super().__init__()
660
+ self.cut_size = cut_size
661
+ # tqdm.write(f"cut size: {self.cut_size}")
662
+ self.cutn = cutn
663
+ self.cut_pow = cut_pow
664
+ self.noise_fac = 0.1
665
+ self.av_pool = nn.AdaptiveAvgPool2d((self.cut_size, self.cut_size))
666
+ self.max_pool = nn.AdaptiveMaxPool2d((self.cut_size, self.cut_size))
667
+ self.augs = nn.Sequential(
668
+ # K.RandomHorizontalFlip(p=0.5),
669
+ # K.RandomSharpness(0.3,p=0.4),
670
+ # K.RandomGaussianBlur((3,3),(10.5,10.5),p=0.2),
671
+ # K.RandomGaussianNoise(p=0.5),
672
+ # K.RandomElasticTransform(kernel_size=(33, 33), sigma=(7,7), p=0.2),
673
+ K.RandomAffine(
674
+ degrees=180, translate=0.5, p=0.2, padding_mode="border"
675
+ ),
676
+ K.RandomPerspective(0.6, p=0.9),
677
+ K.ColorJitter(hue=0.03, saturation=0.01, p=0.1),
678
+ K.RandomErasing((0.1, 0.7), (0.3, 1 / 0.4), same_on_batch=True, p=0.2),
679
+ )
680
+
681
+ def set_cut_pow(self, cut_pow):
682
+ self.cut_pow = cut_pow
683
+
684
+ def forward(self, input):
685
+ sideY, sideX = input.shape[2:4]
686
+ max_size = min(sideX, sideY)
687
+ min_size = min(sideX, sideY, self.cut_size)
688
+ cutouts = []
689
+ cutouts_full = []
690
+ noise_fac = 0.1
691
+
692
+ min_size_width = min(sideX, sideY)
693
+ lower_bound = float(self.cut_size / min_size_width)
694
+
695
+ for ii in range(self.cutn):
696
+
697
+ # size = int(torch.rand([])**self.cut_pow * (max_size - min_size) + min_size)
698
+ randsize = (
699
+ torch.zeros(
700
+ 1,
701
+ )
702
+ .normal_(mean=0.8, std=0.3)
703
+ .clip(lower_bound, 1.0)
704
+ )
705
+ size_mult = randsize**self.cut_pow
706
+ size = int(
707
+ min_size_width * (size_mult.clip(lower_bound, 1.0))
708
+ ) # replace .5 with a result for 224 the default large size is .95
709
+ # size = int(min_size_width*torch.zeros(1,).normal_(mean=.9, std=.3).clip(lower_bound, .95)) # replace .5 with a result for 224 the default large size is .95
710
+
711
+ offsetx = torch.randint(0, sideX - size + 1, ())
712
+ offsety = torch.randint(0, sideY - size + 1, ())
713
+ cutout = input[:, :, offsety : offsety + size, offsetx : offsetx + size]
714
+ cutouts.append(resample(cutout, (self.cut_size, self.cut_size)))
715
+
716
+ cutouts = torch.cat(cutouts, dim=0)
717
+ cutouts = clamp_with_grad(cutouts, 0, 1)
718
+
719
+ # if args.use_augs:
720
+ cutouts = self.augs(cutouts)
721
+ if self.noise_fac:
722
+ facs = cutouts.new_empty([cutouts.shape[0], 1, 1, 1]).uniform_(
723
+ 0, self.noise_fac
724
+ )
725
+ cutouts = cutouts + facs * torch.randn_like(cutouts)
726
+ return cutouts
727
+
728
+ class MakeCutoutsGinger(nn.Module):
729
+ def __init__(self, cut_size, cutn, cut_pow, augs):
730
+ super().__init__()
731
+ self.cut_size = cut_size
732
+ # tqdm.write(f"cut size: {self.cut_size}")
733
+ self.cutn = cutn
734
+ self.cut_pow = cut_pow
735
+ self.noise_fac = 0.1
736
+ self.av_pool = nn.AdaptiveAvgPool2d((self.cut_size, self.cut_size))
737
+ self.max_pool = nn.AdaptiveMaxPool2d((self.cut_size, self.cut_size))
738
+ self.augs = augs
739
+ """
740
+ nn.Sequential(
741
+ K.RandomHorizontalFlip(p=0.5),
742
+ K.RandomSharpness(0.3,p=0.4),
743
+ K.RandomGaussianBlur((3,3),(10.5,10.5),p=0.2),
744
+ K.RandomGaussianNoise(p=0.5),
745
+ K.RandomElasticTransform(kernel_size=(33, 33), sigma=(7,7), p=0.2),
746
+ K.RandomAffine(degrees=30, translate=0.1, p=0.8, padding_mode='border'), # padding_mode=2
747
+ K.RandomPerspective(0.2,p=0.4, ),
748
+ K.ColorJitter(hue=0.01, saturation=0.01, p=0.7),)
749
+ """
750
+
751
+ def set_cut_pow(self, cut_pow):
752
+ self.cut_pow = cut_pow
753
+
754
+ def forward(self, input):
755
+ sideY, sideX = input.shape[2:4]
756
+ max_size = min(sideX, sideY)
757
+ min_size = min(sideX, sideY, self.cut_size)
758
+ cutouts = []
759
+ cutouts_full = []
760
+ noise_fac = 0.1
761
+
762
+ min_size_width = min(sideX, sideY)
763
+ lower_bound = float(self.cut_size / min_size_width)
764
+
765
+ for ii in range(self.cutn):
766
+
767
+ # size = int(torch.rand([])**self.cut_pow * (max_size - min_size) + min_size)
768
+ randsize = (
769
+ torch.zeros(
770
+ 1,
771
+ )
772
+ .normal_(mean=0.8, std=0.3)
773
+ .clip(lower_bound, 1.0)
774
+ )
775
+ size_mult = randsize**self.cut_pow
776
+ size = int(
777
+ min_size_width * (size_mult.clip(lower_bound, 1.0))
778
+ ) # replace .5 with a result for 224 the default large size is .95
779
+ # size = int(min_size_width*torch.zeros(1,).normal_(mean=.9, std=.3).clip(lower_bound, .95)) # replace .5 with a result for 224 the default large size is .95
780
+
781
+ offsetx = torch.randint(0, sideX - size + 1, ())
782
+ offsety = torch.randint(0, sideY - size + 1, ())
783
+ cutout = input[:, :, offsety : offsety + size, offsetx : offsetx + size]
784
+ cutouts.append(resample(cutout, (self.cut_size, self.cut_size)))
785
+
786
+ cutouts = torch.cat(cutouts, dim=0)
787
+ cutouts = clamp_with_grad(cutouts, 0, 1)
788
+
789
+ # if args.use_augs:
790
+ cutouts = self.augs(cutouts)
791
+ if self.noise_fac:
792
+ facs = cutouts.new_empty([cutouts.shape[0], 1, 1, 1]).uniform_(
793
+ 0, self.noise_fac
794
+ )
795
+ cutouts = cutouts + facs * torch.randn_like(cutouts)
796
+ return cutouts
797
+
798
+ class MakeCutoutsZynth(nn.Module):
799
+ def __init__(self, cut_size, cutn, cut_pow, augs):
800
+ super().__init__()
801
+ self.cut_size = cut_size
802
+ # tqdm.write(f"cut size: {self.cut_size}")
803
+ self.cutn = cutn
804
+ self.cut_pow = cut_pow
805
+ self.noise_fac = 0.1
806
+ self.av_pool = nn.AdaptiveAvgPool2d((self.cut_size, self.cut_size))
807
+ self.max_pool = nn.AdaptiveMaxPool2d((self.cut_size, self.cut_size))
808
+ self.augs = nn.Sequential(
809
+ K.RandomHorizontalFlip(p=0.5),
810
+ # K.RandomSolarize(0.01, 0.01, p=0.7),
811
+ K.RandomSharpness(0.3, p=0.4),
812
+ K.RandomAffine(degrees=30, translate=0.1, p=0.8, padding_mode="border"),
813
+ K.RandomPerspective(0.2, p=0.4),
814
+ K.ColorJitter(hue=0.01, saturation=0.01, p=0.7),
815
+ )
816
+
817
+ def set_cut_pow(self, cut_pow):
818
+ self.cut_pow = cut_pow
819
+
820
+ def forward(self, input):
821
+ sideY, sideX = input.shape[2:4]
822
+ max_size = min(sideX, sideY)
823
+ min_size = min(sideX, sideY, self.cut_size)
824
+ cutouts = []
825
+ cutouts_full = []
826
+ noise_fac = 0.1
827
+
828
+ min_size_width = min(sideX, sideY)
829
+ lower_bound = float(self.cut_size / min_size_width)
830
+
831
+ for ii in range(self.cutn):
832
+
833
+ # size = int(torch.rand([])**self.cut_pow * (max_size - min_size) + min_size)
834
+ randsize = (
835
+ torch.zeros(
836
+ 1,
837
+ )
838
+ .normal_(mean=0.8, std=0.3)
839
+ .clip(lower_bound, 1.0)
840
+ )
841
+ size_mult = randsize**self.cut_pow
842
+ size = int(
843
+ min_size_width * (size_mult.clip(lower_bound, 1.0))
844
+ ) # replace .5 with a result for 224 the default large size is .95
845
+ # size = int(min_size_width*torch.zeros(1,).normal_(mean=.9, std=.3).clip(lower_bound, .95)) # replace .5 with a result for 224 the default large size is .95
846
+
847
+ offsetx = torch.randint(0, sideX - size + 1, ())
848
+ offsety = torch.randint(0, sideY - size + 1, ())
849
+ cutout = input[:, :, offsety : offsety + size, offsetx : offsetx + size]
850
+ cutouts.append(resample(cutout, (self.cut_size, self.cut_size)))
851
+
852
+ cutouts = torch.cat(cutouts, dim=0)
853
+ cutouts = clamp_with_grad(cutouts, 0, 1)
854
+
855
+ # if args.use_augs:
856
+ cutouts = self.augs(cutouts)
857
+ if self.noise_fac:
858
+ facs = cutouts.new_empty([cutouts.shape[0], 1, 1, 1]).uniform_(
859
+ 0, self.noise_fac
860
+ )
861
+ cutouts = cutouts + facs * torch.randn_like(cutouts)
862
+ return cutouts
863
+
864
+ class MakeCutoutsWyvern(nn.Module):
865
+ def __init__(self, cut_size, cutn, cut_pow, augs):
866
+ super().__init__()
867
+ self.cut_size = cut_size
868
+ # tqdm.write(f"cut size: {self.cut_size}")
869
+ self.cutn = cutn
870
+ self.cut_pow = cut_pow
871
+ self.noise_fac = 0.1
872
+ self.av_pool = nn.AdaptiveAvgPool2d((self.cut_size, self.cut_size))
873
+ self.max_pool = nn.AdaptiveMaxPool2d((self.cut_size, self.cut_size))
874
+ self.augs = augs
875
+
876
+ def forward(self, input):
877
+ sideY, sideX = input.shape[2:4]
878
+ max_size = min(sideX, sideY)
879
+ min_size = min(sideX, sideY, self.cut_size)
880
+ cutouts = []
881
+ for _ in range(self.cutn):
882
+ size = int(
883
+ torch.rand([]) ** self.cut_pow * (max_size - min_size) + min_size
884
+ )
885
+ offsetx = torch.randint(0, sideX - size + 1, ())
886
+ offsety = torch.randint(0, sideY - size + 1, ())
887
+ cutout = input[:, :, offsety : offsety + size, offsetx : offsetx + size]
888
+ cutouts.append(resample(cutout, (self.cut_size, self.cut_size)))
889
+ return clamp_with_grad(torch.cat(cutouts, dim=0), 0, 1)
890
+
891
+
892
+ import PIL
893
+
894
+ def resize_image(image, out_size):
895
+ ratio = image.size[0] / image.size[1]
896
+ area = min(image.size[0] * image.size[1], out_size[0] * out_size[1])
897
+ size = round((area * ratio) ** 0.5), round((area / ratio) ** 0.5)
898
+ return image.resize(size, PIL.Image.LANCZOS)
899
+
900
+ class GaussianBlur2d(nn.Module):
901
+ def __init__(self, sigma, window=0, mode="reflect", value=0):
902
+ super().__init__()
903
+ self.mode = mode
904
+ self.value = value
905
+ if not window:
906
+ window = max(math.ceil((sigma * 6 + 1) / 2) * 2 - 1, 3)
907
+ if sigma:
908
+ kernel = torch.exp(
909
+ -((torch.arange(window) - window // 2) ** 2) / 2 / sigma**2
910
+ )
911
+ kernel /= kernel.sum()
912
+ else:
913
+ kernel = torch.ones([1])
914
+ self.register_buffer("kernel", kernel)
915
+
916
+ def forward(self, input):
917
+ n, c, h, w = input.shape
918
+ input = input.view([n * c, 1, h, w])
919
+ start_pad = (self.kernel.shape[0] - 1) // 2
920
+ end_pad = self.kernel.shape[0] // 2
921
+ input = F.pad(
922
+ input, (start_pad, end_pad, start_pad, end_pad), self.mode, self.value
923
+ )
924
+ input = F.conv2d(input, self.kernel[None, None, None, :])
925
+ input = F.conv2d(input, self.kernel[None, None, :, None])
926
+ return input.view([n, c, h, w])
927
+
928
+ BUF_SIZE = 65536
929
+
930
+ def get_digest(path, alg=hashlib.sha256):
931
+ hash = alg()
932
+ # print(path)
933
+ with open(path, "rb") as fp:
934
+ while True:
935
+ data = fp.read(BUF_SIZE)
936
+ if not data:
937
+ break
938
+ hash.update(data)
939
+ return b64encode(hash.digest()).decode("utf-8")
940
+
941
+ flavordict = {
942
+ "cumin": MakeCutoutsCumin,
943
+ "holywater": MakeCutoutsHolywater,
944
+ "old_holywater": MakeCutoutsOldHolywater,
945
+ "ginger": MakeCutoutsGinger,
946
+ "zynth": MakeCutoutsZynth,
947
+ "wyvern": MakeCutoutsWyvern,
948
+ "aaron": MakeCutoutsAaron,
949
+ "moth": MakeCutoutsMoth,
950
+ "juu": MakeCutoutsJuu,
951
+ "custom": MakeCutoutsCustom,
952
+ }
953
+
954
+ @torch.jit.script
955
+ def gelu_impl(x):
956
+ """OpenAI's gelu implementation."""
957
+ return (
958
+ 0.5
959
+ * x
960
+ * (1.0 + torch.tanh(0.7978845608028654 * x * (1.0 + 0.044715 * x * x)))
961
+ )
962
+
963
+ def gelu(x):
964
+ return gelu_impl(x)
965
+
966
+ class MSEDecayLoss(nn.Module):
967
+ def __init__(self, init_weight, mse_decay_rate, mse_epoches, mse_quantize):
968
+ super().__init__()
969
+
970
+ self.init_weight = init_weight
971
+ self.has_init_image = False
972
+ self.mse_decay = init_weight / mse_epoches if init_weight else 0
973
+ self.mse_decay_rate = mse_decay_rate
974
+ self.mse_weight = init_weight
975
+ self.mse_epoches = mse_epoches
976
+ self.mse_quantize = mse_quantize
977
+
978
+ @torch.no_grad()
979
+ def set_target(self, z_tensor, model):
980
+ z_tensor = z_tensor.detach().clone()
981
+ if self.mse_quantize:
982
+ z_tensor = vector_quantize(
983
+ z_tensor.movedim(1, 3), model.quantize.embedding.weight
984
+ ).movedim(
985
+ 3, 1
986
+ ) # z.average
987
+ self.z_orig = z_tensor
988
+
989
+ def forward(self, i, z):
990
+ if self.is_active(i):
991
+ return F.mse_loss(z, self.z_orig) * self.mse_weight / 2
992
+ return 0
993
+
994
+ def is_active(self, i):
995
+ if not self.init_weight:
996
+ return False
997
+ if i <= self.mse_decay_rate and not self.has_init_image:
998
+ return False
999
+ return True
1000
+
1001
+ @torch.no_grad()
1002
+ def step(self, i):
1003
+
1004
+ if (
1005
+ i % self.mse_decay_rate == 0
1006
+ and i != 0
1007
+ and i < self.mse_decay_rate * self.mse_epoches
1008
+ ):
1009
+
1010
+ if (
1011
+ self.mse_weight - self.mse_decay > 0
1012
+ and self.mse_weight - self.mse_decay >= self.mse_decay
1013
+ ):
1014
+ self.mse_weight -= self.mse_decay
1015
+ else:
1016
+ self.mse_weight = 0
1017
+ # print(f"updated mse weight: {self.mse_weight}")
1018
+
1019
+ return True
1020
+
1021
+ return False
1022
+
1023
+ class TVLoss(nn.Module):
1024
+ def forward(self, input):
1025
+ input = F.pad(input, (0, 1, 0, 1), "replicate")
1026
+ x_diff = input[..., :-1, 1:] - input[..., :-1, :-1]
1027
+ y_diff = input[..., 1:, :-1] - input[..., :-1, :-1]
1028
+ diff = x_diff**2 + y_diff**2 + 1e-8
1029
+ return diff.mean(dim=1).sqrt().mean()
1030
+
1031
+ class MultiClipLoss(nn.Module):
1032
+ def __init__(
1033
+ self, clip_models, text_prompt, cutn, cut_pow=1.0, clip_weight=1.0
1034
+ ):
1035
+ super().__init__()
1036
+
1037
+ # Load Clip
1038
+ self.perceptors = []
1039
+ for cm in clip_models:
1040
+ sys.stdout.write(f"Loading {cm[0]} ...\n")
1041
+ sys.stdout.flush()
1042
+ c = (
1043
+ clip.load(cm[0], jit=False)[0]
1044
+ .eval()
1045
+ .requires_grad_(False)
1046
+ .to(device)
1047
+ )
1048
+ self.perceptors.append(
1049
+ {
1050
+ "res": c.visual.input_resolution,
1051
+ "perceptor": c,
1052
+ "weight": cm[1],
1053
+ "prompts": [],
1054
+ }
1055
+ )
1056
+ self.perceptors.sort(key=lambda e: e["res"], reverse=True)
1057
+
1058
+ # Make Cutouts
1059
+ self.max_cut_size = self.perceptors[0]["res"]
1060
+ # self.make_cuts = flavordict[flavor](self.max_cut_size, cutn, cut_pow)
1061
+ # cutouts = flavordict[flavor](self.max_cut_size, cutn, cut_pow=cut_pow, augs=args.augs)
1062
+
1063
+ # Get Prompt Embedings
1064
+ # texts = [phrase.strip() for phrase in text_prompt.split("|")]
1065
+ # if text_prompt == ['']:
1066
+ # texts = []
1067
+ texts = text_prompt
1068
+ self.pMs = []
1069
+ for prompt in texts:
1070
+ txt, weight, stop = parse_prompt(prompt)
1071
+ clip_token = clip.tokenize(txt).to(device)
1072
+ for p in self.perceptors:
1073
+ embed = p["perceptor"].encode_text(clip_token).float()
1074
+ embed_normed = F.normalize(embed.unsqueeze(0), dim=2)
1075
+ p["prompts"].append(
1076
+ {
1077
+ "embed_normed": embed_normed,
1078
+ "weight": torch.as_tensor(weight, device=device),
1079
+ "stop": torch.as_tensor(stop, device=device),
1080
+ }
1081
+ )
1082
+
1083
+ # Prep Augments
1084
+ self.normalize = transforms.Normalize(
1085
+ mean=[0.48145466, 0.4578275, 0.40821073],
1086
+ std=[0.26862954, 0.26130258, 0.27577711],
1087
+ )
1088
+
1089
+ self.augs = nn.Sequential(
1090
+ K.RandomHorizontalFlip(p=0.5),
1091
+ K.RandomSharpness(0.3, p=0.1),
1092
+ K.RandomAffine(
1093
+ degrees=30, translate=0.1, p=0.8, padding_mode="border"
1094
+ ), # padding_mode=2
1095
+ K.RandomPerspective(
1096
+ 0.2,
1097
+ p=0.4,
1098
+ ),
1099
+ K.ColorJitter(hue=0.01, saturation=0.01, p=0.7),
1100
+ K.RandomGrayscale(p=0.15),
1101
+ )
1102
+ self.noise_fac = 0.1
1103
+
1104
+ self.clip_weight = clip_weight
1105
+
1106
+ def prepare_cuts(self, img):
1107
+ cutouts = self.make_cuts(img)
1108
+ cutouts = self.augs(cutouts)
1109
+ if self.noise_fac:
1110
+ facs = cutouts.new_empty([cutouts.shape[0], 1, 1, 1]).uniform_(
1111
+ 0, self.noise_fac
1112
+ )
1113
+ cutouts = cutouts + facs * torch.randn_like(cutouts)
1114
+ cutouts = self.normalize(cutouts)
1115
+ return cutouts
1116
+
1117
+ def forward(self, i, img):
1118
+ cutouts = checkpoint(self.prepare_cuts, img)
1119
+ loss = []
1120
+
1121
+ current_cuts = cutouts
1122
+ currentres = self.max_cut_size
1123
+ for p in self.perceptors:
1124
+ if currentres != p["res"]:
1125
+ current_cuts = resample(cutouts, (p["res"], p["res"]))
1126
+ currentres = p["res"]
1127
+
1128
+ iii = p["perceptor"].encode_image(current_cuts).float()
1129
+ input_normed = F.normalize(iii.unsqueeze(1), dim=2)
1130
+ for prompt in p["prompts"]:
1131
+ dists = (
1132
+ input_normed.sub(prompt["embed_normed"])
1133
+ .norm(dim=2)
1134
+ .div(2)
1135
+ .arcsin()
1136
+ .pow(2)
1137
+ .mul(2)
1138
+ )
1139
+ dists = dists * prompt["weight"].sign()
1140
+ l = (
1141
+ prompt["weight"].abs()
1142
+ * replace_grad(
1143
+ dists, torch.maximum(dists, prompt["stop"])
1144
+ ).mean()
1145
+ )
1146
+ loss.append(l * p["weight"])
1147
+
1148
+ return loss
1149
+
1150
+ class ModelHost:
1151
+ def __init__(self, args):
1152
+ self.args = args
1153
+ self.model, self.perceptor = None, None
1154
+ self.make_cutouts = None
1155
+ self.alt_make_cutouts = None
1156
+ self.imageSize = None
1157
+ self.prompts = None
1158
+ self.opt = None
1159
+ self.normalize = None
1160
+ self.z, self.z_orig, self.z_min, self.z_max = None, None, None, None
1161
+ self.metadata = None
1162
+ self.mse_weight = 0
1163
+ self.normal_flip_optim = None
1164
+ self.usealtprompts = False
1165
+
1166
+ def setup_metadata(self, seed):
1167
+ metadata = {k: v for k, v in vars(self.args).items()}
1168
+ del metadata["max_iterations"]
1169
+ del metadata["display_freq"]
1170
+ metadata["seed"] = seed
1171
+ if metadata["init_image"]:
1172
+ path = metadata["init_image"]
1173
+ digest = get_digest(path)
1174
+ metadata["init_image"] = (path, digest)
1175
+ if metadata["image_prompts"]:
1176
+ prompts = []
1177
+ for prompt in metadata["image_prompts"]:
1178
+ path = prompt
1179
+ digest = get_digest(path)
1180
+ prompts.append((path, digest))
1181
+ metadata["image_prompts"] = prompts
1182
+ self.metadata = metadata
1183
+
1184
+ def setup_model(self, x):
1185
+ i = x
1186
+ device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
1187
+
1188
+ #perceptor = (
1189
+ # clip.load(args.clip_model, jit=False)[0]
1190
+ # .eval()
1191
+ # .requires_grad_(False)
1192
+ # .to(device)
1193
+ #)
1194
+
1195
+ cut_size = perceptor.visual.input_resolution
1196
+
1197
+ if self.args.is_gumbel:
1198
+ e_dim = model.quantize.embedding_dim
1199
+ else:
1200
+ e_dim = model.quantize.e_dim
1201
+
1202
+ f = 2 ** (model.decoder.num_resolutions - 1)
1203
+
1204
+ make_cutouts = flavordict[flavor](
1205
+ cut_size, args.mse_cutn, cut_pow=args.mse_cut_pow, augs=args.augs
1206
+ )
1207
+
1208
+ # make_cutouts = MakeCutouts(cut_size, args.mse_cutn, cut_pow=args.mse_cut_pow,augs=args.augs)
1209
+ if args.altprompts:
1210
+ self.usealtprompts = True
1211
+ self.alt_make_cutouts = flavordict[flavor](
1212
+ cut_size,
1213
+ args.mse_cutn,
1214
+ cut_pow=args.alt_mse_cut_pow,
1215
+ augs=args.altaugs,
1216
+ )
1217
+ # self.alt_make_cutouts = MakeCutouts(cut_size, args.mse_cutn, cut_pow=args.alt_mse_cut_pow,augs=args.altaugs)
1218
+
1219
+ if self.args.is_gumbel:
1220
+ n_toks = model.quantize.n_embed
1221
+ else:
1222
+ n_toks = model.quantize.n_e
1223
+
1224
+ toksX, toksY = args.size[0] // f, args.size[1] // f
1225
+ sideX, sideY = toksX * f, toksY * f
1226
+
1227
+ if self.args.is_gumbel:
1228
+ z_min = model.quantize.embed.weight.min(dim=0).values[
1229
+ None, :, None, None
1230
+ ]
1231
+ z_max = model.quantize.embed.weight.max(dim=0).values[
1232
+ None, :, None, None
1233
+ ]
1234
+ else:
1235
+ z_min = model.quantize.embedding.weight.min(dim=0).values[
1236
+ None, :, None, None
1237
+ ]
1238
+ z_max = model.quantize.embedding.weight.max(dim=0).values[
1239
+ None, :, None, None
1240
+ ]
1241
+
1242
+ from PIL import Image
1243
+ import cv2
1244
+
1245
+ # -------
1246
+ working_dir = self.args.folder_name
1247
+
1248
+ if self.args.init_image != "":
1249
+ img_0 = cv2.imread(init_image)
1250
+ z, *_ = model.encode(
1251
+ TF.to_tensor(img_0).to(device).unsqueeze(0) * 2 - 1
1252
+ )
1253
+ elif not os.path.isfile(f"{working_dir}/steps/{i:04d}.png"):
1254
+ one_hot = F.one_hot(
1255
+ torch.randint(n_toks, [toksY * toksX], device=device), n_toks
1256
+ ).float()
1257
+ if self.args.is_gumbel:
1258
+ z = one_hot @ model.quantize.embed.weight
1259
+ else:
1260
+ z = one_hot @ model.quantize.embedding.weight
1261
+ z = z.view([-1, toksY, toksX, e_dim]).permute(0, 3, 1, 2)
1262
+ else:
1263
+ center = (1 * img_0.shape[1] // 2, 1 * img_0.shape[0] // 2)
1264
+ trans_mat = np.float32([[1, 0, 10], [0, 1, 10]])
1265
+ rot_mat = cv2.getRotationMatrix2D(center, 10, 20)
1266
+
1267
+ trans_mat = np.vstack([trans_mat, [0, 0, 1]])
1268
+ rot_mat = np.vstack([rot_mat, [0, 0, 1]])
1269
+ transformation_matrix = np.matmul(rot_mat, trans_mat)
1270
+
1271
+ img_0 = cv2.warpPerspective(
1272
+ img_0,
1273
+ transformation_matrix,
1274
+ (img_0.shape[1], img_0.shape[0]),
1275
+ borderMode=cv2.BORDER_WRAP,
1276
+ )
1277
+ z, *_ = model.encode(
1278
+ TF.to_tensor(img_0).to(device).unsqueeze(0) * 2 - 1
1279
+ )
1280
+
1281
+ def save_output(i, img, suffix="zoomed"):
1282
+ filename = f"{working_dir}/steps/{i:04}{'_' + suffix if suffix else ''}.png"
1283
+ imageio.imwrite(filename, np.array(img))
1284
+
1285
+ save_output(i, img_0)
1286
+ # -------
1287
+ if args.init_image:
1288
+ pil_image = Image.open(args.init_image).convert("RGB")
1289
+ pil_image = pil_image.resize((sideX, sideY), Image.LANCZOS)
1290
+ z, *_ = model.encode(
1291
+ TF.to_tensor(pil_image).to(device).unsqueeze(0) * 2 - 1
1292
+ )
1293
+ else:
1294
+ one_hot = F.one_hot(
1295
+ torch.randint(n_toks, [toksY * toksX], device=device), n_toks
1296
+ ).float()
1297
+ if self.args.is_gumbel:
1298
+ z = one_hot @ model.quantize.embed.weight
1299
+ else:
1300
+ z = one_hot @ model.quantize.embedding.weight
1301
+ z = z.view([-1, toksY, toksX, e_dim]).permute(0, 3, 1, 2)
1302
+ z = EMATensor(z, args.ema_val)
1303
+
1304
+ if args.mse_with_zeros and not args.init_image:
1305
+ z_orig = torch.zeros_like(z.tensor)
1306
+ else:
1307
+ z_orig = z.tensor.clone()
1308
+ z.requires_grad_(True)
1309
+ # opt = optim.AdamW(z.parameters(), lr=args.mse_step_size, weight_decay=0.00000000)
1310
+ print("Step size inside:", args.step_size)
1311
+ if self.normal_flip_optim == True:
1312
+ if randint(1, 2) == 1:
1313
+ opt = torch.optim.AdamW(
1314
+ z.parameters(), lr=args.step_size, weight_decay=0.00000000
1315
+ )
1316
+ # opt = Ranger21(z.parameters(), lr=args.step_size, weight_decay=0.00000000)
1317
+ else:
1318
+ opt = optim.DiffGrad(
1319
+ z.parameters(), lr=args.step_size, weight_decay=0.00000000
1320
+ )
1321
+ else:
1322
+ opt = torch.optim.AdamW(
1323
+ z.parameters(), lr=args.step_size, weight_decay=0.00000000
1324
+ )
1325
+
1326
+ self.cur_step_size = args.mse_step_size
1327
+
1328
+ normalize = transforms.Normalize(
1329
+ mean=[0.48145466, 0.4578275, 0.40821073],
1330
+ std=[0.26862954, 0.26130258, 0.27577711],
1331
+ )
1332
+
1333
+ pMs = []
1334
+ altpMs = []
1335
+
1336
+ for prompt in args.prompts:
1337
+ txt, weight, stop = parse_prompt(prompt)
1338
+ embed = perceptor.encode_text(clip.tokenize(txt).to(device)).float()
1339
+ pMs.append(Prompt(embed, weight, stop).to(device))
1340
+
1341
+ for prompt in args.altprompts:
1342
+ txt, weight, stop = parse_prompt(prompt)
1343
+ embed = perceptor.encode_text(clip.tokenize(txt).to(device)).float()
1344
+ altpMs.append(Prompt(embed, weight, stop).to(device))
1345
+
1346
+ from PIL import Image
1347
+
1348
+ for prompt in args.image_prompts:
1349
+ path, weight, stop = parse_prompt(prompt)
1350
+ img = resize_image(Image.open(path).convert("RGB"), (sideX, sideY))
1351
+ batch = make_cutouts(TF.to_tensor(img).unsqueeze(0).to(device))
1352
+ embed = perceptor.encode_image(normalize(batch)).float()
1353
+ pMs.append(Prompt(embed, weight, stop).to(device))
1354
+
1355
+ for seed, weight in zip(args.noise_prompt_seeds, args.noise_prompt_weights):
1356
+ gen = torch.Generator().manual_seed(seed)
1357
+ embed = torch.empty([1, perceptor.visual.output_dim]).normal_(
1358
+ generator=gen
1359
+ )
1360
+ pMs.append(Prompt(embed, weight).to(device))
1361
+ if self.usealtprompts:
1362
+ altpMs.append(Prompt(embed, weight).to(device))
1363
+
1364
+ self.model, self.perceptor = model, perceptor
1365
+ self.make_cutouts = make_cutouts
1366
+ self.imageSize = (sideX, sideY)
1367
+ self.prompts = pMs
1368
+ self.altprompts = altpMs
1369
+ self.opt = opt
1370
+ self.normalize = normalize
1371
+ self.z, self.z_orig, self.z_min, self.z_max = z, z_orig, z_min, z_max
1372
+ self.setup_metadata(args2.seed)
1373
+ self.mse_weight = self.args.init_weight
1374
+
1375
+ def synth(self, z):
1376
+ if self.args.is_gumbel:
1377
+ z_q = vector_quantize(
1378
+ z.movedim(1, 3), self.model.quantize.embed.weight
1379
+ ).movedim(3, 1)
1380
+ else:
1381
+ z_q = vector_quantize(
1382
+ z.movedim(1, 3), self.model.quantize.embedding.weight
1383
+ ).movedim(3, 1)
1384
+ return clamp_with_grad(self.model.decode(z_q).add(1).div(2), 0, 1)
1385
+
1386
+ def add_metadata(self, path, i):
1387
+ imfile = PngImageFile(path)
1388
+ meta = PngInfo()
1389
+ step_meta = {"iterations": i}
1390
+ step_meta.update(self.metadata)
1391
+ # meta.add_itxt('vqgan-params', json.dumps(step_meta), zip=True)
1392
+ imfile.save(path, pnginfo=meta)
1393
+ # Hey you. This one's for Glooperpogger#7353 on Discord (Gloop has a gun), they are a nice snek
1394
+
1395
+ @torch.no_grad()
1396
+ def checkin(self, i, losses, x):
1397
+ out = self.synth(self.z.average)
1398
+
1399
+ batchpath = "./"
1400
+ TF.to_pil_image(out[0].cpu()).save(args2.image_file)
1401
+
1402
+ def unique_index(self, batchpath):
1403
+ i = 0
1404
+ while i < 10000:
1405
+ if os.path.isfile(batchpath + "/" + str(i) + ".png"):
1406
+ i = i + 1
1407
+ else:
1408
+ return batchpath + "/" + str(i) + ".png"
1409
+
1410
+ def ascend_txt(self, i):
1411
+ out = self.synth(self.z.tensor)
1412
+ iii = self.perceptor.encode_image(
1413
+ self.normalize(self.make_cutouts(out))
1414
+ ).float()
1415
+
1416
+ result = []
1417
+ if self.args.init_weight and self.mse_weight > 0:
1418
+ result.append(
1419
+ F.mse_loss(self.z.tensor, self.z_orig) * self.mse_weight / 2
1420
+ )
1421
+
1422
+ for prompt in self.prompts:
1423
+ result.append(prompt(iii))
1424
+
1425
+ if self.usealtprompts:
1426
+ iii = self.perceptor.encode_image(
1427
+ self.normalize(self.alt_make_cutouts(out))
1428
+ ).float()
1429
+ for prompt in self.altprompts:
1430
+ result.append(prompt(iii))
1431
+
1432
+ return result
1433
+
1434
+ def train(self, i, x):
1435
+ self.opt.zero_grad()
1436
+ mse_decay = self.args.mse_decay
1437
+ mse_decay_rate = self.args.mse_decay_rate
1438
+ lossAll = self.ascend_txt(i)
1439
+
1440
+ sys.stdout.write("Iteration {}".format(i) + "\n")
1441
+ sys.stdout.flush()
1442
+ if i % (args2.iterations-2) == 0:
1443
+ self.checkin(i, lossAll, x)
1444
+
1445
+ loss = sum(lossAll)
1446
+ loss.backward()
1447
+ self.opt.step()
1448
+ with torch.no_grad():
1449
+ if (
1450
+ self.mse_weight > 0
1451
+ and self.args.init_weight
1452
+ and i > 0
1453
+ and i % mse_decay_rate == 0
1454
+ ):
1455
+ if self.args.is_gumbel:
1456
+ self.z_orig = vector_quantize(
1457
+ self.z.average.movedim(1, 3),
1458
+ self.model.quantize.embed.weight,
1459
+ ).movedim(3, 1)
1460
+ else:
1461
+ self.z_orig = vector_quantize(
1462
+ self.z.average.movedim(1, 3),
1463
+ self.model.quantize.embedding.weight,
1464
+ ).movedim(3, 1)
1465
+ if self.mse_weight - mse_decay > 0:
1466
+ self.mse_weight = self.mse_weight - mse_decay
1467
+ # print(f"updated mse weight: {self.mse_weight}")
1468
+ else:
1469
+ self.mse_weight = 0
1470
+ self.make_cutouts = flavordict[flavor](
1471
+ self.perceptor.visual.input_resolution,
1472
+ args.cutn,
1473
+ cut_pow=args.cut_pow,
1474
+ augs=args.augs,
1475
+ )
1476
+ if self.usealtprompts:
1477
+ self.alt_make_cutouts = flavordict[flavor](
1478
+ self.perceptor.visual.input_resolution,
1479
+ args.cutn,
1480
+ cut_pow=args.alt_cut_pow,
1481
+ augs=args.altaugs,
1482
+ )
1483
+ self.z = EMATensor(self.z.average, args.ema_val)
1484
+ self.new_step_size = args.step_size
1485
+ self.opt = torch.optim.AdamW(
1486
+ self.z.parameters(),
1487
+ lr=args.step_size,
1488
+ weight_decay=0.00000000,
1489
+ )
1490
+ # print(f"updated mse weight: {self.mse_weight}")
1491
+ if i > args.mse_end:
1492
+ if (
1493
+ args.step_size != args.final_step_size
1494
+ and args.max_iterations > 0
1495
+ ):
1496
+ progress = (i - args.mse_end) / (args.max_iterations)
1497
+ self.cur_step_size = lerp(step_size, final_step_size, progress)
1498
+ for g in self.opt.param_groups:
1499
+ g["lr"] = self.cur_step_size
1500
+
1501
+ def run(self, x):
1502
+ j = 0
1503
+ try:
1504
+ print("Step size: ", args.step_size)
1505
+ print("Step MSE size: ", args.mse_step_size)
1506
+ before_start_time = time.perf_counter()
1507
+ total_steps = int(args.max_iterations + args.mse_end) - 1
1508
+ for _ in range(total_steps):
1509
+ self.train(j, x)
1510
+ if j > 0 and j % args.mse_decay_rate == 0 and self.mse_weight > 0:
1511
+ self.z = EMATensor(self.z.average, args.ema_val)
1512
+ self.opt = torch.optim.AdamW(
1513
+ self.z.parameters(),
1514
+ lr=args.mse_step_size,
1515
+ weight_decay=0.00000000,
1516
+ )
1517
+ if j >= total_steps:
1518
+ break
1519
+ self.z.update()
1520
+ j += 1
1521
+ time_past_seconds = time.perf_counter() - before_start_time
1522
+ iterations_per_second = j / time_past_seconds
1523
+ time_left = (total_steps - j) / iterations_per_second
1524
+ percentage = round((j / (total_steps + 1)) * 100)
1525
+
1526
+ import shutil
1527
+ import os
1528
+
1529
+ #image_data = Image.open(args2.image_file)
1530
+ #os.remove(args2.image_file)
1531
+ #return(image_data)
1532
+
1533
+ except KeyboardInterrupt:
1534
+ pass
1535
+
1536
+ def add_noise(img):
1537
+
1538
+ # Getting the dimensions of the image
1539
+ row, col = img.shape
1540
+
1541
+ # Randomly pick some pixels in the
1542
+ # image for coloring them white
1543
+ # Pick a random number between 300 and 10000
1544
+ number_of_pixels = random.randint(300, 10000)
1545
+ for i in range(number_of_pixels):
1546
+
1547
+ # Pick a random y coordinate
1548
+ y_coord = random.randint(0, row - 1)
1549
+
1550
+ # Pick a random x coordinate
1551
+ x_coord = random.randint(0, col - 1)
1552
+
1553
+ # Color that pixel to white
1554
+ img[y_coord][x_coord] = 255
1555
+
1556
+ # Randomly pick some pixels in
1557
+ # the image for coloring them black
1558
+ # Pick a random number between 300 and 10000
1559
+ number_of_pixels = random.randint(300, 10000)
1560
+ for i in range(number_of_pixels):
1561
+
1562
+ # Pick a random y coordinate
1563
+ y_coord = random.randint(0, row - 1)
1564
+
1565
+ # Pick a random x coordinate
1566
+ x_coord = random.randint(0, col - 1)
1567
+
1568
+ # Color that pixel to black
1569
+ img[y_coord][x_coord] = 0
1570
+
1571
+ return img
1572
+
1573
+ import io
1574
+ import base64
1575
+
1576
+ def image_to_data_url(img, ext):
1577
+ img_byte_arr = io.BytesIO()
1578
+ img.save(img_byte_arr, format=ext)
1579
+ img_byte_arr = img_byte_arr.getvalue()
1580
+ # ext = filename.split('.')[-1]
1581
+ prefix = f"data:image/{ext};base64,"
1582
+ return prefix + base64.b64encode(img_byte_arr).decode("utf-8")
1583
+
1584
+ import torch
1585
+ import math
1586
+
1587
+ device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
1588
+
1589
+ def rand_perlin_2d(
1590
+ shape, res, fade=lambda t: 6 * t**5 - 15 * t**4 + 10 * t**3
1591
+ ):
1592
+ delta = (res[0] / shape[0], res[1] / shape[1])
1593
+ d = (shape[0] // res[0], shape[1] // res[1])
1594
+
1595
+ grid = (
1596
+ torch.stack(
1597
+ torch.meshgrid(
1598
+ torch.arange(0, res[0], delta[0]), torch.arange(0, res[1], delta[1])
1599
+ ),
1600
+ dim=-1,
1601
+ )
1602
+ % 1
1603
+ )
1604
+ angles = 2 * math.pi * torch.rand(res[0] + 1, res[1] + 1)
1605
+ gradients = torch.stack((torch.cos(angles), torch.sin(angles)), dim=-1)
1606
+
1607
+ tile_grads = (
1608
+ lambda slice1, slice2: gradients[
1609
+ slice1[0] : slice1[1], slice2[0] : slice2[1]
1610
+ ]
1611
+ .repeat_interleave(d[0], 0)
1612
+ .repeat_interleave(d[1], 1)
1613
+ )
1614
+ dot = lambda grad, shift: (
1615
+ torch.stack(
1616
+ (
1617
+ grid[: shape[0], : shape[1], 0] + shift[0],
1618
+ grid[: shape[0], : shape[1], 1] + shift[1],
1619
+ ),
1620
+ dim=-1,
1621
+ )
1622
+ * grad[: shape[0], : shape[1]]
1623
+ ).sum(dim=-1)
1624
+
1625
+ n00 = dot(tile_grads([0, -1], [0, -1]), [0, 0])
1626
+ n10 = dot(tile_grads([1, None], [0, -1]), [-1, 0])
1627
+ n01 = dot(tile_grads([0, -1], [1, None]), [0, -1])
1628
+ n11 = dot(tile_grads([1, None], [1, None]), [-1, -1])
1629
+ t = fade(grid[: shape[0], : shape[1]])
1630
+ return math.sqrt(2) * torch.lerp(
1631
+ torch.lerp(n00, n10, t[..., 0]), torch.lerp(n01, n11, t[..., 0]), t[..., 1]
1632
+ )
1633
+
1634
+ def rand_perlin_2d_octaves(desired_shape, octaves=1, persistence=0.5):
1635
+ shape = torch.tensor(desired_shape)
1636
+ shape = 2 ** torch.ceil(torch.log2(shape))
1637
+ shape = shape.type(torch.int)
1638
+
1639
+ max_octaves = int(
1640
+ min(
1641
+ octaves,
1642
+ math.log(shape[0]) / math.log(2),
1643
+ math.log(shape[1]) / math.log(2),
1644
+ )
1645
+ )
1646
+ res = torch.floor(shape / 2**max_octaves).type(torch.int)
1647
+
1648
+ noise = torch.zeros(list(shape))
1649
+ frequency = 1
1650
+ amplitude = 1
1651
+ for _ in range(max_octaves):
1652
+ noise += amplitude * rand_perlin_2d(
1653
+ shape, (frequency * res[0], frequency * res[1])
1654
+ )
1655
+ frequency *= 2
1656
+ amplitude *= persistence
1657
+
1658
+ return noise[: desired_shape[0], : desired_shape[1]]
1659
+
1660
+ def rand_perlin_rgb(desired_shape, amp=0.1, octaves=6):
1661
+ r = rand_perlin_2d_octaves(desired_shape, octaves)
1662
+ g = rand_perlin_2d_octaves(desired_shape, octaves)
1663
+ b = rand_perlin_2d_octaves(desired_shape, octaves)
1664
+ rgb = (torch.stack((r, g, b)) * amp + 1) * 0.5
1665
+ return rgb.unsqueeze(0).clip(0, 1).to(device)
1666
+
1667
+ def pyramid_noise_gen(shape, octaves=5, decay=1.0):
1668
+ n, c, h, w = shape
1669
+ noise = torch.zeros([n, c, 1, 1])
1670
+ max_octaves = int(min(math.log(h) / math.log(2), math.log(w) / math.log(2)))
1671
+ if octaves is not None and 0 < octaves:
1672
+ max_octaves = min(octaves, max_octaves)
1673
+ for i in reversed(range(max_octaves)):
1674
+ h_cur, w_cur = h // 2**i, w // 2**i
1675
+ noise = F.interpolate(
1676
+ noise, (h_cur, w_cur), mode="bicubic", align_corners=False
1677
+ )
1678
+ noise += (torch.randn([n, c, h_cur, w_cur]) / max_octaves) * decay ** (
1679
+ max_octaves - (i + 1)
1680
+ )
1681
+ return noise
1682
+
1683
+ def rand_z(model, toksX, toksY):
1684
+ e_dim = model.quantize.e_dim
1685
+ n_toks = model.quantize.n_e
1686
+ z_min = model.quantize.embedding.weight.min(dim=0).values[None, :, None, None]
1687
+ z_max = model.quantize.embedding.weight.max(dim=0).values[None, :, None, None]
1688
+
1689
+ one_hot = F.one_hot(
1690
+ torch.randint(n_toks, [toksY * toksX], device=device), n_toks
1691
+ ).float()
1692
+ z = one_hot @ model.quantize.embedding.weight
1693
+ z = z.view([-1, toksY, toksX, e_dim]).permute(0, 3, 1, 2)
1694
+
1695
+ return z
1696
+
1697
+ def make_rand_init(
1698
+ mode,
1699
+ model,
1700
+ perlin_octaves,
1701
+ perlin_weight,
1702
+ pyramid_octaves,
1703
+ pyramid_decay,
1704
+ toksX,
1705
+ toksY,
1706
+ f,
1707
+ ):
1708
+
1709
+ if mode == "VQGAN ZRand":
1710
+ return rand_z(model, toksX, toksY)
1711
+ elif mode == "Perlin Noise":
1712
+ rand_init = rand_perlin_rgb(
1713
+ (toksY * f, toksX * f), perlin_weight, perlin_octaves
1714
+ )
1715
+ z, *_ = model.encode(rand_init * 2 - 1)
1716
+ return z
1717
+ elif mode == "Pyramid Noise":
1718
+ rand_init = pyramid_noise_gen(
1719
+ (1, 3, toksY * f, toksX * f), pyramid_octaves, pyramid_decay
1720
+ ).to(device)
1721
+ rand_init = (rand_init * 0.5 + 0.5).clip(0, 1)
1722
+ z, *_ = model.encode(rand_init * 2 - 1)
1723
+ return z
1724
+
1725
+ ##################### JUICY MESS ###################################
1726
+ import os
1727
+
1728
+ imagenet_1024 = False # @param {type:"boolean"}
1729
+ imagenet_16384 = True # @param {type:"boolean"}
1730
+ gumbel_8192 = False # @param {type:"boolean"}
1731
+ sber_gumbel = False # @param {type:"boolean"}
1732
+ # imagenet_cin = False #@param {type:"boolean"}
1733
+ coco = False # @param {type:"boolean"}
1734
+ coco_1stage = False # @param {type:"boolean"}
1735
+ faceshq = False # @param {type:"boolean"}
1736
+ wikiart_1024 = False # @param {type:"boolean"}
1737
+ wikiart_16384 = False # @param {type:"boolean"}
1738
+ wikiart_7mil = False # @param {type:"boolean"}
1739
+ sflckr = False # @param {type:"boolean"}
1740
+
1741
+ ##@markdown Experimental models (won't probably work, if you know how to make them work, go ahead :D):
1742
+ # celebahq = False #@param {type:"boolean"}
1743
+ # ade20k = False #@param {type:"boolean"}
1744
+ # drin = False #@param {type:"boolean"}
1745
+ # gumbel = False #@param {type:"boolean"}
1746
+ # gumbel_8192 = False #@param {type:"boolean"}
1747
+
1748
+ # Configure and run the model"""
1749
+
1750
+ # Commented out IPython magic to ensure Python compatibility.
1751
+ # @title <font color="lightgreen" size="+3">←</font> <font size="+2">🏃‍♂️</font> **Configure & Run** <font size="+2">🏃‍♂️</font>
1752
+
1753
+ import os
1754
+ import random
1755
+ import cv2
1756
+
1757
+ # from google.colab import drive
1758
+ from PIL import Image
1759
+ from importlib import reload
1760
+
1761
+ reload(PIL.TiffTags)
1762
+ # %cd /content/
1763
+ # @markdown >`prompts` is the list of prompts to give to the AI, separated by `|`. With more than one, it will attempt to mix them together. You can add weights to different parts of the prompt by adding a `p:x` at the end of a prompt (before a `|`) where `p` is the prompt and `x` is the weight.
1764
+
1765
+ # prompts = "A fantasy landscape, by Greg Rutkowski. A lush mountain.:1 | Trending on ArtStation, unreal engine. 4K HD, realism.:0.63" #@param {type:"string"}
1766
+
1767
+ prompts = args2.prompt
1768
+
1769
+ width = args2.sizex # @param {type:"number"}
1770
+ height = args2.sizey # @param {type:"number"}
1771
+
1772
+ # model = "ImageNet 16384" #@param ['ImageNet 16384', 'ImageNet 1024', "Gumbel 8192", "Sber Gumbel", 'WikiArt 1024', 'WikiArt 16384', 'WikiArt 7mil', 'COCO-Stuff', 'COCO 1 Stage', 'FacesHQ', 'S-FLCKR']
1773
+ #model = args2.vqgan_model
1774
+
1775
+ #if model == "Gumbel 8192" or model == "Sber Gumbel":
1776
+ # is_gumbel = True
1777
+ #else:
1778
+ # is_gumbel = False
1779
+ is_gumbel = False
1780
+ ##@markdown The flavor effects the output greatly. Each has it's own characteristics and depending on what you choose, you'll get a widely different result with the same prompt and seed. Ginger is the default, nothing special. Cumin results more of a painting, while Holywater makes everythng super funky and/or colorful. Custom is a custom flavor, use the utilities above.
1781
+ # Type "old_holywater" to use the old holywater flavor from Hypertron V1
1782
+ flavor = (
1783
+ args2.flavor
1784
+ ) #'ginger' #@param ["ginger", "cumin", "holywater", "zynth", "wyvern", "aaron", "moth", "juu", "custom"]
1785
+ template = (
1786
+ args2.template
1787
+ ) # @param ["none", "----------Parameter Tweaking----------", "Balanced", "Detailed", "Consistent Creativity", "Realistic", "Smooth", "Subtle MSE", "Hyper Fast Results", "----------Complete Overhaul----------", "flag", "planet", "creature", "human", "----------Sizes----------", "Size: Square", "Size: Landscape", "Size: Poster", "----------Prompt Modifiers----------", "Better - Fast", "Better - Slow", "Movie Poster", "Negative Prompt", "Better Quality"]
1788
+ ##@markdown To use initial or target images, upload it on the left in the file browser. You can also use previous outputs by putting its path below, e.g. `batch_01/0.png`. If your previous output is saved to drive, you can use the checkbox so you don't have to type the whole path.
1789
+ init = "default noise" # @param ["default noise", "image", "random image", "salt and pepper noise", "salt and pepper noise on init image"]
1790
+
1791
+ if args2.seed_image is None:
1792
+ init_image = "" # args2.seed_image #""#@param {type:"string"}
1793
+ else:
1794
+ init_image = args2.seed_image # ""#@param {type:"string"}
1795
+
1796
+ if init == "random image":
1797
+ url = (
1798
+ "https://picsum.photos/"
1799
+ + str(width)
1800
+ + "/"
1801
+ + str(height)
1802
+ + "?blur="
1803
+ + str(random.randrange(5, 10))
1804
+ )
1805
+ urllib.request.urlretrieve(url, "Init_Img/Image.png")
1806
+ init_image = "Init_Img/Image.png"
1807
+ elif init == "random image clear":
1808
+ url = "https://source.unsplash.com/random/" + str(width) + "x" + str(height)
1809
+ urllib.request.urlretrieve(url, "Init_Img/Image.png")
1810
+ init_image = "Init_Img/Image.png"
1811
+ elif init == "random image clear 2":
1812
+ url = "https://loremflickr.com/" + str(width) + "/" + str(height)
1813
+ urllib.request.urlretrieve(url, "Init_Img/Image.png")
1814
+ init_image = "Init_Img/Image.png"
1815
+ elif init == "salt and pepper noise":
1816
+ urllib.request.urlretrieve(
1817
+ "https://i.stack.imgur.com/olrL8.png", "Init_Img/Image.png"
1818
+ )
1819
+ import cv2
1820
+
1821
+ img = cv2.imread("Init_Img/Image.png", 0)
1822
+ cv2.imwrite("Init_Img/Image.png", add_noise(img))
1823
+ init_image = "Init_Img/Image.png"
1824
+ elif init == "salt and pepper noise on init image":
1825
+ img = cv2.imread(init_image, 0)
1826
+ cv2.imwrite("Init_Img/Image.png", add_noise(img))
1827
+ init_image = "Init_Img/Image.png"
1828
+ elif init == "perlin noise":
1829
+ # For some reason Colab started crashing from this
1830
+ import noise
1831
+ import numpy as np
1832
+ from PIL import Image
1833
+
1834
+ shape = (width, height)
1835
+ scale = 100
1836
+ octaves = 6
1837
+ persistence = 0.5
1838
+ lacunarity = 2.0
1839
+ seed = np.random.randint(0, 100000)
1840
+ world = np.zeros(shape)
1841
+ for i in range(shape[0]):
1842
+ for j in range(shape[1]):
1843
+ world[i][j] = noise.pnoise2(
1844
+ i / scale,
1845
+ j / scale,
1846
+ octaves=octaves,
1847
+ persistence=persistence,
1848
+ lacunarity=lacunarity,
1849
+ repeatx=1024,
1850
+ repeaty=1024,
1851
+ base=seed,
1852
+ )
1853
+ Image.fromarray(prep_world(world)).convert("L").save("Init_Img/Image.png")
1854
+ init_image = "Init_Img/Image.png"
1855
+ elif init == "black and white":
1856
+ url = "https://www.random.org/bitmaps/?format=png&width=300&height=300&zoom=1"
1857
+ urllib.request.urlretrieve(url, "Init_Img/Image.png")
1858
+ init_image = "Init_Img/Image.png"
1859
+
1860
+ seed = args2.seed # @param {type:"number"}
1861
+ # @markdown >iterations excludes iterations spent during the mse phase, if it is being used. The total iterations will be more if `mse_decay_rate` is more than 0.
1862
+ iterations = args2.iterations # @param {type:"number"}
1863
+ transparent_png = False # @param {type:"boolean"}
1864
+
1865
+ # @markdown <font size="+3">⚠</font> **ADVANCED SETTINGS** <font size="+3">⚠</font>
1866
+ # @markdown ---
1867
+ # @markdown ---
1868
+
1869
+ # @markdown >If you want to make multiple images with different prompts, use this. Seperate different prompts for different images with a `~` (example: `prompt1~prompt1~prompt3`). Iter is the iterations you want each image to run for. If you use MSE, I'd type a pretty low number (about 10).
1870
+ multiple_prompt_batches = False # @param {type:"boolean"}
1871
+ multiple_prompt_batches_iter = 300 # @param {type:"number"}
1872
+
1873
+ # @markdown >`folder_name` is the name of the folder you want to output your result(s) to. Previous outputs will NOT be overwritten. By default, it will be saved to the colab's root folder, but the `save_to_drive` checkbox will save it to `MyDrive\VQGAN_Output` instead.
1874
+ folder_name = "" # @param {type:"string"}
1875
+ save_to_drive = False # @param {type:"boolean"}
1876
+ prompt_experiment = "None" # @param ['None', 'Fever Dream', 'Philipuss’s Basement', 'Vivid Turmoil', 'Mad Dad', 'Platinum', 'Negative Energy']
1877
+ if prompt_experiment == "Fever Dream":
1878
+ prompts = "<|startoftext|>" + prompts + "<|endoftext|>"
1879
+ elif prompt_experiment == "Vivid Turmoil":
1880
+ prompts = prompts.replace(" ", "¡")
1881
+ prompts = "¬" + prompts + "®"
1882
+ elif prompt_experiment == "Mad Dad":
1883
+ prompts = prompts.replace(" ", "\\s+")
1884
+ elif prompt_experiment == "Platinum":
1885
+ prompts = "~!" + prompts + "!~"
1886
+ prompts = prompts.replace(" ", "</w>")
1887
+ elif prompt_experiment == "Philipuss’s Basement":
1888
+ prompts = "<|startoftext|>" + prompts
1889
+ prompts = prompts.replace(" ", "<|endoftext|><|startoftext|>")
1890
+ elif prompt_experiment == "Lowercase":
1891
+ prompts = prompts.lower()
1892
+
1893
+
1894
+ # @markdown >Target images work like prompts, write the name of the image. You can add multiple target images by seperating them with a `|`.
1895
+ target_images = "" # @param {type:"string"}
1896
+
1897
+ # @markdown ><font size="+2">☢</font> Advanced values. Values of cut_pow below 1 prioritize structure over detail, and vice versa for above 1. Step_size affects how wild the change between iterations is, and if final_step_size is not 0, step_size will interpolate towards it over time.
1898
+ # @markdown >Cutn affects on 'Creativity': less cutout will lead to more random/creative results, sometimes barely readable, while higher values (90+) lead to very stable, photo-like outputs
1899
+ cutn = 130 # @param {type:"number"}
1900
+ cut_pow = 1 # @param {type:"number"}
1901
+ # @markdown >Step_size is like weirdness. Lower: more accurate/realistic, slower; Higher: less accurate/more funky, faster.
1902
+ step_size = 0.1 # @param {type:"number"}
1903
+ # @markdown >Start_step_size is a temporary step_size that will be active only in the first 10 iterations. It (sometimes) helps with speed. If it's set to 0, it won't be used.
1904
+ start_step_size = 0 # @param {type:"number"}
1905
+ # @markdown >Final_step_size is a goal step_size which the AI will try and reach. If set to 0, it won't be used.
1906
+ final_step_size = 0 # @param {type:"number"}
1907
+ if start_step_size <= 0:
1908
+ start_step_size = step_size
1909
+ if final_step_size <= 0:
1910
+ final_step_size = step_size
1911
+
1912
+ # @markdown ---
1913
+
1914
+ # @markdown >EMA maintains a moving average of trained parameters. The number below is the rate of decay (higher means slower).
1915
+ ema_val = 0.98 # @param {type:"number"}
1916
+
1917
+ # @markdown >If you want to keep starting from the same point, set `gen_seed` to a positive number. `-1` will make it random every time.
1918
+ gen_seed = -1 # @param {type:'number'}
1919
+
1920
+ init_image_in_drive = False # @param {type:"boolean"}
1921
+ if init_image_in_drive and init_image:
1922
+ init_image = "/content/drive/MyDrive/VQGAN_Output/" + init_image
1923
+
1924
+ images_interval = args2.update # @param {type:"number"}
1925
+
1926
+ # I think you should give "Free Thoughts on the Proceedings of the Continental Congress" a read, really funny and actually well-written, Hamilton presented it in a bad light IMO.
1927
+
1928
+ batch_size = 1 # @param {type:"number"}
1929
+
1930
+ # @markdown ---
1931
+
1932
+ # @markdown <font size="+1">🔮</font> **MSE Regulization** <font size="+1">🔮</font>
1933
+ # Based off of this notebook: https://colab.research.google.com/drive/1gFn9u3oPOgsNzJWEFmdK-N9h_y65b8fj?usp=sharing - already in credits
1934
+ use_mse = args2.mse # @param {type:"boolean"}
1935
+ mse_images_interval = images_interval
1936
+ mse_init_weight = 0.2 # @param {type:"number"}
1937
+ mse_decay_rate = 160 # @param {type:"number"}
1938
+ mse_epoches = 10 # @param {type:"number"}
1939
+ ##@param {type:"number"}
1940
+
1941
+ # @markdown >Overwrites the usual values during the mse phase if included. If any value is 0, its normal counterpart is used instead.
1942
+ mse_with_zeros = True # @param {type:"boolean"}
1943
+ mse_step_size = 0.87 # @param {type:"number"}
1944
+ mse_cutn = 42 # @param {type:"number"}
1945
+ mse_cut_pow = 0.75 # @param {type:"number"}
1946
+
1947
+ # @markdown >normal_flip_optim flips between two optimizers during the normal (not MSE) phase. It can improve quality, but it's kind of experimental, use at your own risk.
1948
+ normal_flip_optim = True # @param {type:"boolean"}
1949
+ ##@markdown >Adding some TV may make the image blurrier but also helps to get rid of noise. A good value to try might be 0.1.
1950
+ # tv_weight = 0.1 #@param {type:'number'}
1951
+ # @markdown ---
1952
+
1953
+ # @markdown >`altprompts` is a set of prompts that take in a different augmentation pipeline, and can have their own cut_pow. At the moment, the default "alt augment" settings flip the picture cutouts upside down before evaluating. This can be good for optical illusion images. If either cut_pow value is 0, it will use the same value as the normal prompts.
1954
+ altprompts = "" # @param {type:"string"}
1955
+ altprompt_mode = "flipped"
1956
+ ##@param ["normal" , "flipped", "sideways"]
1957
+ alt_cut_pow = 0 # @param {type:"number"}
1958
+ alt_mse_cut_pow = 0 # @param {type:"number"}
1959
+ # altprompt_type = "upside-down" #@param ['upside-down', 'as']
1960
+
1961
+ ##@markdown ---
1962
+ ##@markdown <font size="+1">💫</font> **Zooming and Moving** <font size="+1">💫</font>
1963
+ zoom = False
1964
+ ##@param {type:"boolean"}
1965
+ zoom_speed = 100
1966
+ ##@param {type:"number"}
1967
+ zoom_frequency = 20
1968
+ ##@param {type:"number"}
1969
+
1970
+ # @markdown ---
1971
+ # @markdown On an unrelated note, if you get any errors while running this, restart the runtime and run the first cell again. If that doesn't work either, message me on Discord (Philipuss#4066).
1972
+
1973
+ model_names = {
1974
+ "vqgan_imagenet_f16_16384": "vqgan_imagenet_f16_16384",
1975
+ "ImageNet 1024": "vqgan_imagenet_f16_1024",
1976
+ "Gumbel 8192": "gumbel_8192",
1977
+ "Sber Gumbel": "sber_gumbel",
1978
+ "imagenet_cin": "imagenet_cin",
1979
+ "WikiArt 1024": "wikiart_1024",
1980
+ "WikiArt 16384": "wikiart_16384",
1981
+ "COCO-Stuff": "coco",
1982
+ "FacesHQ": "faceshq",
1983
+ "S-FLCKR": "sflckr",
1984
+ "WikiArt 7mil": "wikiart_7mil",
1985
+ "COCO 1 Stage": "coco_1stage",
1986
+ }
1987
+
1988
+ if template == "Better - Fast":
1989
+ prompts = prompts + ". Detailed artwork. ArtStationHQ. unreal engine. 4K HD."
1990
+ elif template == "Better - Slow":
1991
+ prompts = (
1992
+ prompts
1993
+ + ". Detailed artwork. Trending on ArtStation. unreal engine. | Rendered in Maya. "
1994
+ + prompts
1995
+ + ". 4K HD."
1996
+ )
1997
+ elif template == "Movie Poster":
1998
+ prompts = prompts + ". Movie poster. Rendered in unreal engine. ArtStationHQ."
1999
+ width = 400
2000
+ height = 592
2001
+ elif template == "flag":
2002
+ prompts = (
2003
+ "A photo of a flag of the country "
2004
+ + prompts
2005
+ + " | Flag of "
2006
+ + prompts
2007
+ + ". White background."
2008
+ )
2009
+ # import cv2
2010
+ # img = cv2.imread('templates/flag.png', 0)
2011
+ # cv2.imwrite('templates/final_flag.png', add_noise(img))
2012
+ init_image = "templates/flag.png"
2013
+ transparent_png = True
2014
+ elif template == "planet":
2015
+ import cv2
2016
+
2017
+ img = cv2.imread("templates/planet.png", 0)
2018
+ cv2.imwrite("templates/final_planet.png", add_noise(img))
2019
+ prompts = (
2020
+ "A photo of the planet "
2021
+ + prompts
2022
+ + ". Planet in the middle with black background. | The planet of "
2023
+ + prompts
2024
+ + ". Photo of a planet. Black background. Trending on ArtStation. | Colorful."
2025
+ )
2026
+ init_image = "templates/final_planet.png"
2027
+ elif template == "creature":
2028
+ # import cv2
2029
+ # img = cv2.imread('templates/planet.png', 0)
2030
+ # cv2.imwrite('templates/final_planet.png', add_noise(img))
2031
+ prompts = (
2032
+ "A photo of a creature with "
2033
+ + prompts
2034
+ + ". Animal in the middle with white background. | The creature has "
2035
+ + prompts
2036
+ + ". Photo of a creature/animal. White background. Detailed image of a creature. | White background."
2037
+ )
2038
+ init_image = "templates/creature.png"
2039
+ # transparent_png = True
2040
+ elif template == "Detailed":
2041
+ prompts = (
2042
+ prompts
2043
+ + ", by Puer Udger. Detailed artwork, trending on artstation. 4K HD, realism."
2044
+ )
2045
+ flavor = "cumin"
2046
+ elif template == "human":
2047
+ init_image = "/content/templates/human.png"
2048
+ elif template == "Realistic":
2049
+ cutn = 200
2050
+ step_size = 0.03
2051
+ cut_pow = 0.2
2052
+ flavor = "holywater"
2053
+ elif template == "Consistent Creativity":
2054
+ flavor = "cumin"
2055
+ cut_pow = 0.01
2056
+ cutn = 136
2057
+ step_size = 0.08
2058
+ mse_step_size = 0.41
2059
+ mse_cut_pow = 0.3
2060
+ ema_val = 0.99
2061
+ normal_flip_optim = False
2062
+ elif template == "Smooth":
2063
+ flavor = "wyvern"
2064
+ step_size = 0.10
2065
+ cutn = 120
2066
+ normal_flip_optim = False
2067
+ tv_weight = 10
2068
+ elif template == "Subtle MSE":
2069
+ mse_init_weight = 0.07
2070
+ mse_decay_rate = 130
2071
+ mse_step_size = 0.2
2072
+ mse_cutn = 100
2073
+ mse_cut_pow = 0.6
2074
+ elif template == "Balanced":
2075
+ cutn = 130
2076
+ cut_pow = 1
2077
+ step_size = 0.16
2078
+ final_step_size = 0
2079
+ ema_val = 0.98
2080
+ mse_init_weight = 0.2
2081
+ mse_decay_rate = 130
2082
+ mse_with_zeros = True
2083
+ mse_step_size = 0.9
2084
+ mse_cutn = 50
2085
+ mse_cut_pow = 0.8
2086
+ normal_flip_optim = True
2087
+ elif template == "Size: Square":
2088
+ width = 450
2089
+ height = 450
2090
+ elif template == "Size: Landscape":
2091
+ width = 480
2092
+ height = 336
2093
+ elif template == "Size: Poster":
2094
+ width = 336
2095
+ height = 480
2096
+ elif template == "Negative Prompt":
2097
+ prompts = prompts.replace(":", ":-")
2098
+ prompts = prompts.replace(":--", ":")
2099
+ elif template == "Hyper Fast Results":
2100
+ step_size = 1
2101
+ ema_val = 0.3
2102
+ cutn = 30
2103
+ elif template == "Better Quality":
2104
+ prompts = (
2105
+ prompts + ":1 | Watermark, blurry, cropped, confusing, cut, incoherent:-1"
2106
+ )
2107
+
2108
+ mse_decay = 0
2109
+
2110
+ if use_mse == False:
2111
+ mse_init_weight = 0.0
2112
+ else:
2113
+ mse_decay = mse_init_weight / mse_epoches
2114
+
2115
+
2116
+ if seed == -1:
2117
+ seed = None
2118
+ if init_image == "None":
2119
+ init_image = None
2120
+ if target_images == "None" or not target_images:
2121
+ target_images = []
2122
+ else:
2123
+ target_images = target_images.split("|")
2124
+ target_images = [image.strip() for image in target_images]
2125
+
2126
+ prompts = [phrase.strip() for phrase in prompts.split("|")]
2127
+ if prompts == [""]:
2128
+ prompts = []
2129
+
2130
+ altprompts = [phrase.strip() for phrase in altprompts.split("|")]
2131
+ if altprompts == [""]:
2132
+ altprompts = []
2133
+
2134
+ if mse_images_interval == 0:
2135
+ mse_images_interval = images_interval
2136
+ if mse_step_size == 0:
2137
+ mse_step_size = step_size
2138
+ if mse_cutn == 0:
2139
+ mse_cutn = cutn
2140
+ if mse_cut_pow == 0:
2141
+ mse_cut_pow = cut_pow
2142
+ if alt_cut_pow == 0:
2143
+ alt_cut_pow = cut_pow
2144
+ if alt_mse_cut_pow == 0:
2145
+ alt_mse_cut_pow = mse_cut_pow
2146
+
2147
+ augs = nn.Sequential(
2148
+ K.RandomHorizontalFlip(p=0.5),
2149
+ K.RandomSharpness(0.3, p=0.4),
2150
+ K.RandomGaussianBlur((3, 3), (4.5, 4.5), p=0.3),
2151
+ # K.RandomGaussianNoise(p=0.5),
2152
+ # K.RandomElasticTransform(kernel_size=(33, 33), sigma=(7,7), p=0.2),
2153
+ K.RandomAffine(
2154
+ degrees=30, translate=0.1, p=0.8, padding_mode="border"
2155
+ ), # padding_mode=2
2156
+ K.RandomPerspective(
2157
+ 0.2,
2158
+ p=0.4,
2159
+ ),
2160
+ K.ColorJitter(hue=0.01, saturation=0.01, p=0.7),
2161
+ K.RandomGrayscale(p=0.1),
2162
+ )
2163
+
2164
+ if altprompt_mode == "normal":
2165
+ altaugs = nn.Sequential(
2166
+ K.RandomRotation(degrees=90.0, return_transform=True),
2167
+ K.RandomHorizontalFlip(p=0.5),
2168
+ K.RandomSharpness(0.3, p=0.4),
2169
+ K.RandomGaussianBlur((3, 3), (4.5, 4.5), p=0.3),
2170
+ # K.RandomGaussianNoise(p=0.5),
2171
+ # K.RandomElasticTransform(kernel_size=(33, 33), sigma=(7,7), p=0.2),
2172
+ K.RandomAffine(
2173
+ degrees=30, translate=0.1, p=0.8, padding_mode="border"
2174
+ ), # padding_mode=2
2175
+ K.RandomPerspective(
2176
+ 0.2,
2177
+ p=0.4,
2178
+ ),
2179
+ K.ColorJitter(hue=0.01, saturation=0.01, p=0.7),
2180
+ K.RandomGrayscale(p=0.1),
2181
+ )
2182
+ elif altprompt_mode == "flipped":
2183
+ altaugs = nn.Sequential(
2184
+ K.RandomHorizontalFlip(p=0.5),
2185
+ # K.RandomRotation(degrees=90.0),
2186
+ K.RandomVerticalFlip(p=1),
2187
+ K.RandomSharpness(0.3, p=0.4),
2188
+ K.RandomGaussianBlur((3, 3), (4.5, 4.5), p=0.3),
2189
+ # K.RandomGaussianNoise(p=0.5),
2190
+ # K.RandomElasticTransform(kernel_size=(33, 33), sigma=(7,7), p=0.2),
2191
+ K.RandomAffine(
2192
+ degrees=30, translate=0.1, p=0.8, padding_mode="border"
2193
+ ), # padding_mode=2
2194
+ K.RandomPerspective(
2195
+ 0.2,
2196
+ p=0.4,
2197
+ ),
2198
+ K.ColorJitter(hue=0.01, saturation=0.01, p=0.7),
2199
+ K.RandomGrayscale(p=0.1),
2200
+ )
2201
+ elif altprompt_mode == "sideways":
2202
+ altaugs = nn.Sequential(
2203
+ K.RandomHorizontalFlip(p=0.5),
2204
+ # K.RandomRotation(degrees=90.0),
2205
+ K.RandomVerticalFlip(p=1),
2206
+ K.RandomSharpness(0.3, p=0.4),
2207
+ K.RandomGaussianBlur((3, 3), (4.5, 4.5), p=0.3),
2208
+ # K.RandomGaussianNoise(p=0.5),
2209
+ # K.RandomElasticTransform(kernel_size=(33, 33), sigma=(7,7), p=0.2),
2210
+ K.RandomAffine(
2211
+ degrees=30, translate=0.1, p=0.8, padding_mode="border"
2212
+ ), # padding_mode=2
2213
+ K.RandomPerspective(
2214
+ 0.2,
2215
+ p=0.4,
2216
+ ),
2217
+ K.ColorJitter(hue=0.01, saturation=0.01, p=0.7),
2218
+ K.RandomGrayscale(p=0.1),
2219
+ )
2220
+
2221
+ if multiple_prompt_batches:
2222
+ prompts_all = str(prompts).split("~")
2223
+ else:
2224
+ prompts_all = prompts
2225
+ multiple_prompt_batches_iter = iterations
2226
+
2227
+ if multiple_prompt_batches:
2228
+ mtpl_prmpts_btchs = len(prompts_all)
2229
+ else:
2230
+ mtpl_prmpts_btchs = 1
2231
+
2232
+ # print(mtpl_prmpts_btchs)
2233
+
2234
+ steps_path = "./"
2235
+ zoom_path = "./"
2236
+
2237
+ path = "./"
2238
+
2239
+ iterations = multiple_prompt_batches_iter
2240
+
2241
+ for pr in range(0, mtpl_prmpts_btchs):
2242
+ # print(prompts_all[pr].replace('[\'', '').replace('\']', ''))
2243
+ if multiple_prompt_batches:
2244
+ prompts = prompts_all[pr].replace("['", "").replace("']", "")
2245
+
2246
+ if zoom:
2247
+ mdf_iter = round(iterations / zoom_frequency)
2248
+ else:
2249
+ mdf_iter = 2
2250
+ zoom_frequency = iterations
2251
+
2252
+ for iter in range(1, mdf_iter):
2253
+ if zoom:
2254
+ if iter != 0:
2255
+ image = Image.open("progress.png")
2256
+ area = (0, 0, width - zoom_speed, height - zoom_speed)
2257
+ cropped_img = image.crop(area)
2258
+ cropped_img.show()
2259
+
2260
+ new_image = cropped_img.resize((width, height))
2261
+ new_image.save("zoom.png")
2262
+ init_image = "zoom.png"
2263
+
2264
+ args = argparse.Namespace(
2265
+ prompts=prompts,
2266
+ altprompts=altprompts,
2267
+ image_prompts=target_images,
2268
+ noise_prompt_seeds=[],
2269
+ noise_prompt_weights=[],
2270
+ size=[width, height],
2271
+ init_image=init_image,
2272
+ png=transparent_png,
2273
+ init_weight=mse_init_weight,
2274
+ #vqgan_model=model_names[model],
2275
+ step_size=step_size,
2276
+ start_step_size=start_step_size,
2277
+ final_step_size=final_step_size,
2278
+ cutn=cutn,
2279
+ cut_pow=cut_pow,
2280
+ mse_cutn=mse_cutn,
2281
+ mse_cut_pow=mse_cut_pow,
2282
+ mse_step_size=mse_step_size,
2283
+ display_freq=images_interval,
2284
+ mse_display_freq=mse_images_interval,
2285
+ max_iterations=zoom_frequency,
2286
+ mse_end=0,
2287
+ seed=seed,
2288
+ folder_name=folder_name,
2289
+ save_to_drive=save_to_drive,
2290
+ mse_decay_rate=mse_decay_rate,
2291
+ mse_decay=mse_decay,
2292
+ mse_with_zeros=mse_with_zeros,
2293
+ normal_flip_optim=normal_flip_optim,
2294
+ ema_val=ema_val,
2295
+ augs=augs,
2296
+ altaugs=altaugs,
2297
+ alt_cut_pow=alt_cut_pow,
2298
+ alt_mse_cut_pow=alt_mse_cut_pow,
2299
+ is_gumbel=is_gumbel,
2300
+ gen_seed=gen_seed,
2301
+ )
2302
+ mh = ModelHost(args)
2303
+ x = 0
2304
+
2305
+ #for x in range(batch_size):
2306
+ mh.setup_model(x)
2307
+ mh.run(x)
2308
+ image_data = Image.open(args2.image_file)
2309
+ os.remove(args2.image_file)
2310
+ return(image_data)
2311
+ #return(last_iter)
2312
+ #x = x + 1
2313
+
2314
+ if zoom:
2315
+ files = os.listdir(steps_path)
2316
+ for index, file in enumerate(files):
2317
+ os.rename(
2318
+ os.path.join(steps_path, file),
2319
+ os.path.join(
2320
+ steps_path,
2321
+ "".join([str(index + 1 + zoom_frequency * iter), ".png"]),
2322
+ ),
2323
+ )
2324
+ index = index + 1
2325
+
2326
+ from pathlib import Path
2327
+ import shutil
2328
+
2329
+ src_path = steps_path
2330
+ trg_path = zoom_path
2331
+
2332
+ for src_file in range(1, mdf_iter):
2333
+ shutil.move(os.path.join(src_path, src_file), trg_path)
2334
+
2335
+ ##################### START GRADIO HERE ############################
2336
+ image = gr.outputs.Image(type="pil", label="Your result")
2337
+ #def cvt_2_base64(file_name):
2338
+ # with open(file_name , "rb") as image_file :
2339
+ # data = base64.b64encode(image_file.read())
2340
+ # return data.decode('utf-8')
2341
+ #base64image = "data:image/jpg;base64,"+cvt_2_base64('flavors.jpg')
2342
+ #markdown = gr.Markdown("<img src='"+base64image+"' />")
2343
+ #def test(raw_input):
2344
+ # pass
2345
+ #setattr(markdown, "requires_permissions", False)
2346
+ #setattr(markdown, "label", "Flavors")
2347
+ #setattr(markdown, "preprocess", test)
2348
+ iface = gr.Interface(
2349
+ fn=run_all,
2350
+ inputs=[
2351
+ gr.inputs.Textbox(label="Prompt - try adding increments to your prompt such as 'oil on canvas', 'a painting', 'a book cover'",default="the milky way in a milk bottle"),
2352
+ gr.inputs.Slider(label="Width", default=256, minimum=32, step=32, maximum=512),
2353
+ gr.inputs.Slider(label="Height", default=256, minimum=32, step=32, maximum=512),
2354
+ gr.inputs.Dropdown(label="Style - Hyper Fast Results is fast but compromises a bit of the quality",choices=["Default","Balanced","Detailed","Consistent Creativity","Realistic","Smooth","Subtle MSE","Hyper Fast Results"],default="Hyper Fast Results"),
2355
+ gr.inputs.Slider(label="Steps - more steps can increase quality but will take longer to generate. All styles that are not Hyper Fast need at least 200 steps",default=50,maximum=300,minimum=1,step=1),
2356
+ gr.inputs.Dropdown(label="Flavor - pick a flavor for the style of the images",choices=["ginger", "cumin", "holywater", "zynth", "wyvern", "aaron", "moth", "juu"]),
2357
+ #markdown
2358
+ ],
2359
+ outputs=image,
2360
+ title="Generate images from text with VQGAN+CLIP (Hypertron v2)",
2361
+ description="<div>By typing a prompt and pressing submit you can generate images based on this prompt. <a href='https://arxiv.org/abs/2204.08583' target='_blank'>VQGAN+CLIP</a> is a combination of a GAN and CLIP, as explained here. This approach innagurated the open source AI art scene, and the Hypertron v2 implementation compiles many improvements.</a><br>This Spaces UI to the model was assembled by <a style='color: rgb(99, 102, 241);font-weight:bold' href='https://twitter.com/multimodalart' target='_blank'>@multimodalart</a>, keep up with the <a style='color: rgb(99, 102, 241);' href='https://multimodal.art/news' target='_blank'>latest multimodal ai art news here</a> and consider <a style='color: rgb(99, 102, 241);' href='https://www.patreon.com/multimodalart' target='_blank'>supporting us on Patreon</a>",
2362
+ article="<h4 style='font-size: 110%;margin-top:.5em'>Biases acknowledgment</h4><div>Despite how impressive being able to turn text into image is, beware to the fact that this model may output content that reinforces or exarcbates societal biases. According to the <a href='https://arxiv.org/abs/2112.10752' target='_blank'>Latent Diffusion paper</a>:<i> \"Deep learning modules tend to reproduce or exacerbate biases that are already present in the data\"</i>. The model was trained on both the Imagenet dataset and in an undisclosed dataset by OpenAI.</div><h4 style='font-size: 110%;margin-top:1em'>Who owns the images produced by this demo?</h4><div>Definetly not me! Probably you do. I say probably because the Copyright discussion about AI generated art is ongoing. So <a href='https://www.theverge.com/2022/2/21/22944335/us-copyright-office-reject-ai-generated-art-recent-entrance-to-paradise' target='_blank'>it may be the case that everything produced here falls automatically into the public domain</a>. But in any case it is either yours or is in the public domain.</div>"
2363
+ )
2364
+ iface.launch(enable_queue=True)
flavors.jpg ADDED
requirements.txt ADDED
@@ -0,0 +1,29 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ -e git+https://github.com/CompVis/taming-transformers.git#egg=taming-transformers
2
+ -e git+https://github.com/openai/CLIP/#egg=CLIP
3
+ gitpython
4
+ ftfy
5
+ regex
6
+ pandas
7
+ omegaconf
8
+ pytorch-lightning
9
+ torch-fidelity
10
+ transformers
11
+ einops
12
+ noise
13
+ gputil
14
+ gradio
15
+ torch
16
+ numpy
17
+ tqdm
18
+ torchvision
19
+ Pillow
20
+ autokeras
21
+ huggingface_hub
22
+ kornia
23
+ imageio
24
+ pathvalidate
25
+ stegano
26
+ imgtag
27
+ timm
28
+ python-xmp-toolkit
29
+ shortuuid
taming/modules/autoencoder/lpips/vgg.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a78928a0af1e5f0fcb1f3b9e8f8c3a2a5a3de244d830ad5c1feddc79b8432868
3
+ size 7289