A new idea: style browsing library

#22
by andykoko - opened

I recently found an artist-style prompt library, It feels good, does not require complex prompts, and is suitable for most mainstream models.
Project website: https://github.com/SupaGruen/StableDiffusion-CheatSheet
I think it would be great to integrate its database into Guernika.app?

It uses a simple json format to store data, like this:

{"Type":"1","Name":"Abbey, Edwin Austin","Born":"1852","Death":"1911","Prompt":"style of Edwin Austin Abbey","NPrompt":"","Category":"Illustration, Painting, Oil, Pastel, Ink, USA, 19th Century","Checkpoint":"Deliberate 2.0","Extrainfo":"","Image":"Edwin-Austin-Abbey.webp","Creation":"202306200852"}

I specially made a simple interface design drawing, and I hope to adopt it if I can.

1.jpg

Guernika org

Hey @andykoko ! sorry for the late response, I have been working on a new update that will, hopefully, improve the creation UI in Guernika and add support for Stable Diffusion XL :D and multiple ControlNets. Here are a couple of images in case you want a sneak peak:

Screenshot 2023-07-12 at 20.40.37.png
Screenshot 2023-07-12 at 20.45.21.png

I would love your feedback in this and any other suggestions you have for this screen, I wanna focus on this one at the moment and improve collection manager in a different update.

When I first saw this idea I though, this might not work because some artists/styles may not be available in every model and it could be frustrating if you try to use one and it's not working. But the data does seem really great and at worst it can give people some inspiration on what to try.

The thing is, would it make sense to be able to access this directly from the Create tab instead? I'm not sure how but maybe that would be the best way to work on a prompt, maybe this could take over the left side of the screen.

If not, maybe this could be a companion app or do you think it's worth it to package it together?

@GuiyeC
The new function is great. I saw a button under the prompt that looks like viewing the prompt history?
Maybe you can set a style button next to it to adapt to SDXL.

5Jj8wTkqVDjcS_QC8vYqQ.jpg

Guernika org

That will be the prompt history yes, it will show the last 10 prompts used, and below the field is a row with the most used "keywords".

The problem with that is that the style library is huge! Maybe having that and a separate tab to explore it πŸ€”I first have to explore what data is actually available.

@GuiyeC For multiple ControlNets, I think it is necessary to provide a function to adjust scaling and translation for input images. For example, I want to make the walking characters in the sample smaller than cat. In this way, multiple input pictures can be better integrated.

I'm a photographer, and I'm looking forward to functions like infinite canvas, integrating inpaint and outpaint, which is the real productivity (like the latest version of PS).

Guernika org

@andykoko that would be the goal, my idea is to have a new tab "Canvas" where you just run whatever model you want wherever you want and then export the result but not in this update, hopefully not long from now.

Guernika 6 should be out now with a few nice things, I'll keep working on it!

@GuiyeC Thank you for your work. Once again, you are in front of Apple's official.πŸ‘

I simply tried version 6.0

Bug:

  1. Once the window height is adjusted, the software will crash, every time.
  2. The new painting function is great. It supports all models and does not affect the unselected area, but it cannot automatically crop input pictures that do not meet the requirements, otherwise the following error will be prompted:

For input feature 'z', the provided shape 1 Γ— 3 Γ— 1536 Γ— 1024 is not compatible with the model's feature description.

Suggestion:

2023-07-18 19.09.45.png

  1. The setting area UI of version 5.0 seems to be more neat and beautiful.

  2. apple/ml-stable-diffusion with a new PR, it seems that can denoise the preview picture without adding time.
    https://github.com/apple/ml-stable-diffusion/pull/210

Other:

I used the latest GuernikaModelConverter 5.0 to convert SDXL 0.9 Base, but I don't know why. Under the same number of steps, resolution and keywords, the generated pictures and dreamstudio.ai are not level at all. The photographic keywords are understood as comics, and the picture quality is very poor, and there are many mosaics.

2023-07-18 20.10.47.jpg
2023-07-18 20.18.42.jpg

Update

I found that the random seeds used by dreamstudio.ai are basically 6 digits, so I set the seeds of Guernika to 6 digits and got similar pictures, but the picture quality is still very poor.

IMG_00010.png

Guernika org
β€’
edited Jul 18, 2023

@andykoko thanks for the response :)

Bugs:

  • do you mean the main window? what tab are you in when it crashes? I'm not able to reproduce this, I will try to see if I got a crash report from Apple
  • This should also be working, I will take a look at models with different sizes, does img2img work for you with that model?

Suggestions:

  • How would you arrange it? do you have anything in mind?
  • This will be in the next update, I had to add it to a few more samplers

Stable Diffusion XL vs DreamStudio:

As far as I can tell the outputs of python implementation and Guernika are pretty similar, here is the same image generated with both, they look almost the same:

IMG_00198.png

SDXL_girl.png

This is using the inputs I saw in your picture with PNDM sample some samplers generate bigger differences I will try to take a look at those

It could be that DreamStudio already has an updated model or that they are running the refiner on every generation, refiner model should work in Guernika too as a basic img2img model. It could also have something to do with the sampler being used, is that configurable in DreamStudio?

@GuiyeC Img2img can automatically crop the incorrect input image resolution, but inpaint will not automatically crop and will report errors.

here is a crash video:

@GuiyeC It seems to be a problem with the sampler, PNDM can get good results, while the picture quality of DDIM and DPM-Solver++ is very poor, it seems that PNDM requires 40-50 steps.

The following is the picture of my new test(no refine). The picture quality and color are quite good. I think it is more suitable for 6-digit random numbers for SDXL seeds. You can test it.
prompt: Photographic, Beautiful girl standing in the garden
negative prompt: blurry, grainy, low-resolution
steps: 40
seed: 810145
guidance scale: 5

IMG_00015.png

After testing, DPM-Solver++ can work normally on sd_xl_refiner, but sd_xl_base is not. Even if more steps are used, the picture quality is not normal. Here are some simple tests:

1.png
2.png

@GuiyeC I observed that there are two repeated remaining time displays in Guernika v6, but the remaining time error has been very large, especially img2img. There is no reference significance. I think it's better to replace it with duration?

In this way, it is convenient to analyze the impact of different parameter settings on the generation time.

@GuiyeC
I made a Apple shortcut to generate Stable Diffusion XL Style, If there is no add style preset plan, this is also a good choice.
https://github.com/czkoko/SDXL-Style-Presets-shortcut

Guernika org

@andykoko that is awesome! where did you get the list of styles for SDXL?

@GuiyeC I'm not sure if this prompt is official, but at present, comfyui and invokeai are using the same method, and according to the announcement of sdxl, the official seems to be making a preset in the form of Text inversion.

Guernika org

@andykoko but where is the list of prompts? I could easily add that next to the prompt history button

Guernika org

I found this

@GuiyeC This is the list. Some prompts are actually irrelevant. Deleting some can also achieve the corresponding style.

@GuiyeC I found a project, I don't know if it can help improve Guernika performance. DrawThings relies on it to improve its performance.
https://github.com/philipturner/metal-flash-attention

Guernika org

@andykoko how about something like this?
Screenshot 2023-07-29 at 01.25.31.png

That basically takes whatever prompt the user enters and applies the style by using the prompt template and the negative prompt in that list.

Guernika org

About the that project, I did see that but I think they are using a completely different implementation, I will have to take a better look but I don't think that will be applicable for these models.

πŸ‘ It would be more beautiful if the corresponding icon could be added.
https://github.com/Stability-AI/StableStudio/tree/main/packages/stablestudio-ui/public/presets

πŸ‘ It would be more beautiful if the corresponding icon could be added.
https://github.com/Stability-AI/StableStudio/tree/main/packages/stablestudio-ui/public/presets

Dang, that is dope! Though these aren't really suitable as icons given the fidelity of the image. At small sizes it would be hard to distinguish the difference in styles, meaning the list would have to be pretty big to show visually the differences between each style. Perhaps a preview image showing the style next to the dropdown list might be better, but this would also cause a bit more over complication of the UI.

@Gomeo Yes, if an icon is added, I also made a hypothetical picture that might be feasible before.

A8-2GS8X0iqpD-45gPzca.jpeg

My bad, I see you posted that mockup 16 days ago. That does indeed look and work a lot better.

Guernika org

@andykoko @Gomeo How about this? I added the option to collapse it in case someone doesn't want to have it at all times there.

Screenshot 2023-07-29 at 02.13.16.png

@GuiyeC The efficiency is really fast. It looks good. Don't forget to count the number of prompt words displayed with the number of style prompt words.
I think img2img and controlnet are more suitable for the layout below as before.

@GuiyeC that works too, though at a glance the mockup @andykoko allows the user to view all the options at once and choose the one they like. In a situation where there become lots of styles (say, 40+) that would work better perhaps.

However for what we have right now, that works great and is functioning as intended. I like the ability to expand/collapse that field too, means the UI will be less visually cluttered / over-whelming. Nice work!

Guernika org

@andykoko what do you mean? Having a row for Img2Img and then a row for each ControlNet?

The problem with that is that you can expand the right column to be able to work on longer prompts comfortably but the images have an aspect ratio that is not suitable for this, I think it makes sense to have them in a row below.

The picture below is what I mean. It's a little similar to the previous V5.0 layout. Of course, the current one is also good.

2023-07-29 08.58.43.jpg

I think it may be because of the display resolution. You should use 1080p, and there will be no blank area in the middle. I zoom from 4k to 2k. If it is not scaled, it will be the following effect.

2023-07-29 09.31.48.jpg

Waow, I missed the party, sorry for asking the same thing everywhere else,
@andykoko kind of opened up my eyes with the cheatsheet, and I'm enjoying going back to the basics.

I like the "Style Picker" @GuiyeC added for SDXL,
can It exist as a similar customizable for SD1.5 models ? : I'd like to pin multiple tokens to the textbox at once ?
A search field would be great in the style picker too !

Those ideas are very cool, makes my Mac shine !

Sign up or log in to comment