Text-to-Image
Diffusers
lora
template:diffusion-lora

Size is the limit?

#1
by Gen2-AI - opened

I'm particularly interested in the storyboard generation ability and as of right now it seems like with your model, flux is able to draw multiple correlated scenes onto the same canvas, with consistency in story, style, and even character. which is pretty fascinating. Though the limitations are quite obvious as well... Not all stories consist of only three scenes, if we want to exceed this limit, we either increase the latent size so the model has more room for more scenes, or have more scenes drawn onto the same size latent image, which probably will give detail-lacking results.
From what I've heard, black forest labs released their newest model that generates as big as 4k images, so I guess if flux text encoder allows super long prompts, we might actually be able to have six or even nine scenes generated at once on a single canvas, the real power of "in-context" image generation.

他只是给大家提供一个思路,不是说只能有三个哈

Sign up or log in to comment