flxcontrol / README.md
fantos's picture
Update README.md
c07cab5 verified

A newer version of the Gradio SDK is available: 5.41.1

Upgrade
metadata
title: FLUX.1 Dev ControlNet Union Pro
emoji: ๐Ÿ–ผ
colorFrom: purple
colorTo: red
sdk: gradio
sdk_version: 5.35.0
app_file: app.py
pinned: false
license: other

FLUX.1-dev ControlNet Union Pro: Advanced Image Generation with Multiple Control Modes

This application implements a sophisticated image generation system using FLUX.1-dev with ControlNet Union Pro, offering multiple control modes for precise image generation guidance. The system allows users to generate high-quality images while maintaining specific structural or stylistic constraints from reference images.

Key Features

1. Multiple Control Modes

  • Canny: Edge-based control using Canny edge detection
  • Depth: 3D depth information guidance using Depth Anything V2
  • OpenPose: Human pose-based generation
  • Grayscale: Luminance-based control
  • Blur: Gaussian blur for soft guidance
  • Tile: Resolution-independent tiling control
  • LowQuality: Noise-based control for enhancement tasks

2. Flexible Input Options

  • Direct upload of pre-processed control images
  • Automatic extraction of control conditions from reference images
  • Support for various image formats and resolutions
  • Intelligent image resizing and preprocessing

3. Advanced Generation Parameters

  • Control Strength (0-1.0): Adjust how strongly the control influences generation
  • Inference Steps (1-50): Balance between quality and speed
  • Guidance Scale (1-10): Control prompt adherence
  • Seed Control: Reproducible results with manual or random seeds

4. Technical Architecture

  • Based on FLUX.1-dev diffusion model
  • Multi-ControlNet support for combined control modes
  • Depth Anything V2 (Large) for accurate depth estimation
  • GPU-accelerated processing with CUDA support
  • Memory-optimized with VAE tiling and CPU offloading

How It Works

  1. Control Image Input: Either upload a pre-processed control image or let the system extract it from a reference image
  2. Control Mode Selection: Choose the appropriate control type for your use case
  3. Prompt Input: Describe the desired output (defaults to "Highest Quality")
  4. Parameter Tuning: Adjust control strength and generation settings
  5. Generation: The model creates an image following both the prompt and control guidance

Use Cases

  • Image Enhancement: Use LowQuality mode to enhance degraded images
  • Style Transfer: Apply artistic styles while preserving structure (Canny/Depth)
  • Pose-Guided Generation: Create images with specific human poses
  • Consistent Character Design: Maintain structural consistency across variations
  • Architectural Visualization: Use depth control for accurate spatial representations
  • Texture Synthesis: Tile mode for seamless pattern generation

The system provides real-time feedback by showing both the generated result and the preprocessed control condition, helping users understand and refine their control inputs for optimal results.


FLUX.1-dev ControlNet Union Pro: ๋‹ค์ค‘ ์ œ์–ด ๋ชจ๋“œ๋ฅผ ํ™œ์šฉํ•œ ๊ณ ๊ธ‰ ์ด๋ฏธ์ง€ ์ƒ์„ฑ

์ด ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜์€ FLUX.1-dev์™€ ControlNet Union Pro๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์ •๊ตํ•œ ์ด๋ฏธ์ง€ ์ƒ์„ฑ ์‹œ์Šคํ…œ์„ ๊ตฌํ˜„ํ•˜๋ฉฐ, ์ •๋ฐ€ํ•œ ์ด๋ฏธ์ง€ ์ƒ์„ฑ ๊ฐ€์ด๋“œ๋ฅผ ์œ„ํ•œ ๋‹ค์–‘ํ•œ ์ œ์–ด ๋ชจ๋“œ๋ฅผ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค. ์‚ฌ์šฉ์ž๋Š” ์ฐธ์กฐ ์ด๋ฏธ์ง€์˜ ํŠน์ • ๊ตฌ์กฐ๋‚˜ ์Šคํƒ€์ผ ์ œ์•ฝ์„ ์œ ์ง€ํ•˜๋ฉด์„œ ๊ณ ํ’ˆ์งˆ ์ด๋ฏธ์ง€๋ฅผ ์ƒ์„ฑํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์ฃผ์š” ๊ธฐ๋Šฅ

1. ๋‹ค์ค‘ ์ œ์–ด ๋ชจ๋“œ

  • Canny: Canny ์—ฃ์ง€ ๊ฒ€์ถœ์„ ์‚ฌ์šฉํ•œ ์—ฃ์ง€ ๊ธฐ๋ฐ˜ ์ œ์–ด
  • Depth: Depth Anything V2๋ฅผ ์‚ฌ์šฉํ•œ 3D ๊นŠ์ด ์ •๋ณด ๊ฐ€์ด๋“œ
  • OpenPose: ์ธ์ฒด ํฌ์ฆˆ ๊ธฐ๋ฐ˜ ์ƒ์„ฑ
  • Grayscale: ๋ช…๋„ ๊ธฐ๋ฐ˜ ์ œ์–ด
  • Blur: ๋ถ€๋“œ๋Ÿฌ์šด ๊ฐ€์ด๋“œ๋ฅผ ์œ„ํ•œ ๊ฐ€์šฐ์‹œ์•ˆ ๋ธ”๋Ÿฌ
  • Tile: ํ•ด์ƒ๋„ ๋…๋ฆฝ์ ์ธ ํƒ€์ผ๋ง ์ œ์–ด
  • LowQuality: ํ–ฅ์ƒ ์ž‘์—…์„ ์œ„ํ•œ ๋…ธ์ด์ฆˆ ๊ธฐ๋ฐ˜ ์ œ์–ด

2. ์œ ์—ฐํ•œ ์ž…๋ ฅ ์˜ต์…˜

  • ์‚ฌ์ „ ์ฒ˜๋ฆฌ๋œ ์ œ์–ด ์ด๋ฏธ์ง€ ์ง์ ‘ ์—…๋กœ๋“œ
  • ์ฐธ์กฐ ์ด๋ฏธ์ง€์—์„œ ์ œ์–ด ์กฐ๊ฑด ์ž๋™ ์ถ”์ถœ
  • ๋‹ค์–‘ํ•œ ์ด๋ฏธ์ง€ ํ˜•์‹ ๋ฐ ํ•ด์ƒ๋„ ์ง€์›
  • ์ง€๋Šฅ์ ์ธ ์ด๋ฏธ์ง€ ํฌ๊ธฐ ์กฐ์ • ๋ฐ ์ „์ฒ˜๋ฆฌ

3. ๊ณ ๊ธ‰ ์ƒ์„ฑ ๋งค๊ฐœ๋ณ€์ˆ˜

  • Control Strength (0-1.0): ์ œ์–ด๊ฐ€ ์ƒ์„ฑ์— ๋ฏธ์น˜๋Š” ์˜ํ–ฅ ์กฐ์ ˆ
  • Inference Steps (1-50): ํ’ˆ์งˆ๊ณผ ์†๋„ ๊ฐ„ ๊ท ํ˜• ์กฐ์ ˆ
  • Guidance Scale (1-10): ํ”„๋กฌํ”„ํŠธ ์ค€์ˆ˜๋„ ์ œ์–ด
  • Seed Control: ์ˆ˜๋™ ๋˜๋Š” ๋žœ๋ค ์‹œ๋“œ๋กœ ์žฌํ˜„ ๊ฐ€๋Šฅํ•œ ๊ฒฐ๊ณผ

4. ๊ธฐ์ˆ ์  ๊ตฌ์กฐ

  • FLUX.1-dev ํ™•์‚ฐ ๋ชจ๋ธ ๊ธฐ๋ฐ˜
  • ๊ฒฐํ•ฉ๋œ ์ œ์–ด ๋ชจ๋“œ๋ฅผ ์œ„ํ•œ Multi-ControlNet ์ง€์›
  • ์ •ํ™•ํ•œ ๊นŠ์ด ์ถ”์ •์„ ์œ„ํ•œ Depth Anything V2 (Large)
  • CUDA ์ง€์› GPU ๊ฐ€์† ์ฒ˜๋ฆฌ
  • VAE ํƒ€์ผ๋ง๊ณผ CPU ์˜คํ”„๋กœ๋”ฉ์œผ๋กœ ๋ฉ”๋ชจ๋ฆฌ ์ตœ์ ํ™”

์ž‘๋™ ๋ฐฉ์‹

  1. ์ œ์–ด ์ด๋ฏธ์ง€ ์ž…๋ ฅ: ์‚ฌ์ „ ์ฒ˜๋ฆฌ๋œ ์ œ์–ด ์ด๋ฏธ์ง€ ์—…๋กœ๋“œ ๋˜๋Š” ์ฐธ์กฐ ์ด๋ฏธ์ง€์—์„œ ์ž๋™ ์ถ”์ถœ
  2. ์ œ์–ด ๋ชจ๋“œ ์„ ํƒ: ์‚ฌ์šฉ ๋ชฉ์ ์— ๋งž๋Š” ์ ์ ˆํ•œ ์ œ์–ด ์œ ํ˜• ์„ ํƒ
  3. ํ”„๋กฌํ”„ํŠธ ์ž…๋ ฅ: ์›ํ•˜๋Š” ์ถœ๋ ฅ ์„ค๋ช… (๊ธฐ๋ณธ๊ฐ’: "Highest Quality")
  4. ๋งค๊ฐœ๋ณ€์ˆ˜ ์กฐ์ •: ์ œ์–ด ๊ฐ•๋„ ๋ฐ ์ƒ์„ฑ ์„ค์ • ์กฐ์ ˆ
  5. ์ƒ์„ฑ: ๋ชจ๋ธ์ด ํ”„๋กฌํ”„ํŠธ์™€ ์ œ์–ด ๊ฐ€์ด๋“œ๋ฅผ ๋ชจ๋‘ ๋”ฐ๋ฅด๋Š” ์ด๋ฏธ์ง€ ์ƒ์„ฑ

ํ™œ์šฉ ์‚ฌ๋ก€

  • ์ด๋ฏธ์ง€ ํ–ฅ์ƒ: LowQuality ๋ชจ๋“œ๋กœ ์—ดํ™”๋œ ์ด๋ฏธ์ง€ ๊ฐœ์„ 
  • ์Šคํƒ€์ผ ์ „์†ก: ๊ตฌ์กฐ๋ฅผ ๋ณด์กดํ•˜๋ฉด์„œ ์˜ˆ์ˆ ์  ์Šคํƒ€์ผ ์ ์šฉ (Canny/Depth)
  • ํฌ์ฆˆ ๊ธฐ๋ฐ˜ ์ƒ์„ฑ: ํŠน์ • ์ธ์ฒด ํฌ์ฆˆ๋กœ ์ด๋ฏธ์ง€ ์ƒ์„ฑ
  • ์ผ๊ด€๋œ ์บ๋ฆญํ„ฐ ๋””์ž์ธ: ๋ณ€ํ˜• ๊ฐ„ ๊ตฌ์กฐ์  ์ผ๊ด€์„ฑ ์œ ์ง€
  • ๊ฑด์ถ• ์‹œ๊ฐํ™”: ์ •ํ™•ํ•œ ๊ณต๊ฐ„ ํ‘œํ˜„์„ ์œ„ํ•œ ๊นŠ์ด ์ œ์–ด ์‚ฌ์šฉ
  • ํ…์Šค์ฒ˜ ํ•ฉ์„ฑ: ๋งค๋„๋Ÿฌ์šด ํŒจํ„ด ์ƒ์„ฑ์„ ์œ„ํ•œ ํƒ€์ผ ๋ชจ๋“œ

์ด ์‹œ์Šคํ…œ์€ ์ƒ์„ฑ๋œ ๊ฒฐ๊ณผ์™€ ์ „์ฒ˜๋ฆฌ๋œ ์ œ์–ด ์กฐ๊ฑด์„ ๋ชจ๋‘ ๋ณด์—ฌ์คŒ์œผ๋กœ์จ ์‹ค์‹œ๊ฐ„ ํ”ผ๋“œ๋ฐฑ์„ ์ œ๊ณตํ•˜๋ฉฐ, ์‚ฌ์šฉ์ž๊ฐ€ ์ตœ์ ์˜ ๊ฒฐ๊ณผ๋ฅผ ์œ„ํ•ด ์ œ์–ด ์ž…๋ ฅ์„ ์ดํ•ดํ•˜๊ณ  ๊ฐœ์„ ํ•˜๋Š” ๋ฐ ๋„์›€์„ ์ค๋‹ˆ๋‹ค.