sdxl-interpolated / README.md
AbstractPhil's picture
Update README.md
2b7c7b9 verified
metadata
license: mit
base_model:
  - stabilityai/stable-diffusion-xl-base-1.0
  • Requires a custom training notebook that will be provided soon.

  • Distilling SDXL using T5 attention masking for the sake of teaching SDXL; CLIP_L and CLIP_G to expect the T5 attention mask.

  • Additional finetuning required, additional interpolation required, addistional distillation required for full cohesion.

  • Ongoing training effort interpolating the T5 into SDXL using teacher/student process.

  • -config = {

  • "epochs": 10,

  • "batch_size": 64,

  • "learning_rate": 1e-6, # Lower learning rate for stability

  • "save_interval_steps": 10, # Save checkpoint every 10 training steps

  • "test_save_interval_steps": 10, # Save test images every 10 training steps

  • "checkpoint_dir": "./checkpoints", # Full diffusers checkpoint folder

  • "compact_model_dir": "./compact_model", # For final compact model (not used for caching)

  • "baseline_test_dir": "./baseline_test", # For baseline images & captions

  • "cache_dir": "./cache", # Folder for caching T5 outputs and teacher features

  • "num_generated_captions": 128, # Number of captions to generate for training

  • "model_id": "stabilityai/stable-diffusion-xl-base-1.0",

  • "model_name": "my_interpolative_distillation", # Folder name for checkpoints

  • "seed": 420,

  • "device": torch.device("cuda:0") if torch.cuda.is_available() else torch.device("cpu"),

  • "inference_steps": 50,

  • "height": 1024,

  • "width": 1024,

  • "guidance_scale": 7.5,

  • "inference_interval": 10,

  • "max_caption_length": 512,

  • Batch size for teacher feature caching (set very low to reduce VRAM usage)

  • "cache_teacher_batch_size": 64,

-}