Hiroshige-SDXL-LoRA

This is a standard PEFT LoRA derived from stabilityai/stable-diffusion-xl-base-1.0.

The main validation prompt used during training was:

hshge, hamster

Validation settings

CFG: 4.2
CFG Rescale: 0.0
Steps: 20
Sampler: None
Seed: 42
Resolution: 1024x1024

Note: The validation settings are not necessarily the same as the training settings.

You can find some example images in the following gallery:

Prompt
unconditional (blank prompt)

Negative Prompt
blurry, cropped, ugly

Prompt
hshge, Mount Fuji viewed from a distance, with cherry blossoms in the foreground. A small village nestles at the base of the mountain.

Negative Prompt
blurry, cropped, ugly

Prompt
hshge, Hamster

Negative Prompt
blurry, cropped, ugly

Prompt
hshge, A scene from the Tokaido road, with travelers crossing a wooden bridge. A misty mountain landscape in the background.

Negative Prompt
blurry, cropped, ugly

Prompt
hshge, A busy fish market in Edo. Vendors display their catch while customers browse. Boats visible in the nearby harbor.

Negative Prompt
blurry, cropped, ugly

Prompt
hshge, People caught in a sudden rainstorm on a city street, rushing for cover with umbrellas. A large bridge spans the background.

Negative Prompt
blurry, cropped, ugly

Prompt
hshge, A serene temple complex under a full moon. Lanterns illuminate the path, with silhouettes of pine trees against the night sky.

Negative Prompt
blurry, cropped, ugly

Prompt
hshge, A traditional Japanese garden in winter. Snow-covered trees and a small bridge over a frozen pond. A figure in a kimono walks along a path.

Negative Prompt
blurry, cropped, ugly

Prompt
hshge, The modern Tokyo Skytree towering over traditional low-rise buildings. Cherry blossoms frame the view.

Negative Prompt
blurry, cropped, ugly

Prompt
hshge, A sleek bullet train speeding past Mount Fuji. Rice fields and a small town visible in the middle ground.

Negative Prompt
blurry, cropped, ugly

Prompt
hshge, The bustling Times Square in New York, with bright billboards and crowds of people. A view reminiscent of Hiroshige's busy street scenes.

Negative Prompt
blurry, cropped, ugly

Prompt
hshge, A futuristic Mars colony with dome habitats and space vehicles. The red Martian landscape stretches to the horizon.

Negative Prompt
blurry, cropped, ugly

Prompt
hshge, An imaginary underwater city with Japanese-style architecture. Fish and sea creatures swim among the buildings.

Negative Prompt
blurry, cropped, ugly

Prompt
hshge, People wearing VR headsets in a modern cafe. Traditional Japanese elements mix with futuristic technology in the decor.

Negative Prompt
blurry, cropped, ugly

The text encoder was not trained. You may reuse the base model text encoder for inference.

Training settings

Training epochs: 7
Training steps: 10000
Learning rate: 8e-05
Effective batch size: 8
- Micro-batch size: 8
- Gradient accumulation steps: 1
- Number of GPUs: 1
Prediction type: epsilon
Rescaled betas zero SNR: False
Optimizer: adamw_bf16
Precision: Pure BF16
Quantised: Yes: int8-quanto
Xformers: Not used
LoRA Rank: 64
LoRA Alpha: None
LoRA Dropout: 0.1
LoRA initialisation style: default

Datasets

hiroshige-sdxl-512

Repeats: 10
Total number of images: 219
Total number of aspect buckets: 10
Resolution: 0.262144 megapixels
Cropped: False
Crop style: None
Crop aspect: None

hiroshige-sdxl-1024

Repeats: 10
Total number of images: 219
Total number of aspect buckets: 16
Resolution: 1.048576 megapixels
Cropped: False
Crop style: None
Crop aspect: None

hiroshige-sdxl-512-crop

Repeats: 10
Total number of images: 219
Total number of aspect buckets: 1
Resolution: 0.262144 megapixels
Cropped: True
Crop style: random
Crop aspect: square

hiroshige-sdxl-1024-crop

Repeats: 10
Total number of images: 219
Total number of aspect buckets: 1
Resolution: 1.048576 megapixels
Cropped: True
Crop style: random
Crop aspect: square

Inference

import torch
from diffusers import DiffusionPipeline

model_id = 'stabilityai/stable-diffusion-xl-base-1.0'
adapter_id = 'davidrd123/Hiroshige-SDXL-LoRA'
pipeline = DiffusionPipeline.from_pretrained(model_id)
pipeline.load_lora_weights(adapter_id)

prompt = "hshge, hamster"
negative_prompt = 'blurry, cropped, ugly'
pipeline.to('cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu')
image = pipeline(
    prompt=prompt,
    negative_prompt=negative_prompt,
    num_inference_steps=20,
    generator=torch.Generator(device='cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu').manual_seed(1641421826),
    width=1024,
    height=1024,
    guidance_scale=4.2,
    guidance_rescale=0.0,
).images[0]
image.save("output.png", format="PNG")

davidrd123
/

Hiroshige-SDXL-LoRA