What is this?

This is not a permenantly released model yet. More like an open alpha work-in-progress.

Latest News: 2025/03/07

(Currently up to epoch 11)

SD1.5 base, with the SDXL VAE tacked on, and then retrained to actually WORK.

I think that this has reached parity with existing SD base, for human output. I'm going to keep training, but wanted to share what I had so far.

How to use

Use it like any other SD1.5 model

How it was trained

This is an fp32 model, trained with full fp32 precision, finetuned on a single 4090 (starting from the SD1.5 base + SDXL VAE).

The current version was trained on:

  • opendiffusionai/laion2b-45ish-1120px,moondream captions
  • opendiffusionai/laion2b-45ish-1120px, wd14 captions
  • opendiffusionai/laion2b-squareish-1024px, moondream captions
  • opendiffusionai/laion2b-squareish-1536px, wd14 captions.

The 1024px was a mistake: i meant to use the 1536px. the 1024px dataset is 3x as large!! But by the time I noticed it, it was 2 days in, and I thought "what the heck, let it run"

Recreating the model

All the datasets are already mentioned. Use img2dataset to download, from When you want to duplicate the 45ish images with wd14 captions, you dont need to redownload. You can just duplicate the directories using the method of your choice (I like using 'lndir' on linux) and then removing the redundant txt files.

Once that is done, you can use https://github.com/ppbrown/vlm-utils/blob/main/dataset_scripts/extracttxtfromjsonl.py to extract txt files from the jsonl.gz file

Finally, copy in the OneTrainer-XLsd32-phase1-LaionPlusWD.json config file for OneTrainer, define the "concept" files in OneTrainer, and start the training session.

Training samples

Here are some training samples.

Im taking samples every 2000 steps. The samples here give the impression that it is a somewhat linear prorgression, but it is definitely NOT. During one epoch, the output tends to cycle between various aspects of the dataset. I have deliberately cherrypicked samples that have looped back to a common root image.

Epoch 0 step 0

What the straight merge looks like, with no training:

e0s0

Epopch 0 step 14000

e0s14000

Epoch 4 step 24532

e4s154000

Epoch 5 step 22165

e5s22154

Epoch 7 step 23431

e7s23431

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Datasets used to train opendiffusionai/xlsd32-alpha1