--- license: apache-2.0 language: - en base_model: - stabilityai/stable-diffusion-xl-base-1.0 pipeline_tag: text-to-image tags: - art --- # SDXL-ProteusSigma Training with ZTSNR and NovelAI V3 Improvements - [x] 10k dataset proof of concept (completed)[link](https://huggingface.co/dataautogpt3/ProteusSigma) - [ ] 200k+ dataset finetune (in testing/training) - [ ] 12M million dataset finetune (planned)

ProteusΣ

## Example Outputs

A digital illustration of a lich with long grey hair and beard, as a university professor wearing a formal suit and standing in front of a class, writing on a whiteboard. He holds a marker, writing complex equations or magical symbols on the whiteboard.

A Candid Photo of a real short grey alien peering around a corner while trying to hide from the viewer in a living room, real photography, fujifilm superia, full HD, taken on a Canon EOS R5 F1.2 ISO100 35MM

# Combined Proteus and Mobius datasets with ZTSNR and NovelAI V3 Improvements # Recommended Inference Parameters [Example ComfyUI workflow](https://github.com/DataCTE/SDXL-Training-Improvements/blob/main/src/inference/Comfyui-zsnrnode/ztsnr%2Bv-pred.json) ## Installation 1. Install the custom nodes: ```bash cd /path/to/ComfyUI/custom_nodes git clone https://github.com/DataCTE/SDXL-Training-Improvements.git mv SDXL-Training-Improvements/src/inference/Comfyui-zsnrnode ./zsnrnode ``` Restart ComfyUI to load the new nodes Load the example workflow from the link above Recommended Settings Sampler: dpmpp_2m Scheduler: Karras (Normal noise schedule) Steps: 28 (Optimal step count) CFG: 3.0 to 5.5 (Classifier-free guidance scale) ## Model Details - **Model Type:** SDXL Fine-tuned with ZTSNR and NovelAI V3 Improvements - **Base Model:** stabilityai/stable-diffusion-xl-base-1.0 - **Training Dataset:** 10,000 high-quality images - **License:** Apache 2.0 ## Key Features - Zero Terminal SNR (ZTSNR) implementation - Increased σ_max ≈ 20000.0 (NovelAI research) - High-resolution coherence enhancements - Tag-based CLIP weighting - VAE improvements ### Technical Specifications - **Noise Schedule**: σ_max ≈ 20000.0 to σ_min ≈ 0.0292 - **Progressive Steps**: [20000, 17.8, 12.4, 9.2, 7.2, 5.4, 3.9, 2.1, 0.9, 0.0292] - **Resolution Scaling**: √(H×W)/1024 ## Training Details ### Training Configuration - **Learning Rate:** 1e-6 - **Batch Size:** 1 - **Gradient Accumulation Steps:** 1 - **Optimizer:** AdamW - **Precision:** bfloat16 - **VAE Finetuning:** Enabled - **VAE Learning Rate:** 1e-6 ### CLIP Weight Configuration - **Character Weight:** 1.5 - **Style Weight:** 1.2 - **Quality Weight:** 0.8 - **Setting Weight:** 1.0 - **Action Weight:** 1.1 - **Object Weight:** 0.9 ## Performance Improvements - 47% fewer artifacts at σ < 5.0 - Stable composition at σ > 12.4 - 31% better detail consistency - Improved color accuracy - Enhanced dark tone reproduction ## Repository and Resources - **GitHub Repository:** [SDXL-Training-Improvements](https://github.com/DataCTE/SDXL-Training-Improvements) - **Training Code:** Available in the repository - **Documentation:** [Implementation Details](https://github.com/DataCTE/SDXL-Training-Improvements/blob/main/README.md) - **Issues and Support:** [GitHub Issues](https://github.com/DataCTE/SDXL-Training-Improvements/issues) ## Citation ```bibtex @article{ossa2024improvements, title={Improvements to SDXL in NovelAI Diffusion V3}, author={Ossa, Juan and Doğan, Eren and Birch, Alex and Johnson, F.}, journal={arXiv preprint arXiv:2409.15997v2}, year={2024} } ```