--- license: apache-2.0 language: - en base_model: - stabilityai/stable-diffusion-xl-base-1.0 pipeline_tag: text-to-image tags: - art --- # SDXL-ProteusSigma Training with ZTSNR and NovelAI V3 Improvements - [x] 10k dataset proof of concept (completed)[link](https://huggingface.co/dataautogpt3/ProteusSigma) - [ ] 200k+ dataset finetune (in testing/training) - [ ] 12M million dataset finetune (planned) <style> .logo-container { position: relative; text-align: center; margin: 40px 0; } .text-layer { font-family: 'Arial Black', 'Helvetica', sans-serif; font-size: 72px; font-weight: bold; white-space: nowrap; } .text-base { position: relative; color: #ff71ce; text-shadow: 2px 2px 0 #ff00ff; } .text-overlay { position: absolute; left: 50%; top: 50%; transform: translate(-49%, -47%); /* Slightly offset */ color: #01cdfe; text-shadow: -2px -2px 0 #00ffff; opacity: 0.8; mix-blend-mode: screen; } .sigma { color: #00ffff; text-shadow: 2px 2px 0 #ff00ff, -2px -2px 0 #00ffff; } </style> <div class="logo-container"> <div class="text-layer text-overlay"> Proteus<span class="sigma">Σ</span> </div> <div class="text-layer text-base"> Proteus<span class="sigma">Σ</span> </div> </div> ## Example Outputs <style> .gallery { display: flex; flex-direction: row; flex-wrap: wrap; gap: 10px; justify-content: center; align-items: center; width: 100%; padding: 10px; } .gallery-item { flex: 0 0 300px; margin: 0; position: relative; } .gallery-item.large { /* New class for larger item */ flex: 0 0 340px; } .gallery img { width: 300px; cursor: pointer; transition: transform 0.2s; border-radius: 8px; } .gallery-item.large img { /* Larger size for last image */ width: 512px; } .gallery img:hover { transform: scale(1.05); } .caption { position: absolute; bottom: 0; left: 0; right: 0; background: rgba(0, 0, 0, 0.4); color: white; padding: 8px; font-size: 11px; border-bottom-left-radius: 8px; border-bottom-right-radius: 8px; opacity: 0.7; transition: opacity 0.3s ease; } .gallery-item:hover .caption { opacity: 0.2; } .modal { display: none; position: fixed; z-index: 1000; top: 0; left: 0; width: 100%; height: 100%; background-color: rgba(0,0,0,0.9); padding: 20px; box-sizing: border-box; } .modal img { max-width: 90%; max-height: 90vh; margin: auto; display: block; position: relative; top: 50%; transform: translateY(-50%); } .modal.active { display: block; } </style> <div class="gallery"> <div class="gallery-item"> <img src="https://huggingface.co/dataautogpt3/ProteusSigma/resolve/main/example.png" alt="Example Output 1" onclick="showImage(this.src)"/> <div class="caption">A digital illustration of a lich with long grey hair and beard, as a university professor wearing a formal suit and standing in front of a class, writing on a whiteboard. He holds a marker, writing complex equations or magical symbols on the whiteboard.</div> </div> <div class="gallery-item"> <img src="https://huggingface.co/dataautogpt3/ProteusSigma/resolve/main/example2.png" alt="Example Output 2" onclick="showImage(this.src)"/> <div class="caption">A Candid Photo of a real short grey alien peering around a corner while trying to hide from the viewer in a living room, real photography, fujifilm superia, full HD, taken on a Canon EOS R5 F1.2 ISO100 35MM</div> </div> <div class="gallery-item"> <img src="https://huggingface.co/dataautogpt3/ProteusSigma/resolve/main/example3.png" alt="Example Output 3" onclick="showImage(this.src)"/> </div> <div class="gallery-item"> <img src="https://huggingface.co/dataautogpt3/ProteusSigma/resolve/main/example4.png" alt="Example Output 4" onclick="showImage(this.src)"/> </div> <div class="gallery-item large"> <!-- Added 'large' class --> <img src="https://huggingface.co/dataautogpt3/ProteusSigma/resolve/main/example5.png" alt="Example Output 5" onclick="showImage(this.src)"/> </div> </div> <div class="modal" onclick="this.classList.remove('active')"> <img id="modal-img" src="" alt="Full size image"/> </div> <script> function showImage(src) { document.getElementById('modal-img').src = src; document.querySelector('.modal').classList.add('active'); } </script> # Combined Proteus and Mobius datasets with ZTSNR and NovelAI V3 Improvements # Recommended Inference Parameters [Example ComfyUI workflow](https://github.com/DataCTE/SDXL-Training-Improvements/blob/main/src/inference/Comfyui-zsnrnode/ztsnr%2Bv-pred.json) ## Installation 1. Install the custom nodes: ```bash cd /path/to/ComfyUI/custom_nodes git clone https://github.com/DataCTE/SDXL-Training-Improvements.git mv SDXL-Training-Improvements/src/inference/Comfyui-zsnrnode ./zsnrnode ``` Restart ComfyUI to load the new nodes Load the example workflow from the link above Recommended Settings Sampler: dpmpp_2m Scheduler: Karras (Normal noise schedule) Steps: 28 (Optimal step count) CFG: 3.0 to 5.5 (Classifier-free guidance scale) ## Model Details - **Model Type:** SDXL Fine-tuned with ZTSNR and NovelAI V3 Improvements - **Base Model:** stabilityai/stable-diffusion-xl-base-1.0 - **Training Dataset:** 10,000 high-quality images - **License:** Apache 2.0 ## Key Features - Zero Terminal SNR (ZTSNR) implementation - Increased σ_max ≈ 20000.0 (NovelAI research) - High-resolution coherence enhancements - Tag-based CLIP weighting - VAE improvements ### Technical Specifications - **Noise Schedule**: σ_max ≈ 20000.0 to σ_min ≈ 0.0292 - **Progressive Steps**: [20000, 17.8, 12.4, 9.2, 7.2, 5.4, 3.9, 2.1, 0.9, 0.0292] - **Resolution Scaling**: √(H×W)/1024 ## Training Details ### Training Configuration - **Learning Rate:** 1e-6 - **Batch Size:** 1 - **Gradient Accumulation Steps:** 1 - **Optimizer:** AdamW - **Precision:** bfloat16 - **VAE Finetuning:** Enabled - **VAE Learning Rate:** 1e-6 ### CLIP Weight Configuration - **Character Weight:** 1.5 - **Style Weight:** 1.2 - **Quality Weight:** 0.8 - **Setting Weight:** 1.0 - **Action Weight:** 1.1 - **Object Weight:** 0.9 ## Performance Improvements - 47% fewer artifacts at σ < 5.0 - Stable composition at σ > 12.4 - 31% better detail consistency - Improved color accuracy - Enhanced dark tone reproduction ## Repository and Resources - **GitHub Repository:** [SDXL-Training-Improvements](https://github.com/DataCTE/SDXL-Training-Improvements) - **Training Code:** Available in the repository - **Documentation:** [Implementation Details](https://github.com/DataCTE/SDXL-Training-Improvements/blob/main/README.md) - **Issues and Support:** [GitHub Issues](https://github.com/DataCTE/SDXL-Training-Improvements/issues) ## Citation ```bibtex @article{ossa2024improvements, title={Improvements to SDXL in NovelAI Diffusion V3}, author={Ossa, Juan and Doğan, Eren and Birch, Alex and Johnson, F.}, journal={arXiv preprint arXiv:2409.15997v2}, year={2024} } ```