File size: 2,809 Bytes
aa76dab
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
0bf1ab8
f91c54b
18db6b4
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4cabd79
18db6b4
 
72f8206
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
aec65d2
72f8206
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
---
license: other
license_name: stabilityai-ai-community
license_link: >-
  https://huggingface.co/stabilityai/stable-diffusion-3.5-large/blob/main/LICENSE.md
language:
- en
base_model:
- stabilityai/stable-diffusion-3.5-large
---
![eyecatch](eyecatch.jpg)
# SD 3.5 Large Modern Anime Full Model Card
This is an experimental model. I full-finetune SD 3.5 Large by Quality Tuning only.

# Usage
- ComfyUI
  1. Download the [model](sd3_5_large_modern_anime_full.safetensors). [fp8](sd3_5_large_modern_anime_full_fp8.safetensors)
  2. Enjoy! (trigger word: modern anime style, )
- diffusers
  1. Run the code:
```python
import torch
from diffusers import SD3Transformer2DModel, StableDiffusion3Pipeline

transformer = SD3Transformer2DModel.from_single_file(
    "https://huggingface.co/alfredplpl/sd3-5-large-modern-anime-full/blob/main/sd3_5_large_modern_anime_full.safetensors",
    torch_dtype=torch.bfloat16,
)
pipe = StableDiffusion3Pipeline.from_pretrained(
    "stabilityai/stable-diffusion-3.5-large",
    transformer=transformer,
    torch_dtype=torch.bfloat16,
)
pipe.enable_model_cpu_offload()
image = pipe("modern anime style, A close-up shot of a girl's face in the center, looking directly at the viewer. Autumn maple trees with red leaves frame both the left and right sides of the background, with the sky visible in the middle.").images[0]
image.save("sd35.png")
```

# How to Make

## Prerequisites
- A6000x1 (48GB)
- Private dataset: 3000 images (collected manually)

## Procedure
I used sd-scripts. The dataset config as is follows: 
```toml
[general]
enable_bucket = true                             # Aspect Ratio Bucketingを使うか否か

[[datasets]]
resolution = 1024                                # 学習解像度
batch_size = 4                                   # バッチサイズ

  [[datasets.subsets]]
  image_dir = '/mnt/NVM2/manual_now'             # 学習用画像を入れたフォルダを指定
  metadata_file = 'manual_dcap2.json'            # メタデータファイル名

```

I ran the command:
```bash
accelerate launch --num_cpu_threads_per_process 1 sd3_train.py --pretrained_model_name_or_path='/mnt/NVM2/sd3_5/sd3.5_large.safetensors'   --output_dir='/mnt/NVM2/sd3_5'   --output_name=modern_anime --dataset_config=manual_dcap2.toml --save_model_as=safetensors --learning_rate=5e-6 --sdpa --gradient_checkpointing --mixed_precision=bf16 --full_bf16 --max_train_epochs=10 --min_bucket_reso=512 --max_bucket_reso=2048 --clip_l='/mnt/NVM2/sd3_5/clip_l.safetensors'   --clip_g='/mnt/NVM2/sd3_5/clip_g.safetensors'   --gradient_accumulation_steps=1 --t5xxl='/mnt/NVM2/sd3_5/t5xxl_fp16.safetensors'   --optimizer_type=Lion8bit --cache_text_encoder_outputs_to_disk --cache_text_encoder_outputs --cache_latents --cache_latents_to_disk --save_every_n_epochs=1
```