bghira
/

sd3-cheechandchong-regularised

@@ -119,7 +119,7 @@ A photo-realistic image of a cat
 ```
 ## Validation settings
-- CFG: `3.0`
 - CFG Rescale: `0.0`
 - Steps: `20`
 - Sampler: `None`
@@ -143,8 +143,8 @@ You may reuse the base model text encoder for inference.
 - Training steps: 100
 - Learning rate: 0.0001
 - Max grad norm: 0.01
-- Effective batch size: 12
-  - Micro-batch size: 4
   - Gradient accumulation steps: 1
   - Number of GPUs: 3
 - Prediction type: flow-matching
@@ -159,20 +159,20 @@ You may reuse the base model text encoder for inference.
     "bypass_mode": true,
     "algo": "lokr",
     "multiplier": 1.0,
     "linear_dim": 10000,
     "linear_alpha": 1,
     "factor": 12,
     "apply_preset": {
         "target_module": [
-            "Attention",
-            "FeedForward"
         ],
         "module_algo_map": {
-            "Attention": {
-                "factor": 12
-            },
             "FeedForward": {
                 "factor": 6
             }
         }
     }
@@ -181,33 +181,6 @@ You may reuse the base model text encoder for inference.
 ## Datasets
-### reg-512
-- Repeats: 0
-- Total number of images: ~288
-- Total number of aspect buckets: 3
-- Resolution: 0.262144 megapixels
-- Cropped: False
-- Crop style: None
-- Crop aspect: None
-- Used for regularisation data: Yes
-### reg-1024
-- Repeats: 0
-- Total number of images: ~291
-- Total number of aspect buckets: 9
-- Resolution: 1.048576 megapixels
-- Cropped: False
-- Crop style: None
-- Crop aspect: None
-- Used for regularisation data: Yes
-### cheechandchong-uncropped-512
-- Repeats: 10
-- Total number of images: ~24
-- Total number of aspect buckets: 5
-- Resolution: 0.262144 megapixels
-- Cropped: False
-- Crop style: None
-- Crop aspect: None
-- Used for regularisation data: No
 ### cheechandchong-cropped-512
 - Repeats: 10
 - Total number of images: ~24
@@ -217,15 +190,6 @@ You may reuse the base model text encoder for inference.
 - Crop style: None
 - Crop aspect: None
 - Used for regularisation data: No
-### cheechandchong-uncropped-1024
-- Repeats: 10
-- Total number of images: ~24
-- Total number of aspect buckets: 7
-- Resolution: 1.048576 megapixels
-- Cropped: False
-- Crop style: None
-- Crop aspect: None
-- Used for regularisation data: No
 ### cheechandchong-cropped-1024
 - Repeats: 10
 - Total number of images: ~24
@@ -261,7 +225,7 @@ image = pipeline(
     generator=torch.Generator(device='cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu').manual_seed(1641421826),
     width=1024,
     height=1024,
-    guidance_scale=3.0,
 ).images[0]
 image.save("output.png", format="PNG")
 ```

 ```
 ## Validation settings
+- CFG: `4.0`
 - CFG Rescale: `0.0`
 - Steps: `20`
 - Sampler: `None`
 - Training steps: 100
 - Learning rate: 0.0001
 - Max grad norm: 0.01
+- Effective batch size: 3
+  - Micro-batch size: 1
   - Gradient accumulation steps: 1
   - Number of GPUs: 3
 - Prediction type: flow-matching
     "bypass_mode": true,
     "algo": "lokr",
     "multiplier": 1.0,
+    "full_matrix": true,
     "linear_dim": 10000,
     "linear_alpha": 1,
     "factor": 12,
     "apply_preset": {
         "target_module": [
+            "JointTransformerBlock"
         ],
         "module_algo_map": {
             "FeedForward": {
                 "factor": 6
+            },
+            "JointTransformerBlock": {
+                "factor": 12
             }
         }
     }
 ## Datasets
 ### cheechandchong-cropped-512
 - Repeats: 10
 - Total number of images: ~24
 - Crop style: None
 - Crop aspect: None
 - Used for regularisation data: No
 ### cheechandchong-cropped-1024
 - Repeats: 10
 - Total number of images: ~24
     generator=torch.Generator(device='cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu').manual_seed(1641421826),
     width=1024,
     height=1024,
+    guidance_scale=4.0,
 ).images[0]
 image.save("output.png", format="PNG")
 ```