xiaozaa commited on
Commit
f610e83
·
1 Parent(s): b633718

add fid score

Browse files
Files changed (2) hide show
  1. README.md +14 -2
  2. script/fid_eval.py +43 -0
README.md CHANGED
@@ -1,8 +1,19 @@
1
  # catvton-flux
2
 
3
- An advanced virtual try-on solution that combines the power of [CATVTON](https://arxiv.org/abs/2407.15886) (Contrastive Appearance and Topology Virtual Try-On) with Flux fill inpainting model for realistic and accurate clothing transfer.
4
  Also inspired by [In-Context LoRA](https://arxiv.org/abs/2410.23775) for prompt engineering.
5
 
 
 
 
 
 
 
 
 
 
 
 
6
  ## Showcase
7
  | Original | Garment | Result |
8
  |----------|---------|---------|
@@ -41,9 +52,10 @@ python app.py
41
 
42
 
43
  ## TODO:
44
- - [ ] Release the FID score
45
  - [x] Add gradio demo
46
  - [ ] Release updated weights with better performance
 
47
 
48
  ## Citation
49
 
 
1
  # catvton-flux
2
 
3
+ An state-of-the-art virtual try-on solution that combines the power of [CATVTON](https://arxiv.org/abs/2407.15886) (Contrastive Appearance and Topology Virtual Try-On) with Flux fill inpainting model for realistic and accurate clothing transfer.
4
  Also inspired by [In-Context LoRA](https://arxiv.org/abs/2410.23775) for prompt engineering.
5
 
6
+ ## Update
7
+ [![SOTA](https://img.shields.io/badge/SOTA-FID%205.59-brightgreen)](https://drive.google.com/file/d/1T2W5R1xH_uszGVD8p6UUAtWyx43rxGmI/view?usp=sharing)
8
+ [![Dataset](https://img.shields.io/badge/Dataset-VITON--HD-blue)](https://github.com/shadow2496/VITON-HD)
9
+
10
+ ---
11
+ **Latest Achievement** (2024/11/24):
12
+ - Released FID score and gradio demo
13
+ - CatVton-Flux-Alpha achieved **SOTA** performance with FID: `5.593255043029785` on VITON-HD dataset. Test configuration: scale 30, step 30. My VITON-HD test inferencing results available [here](https://drive.google.com/file/d/1T2W5R1xH_uszGVD8p6UUAtWyx43rxGmI/view?usp=sharing)
14
+
15
+ ---
16
+
17
  ## Showcase
18
  | Original | Garment | Result |
19
  |----------|---------|---------|
 
52
 
53
 
54
  ## TODO:
55
+ - [x] Release the FID score
56
  - [x] Add gradio demo
57
  - [ ] Release updated weights with better performance
58
+ - [ ] Train a smaller model
59
 
60
  ## Citation
61
 
script/fid_eval.py ADDED
@@ -0,0 +1,43 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from PIL import Image
2
+ import os
3
+ import numpy as np
4
+ from torchvision.transforms import functional as F
5
+ import torch
6
+ from torchmetrics.image.fid import FrechetInceptionDistance
7
+
8
+
9
+ # Paths setup
10
+ generated_dataset_path = "output/tryon_results"
11
+ original_dataset_path = "data/VITON-HD/test/image" # Replace with your actual original dataset path
12
+
13
+ # Get generated images
14
+ image_paths = sorted([os.path.join(generated_dataset_path, x) for x in os.listdir(generated_dataset_path)])
15
+ generated_images = [np.array(Image.open(path).convert("RGB")) for path in image_paths]
16
+
17
+ # Get corresponding original images
18
+ original_images = []
19
+ for gen_path in image_paths:
20
+ # Extract the XXXXXX part from "tryon_XXXXXX.jpg"
21
+ base_name = os.path.basename(gen_path) # get filename from path
22
+ original_id = base_name.replace("tryon_", "") # remove "tryon_" prefix
23
+
24
+ # Construct original image path
25
+ original_path = os.path.join(original_dataset_path, original_id)
26
+ original_images.append(np.array(Image.open(original_path).convert("RGB")))
27
+
28
+
29
+
30
+ def preprocess_image(image):
31
+ image = torch.tensor(image).unsqueeze(0)
32
+ image = image.permute(0, 3, 1, 2) / 255.0
33
+ return F.center_crop(image, (768, 1024))
34
+
35
+ real_images = torch.cat([preprocess_image(image) for image in original_images])
36
+ fake_images = torch.cat([preprocess_image(image) for image in generated_images])
37
+ print(real_images.shape, fake_images.shape)
38
+
39
+ fid = FrechetInceptionDistance(normalize=True)
40
+ fid.update(real_images, real=True)
41
+ fid.update(fake_images, real=False)
42
+
43
+ print(f"FID: {float(fid.compute())}")