Model Details
Model Description
This model is fine-tuned from stable-diffusion-v1-5 on 110,000 image-text pairs from the MIMIC dataset using the Norm-tuning PEFT method. Under this fine-tuning strategy, fine-tune only the normalization weightsin the U-Net while keeping everything else frozen.
- Developed by: Raman Dutt
- Shared by: Raman Dutt
- Model type: [Stable Diffusion fine-tuned using Parameter-Efficient Fine-Tuning]
- Finetuned from model: stable-diffusion-v1-5
Model Sources
- Paper: Parameter-Efficient Fine-Tuning for Medical Image Analysis: The Missed Opportunity
- Demo: MIMIC-SD-PEFT-Demo
Direct Use
This model can be directly used to generate realistic medical images from text prompts.
How to Get Started with the Model
import os
from safetensors.torch import load_file
from diffusers.pipelines import StableDiffusionPipeline
pipe = StableDiffusionPipeline.from_pretrained(sd_folder_path, revision="fp16")
exp_path = os.path.join('unet', 'diffusion_pytorch_model.safetensors')
state_dict = load_file(exp_path)
# Load the adapted U-Net
pipe.unet.load_state_dict(state_dict, strict=False)
pipe.to('cuda:0')
# Generate images with text prompts
TEXT_PROMPT = "No acute cardiopulmonary abnormality."
GUIDANCE_SCALE = 4
INFERENCE_STEPS = 75
result_image = pipe(
prompt=TEXT_PROMPT,
height=224,
width=224,
guidance_scale=GUIDANCE_SCALE,
num_inference_steps=INFERENCE_STEPS,
)
result_pil_image = result_image["images"][0]
Training Details
Training Data
This model has been fine-tuned on 110K image-text pairs from the MIMIC dataset.
Training Procedure
The training procedure has been described in detail in Section 4.3 of this paper.
Metrics
This model has been evaluated using the Fréchet inception distance (FID) Score on MIMIC dataset.
Results
Fine-Tuning Strategy | FID Score |
---|---|
Full FT | 58.74 |
Attention | 52.41 |
Bias | 20.81 |
Norm | 29.84 |
Bias+Norm+Attention | 35.93 |
LoRA | 439.65 |
SV-Diff | 23.59 |
DiffFit | 42.50 |
Environmental Impact
Using Parameter-Efficient Fine-Tuning potentially causes lesser harm to the environment since we fine-tune a significantly lesser number of parameters in a model. This results in much lesser computing and hardware requirements.
Citation
BibTeX:
@article{dutt2023parameter, title={Parameter-Efficient Fine-Tuning for Medical Image Analysis: The Missed Opportunity}, author={Dutt, Raman and Ericsson, Linus and Sanchez, Pedro and Tsaftaris, Sotirios A and Hospedales, Timothy}, journal={arXiv preprint arXiv:2305.08252}, year={2023} }
APA:
Dutt, R., Ericsson, L., Sanchez, P., Tsaftaris, S. A., & Hospedales, T. (2023). Parameter-Efficient Fine-Tuning for Medical Image Analysis: The Missed Opportunity. arXiv preprint arXiv:2305.08252.
Model Card Authors
- Downloads last month
- 3