File size: 3,407 Bytes
440ef7e 724e02f 440ef7e 9997f1f 440ef7e 724e02f fa00ec7 724e02f 19d15fb 724e02f 19d15fb 724e02f 19d15fb 724e02f 19d15fb 724e02f 19d15fb 724e02f bce9aa5 724e02f bce9aa5 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 |
---
tags:
- autoencoder
- image-colorization
- pytorch
- pytorch_model_hub_mixin
license: apache-2.0
datasets:
- flwrlabs/celeba
language:
- en
metrics:
- mse
pipeline_tag: image-to-image
---
# Model Colorization Autoencoder
## Model Description
This autoencoder model is designed for image colorization. It takes grayscale images as input and outputs colorized versions of those images. The model architecture consists of an encoder-decoder structure, where the encoder compresses the input image into a latent representation, and the decoder reconstructs the image in color.
### Architecture
- **Encoder**: The encoder comprises three convolutional layers followed by max pooling and ReLU activations, each paired with batch normalization. It ends with a flattening layer and a fully connected layer to produce a latent vector.
- **Decoder**: The decoder mirrors the encoder, using linear and transposed convolutional layers with ReLU activations and batch normalization. The final layer outputs a color image using a sigmoid activation function.
The architecture details are as follows:
```python
class ModelColorization(nn.Module, PyTorchModelHubMixin):
def __init__(self):
super(ModelColorization, self).__init__()
self.encoder = nn.Sequential(
nn.Conv2d(1, 64, kernel_size=3, stride=1, padding=1),
nn.MaxPool2d(kernel_size=2, stride=2),
nn.ReLU(),
nn.BatchNorm2d(64),
nn.Conv2d(64, 32, kernel_size=3, stride=1, padding=1),
nn.MaxPool2d(kernel_size=2, stride=2),
nn.ReLU(),
nn.BatchNorm2d(32),
nn.Conv2d(32, 16, kernel_size=3, stride=1, padding=1),
nn.MaxPool2d(kernel_size=2, stride=2),
nn.ReLU(),
nn.BatchNorm2d(16),
nn.Flatten(),
nn.Linear(16*45*45, 4000),
)
self.decoder = nn.Sequential(
nn.Linear(4000, 16 * 45 * 45),
nn.ReLU(),
nn.Unflatten(1, (16, 45, 45)),
nn.ConvTranspose2d(16, 32, kernel_size=3, stride=2, padding=1, output_padding=1),
nn.ReLU(),
nn.BatchNorm2d(32),
nn.ConvTranspose2d(32, 64, kernel_size=3, stride=2, padding=1, output_padding=1),
nn.ReLU(),
nn.BatchNorm2d(64),
nn.ConvTranspose2d(64, 3, kernel_size=3, stride=2, padding=1, output_padding=1),
nn.Sigmoid()
)
def forward(self, x):
x = self.encoder(x)
x = self.decoder(x)
return x
```
### Training Details
The model was trained using PyTorch for 5 epochs. Here are the training and validation losses observed during the training:
Epoch 1: Training Loss: 0.0063, Validation Loss: 0.0042
Epoch 2: Training Loss: 0.0036, Validation Loss: 0.0035
Epoch 3: Training Loss: 0.0032, Validation Loss: 0.0032
Epoch 4: Training Loss: 0.0030, Validation Loss: 0.0030
Epoch 5: Training Loss: 0.0029, Validation Loss: 0.0030
The model demonstrated continuous improvement in reducing both training and validation loss over the epochs.
### Usage
You can load the model from the Hugging Face Hub using the following code:
```python
# Ensure you have the necessary dependencies installed:
pip install torch torchvision transformers
from transformers import AutoModel
model = AutoModel.from_pretrained("sebastiansarasti/AutoEncoderImageColorization")
``` |