|
--- |
|
library_name: keras |
|
license: mit |
|
language: |
|
- en |
|
pipeline_tag: image-to-image |
|
tags: |
|
- art |
|
- pixel_art |
|
- character_sprite |
|
- missing_data_imputation |
|
- image_to_image |
|
--- |
|
|
|
## Model description |
|
|
|
The MDIGAN-Characters model was proposed in SBGames 2024 ([paper on ArXiv][paper-arxiv], [page][paper-page] [demo][paper-demo]) |
|
It is a model trained for the task of generating characters in a missing pose: for instance, |
|
given images of a character facing back, left, and right, it can generate the character facing front (missing data imputation task). |
|
![](https://i.imgur.com/s5ONl9Q.png) |
|
|
|
The model's architecture is based on [CollaGAN][paper-collagan]'s, a model trained to impute images in missing domains |
|
in a multi-domain scenario. In our case, the domains are the sides a character might face, i.e., back, left, front, and right. |
|
|
|
We tested providing 3 images to the model, to generate the missing one. But we also evaluated the quality of the generated |
|
images when the model receives 2 or 1 input image. |
|
|
|
The inputs to the model are the target (missing) domain and 4 image-like tensors with size 64x64x4 in the order |
|
back, left, front, and right. The input images should be floating point tensors in the range of [-1, 1]. |
|
In place of the missing image(s), we must provide a tensor with shape 64x64x4 filled with zeros. |
|
|
|
|
|
[paper-collagan]: https://www.computer.org/csdl/proceedings-article/cvpr/2019/329300c482/1gys5gg67QY |
|
[paper-arxiv]: https://arxiv.org/abs/2409.10721 |
|
[paper-page]: https://fegemo.github.io/mdigan-characters |
|
[paper-demo]: https://fegemo.github.io/interactive-generator |
|
|
|
## Intended uses & limitations |
|
|
|
This can be used for research purposes only. The quality of the generated images vary a lot, and a |
|
post-processing step to quantize the colors of the generated image to the intended palette is benefitial. |
|
|
|
|
|
## Training and evaluation data |
|
|
|
The model was trained with the [PAC dataset][pac], which features 12,074 paired images of pixel art characters |
|
in 4 directions: back, left, front, and right. Compared to StarGAN and Pix2Pix-based baselines, the MDIGAN-Characters |
|
model yielded much better images when it received 3 images, and still good images when only 2 are provided. |
|
|
|
[pac]: https://github.com/plucksquire/pac/ |