Model description
This is a image clustering model trained after the Semantic Clustering by Adopting Nearest neighbors (SCAN)(Van Gansbeke et al., 2020) algorithm.
The training procedure was done as seen in the example on keras.io by Khalid Salama.
The algorithm consists of two phases:
- Self-supervised visual representation learning of images, in which we use the simCLR technique.
- Clustering of the learned visual representation vectors to maximize the agreement between the cluster assignments of neighboring vectors.
Intended uses & limitations
The model is intended to show the effective use of self-supervised learning combined with nearest neighbours for (semantic) image clustering.
You can use these clusters to retrieve images of the same class.
Limitations
This model is not supposed to show any superiority to image classification from supervised learning, but as a POC that unsupervised learning is able to cluster similar images together without any labels.
Possible Improvements:
As given by the original author on keras.io, these steps can be taken to improve the accuary further:
- increase the number of epochs in the representation learning and the clustering phases;
- allow the encoder weights to be tuned during the clustering phase
- perform a final fine-tuning step through self-labeling, as described in the original SCAN paper
Training and evaluation data
Training Data
The model was trained using the CIFAR-10 dataset. For training the images were scaled to (32,32,3).
Hyperparameters
For training the following parameters were used:
- Feature Vector Dimension: 512
- Projection Units of Head: 128
- Number of Cluster: 20
- K-Neighbours: 5
The encoder was not tuned during clustering.
Evaluation
Visualization of highest confidence cluster picks
Clusters and their respective labels, accuracy and size
Cluster | Label | Accuracy | Size |
---|---|---|---|
cluster 0 | frog | 31.6 % | 3582 |
cluster 1 | frog | 19.76 % | 2348 |
cluster 2 | horse | 26.82 % | 2983 |
cluster 3 | bird | 29.7 % | 1532 |
cluster 4 | airplane | 39.16 % | 3575 |
cluster 5 | ship | 22.38 % | 2207 |
cluster 6 | automobile | 26.41 % | 4365 |
cluster 7 | dog | 21.09 % | 5049 |
cluster 8 | automobile | 21.94 % | 4093 |
cluster 9 | truck | 29.66 % | 4639 |
cluster 10 | bird | 23.02 % | 1455 |
cluster 11 | truck | 17.78 % | 3937 |
cluster 12 | deer | 30.36 % | 2635 |
cluster 13 | dog | 22.62 % | 1950 |
cluster 14 | frog | 22.64 % | 4391 |
cluster 15 | airplane | 26.89 % | 2838 |
cluster 16 | ship | 34.7 % | 2213 |
cluster 17 | ship | 17.59 % | 1785 |
cluster 18 | cat | 16.57 % | 1997 |
cluster 19 | deer | 27.25 % | 2426 |
Model Plot
- Downloads last month
- 7