|
--- |
|
datasets: |
|
- CIFAR-10 |
|
library_name: tf-keras |
|
tags: |
|
- clustering |
|
--- |
|
|
|
## Model description |
|
|
|
This is a image clustering model trained after the [**Semantic Clustering by Adopting Nearest neighbors (SCAN)**](https://arxiv.org/abs/2005.12320)(Van Gansbeke et al., 2020) algorithm. |
|
|
|
The training procedure was done as seen in the example on <a href='https://keras.io/examples/vision/semantic_image_clustering/' target='_blank'>**keras.io**</a> by [Khalid Salama](https://www.linkedin.com/in/khalid-salama-24403144/). |
|
|
|
The algorithm consists of two phases: |
|
|
|
1. Self-supervised visual representation learning of images, in which we use the simCLR technique. |
|
2. Clustering of the learned visual representation vectors to maximize the agreement between the cluster assignments of neighboring vectors. |
|
|
|
## Intended uses & limitations |
|
|
|
The model is intended to show the effective use of self-supervised learning combined with nearest neighbours for (semantic) image clustering. |
|
|
|
You can use these clusters to retrieve images of the same class. |
|
|
|
### Limitations |
|
This model is not supposed to show any superiority to image classification from supervised learning, but as a POC that unsupervised learning is able to cluster similar images together without any labels. |
|
### Possible Improvements: |
|
As given by the original author on keras.io, these steps can be taken to improve the accuary further: |
|
1) increase the number of epochs in the representation learning and the clustering phases; |
|
2) allow the encoder weights to be tuned during the clustering phase |
|
3) perform a final fine-tuning step through self-labeling, as described in the original SCAN paper |
|
|
|
## Training and evaluation data |
|
|
|
### Training Data |
|
The model was trained using the [CIFAR-10 dataset](https://www.cs.toronto.edu/~kriz/cifar.html). For training the images were scaled to (32,32,3). |
|
|
|
### Hyperparameters |
|
For training the following parameters were used: |
|
- Feature Vector Dimension: 512 |
|
- Projection Units of Head: 128 |
|
- Number of Cluster: 20 |
|
- K-Neighbours: 5 |
|
|
|
The encoder was not tuned during clustering. |
|
### Evaluation |
|
#### Visualization of highest confidence cluster picks |
|
 |
|
|
|
#### Clusters and their respective labels, accuracy and size |
|
|
|
| Cluster | Label | Accuracy | Size | |
|
|:---------|:-------------:|-----:| -----:| |
|
|cluster 0| frog | 31.6 %|3582| |
|
|cluster 1| frog | 19.76 %|2348| |
|
|cluster 2| horse | 26.82 %|2983| |
|
|cluster 3| bird | 29.7 %|1532| |
|
|cluster 4| airplane | 39.16 %|3575| |
|
|cluster 5| ship | 22.38 %|2207| |
|
|cluster 6| automobile | 26.41 %|4365| |
|
|cluster 7| dog | 21.09 %|5049| |
|
|cluster 8| automobile | 21.94 %|4093| |
|
|cluster 9| truck | 29.66 %|4639| |
|
|cluster 10| bird | 23.02 %|1455| |
|
|cluster 11| truck | 17.78 %|3937| |
|
|cluster 12| deer | 30.36 %|2635| |
|
|cluster 13| dog | 22.62 %|1950| |
|
|cluster 14| frog | 22.64 %|4391| |
|
|cluster 15| airplane | 26.89 %|2838| |
|
|cluster 16| ship | 34.7 %|2213| |
|
|cluster 17| ship | 17.59 %|1785| |
|
|cluster 18| cat | 16.57 %|1997| |
|
|cluster 19| deer | 27.25 %|2426| |
|
## Model Plot |
|
|
|
<details> |
|
<summary>View Model Plot</summary> |
|
|
|
 |
|
|
|
</details> |