File size: 3,156 Bytes
dafd73b
 
 
 
8ceac63
 
dafd73b
 
 
 
8ceac63
 
 
 
 
 
 
 
dafd73b
 
 
8ceac63
 
 
 
 
 
 
 
 
 
 
dafd73b
 
 
8ceac63
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
dafd73b
8ceac63
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
dafd73b
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
---
library_name: keras
tags:
- clustering
datasets:
- CIFAR-10
---

## Model description

This is a image clustering model trained after the [**Semantic Clustering by Adopting Nearest neighbors (SCAN)**](https://arxiv.org/abs/2005.12320)(Van Gansbeke et al., 2020) algorithm.

The training procedure was done as seen in the example on <a href='https://keras.io/examples/vision/semantic_image_clustering/' target='_blank'>**keras.io**</a>  by [Khalid Salama](https://www.linkedin.com/in/khalid-salama-24403144/).

The algorithm consists of two phases:

1. Self-supervised visual representation learning of images, in which we use the simCLR technique.
2. Clustering of the learned visual representation vectors to maximize the agreement between the cluster assignments of neighboring vectors.

## Intended uses & limitations

The model is intended to show the effective use of self-supervised learning combined with nearest neighbours for (semantic) image clustering.

You can use these clusters to retrieve images of the same class.

### Limitations
This model is not supposed to show any superiority to image classification from supervised learning, but as a POC that unsupervised learning is able to cluster similar images together without any labels.
### Possible Improvements:
As given by the original author on keras.io, these steps can be taken to improve the accuary further: 
1) increase the number of epochs in the representation learning and the clustering phases;
2) allow the encoder weights to be tuned during the clustering phase
3) perform a final fine-tuning step through self-labeling, as described in the original SCAN paper

## Training and evaluation data

### Training Data
The model was trained using the [CIFAR-10 dataset](https://www.cs.toronto.edu/~kriz/cifar.html). For training the images were scaled to (32,32,3).

### Hyperparameters
For training the following parameters were used:
- Feature Vector Dimension: 512
- Projection Units of Head: 128
- Number of Cluster: 20
- K-Neighbours: 5

The encoder was not tuned during clustering.
### Evaluation
#### Visualization of highest confidence cluster picks
![Visualization of highest confidence cluster picks](clusters.png)

#### Clusters and their respective labels, accuracy and size

| Cluster | Label | Accuracy | Size | 
|:---------|:-------------:|-----:| -----:|
|cluster 0| frog  | 31.6 %|3582|
|cluster 1| frog  | 19.76 %|2348|
|cluster 2| horse  | 26.82 %|2983|
|cluster 3| bird  | 29.7 %|1532|
|cluster 4| airplane  | 39.16 %|3575|
|cluster 5| ship  | 22.38 %|2207|
|cluster 6| automobile  | 26.41 %|4365|
|cluster 7| dog  | 21.09 %|5049|
|cluster 8| automobile  | 21.94 %|4093|
|cluster 9| truck  | 29.66 %|4639|
|cluster 10| bird  | 23.02 %|1455|
|cluster 11| truck  | 17.78 %|3937|
|cluster 12| deer  | 30.36 %|2635|
|cluster 13| dog  | 22.62 %|1950|
|cluster 14| frog  | 22.64 %|4391|
|cluster 15| airplane  | 26.89 %|2838|
|cluster 16| ship  | 34.7 %|2213|
|cluster 17| ship  | 17.59 %|1785|
|cluster 18| cat  | 16.57 %|1997|
|cluster 19| deer  | 27.25 %|2426|
 ## Model Plot

<details>
<summary>View Model Plot</summary>

![Model Image](./model.png)

</details>