Johannes Kolbe
commited on
Commit
·
8ceac63
1
Parent(s):
dafd73b
update model card
Browse files
README.md
CHANGED
@@ -2,20 +2,76 @@
|
|
2 |
library_name: keras
|
3 |
tags:
|
4 |
- clustering
|
|
|
|
|
5 |
---
|
6 |
|
7 |
## Model description
|
8 |
|
9 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
10 |
|
11 |
## Intended uses & limitations
|
12 |
|
13 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
14 |
|
15 |
## Training and evaluation data
|
16 |
|
17 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
18 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
19 |
## Model Plot
|
20 |
|
21 |
<details>
|
|
|
2 |
library_name: keras
|
3 |
tags:
|
4 |
- clustering
|
5 |
+
datasets:
|
6 |
+
- CIFAR-10
|
7 |
---
|
8 |
|
9 |
## Model description
|
10 |
|
11 |
+
This is a image clustering model trained after the [**Semantic Clustering by Adopting Nearest neighbors (SCAN)**](https://arxiv.org/abs/2005.12320)(Van Gansbeke et al., 2020) algorithm.
|
12 |
+
|
13 |
+
The training procedure was done as seen in the example on <a href='https://keras.io/examples/vision/semantic_image_clustering/' target='_blank'>**keras.io**</a> by [Khalid Salama](https://www.linkedin.com/in/khalid-salama-24403144/).
|
14 |
+
|
15 |
+
The algorithm consists of two phases:
|
16 |
+
|
17 |
+
1. Self-supervised visual representation learning of images, in which we use the simCLR technique.
|
18 |
+
2. Clustering of the learned visual representation vectors to maximize the agreement between the cluster assignments of neighboring vectors.
|
19 |
|
20 |
## Intended uses & limitations
|
21 |
|
22 |
+
The model is intended to show the effective use of self-supervised learning combined with nearest neighbours for (semantic) image clustering.
|
23 |
+
|
24 |
+
You can use these clusters to retrieve images of the same class.
|
25 |
+
|
26 |
+
### Limitations
|
27 |
+
This model is not supposed to show any superiority to image classification from supervised learning, but as a POC that unsupervised learning is able to cluster similar images together without any labels.
|
28 |
+
### Possible Improvements:
|
29 |
+
As given by the original author on keras.io, these steps can be taken to improve the accuary further:
|
30 |
+
1) increase the number of epochs in the representation learning and the clustering phases;
|
31 |
+
2) allow the encoder weights to be tuned during the clustering phase
|
32 |
+
3) perform a final fine-tuning step through self-labeling, as described in the original SCAN paper
|
33 |
|
34 |
## Training and evaluation data
|
35 |
|
36 |
+
### Training Data
|
37 |
+
The model was trained using the [CIFAR-10 dataset](https://www.cs.toronto.edu/~kriz/cifar.html). For training the images were scaled to (32,32,3).
|
38 |
+
|
39 |
+
### Hyperparameters
|
40 |
+
For training the following parameters were used:
|
41 |
+
- Feature Vector Dimension: 512
|
42 |
+
- Projection Units of Head: 128
|
43 |
+
- Number of Cluster: 20
|
44 |
+
- K-Neighbours: 5
|
45 |
+
|
46 |
+
The encoder was not tuned during clustering.
|
47 |
+
### Evaluation
|
48 |
+
#### Visualization of highest confidence cluster picks
|
49 |
+

|
50 |
+
|
51 |
+
#### Clusters and their respective labels, accuracy and size
|
52 |
|
53 |
+
| Cluster | Label | Accuracy | Size |
|
54 |
+
|:---------|:-------------:|-----:| -----:|
|
55 |
+
|cluster 0| frog | 31.6 %|3582|
|
56 |
+
|cluster 1| frog | 19.76 %|2348|
|
57 |
+
|cluster 2| horse | 26.82 %|2983|
|
58 |
+
|cluster 3| bird | 29.7 %|1532|
|
59 |
+
|cluster 4| airplane | 39.16 %|3575|
|
60 |
+
|cluster 5| ship | 22.38 %|2207|
|
61 |
+
|cluster 6| automobile | 26.41 %|4365|
|
62 |
+
|cluster 7| dog | 21.09 %|5049|
|
63 |
+
|cluster 8| automobile | 21.94 %|4093|
|
64 |
+
|cluster 9| truck | 29.66 %|4639|
|
65 |
+
|cluster 10| bird | 23.02 %|1455|
|
66 |
+
|cluster 11| truck | 17.78 %|3937|
|
67 |
+
|cluster 12| deer | 30.36 %|2635|
|
68 |
+
|cluster 13| dog | 22.62 %|1950|
|
69 |
+
|cluster 14| frog | 22.64 %|4391|
|
70 |
+
|cluster 15| airplane | 26.89 %|2838|
|
71 |
+
|cluster 16| ship | 34.7 %|2213|
|
72 |
+
|cluster 17| ship | 17.59 %|1785|
|
73 |
+
|cluster 18| cat | 16.57 %|1997|
|
74 |
+
|cluster 19| deer | 27.25 %|2426|
|
75 |
## Model Plot
|
76 |
|
77 |
<details>
|
clusters.png
ADDED
![]() |
logs/{train → train_clustering}/events.out.tfevents.1655108829.7c9e25180606.72.1.v2
RENAMED
File without changes
|
logs/{train → train_encoder}/events.out.tfevents.1655106318.7c9e25180606.72.0.v2
RENAMED
File without changes
|