DivClust: Controlling Diversity in Deep Clustering
Abstract
Clustering has been a major research topic in the field of machine learning, one to which Deep Learning has recently been applied with significant success. However, an aspect of clustering that is not addressed by existing deep clustering methods, is that of efficiently producing multiple, diverse partitionings for a given dataset. This is particularly important, as a diverse set of base clusterings are necessary for consensus <PRE_TAG>clustering</POST_TAG>, which has been found to produce better and more robust results than relying on a single clustering. To address this gap, we propose DivClust, a diversity controlling loss that can be incorporated into existing deep <PRE_TAG>clustering frameworks</POST_TAG> to produce multiple clusterings with the desired degree of diversity. We conduct experiments with multiple datasets and deep <PRE_TAG>clustering frameworks</POST_TAG> and show that: a) our method effectively controls diversity across frameworks and datasets with very small additional computational cost, b) the sets of clusterings learned by DivClust include solutions that significantly outperform single-clustering baselines, and c) using an off-the-shelf consensus <PRE_TAG>clustering</POST_TAG> algorithm, DivClust produces consensus <PRE_TAG>clustering</POST_TAG> solutions that consistently outperform single-clustering baselines, effectively improving the performance of the base deep clustering framework.
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper