Adversarial Deep Embedded Clustering : On a Better Trade-off Between Feature Randomness and Feature Drift

Author

Mrabah, N

Linked Agent

Bouguessa, M, Author

Ksantini , R, Author

Title of Periodical

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING

Country of Publication

Kingdom of Bahrain

Place Published

Sakhir . Bahrain

Publisher

University of Bahrain

Date Issued

2022

Language

English

English Abstract

Abstract : To overcome the absence of concrete supervisory signals, deep lustering models construct their own labels based on self-supervision and pseudo-supervision. However, applying these techniques can cause Feature Randomness and Feature Drift. In this paper, we formally characterize these two new concepts. On one hand, Feature Randomness takes place when a considerable portion of the pseudo-labels is deemed to be random. In this regard, the trained model can learn non-representative features. On the other hand, Feature Drift takes place when the pseudo-supervised and the reconstruction losses are jointly minimized. While penalizing the reconstruction loss aims to preserve all the inherent data information, optimizing the embedded-clustering objective drops the latent between-cluster variances. Due to this compromise, the clustering-friendly representations can be easily drifted. In this context, we propose ADEC (Adversarial Deep Embedded Clustering) a novel autoencoder-based clustering model, which relies on a discriminator network to reduce random features while avoiding the drifting effect. Our new metrics DFR and DFD allows to, respectively, assess the level of Feature Randomness and Feature Drift. We empirically demonstrate the suitability of our model on handling these problems using benchmark real datasets. Experimental results validate that our model outperforms state-of-the-art autoencoder-based clustering methods.