Clustering Methods based on Deep Learning and Unimodality Testing

Vardakas, Georgios

Please use this identifier to cite or link to this item: https://olympias.lib.uoi.gr/jspui/handle/123456789/39586

Full metadata record

DC Field	Value	Language
dc.contributor.author	Vardakas, Georgios	en
dc.date.accessioned	2025-10-30T08:28:52Z	-
dc.date.available	2025-10-30T08:28:52Z	-
dc.identifier.uri	https://olympias.lib.uoi.gr/jspui/handle/123456789/39586	-
dc.rights	CC0 1.0 Universal	*
dc.rights.uri	http://creativecommons.org/publicdomain/zero/1.0/	*
dc.subject	Data clustering	en
dc.subject	Deep learning	en
dc.subject	Unimodality testing	en
dc.subject	Estimating the number of clusters	en
dc.subject	Machine learning	en
dc.title	Clustering Methods based on Deep Learning and Unimodality Testing	en
dc.type	doctoralThesis	en
heal.type	doctoralThesis	el
heal.type.en	Doctoral thesis	en
heal.type.el	Διδακτορική διατριβή	el
heal.classification	Machine Learning	en
heal.dateAvailable	2025-10-30T08:29:54Z	-
heal.language	en	el
heal.access	free	el
heal.recordProvider	Πανεπιστήμιο Ιωαννίνων. Πολυτεχνική Σχολή	el
heal.publicationDate	2025-10-22	-
heal.abstract	Data clustering is the process of partitioning a dataset into a finite set of groups, or clusters, such that data points within each cluster exhibit intra-cluster similarity, while those belonging to different clusters are characterized by inter-cluster dissimilarity. Clustering remains a challenging task due to the inherent complexity of uncovering meaningful structures within data. Revealing these hidden structures provides valuable insights and facilitates a deeper understanding of the underlying patterns. This thesis concerns the development, implementation and evaluation of novel clustering methodologies mainly focused on three important problems: i) partitional clustering in both Euclidean and kernel spaces, ii) unimodality-based clustering, which incorporates the concept of unimodality into the clustering process, and iii) deep clustering, which leverages the representational power of deep learning methods. We first introduce global k-means++, a method developed to address the initialization challenges inherent in the standard k-means algorithm. The approach integrates the incremental strategy of global k-means with the probabilistic center selection mechanism of k-means++, effectively combining the strengths of both techniques. The resulting synergy delivers high-quality clustering solutions while significantly reducing the computational cost typically associated with global k-means. Furthermore, we extend this concept from Euclidean to kernel space by proposing global kernel k-means++, an algorithm specifically designed to overcome the initialization problem in kernel k-means. The optimization effectiveness of both global k-means variants is thoroughly validated through extensive experimental evaluation. Afterwards, we present UniForCE, a clustering method that simultaneously partitions data and estimates the number of clusters k. UniForCE introduces a novel notion of locally unimodal clusters, focusing on unimodality at local regions of the data density rather than in the entire cluster. By identifying unimodal pairs of neighboring subclusters, the method aggregates them into larger, statistically coherent structures via a unimodality graph. This flexible formulation enables the discovery of arbitrarily shaped clusters. A statistical test determines unimodal pairs, and clustering is achieved with automatic estimation k by detecting the number of connected components in the unimodality graph. Extensive experiments on synthetic and real datasets validate both the conceptual soundness of the method and its practical effectiveness. Furthermore, we introduce the soft silhouette score, a generalization of the widely used silhouette measure that accommodates probabilistic cluster assignments. Building on this differentiable measure, we develop an autoencoder-based deep clustering method utilizing the soft silhouette score. Our method guides the learned latent representations to form clusters that are both compact and well-separated. This property is crucial in real-world applications, as simultaneously ensuring compactness and separability guarantees that clusters are not only densely packed but also clearly distinct from each other. We evaluate our method on a variety of benchmark datasets and against state-of-the-art methods to demonstrate that it outperforms established deep clustering approaches, highlighting the effectiveness of the soft silhouette score as a principled objective for improving the quality of learned latent representations. Finally, we present the neural implicit maximum likelihood clustering, which is a neural-network-based approach that frames clustering as a generative task within the Implicit Maximum Likelihood Estimation framework. By adapting ideas from ClusterGAN, our method avoids several well-known shortcomings of GAN-based clustering while maintaining a simple and stable training objective. The method performs particularly well on small datasets, with experimental comparisons against both deep and conventional clustering algorithms underscoring its competitive potential. A notable strength of our method is its ability to capture diverse cluster geometries without requiring hyperparameter tuning. Experiments on synthetic datasets show that the method can successfully cluster both cloud-shaped and ring-shaped data.	en
heal.advisorName	Likas, Aristidis	en
heal.committeeMemberName	Likas, Aristidis	en
heal.committeeMemberName	Blekas, Konstantinos	en
heal.committeeMemberName	Nikou, Christophoros	en
heal.committeeMemberName	Tefas, Anastasios	en
heal.committeeMemberName	Vouros, George	en
heal.committeeMemberName	Skianis, Konstantinos	en
heal.committeeMemberName	Voulodimos, Athanasios	en
heal.academicPublisher	Πανεπιστήμιο Ιωαννίνων. Πολυτεχνική Σχολή. Τμήμα Μηχανικών Ηλεκτρονικών Υπολογιστών και Πληροφορικής	el
heal.academicPublisherID	uoi	el
heal.fullTextAvailability	true	-
Appears in Collections:	Διδακτορικές Διατριβές - ΜΗΥΠ

Show simple item record

Files in This Item:

File	Description	Size	Format
PhD_Thesis.pdf		52.27 MB	Adobe PDF	View/Open

Show simple item record

This item is licensed under a Creative Commons License

Repository of UOI "Olympias"