Clustering methods based on reinforcement learning

Pachi, Eleni

Please use this identifier to cite or link to this item: https://olympias.lib.uoi.gr/jspui/handle/123456789/29715

Full metadata record

DC Field	Value	Language
dc.contributor.author	Pachi, Eleni	el
dc.date.accessioned	2020-03-09T09:39:31Z	-
dc.date.available	2020-03-09T09:39:31Z	-
dc.identifier.uri	https://olympias.lib.uoi.gr/jspui/handle/123456789/29715	-
dc.identifier.uri	http://dx.doi.org/10.26268/heal.uoi.9712	-
dc.rights	Attribution-NonCommercial-NoDerivs 3.0 United States	*
dc.rights.uri	http://creativecommons.org/licenses/by-nc-nd/3.0/us/	*
dc.subject	Clustering	en
dc.subject	Reinforcement learning	en
dc.subject	Machine learning	en
dc.title	Clustering methods based on reinforcement learning	en
dc.title	Μέθοδοι ομαδοποίησης βασισμένες σε ενισχυτική μάθηση	el
heal.type	masterThesis	-
heal.type.en	Master thesis	en
heal.type.el	Μεταπτυχιακή εργασία	el
heal.classification	Reinforcement learning	-
heal.dateAvailable	2020-03-09T09:40:31Z	-
heal.language	en	-
heal.access	free	-
heal.recordProvider	Πανεπιστήμιο Ιωαννίνων. Πολυτεχνική Σχολή. Τμήμα Μηχανικών Ηλεκτρονικών Υπολογιστών και Πληροφορικής	el
heal.publicationDate	2019	-
heal.bibliographicCitation	Βιβλιογραφία: σ. 44-46	el
heal.abstract	Clustering is one of the most popular problems in machine learning and data mining. It belongs to the category of unsupervised learning problems since no label information is provided to assist in partitioning the data points into coherent groups. Although clustering is an unsupervised problem, it is possible to view clustering from a reinforcement learning perspective. In reinforcement learning, an agent learns an action policy that solves a sequential decision problem using reinforcement signals provided by the environment. In reinforcement-based clustering, the clustering system learns through reinforcements to follow the desired clustering policy. The previous method of this type (RGCL algorithm) trains a team of binary stochastic units to perform on-line clustering. Each unit corresponds to a cluster and the weights of each stochastic unit correspond to a representative point (centroid) of the respective cluster. The team of stochastic units is trained to perform clustering using the REINFORCE algorithm by exploiting properly defined reinforcement signals provided by the environment. In this thesis we propose two extensions of the RGCL algorithm based on the use of a single stochastic multinomial unit instead of a team of binary stochastic units. In the first method the unit is trained on-line based on the REINFORCE framework using immediate reinforcement signals. In the second method the stochastic multinomial unit is trained in a batch mode based on the REINFORCE framework using delayed reinforcement signals. In both cases the weight update equations are derived so that the weight updates lead to the stochastic minimization of the well-known k-means clustering error. An experimental study has been conducted using synthetic and real datasets to assess the performance of the proposed methods. The experimental results indicate that improved clustering results are obtained in the majority of cases.	en
heal.abstract	Η ομαδοποίηση των δεδομένων είναι ένα από τα πιο δημοφιλή προβλήματα της μηχανικής μάθησης καθώς και της εξόρυξης δεδομένων. Ανήκει στην κατηγορία των προβλημάτων μάθησης χωρίς επίβλεψη, αφού η μόνη πληροφορία που παρέχεται για τον διαχωρισμό των δεδομένων σε ομάδες, είναι τα ίδια τα δεδομένα και όχι ετικέτες αυτών. Παρόλο που η ομαδοποίηση είναι πρόβλημα χωρίς επίβλεψη, είναι εφικτό να την προσεγγίσουμε και σαν ένα πρόβλημα ενισχυτικής μάθησης. Στην ενισχυτική μάθηση, το σύστημα μαθαίνει μια στρατηγική η οποία λύνει ένα πρόβλημα διαδοχικών αποφάσεων χρησιμοποιώντας σήματα ενίσχυσης που παρέχονται από το περιβάλλον. Στην ομαδοποίηση με ενισχυτική μάθηση, το σύστημα μαθαίνει μέσα από σήματα ενίσχυσης να ακολουθήσει την επιθυμητή στρατηγική ομαδοποίησης. Σύμφωνα με την ιδέα αυτή μια προηγούμενη μέθοδος (αλγόριθμος RGCL), βασίζεται στην εκπαίδευση ενός συνόλου από στοχαστικές δυαδικές μονάδες και υλοποιεί ένα σειριακό αλγόριθμο ομαδοποίησης. Στη μέθοδο αυτή κάθε στοχαστική μονάδα αντιστοιχεί σε μια ομάδα και οι παράμετροι της στοχαστικής μονάδας αντιστοιχούν στον αντιπρόσωπο της ομάδας. Το σύνολο των στοχαστικών μονάδων εκπαιδεύεται με τη βοήθεια των REINFORCE τεχνικών και με κατάλληλα ορισμένα σήματα ενίσχυσης που στέλνονται έτσι ώστε να υλοποιείται η ομαδοποίηση των δεδομένων. Στην εργασία αυτή προτείνουμε δυο νέες μεθόδους που επεκτείνουν τον αλγόριθμο RGCL και χρησιμοποιούν μια μόνο στοχαστική πολυωνυμική μονάδα, αντί για ένα σύνολο από στοχαστικές δυαδικές μονάδες. Στην πρώτη μέθοδο η στοχαστική μονάδα εκπαιδεύεται σειριακά, με την εκπαίδευση να βασίζεται στο REINFORCE αλγόριθμο χρησιμοποιώντας σήματα άμεσης ενίσχυσης. Στη δεύτερη μέθοδο η στοχαστική πολυωνυμική μονάδα εκπαιδεύεται σε ομάδες παραδειγμάτων (batches) και βασίζεται πάλι στο REINFORCE αλγόριθμο, με τη διαφορά ότι λαμβάνουμε υπόψη και παρελθοντικά σήματα ενίσχυσης. Και στις δυο περιπτώσεις παρουσιάζουμε τις εξισώσεις που προκύπτουν για την ενημέρωση των παραμέτρων. Η ενημέρωση των παραμέτρων γίνεται με τέτοιο τρόπο ώστε το γνωστό σφάλμα ομαδοποίησης του k-Means να ελαχιστοποείται στοχαστικά. Διάφορα πειράματα σε τεχνητά και πραγματικά σύνολα δεδομένων διεξήχθηκαν για να μελετήσουμε την επίδοση των προτεινόμενων μεθόδων, όπου στις περισσότερες περιπτώσεις τα πειραματικά αποτελέσματα έδειξαν ότι οδηγούμαστε σε καλύτερες λύσεις ομαδοποίησης.	el
heal.advisorName	Λύκας, Αριστείδης	el
heal.committeeMemberName	Λύκας, Αριστείδης	el
heal.committeeMemberName	Μπλέκας, Κωνσταντίνος	el
heal.committeeMemberName	Βλάχος, Κωνσταντίνος	el
heal.academicPublisher	Πανεπιστήμιο Ιωαννίνων. Πολυτεχνική Σχολή. Τμήμα Μηχανικών Ηλεκτρονικών Υπολογιστών και Πληροφορικής	el
heal.academicPublisherID	uoi	-
heal.numberOfPages	62 σ.	-
heal.fullTextAvailability	true	-
Appears in Collections:	Διατριβές Μεταπτυχιακής Έρευνας (Masters) - ΜΗΥΠ

Show simple item record

Files in This Item:

File	Description	Size	Format
Μ.Ε. PACHI ELENI 2019.pdf		1.17 MB	Adobe PDF	View/Open

Show simple item record

This item is licensed under a Creative Commons License

Repository of UOI "Olympias"