Αυτόνομη πλοήγηση θαλάσσιας ρομποτικής πλατφόρμας με χρήση μεθόδων ενισχυτικής μάθησης

Τζιορτζιώτης, Κωνσταντίνος

Please use this identifier to cite or link to this item: https://olympias.lib.uoi.gr/jspui/handle/123456789/27788

Full metadata record

DC Field	Value	Language
dc.contributor.author	Τζιορτζιώτης, Κωνσταντίνος	el
dc.date.accessioned	2017-01-09T11:53:34Z	-
dc.date.available	2017-01-09T11:53:34Z	-
dc.identifier.uri	https://olympias.lib.uoi.gr/jspui/handle/123456789/27788	-
dc.identifier.uri	http://dx.doi.org/10.26268/heal.uoi.1803	-
dc.rights	Default License	-
dc.subject	Ενισχυτική μάθηση	el
dc.subject	Ρομποτική θαλάσσια πλατφόρμα	el
dc.subject	Αντίστροφη ενισχυτική μάθηση	el
dc.subject	Reinforcement learnings	en
dc.subject	Robotic marine platform	en
dc.subject	Inverse reinforcement learning	en
dc.title	Αυτόνομη πλοήγηση θαλάσσιας ρομποτικής πλατφόρμας με χρήση μεθόδων ενισχυτικής μάθησης	el
dc.title	Autonomous navigation of an over-actuated marine platform using reinforcement learning	en
heal.type	masterThesis	-
heal.type.en	Master thesis	en
heal.type.el	Μεταπτυχιακή εργασία	el
heal.classification	Ρομποτική	el
heal.language	el	-
heal.access	free	-
heal.recordProvider	Πανεπιστήμιο Ιωαννίνων. Σχολή Θετικών Επιστημών. Τμήμα Μηχανικών Η/Υ & Πληροφορικής	el
heal.publicationDate	2016	-
heal.bibliographicCitation	Βιβλιογραφία : σ. 64-66	el
heal.abstract	Η παρούσα εργασία πραγματεύεται την αυτόνομη πλοήγηση μιας θαλάσσιας ρομποτικής πλατφόρμας - Delta Berenike - μέσω μεθόδων ενισχυτικής μάθησης (reinforcement learning), δηλ. της βέλτιστης κίνησης της πλατφόρμας με στόχο τον εντοπισμό μιας συγκεκριμένης θέσης-στόχου και με ταυτόχρονη αποφυγή εμποδίων και συγκρούσεων. Μερικές βασικές ιδιαιτερότητες της συγκεκριμένης θαλάσσιας πλατφόρμας που σχετίζονται άμεσα με το σύστημα ελέγχου αποτελουν οι διαταραχές εξαιτίας του υδροδυναμικού μοντέλου και του σύνθετου δυναμικού μοντέλου των επενεργητών, όπως επίσης τα σφάλματα που προέρχονται από την ανατροφοδότηση των μεταβλητών κατάστασης (τρέχουσα θέση, προσανατολισμός και ταχύτητα) από τους αισθητήρες κίνησης. Το πλαίσιο της ενισχυτικής μάθησης προσεγγίζει το πρόβλημα της πλοήγησης ως ένα πρόβλημα ανακάλυψης της βέλτιστης πολιτικής ενός πράκτορα (agent) ο οποίος κινείται σε ένα στοχαστικό Μαρκοβιανό χώρο καταστάσεων. Κατά την διάρκεια της αλληλεπίδρασης του πράκτορα με το περιβάλλον σημαντικό ρόλο στη διαδικασία μάθησης αποτελεί η συνάρτηση ανταμοιβής (reward function), η οποία καθορίζει την μορφή απεικόνισης του χώρου καταστάσεων με τις ενέργειες (συνάρτηση αξίας action value function). Η συνήθης διαδικασία είναι ότι η συνάρτηση ανταμοιβής είναι εκ των προτέρων γνωστή με βάση την εμπειρία του προβλήματος ελέγχου. Γενικά το πρόβλημα της εκτίμησης της συνάρτησης ανταμοιβής ορίζεται ως ένα πρόβλημα αντίστροφης ενισχυτικής μάθησης (inverse reinforcemenet learning). Στην εργασία αυτή προτείνεται ένα πλαίσιο ενισχυτικής μάθησης για τον έλεγχο της θαλάσσιας πλατφόρμας, το οποίο εκτιμά τη βέλτιστη πολιτική και ταυτόχρονα τη συνάρτηση ανταμοιβής. Η διαδικασία μάθησης είναι on-line και επικεντρώνεται στο γραμμικό μαθηματικό μοντέλο (linear model) για την περιγραφή των συναρτήσεων αξιών και ανταμοιβών, χρησιμοποιώντας ένα περιγραφικό χώρο καταστασεων μέσω κατάλληλων συναρτήσεων βάσης (basis functions). Η προτεινόμενη μέθοδος αξιολογήθηκε πειραματικά σε προσομοιωμένα θαλάσσια στοχαστικά περιβάλλοντα στα οποία επιδρούν διάφορες μορφές περιβαλλοντικών διαταραχών, όπως ο άνεμος, τα θαλάσσια ρεύματα καθώς και τα κύματα. Τέλος, πραγματοποιήθηκε η σύγκριση της μεθόδου με δύο γνωστές τεχνικές ενισχυτικής μάθησης, τον αλγόριθμο Q-Learning και τον LSPI.	el
heal.abstract	The current diploma thesis examines the autonomous navigation of an over-actuated marine platform “Delta Berenike” using reinforcement learning methods. RL aims at finding an optimum route through obstacles in order to identify a specific target. The marine platform is related with certain peculiarities due to hydrodynamic model of the actuators and complex dynamic model,as well as errors from feedback of state variables (current position, orientation and speed) from motion sensors. The RL framework considers navigation problem as a problem of finding the optimal policy of an agent which moves in a stochastic Markov state space. During the agent’s interaction with the environment, reward function plays an important role in the learning process, determining the display format of the state space with the actions. More specifically, reward function is known in advance and is based on the control problem experience. Generally, the problem of the reward function estimation is defined as an inverse reinforcement learning problem. In this study we propose a reinforcement learning framework which controls marine platform and estimates the optimum policy as well as the reward function. The learning process can be implemented as an on line learning algorithm and is focused on linear model in order to describe the value and rewards functions, using a descriptive situation space through appropriate basis function. We have studied the performance of the proposed method using two simulated environments. The environment includes several environmental disturbances such as wind, sea currents and waves. Finally, emphasis was given in the comparison of the method through two known reinforcement learning techniques, the Q-Learning and LSPI algorithm. The results are promising.	en
heal.advisorName	Μπλέκας, Κωνσταντίνος	el
heal.committeeMemberName	Μπλέκας, Κωνσταντίνος	el
heal.committeeMemberName	Λύκας, Αριστείδης	el
heal.committeeMemberName	Βλάχος, Κωνσταντίνος	el
heal.academicPublisher	Πανεπιστήμιο Ιωαννίνων. Σχολή Θετικών Επιστημών. Τμήμα Μηχανικών Η/Υ & Πληροφορικής	el
heal.academicPublisherID	uoi	-
heal.numberOfPages	66 σ.	-
heal.fullTextAvailability	true	-
Appears in Collections:	Διατριβές Μεταπτυχιακής Έρευνας (Masters) - ΜΥ

Show simple item record

Files in This Item:

File	Description	Size	Format
Μ.Ε. ΤΖΙΟΡΤΖΙΩΤΗΣ ΚΩΝΣΤΑΝΤΙΝΟΣ 2016.pdf		1.57 MB	Adobe PDF	View/Open

Show simple item record

This item is licensed under a Creative Commons License

Repository of UOI "Olympias"