Show/Hide Menu
Hide/Show Apps
Logout
Türkçe
Türkçe
Search
Search
Login
Login
OpenMETU
OpenMETU
About
About
Open Science Policy
Open Science Policy
Open Access Guideline
Open Access Guideline
Postgraduate Thesis Guideline
Postgraduate Thesis Guideline
Communities & Collections
Communities & Collections
Help
Help
Frequently Asked Questions
Frequently Asked Questions
Guides
Guides
Thesis submission
Thesis submission
MS without thesis term project submission
MS without thesis term project submission
Publication submission with DOI
Publication submission with DOI
Publication submission
Publication submission
Supporting Information
Supporting Information
General Information
General Information
Copyright, Embargo and License
Copyright, Embargo and License
Contact us
Contact us
Abstraction in reinforcement learning in partially observable environments
Download
index.pdf
Date
2014
Author
Çilden, Erkin
Metadata
Show full item record
Item Usage Stats
194
views
101
downloads
Cite This
Reinforcement learning defines a prominent family of unsupervised machine learning methods in autonomous agents perspective. Markov decision process model provides a solid formal basis for reinforcement learning algorithms. Temporal abstraction mechanisms can be built on reinforcement learning and significant performance gain can be achieved. If the full observability assumption of Markov decision process model is relaxed, the resulting model is partially observable Markov decision process, which constitutes a more realistic but difficult problem setting. Reinforcement learning research for partial observability focuses on techniques to reduce negative impact of perceptual aliasing and huge state-space. In the broadest sense, these studies can be divided into two categories. Model based approaches assume that the state transition model is available to the agent. In the model free approaches, states are completely hidden from the agent. In this thesis, we propose methods to generalize a known sequence based automatic temporal abstraction technique -namely, extended sequence tree method- to partial observability. We attack the problem in both model based and model free approaches, showing that our methods accelerate well known representatives of each perspective. Effectiveness of our methods are demonstrated by conducting experimentation on widely accepted benchmark problems.
Subject Keywords
Reinforcement learning.
,
Machine learning.
,
Sequential analysis.
,
Markov processes.
URI
http://etd.lib.metu.edu.tr/upload/12616815/index.pdf
https://hdl.handle.net/11511/23288
Collections
Graduate School of Natural and Applied Sciences, Thesis
Suggestions
OpenMETU
Core
Kısmi gözlemlenebilir takviye öğrenme için dolaysız soyutlama
Çilden, Erkin; Polat, Faruk; Şahin, Coşkun(2015)
Reinforcement learning defines a prominent family of unsupervised machine learning methods in autonomous agents perspective. Markov decision process model provides a solid formal basis for reinforcement learning algorithms. Temporal abstraction mechanisms can be built on reinforcement learning and significant performance gain can be achieved. If the full observability assumption of Markov decision process model is relaxed, the resulting model is partially observable Markov decision process, which constitute...
Simple and complex behavior learning using behavior hidden Markov Model and CobART
Seyhan, Seyit Sabri; Alpaslan, Ferda Nur; Department of Computer Engineering (2013)
In this thesis, behavior learning and generation models are proposed for simple and complex behaviors of robots using unsupervised learning methods. Simple behaviors are modeled by simple-behavior learning model (SBLM) and complex behaviors are modeled by complex-behavior learning model (CBLM) which uses previously learned simple or complex behaviors. Both models have common phases named behavior categorization, behavior modeling, and behavior generation. Sensory data are categorized using correlation based...
Automatic identification of transitional bottlenecks in reinforcement learning under partial observability
Aydın, Hüseyin; Polat, Faruk; Department of Computer Engineering (2017)
Instance-based methods are proven tools to solve reinforcement learning problems with hidden states. Nearest Sequence Memory (NSM) is a widely known instance-based approach mainly based on k-Nearest Neighbor algorithm. NSM keeps track of raw history of action-observation-reward instances within a fixed length (or ideally unlimited) memory. It calculates the neighborhood for the current state through a recursive comparison of the matching action-observation-reward tuples with the previous ones. The ones with...
Abstraction in Model Based Partially Observable Reinforcement Learning using Extended Sequence Trees
Cilden, Erkin; Polat, Faruk (2012-12-07)
Extended sequence tree is a direct method for automatic generation of useful abstractions in reinforcement learning, designed for problems that can be modelled as Markov decision process. This paper proposes a method to expand the extended sequence tree method over reinforcement learning to cover partial observability formalized via partially observable Markov decision process through belief state formalism. This expansion requires a reasonable approximation of information state. Inspired by statistical ran...
Simple and complex behavior learning using behavior hidden Markov model and CobART
Seyhan, Seyit Sabri; Alpaslan, Ferda Nur; Yavaş, Mustafa (2013-03-01)
This paper proposes behavior learning and generation models for simple and complex behaviors of robots using unsupervised learning methods. While the simple behaviors are modeled by simple-behavior learning model (SBLM), complex behaviors are modeled by complex-behavior learning model (CBLM) which uses previously learned simple or complex behaviors. Both models include behavior categorization, behavior modeling, and behavior generation phases. In the behavior categorization phase, sensory data are categoriz...
Citation Formats
IEEE
ACM
APA
CHICAGO
MLA
BibTeX
E. Çilden, “Abstraction in reinforcement learning in partially observable environments,” Ph.D. - Doctoral Program, Middle East Technical University, 2014.