Abstraction in reinforcement learning in partially observable environments

Download
2014
Çilden, Erkin
Reinforcement learning defines a prominent family of unsupervised machine learning methods in autonomous agents perspective. Markov decision process model provides a solid formal basis for reinforcement learning algorithms. Temporal abstraction mechanisms can be built on reinforcement learning and significant performance gain can be achieved. If the full observability assumption of Markov decision process model is relaxed, the resulting model is partially observable Markov decision process, which constitutes a more realistic but difficult problem setting. Reinforcement learning research for partial observability focuses on techniques to reduce negative impact of perceptual aliasing and huge state-space. In the broadest sense, these studies can be divided into two categories. Model based approaches assume that the state transition model is available to the agent. In the model free approaches, states are completely hidden from the agent. In this thesis, we propose methods to generalize a known sequence based automatic temporal abstraction technique -namely, extended sequence tree method- to partial observability. We attack the problem in both model based and model free approaches, showing that our methods accelerate well known representatives of each perspective. Effectiveness of our methods are demonstrated by conducting experimentation on widely accepted benchmark problems.

Suggestions

Kısmi gözlemlenebilir takviye öğrenme için dolaysız soyutlama
Çilden, Erkin; Polat, Faruk; Şahin, Coşkun(2015)
Reinforcement learning defines a prominent family of unsupervised machine learning methods in autonomous agents perspective. Markov decision process model provides a solid formal basis for reinforcement learning algorithms. Temporal abstraction mechanisms can be built on reinforcement learning and significant performance gain can be achieved. If the full observability assumption of Markov decision process model is relaxed, the resulting model is partially observable Markov decision process, which constitute...
Simple and complex behavior learning using behavior hidden Markov Model and CobART
Seyhan, Seyit Sabri; Alpaslan, Ferda Nur; Department of Computer Engineering (2013)
In this thesis, behavior learning and generation models are proposed for simple and complex behaviors of robots using unsupervised learning methods. Simple behaviors are modeled by simple-behavior learning model (SBLM) and complex behaviors are modeled by complex-behavior learning model (CBLM) which uses previously learned simple or complex behaviors. Both models have common phases named behavior categorization, behavior modeling, and behavior generation. Sensory data are categorized using correlation based...
Automatic identification of transitional bottlenecks in reinforcement learning under partial observability
Aydın, Hüseyin; Polat, Faruk; Department of Computer Engineering (2017)
Instance-based methods are proven tools to solve reinforcement learning problems with hidden states. Nearest Sequence Memory (NSM) is a widely known instance-based approach mainly based on k-Nearest Neighbor algorithm. NSM keeps track of raw history of action-observation-reward instances within a fixed length (or ideally unlimited) memory. It calculates the neighborhood for the current state through a recursive comparison of the matching action-observation-reward tuples with the previous ones. The ones with...
Abstraction in Model Based Partially Observable Reinforcement Learning using Extended Sequence Trees
Cilden, Erkin; Polat, Faruk (2012-12-07)
Extended sequence tree is a direct method for automatic generation of useful abstractions in reinforcement learning, designed for problems that can be modelled as Markov decision process. This paper proposes a method to expand the extended sequence tree method over reinforcement learning to cover partial observability formalized via partially observable Markov decision process through belief state formalism. This expansion requires a reasonable approximation of information state. Inspired by statistical ran...
Simple and complex behavior learning using behavior hidden Markov model and CobART
Seyhan, Seyit Sabri; Alpaslan, Ferda Nur; Yavaş, Mustafa (2013-03-01)
This paper proposes behavior learning and generation models for simple and complex behaviors of robots using unsupervised learning methods. While the simple behaviors are modeled by simple-behavior learning model (SBLM), complex behaviors are modeled by complex-behavior learning model (CBLM) which uses previously learned simple or complex behaviors. Both models include behavior categorization, behavior modeling, and behavior generation phases. In the behavior categorization phase, sensory data are categoriz...
Citation Formats
E. Çilden, “Abstraction in reinforcement learning in partially observable environments,” Ph.D. - Doctoral Program, Middle East Technical University, 2014.