Show/Hide Menu
Hide/Show Apps
Logout
Türkçe
Türkçe
Search
Search
Login
Login
OpenMETU
OpenMETU
About
About
Open Science Policy
Open Science Policy
Open Access Guideline
Open Access Guideline
Postgraduate Thesis Guideline
Postgraduate Thesis Guideline
Communities & Collections
Communities & Collections
Help
Help
Frequently Asked Questions
Frequently Asked Questions
Guides
Guides
Thesis submission
Thesis submission
MS without thesis term project submission
MS without thesis term project submission
Publication submission with DOI
Publication submission with DOI
Publication submission
Publication submission
Supporting Information
Supporting Information
General Information
General Information
Copyright, Embargo and License
Copyright, Embargo and License
Contact us
Contact us
Toward Generalization of Automated Temporal Abstraction to Partially Observable Reinforcement Learning
Date
2015-08-01
Author
Cilden, Erkin
Polat, Faruk
Metadata
Show full item record
This work is licensed under a
Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License
.
Item Usage Stats
183
views
0
downloads
Cite This
Temporal abstraction for reinforcement learning (RL) aims to decrease learning time by making use of repeated sub-policy patterns in the learning task. Automatic extraction of abstractions during RL process is difficult but has many challenges such as dealing with the curse of dimensionality. Various studies have explored the subject under the assumption that the problem domain is fully observable by the learning agent. Learning abstractions for partially observable RL is a relatively less explored area. In this paper, we adapt an existing automatic abstraction method, namely extended sequence tree, originally designed for fully observable problems. The modified method covers a certain family of model-based partially observable RL settings. We also introduce belief state discretization methods that can be used with this new abstraction mechanism. The effectiveness of the proposed abstraction method is shown empirically by experimenting on well-known benchmark problems.
Subject Keywords
Learning abstractions
,
Partially observable Markov decision process (POMDP)
,
Reinforcement learning (RL)
URI
https://hdl.handle.net/11511/46018
Journal
IEEE TRANSACTIONS ON CYBERNETICS
DOI
https://doi.org/10.1109/tcyb.2014.2352038
Collections
Department of Computer Engineering, Article
Suggestions
OpenMETU
Core
Abstraction in Model Based Partially Observable Reinforcement Learning using Extended Sequence Trees
Cilden, Erkin; Polat, Faruk (2012-12-07)
Extended sequence tree is a direct method for automatic generation of useful abstractions in reinforcement learning, designed for problems that can be modelled as Markov decision process. This paper proposes a method to expand the extended sequence tree method over reinforcement learning to cover partial observability formalized via partially observable Markov decision process through belief state formalism. This expansion requires a reasonable approximation of information state. Inspired by statistical ran...
Recursive Compositional Reinforcement Learning for Continuous Control Sürekli Kontrol Uygulamalari için Özyinelemeli Bileşimsel Pekiştirmeli Öǧrenme
Tanik, Guven Orkun; Ertekin Bolelli, Şeyda (2022-01-01)
Compositional and temporal abstraction is the key to improving learning and planning in reinforcement learning. Modern real-world control problems call for continuous control domains and robust, sample efficient and explainable control frameworks. We are presenting a framework for recursively composing control skills to solve compositional and progressively complex tasks. The framework promotes reuse of skills, and as a result quickly adaptable to new tasks. The decision-tree can be observed, providing insi...
Improving reinforcement learning using distinctive clues of the environment
Demir, Alper; Polat, Faruk; Department of Computer Engineering (2019)
Effective decomposition and abstraction has been shown to improve the performance of Reinforcement Learning. An agent can use the clues from the environment to either partition the problem into sub-problems or get informed about its progress in a given task. In a fully observable environment such clues may come from subgoals while in a partially observable environment they may be provided by unique experiences. The contribution of this thesis is two fold; first improvements over automatic subgoal identifica...
Tracing Teacher Learning through Shifts in Discourses: The Case of a Mathematics Teacher
Ilhan, Emine Gul Celebi; Erbaş, Ayhan Kürşat (2017-06-01)
This study presents a methodology for investigating teacher learning in and from practice based on discourses that are in constant flux and transformation. Conceptualizing teacher learning as a frame of meaning based on knowing and doing discourses, the ideas are illustrated through data collected from a secondary mathematics teacher conducting an inquiry of self-practice. Narrative analysis of the data from the teacher interviews was conducted along with classroom observations of the teacher's mathematical...
Investigation of Students’ Cognitive Processes in Computer Programming: A Cognitive Ethnography Study
Doğan, Sibel; Aslan, Orhan; Yıldırım, İbrahim Soner (2019-01-01)
The aim of the current study is to investigate how cognitive processes of students categorized as novice, semi-expert and expert differ in terms of creating pseudocode for a given programming task. To conduct this aim, cognitive ethnography research design was employed to reveal the cognitive process of the participants behind the specified task. In the study, three undergraduate students from a Computer Education and Instructional Technology (CEIT) department were included as participants. These students w...
Citation Formats
IEEE
ACM
APA
CHICAGO
MLA
BibTeX
E. Cilden and F. Polat, “Toward Generalization of Automated Temporal Abstraction to Partially Observable Reinforcement Learning,”
IEEE TRANSACTIONS ON CYBERNETICS
, pp. 1414–1425, 2015, Accessed: 00, 2020. [Online]. Available: https://hdl.handle.net/11511/46018.