Show/Hide Menu
Hide/Show Apps
anonymousUser
Logout
Türkçe
Türkçe
Search
Search
Login
Login
OpenMETU
OpenMETU
About
About
Açık Bilim Politikası
Açık Bilim Politikası
Frequently Asked Questions
Frequently Asked Questions
Browse
Browse
By Issue Date
By Issue Date
Authors
Authors
Titles
Titles
Subjects
Subjects
Communities & Collections
Communities & Collections
Kısmi gözlemlenebilir takviye öğrenme için dolaysız soyutlama
Download
TVRRMU5qYzM.pdf
Date
2015
Author
Çilden, Erkin
Polat, Faruk
Şahin, Coşkun
Metadata
Show full item record
This work is licensed under a
Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License
.
Item Usage Stats
0
views
0
downloads
Reinforcement learning defines a prominent family of unsupervised machine learning methods in autonomous agents perspective. Markov decision process model provides a solid formal basis for reinforcement learning algorithms. Temporal abstraction mechanisms can be built on reinforcement learning and significant performance gain can be achieved. If the full observability assumption of Markov decision process model is relaxed, the resulting model is partially observable Markov decision process, which constitutes a more realistic but difficult problem setting. Reinforcement learning research for partial observability focuses on techniques to reduce negative impact of perceptual aliasing and huge state-space. In the broadest sense, these studies can be divided into two categories. Model based approaches assume that the state transition model is available to the agent. In the model free approaches, states are completely hidden from the agent. In this project, we propose methods to generalize a known sequence based automatic temporal abstraction technique –namely, extended sequence tree method– to partial observability. We attack the problem in both model based and model free approaches, showing that our methods accelerate well known representatives of each perspective. Effectiveness of our methods are demonstrated by conducting experimentation on widely accepted benchmark problems.
Subject Keywords
Reinforcement Learning
,
Partially Observable Markov Decision Process
,
Temporal Abstraction
,
Extended Sequence Tree
URI
https://app.trdizin.gov.tr/publication/project/detail/TVRRMU5qYzM
https://hdl.handle.net/11511/49848
Collections
Department of Computer Engineering, Project and Design