Show/Hide Menu
Hide/Show Apps
Logout
Türkçe
Türkçe
Search
Search
Login
Login
OpenMETU
OpenMETU
About
About
Open Science Policy
Open Science Policy
Open Access Guideline
Open Access Guideline
Postgraduate Thesis Guideline
Postgraduate Thesis Guideline
Communities & Collections
Communities & Collections
Help
Help
Frequently Asked Questions
Frequently Asked Questions
Guides
Guides
Thesis submission
Thesis submission
MS without thesis term project submission
MS without thesis term project submission
Publication submission with DOI
Publication submission with DOI
Publication submission
Publication submission
Supporting Information
Supporting Information
General Information
General Information
Copyright, Embargo and License
Copyright, Embargo and License
Contact us
Contact us
Abstraction in Reinforcement Learning
Download
index.pdf
Date
2007
Author
Girgin, Serhat
Metadata
Show full item record
Item Usage Stats
231
views
75
downloads
Cite This
Reinforcement learning is the problem faced by an agent that must learn behavior through trial-and-error interactions with a dynamic environment. Generally, the problem to be solved contains subtasks that repeat at different regions of the state space. Without any guidance an agent has to learn the solutions of all subtask instances independently, which degrades the learning performance. In this thesis, we propose two approaches to build connections between different regions of the search space leading to better utilization of gained experience and accelerate learning is proposed. In the first approach, we first extend existing work of McGovern and propose the formalization of stochastic conditionally terminating sequences with higher representational power. Then, we describe how to efficiently discover and employ useful abstractions during learning based on such sequences. The method constructs a tree structure to keep track of frequently used action sequences together with visited states. This tree is then used to select actions to be executed at each step. In the second approach, we propose a novel method to identify states with similar sub-policies, and show how they can be integrated into reinforcement learning framework to improve the learning performance. The method uses an efficient data structure to find common action sequences started from observed states and defines a similarity function between states based on the number of such sequences. Using this similarity function, updates on the action-value function of a state are reflected to all similar states. This, consequently, allows experience acquired during learning be applied to a broader context. Effectiveness of both approaches is demonstrated empirically by conducting extensive experiments on various domains.
Subject Keywords
Computer Engineering.
,
Computer Science.
URI
http://etd.lib.metu.edu.tr/upload/12608257/index.pdf
https://hdl.handle.net/11511/17216
Collections
Graduate School of Natural and Applied Sciences, Thesis
Suggestions
OpenMETU
Core
Resource based plan revision in dynamic multi-agent environments
Erdoğdu, Utku; Polat, Faruk; Department of Computer Engineering (2004)
Planning framework is commonly used to represent intelligent agents effectively and to model complex behavior. In planning framework, resource-based perspective is interesting in the sense that in a multi-agent environment, exchange of resources can form a cooperative interaction. In resource based plan coordination, each agent constructs an individual plan, then plans are examined by a central plan revision unit for possibilities of removing actions. Domain of this work is the classical postmen domain that...
A comparison of subspace based face recognition methods
Gönder, Özkan; Halıcı, Uğur; Department of Electrical and Electronics Engineering (2004)
Different approaches to the face recognition are studied in this thesis. These approaches are PCA (Eigenface), Kernel Eigenface and Fisher LDA. Principal component analysis extracts the most important information contained in the face to construct a computational model that best describes the face. In Eigenface approach, variation between the face images are described by using a set of characteristic face images in order to find out the eigenvectors (Eigenfaces) of the covariance matrix of the distribution,...
Action recognition through action generation
Akgün, Barış; Şahin, Erol; Department of Computer Engineering (2010)
This thesis investigates how a robot can use action generation mechanisms to recognize the action of an observed actor in an on-line manner i.e., before the completion of the action. Towards this end, Dynamic Movement Primitives (DMP), an action generation method proposed for imitation, are modified to recognize the actions of an actor. Specifically, a human actor performed three different reaching actions to two different objects. Three DMP's, each corresponding to a different reaching action, were trained...
Reinforcement learning using potential field for role assignment in a multi-robot two-team game
Fidan, Özgül; Erkmen, İsmet; Department of Electrical and Electronics Engineering (2004)
In this work, reinforcement learning algorithms are studied with the help of potential field methods, using robosoccer simulators as test beds. Reinforcement Learning (RL) is a framework for general problem solving where an agent can learn through experience. The soccer game is selected as the problem domain a way of experimenting multi-agent team behaviors because of its popularity and complexity.
Anomaly detection from personal usage patterns in web applications
Vural, Gürkan; Yöndem (Turhan), Meltem; Department of Computer Engineering (2006)
The anomaly detection task is to recognize the presence of an unusual (and potentially hazardous) state within the behaviors or activities of a computer user, system, or network with respect to some model of normal behavior which may be either hard-coded or learned from observation. An anomaly detection agent faces many learning problems including learning from streams of temporal data, learning from instances of a single class, and adaptation to a dynamically changing concept. The domain is complicated by ...
Citation Formats
IEEE
ACM
APA
CHICAGO
MLA
BibTeX
S. Girgin, “Abstraction in Reinforcement Learning,” Ph.D. - Doctoral Program, Middle East Technical University, 2007.