Landmark Based Reward Shaping in Reinforcement Learning with Hidden States

Date

2019-01-01

Author

Demir, Alper
Cilden, Erkin
Polat, Faruk

Metadata

Show full item record

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Item Usage Stats

114
views

0
downloads

While most of the work on reward shaping focuses on fully observable problems, there are very few studies that couple reward shaping with partial observability. Moreover, for problems with hidden states, where there is no prior information about the underlying states, reward shaping opportunities are unexplored. In this paper, we show that landmarks can be used to shape the rewards in reinforcement learning with hidden states. Proposed approach is empirically shown to improve the learning performance in terms of speed and quality.

Subject Keywords

Computing methodologies, Machine learning, Learning paradigms, Machine learning approaches, Reinforcement learning, Partially-observable Markov decision processes results

URI

https://hdl.handle.net/11511/53047

Conference Name

AAMAS '19: International Conference on Autonomous Agents and Multiagent Systems

Collections

Department of Computer Engineering, Conference / Seminar

Suggestions

OpenMETU
Core

Tracing Teacher Learning through Shifts in Discourses: The Case of a Mathematics Teacher Ilhan, Emine Gul Celebi; Erbaş, Ayhan Kürşat (2017-06-01) This study presents a methodology for investigating teacher learning in and from practice based on discourses that are in constant flux and transformation. Conceptualizing teacher learning as a frame of meaning based on knowing and doing discourses, the ideas are illustrated through data collected from a secondary mathematics teacher conducting an inquiry of self-practice. Narrative analysis of the data from the teacher interviews was conducted along with classroom observations of the teacher's mathematical...
Effect of human prior knowledge on game success and comparison with reinforcement learning Hasanoğlu, Mert.; Çakır, Murat Perit; Department of Cognitive Sciences (2019) This study aims to find out the effect of prior knowledge on the success of humans in a non-rewarding game environment, and then to compare human performance with a reinforcement learning method in an effort to observe to what extent this method can be brought closer to human behavior and performance with the data obtained. For this purpose, different versions of a simple 2D game were used, and data were collected from 32 participants. At the end of the experiment, it is concluded that prior knowledge, such...
Investigation of undergraduate students' mental models about the quantization of physical observables Didiş, Nilüfer; Eryılmaz, Ali; Erkoç, Şakir; Department of Secondary Science and Mathematics Education (2012) The purpose of this research is to investigate undergraduate students’ mental models about the quantization of physical observables. The research was guided by ethnography, case study, and content analysis integrated to each other. It focused on second-year physics and physics education students, who were taking the Modern Physics course at the Department of Physics, at Middle East Technical University. Wide range of data was collected by interview, observation, test, diary, and other documents during 2008-...
A Frame for the Literature on M-learning ALİOON, Yasaman; Delialioğlu, Ömer (2014-11-29) This study attempts to find out overall aspects that have been considered in various mobile learning researches and projects. A systematic review of literature was carried out on the mobile learning domain that emphasized on scale, purposes, methods and type of used m-learning applications as well as school levels they have been implemented. Engagement of students, just-in-time learning were among the most frequently considered issues for selecting instructional methods in m-learning projects while inquiry-...
Prediction of organizational effectiveness in construction companies Dikmen Toker, İrem; Birgönül, Mustafa Talat (American Society of Civil Engineers (ASCE), 2005-02-01) Investigation of literature on organizational effectiveness (OE) reveals that the researchers have been in consensus for the difficulty of defining, modeling, and measuring OE, which is important for attaining high performance. Major focuses of this paper are, therefore, to construct a conceptual framework to model OE, to derive major determinants of OE from this framework, and to measure OE by constructing prediction models based on artificial neural network (ANN) and multiple regression (MR) techniques. B...

Citation Formats

A. Demir, E. Cilden, and F. Polat, “Landmark Based Reward Shaping in Reinforcement Learning with Hidden States,” presented at the AAMAS ’19: International Conference on Autonomous Agents and Multiagent Systems, Montreal QC Canada, 2019, Accessed: 00, 2020. [Online]. Available: https://hdl.handle.net/11511/53047.