Improving reinforcement learning using distinctive clues of the environment

2019
Demir, Alper
Effective decomposition and abstraction has been shown to improve the performance of Reinforcement Learning. An agent can use the clues from the environment to either partition the problem into sub-problems or get informed about its progress in a given task. In a fully observable environment such clues may come from subgoals while in a partially observable environment they may be provided by unique experiences. The contribution of this thesis is two fold; first improvements over automatic subgoal identification and option generation in fully observable environments is proposed, then an automatic landmark identification and an anchor based guiding mechanism in partially observable environments is introduced. Moreover, for both type of problems, the thesis proposes an overall framework that is shown to outperform baseline learning algorithms on several benchmark domains.

Suggestions

Automatic identification of transitional bottlenecks in reinforcement learning under partial observability
Aydın, Hüseyin; Polat, Faruk; Department of Computer Engineering (2017)
Instance-based methods are proven tools to solve reinforcement learning problems with hidden states. Nearest Sequence Memory (NSM) is a widely known instance-based approach mainly based on k-Nearest Neighbor algorithm. NSM keeps track of raw history of action-observation-reward instances within a fixed length (or ideally unlimited) memory. It calculates the neighborhood for the current state through a recursive comparison of the matching action-observation-reward tuples with the previous ones. The ones with...
Effective subgoal discovery and option generation in reinforcement learning
Demir, Alper; Polat, Faruk; Department of Computer Engineering (2016)
Subgoal discovery is proven to be a practical way to cope with large state spaces in Reinforcement Learning. Subgoals are natural hints to partition the problem into sub-problems, allowing the agent to solve each sub-problem separately. Identification of such subgoal states in the early phases of the learning process increases the learning speed of the agent. In a problem modeled as a Markov Decision Process, subgoal states possess key features that distinguish them from the ordinary ones. A learning agent ...
A Concept Filtering Approach for Diverse Density to Discover Subgoals in Reinforcement Learning
DEMİR, ALPER; Cilden, Erkin; Polat, Faruk (2017-11-08)
In the reinforcement learning context, subgoal discovery methods aim to find bottlenecks in problem state space so that the problem can naturally be decomposed into smaller subproblems. In this paper, we propose a concept filtering method that extends an existing subgoal discovery method, namely diverse density, to be used for both fully and partially observable RL problems. The proposed method is successful in discovering useful subgoals with the help of multiple instance learning. Compared to the original...
EMDD-RL: faster subgoal identification with diverse density in reinforcement learning
Sunel, Saim; Polat, Faruk; Department of Computer Engineering (2021-1-15)
Diverse Density (DD) algorithm is a well-known multiple instance learning method, also known to be effective to automatically identify sub-goals and improve Reinforcement Learning (RL). Expectation-Maximization Diverse Density (EMDD) improves DD in terms of both speed and accuracy. This study adapts EMDD to automatically identify subgoals for RL which is shown to perform significantly faster (3 to 10 times) than its predecessor, without sacrificing solution quality. The performance of the proposed method na...
Using chains of bottleneck transitions to decompose and solve reinforcement learning tasks with hidden states
Aydın, Hüseyin; Çilden, Erkin; Polat, Faruk (2022-08-01)
Reinforcement learning is known to underperform in large and ambiguous problem domains under partial observability. In such cases, a proper decomposition of the task can improve and accelerate the learning process. Even ambiguous and complex problems that are not solvable by conventional methods turn out to be easier to handle by using a convenient problem decomposition, followed by the incorporation of machine learning methods for the sub-problems. Like in most real-life problems, the decomposition of a ta...
Citation Formats
A. Demir, “Improving reinforcement learning using distinctive clues of the environment,” Thesis (Ph.D.) -- Graduate School of Natural and Applied Sciences. Computer Engineering., Middle East Technical University, 2019.