Using Transitional Bottlenecks to Improve Learning in Nearest Sequence Memory Algorithm

2017-11-08
Instance-based methods are proven tools to solve reinforcement learning problems with hidden states. Nearest Sequence Memory (NSM) is a widely known instance-based approach mainly based on k-Nearest Neighbor algorithm. It keeps the history of the agent in terms of action-observation-reward tuples and uses it to vote for the best upcoming action. In this work, an improving heuristic is proposed for the NSM algorithm which provides the agent an additional prior information, namely transitional bottlenecks, on the way to goal. Additionally, a tuple extension pattern is shown to further improve the heuristic by means of ambiguity reduction due to the nature of transitional bottlenecks, thus increase the learning speed. Empirical results indicate a significant improvement in learning performance, in terms of number of steps to goal.

Suggestions

Automatic identification of transitional bottlenecks in reinforcement learning under partial observability
Aydın, Hüseyin; Polat, Faruk; Department of Computer Engineering (2017)
Instance-based methods are proven tools to solve reinforcement learning problems with hidden states. Nearest Sequence Memory (NSM) is a widely known instance-based approach mainly based on k-Nearest Neighbor algorithm. NSM keeps track of raw history of action-observation-reward instances within a fixed length (or ideally unlimited) memory. It calculates the neighborhood for the current state through a recursive comparison of the matching action-observation-reward tuples with the previous ones. The ones with...
Using chains of bottleneck transitions to decompose and solve reinforcement learning tasks with hidden states
Aydın, Hüseyin; Çilden, Erkin; Polat, Faruk (2022-08-01)
Reinforcement learning is known to underperform in large and ambiguous problem domains under partial observability. In such cases, a proper decomposition of the task can improve and accelerate the learning process. Even ambiguous and complex problems that are not solvable by conventional methods turn out to be easier to handle by using a convenient problem decomposition, followed by the incorporation of machine learning methods for the sub-problems. Like in most real-life problems, the decomposition of a ta...
Using Frequencies of Transitions to Improve Reinforcement Learning with Hidden States
Aydın, Hüseyin; Polat, Faruk; Çilden, Erkin; Department of Computer Engineering (2022-8)
Reinforcement learning problems with hidden states suffer from the ambiguity of the environment, since the ambiguity in the agent's perception may prevent the agent from estimating its current state correctly. Therefore, constructing a solution without using an external memory may be extremely difficult or even impossible sometimes. In an ambiguous environment, frequencies of the transitions can provide more reliable information and hence it may lead us to construct more efficient and effective memory inst...
A Concept Filtering Approach for Diverse Density to Discover Subgoals in Reinforcement Learning
DEMİR, ALPER; Cilden, Erkin; Polat, Faruk (2017-11-08)
In the reinforcement learning context, subgoal discovery methods aim to find bottlenecks in problem state space so that the problem can naturally be decomposed into smaller subproblems. In this paper, we propose a concept filtering method that extends an existing subgoal discovery method, namely diverse density, to be used for both fully and partially observable RL problems. The proposed method is successful in discovering useful subgoals with the help of multiple instance learning. Compared to the original...
Compact Frequency Memory for Reinforcement Learning with Hidden States.
Polat, Faruk; Cilden, Erkin (2019-10-28)
Memory-based reinforcement learning approaches keep track of past experiences of the agent in environments with hidden states. This may require extensive use of memory that limits the practice of these methods in a real-life problem. The motivation behind this study is the observation that less frequent transitions provide more reliable information about the current state of the agent in ambiguous environments. In this work, a selective memory approach based on the frequencies of transitions is proposed to ...
Citation Formats
H. Aydın and F. Polat, “Using Transitional Bottlenecks to Improve Learning in Nearest Sequence Memory Algorithm,” 2017, Accessed: 00, 2020. [Online]. Available: https://hdl.handle.net/11511/41896.