Using Transitional Bottlenecks to Improve Learning in Nearest Sequence Memory Algorithm

Instance-based methods are proven tools to solve reinforcement learning problems with hidden states. Nearest Sequence Memory (NSM) is a widely known instance-based approach mainly based on k-Nearest Neighbor algorithm. It keeps the history of the agent in terms of action-observation-reward tuples and uses it to vote for the best upcoming action. In this work, an improving heuristic is proposed for the NSM algorithm which provides the agent an additional prior information, namely transitional bottlenecks, on the way to goal. Additionally, a tuple extension pattern is shown to further improve the heuristic by means of ambiguity reduction due to the nature of transitional bottlenecks, thus increase the learning speed. Empirical results indicate a significant improvement in learning performance, in terms of number of steps to goal.