Hide/Show Apps

Option discovery in reinforcement learning using frequent common subsequences of actions

Girgin, Sertan
Polat, Faruk
Temporally abstract actions, or options, facilitate learning in large and complex domains by exploiting sub-tasks and hierarchical structure of the problem formed by these sub-tasks. In this paper, we study automatic generation of options using common sub-sequences derived from the state transition histories collected as learning progresses. The standard Q-learning algorithm is extended to use generated options transparently, and effectiveness of the method is demostrated in Dietterich's Taxi domain.