Option discovery in reinforcement learning using frequent common subsequences of actions

2005-11-30
Girgin, Sertan
Polat, Faruk
Temporally abstract actions, or options, facilitate learning in large and complex domains by exploiting sub-tasks and hierarchical structure of the problem formed by these sub-tasks. In this paper, we study automatic generation of options using common sub-sequences derived from the state transition histories collected as learning progresses. The standard Q-learning algorithm is extended to use generated options transparently, and effectiveness of the method is demostrated in Dietterich's Taxi domain.
Citation Formats
S. Girgin and F. Polat, “Option discovery in reinforcement learning using frequent common subsequences of actions,” presented at the International Conference on Computational Intelligence for Modelling, Control and Automation/International Conference on Intelligent Agents Web Technologies and International Commerce, Vienna, AUSTRIA, 2005, Accessed: 00, 2020. [Online]. Available: https://hdl.handle.net/11511/55280.