Option discovery in reinforcement learning using frequent common subsequences of actions

2005-11-30
Girgin, Sertan
Polat, Faruk
Temporally abstract actions, or options, facilitate learning in large and complex domains by exploiting sub-tasks and hierarchical structure of the problem formed by these sub-tasks. In this paper, we study automatic generation of options using common sub-sequences derived from the state transition histories collected as learning progresses. The standard Q-learning algorithm is extended to use generated options transparently, and effectiveness of the method is demostrated in Dietterich's Taxi domain.
International Conference on Computational Intelligence for Modelling, Control and Automation/International Conference on Intelligent Agents Web Technologies and International Commerce

Suggestions

Hierarchical and decentralized multitasking control of discrete event systems
Schmidt, Klaus Verner; Cury, José E. R. (2007-12-01)
In this paper, a hierarchical and decentralized approach for composite discrete-event systems (DES) that have to fulfill multiple tasks is elaborated. Colored marking generators that can distinguish classes of tasks are used as the system model, and a colored abstraction procedure as well as sufficient conditions for nonblocking and hierarchically consistent control are developed. It is shown that the computational complexity for supervisor computation is reduced. A flexible manufacturing system example dem...
A dynamic programming algorithm for tree-like weighted set packing problem
Gulek, Mehmet; Toroslu, İsmail Hakkı (Elsevier BV, 2010-10-15)
In hierarchical organizations, hierarchical structures naturally correspond to nested sets. That is, we have a collection of sets such that for any two sets, either one of them is a subset of the other, or they are disjoint. In other words, a nested set system forms a hierarchy in the form of a tree structure. The task assignment problem on such hierarchical organizations is a real life problem. In this paper, we introduce the tree-like weighted set packing problem, which is a weighted set packing problem r...
Maximally Permissive Hierarchical Control of Decentralized Discrete Event Systems
SCHMİDT, KLAUS WERNER; Schmidt, Klaus Verner (2011-04-01)
The subject of this paper is the synthesis of natural projections that serve as nonblocking and maximally permissive abstractions for the hierarchical and decentralized control of large-scale discrete event systems. To this end, existing concepts for nonblocking abstractions such as natural observers and marked string accepting (msa)-observers are extended by local control consistency (LCC) as a novel sufficient condition for maximal permissiveness. Furthermore, it is shown that, similar to the natural obse...
Local Roots A Tree Based Subgoal Discovery Method to Accelerate Reinforcement Learning
Demir, Alper; Polat, Faruk; Cilden, Erkin (2016-12-04)
Subgoal discovery in reinforcement learning is an effective way of partitioning a problem domain with large state space. Recent research mainly focuses on automatic identification of such subgoals during learning, making use of state transition information gathered during exploration. Mostly based on the options framework, an identified subgoal leads the learning agent to an intermediate region which is known to be useful on the way to goal. In this paper, we propose a novel automatic subgoal discovery meth...
Domain adaptation on graphs by learning aligned graph bases
Pilancı, Mehmet; Vural, Elif; Department of Electrical and Electronics Engineering (2018)
In this thesis, the domain adaptation problem is studied and a method for domain adaptation on graphs is proposed. Given sufficiently many observations of the label function on a source graph, we study the problem of transferring the label information from the source graph to a target graph for estimating the target label function. Our assumption about the relation between the two domains is that the frequency content of the label function, regarded as a graph signal, has similar characteristics over the so...
Citation Formats
S. Girgin and F. Polat, “Option discovery in reinforcement learning using frequent common subsequences of actions,” presented at the International Conference on Computational Intelligence for Modelling, Control and Automation/International Conference on Intelligent Agents Web Technologies and International Commerce, Vienna, AUSTRIA, 2005, Accessed: 00, 2020. [Online]. Available: https://hdl.handle.net/11511/55280.