Learning cooperation in hunter-prey problem via state abstraction

Download
2009
İşçen, Atıl
Hunter-Prey or Prey-Pursuit problem is a common toy domain for Reinforcement Learning, but the size of the state space is exponential in the parameters such as size of the grid or number of agents. As the size of the state space makes the flat Q-learning impossible to use for different scenarios, this thesis presents an approach to make the size of the state space constant by producing agents that use previously learned knowledge to perform on bigger scenarios containing more agents. Inspired from HRL methods, the method is composed of a parallel subtasks schema dividing the task into choices of simpler subtasks, a state representation technique convenient for this schema and its extension for bigger grids. Experimental results show that proposed method successfully provides agents that perform near to hand-coded agents by using constant sized state space independent from parameters of the domain.

Suggestions

A new offline path search algorithm for computer games that considers damage as a feasibility criterion
Bayılı, Serhat; Polat, Faruk; Department of Computer Engineering (2008)
Pathfinding algorithms used in today’s computer games consider path length or a similar criterion as the only measure of optimality. However, these games usually involve opposing parties, whose agents can inflict damage on those of the others’. Therefore, the shortest path in such games may not always be the safest one. Consequently, a new suboptimal offline path search algorithm that takes the threat sources into consideration was developed, based on the A* algorithm. Given an upper bound value as the tole...
Using semantic web services for data integration in banking domain
Okat, Çağlar; Doğru, Ali Hikmet; Department of Computer Engineering (2010)
A semantic model oriented transformation mechanism is developed for the centralization of intra-enterprise data integration. Such a mechanism is especially crucial in the banking domain which is selected in this study. A new domain ontology is constructed to provide basis for annotations. A bottom-up approach is preferred for semantic annotations to utilize existing web service definitions. Transformations between syntactic web service XML responses and semantic model concepts are defined in transformation ...
Tracking of ground targets with interacting multiple model estimator
Acar, Duygu; Baykal, Buyurman; Department of Electrical and Electronics Engineering (2011)
Interacting Multiple Model (IMM) estimator is used extensively to estimate trajectories of maneuvering targets in cluttered environment. In the standard tracking methods, it is assumed that movement of target is applicable to a certain model and the target could be monitored via the usage of status predictions of that model. However, targets can make different maneuvering movements. At that time, expression of target dynamic model with only one model can be insufficient. In IMM approach, target dynamic mode...
A pattern classification approach boosted with genetic algorithms
Yalabık, İsmet; Yarman Vural, Fatoş Tunay; Department of Computer Engineering (2007)
Ensemble learning is a multiple-classier machine learning approach which combines, produces collections and ensembles statistical classiers to build up more accurate classier than the individual classiers. Bagging, boosting and voting methods are the basic examples of ensemble learning. In this thesis, a novel boosting technique targeting to solve partial problems of AdaBoost, a well-known boosting algorithm, is proposed. The proposed systems nd an elegant way of boosting a bunch of classiers successively t...
Natural language query processing in ontology based multimedia databases
Aygül, Filiz Alaca; Çiçekli, Fehime Nihan; Department of Computer Engineering (2010)
In this thesis a natural language query interface is developed for semantic and spatio-temporal querying of MPEG-7 based domain ontologies. The underlying ontology is created by attaching domain ontologies to the core Rhizomik MPEG-7 ontology. The user can pose concept, complex concept (objects connected with an “AND” or “OR” connector), spatial (left, right . . . ), temporal (before, after, at least 10 minutes before, 5 minutes after . . . ), object trajectory and directional trajectory (east, west, southe...
Citation Formats
A. İşçen, “Learning cooperation in hunter-prey problem via state abstraction,” M.S. - Master of Science, Middle East Technical University, 2009.