Learning sequences of compatible actions among agents

Date

2002-03-01

Author

Polat, Faruk

Metadata

Show full item record

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Item Usage Stats

169
views

0
downloads

Action coordination in multiagent systems is a difficult task especially in dynamic environments. If the environment possesses cooperation, least communication, incompatibility and local information constraints, the task becomes even more difficult. Learning compatible action sequences to achieve a designated goal under these constraints is studied in this work. Two new multiagent learning algorithms called QACE and NoCommQACE are developed. To improve the performance of the QACE and NoCommQACE algorithms four heuristics, state iteration, means-ends analysis, decreasing reward and do-nothing, are developed. The proposed algorithms are tested on the blocks world domain and the performance results are reported.

Subject Keywords

Bucket brigade learning, Multiagent learning, Multiagent systems, Q-learning, Reinforcement learning

URI

https://hdl.handle.net/11511/44700

Journal

ARTIFICIAL INTELLIGENCE REVIEW

DOI

https://doi.org/10.1023/a:1015009422110

Collections

Department of Computer Engineering, Article

Suggestions

OpenMETU
Core

Deep Metric Learning With Alternating Projections Onto Feasible Sets Can, Oğul; Gürbüz, Yeti Z.; Alatan, Abdullah Aydın (2021-01-01) Minimizers of the typical distance metric learning loss functions can be considered as "feasible points" satisfying a set of constraints imposed by the training data. We reformulate distance metric learning problem as finding a feasible point of a constraint set where the embedding vectors of the training data satisfy desired intra-class and inter-class proximity. The feasible set induced by the constraint set is expressed as the intersection of the relaxed feasible sets which enforce the proximity constrai...
Attention mechanisms for semantic few-shot learning Baran, Orhun Buğra; Cinbiş, Ramazan Gökberk; İkizler-Cinbiş, Nazlı; Department of Computer Engineering (2021-9-1) One of the fundamental difficulties in contemporary supervised learning approaches is the dependency on labelled examples. Most state-of-the-art deep architectures, in particular, tend to perform poorly in the absence of large-scale annotated training sets. In many practical problems, however, it is not feasible to construct sufficiently large training sets, especially in problems involving sensitive information or consisting of a large set of fine-grained classes. One of the main topics in machine learning...
Zenith Pass Problem In Air-To-Air Missiles With Nod-Over- Roll Gimbal Kandemir, Kutlu Demir; Yazıcıoğlu, Yiğit; Department of Mechanical Engineering (2023-1-24) Nod-over-Roll is a commonly used gimbal configuration in automatic target tracking and pointing systems due to its simplicity and volumetric advantage. Yet, it suffers from an inherent kinematic singularity problem right at the center of its task space, where roll axis and pointing vectors coincide. This phenomenon is called zenith pass problem and has to be solved in real time for a proper tracking performance. This thesis focuses on the zenith pass problem from an air-to-air missile seeker perspective. An...
Voluntary Behavior on Cortical Learning Algorithm Based Agents Sungur, Ali Kaan; Sürer, Elif (2016-09-23) Operating autonomous agents inside a 3D workspace is a challenging problem domain in real-time for dynamic environments since it involves online interaction with ever-changing decision constraints. This study proposes a neuroscience inspired architecture to simulate autonomous agents with interaction capabilities inside a 3D virtual world. The environment stimulates the operating agents based on their place and course of action. They are expected to form a life cycle composed of behavior chunks inside this ...
Improving reinforcement learning using distinctive clues of the environment Demir, Alper; Polat, Faruk; Department of Computer Engineering (2019) Effective decomposition and abstraction has been shown to improve the performance of Reinforcement Learning. An agent can use the clues from the environment to either partition the problem into sub-problems or get informed about its progress in a given task. In a fully observable environment such clues may come from subgoals while in a partially observable environment they may be provided by unique experiences. The contribution of this thesis is two fold; first improvements over automatic subgoal identifica...

Citation Formats

F. Polat, “Learning sequences of compatible actions among agents,” ARTIFICIAL INTELLIGENCE REVIEW, pp. 21–37, 2002, Accessed: 00, 2020. [Online]. Available: https://hdl.handle.net/11511/44700.