Population-based exploration in reinforcement learning through repulsive reward shaping using eligibility traces

Date

2024-01-01

Author

Bal, Melis Ilayda
İyigün, Cem
Polat, Faruk
Aydın, Hüseyin

Metadata

Show full item record

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Item Usage Stats

145
views

0
downloads

Efficient exploration plays a key role in accelerating the learning performance and sample efficiency of reinforcement learning tasks. In this paper we propose a framework that serves as a population-based repulsive reward shaping mechanism using eligibility traces to enhance the efficiency in exploring the state-space under the scope of tabular reinforcement learning representation. The framework contains a hierarchical structure of RL agents, where a higher level repulsive-reward-shaper agent (RRS-Agent) coordinates the exploration of its population of sub-agents through repulsion when necessary conditions on their eligibility traces are met. Empirical results on well-known benchmark problem domains show that the framework indeed achieves efficient exploration with a significant improvement in learning performance and state-space coverage. Furthermore, the transparency of the proposed framework enables explainable decisions made by the agents in the hierarchical structure to explore the state-space in a coordinated manner and supports the interpretability of the framework.

Subject Keywords

Coordinated agents, Eligibility traces, Population-based exploration, Reinforcement learning, Reward shaping

URI

https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85182712212&origin=inward
https://hdl.handle.net/11511/108575

Journal

Annals of Operations Research

DOI

https://doi.org/10.1007/s10479-023-05798-1

Collections

Department of Industrial Engineering, Article

Citation Formats

M. I. Bal, C. İyigün, F. Polat, and H. Aydın, “Population-based exploration in reinforcement learning through repulsive reward shaping using eligibility traces,” Annals of Operations Research, pp. 0–0, 2024, Accessed: 00, 2024. [Online]. Available: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85182712212&origin=inward.