Potential-based reward shaping using state–space segmentation for efficiency in reinforcement learning

Date

2024-08-01

Author

Bal, Melis İlayda
Aydın, Hüseyin
İyigün, Cem
Polat, Faruk

Metadata

Show full item record

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Item Usage Stats

171
views

0
downloads

Reinforcement Learning (RL) algorithms encounter slow learning in environments with sparse explicit reward structures due to the limited feedback available on the agent's behavior. This problem is exacerbated particularly in complex tasks with large state and action spaces. To address this inefficiency, in this paper, we propose a novel approach based on potential-based reward-shaping using state–space segmentation to decompose the task and to provide more frequent feedback to the agent. Our approach involves extracting state–space segments by formulating the problem as a minimum cut problem on a transition graph, constructed using the agent's experiences during interactions with the environment via the Extended Segmented Q-Cut algorithm. Subsequently, these segments are leveraged in the agent's learning process through potential-based reward shaping. Our experimentation on benchmark problem domains with sparse rewards demonstrated that our proposed method effectively accelerates the agent's learning without compromising computation time while upholding the policy invariance principle.

Subject Keywords

Potential-based reward shaping, Reinforcement learning, Reward shaping, Sparse rewards, State–space segmentation

URI

https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85190141334&origin=inward
https://hdl.handle.net/11511/109361

Journal

Future Generation Computer Systems

DOI

https://doi.org/10.1016/j.future.2024.03.057

Collections

Department of Computer Engineering, Article

Citation Formats

M. İ. Bal, H. Aydın, C. İyigün, and F. Polat, “Potential-based reward shaping using state–space segmentation for efficiency in reinforcement learning,” Future Generation Computer Systems, vol. 157, pp. 469–484, 2024, Accessed: 00, 2024. [Online]. Available: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85190141334&origin=inward.