Improving reinforcement learning by using sequence trees

Download

index.pdf

Date

2010-12-01

Author

Girgin, Sertan
Polat, Faruk
Alhajj, Reda

Metadata

Show full item record

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Item Usage Stats

336
views

128
downloads

This paper proposes a novel approach to discover options in the form of stochastic conditionally terminating sequences; it shows how such sequences can be integrated into the reinforcement learning framework to improve the learning performance. The method utilizes stored histories of possible optimal policies and constructs a specialized tree structure during the learning process. The constructed tree facilitates the process of identifying frequently used action sequences together with states that are visited during the execution of such sequences. The tree is constantly updated and used to implicitly run corresponding options. The effectiveness of the method is demonstrated empirically by conducting extensive experiments on various domains with different properties.

Subject Keywords

Software, Artificial Intelligence

URI

https://hdl.handle.net/11511/36002

Journal

MACHINE LEARNING

DOI

https://doi.org/10.1007/s10994-010-5182-y

Collections

Department of Computer Engineering, Article

Suggestions

OpenMETU
Core

Multi-objective multi-item fixed-charge solid transportation problem under twofold uncertainty Roy, Sankar Kumar; Midya, Sudipta; Weber, Gerhard Wilhelm (Springer Science and Business Media LLC, 2019-12-01) In this paper, we investigate a multi-objective multi-item fixed-charge solid transportation problem (MOMIFCSTP) with fuzzy-rough variables as coefficients of the objective functions and of the constraints. The main focus of the paper is to analyze MOMIFCSTP under a fuzzy-rough environment for a transporting system. In practical situations, the parameters of a MOMIFCSTP are imprecise in nature, due to several uncontrollable factors. For these reasons, we introduce the fuzzy-rough variables in MOMIFCSTP to t...
LinGraph: a graph-based automated planner for concurrent task planning based on linear logic Kortik, Sitar; Saranlı, Uluç (Springer Science and Business Media LLC, 2017-10-01) In this paper, we introduce an automated planner for deterministic, concurrent domains, formulated as a graph-based theorem prover for a propositional fragment of intuitionistic linear logic, relying on the previously established connection between intuitionistic linear logic and planning problems. The new graph-based theorem prover we introduce improves planning performance by reducing proof permutations that are irrelevant to planning problems particularly in the presence of large numbers of objects and a...
Designing energy-efficient high-precision multi-pass turning processes via robust optimization and artificial intelligence Khalilpourazari, Soheyl; Khalilpourazary, Saman; ÇİFTÇİOĞLU, AYBİKE ÖZYÜKSEL; Weber, Gerhard Wilhelm (Springer Science and Business Media LLC, 2020-09-01) This paper suggests a novel robust formulation designed for optimizing the parameters of the turning process in an uncertain environment for the first time. The aim is to achieve the lowest energy consumption and highest precision. With this aim, the current paper considers uncertain parameters, objective functions, and constraints in the offered mathematical model. We proposed several uncertain models and validated the results in real-world case studies. In addition, several artificial intelligence-based s...
Nuclear Fission-Nuclear Fusion algorithm for global optimization: a modified Big Bang-Big Crunch algorithm YALÇIN, YAĞIZER; Pekcan, Onur (Springer Science and Business Media LLC, 2020-04-01) This study introduces a derivative of the well-known optimization algorithm, Big Bang-Big Crunch (BB-BC), named Nuclear Fission-Nuclear Fusion-based BB-BC, simply referred to as N2F. Broadly preferred in the engineering optimization community, BB-BC provides accurate solutions with reasonably fast convergence rates for many engineering problems. Regardless, the algorithm often suffers from stagnation issues. More specifically, for some problems, BB-BC either converges prematurely or exploits the promising r...
Modeling human thinking about similarities by neuromatrices in the perspective of fuzzy logic Grobelny, Jerzy; Michalski, Rafal; Weber, Gerhard Wilhelm (Springer Science and Business Media LLC, 2020-09-01) In this work, we propose a new method for modeling human reasoning about objects' similarities. We assume that similarity depends on perceived intensities of objects' attributes expressed by natural language expressions such as low, medium, and high. We show how to find the underlying structure of the matrix with intensities of objects' similarities in the factor-analysis-like manner. The demonstrated approach is based on fuzzy logic and set theory principles, and it uses only maximum and minimum operators....

Citation Formats

S. Girgin, F. Polat, and R. Alhajj, “Improving reinforcement learning by using sequence trees,” MACHINE LEARNING, pp. 283–331, 2010, Accessed: 00, 2020. [Online]. Available: https://hdl.handle.net/11511/36002.