Optimization of reward shaping function based on genetic algorithm applied to a cross validated deep deterministic policy gradient in a powered landing guidance problem

2023-04-01
Nugroho, Larasmoyo
Andiarti, Rika
Akmeliawati, Rini
Kutay, Ali Türker
Larasati, Diva Kartika
Wijaya, Sastra Kusuma
One major capability of a Deep Reinforcement Learning (DRL) agent to control a specific vehicle in an environment without any prior knowledge is decision-making based on a well-designed reward shaping function. An important but little-studied major factor that can alter significantly the training reward score and performance outcomes is the reward shaping function. To maximize the control efficacy of a DRL algorithm, an optimized reward shaping function and a solid hyperparameter combination are essential. In order to achieve optimal control during the powered descent guidance (PDG) landing phase of a reusable launch vehicle, the Deep Deterministic Policy Gradient (DDPG) algorithm is used in this paper to discover the best shape of the reward shaping function (RSF). Although DDPG is quite capable of managing complex environments and producing actions intended for continuous spaces, its state and action performance could still be improved. A reference DDPG agent with the original reward shaping function and a PID controller were placed side by side with the GA-DDPG agent using GA-optimized RSF. The best GA-DDPG individual can maximize overall rewards and minimize state errors with the help of the potential-based GA(PbGA) searched RSF, maintaining the highest fitness score among all individuals after has been cross-validated and retested extensively Monte-Carlo experimental results.
Engineering Applications of Artificial Intelligence

Suggestions

Planning unmanned aerial vehicle's path for maximum information collection using evolutionary algorithms
Ergezer, Halit; Leblebicioğlu, Mehmet Kemal (2011-01-01)
Path planning is a problem of designing the path the vehicle is supposed to follow in such a way that a certain objective is optimized. In our study the objective is to maximize collected amount of information from Desired Regions (DR), meanwhile flying over the Forbidden Regions is avoided. In this paper, the path planning problem for single unmanned air vehicle (UAV) is studied with the proposal of novel evolutionary operators; Pull-to-Desired-Region (PTDR), Push-From-Forbidden-Region (PFFR), Pull-to-Fini...
An Approach for System Identification in Unmanned Surface Vehicles
Erunsal, Izzet Kagan; Ahiska, Kenan; Kumru, Murat; Leblebicioğlu, Mehmet Kemal (2017-10-21)
In this study, a system identification methodology is introduced to determine the model parameters of unmanned surface vehicles. The proposed identification scheme is based on sequencing the experiments according to their capabilities to identify the model parameters. In each experiment, the parameters to be found are updated and the results are validated before ascertaining the final value. A procedure to complete the identification work in an experiment, namely the required post-processing, the optimizati...
Characterization of Driver Neuromuscular Dynamics for Human-Automation Collaboration Design of Automated Vehicles
Lv, Chen; Wang, Huaji; Cao, Dongpu; Zhao, Yifan; Auger, Daniel J.; Sullman, Mark; Matthias, Rebecca; Skrypchuk, Lee; Mouzakitis, Alexandros (Institute of Electrical and Electronics Engineers (IEEE), 2018-12-01)
In order to design an advanced human-automation collaboration system for highly automated vehicles, research into the driver's neuromuscular dynamics is needed. In this paper, a dynamic model of drivers' neuromuscular interaction with a steering wheel is first established. The transfer function and the natural frequency of the systems are analyzed. In order to identify the key parameters of the driver-steering-wheel interacting system and investigate the system properties under different situations, experim...
Design, construction and control of an electrohydraulic load simulator for testing hydraulic drives
Akova, Hayrettin Ulaş; Platin, Bülent Emre; Balkan, Raif Tuna; Department of Mechanical Engineering (2014)
In this thesis, an electro-hydraulic load simulator is designed, constructed, and controlled in order to carry out the stability and performance tests of newly developed hydraulic drive and control systems. It is an experimental loading system capable of applying the desired test loads onto the actuator of the hydraulic drive system under test in a laboratory environment. The primary aim of this study is to support the research activities related to the development of hydraulic drive and control systems so ...
Optimization of types, numbers and locations of sensors and actuators used in modal analysis of aircraft structures using genetic algorithm
Pedramasl, Nima; Şahin, Melin; Acar, Erdem; Department of Aerospace Engineering (2017)
Aircraft structures are exposed to dynamic loads under service conditions and therefore, it is necessary to determine their dynamic characteristics. Dynamic characteristics of a structure can be determined using simulation-based methods such as finite element analysis (FEA) or test-based methods such as experimental modal analysis (EMA). In order to perform an EMA with reliable and high quality results, test equipment must be lightweight and have high accuracy. In addition, the sensors and actuators must be...
Citation Formats
L. Nugroho, R. Andiarti, R. Akmeliawati, A. T. Kutay, D. K. Larasati, and S. K. Wijaya, “Optimization of reward shaping function based on genetic algorithm applied to a cross validated deep deterministic policy gradient in a powered landing guidance problem,” Engineering Applications of Artificial Intelligence, vol. 120, pp. 0–0, 2023, Accessed: 00, 2023. [Online]. Available: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85146696067&origin=inward.