Show/Hide Menu
Hide/Show Apps
Logout
Türkçe
Türkçe
Search
Search
Login
Login
OpenMETU
OpenMETU
About
About
Open Science Policy
Open Science Policy
Open Access Guideline
Open Access Guideline
Postgraduate Thesis Guideline
Postgraduate Thesis Guideline
Communities & Collections
Communities & Collections
Help
Help
Frequently Asked Questions
Frequently Asked Questions
Guides
Guides
Thesis submission
Thesis submission
MS without thesis term project submission
MS without thesis term project submission
Publication submission with DOI
Publication submission with DOI
Publication submission
Publication submission
Supporting Information
Supporting Information
General Information
General Information
Copyright, Embargo and License
Copyright, Embargo and License
Contact us
Contact us
Optimization of reward shaping function based on genetic algorithm applied to a cross validated deep deterministic policy gradient in a powered landing guidance problem
Date
2023-04-01
Author
Nugroho, Larasmoyo
Andiarti, Rika
Akmeliawati, Rini
Kutay, Ali Türker
Larasati, Diva Kartika
Wijaya, Sastra Kusuma
Metadata
Show full item record
This work is licensed under a
Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License
.
Item Usage Stats
193
views
0
downloads
Cite This
One major capability of a Deep Reinforcement Learning (DRL) agent to control a specific vehicle in an environment without any prior knowledge is decision-making based on a well-designed reward shaping function. An important but little-studied major factor that can alter significantly the training reward score and performance outcomes is the reward shaping function. To maximize the control efficacy of a DRL algorithm, an optimized reward shaping function and a solid hyperparameter combination are essential. In order to achieve optimal control during the powered descent guidance (PDG) landing phase of a reusable launch vehicle, the Deep Deterministic Policy Gradient (DDPG) algorithm is used in this paper to discover the best shape of the reward shaping function (RSF). Although DDPG is quite capable of managing complex environments and producing actions intended for continuous spaces, its state and action performance could still be improved. A reference DDPG agent with the original reward shaping function and a PID controller were placed side by side with the GA-DDPG agent using GA-optimized RSF. The best GA-DDPG individual can maximize overall rewards and minimize state errors with the help of the potential-based GA(PbGA) searched RSF, maintaining the highest fitness score among all individuals after has been cross-validated and retested extensively Monte-Carlo experimental results.
Subject Keywords
DDPG
,
Fitness
,
GA-search
,
Reusable launch vehicle
,
Reward shaping function
URI
https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85146696067&origin=inward
https://hdl.handle.net/11511/102311
Journal
Engineering Applications of Artificial Intelligence
DOI
https://doi.org/10.1016/j.engappai.2022.105798
Collections
Department of Aerospace Engineering, Article
Suggestions
OpenMETU
Core
Planning unmanned aerial vehicle's path for maximum information collection using evolutionary algorithms
Ergezer, Halit; Leblebicioğlu, Mehmet Kemal (2011-01-01)
Path planning is a problem of designing the path the vehicle is supposed to follow in such a way that a certain objective is optimized. In our study the objective is to maximize collected amount of information from Desired Regions (DR), meanwhile flying over the Forbidden Regions is avoided. In this paper, the path planning problem for single unmanned air vehicle (UAV) is studied with the proposal of novel evolutionary operators; Pull-to-Desired-Region (PTDR), Push-From-Forbidden-Region (PFFR), Pull-to-Fini...
An Approach for System Identification in Unmanned Surface Vehicles
Erunsal, Izzet Kagan; Ahiska, Kenan; Kumru, Murat; Leblebicioğlu, Mehmet Kemal (2017-10-21)
In this study, a system identification methodology is introduced to determine the model parameters of unmanned surface vehicles. The proposed identification scheme is based on sequencing the experiments according to their capabilities to identify the model parameters. In each experiment, the parameters to be found are updated and the results are validated before ascertaining the final value. A procedure to complete the identification work in an experiment, namely the required post-processing, the optimizati...
Characterization of Driver Neuromuscular Dynamics for Human-Automation Collaboration Design of Automated Vehicles
Lv, Chen; Wang, Huaji; Cao, Dongpu; Zhao, Yifan; Auger, Daniel J.; Sullman, Mark; Matthias, Rebecca; Skrypchuk, Lee; Mouzakitis, Alexandros (Institute of Electrical and Electronics Engineers (IEEE), 2018-12-01)
In order to design an advanced human-automation collaboration system for highly automated vehicles, research into the driver's neuromuscular dynamics is needed. In this paper, a dynamic model of drivers' neuromuscular interaction with a steering wheel is first established. The transfer function and the natural frequency of the systems are analyzed. In order to identify the key parameters of the driver-steering-wheel interacting system and investigate the system properties under different situations, experim...
Design, construction and control of an electrohydraulic load simulator for testing hydraulic drives
Akova, Hayrettin Ulaş; Platin, Bülent Emre; Balkan, Raif Tuna; Department of Mechanical Engineering (2014)
In this thesis, an electro-hydraulic load simulator is designed, constructed, and controlled in order to carry out the stability and performance tests of newly developed hydraulic drive and control systems. It is an experimental loading system capable of applying the desired test loads onto the actuator of the hydraulic drive system under test in a laboratory environment. The primary aim of this study is to support the research activities related to the development of hydraulic drive and control systems so ...
Optimization of types, numbers and locations of sensors and actuators used in modal analysis of aircraft structures using genetic algorithm
Pedramasl, Nima; Şahin, Melin; Acar, Erdem; Department of Aerospace Engineering (2017)
Aircraft structures are exposed to dynamic loads under service conditions and therefore, it is necessary to determine their dynamic characteristics. Dynamic characteristics of a structure can be determined using simulation-based methods such as finite element analysis (FEA) or test-based methods such as experimental modal analysis (EMA). In order to perform an EMA with reliable and high quality results, test equipment must be lightweight and have high accuracy. In addition, the sensors and actuators must be...
Citation Formats
IEEE
ACM
APA
CHICAGO
MLA
BibTeX
L. Nugroho, R. Andiarti, R. Akmeliawati, A. T. Kutay, D. K. Larasati, and S. K. Wijaya, “Optimization of reward shaping function based on genetic algorithm applied to a cross validated deep deterministic policy gradient in a powered landing guidance problem,”
Engineering Applications of Artificial Intelligence
, vol. 120, pp. 0–0, 2023, Accessed: 00, 2023. [Online]. Available: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85146696067&origin=inward.