Reinforcement learning control for helicopter landing in autorotation

2018-01-01
Kopsa, Kadircan
Kutay, Ali Türker
This study presents an application of an actor-critic reinforcement learning method to the nonlinear problem of helicopter guidance during autorotation in order to achieve safe landing following engine power loss. A point mass model of an OH-58A helicopter in autorotation was built to simulate autorotation dynamics. The point-mass model includes equations of motion In vertical plane. The states of the point-mass model are the horizontal and vertical velocities, the horizontal and vertical positions, the rotor angular speed and the horizontal and vertical components of the rotor thrust coefficient. The inputs to the model were chosen to be the rates of change of the horizontal and vertical components of the rotor thrust coefficient. A reinforcement learning agent was trained by a model-free asynchronous actor-critic algorithm, where training episodes were parallelized on a multi-core CPU. Objective of the training was defined as achieving near-zero horizontal and vertical kinetic energies at the instant of touchdown. Training episodes were defined as the autorotative flight from an initial equilibrium flight condition to touchdown. During each training episode, the agent was presented a reward at each discrete time-step according to a multiconditional reward function. Constraints on the rotor angular speed, the rotor disk orientation and the rotor thrust coefficient were implemented by structuring the reward function accordingly. Reward function was programmed to output a weighted sum of squared vertical and horizontal velocities at touchdown. Majority of the reinforcement signal came from this reward at touchdown, as it is a measure of success for the agent in accomplishing the safe autorotation landing task. The agent consists of two separate neural network function approximators, namely the actor and the critic. The critic approximates the value of a set of states. The actor generates a set of actions given a set of states, sampled from a Gaussian distribution with mean values as output set of the actor network. Updates to the parameters of both networks were calculated by an n-step returns scheme, which accumulates gradients coming from individual time steps into large, once per episode updates to improve training stability. RMSProp algorithm was used for optimization. After training is complete, the agent was tested against different initial conditions both inside and outside of the height-velocity (H-V) avoidance region of the standard OH-58A helicopter at maximum gross weight. Results achieved by the agent indicates that the method is well-suited for guiding the helicopter safely to the ground in a closed loop manner for a large initial condition state space. Controls generated by the reinforcement learning agent were found to be similar to a helicopter pilot's technique during autorotative flight. The study demonstrates that a significant part of a helicopterâǍŹs H-V restriction zone can be reduced using the presented reinforcement learning method for autonomous landing of a helicopter in autorotation.
44th European Rotorcraft Forum 2018 (18-21 September)

Suggestions

Reinforcement learning control for autorotation of a simple point-mass helicopter model
Kopşa, Kadircan; Kutay, Ali Türker; Department of Aerospace Engineering (2018)
This study presents an application of an actor-critic reinforcement learning method to a simple point-mass model helicopter guidance problem during autorotation. A point-mass model of an OH-58A helicopter in autorotation was built. A reinforcement learning agent was trained by a model-free asynchronous actor-critic algorithm, where training episodes were parallelized on a multi-core CPU. Objective of the training was defined as achieving near-zero horizontal and vertical kinetic energies at the instant of t...
COMPARATIVE STRUCTURAL OPTIMIZATION STUDY OF COMPOSITE AND ALUMINUM HORIZONTAL TAIL PLANE OF A HELICOPTER
Arpacıoğlu, Bertan; Kayran, Altan (2019-11-11)
This work presents structural optimization studies of aluminum and composite material horizontal tail plane of a helicopter by using MSC. NASTRAN SOL200 optimization capabilities. Structural design process starts from conceptual design phase, and structural layout design is performed by using CATIA. In the preliminary design phase, study focuses on the minimum weight optimization with multiple design variables and similar constraints for both materials. Aerodynamic load calculation is performed using ANSYS ...
Varying mass missile dynamics, guidance & control
Günbatar, Yakup; Leblebicioğlu, Mehmet Kemal; Department of Electrical and Electronics Engineering (2007)
The focus of this study is to be able to control the air-to-surface missile throughout the entire flight, with emphasis on the propulsion phase to increase the impact range of the missile. A major difficulty in controlling the missile during the propulsion phase is the important change in mass of the missile. This results in sliding the center of gravity (cg) point and changing inertias. Moreover, aerodynamic coefficients and stability derivatives are not assumed to be constant at predetermined ranges; conv...
Nonlinear Dynamic Inversion Autopilot Design for an Air Defense System with Aerodynamic and Thrust Vector Control
Bıyıklı, Rabiya; Yavrucuk, İlkay; Tekin, Raziye; Department of Aerospace Engineering (2022-2)
The study proposes complete attitude and acceleration autopilots in all three channels of a highly agile air defense missile by utilizing a subcategory of nonlinear feedback linearization methods Nonlinear Dynamic Inversion (NDI). The autopilot design includes cross-coupling effects enabling bank-to-turn (BTT) maneuvers and a rarely touched topic of control in the boost phase with hybrid control which consists of both aerodynamic fin control and thrust vector control. This piece of work suggests solut...
Numerical investigations of lateral jets for missile aerodynamics
Ağsarlıoğlu, Ekin; Albayrak, Kahraman; Department of Mechanical Engineering (2011)
In this thesis, effects of sonic lateral jets on aerodynamics of missiles and missilelike geometries are investigated numerically by commercial Computational Fluid Dynamics (CFD) software FLUENT. The study consists of two parts. In the first part, two generic missile-like geometries with lateral jets, of which experimental data are available in literature, are analyzed by the software for validation studies. As the result of this study, experimental data and CFD results are in good agreement with each other...
Citation Formats
K. Kopsa and A. T. Kutay, “Reinforcement learning control for helicopter landing in autorotation,” Delft, Netherlands, 2018, vol. 2, p. 917, Accessed: 00, 2021. [Online]. Available: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85069444891&origin=inward.