Bipedal Robot Walking by Reinforcement Learning in Partially Observed Environment

Download

ugurcanozalp_mscthesis_1509.pdf

Date

2021-8-27

Author

Özalp, Uğurcan

Metadata

Show full item record

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Item Usage Stats

896
views

1539
downloads

Deep Reinforcement Learning methods on mechanical control have been successfully applied in many environments and used instead of traditional optimal and adaptive control methods for some complex problems. However, Deep Reinforcement Learning algorithms do still have some challenges. One is to control on partially observable environments. When an agent is not informed well of the environment, it must recover information from the past observations. In this thesis, walking of Bipedal Walker Hardcore (OpenAI GYM) environment, which is partially observable, is studied by two continuous actor-critic reinforcement learning algorithms; Twin Delayed Deep Determinstic Policy Gradient and Soft Actor-Critic. Several neural architectures are implemented. The first one is Residual Feed Forward Neural Network under the observable environment assumption, while the second and the third ones are Long Short Term Memory and Transformer using observation history as input to recover the hidden information due to the partially observable environment.

Subject Keywords

deep reinforcement learning, partial observability, robot control, actor-critic methods, long short term memory, transformer

URI

https://hdl.handle.net/11511/92170

Collections

Graduate School of Applied Mathematics, Thesis

Suggestions

OpenMETU
Core

Recursive Compositional Reinforcement Learning for Continuous Control Sürekli Kontrol Uygulamalari için Özyinelemeli Bileşimsel Pekiştirmeli Öǧrenme Tanik, Guven Orkun; Ertekin Bolelli, Şeyda (2022-01-01) Compositional and temporal abstraction is the key to improving learning and planning in reinforcement learning. Modern real-world control problems call for continuous control domains and robust, sample efficient and explainable control frameworks. We are presenting a framework for recursively composing control skills to solve compositional and progressively complex tasks. The framework promotes reuse of skills, and as a result quickly adaptable to new tasks. The decision-tree can be observed, providing insi...
Multi-time-scale input approaches for hourly-scale rainfall-runoff modeling based on recurrent neural networks Ishida, Kei; Kiyama, Masato; Ercan, Ali; Amagasaki, Motoki; Tu, Tongbi (2021-11-01) This study proposes two effective approaches to reduce the required computational time of the training process for time-series modeling through a recurrent neural network (RNN) using multi-time-scale time-series data as input. One approach provides coarse and fine temporal resolutions of the input time-series data to RNN in parallel. The other concatenates the coarse and fine temporal resolutions of the input time-series data over time before considering them as the input to RNN. In both approaches, first, ...
Reward Shaping for Efficient Exploration and Acceleration of Learning in Reinforcement Learning Bal, Melis İlayda; İyigün, Cem; Polat, Faruk; Department of Operational Research (2022-7-21) In a Reinforcement Learning task, a learning agent needs to extract useful information about its uncertain environment in an efficient way during the interaction process to successfully complete the task. Through strategic exploration, the agent acquires sufficient information to adjust its behavior to act intelligently as it interacts with the environment. Therefore, efficient exploration plays a key role in the learning efficiency of Reinforcement Learning tasks. Due to the delayed-feedback nature of Rein...
Improving reinforcement learning using distinctive clues of the environment Demir, Alper; Polat, Faruk; Department of Computer Engineering (2019) Effective decomposition and abstraction has been shown to improve the performance of Reinforcement Learning. An agent can use the clues from the environment to either partition the problem into sub-problems or get informed about its progress in a given task. In a fully observable environment such clues may come from subgoals while in a partially observable environment they may be provided by unique experiences. The contribution of this thesis is two fold; first improvements over automatic subgoal identifica...
Mobile Robot Heading Adjustment Using Radial Basis Function Neural Networks Controller and Reinforcement Learning BAYAR, GÖKHAN; Konukseven, Erhan İlhan; Koku, Ahmet Buğra (2008-10-28) This paper proposes radial basis function neural networks approach to the Solution of a mobile robot heading adjustment using reinforcement learning. In order to control the heading of the mobile robot, the neural networks control system have been constructed and implemented. Neural controller has been charged to enhance the control system by adding some degrees of strength. It has been achieved that neural networks system can learn the relationship between the desired directional heading and the error posi...

Citation Formats

U. Özalp, “Bipedal Robot Walking by Reinforcement Learning in Partially Observed Environment,” M.S. - Master of Science, Middle East Technical University, 2021.