Show/Hide Menu
Hide/Show Apps
Logout
Türkçe
Türkçe
Search
Search
Login
Login
OpenMETU
OpenMETU
About
About
Open Science Policy
Open Science Policy
Open Access Guideline
Open Access Guideline
Postgraduate Thesis Guideline
Postgraduate Thesis Guideline
Communities & Collections
Communities & Collections
Help
Help
Frequently Asked Questions
Frequently Asked Questions
Guides
Guides
Thesis submission
Thesis submission
MS without thesis term project submission
MS without thesis term project submission
Publication submission with DOI
Publication submission with DOI
Publication submission
Publication submission
Supporting Information
Supporting Information
General Information
General Information
Copyright, Embargo and License
Copyright, Embargo and License
Contact us
Contact us
Bipedal Robot Walking by Reinforcement Learning in Partially Observed Environment
Download
ugurcanozalp_mscthesis_1509.pdf
Date
2021-8-27
Author
Özalp, Uğurcan
Metadata
Show full item record
This work is licensed under a
Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License
.
Item Usage Stats
843
views
1467
downloads
Cite This
Deep Reinforcement Learning methods on mechanical control have been successfully applied in many environments and used instead of traditional optimal and adaptive control methods for some complex problems. However, Deep Reinforcement Learning algorithms do still have some challenges. One is to control on partially observable environments. When an agent is not informed well of the environment, it must recover information from the past observations. In this thesis, walking of Bipedal Walker Hardcore (OpenAI GYM) environment, which is partially observable, is studied by two continuous actor-critic reinforcement learning algorithms; Twin Delayed Deep Determinstic Policy Gradient and Soft Actor-Critic. Several neural architectures are implemented. The first one is Residual Feed Forward Neural Network under the observable environment assumption, while the second and the third ones are Long Short Term Memory and Transformer using observation history as input to recover the hidden information due to the partially observable environment.
Subject Keywords
deep reinforcement learning
,
partial observability
,
robot control
,
actor-critic methods
,
long short term memory
,
transformer
URI
https://hdl.handle.net/11511/92170
Collections
Graduate School of Applied Mathematics, Thesis
Suggestions
OpenMETU
Core
Recursive Compositional Reinforcement Learning for Continuous Control Sürekli Kontrol Uygulamalari için Özyinelemeli Bileşimsel Pekiştirmeli Öǧrenme
Tanik, Guven Orkun; Ertekin Bolelli, Şeyda (2022-01-01)
Compositional and temporal abstraction is the key to improving learning and planning in reinforcement learning. Modern real-world control problems call for continuous control domains and robust, sample efficient and explainable control frameworks. We are presenting a framework for recursively composing control skills to solve compositional and progressively complex tasks. The framework promotes reuse of skills, and as a result quickly adaptable to new tasks. The decision-tree can be observed, providing insi...
Multi-time-scale input approaches for hourly-scale rainfall-runoff modeling based on recurrent neural networks
Ishida, Kei; Kiyama, Masato; Ercan, Ali; Amagasaki, Motoki; Tu, Tongbi (2021-11-01)
This study proposes two effective approaches to reduce the required computational time of the training process for time-series modeling through a recurrent neural network (RNN) using multi-time-scale time-series data as input. One approach provides coarse and fine temporal resolutions of the input time-series data to RNN in parallel. The other concatenates the coarse and fine temporal resolutions of the input time-series data over time before considering them as the input to RNN. In both approaches, first, ...
Reward Shaping for Efficient Exploration and Acceleration of Learning in Reinforcement Learning
Bal, Melis İlayda; İyigün, Cem; Polat, Faruk; Department of Operational Research (2022-7-21)
In a Reinforcement Learning task, a learning agent needs to extract useful information about its uncertain environment in an efficient way during the interaction process to successfully complete the task. Through strategic exploration, the agent acquires sufficient information to adjust its behavior to act intelligently as it interacts with the environment. Therefore, efficient exploration plays a key role in the learning efficiency of Reinforcement Learning tasks. Due to the delayed-feedback nature of Rein...
Improving reinforcement learning using distinctive clues of the environment
Demir, Alper; Polat, Faruk; Department of Computer Engineering (2019)
Effective decomposition and abstraction has been shown to improve the performance of Reinforcement Learning. An agent can use the clues from the environment to either partition the problem into sub-problems or get informed about its progress in a given task. In a fully observable environment such clues may come from subgoals while in a partially observable environment they may be provided by unique experiences. The contribution of this thesis is two fold; first improvements over automatic subgoal identifica...
Mobile Robot Heading Adjustment Using Radial Basis Function Neural Networks Controller and Reinforcement Learning
BAYAR, GÖKHAN; Konukseven, Erhan İlhan; Koku, Ahmet Buğra (2008-10-28)
This paper proposes radial basis function neural networks approach to the Solution of a mobile robot heading adjustment using reinforcement learning. In order to control the heading of the mobile robot, the neural networks control system have been constructed and implemented. Neural controller has been charged to enhance the control system by adding some degrees of strength. It has been achieved that neural networks system can learn the relationship between the desired directional heading and the error posi...
Citation Formats
IEEE
ACM
APA
CHICAGO
MLA
BibTeX
U. Özalp, “Bipedal Robot Walking by Reinforcement Learning in Partially Observed Environment,” M.S. - Master of Science, Middle East Technical University, 2021.