Show/Hide Menu
Hide/Show Apps
Logout
Türkçe
Türkçe
Search
Search
Login
Login
OpenMETU
OpenMETU
About
About
Open Science Policy
Open Science Policy
Communities & Collections
Communities & Collections
Help
Help
Frequently Asked Questions
Frequently Asked Questions
Guides
Guides
Thesis submission
Thesis submission
MS without thesis term project submission
MS without thesis term project submission
Publication submission with DOI
Publication submission with DOI
Publication submission
Publication submission
Supporting Information
Supporting Information
General Information
General Information
Copyright, Embargo and License
Copyright, Embargo and License
Contact us
Contact us
Bipedal Robot Walking by Reinforcement Learning in Partially Observed Environment
Download
ugurcanozalp_mscthesis_1509.pdf
Date
2021-8-27
Author
Özalp, Uğurcan
Metadata
Show full item record
This work is licensed under a
Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License
.
Item Usage Stats
303
views
659
downloads
Cite This
Deep Reinforcement Learning methods on mechanical control have been successfully applied in many environments and used instead of traditional optimal and adaptive control methods for some complex problems. However, Deep Reinforcement Learning algorithms do still have some challenges. One is to control on partially observable environments. When an agent is not informed well of the environment, it must recover information from the past observations. In this thesis, walking of Bipedal Walker Hardcore (OpenAI GYM) environment, which is partially observable, is studied by two continuous actor-critic reinforcement learning algorithms; Twin Delayed Deep Determinstic Policy Gradient and Soft Actor-Critic. Several neural architectures are implemented. The first one is Residual Feed Forward Neural Network under the observable environment assumption, while the second and the third ones are Long Short Term Memory and Transformer using observation history as input to recover the hidden information due to the partially observable environment.
Subject Keywords
deep reinforcement learning
,
partial observability
,
robot control
,
actor-critic methods
,
long short term memory
,
transformer
URI
https://hdl.handle.net/11511/92170
Collections
Graduate School of Applied Mathematics, Thesis
Suggestions
OpenMETU
Core
Improving reinforcement learning using distinctive clues of the environment
Demir, Alper; Polat, Faruk; Department of Computer Engineering (2019)
Effective decomposition and abstraction has been shown to improve the performance of Reinforcement Learning. An agent can use the clues from the environment to either partition the problem into sub-problems or get informed about its progress in a given task. In a fully observable environment such clues may come from subgoals while in a partially observable environment they may be provided by unique experiences. The contribution of this thesis is two fold; first improvements over automatic subgoal identifica...
Mobile Robot Heading Adjustment Using Radial Basis Function Neural Networks Controller and Reinforcement Learning
BAYAR, GÖKHAN; Konukseven, Erhan İlhan; Koku, Ahmet Buğra (2008-10-28)
This paper proposes radial basis function neural networks approach to the Solution of a mobile robot heading adjustment using reinforcement learning. In order to control the heading of the mobile robot, the neural networks control system have been constructed and implemented. Neural controller has been charged to enhance the control system by adding some degrees of strength. It has been achieved that neural networks system can learn the relationship between the desired directional heading and the error posi...
Effective subgoal discovery and option generation in reinforcement learning
Demir, Alper; Polat, Faruk; Department of Computer Engineering (2016)
Subgoal discovery is proven to be a practical way to cope with large state spaces in Reinforcement Learning. Subgoals are natural hints to partition the problem into sub-problems, allowing the agent to solve each sub-problem separately. Identification of such subgoal states in the early phases of the learning process increases the learning speed of the agent. In a problem modeled as a Markov Decision Process, subgoal states possess key features that distinguish them from the ordinary ones. A learning agent ...
Visual Object Tracking with Autoencoder Representations
Besbinar, Beril; Alatan, Abdullah Aydın (2016-05-19)
Deep learning is the discipline of training computational models that are composed of multiple layers and these methods have recently improved the state of the art in many areas as a virtue of large labeled datasets, increase in the computational power of current hardware and unsupervised training methods. Although such a dataset may not be available for lots of application areas, the representations obtained by the well-designed networks that have a large representation capacity and trained with enough dat...
Using Generative Adversarial Nets on Atari Games for Feature Extraction in Deep Reinforcement Learning
Aydın, Ayberk; Sürer, Elif (2020-04-01)
Deep Reinforcement Learning (DRL) has been suc-cessfully applied in several research domains such as robotnavigation and automated video game playing. However, thesemethods require excessive computation and interaction with theenvironment, so enhancements on sample efficiency are required.The main reason for this requirement is that sparse and delayedrewards do not provide an effective supervision for representationlearning of deep neural networks. In this study, Proximal Policy...
Citation Formats
IEEE
ACM
APA
CHICAGO
MLA
BibTeX
U. Özalp, “Bipedal Robot Walking by Reinforcement Learning in Partially Observed Environment,” M.S. - Master of Science, Middle East Technical University, 2021.