Using Generative Adversarial Nets on Atari Games for Feature Extraction in Deep Reinforcement Learning

2020-10-05
Deep Reinforcement Learning (DRL) has been successfully applied in several research domains such as robot navigation and automated video game playing. However, these methods require excessive computation and interaction with the environment, so enhancements on sample efficiency are required. The main reason for this requirement is that sparse and delayed rewards do not provide an effective supervision for representation learning of deep neural networks. In this study, Proximal Policy Optimization (PPO) algorithm is augmented with Generative Adversarial Networks (GANs) to increase the sample efficiency by enforcing the network to learn efficient representations without depending on sparse and delayed rewards as supervision. The results show that an increased performance can be obtained by jointly training a DRL agent with a GAN discriminator.
2020 28th Signal Processing and Communications Applications Conference (SIU)

Suggestions

Using Generative Adversarial Nets on Atari Games for Feature Extraction in Deep Reinforcement Learning
Aydın, Ayberk; Sürer, Elif (2020-04-01)
Deep Reinforcement Learning (DRL) has been suc-cessfully applied in several research domains such as robotnavigation and automated video game playing. However, thesemethods require excessive computation and interaction with theenvironment, so enhancements on sample efficiency are required.The main reason for this requirement is that sparse and delayedrewards do not provide an effective supervision for representationlearning of deep neural networks. In this study, Proximal Policy...
Learning to Navigate Endoscopic Capsule Robots
Turan, Mehmet; Almalioglu, Yasin; Gilbert, Hunter B.; Mahmood, Faisal; Durr, Nicholas J.; Araujo, Helder; Sari, Alp Eren; Ajay, Anurag; Sitti, Metin (Institute of Electrical and Electronics Engineers (IEEE), 2019-07-01)
Deep reinforcement learning (DRL) techniques have been successful in several domains, such as physical simulations, computer games, and simulated robotic tasks, yet the transfer of these successful learning concepts from simulations into the real world scenarios remains still a challenge. In this letter, a DRL approach is proposed to learn the continuous control of a magnetically actuated soft capsule endoscope (MASCE). Proposed controller approach can alleviate the need for tedious modeling of complex and ...
An algorithm to resolve the optimal locomotion problem of modular robots
Mencek, Hakan; Soylu, Reşit; Department of Mechanical Engineering (2007)
In this study, a novel optimal motion planning algorithm is developed for the locomotion of modular robots. The total energy consumption of the robot is considered to be the optimization criteria. In order to determine the energy consumption of the system, the kinematic and dynamic analyses of the system are performed. Due to the variable number of modules in the system, a recursive formulation is developed for both kinematic and dynamic analyses. Coulomb's static and dynamic friction models are used to mod...
An embodied and cognitive model of figther pilot high level air-to-air engagement decision
Kaygusuz, Yasin; Çakır, Murat Perit (2023-08-01)
The development of pilot models has been a long-standing interest in cognitive modeling and AI due to the potential gains they offer in aerial robotics. In this study, a cognitive and embodied fighter pilot model is developed to obtain pilot-like characteristics for making high-level decisions in air-to-air engagements, using spiking neurons in ensembles implemented in the Neural Engineering Objects (NENGO) framework (Bekolay et al., 2014; Eliasmith, 2015). One major problem that is faced when creating such...
Simulation of biped locomotion of humanoid robots in 3D space
Akalın, Gökcan; Özgören, Mustafa Kemal; Department of Mechanical Engineering (2010)
The main goal of this thesis is to simulate the response of a humanoid robot using a specified control algorithm which can achieve a sustainable biped locomotion with 4 basic locomotion phases. Basic parts for the body of the humanoid robot model are shaped according to the specified basic physical parameters and assumed kinematic model. The kinematic model, which does not change according to locomotion phases and consists of 27 segments including 14 virtual segments, provides a humanoid robot model with 26...
Citation Formats
A. Aydın and E. Sürer, “Using Generative Adversarial Nets on Atari Games for Feature Extraction in Deep Reinforcement Learning,” presented at the 2020 28th Signal Processing and Communications Applications Conference (SIU), İstanbul, Türkiye, 2020, Accessed: 00, 2021. [Online]. Available: https://ieeexplore.ieee.org/abstract/document/9302454.