Deep Reinforcement Learning in Keras
Deep-RL-Keras is a project that provides a modular implementation of several popular deep reinforcement learning (DRL) algorithms using the Keras library. The project focuses on two main categories of algorithms: Actor-Critic Algorithms and Deep Q-Learning Algorithms. It aims to be a helpful resource for researchers and practitioners interested in deep reinforcement learning.
Overview of Algorithms
Actor-Critic Algorithms
-
N-step Advantage Actor Critic (A2C):
- A2C is a model-free, off-policy algorithm that involves two components: the actor, which predicts the best actions, and the critic, which evaluates these actions.
- It uses the N-step method to increase stability, integrating entropy regularization to encourage exploration during training.
- While A2C is efficient, it can become computationally expensive for complex environments like Atari games.
-
N-step Asynchronous Advantage Actor Critic (A3C):
- A3C builds upon A2C by introducing asynchronous weight updates. This means multiple agents work simultaneously, speeding up computations.
- A3C is noted for its performance in environments like the Atari Breakout game.
-
Deep Deterministic Policy Gradient (DDPG):
- DDPG is designed for continuous action spaces and uses a deterministic policy with a target network for better learning stability.
- It includes experience replay and parameter noise for exploration, tested on environments like the Lunar Lander.
Deep Q-Learning Algorithms
-
Double Deep Q-Network (DDQN):
- An enhancement of the standard DQN algorithm, DDQN uses two separate neural networks, one for predictions and another for more stable target values.
- It utilizes Experience Replay to randomly sample past experiences, aiding in reducing data correlation and improving learning stability.
-
Double Deep Q-Network with Prioritized Experience Replay (DDQN + PER):
- This variant improves upon DDQN by incorporating PER, which prioritizes learning from experiences with high error rates, allowing the model to focus on important samples.
-
Dueling Double Deep Q-Network (Dueling DDQN):
- Dueling DDQN enhances the Q-learning architecture by separating the estimation of state value and advantage, leading to improved learning efficiency and performance.
Getting Started
To get started with Deep-RL-Keras, users need to install Keras version 2.1.6 and OpenAI Gym. This project caters to different environments like CartPole, Atari games, and more, showcasing its adaptability across various scenarios.
$ pip install gym keras==2.1.6
Running the Algorithms
Each algorithm can be run with a command specifying the type, environment, and additional parameters:
$ python3 main.py --type A2C --env CartPole-v1
$ python3 main.py --type A3C --env BreakoutNoFrameskip-v4 --is_atari --nb_episodes 10000 --n_threads 16
$ python3 main.py --type DDPG --env LunarLanderContinuous-v2
Visualization & Monitoring
The project includes tools for visualizing and monitoring models using Tensorboard and Plotly. Users can observe real-time training progress and visualize results for analysis.
- Model Visualization: Saved models can be visualized in their training environments.
- Tensorboard: Provides a graphical interface for monitoring training scores.
- Results Plotting: Allows plotting of training results using Plotly for detailed examination.
Acknowledgments
The project stands on several open-source works, including templates and environment wrappers from other contributors, showcasing a collaborative effort in the DRL community.
References
The project draws on various seminal papers, providing a strong theoretical backbone for the implemented algorithms. These papers are linked within the documentation for those interested in the underlying research.
Deep-RL-Keras makes advanced DRL techniques accessible to a broader audience by simplifying complex algorithms into usable code with detailed guidance and support.