RL-Agents: An Introduction to Reinforcement Learning Agents
RL-Agents is a comprehensive collection of reinforcement learning (RL) agents designed to explore and implement various algorithms in this domain. Featuring a wide range of planning and value-based agents, this project stands as a valuable resource for both new learners and seasoned practitioners in the field of reinforcement learning.
Installation
Getting started with RL-Agents is straightforward. To install the project, simply run the following command in your terminal:
pip install --user git+https://github.com/eleurent/rl-agents
Usage
RL-Agents provides scripts to conduct experiments with ease. You can initiate most experiments by navigating to the scripts
directory and running a command via experiments.py
. The basic structure allows users to choose the environment and agent for training or testing:
# Basic usage for training or testing an agent
Usage:
experiments evaluate <environment> <agent> (--train|--test)
[--episodes <count>]
[--seed <str>]
[--analyze]
experiments benchmark <benchmark> (--train|--test)
[--processes <count>]
[--episodes <count>]
[--seed <str>]
experiments -h | --help
This utility can train, test, and even benchmark multiple agents over specified environments, allowing parallel execution for efficiency.
Monitoring
The project includes multiple monitoring tools to assess agent performance:
- Run Metadata: Saves the configurations of environments and agents to ensure reproducibility.
- Gym Monitor: Logs main statistics of runs and can visualize results via the
analyze.py
script. - Logging: Uses Python's logging library to record the processes and results with adjustable verbosity.
- TensorBoard: Records and visualizes metrics using TensorBoard for more detailed inspection of runs.
Agents
RL-Agents implements a diverse array of RL strategies, categorized into different types:
Planning
- Value Iteration (VI): Computes state-action values for decision-making based on finite MDP environments.
- Cross-Entropy Method (CEM): A sampling-based approach that refines actions iteratively using a Gaussian distribution.
- Monte-Carlo Tree Search (MCTS): Utilizes a world transition model to explore and develop strategy dynamically through tree simulation.
Specific variants of MCTS include:
- Upper Confidence bounds applied to Trees (UCT)
- Deterministic Optimistic Planning (OPD)
- Open Loop Optimistic Planning (OLOP)
- Trailblazer
- PlaTγPOOS
Safe Planning
Focused on robustness, these agents cater to uncertain environments with:
- Robust Value Iteration (RVI)
- Discrete Robust Optimistic Planning (DROP)
- Interval-based Robust Planning (IRP)
Value-Based
Agents designed to estimate optimal policies through state-action values:
- Deep Q-Network (DQN): Utilizes deep learning to approximate Q-values.
- Fitted-Q (FTQ): Trains Q-function models through a batch of state-action transitions.
Safe Value-Based
- Budgeted Fitted-Q (BFTQ): Enhances FTQ to handle policies where expected costs remain within specified budgets.
Citing
If RL-Agents proves helpful in your research, consider citing it as follows:
@misc{rl-agents,
author = {Leurent, Edouard},
title = {rl-agents: Implementations of Reinforcement Learning algorithms},
year = {2018},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\url{https://github.com/eleurent/rl-agents}},
}
In summary, RL-Agents offers a robust and flexible toolkit for exploring reinforcement learning through a comprehensive suite of algorithms and tools for both training and evaluation. Whether you're researching, learning, or deploying RL models, RL-Agents provides the necessary resources to succeed.