OmniSafe: A Comprehensive Framework for Safe Reinforcement Learning
OmniSafe is an advanced infrastructural framework designed to expedite research in safe reinforcement learning (SafeRL). It serves a dual purpose: providing a solid benchmark for safe RL algorithms and a modular toolkit for researchers aiming to develop algorithms that reduce the risk of unintended harm or unsafe behavior.
Overview of OmniSafe
As the first unified learning framework specifically tailored for safe reinforcement learning, OmniSafe promotes the growth of the SafeRL community. The framework is characterized by several key features:
-
Modular Framework: OmniSafe offers a highly modular design, boasting an extensive range of algorithms made for diverse safe reinforcement learning applications. Its interface is user-friendly, adopting Adapter and Wrapper design patterns to facilitate seamless interaction between different components. This architecture permits easy extension and customization, making OmniSafe a valuable tool for developers.
-
High-Performance Computing: Utilizing
torch.distributed
, OmniSafe accelerates the learning process via parallel computing. This not only supports asynchronous parallelism at the environment level but also incorporates agent asynchronous learning, enhancing training stability and speed. -
Out-of-the-Box Toolkits: OmniSafe provides ready-to-use toolkits for tasks like training, benchmarking, analysis, and rendering. With tutorials and a user-friendly API, it caters to both beginners and experts in the field, empowering them to improve their efficiency without being bogged down by complex code.
Installation
To get started with OmniSafe, it requires Python 3.8 or higher and PyTorch 1.10+. Here’s a brief guide on how to install it:
-
From Source:
git clone https://github.com/PKU-Alignment/omnisafe.git cd omnisafe conda env create --file conda-recipe.yaml conda activate omnisafe pip install -e .
-
From PyPI:
Simply run
pip install omnisafe
to get the latest version.
OmniSafe supports various operating systems, with official testing on Linux and support for macOS M1 and M2. Although Windows support is not official, contributions are welcomed.
Implemented Algorithms
OmniSafe includes a broad suite of algorithms across different domains such as On-Policy, Off-Policy, Model-Based, and Offline SafeRL. Notable algorithms include:
-
On-Policy Algorithms: Like PPO-Lag, TRPO-Lag, CPO, which are adaptation of popular algorithms for safe reinforcement learning.
-
Off-Policy Algorithms: Such as DDPG-Lag, TD3-Lag, which incorporate safety constraints in feedback models.
-
Model-Based Algorithms: Including cutting-edge solutions like SafeLOOP and CAPPETS, which enhance decision-making in uncertain environments.
Supported Environments
OmniSafe is compatible with tasks from the Safety-Gymnasium. It supports a wide range of agents like Point, Car, Racecar, Ant in safe navigation and velocity tasks, as well as complex simulations with ShadowHand in Safe Isaac Gyms.
Customization and Usability
Users can customize environments without the need to alter OmniSafe’s source code, allowing for personalized training scenarios. The framework supports logging-specific environment information with easy setup through guided tutorials available online.
Community Contribution and Development
OmniSafe encourages contributions from the research community. Users can engage in discussions, raise issues, or even submit pull requests to improve the project. If used for research, citing OmniSafe is recommended to acknowledge and support the ongoing efforts to enhance safe reinforcement learning methodologies.
OmniSafe is not just a toolkit but a comprehensive ecosystem supporting the continued advancement of safe AI interactions. With its robust framework and ease of use, it is set to be an essential part of the research community's efforts in making RL safer and more dependable.