#reinforcement learning
ml-agents
The Unity ML-Agents Toolkit is an open-source project that allows for intelligent agent training within game and simulation environments. Utilizing advanced algorithms such as reinforcement learning and imitation learning, it serves both game developers and researchers by offering a flexible Python API. The toolkit supports various training scenarios, including single and multi-agent setups, and enables cross-platform compatibility with Unity's Sentis. Explore features like curriculum learning, custom training algorithms, and integration with gym and PettingZoo formats. This toolkit aids in the development of improved NPC behaviors and aids researchers in evaluating game design and conducting automated game testing.
dopamine
Dopamine offers a flexible framework designed for rapid prototyping in reinforcement learning, emphasizing experimentation and reproducibility. It supports popular agents like DQN, C51, and Rainbow using Jax, facilitating seamless experimentation and adaptable development. This platform offers concise implementations of proven algorithms, with a focus on reproducibility. Compatible with Atari and Mujoco environments, Dopamine can be installed from the source or via pip, with source installation allowing for direct alterations. Comprehensive documentation and setup guidelines empower users to efficiently customize and execute their experiments.
stable-baselines3-contrib
SB3-Contrib provides experimental reinforcement learning algorithms and utilities that extend stable-baselines3, featuring advanced implementations such as Augmented Random Search, Quantile Regression DQN, and PPO with recurrent policy. Aimed at researchers, it includes niche tools that support a broad range of RL research and applications, facilitating the integration and exploration of innovative enhancements in reinforcement learning.
rl-plotter
The rl-plotter tool streamlines the visualization of learning curves in reinforcement learning projects. Compatible with OpenAI's spinningup and baseline frameworks, it enhances performance tracking with customizable plotting styles and loggers for additional data. This tool's commands accommodate specific visualization needs, such as filtering, averaging, and shaded regions for data plots. It ensures easy integration into existing workflows with installation options from pip or source.
lerobot
The open-source project LeRobot empowers real-world robotics with AI models, datasets, and tools compatible with PyTorch. It seeks to lower entry barriers so that anyone can participate in sharing and utilizing datasets and pretrained models. Emphasizing imitation and reinforcement learning, LeRobot ensures effective real-world application, offering users access to a variety of pretrained models and simulation environments. Planned expansions aim to support more cost-effective robotics solutions. Hosted on Hugging Face, LeRobot's community encourages exploration and collaboration in ongoing robotics advancements.
pbdl-book
This book serves as a detailed guide to combining deep learning with physical simulations, featuring practical Jupyter notebooks. It focuses on solving PDE problems using deep learning while integrating existing physical knowledge and numerical methods. The latest version adds extensive content on differentiable physics training and innovative learning techniques. Highlights include hybrid fluid flow solvers, Bayesian Neural Networks for RANS flow with uncertainty predictions, and reinforcement learning for PDE control, making it a valuable asset for enhancing AI-driven physical modeling.
AutoWebGLM
AutoWebGLM enhances web navigation with the ChatGLM3-6B model, featuring HTML simplification and hybrid AI-human training for better browsing comprehension. It employs reinforcement learning to optimize real-world tasks, supported by the AutoWebBench bilingual benchmark. Open evaluation tools offer robust frameworks for testing and improving the agent's efficiency in web interactions.
DI-engine
DI-engine offers a versatile platform for reinforcement learning, integrating asynchronous-native task abstractions and core decision-making components such as Environment, Policy, and Model. Utilizing PyTorch and JAX, it supports a wide range of RL algorithms including basic, multi-agent, and model-based types. Targeted at both academic research and prototype development, it delivers modular tools and training resources, optimized for large-scale reinforcement learning tasks across various environments.
neurojs
Neurojs, a JavaScript framework for deep learning in the browser, excels in reinforcement learning with illustrative demos like a 2D self-driving car. Though now eclipsed by newer frameworks like TensorFlow-JS, it offers a robust, full-stack neural-network architecture with advanced features such as priority replay buffers and deep-Q networks. Perfect for hands-on experimentation with neural networks directly in the browser.
qlib
Discover an open-source platform focusing on quantitative investment research using AI technologies, facilitating automated factor mining with RD-Agent and various machine learning models for data handling, testing, and strategy development. Latest enhancements feature LLM-driven automation and new model integrations.
TextRL
TextRL uses reinforcement learning to enhance text generation models, based on Hugging Face Transformers, PFRL, and OpenAI GYM. Its customizable features support various text-generation frameworks with examples such as GPT-2 and FLAN-T5. The project includes comprehensive documentation on installation, model training, and reward function setup, offering adaptable solutions for controllable text generation across different AI models.
trl
TRL is a library for enhancing foundation models post-training via methods such as Supervised Fine-Tuning, Proximal Policy Optimization, and Direct Preference Optimization. Integrated with Hugging Face Transformers, it supports multiple architectures and scales from single GPUs to clusters. Its CLI allows model fine-tuning without coding, and dedicated trainers facilitate effective reinforcement learning, ensuring efficient hardware usage.
CAGrad
CAGrad introduces a novel approach to multitask learning by utilizing conflict-averse gradient descent, which optimizes multiple objectives simultaneously. Recognized at NeurIPS 2021, this methodology reduces the computational burden in calculating task gradients, enhancing efficiency for varied applications. With the addition of FAMO, the tool further supports dynamic optimization without computing all task gradients. Experiments on NYU-v2, CityScapes, and Metaworld datasets illustrate its effectiveness in image-to-image prediction and reinforcement learning. This resource aids researchers in optimizing multitask objectives with minimal resource usage.
tensorforce
Tensorforce, an open-source library built on TensorFlow, provides a modular architecture for deep reinforcement learning, ideal for research and practical applications. It features a flexible, component-based structure that decouples algorithms from environments, enhancing versatility and user accessibility. Supporting a variety of network architectures, policy distributions, and optimization strategies, Tensorforce facilitates the development of models such as DQN and PPO. The library includes practical example configurations and thorough documentation, although it is important to note that the project is no longer actively maintained.
motif
Explore Motif's unique method of using a Large Language Model to define reward functions for AI agent training in NetHack. This method features a three-phase process: dataset annotation, reward training, and reinforcement learning, transforming LLM preferences into intrinsic agent motivation. Discover intuitive, human-aligned AI behaviors guided by customizable prompts and gain insights into Motif's capabilities for feedback-driven intrinsic rewards in reinforcement learning.
Miniworld
MiniWorld is a minimalist 3D simulator ideal for reinforcement learning and robotics research, focusing on basic interior layouts. Entirely Python-based, it allows for easy customization, making it suitable for educational purposes. Key features include domain randomization, text display capabilities, and RGB-D depth mapping. Although limited in graphics and physics, its efficiency and low resource demands are notable. Numerous academic publications demonstrate its utility in studies of reinforcement learning and procedural environment exploration. Compatible with Python 3.7+, Gymnasium, and NumPy, its installation is uncomplicated.
lab
DeepMind Lab is a dynamic 3D testing environment for AI, centered on deep reinforcement learning. It presents varied tasks that challenge learning agents in navigation and problem-solving. Built on the ioquake3 engine, this platform consolidates open-source resources, providing a robust testbed for researchers. Comprehensive setup guides, Python API support, and Lua-configured tasks enhance its utility as a tool for AI development.
gymfc
GymFC is a framework designed for tuning UAV flight control systems, with a focus on attitude control. It allows synthesizing neuro-flight and traditional controllers and is utilized in the Neuroflight firmware. GymFC is adaptable to different aircraft types through sensor and actuator configurations and supports multiple Gazebo versions. For more technical details, refer to associated manuscripts and Wil Koch's thesis.
gym-ignition
Explore a flexible framework facilitating the creation of robotics environments for reinforcement learning. Leveraging ScenarIO, gym-ignition provides key abstractions such as Task and Runtime to focus on logic development. While not currently maintained, it eases simulation setup and domain randomization with dynamic algorithms. Ideal for custom RL environment creation, despite lacking pre-built settings, with ample examples available.
brax
Brax, developed with JAX, is a high-performance physics engine that supports fast simulations in robotics, human perception, and reinforcement learning. It is optimized for acceleration hardware, making it scalable for simulations across multiple devices. Brax includes training algorithms such as PPO and SAC, utilizing its differentiable simulator for analytical policy gradients. With four physics pipelines, including MuJoCo XLA and Spring, Brax adapts to diverse simulation needs. It offers Colab notebook support and integrates with frameworks like PyTorch. Installation is flexible with pip, Conda, or source, fitting different computational environments.
tensorlayer-chinese
The TensorLayer library, grounded in TensorFlow, offers extensive Chinese documentation and active community forums. Aimed at facilitating AI development, it equips researchers and engineers with diverse neural network tools. Engage with its dynamic user communities on platforms like QQ, WeChat, and Slack to collaborate and innovate in AI solutions.
ViZDoom
ViZDoom enables AI development for playing Doom using visual inputs, serving as a crucial research tool in visual machine learning and deep reinforcement learning. Available for Linux, macOS, and Windows, it provides Python and C++ APIs. Key features include custom scenario creation, multi-platform support, and adjustable settings, supporting both asynchronous and synchronous multiplayer modes. Gymnasium environment wrappers enhance its utility in reinforcement learning studies, making it suitable for learning from demonstrations and other advanced learning techniques.
omnisafe
OmniSafe is a framework that facilitates safe reinforcement learning research. It offers a modular and extensible design, integrating advanced algorithms and high-performance parallel computing to improve training efficiency. The toolkits support a variety of tasks including benchmarking, which are suitable for both new and experienced researchers. The platform prioritizes risk reduction and safety, making it applicable across various fields in the SafeRL community.
alpha-zero-general
Alpha Zero General provides an adaptable implementation of self-play reinforcement learning based on the AlphaGo Zero model. It is compatible with any two-player turn-based game and supports various deep learning frameworks, making it a useful asset for developers. The project includes examples for games like Othello, GoBang, and TicTacToe using PyTorch and Keras. Its design allows for easy customization through subclassing game and neural network templates. Key features include a training loop, Monte Carlo Tree Search, and flexible neural network parameter settings. Setup can be streamlined with nvidia-docker for a Jupyter environment.
Gym.NET
Gym.NET is a C# adaptation of the OpenAI Gym, offering a framework for creating and testing reinforcement learning algorithms. It includes environments like CartPole and LunarLander, with rendering options via WinForm and Avalonia. Future plans involve adding support for more environments and improving compatibility. Installation is straightforward, making it accessible for reinforcement learning exploration in .NET.
ravens
Ravens project uses PyBullet for simulating vision-based robotic manipulation tasks, featuring 10 tabletop challenges with expert demonstrations and reward functions. The Gym-like API supports object generalization and multi-step tasks with closed-loop feedback. With the Transporter Network, it improves sample efficiency and adapts beyond object constraints. Experiments confirm quick learning and superior generalization to other baselines, illustrating advancements in robotic manipulation.
jumanji
Explore 22 scalable reinforcement learning environments crafted with JAX to boost research efficiency. These environments, ranging from basic games to intricate NP-complete challenges, support research applications in both academia and industry. Seamlessly integrates with popular frameworks such as OpenAI Gym and DeepMind Env, providing practical examples for easy implementation. Suitable for novice and experienced users alike.
DI-engine-docs
DI-engine Docs, produced by OpenDILab, offers educational resources on Decision Intelligence and reinforcement learning. It covers DI-engine's introduction, reinforcement learning concepts, algorithm classifications, environment examples, and API documentation. Available in English and Chinese, it serves global audiences seeking to enhance understanding in these fields.
Gymnasium
Gymnasium provides a consistent API for reinforcement learning, building on OpenAI's Gym by offering diverse environments like Classic Control and Atari. Explore flexible installations and related libraries like CleanRL for comprehensive learning toolkits.
DRLX
Discover the DRLX library designed for distributed diffusion model training utilizing reinforcement learning. Integrate effortlessly with Hugging Face's Diffusers and leverage Accelerate for scalable Multi-GPU and Multi-Node configurations. Explore DDPO algorithm applications compatible with Stable Diffusion across diverse pipelines. Access documentation for installation and learn about our latest experiments.
Feedback Email: [email protected]