#Reinforcement Learning

Logo of ML-Course-Notes
ML-Course-Notes
Access comprehensive lecture notes and resources from leading courses like Andrew Ng's Machine Learning and MIT's Deep Learning. Ideal for AI enthusiasts seeking insights into advanced and foundational AI topics.
Logo of AI-Optimizer
AI-Optimizer
AI-Optimizer provides a wide array of algorithm libraries for reinforcement learning, covering both model-free and model-based methods. It is designed for single-agent and multi-agent setups, featuring a distributed framework for optimized training efficiency. Highlighted areas of innovation include solutions for multiagent reinforcement learning, offline RL, self-supervised learning, and model-based RL, tackling issues such as scalability and sample efficiency. This tool is a valuable asset for researchers and practitioners, offering accessible implementations suitable for complex real-world scenarios.
Logo of PaLM-rlhf-pytorch
PaLM-rlhf-pytorch
The project demonstrates the implementation of Reinforcement Learning with Human Feedback (RLHF) on the PaLM infrastructure, enabling researchers to explore open-source systems similar to ChatGPT. It provides guidelines on using the PaLM framework, training reward models with human input, and integrating RLHF for improved performance. The contributions of CarperAI and support from Hugging Face are acknowledged, as well as potential enhancements like Direct Preference Optimization.
Logo of ML-YouTube-Courses
ML-YouTube-Courses
This repository features a curated selection of machine learning courses from YouTube, spanning topics like basics, deep learning, NLP, computer vision, and reinforcement learning. Compiled by DAIR.AI, it includes courses from prestigious institutions such as Caltech, Stanford, and MIT, offering educational resources for professionals and enthusiasts. It provides access to advanced courses on modern techniques and practical applications, serving both beginners and experienced learners in AI and machine learning.
Logo of rl-baselines-zoo
rl-baselines-zoo
Discover a versatile array of reinforcement learning agents, designed to showcase adaptable practices across multiple environments. Utilizing Stable Baselines, this toolkit facilitates the fine-tuning of hyperparameters, offering user-friendly interfaces for training, testing, and deploying agents. Perfect for educational exploration and development, it serves both novices and expert developers looking to refine their skills in robust model creation. Opt to transition to RL-Baselines3 Zoo for the latest advancements.
Logo of MedicalGPT
MedicalGPT
This page details the methodologies used in training medical language models using GPT techniques, including pretraining, supervised fine-tuning, reinforcement learning from human feedback (RLHF), and direct preference optimization (DPO). By utilizing extensive multilingual datasets, MedicalGPT enhances performance in medical Q&A systems and supports various architectures such as Llama and Vicuna. The platform provides practical scripts and demo interfaces for ease of integration, serving as a significant resource for the development of contemporary medical AI applications.
Logo of SolidUI
SolidUI
Convert text to graphics using AI technology. SolidUI offers 2D and 3D graphic models and scenes by merging natural language processing with computer graphics. Its unique Vincent graph language model benefits from reinforcement learning for improved accuracy. The platform supports containerized deployment, various data sources, Huggingface collaboration, and plug-in robotics for enhanced visualization tool development.
Logo of open-chatgpt
open-chatgpt
Investigate an open-source framework designed for crafting AI models akin to ChatGPT with straightforward processes. Utilize a system that optimizes limited computational capabilities for training through RLHF and advanced distributed training solutions. The project supports expansive language models, incorporates fine-tuning with LoRA, and ensures compatibility with DeepSpeed for enhanced scalability. Access a complete toolkit to create instruction-following models, featuring diverse datasets for multilingual and task-specific uses.
Logo of OneTrainer
OneTrainer
OneTrainer serves as a comprehensive hub for stable diffusion model training, compatible with an array of models from Stable Diffusion 1.5 to 3.5 and SDXL, among others. It incorporates advanced training strategies including full fine-tuning, LoRA, and embeddings, alongside features like masked training and image augmentation. Users gain from functionality such as automatic backups, Tensorboard tracking, and tools facilitating multi-resolution training. This versatile toolkit smooths model management with intuitive format conversion and real-time model sampling, tailored for both CLI and GUI preferences.
Logo of rllte
rllte
Discover a versatile toolkit tailored for advancing reinforcement learning by providing a full ecosystem for designing tasks, evaluating models, and deploying systems. It facilitates modular algorithm development and supports hardware acceleration across multiple devices like GPU and NPU. Featuring intrinsic reward modules and a language model-enhanced assistant, this platform enables efficient implementation and testing of RL algorithms, serving as a valuable resource for researchers and developers aiming to utilize advanced technologies and reusable benchmarks to improve RL applications.
Logo of gym-sokoban
gym-sokoban
Discover the challenges of Sokoban with dynamic room creation, a valuable tool for testing Reinforcement Learning algorithms. This environment aids AI research by varying puzzles to avoid overfitting, offering diverse gameplay options and configurations. Ideal for AI development, it supports multiple modes and adaptations, providing a practical solution for enhancing learning algorithms.
Logo of gymnax
gymnax
Gymnax integrates JAX acceleration into gym APIs, enhancing speed and efficiency in reinforcement learning environments. Supporting various settings from classic control to bsuite, it employs JAX primitives like 'jit', 'vmap', and 'pmap' for high-throughput experiments. The project offers control over environments, beneficial for meta reinforcement learning and evolutionary optimization, including implementing the Anakin sub-architecture. Speed tests on NVIDIA A100 GPUs illustrate its capabilities, making it suitable for scalable RL experiments. Tutorial resources are available for users to start exploring its features.
Logo of rl-agents
rl-agents
This project provides a diverse set of reinforcement learning agents specializing in planning, safe exploration, and value-based strategies. Featuring implementations like Value Iteration, Monte-Carlo Tree Search, and Deep Q-Networks, it supports both deterministic and stochastic environments and integrates seamlessly with OpenAI's Gym. Equipped with monitoring tools such as Gym Monitor and Tensorboard, this toolkit facilitates efficient experimentation with various configurations, offering a valuable resource for AI research and development.
Logo of h2o-llmstudio
h2o-llmstudio
H2O LLM Studio is a no-code platform for fine-tuning large language models using an intuitive GUI. It features cutting-edge techniques like Low-Rank Adaptation, supports multiple hyperparameters, and offers model performance tracking through Neptune and W&B integration. Recent enhancements provide robust training and optimization methods.
Logo of deep-neuroevolution
deep-neuroevolution
The repository provides distributed implementations of neuroevolution algorithms for training deep neural networks in reinforcement learning, including Deep Genetic Algorithm and Evolution Strategies. It supports local and AWS cloud execution, and offers tools like Visual Inspector for NeuroEvolution and a GPU-optimized implementation. Tutorials and guides ensure smooth setup and experimentation, equipping researchers with effective AI development tools.
Logo of Practical_RL
Practical_RL
Practical_RL is an open course that delves into reinforcement learning, emphasizing practical, real-world solutions. Offered at HSE and YSDA, it supports English and Russian students, both on-campus and online. The curriculum encourages curiosity-driven exploration, includes Git-contributions for collaborative improvements, and utilizes platforms like Google Colab and Azure Notebooks for an interactive learning experience. Covering key topics like value-based methods, model-free RL, policy gradients, POMDPs, and model-based RL, it provides both theoretical insight and practical skill development in reinforcement learning.
Logo of deep-learning-roadmap
deep-learning-roadmap
This project offers a structured collection of deep learning resources. It's designed to guide both developers and researchers through complex topics. Resources are categorized and include papers, models, and real-world applications. Additionally, it provides access to a Python machine learning book and a Slack community for continuous learning. As an open-source initiative, it invites global collaboration and contributions.
Logo of Awesome-ChatGPT
Awesome-ChatGPT
Access detailed resources on AI and ChatGPT, highlighting recent developments, model evolutions like GPT-4, and their applications. Explore updates, studies, and technical insights, including the influence of technologies such as Baidu Wenxin Yiyan on search engines. A valuable tool for researchers, developers, and AI enthusiasts interested in AI advancements.
Logo of DeepRL
DeepRL
This project features a modular implementation of deep reinforcement learning algorithms using PyTorch. It seamlessly transitions from simple tasks to complex games, incorporating methods like Double DQN, A2C, and PPO. With efficient data generation and hardware optimization, it's suitable for scalable deep learning research. Support is available for robust testing environments such as Breakout and Mujoco. Discover innovative algorithmic insights and performance metrics visualized through detailed learning curves.
Logo of Online-RLHF
Online-RLHF
This project offers a detailed guide to Online Iterative RLHF, a cutting-edge method proven more effective than offline methods. The open-source workflow allows reproduction of advanced LLMs using only open-source data, achieving results on par with or better than LLaMA3-8B-instruct. It includes comprehensive setup instructions covering fine-tuning, reward modeling, data generation, and iterative training.
Logo of stable-baselines3
stable-baselines3
Stable Baselines3 provides reliable reinforcement learning implementations in PyTorch, facilitating replication and refinement in research and industry. It features advanced RL methods, customizable setups, and integrations with Tensorboard and Hugging Face. Suitable for those with reinforcement learning knowledge, it supports both beginners and experts with detailed documentation and community involvement. Ideal for boosting RL projects and experimentation.
Logo of openai_lab
openai_lab
Discover a comprehensive reinforcement learning framework leveraging OpenAI Gym and TensorFlow. This system offers a unified interface, essential RL algorithm implementations, and automated analytics, optimizing algorithm development. Suitable for extensive experimentation, hypothesis testing, and hyperparameter optimization, with settings stored for reproducibility. Evaluate algorithm performance across various environments using the Fitness Matrix. Start developing RL agents with provided components and look forward to future support for PyTorch.
Logo of QuantResearch
QuantResearch
Discover resources for enhancing quantitative finance through backtesting, machine learning, and reinforcement learning. Access notebooks and blogs on topics including linear regression, Kalman filters, PCA, ARIMA, and GARCH. Leverage practical demos and tools for live trading and learn advanced techniques like hidden Markov models for option pricing, providing hands-on experience in algorithmic trading and market analysis.
Logo of Awesome-World-Model
Awesome-World-Model
Delve into a curated selection of papers on world models specifically for autonomous driving. This resource sheds light on the predictive features of these models, which are crucial for anticipating upcoming scenarios in autonomous systems. Open collaboration is encouraged to expand their use in practical driving environments. The repository also details participation in workshops and challenges to drive the advancement of these models, ultimately improving decision-making and interaction in autonomous vehicles.
Logo of Minigrid
Minigrid
Minigrid provides versatile and adaptive grid-world environments ideal for reinforcement learning, compatible with the Gymnasium API. Featuring both Minigrid and BabyAI settings, the library facilitates navigation and language instruction tasks, optimized for curriculum learning. It's available for Python on Linux and macOS, with community-driven Windows support.
Logo of DI-star
DI-star
Discover a platform dedicated to the training of AI agents for StarCraft II, featuring both supervised and reinforcement learning options. It supports resource-limited training environments and offers interactive demos. This framework caters to both grandmaster level AI progression and accessible development processes, with detailed setup and test guidance. Suitable for developers aiming to enhance game AI with solid models and structured frameworks.
Logo of TensorLayer
TensorLayer
Explore TensorLayer, a versatile deep learning library crafted for both researchers and engineers. It emphasizes flexibility, simplicity, and performance, and is compatible with frameworks like TensorFlow and PyTorch. TensorLayer offers an extensive array of neural layers, comprehensive tutorials, and applications powered by a vibrant community acknowledged by the ACM Multimedia Society. It facilitates swift development of complex AI models, supports diverse hardware, and offers both high-level and professional APIs, with rich multilingual documentation and numerous examples available.
Logo of reflexion
reflexion
Investigate a novel AI approach that uses verbal reinforcement learning to boost reasoning and decision-making capabilities in language agents. The project offers source code, demonstrations, and thorough setup guidance for executing experiments related to reasoning in HotPotQA and decision-making in AlfWorld. Learn about different agent types, reflexion strategies, and explore resources like LeetcodeHardGym. Though developer access might be limited due to GPT-4 constraints and API expenses, extensive experiment logs have been made accessible for analysis. This project by Noah Shinn and his team is featured in a NeurIPS 2023 publication.
Logo of rl
rl
TorchRL is a Python-first, efficient, and modular open-source Reinforcement Learning (RL) library specifically designed for seamless integration with the PyTorch ecosystem. It offers a versatile architecture with tools like distributed data collectors, replay buffers, and `TensorDict`, which optimize RL research. The library supports major environment libraries and provides extensive tutorials, documentation, and real-world application examples, making it an invaluable resource for implementing RL solutions across various domains.
Logo of LLM-Agent-Paper-List
LLM-Agent-Paper-List
This survey presents an overview of large language model (LLM) based agents in AI, examining their development and role in pursuing artificial general intelligence. It outlines a framework including brain, perception, and action components, with applications ranging from single to multi-agent settings, and human-agent interactions. The review also covers social behaviors and dynamics within LLM-based agent societies, shedding light on current advancements and future challenges without exaggerated claims. This objective resource is ideal for those exploring the developments and future direction of LLM-based agents.
Logo of Super-mario-bros-PPO-pytorch
Super-mario-bros-PPO-pytorch
The project applies the Proximal Policy Optimization (PPO) algorithm to train an AI agent to play Super Mario Bros, completing 31 out of 32 levels. Building on the A3C method, this shows marked performance improvements. It allows training and testing of models with customizable learning rates for optimal results. A Dockerfile facilitates a seamless setup for training and testing, although there may be rendering issues. This framework is ideal for those exploring AI-centered game development and performance optimization.
Logo of awesome-deeplearning-resources
awesome-deeplearning-resources
Explore a curated list of recent deep learning and reinforcement learning papers, organized by time to highlight the latest developments. This resource also features model zoos, pretrained models, educational courses, key software, and applications. It's an essential tool for researchers and enthusiasts to stay updated with important and popular papers in the field. Expand your knowledge with tutorials, projects, and diverse corpora to apply deep learning methods effectively.
Logo of gym-electric-motor
gym-electric-motor
Utilize this Python toolbox to simulate and control electric motors with classical and reinforcement learning methods. It supports precise modeling of distinct motor types and control scenarios to facilitate robust simulations, perfect for engineering and research applications. Straightforward installation and interactive notebooks include various motor models such as PMSM, SynRM, SCIM, and DFIM.
Logo of genrl
genrl
GenRL is an actively developed PyTorch library facilitating reproducible and accessible reinforcement learning research. It features modular implementations, unified interfaces, and over 20 tutorials, all designed to support reliable algorithm development and benchmarking, seamlessly integrating with OpenAI Gym.
Logo of AlphaZero_Gomoku
AlphaZero_Gomoku
The AlphaZero_Gomoku project utilizes the AlphaZero algorithm to train AI for Gomoku, leveraging self-play for efficient model development. Simplicity relative to Go or chess allows for rapid AI training on a single PC. The implementation supports both TensorFlow and PyTorch, offering training flexibility. Basic setup requires Python and Numpy, with further configuration necessary for scratch training using specific frameworks. Trained model examples illustrate AI performance across various setups, and scripts can be modified for experimentation and evaluation.
Logo of cartpole
cartpole
This project presents a reinforcement learning application using Deep Q-Learning (DQN) to train the Cartpole system. The system requires precise force applications to keep the pole balanced on a frictionless track. Details include specific hyperparameters like a learning rate of 0.001, batch size of 20, and the use of experience replay. Achieving an average reward of 195.0 over 100 trials qualifies as successful, showcasing efficient balance control. Learn more about performance metrics and see example trials of successful execution.
Logo of ai_quant_trade
ai_quant_trade
Discover a robust AI-driven stock trading system that offers tools and strategies for both institutions and individual investors. This platform utilizes advanced techniques, including machine learning, deep learning, reinforcement learning, and high-frequency trading. Participate in stock factor mining and sentiment analysis, while deploying real-time strategies locally or online. Includes market monitoring and stock recommendations, with support for C++/CPU/GPU deployment. Features like StructBERT market sentiment analysis aim to enhance performance and potential returns.
Logo of SMARTS
SMARTS
SMARTS is a simulation platform designed for multi-agent reinforcement learning and autonomous driving research. Created by Huawei Noah's Ark Lab, it emphasizes realistic and varied interactions, forming part of the broader XingTian RL platform suite. Suitable for researchers and developers focusing on autonomous driving advancements, SMARTS facilitates extensive experiments and learning in complex settings. Being open-source, it encourages community involvement and innovation. Detailed documentation is available for further insights into its features and application.
Logo of dm_control
dm_control
Google DeepMind's dm_control provides essential tools for building physics-based simulations and reinforcement learning tasks using MuJoCo. It features Python bindings, customizable environments, and multi-agent tasks like soccer simulations. The package is installed via PyPI, supporting multiple OpenGL rendering backends, though not in editable mode. Suitable for developers aiming to explore complex control tasks.
Logo of annotated_deep_learning_paper_implementations
annotated_deep_learning_paper_implementations
Discover a detailed set of annotated PyTorch implementations focused on neural networks and deep learning algorithms. The resource is continually updated and documented with comprehensible notes, providing practical insights into models like Transformers, GANs, and Diffusion Models, as well as reinforcement learning methods. Suitable for developers interested in architectures and optimization strategies, and complemented by regular updates to ensure resourcefulness. An essential repository for those wishing to broaden their deep learning acumen.
Logo of awesome-AI-books
awesome-AI-books
Discover an extensive collection of AI books and resources, spanning foundational theories to advanced topics like deep learning and quantum AI. Ideal for self-paced learning, these materials also include reinforcement learning tools. Contributions to expand this knowledge base are welcome.
Logo of openrl
openrl
OpenRL is an adaptable and efficient open-source framework meant for reinforcement learning research, encompassing tasks like single-agent, multi-agent, offline RL, and natural language processing, using PyTorch for straightforwardness and adaptability. It features a unified interface, supports various algorithms, and offers training enhancement methods. The framework connects seamlessly with vital tools like DeepSpeed and Hugging Face and caters to environments like Gymnasium and StarCraft II. Suitable for both academic and practical applications, it ensures straightforward model and data management, detailed documentation, and offers a vibrant community for engagement.
Logo of rlcard
rlcard
RLCard is an open-source toolkit designed to facilitate reinforcement learning in diverse card game environments. It features user-friendly interfaces for implementing a range of algorithms. Developed by DATA Lab at Rice University and Texas A&M University, RLCard bridges the gap between AI and imperfect information games. With updated Jupyter Notebook tutorials and integration with PettingZoo, it serves as a community-driven resource for AI researchers and developers. RLCard allows for customization of game environments and algorithm training, making it a valuable tool for machine learning projects.
Logo of reinforcement-learning-an-introduction
reinforcement-learning-an-introduction
The project provides a Python implementation based on 'Reinforcement Learning: An Introduction (2nd Edition)' by Sutton & Barto. It includes key algorithms and techniques demonstrated in the book, supported by detailed visual examples and code, covering subjects like bandit problems and gridworld. Utilizes Python tools such as numpy and matplotlib to deliver hands-on learning experiences for understanding reinforcement learning concepts.
Logo of Awesome-LLM-related-Papers-Comprehensive-Topics
Awesome-LLM-related-Papers-Comprehensive-Topics
Delve into a comprehensive collection of papers and resources on advanced LLM topics. This curated repository offers in-depth discussions on Prompt Engineering, Zero-shot Learning, Visual Language Models, and more, making it a valuable tool for researchers seeking to deepen their understanding and application of LLM advancements. Explore dynamically via our interactive Notion table.
Logo of rl-baselines3-zoo
rl-baselines3-zoo
This framework facilitates training and evaluation of RL agents with Stable Baselines3, offering scripts for hyperparameter tuning, performance evaluation, and utilizing pre-trained models. It supports benchmarking across algorithms with a comprehensive set of tuned settings for diverse environments, providing clear documentation for setup, training, and integration with tools such as Weights & Biases. Suitable for all levels of RL practitioners.
Logo of UAV-DDPG
UAV-DDPG
This research presents a method for improving mobile edge computing systems by employing UAVs to offload tasks, thereby reducing processing delay. Using Deep Deterministic Policy Gradient, the study optimizes user scheduling, task offloading, and UAV dynamics. The proposed reinforcement learning algorithm addresses the problem’s complexity and offers optimal solutions in dynamic settings. Published in Wireless Networks, the research highlights measurable improvements compared to standard methods.
Logo of FinGPT
FinGPT
FinGPT provides open-source financial language models that support cost-efficiency and scalability in finance. It democratizes data access and allows for timely updates through fine-tuning, with reinforcement learning to improve personalized insights, setting it apart from models like BloombergGPT. Recent advancements include the FinGPT-Forecaster for improved predictive modeling in robo-advisory and sentiment analysis with high performance, showcasing its comprehensive data processing and financial market application capabilities.
Logo of awesome-decision-transformer
awesome-decision-transformer
This repository provides a thorough and current collection of research papers on Decision Transformers, covering significant advancements from major conferences like ICML, ICLR, and NeurIPS. The Decision Transformer model, introduced by Chen L. et al., redefines offline reinforcement learning through sequence modeling, eliminating long-term credit assignment issues and short-sighted actions while utilizing scalable transformer frameworks. It serves as a critical resource for researchers and practitioners exploring transformer integration in reinforcement learning.
Logo of flashbax
flashbax
Flashbax offers optimized replay buffers for JAX, supporting both academic and industrial reinforcement learning applications. Its framework includes diverse buffer types, such as Flat, Trajectory, and Prioritised Buffers, emphasizing efficient memory and prioritisation. Ideal for algorithms using recurrent networks, Flashbax integrates effortlessly into various projects, enhancing speed and performance in RL environments. Discover detailed examples and benchmarks to effectively utilize Flashbax in your reinforcement learning projects.