Safe-Policy-Optimization
Safe Policy Optimization (SafePO) offers a unified framework for assessing algorithms in various safe reinforcement learning settings. It emphasizes correctness, reliability, and allows seamless addition of new algorithms. Features include detailed logging and visualization through TensorBoard and WandB, along with comprehensive documentation. Intended for researchers, SafePO facilitates effective benchmarking for both single and multi-agent scenarios, encouraging ethical use.