#Autonomous Driving
DriveLM
This article explores the application of Graph Visual Question Answering in autonomous driving systems, particularly for the 2024 challenge. It uses datasets such as nuScenes and CARLA to develop a VLM-based baseline approach that combines Graph VQA with end-to-end driving solutions. The project seeks to simulate human reasoning in driving, offering a holistic framework for perception, prediction, and planning. It merges language models with autonomous systems for explainable planning and improved decision-making in self-driving vehicles. Learn about the project's novel methodologies and its impact on the field of autonomous vehicles.
UniTR
UniTR, a unified multi-modal transformer, sets advancements in 3D object detection and BEV map segmentation using the nuScenes dataset. It employs effective weight-sharing methods, integrating camera and LiDAR sensor data smoothly for autonomous driving applications. It is recognized for minimal dependencies and enhanced inference latency, marking new standards in 3D perception.
Mamba-in-CV
This collection of Mamba-focused computer vision projects highlights recent developments including human activity recognition, anomaly detection, and autonomous driving. It explores the capabilities of state space models as alternatives to transformers, providing links to detailed papers and code. Ideal for researchers and practitioners interested in visual state space models.
ChatSim
Discover how the collaboration with LLM-Agent enhances autonomous driving simulations by advancing editable scene rendering. This project leverages 3D Gaussian splatting to drastically speed up background rendering, achieving remarkable efficiency. Improved Blender processes allow rapid foreground rendering, showcasing significant technological advancements. Supported by OpenAI and NVIDIA AI, this simulation project offers enhanced rendering quality and speed. Explore the innovative features that make this a pivotal tool for autonomous driving development.
Awesome-World-Model
Delve into a curated selection of papers on world models specifically for autonomous driving. This resource sheds light on the predictive features of these models, which are crucial for anticipating upcoming scenarios in autonomous systems. Open collaboration is encouraged to expand their use in practical driving environments. The repository also details participation in workshops and challenges to drive the advancement of these models, ultimately improving decision-making and interaction in autonomous vehicles.
SMARTS
SMARTS is a simulation platform designed for multi-agent reinforcement learning and autonomous driving research. Created by Huawei Noah's Ark Lab, it emphasizes realistic and varied interactions, forming part of the broader XingTian RL platform suite. Suitable for researchers and developers focusing on autonomous driving advancements, SMARTS facilitates extensive experiments and learning in complex settings. Being open-source, it encourages community involvement and innovation. Detailed documentation is available for further insights into its features and application.
Vision-Centric-BEV-Perception
This article offers a detailed examination of vision-centric bird's-eye-view (BEV) perception technologies relevant to autonomous driving. It discusses geometry-based, depth-based, and network-based approaches, including multi-layer perceptron (MLP) and transformer-based methods, and their use in object detection and semantic segmentation. The survey includes discussions on datasets, benchmarking outcomes, and historical technological developments. It also addresses multi-task learning and fusion strategies, emphasizing advances in multi-modality fusion for improved 3D object detection, providing a valuable insight into current BEV perception technologies.
S3Gaussian
Discover S3Gaussian's approach using 3D Gaussians for self-supervised street scene modeling in autonomous driving, bypassing traditional 3D bounding boxes. It features an innovative hexplane-based encoder and a multi-head Gaussian decoder for quality scene rendering. Compatible with Ubuntu, Python, and PyTorch, this open-source initiative offers extensive tools for training and visualization, highlighting its advancements in modeling dynamic environments without extra supervision.
InterFuser
InterFuser enhances autonomous driving safety through interpretable sensor fusion, integrating multi-modal sensor data for comprehensive scene insights. This method provides semantics to ensure actions stay within safe limits, setting new standards on the CARLA AD Leaderboard. The repository offers complete setup guidance, including dataset configuration, model training, and evaluation instructions, with access to pre-trained weights for immediate application.
Autonomous-Driving-in-Carla-using-Deep-Reinforcement-Learning
This project uses CARLA simulation and Deep Reinforcement Learning with Proximal Policy Optimization to improve autonomous driving capabilities. It trains agents in hyper-realistic urban environments, leveraging a Variational Autoencoder for efficient learning. By focusing on continuous state and action spaces, it aims to provide reliable autonomous navigation on predetermined routes, offering a comprehensive end-to-end driving solution.
mahalanobis_3d_multi_object_tracking
The project presents a probabilistic approach to 3D multi-object tracking aimed at enhancing accuracy in autonomous driving systems. By integrating MEGVII detection inputs, the method surpasses the AB3DMOT baseline, earning first place in the NuScenes Tracking Challenge. It features a combination of Kalman Filter covariance estimation and data association techniques, improving tracking accuracy, particularly for small objects such as pedestrians. Open-source code and setup guidelines are available for developers interested in replicating or further exploring these results.
Feedback Email: [email protected]