GPUDrive: Multi-Agent Reinforcement Learning Driving Simulator

Ran multi-agent driving simulations using GPUDrive, training IPPO agents and comparing CPU vs GPU performance on the Waymo Mini dataset.

Screenshot of the simulator (left) and training time comparison between GPUDrive and Nocturne (right).

Reinforcement learning termination conditions illustration (left) and GPUDrive simulator video (right).

GPU vs CPU: Steps per second vs number of worlds. GPU performance is exponentially better and was able to handle 125 worlds compared to 50 in CPU.

GPU vs CPU: Success Rate vs number of worlds. GPU success rate is a bit better, but was able to handle 125 worlds compared to 50 in CPU.

GPU vs CPU: Real elapsed time in seconds vs number of worlds, comparison graph.

GPUDrive Project Details

GPUDrive is a high-performance, data-driven simulator designed to support large-scale multi-agent autonomous driving research by combining real-world driving data with massively parallel simulation. The framework is capable of running over one million simulation steps per second while maintaining a lightweight memory footprint, enabling thousands of environments and hundreds of agents to be simulated simultaneously. In the original paper, agents are trained using Independent Proximal Policy Optimization (IPPO), where each agent learns its own policy independently without communication, allowing efficient scaling across large datasets such as the Waymo Open Motion Dataset. GPUDrive addresses common simulation bottlenecks through techniques such as polyline decimation, bounding volume hierarchies for collision filtering, and dynamic memory allocation.

In this project, we reproduced and experimented with GPUDrive by configuring the simulator using the Waymo Mini Dataset and implementing a custom Python training pipeline. We trained agents using IPPO and evaluated performance under varying levels of parallelism by changing the number of simulated worlds, ranging from small-scale setups to 100 parallel environments. Our experiments compared CPU and GPU execution to measure resource trade-offs, focusing on steps per second, training time, and success rates. Throughout the setup process, we encountered challenges related to limited documentation, unresolved GitHub issues, and CUDA compatibility, particularly on Windows-based systems, ultimately achieving stable execution on Ubuntu.

GPUDrive achieves a 200 - 300x training speedup, solving 10 scenarios in less than 3 minutes compared to approximately 10 hours in Nocturne.

from the GPUDrive paper

GPUDrive: Data-Driven, Multi-Agent Driving Simulation at 1 Million FPS

The results showed that GPU execution significantly outperformed CPU execution, with steps per second increasing rapidly as parallel environments scaled. While success rates were only slightly higher on GPU, the dramatic reduction in training time made GPU-based simulation far more efficient overall.

Project information

Category Research
Technologies Python, C++, CUDA Toolkit, CMake, Ubuntu/Linux, Waymo Open Motion Dataset, Gymnasium (Torch & JAX environments)
Skills GPU computing and benchmarking, multi‑agent reinforcement learning simulation, environment configuration and debugging, performance analysis, experimental evaluation
Project date November 2025 - December 2025
Final Presentation Final Presentation Download
Project URL GitHub Link