RealDriveSim: A Realistic Multi-Modal Multi-Task Synthetic Dataset for Autonomous Driving

Arpit Jadon*, Haoran Wang, Phillip Thomas, Michael Stanley, S. Nathaniel Cibik, Rachel Laurat, Omar Maher, Lukas Hoyer, Ozan Unal*, Dengxin Dai

arXiv logo Paper Download icon Dataset BibTeX icon

Abstract

As perception models continue to develop, the need for large-scale datasets increases. However, data annotation remains far too expensive to effectively scale and meet the demand. Synthetic datasets provide a solution to boost model performance with substantially reduced costs. However, current synthetic datasets remain limited in their scope, realism, and are designed for specific tasks and applications. In this work, we present RealDriveSim, a realistic multi-modal synthetic dataset for autonomous driving that not only supports popular 2D computer vision applications but also their LiDAR counterparts, providing fine-grained annotations for up to 64 classes. We extensively evaluate our dataset for a wide range of applications and domains, demonstrating state-of-the-art results compared to existing synthetic benchmarks. The dataset is publicly available at https://realdrivesim.github.io/

Comparison with Existing Synthetic Datasets

Dataset Camera LiDAR
Adverse-W 2D Seg. Sem-Cls. 2D Det. 3D Det. Depth Optical-Flow MOT 3D Det. 3D Seg. Sem-Cls. Scene-Flow MOT SLAM
SYNTHIA22------
GTA-V19------
VIPER32------
Synscapes19------
SHIFT23-
PreSIL1212
SynLIDAR--------32
RealDriveSim 61 64
Download Dataset
Label Mappings

Normal Weather - Full

Modality Download (File Size)
2D Modalities
Images (RGB) 367.42 GB
2D Object Detection 1.78 GB
2D Semantic Segmentation 7.75 GB
2D Instance Segmentation 2.54 GB
Depth Maps 605.66 GB
2D Motion Vectors / Optical Flow 91.54 GB
3D Modalities
3D Point Clouds 430.33 GB
3D Object Detection 62 GB
3D Semantic Segmentation 3.53 GB
3D Instance Segmentation 1.29 GB
3D Motion Vectors / Scene Flow 96.95 GB
Other
Calibration Data 6 MB

Normal Weather - Sampled
(Every 5th Frame Uniformly Sampled from Full Sequences)

Modality Download (File Size)
2D Modalities
Images (RGB) 73.52 GB
2D Object Detection 367.51 MB
2D Semantic Segmentation 1.55 GB
2D Instance Segmentation 522.26 MB
Depth Maps 121.13 GB
2D Motion Vectors / Optical Flow 19.18 GB
3D Modalities
3D Point Clouds 86.09 GB
3D Object Detection 12.40 GB
3D Semantic Segmentation 725.52 MB
3D Instance Segmentation 268.09 MB
3D Motion Vectors / Scene Flow 19.60 GB
Other
Calibration Data 6 MB

Adverse Weather [Batch 1]

Modality Download (File Size)
2D Modalities
Images (RGB) 11.54 GB
2D Object Detection 74.15 MB
2D Semantic Segmentation 323.89 MB
2D Instance Segmentation 107.68 MB
Depth Maps 24.38 GB
2D Motion Vectors / Optical Flow 3.75 GB
3D Modalities
3D Point Clouds 6.16 GB
3D Object Detection 2.35 GB
3D Semantic Segmentation 75.96 MB
3D Instance Segmentation 28.14 MB
3D Motion Vectors / Scene Flow 1.46 GB
Other
Calibration Data 248 KB

Adverse Weather [Batch 2]

Modality Download (File Size)
2D Modalities
Images (RGB) 4.07 GB
2D Object Detection 25.62 MB
2D Semantic Segmentation 119.61 MB
2D Instance Segmentation 37.79 MB
Depth Maps 8.57 GB
2D Motion Vectors / Optical Flow 1.42 GB
3D Modalities
3D Point Clouds 2.24 GB
3D Object Detection 820.03 MB
3D Semantic Segmentation 27.68 MB
3D Instance Segmentation 9.66 MB
3D Motion Vectors / Scene Flow 554.15 MB
Other
Calibration Data 88 KB

Dataset Summary

Dataset Name No. of Sequences No. of Frames Remarks
Normal Weather – Full 6,343 126,680 Contains clear weather [including cloudy and overcase conditions] images taken at different times of the day.
Normal Weather – Sampled 6,343 25,372 Smaller version of the full normal weather dataset with every 5th frame uniformly sampled from each full sequence.
Adverse Weather – Batch 1 258 5,160 Contains foggy, night, and rainy scenes
Adverse Weather – Batch 2 90 1,800 Contains foggy, night, and rainy scenes

Note 1: The experiments in our paper were conducted using a combined dataset consisting of Normal Weather – Sampled Dataset, Adverse Weather [Batch 1], and Adverse Weather [Batch 2]. If the full Normal Weather dataset is too large for your needs, we recommend using the same sampled version as used in our experiments.

Note 2: Compared to the dataset used in the paper, we have added one additional sequence each to Adverse Weather Batch 1 and Batch 2.

Note 3: PD-SDK is no longer available or needed; the mappings above include all classes and IDs, so please use them directly.

RealDriveSim Label Names → Label IDs

Label Name Label ID
Animal0
Bicycle1
Bicyclist2
Building3
Bus4
Car5
Caravan/RV6
ConstructionVehicle7
CrossWalk8
Fence9
HorizontalPole10
LaneMarking11
LimitLine12
Motorcycle13
Motorcyclist14
OtherDriveableSurface15
OtherFixedStructure16
OtherMovable17
OtherRider18
Overpass/Bridge/Tunnel19
OwnCar(EgoCar)20
ParkingMeter21
Pedestrian22
Railway23
Road24
RoadBarriers25
RoadBoundary(Curb)26
RoadMarking27
SideWalk28
Sky29
TemporaryConstructionObject30
Terrain31
TowedObject32
TrafficLight33
TrafficSign34
Train35
Truck36
Vegetation37
VerticalPole38
WheeledSlow39
LaneMarkingOther40
LaneMarkingGap41
Fence(Transparent)42
StaticObject(Trashcan)43
Vegetation(Bush)44
OtherPole45
Powerline46
SchoolBus47
ParkingLot48
RoadMarkingSpeed49
Vegetation(GroundCover)50
Vegetation(Grass)51
Vegetation(Tree)52
Debris53
RoadBoundary(CurbFlat)54
LaneMarking(Parking)55
LaneMarking(ParkingIndicator)56
RoadMarkingArrows57
RoadMarkingBottsDots58
StopLine59
ChannelizingDevice60
LaneMarkingSpan61
StaticObject(BikeRack)62
ParkingSpot63
RoadBoundary(CurbTop)64
RoadBoundary(CurbSide)65
RoadBoundary(CurbRoadLevel)66
Water67
GuardRail68
Wall69
WheelStopper70
ParkingMarker71
Van103
ConstructionVehicle(Truck)104
DiffusionPrimitive200
Custom201
Parking Gap202
Multipath(Noise)225
ThermalNoise(Noise)226
Fog(Noise)227
Rain(Noise)228
TireSplash(Noise)229
Void255

Note 1: The experiments in our paper were conducted using a combined dataset consisting of Normal Weather – Sampled Dataset, Adverse Weather [Batch 1], and Adverse Weather [Batch 2]. If the full Normal Weather dataset is too large for your needs, we recommend using the same sampled version as used in our experiments.

Note 2: Compared to the dataset used in the paper, we have added one additional sequence each to Adverse Weather Batch 1 and Batch 2.

Note 3: PD-SDK is no longer available or needed; the mappings above include all classes and IDs, so please use them directly.