Catalogue Search | MBRL

FTO-Sim: an open-source simulation framework for evaluating cooperative perception in urban areas

by Bogenberger, Klaus , Ilic, Mario , Niels, Tanja in Automotive Engineering , Civil Engineering , Classification

2026

In urban areas, static and dynamic occlusions frequently obstruct the field of view and impair the reliable detection of vulnerable road users (VRUs), posing a major challenge to the safe deployment of connected and automated vehicles in complex urban traffic. To address this, the following paper presents an open-source simulation framework for evaluating cooperative perception under realistic occlusion conditions, tailored to assessing VRU safety in urban traffic. The framework uses SUMO for microscopic traffic simulation and a Python-based ray-tracing module, enabling explicit modeling, visualization, and evaluation of occlusion effects without requiring complex co-simulation frameworks. In addition to conventional floating car observers, the framework introduces a novel observer type, the floating bike observer, extending the scope of cooperative perception studies. Several evaluation metrics are implemented, including relative visibility, level of visibility, and VRU-specific detection rates, which enable the systematic assessment of spatial perception coverage, detection reliability, and safety in critical road user interactions. The framework is fully open-source and complemented by simulation examples that replicate published studies, ensuring transparent validation of previous results while providing a basis for adapting it to new research questions in cooperative perception.

Journal Article

Share this book

Add to My Shelf

Learning Ego-Centric BEV Representations from a Perspective-Privileged View: Cross-View Supervision for Online HD Map Construction

by Bogenberger, Klaus , Lengerer, Daniel , Pechinger, Mathias in Cameras , High definition , Inference

2026

Bird's-eye-view (BEV) representations derived from multi-camera input have become a central interface for online high-definition (HD) map construction. However, most approaches rely solely on ego-centric supervision, requiring large-scale scene structure to be inferred from incomplete observations, occlusions, and diminishing information density at long range, where perspective effects and spatial sparsity hinder consistent structural reasoning. We introduce Cross-View Supervision (CVS), a representation learning paradigm that transfers geometric and topological priors from an ego-aligned overhead perspective into camera-based BEV encoders. Rather than adding auxiliary semantic losses, CVS aligns representations in a shared BEV feature space and distills globally consistent structural knowledge from a perspective-privileged teacher into the ego-centric backbone. This supervision enhances structural coherence without modifying the inference architecture or requiring overhead input at test time. Experiments on nuScenes using ego-aligned aerial imagery from the AID4AD cross-view extension demonstrate consistent improvements over StreamMapNet while maintaining identical camera-only inference. CVS yields +3.9\\,mAP in the standard \\(6030\\,m\\) region and +9.9\\,mAP in the extended \\(10050\\,m\\) setting, corresponding to a 44\\% relative gain at long range. These results highlight perspective-privileged structural supervision as a promising training principle for improving BEV representation learning in HD map construction.

Paper

Share this book

Add to My Shelf

AID4AD: Aerial Image Data for Automated Driving Perception

by Bogenberger, Klaus , Lengerer, Daniel , Pechinger, Mathias in Aerial photography , Alignment , Automation

2025

This work investigates the integration of spatially aligned aerial imagery into perception tasks for automated vehicles (AVs). As a central contribution, we present AID4AD, a publicly available dataset that augments the nuScenes dataset with high-resolution aerial imagery precisely aligned to its local coordinate system. The alignment is performed using SLAM-based point cloud maps provided by nuScenes, establishing a direct link between aerial data and nuScenes local coordinate system. To ensure spatial fidelity, we propose an alignment workflow that corrects for localization and projection distortions. A manual quality control process further refines the dataset by identifying a set of high-quality alignments, which we publish as ground truth to support future research on automated registration. We demonstrate the practical value of AID4AD in two representative tasks: in online map construction, aerial imagery serves as a complementary input that improves the mapping process; in motion prediction, it functions as a structured environmental representation that replaces high-definition maps. Experiments show that aerial imagery leads to a 15-23% improvement in map construction accuracy and a 2% gain in trajectory prediction performance. These results highlight the potential of aerial imagery as a scalable and adaptable source of environmental context in automated vehicle systems, particularly in scenarios where high-definition maps are unavailable, outdated, or costly to maintain. AID4AD, along with evaluation code and pretrained models, is publicly released to foster further research in this direction: https://github.com/DriverlessMobility/AID4AD.

Paper

Share this book

Add to My Shelf

TUM2TWIN: Introducing the Large-Scale Multimodal Urban Digital Twin Benchmark Dataset

by Holst, Christoph , Zhu, Jingwei , Wang, Jiapan in Benchmarks , Data acquisition , Data integration

2025

Urban Digital Twins (UDTs) have become essential for managing cities and integrating complex, heterogeneous data from diverse sources. Creating UDTs involves challenges at multiple process stages, including acquiring accurate 3D source data, reconstructing high-fidelity 3D models, maintaining models' updates, and ensuring seamless interoperability to downstream tasks. Current datasets are usually limited to one part of the processing chain, hampering comprehensive UDTs validation. To address these challenges, we introduce the first comprehensive multimodal Urban Digital Twin benchmark dataset: TUM2TWIN. This dataset includes georeferenced, semantically aligned 3D models and networks along with various terrestrial, mobile, aerial, and satellite observations boasting 32 data subsets over roughly 100,000 \\(m^2\\) and currently 767 GB of data. By ensuring georeferenced indoor-outdoor acquisition, high accuracy, and multimodal data integration, the benchmark supports robust analysis of sensors and the development of advanced reconstruction methods. Additionally, we explore downstream tasks demonstrating the potential of TUM2TWIN, including novel view synthesis of NeRF and Gaussian Splatting, solar potential analysis, point cloud semantic segmentation, and LoD3 building reconstruction. We are convinced this contribution lays a foundation for overcoming current limitations in UDT creation, fostering new research directions and practical solutions for smarter, data-driven urban environments. The project is available under: https://tum2t.win

Paper

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter