Register    Attendee Sign In Sign in to access papers, presentations, and videos
Previous Abstract Return to Session D6 Next Abstract

Session D6: Navigation Using Environmental Features

Robust 3D Map-Matching with Visual Environment Features for Neural City Maps
Mira Partha, Daniel Neamati, Shubh Gupta, and Grace Gao, Stanford University
Alternate Number 1

In GNSS-degraded environments such as urban canyons, vision-based localization systems can be a critical method for obtaining accurate and robust positioning solutions. Typically, visual localization systems compute vehicle position estimates by comparing images collected by cameras onboard the vehicle to a map of the environment. The choice of map representation is thus very important for safe and accurate localization. In our previous work, we proposed photorealistic 3D representations of urban environments built atop Neural Radiance Fields (NeRFs), a paradigm which we termed ‘Neural City Maps.’ However, due to the complex and highly varying nature of urban environments, many challenges remain in developing successful visual localization systems using Neural City Maps. The core component for any visual localization system is map-matching, that is, the method for matching real-world camera images to a map of the environment. In this work, we develop a novel grid sampling-based method for characterizing the optimization landscapes of different visual environment features used for neural map-matching. The ability to determine and evaluate these optimization landscapes is an essential step toward achieving robust localization. We demonstrate our proposed method by evaluating a diverse set of image similarity metrics, including both deterministic and learning-based methods, for their ability to match to Neural City Maps using real-world camera data captured in diverse urban environments. Our results demonstrate that embedding distances derived from deep learning models such as VGG16 and ResNet produce optimization landscapes that are more amenable to localization with Neural City Maps than traditional feature matching. More broadly, our work affirms the necessity of considering higher-level visual features and semantic context for robust neural map-matching.



Previous Abstract Return to Session D6 Next Abstract