Robust and Low-cost Precise Vehicular Positioning in Urban Environments
Murrian, Matthew J., Narula, Lakshay, Wooten, Michael, Humphreys, Todd E., University of Texas at Austin
Alternate Number 2
This paper develops a technique for fusing PPP-RTK with visual Simultaneous Localization and Mapping (SLAM). Specifically, it offers a means to build a globally-referenced sparse map of an urban roadway by combining visible-light camera images and PPP-RTK measurements. Prior work in the use of visual odometry techniques for vehicular positioning has approached the problem from a few different directions. One trend is to assume unavailability or unreliability of GNSS during vehicular positioning and treat it as a GNSS-denied environment, as in reference . Other work has accepted the challenge of fusing GNSS with visual techniques (e.g., references  and ) but their use is often limited to code-phase derived observables with a focus on identifying and compensating for unavailable or aberrant GNSS solutions. Not shown in prior work, this paper will “close the loop” between visual SLAM and PPP-RTK by using visual SLAM to directly aid in integer-ambiguity resolution.
Vehicle perception systems for both ADAS (advanced driver-assistance systems) and autonomous driving vary widely in the sensor-types used and in their overall approach to the perception problem. Tesla’s Autopilot system uses forward-facing cameras to identify the location and curvature of roadway lane markings in order to determine the relative position of the vehicle on the roadway. The Tesla system also uses several ultra-sonic sensors and a forward-facing RADAR system for object detection and collision avoidance. Notably, the Tesla system focuses on the immediate perception of the environment and does not rely on locating the vehicle within a pre-existing map. Google self-driving cars, in contrast, use LiDAR as their primary perception sensor. LiDAR sensors provide rich and accurate 360-degree, three-dimensional measurements of surrounding contours. Google uses the LiDAR sensor to localize itself with an accuracy of 10 cm on a pre-existing map. The LiDAR sensor, however, is still far too expensive for consumer vehicles and experiences some performance degradation in inclement weather.
Pre-existing maps are compelling for vehicle perception systems. One powerful technique for building sparse maps suitable for localization is SLAM (Simultaneous Localization and Mapping). This technique enables a platform vehicle to, as the name suggests, build a map of an unknown environment while also localizing within that map. However, pre-existing maps must be constructed in the first place and continuously maintained thereafter. If these maps could be created and maintained with low-cost visual sensors then it is conceivable that crowd-sourcing, as opposed to a dedicated fleet of mapping vehicles, could accomplish these tasks. However, in order for smaller maps (as generated by each crowd-source user) to combine into or update a larger map, it is beneficial that these maps be referenced to a common frame (e.g., ECEF). Further, an independent measurement of location is necessary in order to improve the absolute accuracy of the map. GNSS is an obvious candidate sensor for providing position measurements within an absolute reference-frame.
Standard code-based positioning alone cannot reliably determine the position of a vehicle with sub-lane-level accuracy in the challenging environments that a vehicle will encounter. Precise carrier-phase positioning (e.g., PPP-RTK) can satisfy the accuracy requirements of ADAS and autonomous vehicle systems but it is fragile and is especially so in urban environments. GNSS receivers will experience frequent and severe multi-path as well as NLOS (non-line of sight) signals while in urban environments. With limited and infrequent views of the horizon, the effective elevation-mask for available GNSS signals can be substantially higher than an open-sky environment. Further, frequent cycle-slips place a greater burden on cycle-slip detection if integer-ambiguity filtering techniques (as opposed to single-epoch resolution) are to be used effectively.
While PPP-RTK alone will not suffice for autonomous vehicles, it has been seen that fusion with other sensors (e.g., inertial measurement units) can provide substantial improvements in the robustness and availability of accurate navigation solutions. However, best results have been shown with rather expensive IMUs which are, like LiDAR, not likely to be found in consumer vehicles before substantial cost reductions.
The visible-light camera can already be found in some consumer vehicles and offers a compelling low-cost sensor option for odometry and mapping. This paper will study the fusion of PPP-RTK with visual SLAM. Specifically, it will build a globally-referenced sparse map of an urban roadway by combining visual SLAM and PPP-RTK measurements. Given the relatively low availability anticipated for correct PPP-RTK fixes (i.e., fixed integer solutions) in this environment, the first stage of analysis will build the map from a conservatively selected subset of available solutions. Periods of “GNSS outage” will rely on odometry information from the visual sensor as well as endpoint constraints from available GNSS solutions for absolute accuracy. This globally-referenced map will then be fed back into the PPP-RTK engine to provide apriori state estimates during second and subsequent processing iterations. At each iteration, the visual SLAM generated map will again be updated to include newly available PPP-RTK measurements made available from improved apriori state estimates. Thus, one aspect of this study will be to assess the utility of globally-referenced visual SLAM in improving the robustness and availability of fixed-integer PPP-RTK solutions in what is comparable to a batch-estimation approach. The next stage of analysis will make use of a second data-set collected along the same urban corridor such that the previously constructed map can be used to again provide apriori state estimates to the PPP-RTK engine. This stage of analysis is important to assess the absolute accuracy of the map where the first stage may merely show self-consistency.
While past work has researched the use of visual odometry techniques in fusion with GNSS positioning, there is none to our knowledge that makes use of visual techniques to specifically aid integer-resolution. We anticipate demonstrating improved availability and robustness of vehicle positioning in an urban environment using low-cost GNSS and visible-light sensors as compared to GNSS alone. Moreover, we anticipate showing absolute-positioning performance approaching that required for autonomous vehicles through use of only low-cost sensors.
 Gupta, A., Chang, H., Yilmaz, A., “GPS-Denied Geo-Localisation using Visual Odometry”, 23rd International Society for Photogrammetry and Remote Sensing, Prague, Czech Republic, July 2016.
 M. Lhuillier, "Incremental Fusion of Structure-from-Motion and GPS Using Constrained Bundle Adjustments," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 34, no. 12, December 2012.
 Anh Vu et. al, "Real-Time Computer Vision/DGPS-Aided Inertial Navigation System for Lane-Level Vehicle Navigation," IEEE Transactions on Intelligent Transportation Systems, vol. 13, no. 2, pp. 899-913, 2012.
 B. M. Scherzinger, “Precise robust positioning with Inertially aided RTK,” Navigation, vol. 53, no. 2, pp. 73–83, 2006.