Register    Attendee Sign In Sign in to access papers, presentations, and videos
Previous Abstract Return to Session C4 Next Abstract

Session C4: Positioning Technologies and Machine Learning

Evaluation of (Un-)Supervised Machine Learning Methods for GNSS Interference Classification with Real-World Data Discrepancies
Lucas Heublein, Nisha L. Raichur, Tobias Feigl, Tobias Brieger, Fraunhofer Institute for Integrated Circuits IIS; Fin Heuer, Lennart Asbach, German Aerospace Center (DLR); Alexander Rügamer, Felix Ott, Fraunhofer Institute for Integrated Circuits IIS
Date/Time: Thursday, Sep. 19, 4:00 p.m.

ABSTRACT
The accuracy and reliability of vehicle localization on roads are crucial for applications such as self-driving cars, toll systems, and digital tachographs. To achieve accurate positioning, vehicles typically use global navigation satellite system (GNSS) receivers to validate their absolute positions. However, GNSS-based positioning can be compromised by interference signals, necessitating the identification, classification, determination of purpose, and localization of such interference to mitigate or eliminate it. Recent approaches based on machine learning (ML) have shown superior performance in monitoring interference. However, their feasibility in real-world applications and environments has yet to be assessed. Effective implementation of ML techniques requires training datasets that incorporate realistic interference signals, including real-world noise and potential multipath effects that may occur between transmitter, receiver, and satellite in the operational area. Additionally, these datasets require reference labels. Creating such datasets is often challenging due to legal restrictions, as causing interference to GNSS sources is strictly prohibited. Consequently, the performance of ML-based methods in practical applications remains unclear. To address this gap, we describe a series of large-scale measurement campaigns conducted in real-world settings at two highway locations in Germany and the Seetal Alps in Austria, and in large-scale controlled indoor environments. We evaluate the latest supervised ML-based methods to report on their performance in real-world settings and present the applicability of pseudo-labeling for unsupervised learning. We demonstrate the challenges of combining datasets due to data discrepancies and evaluate outlier detection, domain adaptation, and data augmentation techniques to present the models' capabilities to adapt to changes in the datasets.
Datasets and source code: https://gitlab.cc-asp.fraunhofer.de/darcy_gnss
OBJECTIVES
Motivation and Problem Statement. Interference signals impact the processing chain of the GNSS and degrade its localization accuracy. As a result, it is essential to mitigate interference signals or eliminate a transmitter, i.e., jammer. Nevertheless, to eliminate an interference source, it is crucial to initially detect, classify, and localize it. Accurate classification of the waveform of an interference signal, which serves as a unique fingerprint of a jammer, facilitates determining its purpose and, consequently, simplifies its localization and mitigation. Currently, there exist numerous techniques to achieve this, including classic [1][2][3] and ML-based methods [4][5][6][7][9][10]. To our understanding, there is currently no publicly accessible (reference) dataset that facilitates researching, developing, evaluating, and validating effective and reliable techniques that support practical GNSS interference management in a real-world environment.

Background. Data-driven techniques directly learn a mapping approximation from data that implicitly represents non-deterministic or non-linear functions, eliminating the need for additional modeling effort. Thus, recently, snapshot-based data-driven techniques such as support vector machines (SVMs) and convolutional neural networks (CNNs) [1][2] surpassed classic model-based approaches such as pattern recognition and mathematical formulations, as they return high classification accuracy even in challenging scenarios. To accurately (low F2-score) detect, classify, and (optionally) localize sources of interference, we adapted supervised, i.e., random forest (RF), CNN, UNet [5][6][7][8], semi-supervised, i.e., Siamese and triplet neural networks [4], quadruplet neural networks [10], and fully unsupervised methods, i.e., VAEs [4], from other scientific domains to the problem of interference analysis in GNSS. To evaluate the reliability (low variance or uncertainty, ?) of these methods, we combine them in ensembles or with Monte Carlo dropout and BALD [6]. We showed that supervised methods detect sources of interference in SDR vectors (F2=0.982) best and classify up to 32 different interference types (F2=0.953). The fusion with our multimodal learning (MML) technique, incorporating time-sensitive features (i.e., C/N0, AGC) and spatial (i.e., SDR) data, significantly improves the classification (F2=0.801) from a single sensor. The classification accuracy increases significantly when data from high-rate low-cost sensors (time-sensitive matrix of raw IQ samples) and low-rate medium-end sensors (spatial images of spectrum data) are combined (F2=0.9532) [6]. In addition, we showed that the evaluation of the feature importance with SHAP of Android smartphone-based features (i.e., C/N0, AGC, elevation) in combination with UNet classifies the sources of interference accurately (F2=0.912) [5]. We even showed that simple but efficient ML methods such as RF accurately localize (MAE=1.72m) a jammer on Android-based features (i.e., C/N0, AGC, elevation) of four phones that are statically placed in the propagation environment. The localization accuracy of RF increases significantly when we input SDR vectors from low-cost sensors (MAE=0.62m from one sensor up to MAE=0.19m for four sensors) [5]. These supervised methods [5][6][7][8] outperform state-of-the-art techniques w.r.t. accuracy of interference detection, classification, and localization in GNSS signals. In [10], we proposed an semi-supervised few-shot learning method utilizing an uncertainty-based quadruplet selection approach to learn an optimal representation between different jammer types. However, these methods require reference labels, that greatly increases the data collection effort and makes it impractical for many everyday applications.
We have therefore examined unsupervised learning methods [4] to detect, classify, and localize even completely unknown sources of interference without any reference data. The quintessence of our investigation is that semi-supervised methods trained on few reference data (<300 data points) based on similar learning methods such as Siamese and triplet neural networks detect (F2=0.954; ?=5%) and classify (F2=0.908; ?=8%) 32 different interference types accurate with low variance. We showed that an oracle (i.e., human expert) continuously improves the reliable detection and classification of unknown, new sources of interference. We also showed that these semi-supervised methods accurately classify (F2=0.891) GNSS interferences even under multipath effects between transmitter and receiver and dynamically moving jammers and can also represent the distance between transmitter and receiver in latent space (V-measure). Instead, we found that fully unsupervised methods such as variational autoencoder (VAE) and different priors such as Gaussian (F2=0.683) and categorical (F2=0.671), that do not require reference labels, offer a lot of research potential as they performed worst in all our experiments (F2=0.683) [4].
Contributions. The performance of ML-based techniques in actual operational environments remains unclear. This is primarily because the use of (hardware) sources of interference in public spaces is illegal in Europe, and no public datasets are available. Consequently, ML-assisted systems cannot be evaluated in practical applications. To address this issue, we conduct advanced measurement campaigns in the real world, incorporating about 40 different interference sources that interfere with the GNSS band. This allowed us to evaluate the impact of interference on localization accuracy and to include interference types that do not affect localization accuracy. We publish the new reference datasets that we utilized to evaluate the detection, classification, and localization performance of state-of-the-art ML-based methods from different fields, including supervised, semi-supervised, and unsupervised approaches. Our goal is to examine the suitability of these methods for practical applications. The GNSS community has shown an increasing interest in ML-based methods for localization and interference management (such as detection, classification, localization, and mitigation of jamming and spoofing). However, the evaluation of modern ML-based methods in literature primarily relies on synthetic data. To address this issue, we conducted preliminary research analyzing various ML methods in a laboratory environment. However, it remains uncertain how accurate and robust these (un-)supervised methods perform in practical everyday applications. To the best of our knowledge, our study is the first to investigate (un-)supervised ML methods using real data in realistic everyday applications.

METHODOLOGY
The pipeline was first introduced by van der Merwe et al. [7] and Brieger et al. [6]. Raichur et al. [5] extended the pipeline to work on Android-based GNSS receivers. Hansen et al. [8] evaluated its potential and Jdidi et al. [4] transformed the pipeline to an unsupervised variant. In general, data flows from the left (input to our model) to the right (predictions of our model). First, the sensor nodes receive the signals (with or without interference) in real-time. The signals are collected, synchronized, and stored in a cloud database. The multi-stage framework pre-processes features from multi-modal input streams. Entropy, kurtosis, C/N0, AGC, and elevation features are extracted from raw GNSS data. High-rate (1.0Hz) sampled snapshots of both spatial features and time series data thereof are fed to the detection and classification components as spectrograms or as time series. The database pushes new signals to the detection component. The classification component further categorizes only the detected interference signals. In a post-processing step, Monte Carlo dropout [6] is applied to the fused components to assess the uncertainty of each estimate. The F2-score evaluates the performance and efficiency of the framework. Next, a detection event is stored (interference and no interference), along with its category, the uncertainty values thereof, and a compressed version of samples used in the database. From there, the database handles requests for the visualization component. We visualize selected listeners at known locations (see the blue antenna symbols in Figure 1), and the appearance of interference signals (see the red antenna symbol), when a listener detects an interference. We also handle the history of events, thus visualizing a heat-map of interference signals along fixed deterministic trajectories (roads). Red indicates a high number of detected interferences, and green represents no detected interference.
In this paper, we extend our pipeline. We fine-tune and adapt both our supervised [5][6][7] and unsupervised [4] techniques to realistic data in practical real-world environments. Therefore, we assess and verify the accuracy and reliability of our detection, classification [4][6][7], and localization [5] mechanisms. Furthermore, we adapt to new, unseen interference classes [10]. In addition, we obtain reference values that indicate whether an interference affects the GNSS-based localization, i.e., a potential interference, or whether there is an interference that does not affect the localization engine at all and thus, needs no further management. We collect datasets in a large-scale measurement campaign in the Seetal Alps in Austria. To verify our measurement campaign and its data, we also carry out a baseline measurement campaign in an anechoic room and a long-term test on a highway in Germany.

Hardware Setup. We describe our GNSS interference monitoring system (DARCY) in preliminary publications [4][5][6][7][8][9][10] with respect to its hardware, ML-based detection, classification, and localization algorithms. Our DARCY system consists of various sensors, i.e., low-cost (LC), medium-end (ME), high-end (HE), and smartphones, and a central server that collects the raw data and performs the ML-based characterization and visualization of detected interferences. The LC low-bandwidth, i.e., 2.5MHz, receiver justifies the deployment of more receiver nodes compared to more expensive state-of-the-art platforms, resulting in a significantly higher probability of interception (POI). The high-cost ME high-bandwidth, i.e., 50MHz, dual-band receiver features high-resolution ADCs. DARCY also employs military grade high-cost HE reference receivers such as Septentrio MOSAIC and Novatel OEM7. Finally, DARCY also employs Android-based smartphones that only provide GNSS data observations as defined by the Google navigation subsystem. We feed all different DARCY sensors by the same GNSS antenna to benchmark their performance.
Real-world Large-scale Measurement Campaigns. In our previous publications [4][5][6][7][8][9][10] the different elements of DARCY and in particular some preliminary tests in anechoic chambers and with indoor emulation environments were presented and successfully validated. Consequently, we repeat some of these tests in actual scenarios in the real world. We investigate and validate our state-of-the-art (un-)supervised ML methods for detecting and classifying interferers in GNSS signals [4][5][6]. We also investigate whether and how disturbances in the GNSS signal affect the receiver and which disturbances specifically have an impact. As part of a test campaign within a military test site in Austria, the DARCY sensor stations with their different sensor types are validated against real-world disruptions. The DARCY station is set up in the field and several drive-by tests are carried out with real privacy protection devices (PPD) as well as with artificial but controlled sources of interference from signal generators. Using the controlled interference conditions in a real outdoor scenario, we will evaluate the performance of DARCY. In addition, we assess the impact of interference that may not be relevant to GNSS. While traditional frequency domain methods can estimate the power spectrum density of the perturbation and calculate the spectral separation coefficients to assess the impact of the perturbation on the C/N0, this no longer works with the ML approaches. Therefore, we use the insights of this campaign to extend our methods in such a way that the ML approaches also consider interferences that may not be relevant for GNSS and do not negatively affect the accuracy.
After completing this real-world validation, we deploy three of these stations for a long-term test on three test sites at the German highway to detect sources of interference mounted in moving vehicles under real operating conditions. As part of our long-term test campaign, we determine for each recording whether an interference really affects GNSS-based localization. For instance, out-of-band interferers, e.g., sinusoidal, do not affect at the edge of the band, but are recognized by ML methods and confuse the evaluation. On the other hand, smartphone receivers are narrow-band and are often unaffected, whereas high-end broadband sensors are affected by interference signals. Obviously, there is no ML model that works accurately and reliably for each of the four sensor types.
ANTICIPATED RESULTS
In contrast to state-of-the-art, we already examined and compared classic and (un-)supervised methods for detecting, classifying, and localizing GNSS interference sources [4][5][6][7][8][9][10]. Our first results showed that (un-)supervised ML-based methods accurately and reliably detect and classify synthetic and deterministic realistic data in laboratory environments on smartphone, low-cost, as well as high-end receiver hardware. Thus, we expect that ML-based methods also yield accurate and reliable results with realistic data in real-world applications. We assume that we can report our findings of our long-term measurement campaigns at the German highway. Furthermore, we expect to provide our datasets to enable independent research, development, and evaluation of such novel techniques for GNSS interference analysis. In contrast to previous studies, we also show how to extend on only few real-world samples [10].
REFERENCES
[1] Ruben M. Ferre, Alberto de la Fuente, and Elena S. Lohan. Jammer Classification in GNSS Bands via Machine Learning Algorithms. In MDPI Sensors, November 2019.
[2] Carolyn J. Swinney and John C. Woods. GNSS Jamming Classification via CNN, Transfer Learning & the Novel Concatenation of Signal Representations. In CyberSA, Dublin, Ireland, June 2021.
[3] Jason N. Gross und Todd E. Humphreys. GNSS Spoofing, Jamming, and Multipath Interference Classification Using a Maximum-Likelihood Multi-tap Multipath Estimator. In ION GNSS+, Monterey, CA, January 2017.
[4] Dorsaf Jdidi, Tobias Brieger, Tobias Feigl, David C. Franco, J. Rossouw van der Merwe, Alexander Rügamer, Jochen Seitz, and Wolfgang Felber. Machine Learning Compression for GNSS Interference Analysis. In ION GNSS+, Denver, CO, September 2022.
[5] Nisha L. Raichur, Tobias Brieger, Dorsaf Jdidi, Carlo Schmitt, Birendra Ghimire, Felix Ott, Tobias Feigl, J. Rossouw van der Merwe, Alexander Rügamer, and Wolfgang Felber. Machine Learning-assisted GNSS Interference Monitoring Through Crowdsourcing. In ION GNSS+, Denver, CO, September 2022.
[6] Tobias Brieger, Nisha L. Raichur, Dorsaf Jdidi, Felix Ott, Tobias Feigl, J. Rossouw van der Merwe, Alexander Rügamer, and Wolfgang Felber. Multimodal Learning for Reliable Interference Classification in GNSS Signals. In ION GNSS+, Denver, Colorado, September 2022.
[7] J. Rossouw van der Merwe, David C. Franco, Dorsaf Jdidi, Tobias Feigl, Alexander Rügamer, and Wolfgang Felber. Low-cost COTS GNSS Interference Detection and Classification Platform: Initial Results. In ICL-GNSS, Tampere, Finland, June 2022.
[8] Jonathan Hansen, J. Rossouw van der Merwe, David C. Franco, Tobias Feigl, Tobias Brieger, Dorsaf Jdidi, Alexander Rügamer, and Wolfgang Felber. Initial Results of a Low-Cost GNSS Interference Monitoring Network. In POSNAV, Berlin, Germany, November 2022.
[9] J. Rossouw van der Merwe, David Contreras Franco, Jonathan Hansen, Tobias Brieger, Tobias Feigl, Felix Ott, Dorsaf Jdidi, Alexander Rügamer, and Wolfgang Felber. Low-Cost COTS GNSS Interference Monitoring, Detection, and Classification System. In MDPI Sensors, March 2023.
[10] Felix Ott, Lucas Heublein, Nisha L. Raichur, Tobias Feigl, Jonathan Hansen, Alexander Rügamer, and Christopher Mutschler. Few-Shot Learning with Uncertainty-based Quadruplet Selection for Interference Detection in GNSS Data. In arXiv:2402.09466, February 2024.



Previous Abstract Return to Session C4 Next Abstract