Previous Abstract
Return to Session F1
Next Abstract
### Session F1: Advanced Software and Hardware Technologies for GNSS Receivers

Previous Abstract Return to Session F1 Next Abstract

**Multipath Parameter Estimation Based on Reinforcement Learning**

*Xin Qi and Bing Xu, Department of Aeronautical and Aviation Engineering, The Hong Kong Polytechnic University*

**Date/Time:** Wednesday, Sep. 18, 10:40 a.m.

Peer Reviewed

Background and objectives:

Multipath has long been recognized as one of the major error sources in urban GNSS positioning, and it is difficult to eliminate due to its high reliance on the environment. The baseband signal processing techniques are effective ways to solve the multipath issue, without reliance on external aiding. Such typical solutions include advanced correlator or discriminator designs, such as narrow correlators [1], strobe correlators [2], and high resolution correlators (HRC) [3], and multipath parameter estimation techniques, such as multipath estimating delay-lock loop (MEDLL) [4], the vision correlator [5] and the multipath mitigation technique (MMT) [6]. With the booming and widespread application of machine learning, machine learning algorithms are also employed for multipath mitigation, such as NNDLL [7] and deep network correlator [8]. In our previous work, we proposed a random forest (RF)--based multipath parameter estimator that achieved better robustness than MEDLL [9]. However, it is difficult for supervised learning-based algorithms to obtain labeled real data as a training set, and training with simulated data may cause the algorithm to perform poorly on real data. Moreover, the above multipath mitigation techniques are effective for long delay multipath but show limited effect on short multipath. This work aims to propose a reinforcement learning (RL)--based multipath parameter estimator to address the above issues.

Key innovative steps

Unlike supervised learning, reinforcement learning learns by interacting with the environment and does not require pre-training. This avoids the algorithm performance degradation caused by the mismatch between the training data and the real data. The algorithm is implemented in the following key steps: problem modeling, parameter optimization based on reinforcement learning, and parameter setting for short multipath problems.

1) Problem modeling

The maximum likelihood estimation of the multipath parameters can be achieved by minimizing the mean square error (MSE) between the original autocorrelation function (ACF) and the ACF reconstructed with the parameter estimates. The minimum MSE problem is then viewed as a parameter optimization problem that can be solved by an iterative algorithm. The Markov decision process (MDP) modeling of parameter optimization is as follows: the state is defined by the value condition of the MSE of ACF samples; the action is defined as an update of multipath parameters; the reward is defined as the opposite of the difference between two states in the current and previous step.

2) Parameter optimization based on reinforcement learning

The input to this algorithm is multiple sampling points of the ACF realized by a multi-correlator and the output is the estimates of multipath parameters. The parameter optimization process is as follows: first, initial values of the parameters are selected, and then at each episode, the RL agent reduces the MSE of the reconstructed ACF sampling points by optimizing the multipath parameters. The optimization process repeats until the stopping criterion is satisfied. The RL agent is achieved based on the Q-learning-based algorithm proposed in [10]. To improve the search efficiency, two strategies are modified in Q-learning in this study as follows: 1) adaptive episode termination strategy: if the minimum MSE of ACF sample points in the current step drops too much from the previous one, this episode will be terminated; 2) optimal initial state strategy: use the current lowest MSE as the initial state in each episode.

3) Parameter setting for short multipath problems

As a minimum MSE estimation problem, there is no doubt that the more sampling points of the ACF, the better the parameter estimation performance of the algorithm. To explore the effect of the number of sampling points on the performance of our algorithm, we tested the multipath mitigation performance of the algorithm against multipath examples with different delays when using various numbers of correlators in the constructive case. The algorithm is tested when ACF sampling points with a sample spacing of 0.2 chips, 0.1 chips, 0.05 chips, and 0.025 chips from -1 chip to +1 chip are used as inputs, corresponding to the required number of correlators of 11, 21, 41, and 81, respectively. 1 chip spacing normalized early minus late discriminator is employed for code discrimination. The results show that the number of ACF sampling points only affects the algorithm’s mitigation performance for short multipath with a delay of less than 0.2 chips, and the more the sample points, the better the short multipath mitigation performance. The RL-based estimator can reduce the multipath error to less than 3 meters when using a sampling spacing of 0.025 chips without considering the influence of carrier-to-noise ratio (CNR).

Preliminary results

The reinforcement learning-based multipath parameter estimator shows great multipath mitigation performance in the tests. The more the number of correlators used, the better mitigation performance for short multipath. When 81 correlators are used, our algorithm can suppress the multipath error below 3 meters.

Conclusion

This work proposes a multipath parameter estimator based on reinforcement learning, which effectively avoids the performance degradation caused by the mismatch between training data and real data in supervised learning-based algorithms. The proposed algorithm achieves superior multipath mitigation performance even for very short multipaths. It can reduce the multipath error to less than 3 meters using 1-chip spacing NEML discriminators without being affected by CNR.

[1] A. Van Dierendonck, P. Fenton, and T. Ford, "Theory and performance of narrow correlator spacing in a GPS receiver," Navigation, vol. 39, no. 3, pp. 265-283, 1992.

[2] L. Garin, F. van Diggelen, and J.-M. Rousseau, "Strobe & edge correlator multipath mitigation for code," in Proceedings of the 9th International Technical Meeting of the Satellite Division of the Institute of Navigation (ION GPS 1996), 1996, pp. 657-664.

[3] G. A. McGraw and M. S. Braasch, "GNSS multipath mitigation using gated and high resolution correlator concepts," in Proceedings of the 1999 national technical meeting of the Institute of Navigation, 1999, pp. 333-342.

[4] R. D. Van Nee, J. Siereveld, P. C. Fenton, and B. R. Townsend, "The multipath estimating delay lock loop: approaching theoretical accuracy limits," in Proceedings of 1994 IEEE Position, Location and Navigation Symposium-PLANS'94, 1994: IEEE, pp. 246-251.

[5] P. C. Fenton and J. Jones, "The theory and performance of NovAtel Inc.'s vision correlator," in Proceedings of the 18th International Technical Meeting of the Satellite Division of The Institute of Navigation (ION GNSS 2005), 2005, pp. 2178-2186.

[6] L. R. Weill, "Multipath mitigation using modernized GPS signals: how good can it get?," in Proceedings of the 15th International Technical Meeting of the Satellite Division of The Institute of Navigation (ION GPS 2002), 2002, pp. 493-505.

[7] M. Orabi, J. Khalife, A. A. Abdallah, Z. M. Kassas, and S. S. Saab, "A machine learning approach for GPS code phase estimation in multipath environments," in 2020 IEEE/ION Position, Location and Navigation Symposium (PLANS), 2020: IEEE, pp. 1224-1229.

[8] H. Li, P. Borhani-Darian, P. Wu, and P. Closas, "Deep neural network correlators for GNSS multipath mitigation," IEEE Transactions on Aerospace and Electronic Systems, 2022.

[9] X. Qi, B. Xu, "Machine learning assisted multipath signal parameter estimation and its evaluation under weak signal environment" in 2023 IEEE/ION Position, Location and Navigation Symposium (PLANS). 2023: IEEE, pp. 1019-1026.

[10] X. Qi, B. Xu, "Hyperparameter optimization of neural networks based on Q-learning," Signal, Image and Video Processing, vol. 17, no. 4, pp. 1669-1676, 2023.

Previous Abstract Return to Session F1 Next Abstract

For Attendees * *Technical Program * *Registration * *CGSIC * *Hotel * *Travel and Visas * *Smartphone Decimeter Challenge * *Exhibits * *Submit Kepler Nomination For Authors and Chairs * *Abstract Management * *Author Resource Center * *Session Chair Resources * *Panel Moderator Resources * *Student Paper Awards * *Editorial Review Policies * *Publication Ethics Policies For Exhibitors * *Exhibitor Resource Center * *Marketing Resources Other Years * *Future Meetings * *Past Meetings