Improvements in Pedestrian Movement Prediction by Considering Multiple Intentions in a Multi-Hypotheses Filter
Florian Particke, Markus Hiller, Jörn Thielecke, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU), Germany; Christian Feist, AUDI AG, Germany
In the future, fully automated vehicles and mobile robots have to operate in a shared environment with pedestrians. To minimize the risk for pedestrians and to optimize motion planning for robots, it is very important to track pedestrians and predict their future movement in a precise way. To improve the accuracy of the tracking and the prediction of the pedestrians, it is necessary to include all available context information in the fusion and the prediction process.
A crucial source of context information is the intention of the pedestrian. In the last decade, extensive research was performed to estimate the future behavior of pedestrians. In this context, the term “intention” was often used, which was interpreted in different ways. Examples are arm gestures in (Owoyemi2017), gaze estimation in (Yamazoe2008) or pedestrian movement modeled by motion probability grids in (Thompson2009). Another deeply connected research area is the incorporation of intention in the motion planning process of a robot. Examples are (Bandyopadhyay2013), where the path planning process is based on mixed observable Markov decision processes or (Bai2015), who uses partially observable Markov decision processes for path planning.
This paper is organized as follows. In the first part, the generalized potential field approach is summarized. Afterwards, the pedestrian prediction model is derived, which is based on an available set of possible intentions. To include this additional information in the fusion process, the Multi-Hypotheses filter is proposed. Subsequently, this model is used for the prediction of pedestrian movements. The prediction is evaluated for different values of the standard deviation of the position measurements and for different time horizons. The proposed model is evaluated against an identically parametrized Kalman Filter without intention information. It is shown that the accuracy on the position estimation is significantly improved. The results are concluded and further steps are pointed out.
In this paper, the term “intention” is always defined by a target area, which the pedestrian tries to reach. The focus of this paper is the incorporation of an unknown intention belonging to a set of possible intentions. It is motivated by the practical scenario that the goal of the person is unknown, which is usually the case, but different discrete hypotheses for the intention of the pedestrian can be assumed based on previous observations or machine learning. One example is a parking scenario, where a fully automated vehicle has to interact with manually driving cars. The goals of the pedestrians stepping out of the manually driving cars are limited to certain target areas, e.g. ticket machines, exits of the car parking area or their cars. As high prediction accuracy of the pedestrians is necessary for successful motion planning of the fully automated vehicles, the proposed approach has to be able to track and predict the pedestrians in a precise way.
For the fusion of the additional context information source “intention”, a generalized potential field approach (Particke2017) for characterizing pedestrian movements is used, which is based on the social force model (Helbing1995). The generalized potential field approach describes every human interacting in different potential fields. These potential fields are defined by other pedestrians, the surrounding environment and the intention of the predicted pedestrians. In comparison to other fusion approaches, the number of proposed parameters is reduced and the parameters can be intuitively understood. The generalized potential field approach delivers an acceleration vector, which is used as an additional control input in the tracking algorithm. For each hypothesis of the set of intentions, a unique potential field is build and an acceleration vector is derived.
The discrete set of intentions (possible discrete goals of the pedestrians) is fused by a Multi-Hypotheses tracking filter. The Multi-Hypotheses tracking filter fuses the intention information by the generalized potential field approach and the pedestrian position measurements of a camera. Different intentions result in different potential fields. Each hypothesis corresponds to a different intention and hence to a different potential field.
In this paper, the Multi-Hypotheses filter is used to predict the movement of the pedestrians. The Multi-Hypotheses filter is a Rao-Blackwellized filter (Schon2005), which consists of particles (hypotheses) each representing a possible intention. Each particle incorporates a potential field representing the intention of the pedestrian and a Kalman Filter with constant velocity model (DeNicolao2007). The acceleration vector of the potential field is fused in the Kalman filter as a control input. Every hypothesis (Kalman with GPFA) is weighted based on the measurements according to the well-known methods of particle filtering (Grisetti2007).
The proposed approach is evaluated using real camera data from a simple scenario in the Edinburgh Informatics Forum. All results are examined in dependence of the measurement quality, which is modeled as an increment of Additive White Gaussian Noise (AWGN). Prediction results for different time horizons reaching from one to ten seconds are studied. The Multi-Hypotheses filter is compared to a Kalman filter with constant velocity model, identical parametrization (equal noise process
covariance and noise covariance matrix) but without intention information. Both models are evaluated by the root mean square error and the entropy of the error between the predicted trajectory and the ground truth trajectory. For both metrics, the Multi-Hypotheses filter outperforms the Kalman filter over the whole range of standard deviations and over the whole range of predicted time intervals. Especially for long predicted time intervals, the Multi-Hypotheses filter is able to significantly improve the prediction accuracy (up to two meter). The results confirm that intention information can significantly improve the prediction quality of pedestrians. Therefore, the incorporation of intention should be considered in critical surveillance scenarios. The proposed model shows a promising possibility for the incorporation.
Conclusions and Outlook
This contribution proposed a Multi-Hypotheses tracking filter, which incorporates GPFA, an adaption of the social force model. The Multi-Hypotheses tracking filter was used for the prediction of pedestrian trajectories. The results were evaluated against a standard Kalman filter for a simple scenario without apriori information about the pedestrian intention in a forum. The Multi-Hypotheses Filter outperformed the simple Kalman filter over the whole range of standard deviations and over the whole range of predicted time intervals. Especially for predicted time intervals over long terms, the accuracy is significantly improved, which are very promising results.
As future work, GPFA shall be extended to higher levels of planning (tactical or strategical planning) for more complex scenarios, which should further increase the gain of the prediction accuracy. Additionally, multi target tracking scenarios with false alarms and missed detections shall be considered in the future.
[Bandyopadhyay2013] Bandyopadhyay, Tirthankar, et al. "Intention-aware motion planning." Algorithmic Foundations of Robotics X. Springer, Berlin, Heidelberg, 2013. 475-491.
[Bai2015] Bai, Haoyu, et al. "Intention-aware online POMDP planning for autonomous driving in a crowd." Robotics and Automation (ICRA), 2015 IEEE International Conference on. IEEE, 2015.
[DeNicolao2007] De Nicolao, Giuseppe, A. Ferrara, and L. Giacomini. "Onboard sensor-based collision risk assessment to improve pedestrians' safety." IEEE transactions on vehicular technology 56.5 (2007): 2405-2413.
[Grisetti2007] Grisetti, Giorgio et al. "Improved techniques for grid mapping with rao-blackwellized particle filters." IEEE transactions on Robotics 23.1 (2007): 34-46.
[Helbing1995] Helbing, Dirk, and P. Molnar. "Social force model for pedestrian dynamics." Physical review E 51.5 (1995): 4282
[Owoyemi2017] Owoyemi and K. Hashimoto, "Learning human motion intention with 3D Convolutional Neural Network," 2017 IEEE International Conference on Mechatronics and Automation (ICMA), Takamatsu, Japan, 2017, pp. 1810-1815.
[Particke2017] Particke, Florian, et al. (2017). “Pedestrian Tracking using a Generalized Potential Field Approach.” In Proceedings of the 12th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications
[Schon2005] Schon, Thomas, F. Gustafsson, and P-J. Nordlund. "Marginalized particle filters for mixed linear/nonlinear state-space models." IEEE Transactions on Signal Processing 53.7 (2005): 2279-2289.
[Thompson2009] Thompson, Simon, T. Horiuchi, and S. Kagami. "A probabilistic model of human motion and navigation intent for mobile robot path planning." Autonomous Robots and Agents, 2009. ICARA 2009. 4th International Conference on. IEEE, 2009.
[Yamazoe2008] Yamazoe, Hirotake, et al. "Remote gaze estimation with a single camera based on facial-feature tracking without special calibration actions." Proceedings of the 2008 symposium on Eye tracking research & applications. ACM, 2008