Behavioral Cloning for Driverless Cars using Transfer Learning
Xin Zhang, Xingqun Zhan, School of Aeronautics and Astronautics, Shanghai Jiao Tong University, China
Several Convolutional Neural Networks (CNNs) are adapted to map raw pixels from front-facing cameras directly to steering signals and the results are reported here. This transfer learning approach proved successful. With minimal efforts in building a CNN from scratch and a decent human involvement in labeling images included in the dataset, the algorithm learns to drive in traffic on local roads with or without lane markings and on motorways. It also succeeds in some areas where visual guidance e.g. traffic signs, are unclear or the illumination condition varies quickly.
In the dataset, the labels are composed solely of steering angles. The network, as a black-box approach, learns automatically intrinsic representations of most of the key steps such as detecting lines, shapes and other road features in each layer of the network, though it is never explicitly designed or ‘asked’ to do so.
Compared to the traditional methods supported by algorithms from computer graphics (e.g. lane detection, path planning, and control), this transfer learning method tend to optimize all processing steps simultaneously. Therefore, the method shall lead to better performance while maintaining a high performance-to-cost ratio. Performance can be further improved using different architectures of CNNs and several CNNs, previously prevailing in image processing contests in recent years, will be evaluated in terms of accuracy.
The paper is structured as follows. Firstly, the combined problem of guidance, navigation and control (GNC) in driverless car will be summarized, clarifying the background of the necessity of behavioral cloning. Secondly, several (3-5) CNN architectures, previously used in image processing classification with high success rates, will be explored from a theoretical viewpoint, in an attempt to understand the key structure characteristics and parameter sets for its successful application, now in driverless car settings, not in general image classification. Section three through five focus on the simulation tests. In section three, a software simulator will be introduced for data collection in two settings, a simple and a difficult one, respectively. The sensing system is solely composed of three front-facing cameras, i.e. left, right and center cameras. The reason why Lidar or other sensors are not ‘installed’ is two-fold. Firstly, we aim at a low lost solution to perception problems embedded in driverless car and secondly, the CNNs are extremely useful but they are useful only for image processing, so in this initial stage of the project, we resort to camera-only solution and left off other sensors. In section four, combined data collection strategies will be explored. These strategies will ensure some driving characteristics imposed by the trained behaviors. For example, the car should swing back to the center of the road if it strays away and gets close to the side of the road. Another example is the car should drive itself rather smoothly along the lane instead of zigzagging. Section five compares the test accuracies of the different trained models. Though most of them differs by an accuracy rate of 0.1-0.7%, some insights can be obtained in terms of network structures and parameter settings. The last section concludes the paper. The datasets and source code are available at github.