Recently, multi-sensor localization strategies are gaining attention in the railway scenario. In fact, the current trend is to reduce or remove the physical equipment deployed along the track for positioning purposes and to exploit on-board sensors to realize the same functionalities. Although GNSS is one of the major resources to perform this task, its performances dramatically decrease in presence of sources of local hazards like multipath, shadowing and blockage. For this reason, multi-sensor positioning methods are under study. Among them, those based on the detection of landmarks constituted by georeferenced trackside infrastructure elements like rail signs, and the estimation of the relative position of the train with respect to them are rather promising. Thus, in this paper we focus on the construction of the section of a Rail Digital Map related to these infrastructure elements on the basis of the fusion of the outputs of a stereo video camera and a LIDAR. In particular, the algorithms for object detection, single epoch landmark position estimation and landmark tracking are discussed. Results of the performance assessment based on Monte Carlo simulations are also reported.