Performance evaluation of similarity measures for dense multimodal stereovision

Yaman M., Kalkan S.

JOURNAL OF ELECTRONIC IMAGING, vol.25, 2016 (SCI-Expanded) identifier identifier

  • Publication Type: Article / Article
  • Volume: 25
  • Publication Date: 2016
  • Doi Number: 10.1117/1.jei.25.3.033013
  • Journal Indexes: Science Citation Index Expanded (SCI-EXPANDED), Scopus
  • Keywords: dense multimodal stereovision, similarity measures, Kinect device, INFORMATION-BASED REGISTRATION, AUTOMATIC REGISTRATION, IMAGE REGISTRATION, MAXIMIZATION
  • Middle East Technical University Affiliated: Yes


Multimodal imaging systems have recently been drawing attention in fields such as medical imaging, remote sensing, and video surveillance systems. In such systems, estimating depth has become possible due to the promising progress of multimodal matching techniques. We perform a systematic performance evaluation of similarity measures frequently used in the literature for dense multimodal stereovision. The evaluated measures include mutual information (MI), sum of squared distances, normalized cross-correlation, census transform, local self-similarity (LSS) as well as descriptors adopted to multimodal settings, like scale invariant feature transform (SIFT), speeded-up robust features (SURF), histogram of oriented gradients (HOG), binary robust independent elementary features, and fast retina keypoint (FREAK). We evaluate the measures over datasets we generated, compiled, and provided as a benchmark and compare the performances using the Winner Takes All method. The datasets are (1) synthetically modified four popular pairs from the Middlebury Stereo Dataset (namely, Tsukuba, Venus, Cones, and Teddy) and (2) our own multimodal image pairs acquired using the infrared and the electrooptical cameras of a Kinect device. The results show that MI and HOG provide promising results for multimodal imagery, and FREAK, SURF, SIFT, and LSS can be considered as alternatives depending on the multimodality level and the computational complexity requirements of the intended application. (C) 2016 SPIE and IS&T