Performance evaluation of similarity measures for dense multimodal stereovision


Yaman M., Kalkan S.

JOURNAL OF ELECTRONIC IMAGING, cilt.25, 2016 (SCI-Expanded) identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 25
  • Basım Tarihi: 2016
  • Doi Numarası: 10.1117/1.jei.25.3.033013
  • Dergi Adı: JOURNAL OF ELECTRONIC IMAGING
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus
  • Anahtar Kelimeler: dense multimodal stereovision, similarity measures, Kinect device, INFORMATION-BASED REGISTRATION, AUTOMATIC REGISTRATION, IMAGE REGISTRATION, MAXIMIZATION
  • Orta Doğu Teknik Üniversitesi Adresli: Evet

Özet

Multimodal imaging systems have recently been drawing attention in fields such as medical imaging, remote sensing, and video surveillance systems. In such systems, estimating depth has become possible due to the promising progress of multimodal matching techniques. We perform a systematic performance evaluation of similarity measures frequently used in the literature for dense multimodal stereovision. The evaluated measures include mutual information (MI), sum of squared distances, normalized cross-correlation, census transform, local self-similarity (LSS) as well as descriptors adopted to multimodal settings, like scale invariant feature transform (SIFT), speeded-up robust features (SURF), histogram of oriented gradients (HOG), binary robust independent elementary features, and fast retina keypoint (FREAK). We evaluate the measures over datasets we generated, compiled, and provided as a benchmark and compare the performances using the Winner Takes All method. The datasets are (1) synthetically modified four popular pairs from the Middlebury Stereo Dataset (namely, Tsukuba, Venus, Cones, and Teddy) and (2) our own multimodal image pairs acquired using the infrared and the electrooptical cameras of a Kinect device. The results show that MI and HOG provide promising results for multimodal imagery, and FREAK, SURF, SIFT, and LSS can be considered as alternatives depending on the multimodality level and the computational complexity requirements of the intended application. (C) 2016 SPIE and IS&T