Multi-modal stereo-vision using infrared/visible camera pairs

MUSTAFA YAMAN

Multi-modal stereo-vision using infrared/visible camera pairs

Tezin Türü: Doktora

Tezin Yürütüldüğü Kurum: Orta Doğu Teknik Üniversitesi, Mühendislik Fakültesi, Bilgisayar Mühendisliği Bölümü, Türkiye

Tezin Onay Tarihi: 2014

Öğrenci: MUSTAFA YAMAN

Danışman: SİNAN KALKAN

Açık Arşiv Koleksiyonu: AVESİS Açık Erişim Koleksiyonu

Özet:

In this thesis, a novel method for computing disparity maps from a multi-modal stereo-vision system composed of an infrared-visible camera pair is introduced. The method uses mutual information as the basic similarity measure where a segmentation based adaptive windowing mechanism is proposed along with a novel mutual information computation surface for greatly enhancing the results. Besides, the method incorporates joint prior probabilities when computing the cost matrix in addition to negative mutual information measures. A novel adaptive cost aggregation method is also proposed using computed cost confidences and resulting minimum cost disparities that are confident enough are fitted planes in segments. The segments are refined by iteratively splitting and merging according to the fitted confident disparities that helps to reduce the dependence of the disparity computation to the initial segmentation. Finally, all the steps are repeated iteratively where more accurate joint probabilities are calculated by sing previous iteration’s disparity map. Two multi-modal stereo image datasets are generated for evaluating the method and the state of the art methods confronted in literature; the synthetically altered image pairs from the Middlebury Stereo Evaluation Dataset, and our own dataset of Kinect Device infrared- visible camera image pairs, which can function as a benchmark for multi-modal stereo-vision methods. On these datasets, it is presented that (i) the proposed method improves the quality of existing MI formulation, (ii) the proposed method outperforms state of the art methods in literature, and (iii) the proposed method can provide depth comparable to the quality of Kinect depth data.