Structure-from-motion for systems with perspective and omnidirectional cameras


Tezin Türü: Doktora

Tezin Yürütüldüğü Kurum: Orta Doğu Teknik Üniversitesi, Enformatik Enstitüsü, Modelleme ve Simülasyon Anabilim Dalı, Türkiye

Tezin Onay Tarihi: 2009

Öğrenci: YALIN BAŞTANLAR

Danışman: ALPTEKİN TEMİZEL

Özet:

In this thesis, a pipeline for structure-from-motion with mixed camera types is described and methods for the steps of this pipeline to make it effective and automatic are proposed. These steps can be summarized as calibration, feature point matching, epipolar geometry and pose estimation, triangulation and bundle adjustment. We worked with catadioptric omnidirectional and perspective cameras and employed the sphere camera model, which encompasses single-viewpoint catadioptric systems as well as perspective cameras. For calibration of the sphere camera model, a new technique that has the advantage of linear and automatic parameter initialization is proposed. The projection of 3D points on a catadioptric image is represented linearly with a 6x10 projection matrix using lifted coordinates. This projection matrix is computed with an adequate number of 3D-2D correspondences and decomposed to obtain intrinsic and extrinsic parameters. Then, a non-linear optimization is performed to refine the parameters. For feature point matching between hybrid camera images, scale invariant feature transform (SIFT) is employed and a method is proposed to improve the SIFT matching output. With the proposed approach, omnidirectional-perspective matching performance significantly increases to enable automatic point matching. In addition, the use of virtual camera plane (VCP) images is evaluated, which are perspective images produced by unwarping the corresponding region in the omnidirectional image. The hybrid epipolar geometry is estimated using random sample consensus (RANSAC) and alternatives of pose estimation methods are evaluated. A weighting strategy for iterative linear triangulation which improves the structure estimation accuracy is proposed. Finally, multi-view structure-from-motion (SfM) is performed by employing the approach of adding views to the structure one by one. To refine the structure estimated with multiple views, sparse bundle adjustment method is employed with a modification to use the sphere camera model. Experiments on simulated and real images for the proposed approaches are conducted. Also, the results of hybrid multi-view SfM with real images are demonstrated, emphasizing the cases where it is advantageous to use omnidirectional cameras with perspective cameras.