Tezin Türü: Doktora
Tezin Yürütüldüğü Kurum: Orta Doğu Teknik Üniversitesi, Mühendislik Fakültesi, Bilgisayar Mühendisliği Bölümü, Türkiye
Tezin Onay Tarihi: 2006
Tezin Dili: İngilizce
Öğrenci: Mutlu Uysal
Danışman: FATOŞ TUNAY YARMAN VURAL
Özet:This thesis proposes an object localization and image retrieval framework, which trains a discriminative feature set for each object class. For this purpose, a hierarchical learning architecture, together with a Neighborhood Tree is introduced for object labeling. Initially, a large variety of features are extracted from the regions of the pre-segmented images. These features are, then, fed to the training module, which selects the "best set of representative features", suppressing relatively less important ones for each class. During this study, we attack various problems of the current image retrieval and classification systems, including feature space design, normalization and curse of dimensionality. Above all, we elaborate the semantic gap problem in comparison to human visual system. The proposed system emulates the eye-brain channel in two layers. The first layer combines relatively simple classifiers with low level, low dimensional features. Then, the second layer implements Adaptive Resonance Theory, which extracts higher level information from the first layer. This two-layer architecture reduces the curse of dimensionality and diminishes the normalization problem. The concept of Neighborhood Tree is introduced for identifying the whole object from the over-segmented image regions. The Neighborhood Tree consists of the nodes corresponding to the neighboring regions as its children and merges the regions through a search algorithm. Experiments are performed on a set of images from Corel database, using MPEG-7, Haar and Gabor features in order to observe the power and the weakness of the proposed system. The "Best Representative Features" are found in the training phase using Fuzzy ARTMAP [1], Feature-based AdaBoost [2], Descriptor-based AdaBoost, Best Representative Descriptor [3], majority voting and the proposed hierarchical learning architecture. During the