Tezin Türü: Yüksek Lisans
Tezin Yürütüldüğü Kurum: Orta Doğu Teknik Üniversitesi, Mühendislik Fakültesi, Bilgisayar Mühendisliği Bölümü, Türkiye
Tezin Onay Tarihi: 2009
Öğrenci: EBRU DOĞAN
Danışman: ADNAN YAZICI
Özet:The audio signals can provide rich semantic cues for analyzing multimedia content, so audio information has been recently used for content-based multimedia indexing and retrieval. Due to growing amount of audio data, demand for efficient retrieval techniques is increasing. In this thesis work, we propose a complete, scalable and extensible audio based content management and retrieval system for news broadcasts. The proposed system considers classification, segmentation, analysis and retrieval of an audio stream. In the sound classification and segmentation stage, a sound stream is segmented by classifying each sub segment into silence, pure speech, music, environmental sound, speech over music, and speech over environmental sound in multiple steps. Support Vector Machines and Hidden Markov Models are employed for classification and these models are trained by using different sets of MPEG-7 features. In the analysis and retrieval stage, two alternatives exist for users to query audio data. The first of these isolates user from main acoustic classes by providing semantic domain based fuzzy classes. The latter offers users to query audio by giving an audio sample in order to find out the similar segments or by requesting expressive summary of the content directly. Additionally, a series of tests was conducted on audio tracks of TRECVID news broadcasts to evaluate the performance of the proposed solution.