Ses öznitelik çıkarımına dayalı video bölütlenmesi.

Neriman Atar

Ses öznitelik çıkarımına dayalı video bölütlenmesi.

Tezin Türü: Yüksek Lisans

Tezin Yürütüldüğü Kurum: Orta Doğu Teknik Üniversitesi, Mühendislik Fakültesi, Elektrik ve Elektronik Mühendisliği Bölümü, Türkiye

Tezin Onay Tarihi: 2009

Tezin Dili: İngilizce

Öğrenci: Neriman Atar

Danışman: GÖZDE AKAR

Açık Arşiv Koleksiyonu: AVESİS Açık Erişim Koleksiyonu

Özet:

In this study, an automatic video segmentation and classification system based on audio features has been presented. Video sequences are classified such as videos with “speech”, “music”, “crowd” and “silence”. The segments that do not belong to these regions are left as “unclassified”. For the silence segment detection, a simple threshold comparison method has been done on the short time energy feature of the embedded audio sequence. For the “speech”, “music” and “crowd” segment detection a multiclass classification scheme has been applied. For this purpose, three audio feature set have been formed, one of them is purely MPEG-7 audio features, other is the audio features that is used in [31] the last one is the combination of these two feature sets. For choosing the best feature a histogram comparison method has been used. Audio segmentation system was trained and tested with these feature sets. The evaluation results show that the Feature Set 3 that is the combination of other two feature sets gives better performance for the audio classification system. The output of the classification system is an XML file which contains MPEG-7 audio segment descriptors for the video sequence. An application scenario is given by combining the audio segmentation results with visual analysis results for getting audio-visual video segments.