Semantic classification and retrieval system for environmental sounds


Thesis Type: Postgraduate

Institution Of The Thesis: Orta Doğu Teknik Üniversitesi, Faculty of Engineering, Department of Computer Engineering, Turkey

Approval Date: 2012

Student: ÇİĞDEM OKUYUCU

Consultant: ADNAN YAZICI

Abstract:

The growth of multimedia content in recent years motivated the research on audio classification and content retrieval area. In this thesis, a general environmental audio classification and retrieval approach is proposed in which higher level semantic classes (outdoor, nature, meeting and violence) are obtained from lower level acoustic classes (emergency alarm, car horn, gun-shot, explosion, automobile, motorcycle, helicopter, wind, water, rain, applause, crowd and laughter). In order to classify an audio sample into acoustic classes, MPEG-7 audio features, Mel Frequency Cepstral Coefficients (MFCC) feature and Zero Crossing Rate (ZCR) feature are used with Hidden Markov Model (HMM) and Support Vector Machine (SVM) classifiers. Additionally, a new classification method is proposed using Genetic Algorithm (GA) for classification of semantic classes. Query by Example (QBE) and keyword-based query capabilities are implemented for content retrieval.