Structural and semantic modeling of audio for content-based querying and browsing

Sert, Mustafa; Baykal, BUYURMAN; Yazici, ADNAN

Structural and semantic modeling of audio for content-based querying and browsing

FLEXIBLE QUERY ANSWERING SYSTEMS, PROCEEDINGS, cilt.4027, ss.319-330, 2006 (SCI-Expanded, Scopus)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 4027
Basım Tarihi: 2006
Dergi Adı: FLEXIBLE QUERY ANSWERING SYSTEMS, PROCEEDINGS
Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Compendex, EMBASE, MathSciNet, Philosopher's Index, zbMATH
Sayfa Sayıları: ss.319-330
Orta Doğu Teknik Üniversitesi Adresli: Evet

Özet

A typical content-based audio management system deals with three aspects namely audio segmentation and classification, audio analysis, and content-based retrieval of audio. In this paper, we integrate the three aspects of content-based audio management into a single framework and propose an efficient method for flexible querying and browsing of auditory data. More specifically, we utilize two robust feature sets namely MPEG-7 Audio Spectrum Flatness (ASF) and Mel Frequency Cepstral Coefficients (MFCC) as the underlying features in order to improve the content-based retrieval accuracy, since both features have some advantages for distinct types of audio (e.g., music and speech). The proposed system provides a wide range of opportunities to query and browse an audio data by content, such as querying and browsing for a chorus section, sound effects, and query-by-example. In addition, the clients can express their queries in the form of point, range, and k-nearest neighbor, which are particularly significant in the multimedia domain.