Audio Feature and Classifier Analysis for Efficient Recognition of Environmental Sounds


Okuyucu C., SERT M., YAZICI A.

15th IEEE International Symposium on Multimedia (ISM), California, Amerika Birleşik Devletleri, 9 - 11 Aralık 2013, ss.125-132 identifier identifier

  • Yayın Türü: Bildiri / Tam Metin Bildiri
  • Doi Numarası: 10.1109/ism.2013.29
  • Basıldığı Şehir: California
  • Basıldığı Ülke: Amerika Birleşik Devletleri
  • Sayfa Sayıları: ss.125-132
  • Anahtar Kelimeler: environmental sound classification, MPEG-7 audio, MFCC, HMM, SVM, SEGMENTATION
  • Orta Doğu Teknik Üniversitesi Adresli: Evet

Özet

Environmental sounds (ES) have different characteristics, such as unstructured nature and typically noiselike and flat spectrums, which make recognition task difficult compared to speech or music sounds. Here, we perform an exhaustive feature and classifier analysis for the recognition of considerably similar ES categories and propose a best representative feature to yield higher recognition accuracy. In the experiments, thirteen (13) ES categories, namely emergency alarm, car horn, gun, explosion, automobile, helicopter, water, wind, rain, applause, crowd, and laughter are detected and tested based on eleven (11) audio features (MPEG-7 family, ZCR, MFCC, and combinations) by using the HMM and SVM classifiers. Extensive experiments have been conducted to demonstrate the effectiveness of these joint features for ES classification. Our experiments show that, the joint feature set ASFCS-H (Audio Spectrum Flatness, Centroid, Spread, and Audio Harmonicity) is the best representative feature set with an average F-measure value of 80.6%.