Vocal tract resonances tracking based on voiced and unvoiced speech classification using dynamic programming and fixed interval Kalman smoother


Oezbek I. Y., Demirekler M.

33rd IEEE International Conference on Acoustics, Speech and Signal Processing, Nevada, Amerika Birleşik Devletleri, 30 Mart - 04 Nisan 2008, ss.4217-4220 identifier identifier

  • Yayın Türü: Bildiri / Tam Metin Bildiri
  • Doi Numarası: 10.1109/icassp.2008.4518585
  • Basıldığı Şehir: Nevada
  • Basıldığı Ülke: Amerika Birleşik Devletleri
  • Sayfa Sayıları: ss.4217-4220
  • Orta Doğu Teknik Üniversitesi Adresli: Evet

Özet

This paper presents a systematic framework for accurate estimation of vocal tract resonances (formants) using neither training data nor a phonetic transcription. In the proposed method, the speech signal is segmented in voiced and unvoiced parts and the resonance frequencies of the vocal tract are estimated by dynamic programming and further processed by using Kalman filtering/smoothing for each part. The performance of the proposed method is compared with three different methods which are baseline, WaveSurfer [10] and MSR [5]. The proposed method reduces the overall vocal tract resonances (for F1, F2 and F3) estimation error rate by 35%, 39.6% and 2.74% over the baseline, WaveSurfer and MSR methods respectively.