Vocal tract resonances tracking based on voiced and unvoiced speech classification using dynamic programming and fixed interval Kalman smoother

33rd IEEE International Conference on Acoustics, Speech and Signal Processing, Nevada, Amerika Birleşik Devletleri, 30 Mart - 04 Nisan 2008, ss.4217-4220

Yayın Türü: Bildiri / Tam Metin Bildiri
Doi Numarası: 10.1109/icassp.2008.4518585
Basıldığı Şehir: Nevada
Basıldığı Ülke: Amerika Birleşik Devletleri
Sayfa Sayıları: ss.4217-4220
Orta Doğu Teknik Üniversitesi Adresli: Evet

Özet

This paper presents a systematic framework for accurate estimation of vocal tract resonances (formants) using neither training data nor a phonetic transcription. In the proposed method, the speech signal is segmented in voiced and unvoiced parts and the resonance frequencies of the vocal tract are estimated by dynamic programming and further processed by using Kalman filtering/smoothing for each part. The performance of the proposed method is compared with three different methods which are baseline, WaveSurfer [10] and MSR [5]. The proposed method reduces the overall vocal tract resonances (for F1, F2 and F3) estimation error rate by 35%, 39.6% and 2.74% over the baseline, WaveSurfer and MSR methods respectively.