Matrix quantization and mixed excitation based linear predictive speech coding at very low bit rates


Ozaydin S., Baykal B.

SPEECH COMMUNICATION, cilt.41, ss.381-392, 2003 (SCI-Expanded) identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 41
  • Basım Tarihi: 2003
  • Doi Numarası: 10.1016/s0167-6393(03)00009-8
  • Dergi Adı: SPEECH COMMUNICATION
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus
  • Sayfa Sayıları: ss.381-392
  • Anahtar Kelimeler: very low bit rate, LSF matrix quantization, LSF vector quantization, mixed excitation, MELP, ARMA prediction, PARAMETERS, DESIGN
  • Orta Doğu Teknik Üniversitesi Adresli: Hayır

Özet

A matrix quantization scheme and a very low bit rate vocoder is developed to obtain good quality speech for low capacity communication links. The new matrix quantization method operates at bit rates between 400 and 800 bps and using a 25 ms linear predictive coding (LPC) analysis frame, spectral distortion about I dB is achieved at 800 bps. Techniques for improving the performance at very low bit rate vocoding include quantization of residual line spectral frequency (LSF) vectors, multistage matrix quantization, joint quantization of pitch and voiced/unvoiced/mixed decisions and a technique to obtain voiced/unvoiced/mixed decisions. In the new matrix quantization based mixed excitation (MQME) vocoder, the residual LSF vectors for two consecutive frames are obtained using autoregressive moving average (ARMA) prediction, then grouped into a superframe and jointly quantized. For other speech parameters, quantization is made in each frame. The residual LSF vector quantization yields bit rate reduction in the vocoder. For the MQME vocoder, listening tests have proven that an efficient and high quality coding has been achieved at a bit rate of 1200 bps. Test results are compared with the mixed excitation based 2400 bps MELP vocoder which is chosen as the new federal standard, and it is observed that the degradation in speech quality is tolerable and the performance is near the 2400 bps MELP vocoder particularly in quiet environments. (C) 2003 Elsevier B.V. All rights reserved.