Tezin Türü: Yüksek Lisans
Tezin Yürütüldüğü Kurum: Orta Doğu Teknik Üniversitesi, Mühendislik Fakültesi, Elektrik ve Elektronik Mühendisliği Bölümü, Türkiye
Tezin Onay Tarihi: 2005
Öğrenci: ALİ ORKAN BAYER
Danışman: TOLGA ÇİLOĞLU
Özet:This study focuses on large vocabulary Turkish continuous speech recognition. Continuous speech recognition for Turkish cannot be performed accurately because of the agglutinative nature of the language. The agglutinative nature decreases the performance of the classical language models that are used in the area. In this thesis firstly, acoustic models using different parameters are constructed and tested. Then, three types of n-gram language models are built. These involve class-based models, stem-based models, and stem-end-based models. Two pass recognition is performed using the Hidden Markov Modeling Toolkit (HTK) for testing the system first with the bigram models and then with the trigram models. At the end of the study, it is found that trigram models over stems and endings give better results, since their coverage of the vocabulary is better.