IEEE 12th Signal Processing and Communications Applications Conference, Kusadasi, Turkey, 28 - 30 April 2004, pp.280-283
This paper describes Turkish telephone speech database created within the framework of Orientel (IST-2000-28373), a 5th framework project. Orientel aims to collect telephone speech data from 21 languages. Turkish database has been successfully completed in 16 months. The work includes recordings, annotations and documentation of 1700 recording sessions. The speaker distribution has been balanced with respect to criteria such as age, sex, dialect, calling environment and network. The database contains digits, numbers, time, date, words and sentences. It is the first Turkish speech database of its size and also of its detailed systematic manner followed in the preparation and validation.