ResearchSpace

Pronunciation modelling of foreign words for Sepedi ASR

Show simple item record

dc.contributor.author Modipa, T
dc.contributor.author Davel, MH
dc.date.accessioned 2010-12-23T10:00:21Z
dc.date.available 2010-12-23T10:00:21Z
dc.date.issued 2010-11
dc.identifier.citation Modipa, T and Davel, MH. 2010. Pronunciation modelling of foreign words for Sepedi ASR. 21st Annual Symposium of the Pattern Recognition Association of South Africa (PRASA), Stellenbosch, South Africa, 22-23 November 2010, pp 185-189 en
dc.identifier.isbn 978-0-7992-2470-2
dc.identifier.uri http://hdl.handle.net/10204/4715
dc.description 21st Annual Symposium of the Pattern Recognition Association of South Africa (PRASA), Stellenbosch, South Africa, 22-23 November 2010 en
dc.description.abstract This study focuses on the effective pronunciation modelling of words from different languages encountered during the development of a Sepedi automatic speech recognition (ASR) system. While the speech corpus used for training the ASR system consists mostly of Sepedi utterances, many words from English (and other South African languages) are embedded within the Sepedi sentences. In order to model these words effectively, different approaches to pronunciation dictionary development are investigated, specifically: (1) using language-specific letter-to-sound rules to predict the pronunciation of each word (based on the language of the word) and mapping foreign phonemes to Sepedi phonemes using linguistically motivated mappings, (2) experimenting with data-driven foreign-to-Sepedi phonemes using linguistically motivated mappings, and (3) using Sepedi letter-to-sound to predict the pronunciation of all words irrespective of language. We find that the data-driven phoneme mappings are more accurate than the initial linguistically motivated mappings evaluated, and (with a slight margin) obtain our best result using Sepedi letter-to-sound rules across all words in the speech corpus. en
dc.language.iso en en
dc.publisher PRASA 2010 en
dc.relation.ispartofseries Conference Paper en
dc.subject Sepedi en
dc.subject Automatic speech recognition en
dc.subject Pronunciation modelling en
dc.subject Pattern recognition en
dc.subject PRASA 2010 en
dc.title Pronunciation modelling of foreign words for Sepedi ASR en
dc.type Conference Presentation en
dc.identifier.apacitation Modipa, T., & Davel, M. (2010). Pronunciation modelling of foreign words for Sepedi ASR. PRASA 2010. http://hdl.handle.net/10204/4715 en_ZA
dc.identifier.chicagocitation Modipa, T, and MH Davel. "Pronunciation modelling of foreign words for Sepedi ASR." (2010): http://hdl.handle.net/10204/4715 en_ZA
dc.identifier.vancouvercitation Modipa T, Davel M, Pronunciation modelling of foreign words for Sepedi ASR; PRASA 2010; 2010. http://hdl.handle.net/10204/4715 . en_ZA
dc.identifier.ris TY - Conference Presentation AU - Modipa, T AU - Davel, MH AB - This study focuses on the effective pronunciation modelling of words from different languages encountered during the development of a Sepedi automatic speech recognition (ASR) system. While the speech corpus used for training the ASR system consists mostly of Sepedi utterances, many words from English (and other South African languages) are embedded within the Sepedi sentences. In order to model these words effectively, different approaches to pronunciation dictionary development are investigated, specifically: (1) using language-specific letter-to-sound rules to predict the pronunciation of each word (based on the language of the word) and mapping foreign phonemes to Sepedi phonemes using linguistically motivated mappings, (2) experimenting with data-driven foreign-to-Sepedi phonemes using linguistically motivated mappings, and (3) using Sepedi letter-to-sound to predict the pronunciation of all words irrespective of language. We find that the data-driven phoneme mappings are more accurate than the initial linguistically motivated mappings evaluated, and (with a slight margin) obtain our best result using Sepedi letter-to-sound rules across all words in the speech corpus. DA - 2010-11 DB - ResearchSpace DP - CSIR KW - Sepedi KW - Automatic speech recognition KW - Pronunciation modelling KW - Pattern recognition KW - PRASA 2010 LK - https://researchspace.csir.co.za PY - 2010 SM - 978-0-7992-2470-2 T1 - Pronunciation modelling of foreign words for Sepedi ASR TI - Pronunciation modelling of foreign words for Sepedi ASR UR - http://hdl.handle.net/10204/4715 ER - en_ZA


Files in this item

This item appears in the following Collection(s)

Show simple item record