ResearchSpace

Collecting and evaluating speech recognition corpora for nine Southern Bantu languages

Show simple item record

dc.contributor.author Badenhorst, JAC
dc.contributor.author Van Heerden, C
dc.contributor.author Davel, M
dc.contributor.author Barnard, E
dc.date.accessioned 2010-07-23T14:36:05Z
dc.date.available 2010-07-23T14:36:05Z
dc.date.issued 2009-03
dc.identifier.citation Badenhorst, JAC,Van Heerden, C, Davel, M et al. 2009. Collecting and evaluating speech recognition corpora for nine Southern Bantu languages. EACL Workshop on Language Technologies for African Languages, Athens, Greece, 31 March 2009, pp 1-8 en
dc.identifier.isbn 1932432256
dc.identifier.uri http://hdl.handle.net/10204/4128
dc.description EACL Workshop on Language Technologies for African Languages, Athens, Greece, 31 March 2009 en
dc.description.abstract The authors describes the Lwazi corpus for automatic speech recognition (ASR), a new telephone speech corpus which includes data from nine Southern Bantu languages. Because of practical constraints, the amount of speech per language is relatively small compared to major corpora in world languages, and we report on our investigation of the stability of the ASR models derived from the corpus. We also report on phoneme distance measures across languages, and describe initial phone recognisers that were developed using this data. en
dc.language.iso en en
dc.publisher Association for Computational Linguistics en
dc.subject Automated telephony systems en
dc.subject Lwazi corpus en
dc.subject Automatic speech recognition system en
dc.subject ASR en
dc.subject Rural areas en
dc.subject African languages en
dc.subject Speech corpus en
dc.subject Southern Bantu languages en
dc.subject Language technologies en
dc.subject Computational linguistics en
dc.title Collecting and evaluating speech recognition corpora for nine Southern Bantu languages en
dc.type Conference Presentation en
dc.identifier.apacitation Badenhorst, J., Van Heerden, C., Davel, M., & Barnard, E. (2009). Collecting and evaluating speech recognition corpora for nine Southern Bantu languages. Association for Computational Linguistics. http://hdl.handle.net/10204/4128 en_ZA
dc.identifier.chicagocitation Badenhorst, JAC, C Van Heerden, M Davel, and E Barnard. "Collecting and evaluating speech recognition corpora for nine Southern Bantu languages." (2009): http://hdl.handle.net/10204/4128 en_ZA
dc.identifier.vancouvercitation Badenhorst J, Van Heerden C, Davel M, Barnard E, Collecting and evaluating speech recognition corpora for nine Southern Bantu languages; Association for Computational Linguistics; 2009. http://hdl.handle.net/10204/4128 . en_ZA
dc.identifier.ris TY - Conference Presentation AU - Badenhorst, JAC AU - Van Heerden, C AU - Davel, M AU - Barnard, E AB - The authors describes the Lwazi corpus for automatic speech recognition (ASR), a new telephone speech corpus which includes data from nine Southern Bantu languages. Because of practical constraints, the amount of speech per language is relatively small compared to major corpora in world languages, and we report on our investigation of the stability of the ASR models derived from the corpus. We also report on phoneme distance measures across languages, and describe initial phone recognisers that were developed using this data. DA - 2009-03 DB - ResearchSpace DP - CSIR KW - Automated telephony systems KW - Lwazi corpus KW - Automatic speech recognition system KW - ASR KW - Rural areas KW - African languages KW - Speech corpus KW - Southern Bantu languages KW - Language technologies KW - Computational linguistics LK - https://researchspace.csir.co.za PY - 2009 SM - 1932432256 T1 - Collecting and evaluating speech recognition corpora for nine Southern Bantu languages TI - Collecting and evaluating speech recognition corpora for nine Southern Bantu languages UR - http://hdl.handle.net/10204/4128 ER - en_ZA


Files in this item

This item appears in the following Collection(s)

Show simple item record