Collecting and evaluating speech recognition corpora for nine Southern Bantu languages

Badenhorst, JAC; Van Heerden, C; Davel, M; Barnard, E

dc.contributor.author	Badenhorst, JAC
dc.contributor.author	Van Heerden, C
dc.contributor.author	Davel, M
dc.contributor.author	Barnard, E
dc.date.accessioned	2010-07-23T14:36:05Z
dc.date.available	2010-07-23T14:36:05Z
dc.date.issued	2009-03
dc.identifier.citation	Badenhorst, JAC,Van Heerden, C, Davel, M et al. 2009. Collecting and evaluating speech recognition corpora for nine Southern Bantu languages. EACL Workshop on Language Technologies for African Languages, Athens, Greece, 31 March 2009, pp 1-8	en
dc.identifier.isbn	1932432256
dc.identifier.uri	http://hdl.handle.net/10204/4128
dc.description	EACL Workshop on Language Technologies for African Languages, Athens, Greece, 31 March 2009	en
dc.description.abstract	The authors describes the Lwazi corpus for automatic speech recognition (ASR), a new telephone speech corpus which includes data from nine Southern Bantu languages. Because of practical constraints, the amount of speech per language is relatively small compared to major corpora in world languages, and we report on our investigation of the stability of the ASR models derived from the corpus. We also report on phoneme distance measures across languages, and describe initial phone recognisers that were developed using this data.	en
dc.language.iso	en	en
dc.publisher	Association for Computational Linguistics	en
dc.subject	Automated telephony systems	en
dc.subject	Lwazi corpus	en
dc.subject	Automatic speech recognition system	en
dc.subject	ASR	en
dc.subject	Rural areas	en
dc.subject	African languages	en
dc.subject	Speech corpus	en
dc.subject	Southern Bantu languages	en
dc.subject	Language technologies	en
dc.subject	Computational linguistics	en
dc.title	Collecting and evaluating speech recognition corpora for nine Southern Bantu languages	en
dc.type	Conference Presentation	en
dc.identifier.apacitation	Badenhorst, J., Van Heerden, C., Davel, M., & Barnard, E. (2009). Collecting and evaluating speech recognition corpora for nine Southern Bantu languages. Association for Computational Linguistics. http://hdl.handle.net/10204/4128	en_ZA
dc.identifier.chicagocitation	Badenhorst, JAC, C Van Heerden, M Davel, and E Barnard. "Collecting and evaluating speech recognition corpora for nine Southern Bantu languages." (2009): http://hdl.handle.net/10204/4128	en_ZA
dc.identifier.vancouvercitation	Badenhorst J, Van Heerden C, Davel M, Barnard E, Collecting and evaluating speech recognition corpora for nine Southern Bantu languages; Association for Computational Linguistics; 2009. http://hdl.handle.net/10204/4128 .	en_ZA
dc.identifier.ris	TY - Conference Presentation AU - Badenhorst, JAC AU - Van Heerden, C AU - Davel, M AU - Barnard, E AB - The authors describes the Lwazi corpus for automatic speech recognition (ASR), a new telephone speech corpus which includes data from nine Southern Bantu languages. Because of practical constraints, the amount of speech per language is relatively small compared to major corpora in world languages, and we report on our investigation of the stability of the ASR models derived from the corpus. We also report on phoneme distance measures across languages, and describe initial phone recognisers that were developed using this data. DA - 2009-03 DB - ResearchSpace DP - CSIR KW - Automated telephony systems KW - Lwazi corpus KW - Automatic speech recognition system KW - ASR KW - Rural areas KW - African languages KW - Speech corpus KW - Southern Bantu languages KW - Language technologies KW - Computational linguistics LK - https://researchspace.csir.co.za PY - 2009 SM - 1932432256 T1 - Collecting and evaluating speech recognition corpora for nine Southern Bantu languages TI - Collecting and evaluating speech recognition corpora for nine Southern Bantu languages UR - http://hdl.handle.net/10204/4128 ER -	en_ZA

Files in this item

Name: Badenhorst_2009.pdf

Size: 594.9Kb

Format: PDF

View/Open

This item appears in the following Collection(s)

Conference Publications

Show simple item record

Browse

All of ResearchSpace
This Collection
- By Issue Date
- Authors
- Titles
- Subjects
- Publication Type
- Cluster
- Impact Area

Quick Links

Legislation and compliance

General Enquiries

Tel: + 27 12 841 2911
Email: callcentre@csir.co.za

Physical Address
Meiring Naudé Road
Brummeria
Pretoria
South Africa

Postal Address
PO Box 395
Pretoria 0001
South Africa

Social Connect

Resources on this site are free to download and reuse according to associated licensing provision. Please read the terms and conditions of usage of each resource.