Collecting and evaluating speech recognition corpora for nine Southern Bantu languages

Badenhorst, JAC; Van Heerden, C; Davel, M; Barnard, E

Collecting and evaluating speech recognition corpora for nine Southern Bantu languages

http://hdl.handle.net/10204/4128

Abstract:

The authors describes the Lwazi corpus for automatic speech recognition (ASR), a new telephone speech corpus which includes data from nine Southern Bantu languages. Because of practical constraints, the amount of speech per language is relatively small compared to major corpora in world languages, and we report on our investigation of the stability of the ASR models derived from the corpus. We also report on phoneme distance measures across languages, and describe initial phone recognisers that were developed using this data.

Reference:

Badenhorst, JAC,Van Heerden, C, Davel, M et al. 2009. Collecting and evaluating speech recognition corpora for nine Southern Bantu languages. EACL Workshop on Language Technologies for African Languages, Athens, Greece, 31 March 2009, pp 1-8

Badenhorst, J., Van Heerden, C., Davel, M., & Barnard, E. (2009). Collecting and evaluating speech recognition corpora for nine Southern Bantu languages. Association for Computational Linguistics. http://hdl.handle.net/10204/4128

Badenhorst, JAC, C Van Heerden, M Davel, and E Barnard. "Collecting and evaluating speech recognition corpora for nine Southern Bantu languages." (2009): http://hdl.handle.net/10204/4128

Badenhorst J, Van Heerden C, Davel M, Barnard E, Collecting and evaluating speech recognition corpora for nine Southern Bantu languages; Association for Computational Linguistics; 2009. http://hdl.handle.net/10204/4128 .

Download RIS

EACL Workshop on Language Technologies for African Languages, Athens, Greece, 31 March 2009

This item appears in the following Collection(s)

Conference Publications

Browse

All of ResearchSpace
This Collection
- By Issue Date
- Authors
- Titles
- Subjects
- Publication Type
- Cluster
- Impact Area

Quick Links

Legislation and compliance

General Enquiries

Tel: + 27 12 841 2911
Email: callcentre@csir.co.za

Physical Address
Meiring Naudé Road
Brummeria
Pretoria
South Africa

Postal Address
PO Box 395
Pretoria 0001
South Africa

Social Connect

Resources on this site are free to download and reuse according to associated licensing provision. Please read the terms and conditions of usage of each resource.

Collecting and evaluating speech recognition corpora for nine Southern Bantu languages

Collecting and evaluating speech recognition corpora for nine Southern Bantu languages

This item appears in the following Collection(s)

Browse

All of ResearchSpace

This Collection

Quick Links

Legislation and compliance

General Enquiries

Social Connect