dc.contributor.author |
Davel, M
|
|
dc.contributor.author |
Barnard, E
|
|
dc.date.accessioned |
2007-07-02T12:56:16Z |
|
dc.date.available |
2007-07-02T12:56:16Z |
|
dc.date.issued |
2006-07 |
|
dc.identifier.citation |
Davel, M and Barnard, E. 2006. Bootstrapping pronunciation models. South African Journal of Science, Vol. 102(7/8), pp 322-328 |
en |
dc.identifier.issn |
0038-2353 |
|
dc.identifier.uri |
http://hdl.handle.net/10204/841
|
|
dc.description |
Copyright: 2006 Acad Science South Africa A S S AF |
en |
dc.description.abstract |
Bootstrapping techniques can accelerate the development of language technology for resource-scarce languages. We (the authors) define a framework for the analysis of a general bootstrapping process whereby a model is improved through a controlled series of increments, at each stage using the previous model to generate the next. We (the authors) apply this framework to the task of creating pronunciation models for resource-scarce languages, iteratively combining machine learning and human knowledge in a way that minimizes the human intervention required during this process. We (the authors) analyse the effectiveness of such an approach when developing a medium sized (5000–10 000-word) pronunciation lexicon. We (the authors) develop such an electronic pronunciation lexicon in Afrikaans, one of South Africa’s official languages, and provide initial results obtained for similar lexicons developed in Zulu and Sepedi, two other South African languages. We (the authors) derive a mathematical model that can be used to predict the amount of time required for the development of a pronunciation lexicon of a given size, demonstrate the various tools that can accelerate the bootstrapping process, and evaluate the efficiency of these tools in practice. |
en |
dc.language.iso |
en |
en |
dc.publisher |
Acad Science South Africa A S S AF |
en |
dc.subject |
Bootstrapping |
en |
dc.subject |
Pronunciation models |
en |
dc.subject |
Bootstrapping techniques |
en |
dc.subject |
Eelectronic pronunciation lexicon |
en |
dc.subject |
Human language technologies |
en |
dc.subject |
HLT |
en |
dc.title |
Bootstrapping pronunciation models |
en |
dc.type |
Article |
en |
dc.identifier.apacitation |
Davel, M., & Barnard, E. (2006). Bootstrapping pronunciation models. http://hdl.handle.net/10204/841 |
en_ZA |
dc.identifier.chicagocitation |
Davel, M, and E Barnard "Bootstrapping pronunciation models." (2006) http://hdl.handle.net/10204/841 |
en_ZA |
dc.identifier.vancouvercitation |
Davel M, Barnard E. Bootstrapping pronunciation models. 2006; http://hdl.handle.net/10204/841. |
en_ZA |
dc.identifier.ris |
TY - Article
AU - Davel, M
AU - Barnard, E
AB - Bootstrapping techniques can accelerate the development of language technology for resource-scarce languages. We (the authors) define a framework for the analysis of a general bootstrapping process whereby a model is improved through a controlled series of increments, at each stage using the previous model to generate the next. We (the authors) apply this framework to the task of creating pronunciation models for resource-scarce languages, iteratively combining machine learning and human knowledge in a way that minimizes the human intervention required during this process. We (the authors) analyse the effectiveness of such an approach when developing a medium sized (5000–10 000-word) pronunciation lexicon. We (the authors) develop such an electronic pronunciation lexicon in Afrikaans, one of South Africa’s official languages, and provide initial results obtained for similar lexicons developed in Zulu and Sepedi, two other South African languages. We (the authors) derive a mathematical model that can be used to predict the amount of time required for the development of a pronunciation lexicon of a given size, demonstrate the various tools that can accelerate the bootstrapping process, and evaluate the efficiency of these tools in practice.
DA - 2006-07
DB - ResearchSpace
DP - CSIR
KW - Bootstrapping
KW - Pronunciation models
KW - Bootstrapping techniques
KW - Eelectronic pronunciation lexicon
KW - Human language technologies
KW - HLT
LK - https://researchspace.csir.co.za
PY - 2006
SM - 0038-2353
T1 - Bootstrapping pronunciation models
TI - Bootstrapping pronunciation models
UR - http://hdl.handle.net/10204/841
ER -
|
en_ZA |