Bootstrapping techniques can accelerate the development of language technology for new languages. The authors define a framework for the analysis of a general bootstrapping process whereby a model is improved through a controlled series of increments, at each stage utilising the previous model to generate the next. This framework is applied to the task of creating pronunciation models for new languages, iteratively combining machine learning and human knowledge in a way that minimises the human intervention required during this process. The authors analyse the effectiveness of this approach when developing a medium- sized pronunciation lexicon in Afrikaans, one of South Africa's official languages, and provide initial results obtained for similar lexicons developed in isiZulu and Sepedi
Reference:
Davel, M and Barnard, E. 2006. Bootstrapping pronunciation models: a South African case study. CSIR Research and Innovation Conference: 1st CSIR Biennial Conference, CSIR International Convention Centre Pretoria, 27-28 February 2006, pp 19
Davel, M., & Barnard, E. (2006). Bootstrapping pronunciation models: a South African case study. http://hdl.handle.net/10204/2727
Davel, M, and E Barnard. "Bootstrapping pronunciation models: a South African case study." (2006): http://hdl.handle.net/10204/2727
Davel M, Barnard E, Bootstrapping pronunciation models: a South African case study; 2006. http://hdl.handle.net/10204/2727 .