Various automated techniques can be used to generalise from phonemic lexicons through the extraction of grapheme-to-phoneme rule sets. These techniques are particularly useful when developing pronunciation models for previously unmodelled languages: a frequent requirement when developing multilingual speech processing systems. However, many of the learning algorithms (such as Dynamically Expanding Context or Default and Refine) experience difficulty in accommodating alternate pronunciations that occur in the training lexicon. In this paper authors propose an approach for the incorporation of phonemic variants in a typical instance based learning algorithm, Default and Refine. Authors investigate the use of a combined ‘pseudo-phoneme’ associated with a set of ‘generation restriction rules’ to model those phonemes that are consistently realised as two or more variants in the training lexicon. Authors evaluate the effectiveness of this approach using the Oxford Advanced Learners Dictionary, a publicly available English pronunciation lexicon. Authors find that phonemic variation exhibits sufficient regularity to be modelled through extracted rules, and that acceptable variants may be underrepresented in the studied lexicon. The proposed method is applicable to many approaches besides the Default and Refine algorithm, and provides a simple but effective technique for including phonemic variants in grapheme-to-phoneme rule extraction frameworks.
Reference:
Davel, M and Barnard, E. 2006. Extracting pronunciation rules for phonemic variants. ISCA technical and research workshop, Stellenbosch, April, 2006, pp 5.
Davel, M., & Barnard, E. (2006). Extracting pronunciation rules for phonemic variants. http://hdl.handle.net/10204/1160
Davel, M, and E Barnard. "Extracting pronunciation rules for phonemic variants." (2006): http://hdl.handle.net/10204/1160
Davel M, Barnard E, Extracting pronunciation rules for phonemic variants; 2006. http://hdl.handle.net/10204/1160 .