This paper reports on the automatic alignment of audiobooks in Afrikaans. An existing Afrikaans pronunciation dictionary and corpus of Afrikaans speech data are used to generate baseline acoustic models. The baseline system achieves an average duration independent overlap rate of 0.977 on the first three chapters of an audio version of “Ruiter in die Nag”, an Afrikaans book by Mikro. The average duration independent overlap rate increases to 0.990 when the speech data from the audiobook is used to perform Maximum A Posteriori adaptation on the baseline models. The corresponding value for models trained on the audiobook data is 0.996. An automatic measure of alignment accuracy is also introduced and compared to accuracies measured relative to a gold standard.
Reference:
Van Heerden, CJ, De Wet, F and Davel, MH. 2012. Automatic alignment of audiobooks in Afrikaans. PRASA 2012, CSIR International Convention Centre, Pretoria, 29-30 November 2012
Van Heerden, C., De Wet, F., & Davel, M. (2012). Automatic alignment of audiobooks in Afrikaans. PRASA. http://hdl.handle.net/10204/6436
Van Heerden, CJ, Febe De Wet, and MH Davel. "Automatic alignment of audiobooks in Afrikaans." (2012): http://hdl.handle.net/10204/6436
Van Heerden C, De Wet F, Davel M, Automatic alignment of audiobooks in Afrikaans; PRASA; 2012. http://hdl.handle.net/10204/6436 .