We investigate modeling strategies for English code-switched words as found in a Swahili spoken term detection system. Code switching, where speakers switch language in a conversation, occurs frequently in multilingual environments, and typically deteriorates STD performance. Analysis is performed in the context of the IARPA Babel program which focuses on rapid STD system development for under-resourced languages. Our results show that approaches that specifically target the modeling of code-switched words, significantly improve the detection performance of these words.
Reference:
Kleynhans, N., Hartman, W., Van Niekerk, D., Van Heerden, C., Schwartz, R., Tsakalidis, S. and Davel, M. 2016. Code-switched English pronunciation modeling for Swahili spoken term detection. In: 5th Workshop on Spoken Language Technology for Under-Resourced Languages, SLTU 2016, 9-12 May 2016, Yogyakarta, Indonesia.
Kleynhans, N., Hartman, W., Van Niekerk, D., Van Heerden, C., Schwartz, R., Tsakalidis, S., & Davel, M. (2016). Code-switched English pronunciation modeling for Swahili spoken term detection. Elsevier, online. http://hdl.handle.net/10204/8916
Kleynhans, N, W Hartman, D Van Niekerk, C Van Heerden, R Schwartz, S Tsakalidis, and M Davel. "Code-switched English pronunciation modeling for Swahili spoken term detection." (2016): http://hdl.handle.net/10204/8916
Kleynhans N, Hartman W, Van Niekerk D, Van Heerden C, Schwartz R, Tsakalidis S, et al, Code-switched English pronunciation modeling for Swahili spoken term detection; Elsevier, online; 2016. http://hdl.handle.net/10204/8916 .
5th Workshop on Spoken Language Technology for Under-Resourced Languages, SLTU 2016, 9-12 May 2016, Yogyakarta, Indonesia.Due to copyright restrictions, the attached PDF file only contains the abstract of the full text item. For access to the full text item, please consult the publisher's website.