dc.contributor.author |
Mak, Franco
|
|
dc.contributor.author |
Govender, Avashna
|
|
dc.contributor.author |
Badenhorst, Jaco
|
|
dc.date.accessioned |
2024-06-11T06:49:36Z |
|
dc.date.available |
2024-06-11T06:49:36Z |
|
dc.date.issued |
2024-02 |
|
dc.identifier.citation |
Mak, F., Govender, A. & Badenhorst, J. 2024. Exploring ASR fine-tuning on limited domain specific data for low resource languages. <i>Journal of the Digital Humanities Association of Southern Africa, 5(1).</i> http://hdl.handle.net/10204/13683 |
en_ZA |
dc.identifier.uri |
http://hdl.handle.net/10204/13683
|
|
dc.description.abstract |
The majority of South Africa’s eleven languages are low-resourced, posing a major challenge to Automatic Speech Recognition (ASR) development. Modern ASR systems require an extensive amount of data that is extremely difficult to find for low-resourced languages. In addition, available speech and text corpora for these languages predominantly revolve around government, political and biblical content. Consequently, this hinders the ability of ASR systems developed for these languages to perform well especially when evaluating data outside of these domains. To alleviate this problem, the Icefall Kaldi II toolkit introduced new transformer model scripts, facilitating the adaptation of pre-trained models using limited adaptation data. In this paper, we explored the technique of using pre-trained ASR models in a domain where more data is available (government data) and adapted it to an entirely different domain with limited data (broadcast news data). The objective was to assess whether such techniques can surpass the accuracy of prior ASR models developed for these languages. Our results showed that the Conformer connectionist temporal classification (CTC) model obtained lower word error rates by a large margin in comparison to previous TDNN-F models evaluated on the same datasets. This research signifies a step forward in mitigating data scarcity challenges and enhancing ASR performance for low-resourced languages in South Africa. |
en_US |
dc.format |
Fulltext |
en_US |
dc.language.iso |
en |
en_US |
dc.relation.uri |
https://upjournals.up.ac.za/index.php/dhasa/article/view/5024/4137 |
en_US |
dc.relation.uri |
https://upjournals.up.ac.za/index.php/dhasa |
en_US |
dc.source |
Journal of the Digital Humanities Association of Southern Africa, 5(1) |
en_US |
dc.subject |
Automatic speech recognition |
en_US |
dc.subject |
Fine-tuning |
en_US |
dc.subject |
Low-resource languages |
en_US |
dc.subject |
Data harvesting |
en_US |
dc.subject |
Broadcast news data |
en_US |
dc.title |
Exploring ASR fine-tuning on limited domain specific data for low resource languages |
en_US |
dc.type |
Article |
en_US |
dc.description.pages |
8 |
en_US |
dc.description.note |
Copyright (c) 2024 Franco Mak, Avashna Govender, Jaco Badenhorst. This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. |
en_US |
dc.description.cluster |
Next Generation Enterprises & Institutions |
en_US |
dc.identifier.apacitation |
Mak, F., Govender, A., & Badenhorst, J. (2024). Exploring ASR fine-tuning on limited domain specific data for low resource languages. <i>Journal of the Digital Humanities Association of Southern Africa, 5(1)</i>, http://hdl.handle.net/10204/13683 |
en_ZA |
dc.identifier.chicagocitation |
Mak, Franco, Avashna Govender, and Jaco Badenhorst "Exploring ASR fine-tuning on limited domain specific data for low resource languages." <i>Journal of the Digital Humanities Association of Southern Africa, 5(1)</i> (2024) http://hdl.handle.net/10204/13683 |
en_ZA |
dc.identifier.vancouvercitation |
Mak F, Govender A, Badenhorst J. Exploring ASR fine-tuning on limited domain specific data for low resource languages. Journal of the Digital Humanities Association of Southern Africa, 5(1). 2024; http://hdl.handle.net/10204/13683. |
en_ZA |
dc.identifier.ris |
TY - Article
AU - Mak, Franco
AU - Govender, Avashna
AU - Badenhorst, Jaco
AB - The majority of South Africa’s eleven languages are low-resourced, posing a major challenge to Automatic Speech Recognition (ASR) development. Modern ASR systems require an extensive amount of data that is extremely difficult to find for low-resourced languages. In addition, available speech and text corpora for these languages predominantly revolve around government, political and biblical content. Consequently, this hinders the ability of ASR systems developed for these languages to perform well especially when evaluating data outside of these domains. To alleviate this problem, the Icefall Kaldi II toolkit introduced new transformer model scripts, facilitating the adaptation of pre-trained models using limited adaptation data. In this paper, we explored the technique of using pre-trained ASR models in a domain where more data is available (government data) and adapted it to an entirely different domain with limited data (broadcast news data). The objective was to assess whether such techniques can surpass the accuracy of prior ASR models developed for these languages. Our results showed that the Conformer connectionist temporal classification (CTC) model obtained lower word error rates by a large margin in comparison to previous TDNN-F models evaluated on the same datasets. This research signifies a step forward in mitigating data scarcity challenges and enhancing ASR performance for low-resourced languages in South Africa.
DA - 2024-02
DB - ResearchSpace
DP - CSIR
J1 - Journal of the Digital Humanities Association of Southern Africa, 5(1)
KW - Automatic speech recognition
KW - Fine-tuning
KW - Low-resource languages
KW - Data harvesting
KW - Broadcast news data
LK - https://researchspace.csir.co.za
PY - 2024
T1 - Exploring ASR fine-tuning on limited domain specific data for low resource languages
TI - Exploring ASR fine-tuning on limited domain specific data for low resource languages
UR - http://hdl.handle.net/10204/13683
ER -
|
en_ZA |
dc.identifier.worklist |
27440 |
en_US |