Objective measures to improve the selection of training speakers in HMM-based child speech synthesis

Govender, Avashna; De Wet, Febe

Objective measures to improve the selection of training speakers in HMM-based child speech synthesis

DOI: 10.1109/RoboMech.2016.7813193
http://ieeexplore.ieee.org/document/7813193/
http://hdl.handle.net/10204/9179

Abstract:

Building synthetic child voices is considered a difficult task due to the challenges associated with data collection. As a result, speaker adaptation in conjunction with Hidden Markov Model (HMM)-based synthesis has become prevalent in this domain because the approach caters for limited amounts of data. An initial average voice model is trained using data from multiple speakers and adapted to resemble a specific target child speaker. Due to the scarcity of child speech data, initial models used in this approach are mostly trained with adult speech data. However, selection of appropriate training speakers from large corpora is not a trivial task because there is no means, other than conducting exhaustive subjective listening tests, to determine which training speakers will yield the best quality synthetic child voice. Therefore, there is a need to find an objective measure that can be used to easily identify a small set of training speakers that will yield the best quality output. In this paper we investigate whether a relationship exists between objective and subjective voice evaluation measures with regard to the selection of training speakers for an average voice model used in speaker-adaptive HMM child speech synthesis. Results indicate that, if training speakers that are closer to the target speaker are used to train initial models, better quality child voices are generated.

Reference:

Govender, A. and De Wet, F. 2016. Objective measures to improve the selection of training speakers in HMM-based child speech synthesis. 2016 Pattern Recognition Association of South Africa and Robotics and Mechatronics International Conference, 30 November - 2 December 2016, Stellenbosch, South Africa, p. 25-30. DOI: 10.1109/RoboMech.2016.7813193

Govender, A., & De Wet, F. (2016). Objective measures to improve the selection of training speakers in HMM-based child speech synthesis. IEEE. http://hdl.handle.net/10204/9179

Govender, Avashna, and Febe De Wet. "Objective measures to improve the selection of training speakers in HMM-based child speech synthesis." (2016): http://hdl.handle.net/10204/9179

Govender A, De Wet F, Objective measures to improve the selection of training speakers in HMM-based child speech synthesis; IEEE; 2016. http://hdl.handle.net/10204/9179 .

Download RIS

2016 Pattern Recognition Association of South Africa and Robotics and Mechatronics International Conference, 30 November - 2 December 2016, Stellenbosch, South Africa.

Govender, Avashna
De Wet, Febe

Dec 2016

Synthetic child voices
Hidden Markov Model

Show full item record

Files in this item

Govender_2016.pdf

This item appears in the following Collection(s)

Conference Publications

Browse

All of ResearchSpace
This Collection
- By Issue Date
- Authors
- Titles
- Subjects
- Publication Type
- Cluster
- Impact Area

Quick Links

Legislation and compliance

General Enquiries

Tel: + 27 12 841 2911
Email: callcentre@csir.co.za

Physical Address
Meiring Naudé Road
Brummeria
Pretoria
South Africa

Postal Address
PO Box 395
Pretoria 0001
South Africa

Social Connect

Resources on this site are free to download and reuse according to associated licensing provision. Please read the terms and conditions of usage of each resource.

Objective measures to improve the selection of training speakers in HMM-based child speech synthesis

Objective measures to improve the selection of training speakers in HMM-based child speech synthesis

This item appears in the following Collection(s)

Browse

All of ResearchSpace

This Collection

Quick Links

Legislation and compliance

General Enquiries

Social Connect