The quality of corpus-based text-to-speech (TTS) systems depends strongly on the consistency of boundary placements during phonetic alignments. Expert human transcribers use visually represented acoustic cues in order to consistently place boundaries at phonetic transitions according to a set of conventions. The authors present some features commonly (and informally) used as aid when performing manual segmentation and investigate the feasibility of automatically extracting and utilising these features to identify phonetic transitions. The authors then show that a number of features can be used to reliably detect various classes of phonetic transitions
Reference:
Van Niekerk, DR and Barnard, E. 2008. Acoustic cues identifying phonetic transitions for speech segmentation. Nineteenth Annual Symposium of the Pattern Recognition Association of South Africa (PRASA 2008), Cape Town, South Africa, 27-28 November, pp 55-59
Van Niekerk, D., & Barnard, E. (2008). Acoustic cues identifying phonetic transitions for speech segmentation. PRASA 2008. http://hdl.handle.net/10204/3033
Van Niekerk, DR, and E Barnard. "Acoustic cues identifying phonetic transitions for speech segmentation." (2008): http://hdl.handle.net/10204/3033
Van Niekerk D, Barnard E, Acoustic cues identifying phonetic transitions for speech segmentation; PRASA 2008; 2008. http://hdl.handle.net/10204/3033 .