Article details

Research area
Text to speech

Kyoto, Japan


Xu Shao, Vincent Pollet

Refined Statistical Model Tuning for Speech Synthesis


This paper describes a number of approaches to refine and tune statistical models for speech synthesis. The first approach is to tune the sizes of the decision trees for central phonemes in a context. The second approach is a refinement technique for HMM models; a variable number of states for hidden semi- Markov models is emulated. A so-called “hard state- skip” training technique is introduced into the standard forward-backward training. The results show that both the tune and refinement techniques lead to increased flexibility for speech synthesis modeling.

Read/download now