Article details

Research area
Text to speech

In proceedings of Interspeech 2011, Florence, Italy, August 2011


Alexander Sorin, Slava Shechtman, Vincent Pollet

Uniform speech parameterization for multi-form segment synthesis


In multi-form segment synthesis speech is constructed by sequencing speech segments of different nature: model segments, i.e. mathematical abstractions of speech and template segments, i.e. speech waveform fragments. These multi-form segments can have shared, layered or alternate speech parameterization schemes. This paper introduces an advanced uniform speech parameterization scheme for statistical model segments and waveform segments employed in our multi-form segment synthesis system. Mel-Regularized Cepstrum derived from amplitude and phase spectra forms its basic framework. Furthermore, a new adaptive enhancement technique for model segments is presented that reduces the perceived gap in quality and similarity between model and template segments.

