Article details

Research area
Speech enhancement

Intelligent Environments (IE), 2010 Sixth International Conference on


Tobias Herbig, Franz Gerl, Wolfgang Minker

Fast adaptation of speech and speaker characteristics for enhanced speech recognition in adverse intelligent environments


In this paper we present a technique for fast adaptation of speech and speaker related information. Fast learning is particularly useful for automatic personalization of speech-controlled devices. Such a personalization of human-computer interfaces to be used in intelligent environments represents an important research issue. Speech recognition is enhanced by speaker specific profiles which are continuously adapted. A fast but robust tracking of speaker characteristics and optimal long-term adaptation are investigated to avoid an extensive enrollment of new speakers. We present an implementation suitable for speaker specific speech recognition in adverse intelligent environments. Exemplarily, in-car applications such as speech controlled navigation, hands-free telephony or infotainment systems are investigated for embedded systems. Results for a subset of the SPEECON database are presented. They validate the benefit of the presented speaker adaptation scheme for speech recognition. Speaker characteristics are captured after very few utterances. In the long run speaker characteristics are accurately represented. This adaptation scheme might be used to develop an unsupervised speech controlled system comprising speech recognition and speaker identification. A unified modeling of speech and speaker characteristics is proposed.

Read/download now