Research area
Speech enhancement

INTERSPEECH 2010 11th Annual Conference of the International Speech Communication Association


Tobias Herbig, Franz Gerl, Wolfgang Minker

Speaker tracking in an unsupervised speech controlled system


In this paper we present a technique to increase the robustness of a self-learning speech controlled system comprising speech recognition, speaker identification and speaker adaptation. Our goal is the automatic personalization of a speech controlled device for groups of 5-10 recurring speakers. Speakers should be identified and tracked across speaker turns only by their voice patterns. Efficient information retrieval and the statistical representation of speaker characteristics have to be combined with a reliable and flexible speaker identification. Even on limited adaptation data, e.g. 2-3 command and control utterances, speakers have to be reliably tracked to allow continuous adaptation of complex statistical models. We present a novel approach of speaker identification on different time-scales based on a unified speech and speaker model. Experiments were carried out on a subset of the SPEECON database.

