DAGA 2017, Kiel
Speech enhancement algorithms are employed in many applications, such as hands-free telephones, or speech recognizers, to recover a speech signal that is recorded in a noisy environment.
In automotive environments, the noise particularly affects the low frequencies that are relevant for voiced speech. Detection of voiced speech sections and estimation of the pitch frequency help to reconstruct the harmonic structure of voiced speech and to enhance the speech signal.
Many algorithms were introduced to detect voiced speech and to estimate the pitch. Most of them rely on a high spectral resolution that is achieved by employing long window lengths. However, some applications, such as in-car-communication (ICC) systems, have to deal with short windows in order to reduce computational costs and to ensure low system latencies. Resolving the pitch is difficult in this case. Spectral refinement techniques have been introduced to increase the spectral resolution by combining multiple consecutive low-resolution spectra. Using these techniques, standard pitch estimation algorithms can be applied even though the resolution of the original spectrum was too low. In this paper, we analyze the performance of pitch estimation using spectral refinement techniques and introduce an alternative approach that explicitly takes into account the short windows of ICC applications.