research category image

Speech enhancement

Speech enhancement is focused on evolving the technologies used in voice-driven interfaces for an enhanced user experience and greater application flexibility for manufacturers.

research category image

Speech enhancement



Leading the industry in speech technology
Nuance’s speech enhancement group in Ulm, Germany, has been a research leader in the field of speech and audio signal processing for over 15 years. The team provides custom-tailored solutions for robust speech recognition and enhanced speech communication across many platforms and in a variety of environments. Our core research competencies include:

  • Single-microphone noise reduction approaches
  • Acoustic echo cancellation for voice user interfaces and telecommunication
  • Cutting-edge spatial filtering techniques for beam forming and interference cancellation

Using speech to enhance user-experience in other industries
Our research also encompasses more diverse speech enhancement topics such as speech reconstruction, bandwidth extension, dynamic signal mixing, or adaptive equalization. These efforts have resulted in an integrated speech enhancement front-end, which is deployed in various automotive, home, and mobile applications.
Our technology supports speech interaction with virtual personal assistants, even in the most challenging acoustic environments. As a leading expert for automotive speech enhancement, we also provide solutions for high-quality hands-free telephony as well as systems to improve in-car communication between front and rear passengers in large sedans and vans.

Explore recent publications by Nuance Speech Enhancement researchers.



Selected articles

Improved performance measures for voice activity detection

Voice activity detection is an essential part of many speech processing algorithms. The requirements of the speech application determine the design of voice activity detection.

Read more

Effects of resampling in acoustic echo cancellation with static nonlinear loudspeaker distortion

In modern acoustic echo compensation (AEC), nonlinear models are applied to mimic the loudspeaker’s behavior. Many conventional methods disregard the NYQUIST criterion when applying the

Read more

Advanced speech enhancement with partial speech reconstruction

An advanced speech enhancement algorithm is proposed, which employs partial speech reconstruction of highly disturbed speech. The speech reconstruction algorithms assume the source-filter model of

Read more

A dynamic multi-channel speech enhancement system for distributed microphones in a car environment

Supporting multiple active speakers in automotive hands-free or speech dialog applications is an interesting issue not least due to comfort reasons. Therefore, a multi-channel system for

Read more

Speech enhancement based on formant estimation

In this contribution, a computationally efficient method for enhancing speech quality and intelligibility by identifying and accentuating formants in speech signals is presented. The proposed

Read more

A multi-channel quality assessment setup applied to a distributed microphone speech enhancement systemwith spectral boosting

For instrumental quality assessment of speech enhancement systems it is common to process the signal components such as speech and noise independently using the filter

Read more

Influence of blocking matrix design on microphone array postfilters

In this paper the role of the blocking matrix is investigated in the context of microphone array postfilters. A generic stability criterion, which depends on

Read more

Spectro-temporal features for excitation signal quantization in a speech reconstruction system

Models of non-disturbed voiced excitation signals are trained using pitch and short-term spectral features, including dynamic pitch as well as inter-frame and intra-frame phase shift

Read more

Enhanced speaker activity detection for distributed microphones by exploitation of signal power ratio patterns

In cars with integrated distributed microphone systems usually each speaker has a dedicated microphone. An often required broadband speaker activity detection can be performed by

Read more

Modeling subjectively perceived annoyance of H.264 video as a function of perceived artifact strength

This paper is concerned with the subjective perception of video coding artifacts in H.264/AVC encoded and decoded video. Our objective is to model the perceived

Read more

1 2 3 4 5 6

Upcoming events

See all Research events