research category image

Speech enhancement

Speech enhancement is focused on evolving the technologies used in voice-driven interfaces for an enhanced user experience and greater application flexibility for manufacturers.

research category image

Speech enhancement



Leading the industry in speech technology
Nuance’s speech enhancement group in Ulm, Germany, has been a research leader in the field of speech and audio signal processing for over 15 years. The team provides custom-tailored solutions for robust speech recognition and enhanced speech communication across many platforms and in a variety of environments. Our core research competencies include:

  • Single-microphone noise reduction approaches
  • Acoustic echo cancellation for voice user interfaces and telecommunication
  • Cutting-edge spatial filtering techniques for beam forming and interference cancellation

Using speech to enhance user-experience in other industries
Our research also encompasses more diverse speech enhancement topics such as speech reconstruction, bandwidth extension, dynamic signal mixing, or adaptive equalization. These efforts have resulted in an integrated speech enhancement front-end, which is deployed in various automotive, home, and mobile applications.
Our technology supports speech interaction with virtual personal assistants, even in the most challenging acoustic environments. As a leading expert for automotive speech enhancement, we also provide solutions for high-quality hands-free telephony as well as systems to improve in-car communication between front and rear passengers in large sedans and vans.

Explore recent publications by Nuance Speech Enhancement researchers.



Selected articles

A dynamic multi-channel speech enhancement system for distributed microphones in a car environment

Supporting multiple active speakers in automotive hands-free or speech dialog applications is an interesting issue not least due to comfort reasons. Therefore, a multi-channel system for

Read more

Speaker activity detection for distributed microphone systems in cars

In this contribution a new framework for energy-based acoustic speaker activity detection for distributed microphones in automotive environments is presented. The method relies on the

Read more

Advanced speech enhancement with partial speech reconstruction

An advanced speech enhancement algorithm is proposed, which employs partial speech reconstruction of highly disturbed speech. The speech reconstruction algorithms assume the source-filter model of

Read more

Speech enhancement based on formant estimation

In this contribution, a computationally efficient method for enhancing speech quality and intelligibility by identifying and accentuating formants in speech signals is presented. The proposed

Read more

A multi-channel quality assessment setup applied to a distributed microphone speech enhancement systemwith spectral boosting

For instrumental quality assessment of speech enhancement systems it is common to process the signal components such as speech and noise independently using the filter

Read more

A morphological approach to single-channel wind-noise suppression

Today, a variety of technical devices deploy spoken language processing technology. In many practical use cases, not only stationary ambient noises but non-stationary interferences, such

Read more

Self-learning speaker identification for enhanced speech recognition

A novel approach for joint speaker identification and speech recognition is presented in this article. Unsupervised speaker tracking and automatic adaptation of the human–computer interface

Read more

Enhanced speaker activity detection for distributed microphones by exploitation of signal power ratio patterns

In cars with integrated distributed microphone systems usually each speaker has a dedicated microphone. An often required broadband speaker activity detection can be performed by

Read more

Entwurf und analyse von beamformer-nachfilter-systemen

PhD Thesis

Spectro-temporal features for excitation signal quantization in a speech reconstruction system

Models of non-disturbed voiced excitation signals are trained using pitch and short-term spectral features, including dynamic pitch as well as inter-frame and intra-frame phase shift

Read more

1 2 3 4 6

Upcoming events

See all Research events