
Turning speech into text is at the heart of an amazing variety of products and services that enrich peoples’ lives. Most of the world’s successful speech solutions today have Nuance speech technology inside.
Our goal: Near-perfect speech recognition for everybody in the world.
Nuance has been a pioneer in speech and language technologies for more than 30 years. Our data centers host billions of speech transactions every month in over 40 languages from hundreds of applications. We continuously expand our research grid to explore this avalanche of data. Our researchers, experts in the fields of speech recognition, statistical modeling, deep machine learning, and linguistics, use these computational and data resources to continuously advance the boundaries of what can be done with speech technology.
Current applications for consumers and companies.
We optimize our technology for four main application scenarios.
– Our personal assistant solutions enable people to communicate with their devices on human terms. Our systems understand people’s intentions and provide appropriate responses. Drivers operate their GPSs, make phone calls and listen to messages using our robust speech solutions. Speech makes these interactions easier and safer. In that sense, speech technology saves lives. We continuously improve our accuracy, latency, and robustness; and extend our models to new domains, accents, languages, and devices.
– Our document creation solution powers Dragon NaturallySpeaking, the Nuance flagship speech recognition product. We develop a highly personalized speech recognition solution for each user without explicit training. Our solution not only transcribes accurately the words people dictate, but also formats the resulting written documents.
– Medical professionals use our dictation solutions to generate millions of reports every day. We offer both “front-end” solutions, where doctors see and correct reports as they dictate, and “back-end” solutions, where users speak into a microphone, and are later presented with corrected, formatted reports for signature.
– Most spoken communication takes place between people. Our transcription solution accurately converts the spoken words in conversational speech, particularly voicemails, into text. We focus our research on particular challenges in conversational speech, e.g. sloppy formulation and articulation, difficult recording conditions, multiple speakers, and unpredictable content.
Our solutions are implemented in server-based systems, embedded systems, and hybrid systems that use both server and embedded components. We work closely with our hosted operations and frequently roll out new algorithms and models
Where we’re headed next
Here are some representative examples of the problems we research:
Explore recent publications by Nuance Speech Recognition researchers.
Spoken language systems, ranging from interactive voice response (IVR) to mixed-initiative conversational systems, make use of a wide range of recognition grammars and vocabularies. The …
Read more
In this paper, we adopt an n-best rescoring scheme using pitch-accent patterns to improve automatic speech recognition (ASR) performance. The pitch-accent model is decoupled from …
Read more
Most previous approaches to automatic prosodic event detection are based on supervised learning, relying on the availability of a corpus that is annotated with the …
Read more
In this paper, we propose a new acoustic confidence measure of ASR hypothesis and compare it to approaches proposed in the literature. This approach takes …
Read more
In this communication, we present a method for noise-robust multi-microphone automatic speech recognition (ASR). It is assumed that the speech source to be recognized is …
Read more
In this paper we compare two different methods for automatically phonetically labeling a continuous speech database, as usually required for designing a speech recognition or …
Read more
This paper intends to summarize some of the robust feature extraction and acoustic modeling technologies used at Multitel, together with their assessment on some of …
Read more
Lots of industrial tasks need contacts between operators and a central Information Management System. Permanent contact creates a more effective and efficient workforce with a …
Read more
This paper intends to summarize recent developments and experimental results related to Automatic Speech Recognition (ASR) using signals captured with a throat-microphone. Due to the …
Read more