Article details

Research area
Natural language & AI

ICSLP 98 Proceedings (5th International Conference on Spoken Language Processing)


Paul van Mulbregt, I. Carp, L. Gillick, S. Lowe, and Jon Yamron

Text Segmentation and Topic Tracking on Broadcast News via a Hidden Markov Model Approach


Continuing progress in the automatic transcription of broadcast speech via speech recognition has raised the possibility of applying information retrieval techniques to the resulting (errorful) text. In this paper we describe a general methodology based on Hidden Markov Models and classical language modeling techniques for automatically inferring story boundaries (segmentation) and for retrieving stories relating to a specific topic (tracking). We will present in detail the features and performance of the Segmentation and Tracking systems submitted by Dragon Systems for the 1998 Topic Detection and Tracking evaluation.

Read/download now