Article details

Research area
Speech recognition

New York


Raymond Brueckner, Björn Schuller

Be at Odds? – Deep and Hierarchical Neural Networks for Classification and Regression of Conflict in Speech


Conflict is a fundamental phenomenon inevitably arising in inter-human communication and only recently has become the subject of study in the emerging field of computational paralinguistics.
As speech is a predominant carrier of information about the valence and level of conflict we investigate and demonstrate how deep and hierarchical neural networks, which have become the new mainstream paradigm in automatic speech recognition over the last few years, can be leveraged to automatically classify and predict levels of conflict purely based on audio recordings. For this purpose we adopt a neural network architecture which we previously have applied successfully to another paralinguistics task. On the Conflict Sub-Challenge data set of the Interspeech 2013 Computational Paralinguistics Challenge (ComParE) we obtain the best results reported so far in the literature on both the classification and the regression task. These results demonstrate that deep neural networks are also appropriate for the prediction of conflict levels, both for classification and regression.

Read/download now