Proceedings of the 7th Joint Symposium on Neural Computation at USC, Los Angeles, CA, USA
A number of speech samples recorded with different amplifier gain are subjected to a psychoacoustic loudness model developed for this project. For this, the recordings are normalized with regard to their loudest frequencies in a short interval at the beginning of each sample. The data is then transferred to the frequency domain and critical bandwidth filters are placed around spectral peaks for every time frame. With data adapted from the ISO R532B/Zwicker model, the loudness in each band is then calculated and integrated over the whole spectrum to obtain a loudness value for this time window. Compared to other speech intensity dimensions like power, these
loudness values prove to be a better representation of perceived speech volume.