Preprint Article Version 1 This version is not peer-reviewed

Speech Emotion Recognition Using Multiscale Global-Local Representation Learning with Feature Pyramid Network

Version 1 : Received: 12 October 2024 / Approved: 14 October 2024 / Online: 14 October 2024 (07:10:55 CEST)

How to cite: Wang, Y.; Huang, J.; Zhao, Z.; Lan, H.; Zhang, X. Speech Emotion Recognition Using Multiscale Global-Local Representation Learning with Feature Pyramid Network. Preprints 2024, 2024101002. https://doi.org/10.20944/preprints202410.1002.v1 Wang, Y.; Huang, J.; Zhao, Z.; Lan, H.; Zhang, X. Speech Emotion Recognition Using Multiscale Global-Local Representation Learning with Feature Pyramid Network. Preprints 2024, 2024101002. https://doi.org/10.20944/preprints202410.1002.v1

Abstract

Speech emotion recognition (SER) is important in facilitating natural human-computer interactions. In speech sequence modelling, a vital challenge is to learn context-aware sentence expression and temporal dynamics of para-linguistic features to achieve unambiguous emotional semantic understand-ing. In previous studies, the SER method based on the single-scale cascade feature extraction module could not effectively preserve the temporal structure of speech signals in the deep layer, downgrading the sequence modeling performance. In this paper, we propose a novel multi-scale feature pyramid network to mitigate the above limitations. With the aid of the bi-directional feature fusion of the pyramid network, the emotional representation with adequate temporal semantics is obtained. Experiments on the IEMOCAP corpus demonstrate the effectiveness of the proposed methods and achieve competitive results under speaker-independent validation.

Keywords

speech emotion recognition; multi-scale feature pyramid network; convolutional self-attention

Subject

Computer Science and Mathematics, Artificial Intelligence and Machine Learning

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.