1. 华南师范大学工学部电子与信息工程学院
2. 华南师范大学物理学院
纸质出版:2024
移动端阅览
[1]黄佩瑶,程慧慧,唐小煜.具有互补特征学习框架和注意力特征融合模块的语音情感识别模型[J].新疆大学学报(自然科学版)(中英文),2024,41(01):52-58.
[1]黄佩瑶,程慧慧,唐小煜.具有互补特征学习框架和注意力特征融合模块的语音情感识别模型[J].新疆大学学报(自然科学版)(中英文),2024,41(01):52-58. DOI: 10.13568/j.cnki.651094.651316.2023.07.05.0002.
DOI:10.13568/j.cnki.651094.651316.2023.07.05.0002.
针对深度学习的特征提取方法无法全面提取语音中的情感特征,也无法有效地融合这些特征的问题,提出了一种集成互补特征学习框架和注意力特征融合模块的语音情感识别模型.该互补特征学习框架包含两条独立的表征提取分支和一条交互互补表征提取分支,能够全面覆盖情感特征的独立性表征和互补性表征.为了进一步优化模型性能,引入注意力特征融合模块,该模块能够根据不同表征对情感分类的贡献程度分配合适的权重,使模型能最大程度地关注对情感识别最有助的特征.基于两个公开情感数据库(Emo-DB和IEMOCAP)的仿真实验结果,验证了所提模型的鲁棒性和有效性.
Addressing the limitations of deep learning feature extraction methods
which fail to comprehensively extract and effectively integrate emotional features from speech
this paper proposes a novel speech emotion recognition model. It integrates a complementary feature learning framework and an attention feature fusion module.The complementary feature learning framework consists of two independent representational extraction branches and an interactive complementary representational extraction branch
thoroughly covering both independent and complementary representations of emotional features. To further optimize model performance
an attention feature fusion module is introduced. This module allocates appropriate weights based on the contribution level of different representations to emotion classification
enabling the model to focus maximally on features most beneficial for emotion recognition. Simulation experiments conducted on two public emotion databases(Emo-DB and IEMOCAP) validate the robustness and effectiveness of the proposed model.
DE LOPE J,GRA NA M.An ongoing review of speech emotion recognition[J].Neurocomputing,2023,528:1-11.
STOCK-HOMBURG R.Survey of emotions in human-robot interactions:Perspectives from robotic psychology on 20 years of research[J].International Journal of Social Robotics,2022,14(2):389-411.
LIU Z Y,HU B,LI X Y,et al.Detecting depression in speech under different speaking styles and emotional valences[C]//International Conference on Brain Informatics.Cham:Springer,2017:261-271.
CHEN M Y,HE X J,YANG J,et al.3-D convolutional recurrent neural networks with attention model for speech emotion recognition[J].IEEE Signal Processing Letters,2018,25(10):1440-1444.
DAHAKE P P,SHAW K,MALATHI P.Speaker dependent speech emotion recognition using MFCC and support vector machine[C]//2016 International Conference on Automatic Control and Dynamic Optimization Techniques (ICACDOT).Pune,India.IEEE,2016:1080-1084.
EYBEN F,W¨OLLMER M,SCHULLER B.Open EAR-introducing the Munich open-source emotion and affect recognition toolkit[C]//2009 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops.Amsterdam,Netherlands.IEEE,2009:1-6.
ZHENG L,LI Q,BAN H,et al.Speech emotion recognition based on convolution neural network combined with random forest[C]//2018 Chinese Control and Decision Conference (CCDC).Shenyang,China.IEEE,2018:4143-4147.
ZHOU P,LI X P,LI J,et al.Speech emotion recognition based on mixed MFCC[J].Applied Mechanics and Materials,2012,249/250:1252-1258.
LALITHA S,MUDUPU A,NANDYALA B V,et al.Speech emotion recognition using DWT[C]//2015 IEEE International Conference on Computational Intelligence and Computing Research (ICCIC).Madurai,India.IEEE,2015:1-4.
RAO K S,KOOLAGUDI S G,VEMPADA R R.Emotion recognition from speech using global and local prosodic features[J].International Journal of Speech Technology,2013,16:143-160.
JIANG P X,FU H L,TAO H W,et al.Parallelized convolutional recurrent neural network with spectral features for speech emotion recognition[J].IEEE Access,2019,7:90368-90377.
CHEN Q P,HUANG G M.A novel dual attention-based BLSTM with hybrid features in speech emotion recognition[J].Engineering Applications of Artificial Intelligence,2021,102:104277.
GUO L L,WANG L B,DANG J W,et al.Exploration of complementary features for speech emotion recognition based on kernel extreme learning machine[J].IEEE Access,2019,7:75798-75809.
HE J R,REN L Y.Speech emotion recognition using XGBoost and CNN BLSTM with attention[C]//2021 IEEE SmartWorld,Ubiquitous Intelligence&Computing,Advanced&Trusted Computing,Scalable Computing&Communications,Internet of People and Smart City Innovation (Smart World/SCALCOM/UIC/ATC/IOP/SCI).Atlanta,GA,USA.IEEE,2021:154-159.
ZHONG S M,YU B X,ZHANG H.Exploration of an independent training framework for speech emotion recognition[J].IEEEAccess,2020,8:222533-222543.
LIU J X,LIU Z L,WANG L B,et al.Speech emotion recognition with local-global aware deep representation learning[C]//ICASSP2020-2020 IEEE International Conference on Acoustics,Speech and Signal Processing (ICASSP).Barcelona,Spain.IEEE,2020:7174-7178.
JUNG H,LEE S,YIM J,et al.Joint fine-tuning in deep neural networks for facial expression recognition[C]//2015 IEEEInternational Conference on Computer Vision (ICCV).Santiago,Chile.IEEE,2015:2983-2991.
WOO S,PARK J,LEE J Y,et al.CBAM:Convolutional block attention module[C]//European Conference on Computer Vision.Cham:Springer,2018:3-19.
KE X X,ZHU Y J,WEN L,et al.Speech emotion recognition based on SVM and ANN[J].International Journal of Machine Learning and Computing,2018,8(3):198-202.
SAHOO S,ROUTRAY A.MFCC feature with optimized frequency range:An essential step for emotion recognition[C]//2016International Conference on Systems in Medicine and Biology (ICSMB).Kharagpur,India.IEEE,2016:162-165.
WU S,LI G Q,DENG L,et al.L1-norm batch normalization for efficient training of deep neural networks[J].IEEE Transactions on Neural Networks and Learning Systems,2019,30(7):2043-2051.
LI R N,WU Z Y,JIA J,et al.Dilated residual network with multi-head self-attention for speech emotion recognition[C]//ICASSP2019-2019 IEEE International Conference on Acoustics,Speech and Signal Processing (ICASSP).Brighton,UK.IEEE,2019:6675-6679.
BUSSO C,BULUT M,LEE C C,et al.IEMOCAP:Interactive emotional dyadic motion capture database[J].Language Resources and Evaluation,2008,42:335-359.
RAMDINMAWII E,MOHANTA A,MITTAL V K.Emotion recognition from speech signal[C]//TENCON 2017-2017 IEEERegion 10 Conference.Penang,Malaysia.IEEE,2017:1562-1567.
0
浏览量
340
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构
京公网安备11010802024621
