[1]张昕然,查诚,徐新洲,等.基于LDA+kernel-KNNFLC的语音情感识别方法[J].东南大学学报(自然科学版),2015,45(1):5-11.[doi:10.3969/j.issn.1001-0505.2015.01.002]
 Zhang Xinran,Zha Cheng,Xu Xinzhou,et al.Speech emotion recognition based on LDA+kernel-KNNFLC[J].Journal of Southeast University (Natural Science Edition),2015,45(1):5-11.[doi:10.3969/j.issn.1001-0505.2015.01.002]
点击复制

基于LDA+kernel-KNNFLC的语音情感识别方法()
分享到:

《东南大学学报(自然科学版)》[ISSN:1001-0505/CN:32-1178/N]

卷:
45
期数:
2015年第1期
页码:
5-11
栏目:
计算机科学与工程
出版日期:
2015-01-20

文章信息/Info

Title:
Speech emotion recognition based on LDA+kernel-KNNFLC
作者:
张昕然12查诚1徐新洲1宋鹏3赵力12
1东南大学水声信号处理教育部重点实验室, 南京210096; 2东南大学信息科学与工程学院, 南京210096; 3东南大学儿童发展与学习科学教育部重点实验室, 南京210096
Author(s):
Zhang Xinran12 Zha Cheng1 Xu Xinzhou1 Song Peng3 Zhao Li12
1Key Laboratory of Underwater Acoustic Signal Processing of Ministry of Education, Southeast University, Nanjing 210096, China
2School of Information Science and Engineering, Southeast University, Nanjing 210096, China
3Key Laboratory of Child Development and Learning Science of Ministry of Education, Southeast University, Nanjing 210096, China
关键词:
语音情感识别 K近邻 核学习 特征重心线 线性判别分析
Keywords:
speech emotion recognition K-nearest neighbor kernel learning method feature line centroid linear discriminant analysis
分类号:
TP391.42
DOI:
10.3969/j.issn.1001-0505.2015.01.002
摘要:
结合K近邻、核学习方法、特征线重心法和LDA算法,提出了用于情感识别的LDA+kernel-KNNFLC方法.首先针对先验样本特征造成的计算量庞大问题,采用重心准则学习样本距离,改进了核学习的K近邻方法;然后加入LDA对情感特征向量进行优化,在避免维度冗余的情况下,更好地保证了情感信息识别的稳定性.最后,通过对特征空间再学习,结合LDA的kernel-KNNFLC方法优化了情感特征向量的类间区分度,适合于语音情感识别.对包含120维全局统计特征的语音情感数据库进行仿真实验,对降维方案、情感分类器和维度参数进行了多组对比分析.结果表明,LDA+kernel-KNNFLC方法在同等条件下性能提升效果最显著.
Abstract:
Based on KNN(K-nearest neighbor), kernel learning, FLC(feature line centroid)and LDA(linear discriminant analysis)algorithm, the LDA+kernel-KNNFLC method is put forward for emotion recognition according to the characteristics of the speech emotion features. First, in view of the large amount of calculation caused by the prior sample characteristics, the KNN of kernel learning method is improved by learning sample distances with the FLC. Secondly, by adding LDA to emotional feature vectors, the stability of emotional information recognition is ensured and dimensional redundancy is avoided. Finally, by the relearning of feature spaces, LDA+kernel-KNNFLC can optimize the degree of differentiation between emotional feature vectors, which is suitable for speech emotion recognition(SER). An emotional database is used for simulation tests, which contains 120 dimensional global statistical characteristics. Multiple comparison analysis is conducted through the dimension reduction scheme, emotion classifiers and dimension parameters. The results show that the improvement effect for SER by using LDA+kernel-KNNFLC is remarkable under the same conditions.

参考文献/References:

[1] Scherer K R. Vocal communication of emotion: a review of research paradigms[J]. Speech Communication, 2003, 40(1/2): 227-256.
[2] Scherer K R, Mortillaro M, Mehu M. Understanding the mechanisms underlying the production of facial expression of emotion: a componential perspective[J]. Emotion Review, 2013, 5(1): 47-53.
[3] Lin J C, Wu C H, Wei W L. Error weighted semi-coupled hidden Markov model for audio-visual emotion recognition[J]. IEEE Transactions on Multimedia, 2012, 14(1):142-156.
[4] Li S Z, Lu J. Face recognition using the nearest feature line method[J]. IEEE Transactions on Neural Networks, 1999, 10(2): 439-443.
[5] Li S Z. Content-based audio classification and retrieval using the nearest feature line method[J]. IEEE Transactions on Speech and Audio Processing, 2000, 8(5): 619-625.
[6] Scholkopf B, Smola A, Muller K. Non-linear component analysis as a kernel eigenvalue problem [J]. Neural Network, 1999, 9(4): 1299-1319.
[7] Muller K, Mika S, Ratsch G, et al. An introduction to kernel-based learning algorithms[J]. IEEE Transactions on Neural Networks, 2001, 12(2): 181-201.
[8] Jung A, Schmutzhard S, Hlawatsch F. The RKHS approach to minimum variance estimation revisited: variance bounds, sufficient statistics, and exponential families[J]. IEEE Transactions on Information Theory, 2014, 60(7): 4050-4065.
[9] Carmona P L, Sánchez J S, Fred A L N. Mathematical methodologies in pattern recognition and machine learning[M]. New York: Springer, 2013: 101-107.
[10] Wu Chung-Hsien, Liang Wei-Bin. Emotion recognition of affective speech based on multiple classifiers using acoustic-prosodic information and semantic labels[J]. IEEE Transactions on Affective Computing, 2011, 2(1):10-21.
[11] Zeng Hong, Cheung Yiu-ming. Feature selection and kernel learning for local learning-based clustering[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2011, 33(8):1532-1547.
[12] Yan Shuicheng, Xu Dong, Zhang Benyu, et al. Graph embedding and extensions: a general framework for dimensionality reduction[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2007, 29(1):40-51.
[13] 黄程韦,赵艳,金赟,等. 实用语音情感的特征分析与识别的研究[J]. 电子与信息学报, 2011, 33(1): 112-116.
  Huang Chengwei,Zhao Yan,Jin Yun, et al. A study on feature analysis and recognition of practical speech emotion[J]. Journal of Electronics & Information Technology, 2011, 33(1): 112-116.(in Chinese)
[14] Dileep A D, Sekhar C C. GMM-based intermediate matching kernel for classification of varying length patterns of long duration speech using support vector machines[J]. IEEE Transactions on Neural Networks and Learning Systems, 2014, 25(8): 1421-1432.
[15] Wu Chung-Hsien, Wei Wen-Li, Lin Jen-Chun, et al. Speaking effect removal on emotion recognition from facial expressions based on eigenface conversion[J]. IEEE Transactions on Multimedia, 2013, 15(8):1732-1744.
[16] Satapathy S C, Udgata S K, Biswal B N, et al. Speech emotion recognition using regularized discriminant analysis[M]. Bhubaneswar, Switzerland: Springer International Publishing, 2014: 363-369.

相似文献/References:

[1]朱芳枚,赵力,梁瑞宇,等.面向中文语音情感识别的改进栈式自编码结构[J].东南大学学报(自然科学版),2017,47(4):631.[doi:10.3969/j.issn.1001-0505.2017.04.001]
 Zhu Fangmei,Zhao Li,Liang Ruiyu,et al.Improved stacked autoencoder for Chinese speech emotion recognition[J].Journal of Southeast University (Natural Science Edition),2017,47(1):631.[doi:10.3969/j.issn.1001-0505.2017.04.001]

备注/Memo

备注/Memo:
收稿日期: 2014-09-17.
作者简介: 张昕然(1987—),男,博士生;赵力(联系人),男,博士,教授,博士生导师,zhaoli@seu.edu.cn.
基金项目: 国家自然科学基金资助项目(61273266, 61231002, 61375028)、教育部博士点专项基金资助项目(20110092130004).
引用本文: 张昕然,查诚,徐新洲,等.基于LDA+kernel-KNNFLC的语音情感识别方法[J].东南大学学报:自然科学版,2015,45(1):5-11. [doi:10.3969/j.issn.1001-0505.2015.01.002]
更新日期/Last Update: 2015-01-20