[1]吴海洋,吕勇,吴镇扬.基于VQMAP模型和AdaBoost学习算法的说话人识别[J].东南大学学报(自然科学版),2010,40(3):476-480.[doi:10.3969/j.issn.1001-0505.2010.03.008]
 Wu Haiyang,Lü Yong,Wu Zhenyang.Speaker recognition based on VQMAP model and AdaBoost learning algorithm[J].Journal of Southeast University (Natural Science Edition),2010,40(3):476-480.[doi:10.3969/j.issn.1001-0505.2010.03.008]
点击复制

基于VQMAP模型和AdaBoost学习算法的说话人识别()
分享到:

《东南大学学报(自然科学版)》[ISSN:1001-0505/CN:32-1178/N]

卷:
40
期数:
2010年第3期
页码:
476-480
栏目:
信息与通信工程
出版日期:
2010-05-20

文章信息/Info

Title:
Speaker recognition based on VQMAP model and AdaBoost learning algorithm
作者:
吴海洋 吕勇 吴镇扬
东南大学信息科学与工程学院,南京 210096
Author(s):
Wu Haiyang Lü Yong Wu Zhenyang
School of Information Science and Engineering, Southeast University, Nanjing 210096, China
关键词:
最大后验矢量量化模型 自适应提升 提前终止 说话人识别
Keywords:
maximum a posteriori vector quantization model adaptive boosting early stopping speaker recognition
分类号:
TN912.34
DOI:
10.3969/j.issn.1001-0505.2010.03.008
摘要:
为了解决传统说话人识别系统在集成学习后识别速度变慢且容易过学习的问题,构造了一种基于最大后验矢量量化(VQMAP)模型和自适应提升(AdaBoost)学习算法的说话人识别系统.首先,分析了说话人识别系统中基分类器性能对集成分类器泛化误差的影响.然后,针对说话人的类别数,构造适当精度的VQMAP模型.最后,利用包含提前终止策略的AdaBoost学习算法将该模型提升为强分类器.实验结果表明:该算法的识别速度较高,是最大后验高斯混合模型(GMMMAP)的9倍; 该算法可有效控制AdaBoost学习算法在说话人识别中的过学习问题,其性能优于VQMAP模型,且在训练数据较少或者类别数可预计的情况下,其性能可接近甚至超过GMMMAP模型.
Abstract:
In order to solve the problem of low recognition speed and overfitting resulting from ensemble learning in traditional speaker recognition systems, a novel speaker recognition system based on the maximum a posteriori vector quantization model(VQMAP)and the adaptive boosting(AdaBoost)learning algorithm is presented. Firstly, the influence of base classifier performance on the generation errors of the boosted classifier is analyzed in the speaker recognition system. Then, a suitable VQMAP classifier matching the speaker number is constructed. Finally, it is boosted to a strong classifier by the AdaBoost learning algorithm with the early stopping method. The experimental results show that the proposed algorithm has a faster recognition speed, which is 9 times faster than that of maximum a posteriori adapted Gaussian mixture model(GMMMAP).It also reduces the overfitting of the AdaBoost learning algorithm in speaker recognition. The performance of the boosted VQMAP model is better than that of the VQMAP model, and in the case of limited data or a predictable speaker number, it can reach or exceed the GMMMAP model.

参考文献/References:

[1] Freund Y,Schapire R E.Decision-theoretic generalization of on-line learning and an application to AdaBoost [J].Journal of Computer and System Sciences,1997,55(1):119-139.
[2] Schapire R E,Freund Y,Bartlett P,et al.AdaBoost the margin:a new explanation for the effectiveness of voting methods [J]. Annals of Statistics,1998,26(5):1651-1686.
[3] Wickramaratna J,Holden S,Buxton B.Performance degradation in AdaBoost [C] //Proceedings of the Second International Workshop on Multiple Classifier Systems.Berlin,Germany:Springer-Verlag,2001:11-21.
[4] Valentini G,Dietterich T G.Bias-variance analysis of support vector machines for the development of SVM-based ensemble methods [J].J Mach Learn Res,2004,21(5):725-775.
[5] Li X,Wang L,Sung E.AdaBoost with SVM-based component classifiers [J]. Engineering Applications of Artificial Intelligence,2008,21(5):785-795.
[6] Luo Dirgsheng,Chen Ke.On the use of statistical ensemble methods for telephone-line speaker identification [C] //IEEE 2002 International Conference on Communications,Circuits and Systems.Chengdu,China,2002:904-908.
[7] Tang H,Chen Z X,Huang T S.Comparison of algorithms for speaker identification under adverse far-field recording conditions with extremely short utterances [C] //Proceedings of 2008 IEEE International Conference on Networking,Sensing and Control.Sanya,China,2008:796-801.
[8] Hautamaki V,Kinnunen T,Karkkainen I,et al.Maximum a posteriori adaptation of the centroid model for speaker verification [J]. IEEE Signal Processing Letters,2008,15:162-165.
[9] Reynolds D A,Quatieri T F,Dunn R B.Speaker verification using adapted Gaussian mixture models [C] //Fifth Annual NIST 1999 Speaker Recognition Workshop.Gaithersburg,MD,USA,1999:19-41.
[10] Margineantu D D,Dietterich T G.Pruning adaptive AdaBoost [C] //Proceedings of the 14th International Conference on Machine Learning.Nashville,TN,USA,1997:211-218.
[11] Zhang T,Yu B.Boosting with early stopping:convergence and consistency [J]. Annals of Statistics,2005,33(4):1538-1579.
[12] Intel Corporation.Intel 64 and IA-32 architectures software developer’s manual [EB/OL].[2009-12-05].http://www.intel.com/Assets/PDF/manual/253665.pdf.

备注/Memo

备注/Memo:
作者简介: 吴海洋(1983—),男,博士生; 吴镇扬(联系人),男,教授,博士生导师,zhenyang@seu.edu.cn.
基金项目: 国家自然科学基金资助项目(60971098).
引文格式: 吴海洋,吕勇,吴镇扬.基于VQMAP模型和AdaBoost学习算法的说话人识别[J].东南大学学报:自然科学版,2010,40(3):476-480. [doi:10.3969/j.issn.1001-0505.2010.03.008]
更新日期/Last Update: 2010-05-20