[1]孙暐,吴镇扬.基于独立感知理论的鲁棒语音识别算法[J].东南大学学报(自然科学版),2005,35(4):506-509.[doi:10.3969/j.issn.1001-0505.2005.04.002]
 Sun Wei,Wu Zhenyang.Robust speech recognition algorithm based on fletcher-allen principle[J].Journal of Southeast University (Natural Science Edition),2005,35(4):506-509.[doi:10.3969/j.issn.1001-0505.2005.04.002]
点击复制

基于独立感知理论的鲁棒语音识别算法()
分享到:

《东南大学学报(自然科学版)》[ISSN:1001-0505/CN:32-1178/N]

卷:
35
期数:
2005年第4期
页码:
506-509
栏目:
信息与通信工程
出版日期:
2005-07-20

文章信息/Info

Title:
Robust speech recognition algorithm based on fletcher-allen principle
作者:
孙暐 吴镇扬
东南大学无线电工程系, 南京 210096
Author(s):
Sun Wei Wu Zhenyang
Department of Radio Engineering, Southeast University, Nanjing 210096, China
关键词:
语音识别 隐马尔可夫模型 最大似然
Keywords:
speech recognition hidden Markov model maximum likelihood
分类号:
TN912.34
DOI:
10.3969/j.issn.1001-0505.2005.04.002
摘要:
为了提高在噪声环境下语音识别系统的性能,对基于子带独立感知理论的语音识别方法进行了研究.这些方法利用人耳对不同频率信号感知的差异,以及噪声和识别对象的频域特征差异,分别采用线性分析、判决分析、多层感知机以及子带最大似然估计对噪声影响进行补偿.实验表明,子带分析采用非线性策略优于线性策略.基于独立感知假定的子带模型,虽然由于独立性假定丢失了带间相关性,但对于噪声环境下语音识别而言可以捕获噪声和识别对象的频谱差异,从而获得比全带分析更高的鲁棒性.
Abstract:
To improve the robust of the speech recognition systems, the subband robust speech recognition algorithms based on the Fletcher-Allen principle are studied. These algorithms utilize the perception difference of human’s ear to the different frequency signal and the spectrum difference between noise and recognition objects to compensate the effects of noise with maximum likelihood criteria and linear combination or discriminative combination or multi-layer perceptron. The test shows nonlinear analysis is superior to linear analysis. Although the subband model based on the Flether-Allen principle loses the correlation among the different bands, it can catch the difference between noise’s spectrum and the recognized object’s spectrum to achieve higher robust than the whole band model in noisy environments.

参考文献/References:

[1] Gales M,Young S.Cepstral parameter compensation for HMM recognition in noise[J].Computer Speech and Language,1993,12(3):231-239.
[2] Leggetter C J,Woodland P C.Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models[J].Computer Speech and Language,1995,9(2):171-185.
[3] Gales M J F,Woodland P C.Mean and variance adaptation within the MLLR framework[J].Computer Speech and Language,1996,10(4):249-264.
[4] Allen J B.How do humans process and recognize speech[J].IEEE Transactions on Speech and Audio Processing,1994,2(4):567-577.
[5] Sharma S R.Multi stream approach to robust speech recognition[D].Portland,USA:Oregon Graduate Institute of Science and Technology,1999.
[6] Tibrewala S,Hermansky H.Sub-band based recognition of noisy speech[A].In:Proc ICASSP’97[C].Munich,Germany,1997.1255-1258.
[7] Hermansky H,Tibrewala S,Pavel M.Towards ASR on partially corrupted speech[A].In:Proc ICSLP’96[C].Philadelphia,USA,1996.462-465.
[8] Ji M,Smith F J.A probabilistic union model for subband based robust speech recognition[A].In:Proc ICASSP’00[C].Istanbul,Turkey,2000.1787-1790.
[9] Ris C,Dupont S.Assessing local noise level estimation methods:application to noise robust ASR[J].Speech Communication,2001,34:141-158.
[10] Hirsh H G.Estimation of noise spectrum and its application to SNR estimation and speech enhancement(TR-93-012)[R].Berkeley,USA:International Computer Science Institute,1993.
[11] Dempster A P,Laird N M,Rubin D B.Maximum likelihood estimation from incomplete data[J].Journal Royal Statistical Society,Serials B,1977,39(1):1-38.
[12] Mak B.A mathematical relationship between fullband and multiband mel-frequency cepstral coefficients[J].Signal Processing Letters IEEE,2002,9(8):241-244.
[13] Morris A,Hagen A,Glotin H,et al.Multi-stream adaptive evidence combination for noise robust ASR[J].Speech Communication,2001,34:25-40.

相似文献/References:

[1]赵力,刘怡龙,邹采荣,等.基于VQ-HMM的无教师说话人自适应方法[J].东南大学学报(自然科学版),2001,31(2):23.[doi:10.3969/j.issn.1001-0505.2001.02.006]
 Zhao Li,Liu Yilong,Zou Cairong,et al.An Unsupervised Speaker Adaptation Method Based on VQ-HMM[J].Journal of Southeast University (Natural Science Edition),2001,31(4):23.[doi:10.3969/j.issn.1001-0505.2001.02.006]
[2]马小辉,富煜清,陆佶人.结合SOFM失真的HMM语音识别方法[J].东南大学学报(自然科学版),1997,27(1):49.[doi:10.3969/j.issn.1001-0505.1997.01.010]
 [J].Journal of Southeast University (Natural Science Edition),1997,27(4):49.[doi:10.3969/j.issn.1001-0505.1997.01.010]
[3]吕勇,吴镇扬.基于隐马尔可夫模型与并行模型组合的特征补偿算法[J].东南大学学报(自然科学版),2009,39(5):889.[doi:10.3969/j.issn.1001-0505.2009.05.004]
 Lü Yong,Wu Zhenyang.Feature compensation algorithm based on hidden Markov model and parallel model combination[J].Journal of Southeast University (Natural Science Edition),2009,39(4):889.[doi:10.3969/j.issn.1001-0505.2009.05.004]

备注/Memo

备注/Memo:
基金项目: 国家自然科学基金资助项目(60272044)、国家重点基础研究发展计划(973计划)资助项目(2002CB312102).
作者简介: 孙暐(1974—),男,博士生; 吴镇扬(联系人),男,教授,博士生导师,zhenyang@seu.edu.cn.
更新日期/Last Update: 2005-07-20