1. 新疆大学信息科学与工程学院
2. 新疆大学多语种信息技术实验室
纸质出版:2011
移动端阅览
[1]黄浩,李兵虎,吾守尔.斯拉木.汉语语音识别声调模型集成中基于决策树的上下文相关权重参数聚类方法[J].新疆大学学报(自然科学版),2011,28(03):260-266.
黄浩, 李兵虎, 吾守尔.斯拉木. 汉语语音识别声调模型集成中基于决策树的上下文相关权重参数聚类方法[J]. Journal of Xinjiang University (Natural Science Edition in Chinese and English), 2011, 28(3): 260-266.
声调集成是汉语语音识别的一个重要任务.在语音识别的二次解码过程中
使用区分性训练的权重因子进行声调模型集成已被证明是一个有效的方法
而且使用上下文相关的得分加权进行模型组合也得到了应用.上下文相关模型组合方法的一个不足是将会带来大量的训练参数
从而导致权重训练受到过拟合的影响.针对该问题
提出利用声学决策树对上下文相关权重参数进行参数聚类
决策树节点问题集根据最小化训练数据的期望误识率进行选择.提出问题集剪枝来加快决策树的构建速度.汉语连续语音识别实验表明与人工选择上下文相关权重参数相比
该方法能够在大大减少参数数量的条件下明显降低误识率.
Tone model integration is an important task in Mandarin speech recognition.Discriminative model weight training is an effective technique for this purpose.In recent works
context-dependent scaling is often applied for better interpolation between the models.One limitation of this approach is a large number of parameters will be introduced
which makes it prone to overtraining.In this paper
we propose parameter tying to cluster context-dependent model weights using phonetic decision trees.Question at each tree node is chosen to minimize expected error of the training data.Question set pruning is used in node splitting to make tree building effcient.Experimental results on continuous speech recognition task show the method is capable of achieve better accuracy using many fewer parameters.
Huang C H,Side F.Pitch tracking and tone features for mandarin speech recognition[C].Proceedings of InternationalConference of Acoustics,Speech and Signal Processing,Istanbul,Turkey,Jun 5-9,2000,1523-1526.
Lei X,Siu M H,Hwang M,et al.Improved Tone Modeling for Mandarin Broadcast News Speech Recognition[C].InProceedings of Interspeech.Pittsburgh,PA,USA,Sept.17-21,2006,1277-1280.
Wang H L,Qian Y,Soong F K,et al.Improved Mandarin Speech Recognition by Lattice Rescoring with Enhanced Tonemodels[C].Proceedings of ISCSLP,2006,445-443.
Beyerlein P.Discriminative model combination[C].in Proc.IEEE Automatic Speech Recognition and Understanding Work-shop,Santa Barbara,California,USA,Dec.1997,238-245.
Huang H,Zhu J.Discriminative incorporation of explicitly trained tone models into lattice based rescoring for Mandarinspeech recognition[C].Proceedings of International Conference of Acoustics,Speech and Signal Processing,2008-Las Vegas,Nevada,U.S.A.,March 30-April 4,2008,1541-1544.
Hoffmeister B,Liang R,Schlulter R,et al.Log-linear model combination with word-dependent scaling factors[C].Pro-ceedings of the 10th Annual Conference of the Speech Communication Association Brighton,U.K.Sept 26-30,2009,248-251.
Liu X,Gales M,Woodland P.Use of Contexts in Language Model Interpolation and Adaptation[C].Proceedings of the10th Annual Conference of the Speech Communication Association Brighton,U.K.,Sept 26-30,2009,2009.
Povey D,Woodland P C.Minimum Phone Error and I-smoothing for Improved Discriminative Training[C].Proceedings ofInternational Conference on Acoustics Speech and Signal Processing Florida,USA,May.13-17,2002,1:105-108.
Young S,Odell J,Woodland P.Tree-based state tying for high accuracy acoustic modeling[C].Proceedings of1994Workshop on Human Language Technology Plainsboro,New Jersey,USA,March 8-11,1994,351-354.
Chang E,Shi Yu,Zhou Jian Lai,et al.Speech lab in a box:a Mandarin speech toolbox to jumpstart speech relatedresearch[C].Proceedings of the 7th European Conference on Speech Communication and Technology Aalborg,Denmark,Sept.3-7,2001,2779-2782.
Gunawardana A,Hahajan M,Acero A,et al.Hidden conditional random fields for phone classification[C].Proceedings ofthe 9th European Conference on Speech Communication and Technology,Lisbon,Portugal,Sept 4-8,2005,1117-1120.
0
浏览量
118
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构
京公网安备11010802024621
