Full-Text Search:
Home|Journal Papers|About CNKI|User Service|FAQ|Contact Us|中文
《Journal of Applied Acoustics》 2020-02
Add to Favorite Get Latest Update

Towards end-to-end speech recognition for Chinese mandarin using SE-MCNN-CTC

ZHANG Wei;ZHAI Minghao;HUANG Zilong;LI Wei;CAO Yi;School of Mechanical Engineering, Jiangnan University;Suzhou Institute of Industrial Technology;  
In order to solve the problems of high prediction error rate and poor generalization performance with traditional convolutional neural network in Chinese speech recognition, different convolutional layers, pooling layers and fully connected layers on DCNN-CTC are analyzed in this paper. Based on the above model, two kinds of acoustic models referred as MCNN-CTC and SE-MCNN-CTC are proposed, respectively. With the combination of the advantages of MCNN and SENet in the latter model, the deep information transmission is reinforced, and the gradient problems can be effectively avoided simultaneously, the extracted feature maps can be adaptively recalibrated. Compared with DCNN-CTC, the research results show that SE-MCNN-CTC not only yields a 13.51% relative PER reduction, and the final PER is 22.21%, but also the generalization performance of the improved acoustic model can be improved effectively.
【Fund】: 国家自然科学基金项目(51375209);; 江苏省“六大人才高峰”计划项目(ZBZZ-012);; 江苏省研究生创新计划项目(KYCX18_0630 KYCX18_1846);; 高等学校学科创新引智计划项目(B18027)
【CateGory Index】: TN912.34
Download(CAJ format) Download(PDF format)
CAJViewer7.0 supports all the CNKI file formats; AdobeReader only supports the PDF format.
Similar Journals
> Journal of Applied Acoustics
©2006 Tsinghua Tongfang Knowledge Network Technology Co., Ltd.(Beijing)(TTKN) All rights reserved