Affective speech synthesis of Chinese children
HU Hangye;WANG Wei;School of Educational Science, Nanjing Normal University;
Emotional speech synthesis technology is of great significance for human-computer interaction.Facing the lack of Chinese speech data resources required for children's emotional speech synthesis and the long time of model training, this paper proposes a method of using transfer learning to realize Chinese children's emotional speech synthesis. This paper first implements the Chinese speech end-to-end synthesis model based on the Chinese speech database training depth learning model, then uses the high-quality and large sample Chinese emotional corpus to complete the emotional speech synthesis model, and finally uses the self sampled small sample Chinese children's emotional corpus to transfer the model to realize low resource speech synthesis.The objective experimental results show that the Mel cepstrum distortion index is 4.91, and the subjective auditory discrimination experimental indexes are 3.61 and 4.17 respectively. The experimental comparison shows that the method in this paper has good performance in the application of emotional speech synthesis technology, and is better than the existing advanced low resource emotional speech synthesis methods.
CAJViewer7.0 supports all the CNKI file formats; AdobeReader only supports the PDF format.