Neural Network-based Continuous Mandarin Speech Recognition System
|關鍵字:||模組化遞迴類神經網路;語者性別類神經網路;Modular Recurrent Neural Network (MRNN);Gender RNN|
In this thesis we extend the modular recurrent neural network (MRNN) based speech recognition approach to speaker-independent, continuous Mandarin speech recognition. It employs a sophisticated MRNN to attack the complicated task. The MRNN is composed of two gander-dependent sub-MRNNs for the discrimination of 411 base-syllables and a gander classification RNN for combining the outputs of these two sub-MRNNs. Each sub-MRNN can be further divided into three parts: two RNNs for the discriminations of 100 right-final-dependent initials and context-independent 39 finals, two weighting RNNs for the generation of dynamic weighting functions for combining initial and final discriminant functions, and one RNN for the detection of syllable boundaries to provide timing cues for the recognition search. The whole system is trained by a four-level training scheme including sub-syllable-, syllable-, utterance-, and gender-level trainings. Experimental results showed that the proposed method outperformed the conventional HMM method. The base-syllable accuracy rate raised from 59.55% obtained by the HMM method to 63.23% obtained by the proposed MRNN method.
|Appears in Collections:||Thesis|