Title: 主軸空間隱藏式馬可夫模型之開發及其於中文連續數字辨識之應用
Development of Principal Space HMM Algorithm and Its Application to Continuous Mandarin Digits Recognition
Authors: 張紋碩
Wen-Shuo Chang
Chin-Teng Lin
Keywords: 隱藏式馬可夫模型;一階演算法;主軸空間隱藏式馬可夫模型;限制狀態長度;HMMs;one-state algorithm;PSHMMs;PCV parameter;bounded state durations
Issue Date: 2000
Abstract: 本論文的主要目的是在研究連續中文語音的辨識。在連續音辨識的領域中,有很多的演算法被提出來解決連續音的辨識問題,其中有一種辨識法則,稱為one-state algorithm。本論文將研究的重點放在改良one-state algorithm的兩大問題上。第一個問題是由於在one-state algorithm進行辨識時,參考模型本身的好壞會嚴重影響one-state algorithm的辨識率。由於好的參考模型會有效的增加連續音辨識效果,因此在本論文中,提出一個主軸空間隱藏式馬可夫模型(PSHMMs)來改善參考模型本身的辨識效果,根據實驗顯示,在單字音辨識時,主軸空間隱藏式馬可夫模型會比傳統的隱藏式馬可夫模型約多3%的辨識率,而在連續音的辨識上,則可明顯增加6.61%的辨識率。 另一個問題是通常使用於one-state algorithm的參考模型為隱藏式馬可夫模型,由於隱藏式馬可夫模型本身並沒有辦法提供足夠的語音暫態模型,於是在本論文裡提出一個新的參數稱為principal component variance (PCV)參數,再配合上前人提出的「狀態長度限制」方法,可以彌補因語音的暫態資訊不足所造成的錯誤。實驗結果顯示,在連續音辨識中加入暫態資訊,對特定語者中文連續數字音辨識能提高7.94%,而對不特定語者的中文連續數字音辨識則可提高13.89%的辨識率。最後應用所有論文中提出的方法在中文連續數字的辨識上,實驗結果顯示出比原來的辨識系統增加了20.5%的辨識率。
In this thesis, we investigate the automatic speech recognition (ASR) for Mandarin digits. One of the connected word pattern matching methods is one-state algorithm. There are two problems in utilizing the one-state algorithm. One problem is the selection of the acoustic model. Another problem to one-state algorithm is that the temporal structure of the speech signals is only considered in the reference pattern but the test pattern. To overcome the first problem of one-state algorithm, we proposed a new acoustic model. The new acoustic model is called PSHMM and is used to improve the recognition performance in acoustic level. This new type of acoustic model has shown increase of recognition rate about 3% comparing to conventional HMMs for speaker independent isolated words. The PSHMMs also can achieve an improvement of 6.61% in speaker independent continuous Mandarin digits recognition. To tackle another problem of one-state algorithm, we proposed new temporal information called PCV parameter to solve it. Furthermore, we also combine the bounded state durations with the one-state algorithm. The combination of PCV parameter and bounded state durations in one-state algorithm has shown 13.89% performance increasing in speaker independent recognition and 7.94% increasing in speaker dependent recognition. The application of the new methods proposed in this thesis is developed to implement a high performance speaker independent continuous Mandarin digits recognition system. The finally experimental results showed that the new methods could improve the continuous Mandarin digits recognition without grammar or lexical roles and the recognition rate has been increased from 55.35% to 75.85%.
Appears in Collections:Thesis