標題: 時域-頻域上的聽覺頻譜平滑化之強健性語者辨識
Spectro-temporal Smoothed Auditory Spectra for Robust Speaker Identification
作者: 林廷翰
冀泰石
電信工程研究所
關鍵字: 聽覺模型;語者識別;auditory model;speaker identification
公開日期: 2009
摘要: 傳統使用的語者辨識系統,辨識率很容易受到加成性雜訊及摺積性雜訊干擾,這是由於傳統上使用的特徵參數只有表達出語句最低層的線索,而較高層的線索被證實出對雜訊較具有抗雜性。本篇論文利用聽覺模型抽取出的語音特徵參數,和時域-頻域調變特性來處理並補捉較高層的線索,最後應用於雜訊下的語者辨識。本論文使用文句不限定及封閉集合語者辨識系統,使用TIMIT和GRID語料庫進行測試,而實驗結果顯示所提出的參數在各個SNR環境下,辨識率比傳統的MFCC參數大大提升;而時域-頻域調變濾波器與最近提出的ANTCC相比,在低SNR下有優異的表現。
The performance of conventional speaker recognition systems is severely compromised by interference, such as additive or convolutional noises. High-level information of the speaker is considered more robust cues for recognizing speakers. This paper proposes an auditory-model based spectral features, auditory cepstral coefficients (ACCs), and a spectro-temporal modulation filtering (STMF) process to capture high-level information for robust speaker recognition. Text-independent closed-set speaker recognition experiments are conducted on TIMIT and GRID corpora to evaluate the robustness of ACCs and benefits of the STMF process. Experimental results show ACCs’ significant improvement over conventional MFCCs in all SNR conditions. The superior performance of STMF to newly developed ANTCCs is also demonstrated in low SNR conditions.
URI: http://140.113.39.130/cdrfb3/record/nctu/#GT079713534
http://hdl.handle.net/11536/44552
Appears in Collections:Thesis


Files in This Item:

  1. 353401.pdf