|標題:||適用於華語數位助聽器之低延遲且類ANSI S1.11 1/3-octave規範濾波器組的音高式噪音消除與語音偵測輔助之廣泛動態範圍壓縮技術設計|
Design of Pitch Based Noise Reduction Adopting Low Latency Quasi ANSI S1.11 1/3 Octave Filter Bank and VAD-based Wide Dynamic Range Compression for Mandarin Digital Hearing Aid System
|關鍵字:||助聽器;雜訊消除;動態範圍壓縮;語音區間偵測;華語;音高;動態背景環境;hearing aids;noise reduction;dynamic range compression;voice activity detection;mandarin;pitch;non-stationary background environment|
|摘要:||在本論文中，我們提出一套採用低延遲的類ANSI 1/3 octave濾波器組且適合實現於助聽器系統的音高式雜訊消除系統與語音偵測基準之廣泛動態範圍壓縮技術。所提出的音高式雜訊消除系統包含一個音高式語音偵測器與仰賴子音起始的雜訊抑制器，而且使用語音的特性如音高與相對應之和諧音、子音起始和單音節字長度的時間。由於quasi ANSI濾波器組有低解析度的缺點，提出的音高式語音偵測器將音高與子音起始特性跟彈性和諧音偵測器整合在一起來提升語音偵測器的準度，而提出的仰賴子音起始的雜訊抑制器是設計來克服濾波器組的低解析度。除此之外，一個長期平均能量更新機制被使用來增進子音起始特性的偵測率，模擬的結果顯示，提出的音高式雜訊消除系統能同時在靜態背景雜訊環境與高動態背景雜訊環境有好的表現，提出的音高式語音偵測器的準度結果是可以與採用高解析度ANSI濾波器組的音高式語音偵測器相比的，平均準度可以分別在靜態與動態背景雜訊環境裡達到83.70%與85.70%。而提出的仰賴子音起始的雜訊抑制器的語音區段訊雜比和語音訊雜比在靜態背景雜訊環境中，平均改進5.95dB和9.12dB，在動態背景雜訊環境裡平均改進6.49dB和9.47dB。另外，語音品質(PESQ)在靜態與動態背景雜訊環境裡平均改進0.19和0.22。
In this thesis, we propose a pitch based noise reduction (NR) system and a VAD-based wide dynamic range compression (WDRC) which adopts a quasi-ANSI 1/3 octave filter bank with low group delay for realistic implementation in hearing aids (HA) systems. The proposed pitch based NR includes a pitch based voice activity detection (VAD) and onset-depended noise attenuation (ONA). The characteristics of speech such as pitch and corresponding harmonics, onset, and time of monosyllable word length are utilized by the proposed pitch based NR. Due to the drawback of low resolution resulted from quasi ASNI filter bank, the proposed pitch based VAD integrates the pitch and onset features with the flexible harmonics detection to improve the accuracy of VAD. The proposed ONA is designed to conquer the poor resolution of the filter bank. In addition, an update mechanism of long-term average magnitude is employed to enhance the detection of onset feature. The simulation results show that the proposed pitch based NR can perform well in both stationary (the situation that user is still) background noise environment and highly dynamic (the situation that user is moving) background noise environment. The accuracy results of proposed pitch based VAD are comparable with the pitch based VAD adopting ANSI filter bank which has high resolution. The average accuracy of proposed pitch based VAD is about 83.70% and 85.70% in stationary and dynamic noise situations respectively. And the average improvement of segmental signal-noise-ratio (SNRseg) and signal-noise-ratio (SNR) of the proposed ONA is 5.95dB and 9.12dB in stationary noise environment and 6.49dB and 9.47dB in dynamic noise environment. Moreover, the average improvement of sound quality (PESQ) is 0.19 and 0.22 in stationary and dynamic noise environments respectively. The proposed VAD-based WDRC enhances the energy difference between speech and noise. Because the WDRC algorithms are usually developed on clean speech scenarios without considering the presence of background noise, the high energy of speech may be suppressed more than low energy of background noise due to the characteristic of WDRC. This incurs the undesired interaction effect when NR and WDRC are connected. The performance of NR might be degraded by WDRC block. Thus, the energy difference between speech and noise is decreased and degrades the speech intelligibility. With the help of VAD information from NR block, WDRC can perform different operations to speech regions and noise regions and increases the speech intelligibility. The simulation results show that the proposed VAD-based WDRC has benefit to reduce the undesired interaction effect between NR and WDRC. For the proposed pitch based NR and VAD-based WDRC, the computational complexity of the proposed algorithms is low and the slight cost of modifications could exchange the outstanding performance. Finally, the total latency of the proposed algorithm including the quasi ANSI filter bank is only 11.3ms which matches the requirement of HA system and is suitable for the HA applications.