標題: 基於多秩訊號模型之語音純化波束形成與多通道後濾波
Beamforming and multi-channel post-filtering for speech enhancement based on multi-rank signal models
作者: 李明唐
Lee, Ming-Tang
胡竹生
Hu, Jwu-Sheng
電控工程研究所
關鍵字: 多秩訊號模型;波束形成;後濾波;語音純化;multi-rank signal model;beamforming;post-filtering;speech enhancement
公開日期: 2013
摘要: 波束形成(Beamforming)與多通道後濾波(multi-channel post-filtering)為麥克風陣列語音純化應用之兩項主要技術。在傳統的演算法中,聲源相對於麥克風間的空間資訊通常是以兩兩麥克風之間的延遲或是相對轉移函數來描述,以上方式描述的聲源模型侷限於單一秩(rank-1)的空間關係中。然而,實際聲源在傳遞至麥克風陣列過程中,受到區域性散射、波前不規則變動、或是迴響等影響,往往空間關係是多秩的。因此,本論文利用多秩訊號模型提出波束形成及多通道後濾波等技術,用來降低壓抑雜訊時過度刪減到目標聲源的現象,藉此提升語音品質。 在波束形成技術上,本論文引入了多秩(multi-rank)訊號模型與範數限制(norm constraint)來降低傳統最小變異無失真響應(minimum variance distortionless response)設計下對陣列不確定性的影響。本論文經由虛擬觀測(pseudo-observation)的技巧,將問題轉換到狀態空間,並使用一階、二階擴展卡爾曼濾波器(extended Kalman filter)以及非察覺型卡爾曼濾波器(unscented Kalman filter)來實現以上非線性問題。此外,本論文針對範數限制值的選擇進行完整分析。由模擬結果可看出,相較於使用對角加載(diagonal loading),使用範數限制對於未知訊號能量、模型誤差等更具有強健性。 在多通道後濾波技術上,我們定義了一種新的空間相干量度(spatial coherence measure)並首度引入多秩訊號模型到後濾波的發展中。此相干量度透過兩功率頻譜密度(power spectral density)矩陣來描述兩聲場之間的相似度。此相干量度可用於設計一種新的多通道濾波器。基於此相干量度,本論文分析了由目標聲場及雜訊聲場造成的偏離(bias),並提出另一套偏離補償的多通道濾波器。此多通道濾波器在偏離或是雜訊功率頻譜密度矩陣被正確估測下,等同於理想的維納濾波器(Wiener filter)。本論文透過理論以及實驗證實在使用更正確的訊號模型下,本論文提出之偏離補償後的多通道濾波器可提供較佳的語音品質。
Beamforming and multi-channel post-filtering are two major techniques in microphone array speech enhancement. In conventional algorithms, the relationship between a source to the microphones is usually described by delays or relative transfer functions (RTFs) of each microphone pair. The description of the source model is limited to the rank-1 spatial correlation. However, in the real sound propagation of the source, the spatial correlation is typically multi-rank due to local scattering, wavefront fluctuation, or reverberation. Thus, this dissertation proposes the beamforming and multi-channel post-filtering algorithms based on multi-rank signal models. The proposed algorithms can alleviate the self-cancellation phenomenon of the desired source during noise reduction and improve the speech quality. For the proposed beamforming algorithms, multi-rank signal models and norm constraint are introduced into the minimum variance distortionless response (MVDR) beamforming problem for reducing the sensitivity of the design against array uncertainties. Based on the pseudo-observation method, the beamforming problems are transformed into state spaces and solved by the first- and second-order extended Kalman filters (EKF) and the unscented Kalman filter (UKF). In addition, the selection of the norm constraint value is completely studied. The simulations show that the usage of the norm constraint is more robust to the unknown signal powers and model errors compared to the usage of the diagonal loading (DL) technique. For the proposed multi-channel post-filtering algorithms, a novel spatial coherence measure is defined and multi-rank signal models are firstly conducted into the post-filtering development. The spatial coherence measure evaluates the similarity between the measured signal fields using power spectral density matrices. A multi-channel post-filter is proposed based on this measure. Under this measure, the bias term due to the similarity of the desired signal field and the noise field is further investigated and a solution based on bias compensation is proposed. It can be shown that the compensated solution is equivalent to the optimal Wiener filter if the bias or the noise power spectral density matrix is perfectly measured. The theoretical and empirical results demonstrate that the proposed bias compensated post-filter provides better speech quality with a more accurate signal model.
URI: http://140.113.39.130/cdrfb3/record/nctu/#GT079712822
http://hdl.handle.net/11536/73110
Appears in Collections:Thesis


Files in This Item:

  1. 282201.pdf