3D Audio Analysis and Synthesis
Mingsian R. Bai
|關鍵字:||環場音效;頭部轉移函數;聲源定位;頭部交越函數;房間響應;虛擬聲源表示法;3D audio;HRTF;source localization;IATF;room response;virtual source representation|
|摘要:||本論文的研究著重於三度空間環場音效的分析及合成。 所謂的三度空間環場音效，包含聲源的定位感 (localization) 以及聆聽者的空間臨場感 (spaciousness) 重現。 在聲源定位的方面，本文提出了兩種方法: 三度空間的陣列模型 (3D array model) 以及FIR-based 頭部交越函數 (Interaural transfer function) 來克服以往直接使用頭部轉移函數(Head related transfer function) 所造成的缺點，其中包括儲存空間以及計算量的節省。 另一方面，有鑒於現有房間響應重現技術的瓶頸。 本文提出了一套有效率的方法，利用較少的運算量來重現更加自然的臨場感受。 首先，我們提出空間虛擬聲源表示法 (Virtual source representation) 來近似聲源在密閉空間聲場之反射現象 (Early reflection)。 此外為更加強調其空間感，我們使用Comb-Nested Filters來加以模擬空間的混響效果(Late reverberation)，再由基因演算法來最佳化。本文所提出之方法，最後皆以數值模擬及主客觀的實驗加以驗證之。|
The dissertation focuses on the 3D audio analysis and synthesis, which contains the research of sound source localization and room effects reproduction. Regarding to sound source localization, two approaches were developed in this work. First, an external ear model based on a three-dimensional array beamformer is presented to synthesize the head-related transfer functions (HRTF). The array coefficients are calculated by matching the measured HRTFs with a frequency-domain template. The model matching problem is then solved, by using s singular value decomposition (SVD) procedure. Second, a perceptual approach for calculating HRTFs is presented. In this method, the ratio between the contra- and ipsilateral HRTFs were represented by a lower order finite impulse response (FIR) filter on the basis of the interaural transfer functions (IATF). The FIR filter is obtained by using the Wiener filter approach. To further improve the computational efficiency, absolute threshold of human hearing is exploited to eliminate the redundancy in the HRTFs. In the research related to the room effects reproduction, an artificial reverberator is proposed to synthesize room responses. The method employs the virtual source representation and the comb-nested allpass filters to generate the early reflection and late reverberation, respectively, of room responses. The filtering property of human hearing is also exploited in a non-uniform sampling procedure to further simplify the computation. Optimal parameters of the comb-allpass filter network are obtained using the genetic algorithm (GA). All proposed methods were examined both objectively and subjectively, and had been proven to be effective in 3D audio synthesis.