標題: 知覺式音訊編碼壓縮瑕疵之探討
Compression Artifacts in Perceptual Audio Coding
作者: 許瀚文
Hsu, Han-Wen
劉啟民
Liu, Chi-Min
資訊科學與工程研究所
關鍵字: MPEG 音訊壓縮;編碼瑕疵;MPEG audio;artifact
公開日期: 2009
摘要: 知覺式音訊編碼利用與聽覺無關的訊號內容以及資料的累贅性來達到高壓縮率。先進與精良的訊號處理使得知覺式音訊編碼造成了與傳統壓縮失真相當不同的編碼瑕疵(artifacts)。新的音訊技術的成熟需經過對其所引入的編碼瑕疵進行成功的建模、量測與控制。隨著例如Advanced Audio Coding (AAC), Spectral Band Replication (SBR)與參數編碼等最先進的編碼技術的進步,其編碼瑕疵的建模、量測與控制的困難度遠高於對先前編碼法上的瑕疵。本論文將就原理、產生來源、知覺影響與相關消除法等面向來探討這些新的編碼瑕疵。我們針對MPEG音訊編碼,包括MP3、AAC、SBR、PS (Parametric Stereo coding),來探討由新技術所引入的編碼瑕疵。我們透過時頻圖來具體化可聽瑕疵、考慮容易觸發瑕疵的音訊種類、分析關鍵編碼器模組,並提供實例驗證。我們提出一種頻譜修補的方法來改善兩種零量化瑕疵,與一種快速奇基數演算法來計算濾波堆中第四型餘弦轉換來控制數值失真與平行運算。我們建立AAC上的時域雜訊朔型法(TNS)的公式化描述,並考慮一項已知的時域假頻雜訊瑕疵。最後我們考慮SBR與PS編碼法上新型態的編碼瑕疵,並驗證了SBR所採用的線性預測法的預測偏差。
Perceptual audio coding achieves a high compression ratio by exploiting perceptual irrelevance and data redundancy. By using advanced and sophisticated signal processing methods, perceptual audio coding has generated artifacts that are quite different from the traditional distortions. A new audio technology becomes mature through the successful modeling, measuring and control on the artifacts incurred from the technology. With the advance of new coding modules in the state-of-the-art coding methods such as Advanced Audio Coding (AAC), Spectral Band Replication (SBR), and parametric coding, the incurred artifacts are far more difficult to model, measure and control than those caused by previous encoding systems like pulse code modulation. In this dissertation, we take into consideration the MPEG audio, including MP3, AAC, SBR and PS (Parametric Stereo) coding, to explore the compression artifacts from the novel coding methods in terms of principle, generation sources, perception, and related relief methods. We model the audible artifacts through the time-frequency diagrams; consider the artifacts-susceptible music types; analyze the critical encoding technologies incurring these artifacts; and provide empirical verifications for the artifacts. Specifically, we propose an audio patch method for reducing the two zero-quantization artifacts and the fast odd-radix algorithm for computing the type-IV discrete cosine transform in the filterbank computation for breaking the tradeoff of parallelism and numerical distortion in the existing methods. We establish the compact forms for the Temporal Noise Shaping (TNS) in AAC and consider the known artifact named the time-domain aliasing noise. New kinds of artifacts are explored for SBR and PS. We also demonstrate the predictive bias of the linear prediction used in SBR.
URI: http://140.113.39.130/cdrfb3/record/nctu/#GT009317826
http://hdl.handle.net/11536/78855
Appears in Collections:Thesis


Files in This Item:

  1. 782601.pdf