標題: 利用TI TMS320C62xx真時實現H.263+編解碼器
Real-Time Implementation of H.263+ Using TI TMS320C62xx
作者: 吳孟隆
Mon-Long Woo
杭學鳴
Dr. Hsueh-Ming Hang
電子研究所
關鍵字: H.263+;C62xx;數位訊號處理;真時實現;鑽石搜尋法;定點快速分離餘弦轉換;H.263+;C62xx;Digital Signal Processing;Real-Time Implementation;Diamond Search;Fixed-Point Fast Discrete Cosine Transform
公開日期: 1999
摘要: 隨著數位訊號處理的進步,即時的視訊傳輸已即將成為生活必須的一部份。本篇論文主要是利用市面上既有的數位訊號處理器,實現一個基本的H.263+編解碼器,以達到即時工作的目標。為了達到這個目標,我們需要替換一些在原始C程式中跑得慢的工作方塊。而且,C程式要修改成能利用數位處理器和編譯器的特點。 在本篇論文,我們先對ITU-T視訊壓縮標準H.263+做一個簡要的介紹,它編碼後的位元率通常低於64kbps。然後介紹我們所採用的數位訊號處理器Texas Instruments' TMS320C62xx,它是一顆定點運算的處理器,而且具有強大的數位訊號處理能力。我們以Telenor Research提供的模擬軟體Tmn 2.0當作一開始的範本來加以修改,又以哥倫比亞大學所提供的軟體編碼器Tmn 3.1.1中的鑽石型搜尋,和一個定點的Decimation-In-Frequency DCT運算來取代Tmn 2.0的相關部分,使整個系統的運算量大為降低。接著再考慮TMS320C62xx和C編譯器的特性,來修改之前的程式,以得到更有效率的程式。最後,我們分別在Intra和Inter Frame上節省了95%和97%的運算量,以sub-QCIF的圖形來說,我們的編碼器每秒可處理69張Intra圖形或是31張Inter圖形,而我們的解碼器每秒可以解碼30張以上的圖形,整個系統編解碼的流程只在一顆德州儀器的處理器上運作,以sub-QCIF的圖片來說,每秒鐘可以處理24張圖形。
With the advancement of the digital signal processing, real-time video transmis- sion will become an essential element in our daily life. In this thesis, we implement a real-time H.263+ codec by using a digital signal processor (DSP). In order to achieve this goal, we need to replace a few slow blocks in the original C programs. Further- more, the C programs are modi?ed to take advantages of the DSP architecture and its C compiler features. We ?rst give a brief introduction to the ITU-T video compression standard, H.263+. It produces reasonable quality videophone pictures at bit rates around 40kbps. Then, we brie y describe the Texas Instruments digital signal processor, TMS320C62xx, which is used in our implementation. It is a powerful processor with xed-point arithmetic. We start with the simulation software tmn 2.0 provided by Telenor Research as the initial template and then modify it to increase its speed. We use the diamond search, which is included in tmn 3.1.1 (a software encoder o?ered by University of British Columbia), to replace the original full search scheme. We use a ?xed-point Decimation-In-Frequency DCT algorithm to replace the oating-point DCT block in tmn 2.0. These two fast algorithms greatly reduce the computation complexity of the entire system. We further re?ne our codes by taking into account the features of the TMS320C62xx and its C compiler to produce a more e?cient program. Overall, we save 95% of the computation load for intra frame coding and 97% for inter frame coding. Our encoder can handle 69 intra frames or 31 inter frames per second for sub-QCIF pictures. The entire system can thus process about 24 frames per second using only one TI processor for both encoding and decoding. Abstract ii 致謝 iii Table of Contents iv List of Figures vi List of Tables vii 1 Overview 1 2 A Brief Introduction to H.263+ 3 2.1 The Basic H.263+ Encoder . . . . . . . . . . . . . . . . . . . . . . . . 4 2.1.1 Source Format . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.1.2 Syntax and Semantics . . . . . . . . . . . . . . . . . . . . . . 6 2.1.3 INTRA and INTER Coding . . . . . . . . . . . . . . . . . . . 6 2.1.4 Choice of the Coding Mode . . . . . . . . . . . . . . . . . . . 11 2.1.5 Quantization . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.1.6 Motion Vector Prediction . . . . . . . . . . . . . . . . . . . . 12 2.2 H.263+ Decoder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 2.3 Optional Modes of H.263+ . . . . . . . . . . . . . . . . . . . . . . . . 14 3 A Brief Introduction to TI TMS320C62xx 18 3.1 The C62xx Core and The Pipeline . . . . . . . . . . . . . . . . . . . . 19 3.2 The Design Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 4 Fast Search and Fast DCT 26 4.1 Fast Diamond Search . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 4.2 Decimation-In-Frequency (DIF) DCT . . . . . . . . . . . . . . . . . . 29 4.3 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 5 Implementation and Comparison 42 5.1 Test Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 5.1.1 The Blue Waves Systems PCI/C6600 . . . . . . . . . . . . . . 42 5.1.2 The C6x C Compiler . . . . . . . . . . . . . . . . . . . . . . . 46 5.1.3 The Code Composer . . . . . . . . . . . . . . . . . . . . . . . 48 5.2 Encoder Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 5.2.1 The TMN5 Code . . . . . . . . . . . . . . . . . . . . . . . . . 49 5.2.2 The Results of Using Fast Algorithms . . . . . . . . . . . . . . 51 5.2.3 Program Re?nement . . . . . . . . . . . . . . . . . . . . . . . 57 5.2.4 Overall Performance at The End . . . . . . . . . . . . . . . . 61 5.3 Decoder Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 5.4 System Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 6 Conclusions and Future Work 66 Bibliography 69 作者簡歷 71
URI: http://140.113.39.130/cdrfb3/record/nctu/#NT880428017
http://hdl.handle.net/11536/65649
Appears in Collections:Thesis