標題: 具多齊質性處理器核心之多媒體串流處理架構
Media Streaming Architecture with Homogeneous Processor Cores
作者: 劉嘉儀
Chia-Yi Liou
闕河鳴
Herming Chiueh
電信工程研究所
關鍵字: 齊質性處理核心;多媒體串流處理架構;矽智財;浮點運算處理器;效能評估;Media Streaming Architecture;Homogeneous Processor Core;Intellectual Property;ALU cluster IP;Floating Point Operation Unit;Performance Evaluation
公開日期: 2006
摘要: 隨著科技的發展與進步,在現今的生活中應用於多媒體運算之可攜式嵌入式系統的重要性與日俱增。然而由於傳統運算模型中之記憶體存取模型與處理核心和記憶體間的效能間隙落差,導致多媒體運算無法有效率的對應並且實現在傳統的處理器架構模型上。另外在硬體實現的系統架構上亦是產生效能無法提升的重要因素之ㄧ。因此所提出的多媒體處理架構採用史丹佛大學提出之串流處理模型配合上多種硬體實現的系統架構來克服傳統處理器架構所造成效率低落。並且提供一具高平行度和有效率運算速率的多媒體運算平台。 在本論文中,設計並下線製作一個與AMBA 介面相容之多媒體處理單元作為構成具多齊質性處理核心之多媒體串流處理架構的核心運算單元。除此之外亦設計實做浮點運算處理器,利用此浮點運算處理器提供此一與AMBA 介面相容之多媒體處理單元有效率的浮點運算處理能力,使其可以更廣泛的應用於各種多媒體處理運算中。透過不同運算單元與架構間的效能評估與比較,證實了僅需要些許的硬體成本即可提供更有效率且更廣泛的多媒體運算處理能力。另外此效能評估與比較亦證實了具多齊質性處理核心之多媒體串流處理架構在擁有不同數目之處理核心時,在合理的硬體成本之下其效能可以有效的提升。
As the evolution of information technology, embedded systems with media applications for portable devices are more and more important in modern life. However, the conventional processor architecture does not handle the processing requirement of media applications very well since the characteristics of media applications and other inheritance disability from conventional microprocessor architecture’s memory accessing model and processor-memory performance gap. Recent research shows that the stream processing model and stream processor architecture are suitable for media applications. However, software implementations for a streaming processor are not a trivial job since it evolves a lot of hand and manual optimization in memory exchange and tread deployment to different processor element or functional unit. In this thesis, a processing element for reconfigurable homogenous ALU cluster and its Advanced Microcontroller Bus Architecture (AMBA) platform interface has been designed and implemented. The proposed design integrated platform based design methodology and stream processing model to overcome the challenge of media applications. The proposed homogenous ALU cluster is utilized as a reconfigurable hardware accelerator for specific and different functions in media applications. The chosen AMBA interface provides an integration platform for embedded operating system and programming development environment. The combination of these methodologies provides a turnkey solution for media applications development in modern portable devices. The ALU cluster IP with AMBA interface is taped out using TSMC 0.15um technology and operates at 100MHz. The chip area is 3.9*3.9 mm2 and gate count is 0.2 million. A 4-layer FRP printed circuit board is designed and fabricated as the daughter card for system integration. The daughter card carries the designed chip is integrated to ARM versatile platform board as the system integration and application development environment. In addition, a floating point operation unit for ALU cluster IP is proposed and implemented and it will be integrated with ALU cluster IP as the future revision of the hardware accelerator. The hard macro of the floating point unit operates at 75MHz, its area and gate count is 0.415mm2 and 0.02 million respectively. The performance evaluation and comparison in floating point operation benchmark between different proposed architectures are presented. Media applications can be developed for proposed reconfigurable homogenous processing elements in the future using the chips and systems build in this thesis.
URI: http://140.113.39.130/cdrfb3/record/nctu/#GT009413604
http://hdl.handle.net/11536/80864
Appears in Collections:Thesis


Files in This Item:

  1. 360401.pdf