標題: 針對多媒體應用之可變延遲且具猜測能力布斯乘法器之研究
Study on Variable-Latency Speculating Booth Multiplier for Multimedia Applications
作者: 吳宗宜
Wu, Tsung-Yi
劉志尉
Liu, Chih-Wei
電子研究所
關鍵字: 布斯乘法器;猜測;可變延遲;錯誤偵測;Booth multiplier;Speculation;Variable-latency;Error detection
公開日期: 2012
摘要: 乘法運算對於今日的多媒體標準是很重要的,因為在大多數的多媒體應用中,乘法運算佔了運算中很大的一部分,隨著多媒體應用需要大量的乘法運算,乘法器需要有更好的效能去滿足多媒體標準的要求。管線(Pipeline)設計是一種常見而且有效的技術可以用在乘法器上改善乘法器的效能,然而,資料危障的問題會對管線設計的效能變差,因為需要額外的停頓週期(Stall cycle)來解決問題,因此,本研究設計了一個高速且高能效的兩階段管線猜測乘法器來降低停頓週期的次數。 我們提出的乘法器具有兩個連續的階段,猜測階段和更正階段,為了要進一步降低關鍵路徑的計算長度,乘法器的部分乘積(Partial product)將被分成高位元部分(MSP)和低位元部分(LSP),在猜測階段,兩個部分乘積的加總運算會在猜測階段中分別平行地運算,原先實際從低位元部分乘積傳到高位元部分乘積的進位值會被取代為以進位估計的方式產生,假如進位估計預測出來的進位是錯的進位值,此時乘法器中的偵測電路將會把錯誤的情況在猜測階段結束前偵測出來,然後在下一個更正階段會把結果給更正,所以我們提出來的乘法器設計,只有當猜測錯誤和資料危障兩件事情同時發生的時候才需要額外的停頓週期,所以我們提出的乘法器它的停頓週期數會比一般的管線設計要來的少,假如進位估計預測出來的進位是對的進位值,此時資料危障的問題就可以隱藏起來,而管線設計所帶來潛在的加速效果就可以被保留下來。 在實驗的過程中,我們使用三個實際的多媒體應用來驗證我們所提出的乘法器是否可行,多媒體應用包含了JPEG壓縮、人臉偵測、H.264/AVC解碼器,將週期時間和週期數考慮進去,和一般的兩階段管線乘法器相比的話,我們提出的乘法器可以達到大約1.0-1.4倍的速度,另外,我們提出的乘法器可以省下大約7%的面積還有大約14.1%的能量損耗。
Multiplication is critical in today’s multimedia standards because multiplication account for large proportion of the computation in multimedia applications. With large multiplication requirement for multimedia applications, the multiplier need better performance to satisfy the requests of multimedia standards. Pipeline design is one of the most common and effective technique used in multiplier design to improve the performance of multiplier. However, data hazards cause severe performance degradation in pipeline design because of additional stall cycles. Therefore, this study designs a high-speed and energy-efficient 2-stage pipeline speculating multiplier to reduce stall cycle count. The proposed multiplier has two successive stages: the speculating stage and the correcting stage. In order to reduce the critical path, the partial product of proposed multiplier is vertically partitioned into the (n-z)-bit LSP and (n+z)-bit MSP. During the speculating stage, the summation of the LSP and MSP are calculated in parallel. Instead of waiting the carry-in propagate from LSP, the carry-in propagated to the (n+z)-bit MSP is generated by the carry estimation function. If the carry estimation predicts the wrong carry-in, then the detection logic in the multiplier would detect the error condition at the end of speculating stage and correct the result in the correcting stage. In proposed design, the additional stall cycle is needed only when the wrong speculation and data dependence, the two things, occur at the same time. The stall cycle count of proposed multiplier is always less than conventional pipeline design. If the speculation is correct, the data-dependence is hidden so that the potential speedup of the pipeline datapath is guaranteed. In the experiment, the proposed multiplier is evaluated by three multimedia applications: JPEG compression, human face detection and H.264/AVC decoder. Considering cycle time and cycle count, the speed up ratio of proposed multiplier is approximately 1.0-1.4 compared with conventional 2-stage pipeline design in three multimedia applications. Moreover, the proposed multiplier also saves approximately 7% area and 14.1% energy dissipation.
URI: http://140.113.39.130/cdrfb3/record/nctu/#GT079911655
http://hdl.handle.net/11536/49181
Appears in Collections:Thesis