標題: 依據擴充式跳躍目標緩衝區的指令快取記憶體預先提取
Instruction Cache Prefetching with Extended BTB
作者: 汲世安
Chi, Shyh-An
鍾崇斌
Chung-Ping Chung
資訊科學與工程研究所
關鍵字: 指令快取記憶體預先提取;基於預測表的預取方法;instruction cache prefetching;prediction table based prefetching
公開日期: 1996
摘要: 處理器與記憶體之間的速度差異越來越大, 使得指令快取記憶體失誤( miss)所造成的損害(penalty)也越來越高. 此外, 目前的微處理機使用超 純量(superscalar)及超管線(superpipelining)的技術藉以增加指令的引 進(issue)速率. 指令快取記憶體失誤所造成的效能降低也就越形嚴重了. 目前的VLSI技術允許以更多的晶片面積來建構晶片上的指令快取記憶體. 假如處理器的時脈速率超過100MHz, 那麼晶片上的指令快取記憶體將限制 於大小為4至16KB, 且有較低的關連度(associativity). 更突顯出快取記 憶體失誤率的問題.在這篇論文中, 我們首先探討數種已發表過, 用來減 少指令快取記憶體失誤的方法. 然後發表一個新的方法稱為BIB (Branch Instruction Based) 預取方法, 此方法是根據跳躍(branch)指令的預測 結果來作預取. 此方法儲存預取資訊(prefetching information)在一個 擴充的BTB (eBTB)中. 當eBTB辨識出一個跳躍指令時, 儲存在相對應eBTB 登錄項(entry)中的預取資訊將用來作預取. 假如eBTB失誤, 則預取循序 下一個位址的指令.為了要評估列出之指令快取記憶體預取方法, 我們為 每一個方法建立了一個簡單但趨真實的機器模型(machine model). 論文 中使用了六種SPECint95標竿程式(benchmark)來評估這些方法. 在我們的 研究中主要的效能評估項目是MCPI (每個指令在記憶體存取時所用的時脈 數). 模擬結果顯示出BIB預取方法優於循序(sequential)預取方法七個百 分點, 優於PBN預取方法十七個百分點. As the speed gap between processor and memory grows, the penalty caused byinstruction cache misses gets higher. Furthermore, modern microprocessors increase the instruction issue rate by employing techniques such as superscalar processing and superpipelining. The performance degradation caused by instruction cache misses becomes more vital. Modern VLSI technology makes it possible to allocate more die area to allow instruction cache structure on-chip. The on-chip instruction cache usually has low associativity, and its size is limited to 4 to 16KB if the processor has a clock rate of 100 MHz or higher, making the cache miss rate a problem.In this thesis, We first study several existing solutions to reducing instruction cache misses. We then propose a new approach, called the BIB (Branch Instruction Based) prefetching method in which the prefetch is directed by the prediction on branches. This method stores the prefetching information in an extended BTB (eBTB). When eBTB recognizes a branch instruction, the prefetching information in the corresponding eBTB entry is used to prefetch. If the eBTB misses, the sequentially next line address is used to prefetch.In order to evaluate these instruction cache prefetching approaches, we establish one simple but realistic machine model for each. Six SPECint95 benchmarks are used in evaluating each approach in this study. The major performance metric in our study is MCPI (cycles per instruction contributed by memory accesses). The simulation results show that the BIB prefetching method outperforms the sequential prefetching by 7% and the PBN prefetching by 17% on average.
URI: http://140.113.39.130/cdrfb3/record/nctu/#NT850392039
http://hdl.handle.net/11536/61789
Appears in Collections:Thesis