標題: 在異質計算架構上利用分支發散分析為基礎的可適性光線追蹤繪圖加速法
Adaptive Ray Tracing Acceleration on Heterogeneous Computing Architecture based on Branch Divergence Profiles
作者: 曾太華
Tseng, Tai-Hua
徐慰中
Hsu, Wei-Chung
資訊科學與工程研究所
關鍵字: 異質運算架構;光線追蹤;分支發散;heterogeneous computing;ray tracing;branch divergence
公開日期: 2013
摘要: 異質計算架構可以同時達到高效能以及高能源使用效率藉由利用高度平行化和圖形處理器的好處。舉例來說,大量且規律的單指令流多資料流指令可以藉由圖形處理器達到更高的效率。然而異質計算架構可能因為圖形處理器發生分支發散導致效能損失。 這是因為同時在圖形處理器上執行的執行緒分別採取不同的分支,在使用單指令流多資料流架構的處理器如圖形處理器上,當發生分支發散時執行緒將被分別屏蔽,而這個現象會造成大量的效能損失。光線追蹤應用同時擁有可大量平行化與會發生分支發散的特性。這篇研究介紹如何用負載分割來達到減少光線追蹤應用中分支發散的現象並提高該應用的效能。光線追蹤應用中的負載將被分為收斂的負載和發散的負載,收斂的負載適用於圖形處理器以提高效率,而發散的負載將分配到中央處理器以避免分支發散。藉由利用不同運算裝置的特性來達到異質計算架構的可適性。使用這個方法可以改進光線追蹤應用的執行時間達百分之二十。
Heterogeneous computing achieves high performance and high power efficiency by exploiting high parallelism and special type of computation (such as SIMD operations) available in applications on best fit computation devices. For example, massive and regular SIMD operations can be more efficiently computed on GPU. However, the performance of a heterogeneous program can be degraded when the portion assigned to GPU encounters high branch divergence. It is due to the fact that threads running in locksteps take different control flow paths, and the SIMD nature of GPU would be sub-optimal when branch divergence occurs and cause severe performance loss. Ray-tracing is an application that has the characteristics of high parallelism but may also exhibit great branch divergences. This study introduces a method which could reduce branch divergences and improve the performance in the ray-tracing application by workload partitioning. The workload of the Ray-tracing application could be split into two parts: a convergent workload and a divergent workload. The convergent workload will be deployed on the GPU device, and the divergent part is sent for the CPU. In this way, the GPU gets maximum computing efficiency, and the CPU is more competent in handling divergent control flows than GPU. This approach more effectively exploits the parallelism of Ray-tracing on a heterogeneous computing architecture. In this study, we have observed 20% performance gain in Ray-tracing by using our proposed method.
URI: http://140.113.39.130/cdrfb3/record/nctu/#GT070056131
http://hdl.handle.net/11536/73351
Appears in Collections:Thesis