Accelerated regular grid traversals using extended anisotropic chessboard distance fields on a parallel stream processor

Es, Alphan
İşler, Veysi
Modern graphics processing units (GPUs) are an implementation of parallel stream processors. In recent years, there have been a few studies on mapping ray tracing to the GPU. Since graphics processors are not designed to process complex data structures, it is crucial to explore data structures and algorithms for efficient stream processing. In particular ray traversal is one of the major bottlenecks in ray tracing and direct volume rendering methods. In this work we focus on the efficient regular grid based ray traversals on GPU. A new empty space skipping traversal method is introduced. Our method extends the anisotropic chessboard distance structure and employs a GPU friendly traversal algorithm with minimal dynamic branching. Additionally, several previous techniques have been redesigned and adapted to the stream processing model. We experimentally show that our traversal method is considerably faster and better suited to the parallel stream processing than the other grid based techniques.


Memory Coalescing Implementation of Metropolis Resampling on Graphics Processing Unit
Dülger, Özcan; Oğuztüzün, Mehmet Halit S.; DEMİREKLER, MÜBECCEL (Springer Science and Business Media LLC, 2018-03-01)
Owing to many cores in its architecture, graphics processing unit (GPU) offers promise for parallel execution of the particle filter. A stage of the particle filter that is particularly challenging to parallelize is resampling. There are parallel resampling algorithms in the literature such as Metropolis resampling, which does not require a collective operation such as cumulative sum over weights and does not suffer from numerical instability. However, with large number of particles, Metropolis resampling b...
Acceleration of direct volume rendering with programmable graphics hardware
Yalim Keles, Hacer; Es, Alphan; İşler, Veysi (Springer Science and Business Media LLC, 2007-01-01)
We propose a method to accelerate direct volume rendering using programmable graphics hardware (GPU). In the method, texture slices are grouped together to form a texture slab. Rendering non-empty slabs from front to back viewing order generates the resultant image. Considering each pixel of the image as a ray, slab silhouette maps (SSMs) are used to skip empty spaces along the ray direction per pixel basis. Additionally, SSMs contain terminated ray information. The method relies on hardware z-occlusion cul...
Accelerated ray tracing using programmable graphics pipelines
Es, Ş. Alphan; İşler, Veysi; Department of Computer Engineering (2008)
The graphics hardware have evolved from simple feed forward triangle rasterization devices to flexible, programmable, and powerful parallel processors. This evolution allows the researchers to use graphics processing units (GPU) for both general purpose computations and advanced graphics rendering. Sophisticated GPUs hold great opportunities for the acceleration of computationally expensive photorealistic rendering methods. Rendering of photorealistic images in real-time is a challenge. In this work, we inv...
Data parallelism for ray casting large scenes on a cpu-gpu cluster
Topcu, Tümer; İşler, Veysi; Department of Computer Engineering (2008)
In the last decade, computational power, memory bandwidth and programmability capabilities of graphics processing units (GPU) have rapidly evolved. Therefore, many researches have been performed to use GPUs in advanced graphics rendering. Because of its high degree of parallelism, ray tracing has been one of the rst algorithms studied on GPUs. However, the rendering of large scenes with ray tracing can easily exceed the GPU's memory capacity. The algorithm proposed in this work uses a data parallel approac...
Optimization of Advanced Encryption Standard on Graphics Processing Units
Tezcan, Cihangir (2021-01-01)
Graphics processing units (GPUs) are specially designed for parallel applications and perform parallel operations much faster than central processing units (CPUs). In this work, we focus on the performance of the Advanced Encryption Standard (AES) on GPUs. We present optimizations which remove bank conflicts in shared memory accesses and provide 878.6 Gbps throughput for AES-128 encryption on an RTX 2070 Super, which is equivalent to 4.1 Gbps per Watt. Our optimizations provide more than 2.56x speed-up agai...
Citation Formats
A. Es and V. İşler, “Accelerated regular grid traversals using extended anisotropic chessboard distance fields on a parallel stream processor,” JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, pp. 1201–1217, 2007, Accessed: 00, 2020. [Online]. Available: