Non-coalesced Access Patterns of Global Memory Load Transactions in Metropolis Resampling Implemented on Graphics Processing Unit

2018-01-01
Due to having many particles, the particle filter has high computational cost. Owing to many cores in its architecture, graphics processing unit (GPU) offers promising solutions. The resampling stage of the particle filter has long execution time because of interactions among the particles. As Metropolis resampling does not need collective operations on particles, it avoids the numerical instability problem and performs fast. However, reading the weights from the global memory becomes serial as the number of particles increases. This is because of non-coalesced global memory access patterns of the Metropolis resampling. We devised two variations of Metropolis, namely, Metropolis-C1 and Metropolis-C2, in our previous work to ameliorate this problem. In these techniques, we ensure that the threads in a warp access the same segments of the global memory. We gain up to 9.7x speed up with Metropolis-C1 and up to 5.5x speed up with Metropolis-C2. Moreover, the quality (root mean square error) results of Metropolis-C1 and Metropolis-C2 on a highly non-linear equation are similar to those of Metropolis. In this study, we investigate the impact of non-coalesced global memory access patterns on the speed of the Metropolis resampling by using the CUDA profiler. We show that the number of global memory load transactions reduces with Metropolis-C1 and Metropolis-C2 leading to shorter execution time of the resampling stage.
26th IEEE Signal Processing and Communications Applications Conference (SIU)

Suggestions

Implementation of the Sampling Importance Resampling Particle Filter Algorithm in Graphics Processing Unit
Dülger, Özcan; Oğuztüzün, Mehmet Halit S.; Demirekler, Mübeccel (2015-05-19)
When the particle filter has too many particles, the computational cost increases and the sequential algorithms become inefficient in terms of the execution time. Recent developments in the graphics processing unit technology offer promising solutions for the speedup of the particle filter. In this study, Sampling Importance Resampling (SIR) particle filter method is implemented on the graphics processing unit. The speedup results are compared with results of the sequential and parallel implementations of t...
Parallel resampling methods for particle filters on graphics processing unit
Dülger, Özcan; Oğuztüzün, Mehmet Halit S.; Department of Computer Engineering (2017)
This thesis addresses the implementation of the resampling stage of the particle filter on graphics processing unit (GPU). Some of the well-known sequential resampling methods are the Multinomial, Stratified and Systematic resampling. They have dependency in their loop structure which impedes their parallel implementation. Although such impediments were overcome on their GPU implementation, these algorithms suffer from numerical instability due to the accumulation of rounding errors when single precision is...
Entropy Calculation in Particle Filters
Orguner, Umut (2009-04-11)
This paper presents a differential entropy calculation method to be used for particle mixtures in particle filters. First it is shown that the exact differential entropy of particle mixtures is minus infinity and therefore useless in practice. The disadvantage of using discrete entropy formulation instead of differential entropy is also explained. Unlike the kernel-based methods in the literature, a Bayes rule based approximation is then proposed. The performance of the algorithm is illustrated on a basic G...
Non-linear filtering based on observations from Gaussian processes
Gustafsson, Fredrik; Saha, Saikat; Orguner, Umut (2011-03-12)
We consider a class of non-linear filtering problems, where the observation model is given by a Gaussian process rather than the common non-linear function of the state and measurement noise. The new observation model can be considered as a generalization of the standard one with correlated measurement noise in both time and space. We propose a particle filter based approach with a measurement update step that requires a memory of past observations which can be truncated using a moving window to obtain a fi...
Simple Estimation of the Surface Area of Irregular 3D Particles
Erdoğan, Sinan Turhan (2016-08-01)
Shape-related properties of irregular particles are of interest in many fields. The volume and dimensions of rocks, such as coarse and larger fine concrete aggregates, can be physically measured rather easily. However, the surface area is difficult to measure physically, if at all possible. A combination of computed tomography and spherical harmonic analysis can be used to calculate the surface areas of micrometer-sized to centimeter-sized particles. This paper compares the success of several approaches tha...
Citation Formats
Ö. Dülger and M. H. S. Oğuztüzün, “Non-coalesced Access Patterns of Global Memory Load Transactions in Metropolis Resampling Implemented on Graphics Processing Unit,” presented at the 26th IEEE Signal Processing and Communications Applications Conference (SIU), İzmir, Turkey, 2018, Accessed: 00, 2020. [Online]. Available: https://hdl.handle.net/11511/56214.