Show/Hide Menu
Hide/Show Apps
Logout
Türkçe
Türkçe
Search
Search
Login
Login
OpenMETU
OpenMETU
About
About
Open Science Policy
Open Science Policy
Open Access Guideline
Open Access Guideline
Postgraduate Thesis Guideline
Postgraduate Thesis Guideline
Communities & Collections
Communities & Collections
Help
Help
Frequently Asked Questions
Frequently Asked Questions
Guides
Guides
Thesis submission
Thesis submission
MS without thesis term project submission
MS without thesis term project submission
Publication submission with DOI
Publication submission with DOI
Publication submission
Publication submission
Supporting Information
Supporting Information
General Information
General Information
Copyright, Embargo and License
Copyright, Embargo and License
Contact us
Contact us
Non-coalesced Access Patterns of Global Memory Load Transactions in Metropolis Resampling Implemented on Graphics Processing Unit
Date
2018-01-01
Author
Dülger, Özcan
Oğuztüzün, Mehmet Halit S.
Metadata
Show full item record
This work is licensed under a
Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License
.
Item Usage Stats
189
views
0
downloads
Cite This
Due to having many particles, the particle filter has high computational cost. Owing to many cores in its architecture, graphics processing unit (GPU) offers promising solutions. The resampling stage of the particle filter has long execution time because of interactions among the particles. As Metropolis resampling does not need collective operations on particles, it avoids the numerical instability problem and performs fast. However, reading the weights from the global memory becomes serial as the number of particles increases. This is because of non-coalesced global memory access patterns of the Metropolis resampling. We devised two variations of Metropolis, namely, Metropolis-C1 and Metropolis-C2, in our previous work to ameliorate this problem. In these techniques, we ensure that the threads in a warp access the same segments of the global memory. We gain up to 9.7x speed up with Metropolis-C1 and up to 5.5x speed up with Metropolis-C2. Moreover, the quality (root mean square error) results of Metropolis-C1 and Metropolis-C2 on a highly non-linear equation are similar to those of Metropolis. In this study, we investigate the impact of non-coalesced global memory access patterns on the speed of the Metropolis resampling by using the CUDA profiler. We show that the number of global memory load transactions reduces with Metropolis-C1 and Metropolis-C2 leading to shorter execution time of the resampling stage.
Subject Keywords
Particle filter
,
Graphics processing unit
,
Metropolis resampling
URI
https://hdl.handle.net/11511/56214
DOI
https://doi.org/10.1109/siu.2018.8404330
Conference Name
26th IEEE Signal Processing and Communications Applications Conference (SIU)
Collections
Department of Computer Engineering, Conference / Seminar
Suggestions
OpenMETU
Core
Implementation of the Sampling Importance Resampling Particle Filter Algorithm in Graphics Processing Unit
Dülger, Özcan; Oğuztüzün, Mehmet Halit S.; Demirekler, Mübeccel (2015-05-19)
When the particle filter has too many particles, the computational cost increases and the sequential algorithms become inefficient in terms of the execution time. Recent developments in the graphics processing unit technology offer promising solutions for the speedup of the particle filter. In this study, Sampling Importance Resampling (SIR) particle filter method is implemented on the graphics processing unit. The speedup results are compared with results of the sequential and parallel implementations of t...
Parallel resampling methods for particle filters on graphics processing unit
Dülger, Özcan; Oğuztüzün, Mehmet Halit S.; Department of Computer Engineering (2017)
This thesis addresses the implementation of the resampling stage of the particle filter on graphics processing unit (GPU). Some of the well-known sequential resampling methods are the Multinomial, Stratified and Systematic resampling. They have dependency in their loop structure which impedes their parallel implementation. Although such impediments were overcome on their GPU implementation, these algorithms suffer from numerical instability due to the accumulation of rounding errors when single precision is...
Entropy Calculation in Particle Filters
Orguner, Umut (2009-04-11)
This paper presents a differential entropy calculation method to be used for particle mixtures in particle filters. First it is shown that the exact differential entropy of particle mixtures is minus infinity and therefore useless in practice. The disadvantage of using discrete entropy formulation instead of differential entropy is also explained. Unlike the kernel-based methods in the literature, a Bayes rule based approximation is then proposed. The performance of the algorithm is illustrated on a basic G...
Non-linear filtering based on observations from Gaussian processes
Gustafsson, Fredrik; Saha, Saikat; Orguner, Umut (2011-03-12)
We consider a class of non-linear filtering problems, where the observation model is given by a Gaussian process rather than the common non-linear function of the state and measurement noise. The new observation model can be considered as a generalization of the standard one with correlated measurement noise in both time and space. We propose a particle filter based approach with a measurement update step that requires a memory of past observations which can be truncated using a moving window to obtain a fi...
Simple Estimation of the Surface Area of Irregular 3D Particles
Erdoğan, Sinan Turhan (2016-08-01)
Shape-related properties of irregular particles are of interest in many fields. The volume and dimensions of rocks, such as coarse and larger fine concrete aggregates, can be physically measured rather easily. However, the surface area is difficult to measure physically, if at all possible. A combination of computed tomography and spherical harmonic analysis can be used to calculate the surface areas of micrometer-sized to centimeter-sized particles. This paper compares the success of several approaches tha...
Citation Formats
IEEE
ACM
APA
CHICAGO
MLA
BibTeX
Ö. Dülger and M. H. S. Oğuztüzün, “Non-coalesced Access Patterns of Global Memory Load Transactions in Metropolis Resampling Implemented on Graphics Processing Unit,” presented at the 26th IEEE Signal Processing and Communications Applications Conference (SIU), İzmir, Turkey, 2018, Accessed: 00, 2020. [Online]. Available: https://hdl.handle.net/11511/56214.