Show/Hide Menu
Hide/Show Apps
Logout
Türkçe
Türkçe
Search
Search
Login
Login
OpenMETU
OpenMETU
About
About
Open Science Policy
Open Science Policy
Open Access Guideline
Open Access Guideline
Postgraduate Thesis Guideline
Postgraduate Thesis Guideline
Communities & Collections
Communities & Collections
Help
Help
Frequently Asked Questions
Frequently Asked Questions
Guides
Guides
Thesis submission
Thesis submission
MS without thesis term project submission
MS without thesis term project submission
Publication submission with DOI
Publication submission with DOI
Publication submission
Publication submission
Supporting Information
Supporting Information
General Information
General Information
Copyright, Embargo and License
Copyright, Embargo and License
Contact us
Contact us
Efficient utilization of streaming multiprocessors for the implementation of particle filter on graphics processing unit
Date
2026-02-01
Author
Dülger, Özcan
Metadata
Show full item record
This work is licensed under a
Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License
.
Item Usage Stats
36
views
0
downloads
Cite This
The particle filter is a serial Monte Carlo estimation method. It is used in tracking applications in which the system or measurement model is highly nonlinear. The quality of the estimation improves as the number of particles increases; however, the computational cost also rises. The graphics processing units (GPUs) offer a promising solution for the particle filter by providing many cores in their architectures. To implement the particle filter on the GPU, we use CUDA as the parallel computing platform. The architecture of the GPU must be carefully considered when determining the parameters of CUDA kernels. Configuring the block size of CUDA kernels appropriately is essential for the efficient utilization of streaming multiprocessors (SMXs). In this study, we investigate the impact of block size on SMX efficiency, particularly in GPUs where the number of SMXs is not a power of two. We propose three distinct scenarios based on different block size configurations and provide a detailed discussion of the characteristics and resulting speedups of these scenarios. We conduct experiments on two different GPU boards, NVIDIA Tesla K20 and NVIDIA Tesla K40. In addition, we demonstrate a multi-GPU approach for the particle filter using these boards and discuss the associated challenges and resulting speedups in detail.
URI
https://hdl.handle.net/11511/118640
Journal
JOURNAL OF COMPUTATIONAL SCIENCE
DOI
https://doi.org/10.1016/j.jocs.2025.102778
Collections
Department of Computer Engineering, Article
Citation Formats
IEEE
ACM
APA
CHICAGO
MLA
BibTeX
Ö. Dülger, “Efficient utilization of streaming multiprocessors for the implementation of particle filter on graphics processing unit,”
JOURNAL OF COMPUTATIONAL SCIENCE
, vol. 94, pp. 0–0, 2026, Accessed: 00, 2026. [Online]. Available: https://hdl.handle.net/11511/118640.