A High throughput FPGA implementation of markov chain monte carlo method for mixture models

Download

index.pdf

Date

2019

Author

Bozgan, Caner

Metadata

Show full item record

Item Usage Stats

346
views

224
downloads

Markov Chain Monte Carlo (MCMC) is a class of algorithms which can generate samples from high dimensional and multimodal probability distributions. In many statistical and control applications, MCMC algorithms are employed widely thanks to their ability to draw sample from arbitrary distribution regardless of dimension or complexity. However, as the complexity of the Bayesian models and the computational load of the MCMC algorithm increase, performing MCMC inference becomes impractical or too time consuming for the real applications with large scale data sets. Motivated by this problem, this thesis proposes a low latency, scalable and high throughput hardware architecture for Parallel Tempering method, which is a MCMC algorithm to sample from multimodal distributions. The work demonstrates that the implementation of the Parallel Tempering method on Field Programmable Gate Array (FPGA) provides significant speedups compared to respective CPU and GPU implementations when performing Bayesian inference for a mixture model. The proposed work also adapts the architecture to the big data MCMC problems by eliminating the external memory related performance losses that arise in the MCMC hardware implementations.

Subject Keywords

Markov processes., Monte Carlo method., Field programmable gate arrays.

URI

http://etd.lib.metu.edu.tr/upload/12623075/index.pdf
https://hdl.handle.net/11511/27990

Collections

Graduate School of Natural and Applied Sciences, Thesis

Suggestions

OpenMETU
Core

A Modified Parallel Learning Vector Quantization Algorithm for Real-Time Hardware Applications Alkim, Erdem; AKLEYLEK, SEDAT; KILIÇ, ERDAL (2017-10-01) In this study a modified learning vector quantization (LVQ) algorithm is proposed. For this purpose, relevance LVQ (RLVQ) algorithm is effciently combined with a reinforcement mechanism. In this mechanism, it is shown that the proposed algorithm is not affected constantly by both relevance-irrelevance input dimensions and the winning of the same neuron. Hardware design of the proposed scheme is also given to illustrate the performance of the algorithm. The proposed algorithm is compared to the corresponding...
A parallel ant colony optimization algorithm based on crossover operation Kalınlı, Adem; Sarıkoç, Fatih (Springer, 2018-11-01) In this work, we introduce a new parallel ant colony optimization algorithm based on an ant metaphor and the crossover operator from genetic algorithms.The performance of the proposed model is evaluated usingwell-known numerical test problems and then it is applied to train recurrent neural networks to identify linear and nonlinear dynamic plants. The simulation results are compared with results using other algorithms.
A True random generator in FPGA for cryptographic applications Yıldırım, Salih; Bazlamaçcı, Cüneyt Fehmi; Department of Electrical and Electronics Engineering (2012) In this thesis a True Random Number Generator (TRNG) employed for cryptographic applications is investigated, implemented and evaluated. The design of TRNG and its embedded tests are described in VHDL language and then implemented on an FPGA platform. Randomness is extracted from the jitter of ring oscillators that has self-failure detecting and sampling logic. The implementation needs only primitive resources which are common in all kinds of FPGAs. The embedded randomness tests described in Federal Informa...
A Meta-Heuristic Paradigm for solving the Forward Kinematics of 6-6 General Parallel Manipulator Chandra, Rohitash; Frean, Marcus; Rolland, Luc (2009-12-18) The forward kinematics of the general Gough platform, namely the 6-6 parallel manipulator is solved using hybrid meta-heuristic techniques in which the simulated annealing algorithm replaces the mutation operator in a genetic algorithm. The results are compared with the standard simulated annealing and genetic algorithm. It shows that the standard simulated annealing algorithm outperforms standard genetic algorithm in terms of computation time and overall accuracy of the solution on this problem. However, t...
A parallel sparse algorithm targeting arterial fluid mechanics computations Manguoğlu, Murat; Sameh, Ahmed H.; Tezduyar, Tayfun E. (2011-09-01) Iterative solution of large sparse nonsymmetric linear equation systems is one of the numerical challenges in arterial fluid-structure interaction computations. This is because the fluid mechanics parts of the fluid + structure block of the equation system that needs to be solved at every nonlinear iteration of each time step corresponds to incompressible flow, the computational domains include slender parts, and accurate wall shear stress calculations require boundary layer mesh refinement near the arteria...

Citation Formats

C. Bozgan, “A High throughput FPGA implementation of markov chain monte carlo method for mixture models,” M.S. - Master of Science, Middle East Technical University, 2019.