Show/Hide Menu
Hide/Show Apps
Logout
Türkçe
Türkçe
Search
Search
Login
Login
OpenMETU
OpenMETU
About
About
Open Science Policy
Open Science Policy
Open Access Guideline
Open Access Guideline
Postgraduate Thesis Guideline
Postgraduate Thesis Guideline
Communities & Collections
Communities & Collections
Help
Help
Frequently Asked Questions
Frequently Asked Questions
Guides
Guides
Thesis submission
Thesis submission
MS without thesis term project submission
MS without thesis term project submission
Publication submission with DOI
Publication submission with DOI
Publication submission
Publication submission
Supporting Information
Supporting Information
General Information
General Information
Copyright, Embargo and License
Copyright, Embargo and License
Contact us
Contact us
Performance benchmarking of a discontinuous Galerkin-based compressible flow solver on GPU computing platforms using cnsBench
Date
2022-11-22
Author
Karakuş, Ali
Unnikrishnan, Umesh
Rowe, Kris
Patel, Saumil S
Metadata
Show full item record
This work is licensed under a
Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License
.
Item Usage Stats
222
views
0
downloads
Cite This
Heterogeneous computing architectures have become an integral feature of modern supercomputers. In this work, we present the performance benchmarking results of a discontinuous Galerkin, spectral-element solver for the compressible Navier-Stokes equations on different GPU-based computing platforms. The solver uses OCCA, an open-source library that provides the portability layer to offload targeted kernels across different architectures and vendor platforms, and achieve application portability. Profiling of the solver is conducted, and the most compute-expensive kernels are identified. A mini-app called cnsBench is developed based on the full solver to benchmark and investigate performance characteristics of different core kernels. The kernel performance metrics will be presented and compared across different GPU architectures, such as NVIDIA, Intel, and AMD GPUs, and programming models, such as CUDA, OpenCL and SYCL. The kernel algorithms and memory access patterns are analyzed to provide insights regarding computational bottlenecks and approaches to further optimize performance of these kernels. These efforts will guide future development of compressible flow applications that can leverage the full potential of next generation exascale supercomputers and beyond.
URI
https://hdl.handle.net/11511/100995
Conference Name
Bulletin of the American Physical Society
Collections
Department of Mechanical Engineering, Conference / Seminar
Suggestions
OpenMETU
Core
Implementation of a risc microcontroller using fpga
Gümüş, Raşit; Güran, Hasan; Department of Electrical and Electronics Engineering (2005)
In this thesis a microcontroller core is developed in an FPGA. Its instruction set is compatible with the microcontroller PIC16XX series by Microchip Technology. The microcontroller employs a RISC architecture with separate busses for instructions and data. Our goal in this research is to implement and evaluate the design in the FPGA. Increasing performance and gate capacity of recent FPGA devices permits complex logic systems to be implemented on a single programmable device. Such a growing complexity dema...
Performance Models for the Spike Banded Linear System Solver
Manguoğlu, Murat; Sameh, Ahmed; Grama, Ananth (Hindawi Limited, 2011)
With availability of large-scale parallel platforms comprised of tens-of-thousands of processors and beyond, there is significant impetus for the development of scalable parallel sparse linear system solvers and preconditioners. An integral part of this design process is the development of performance models capable of predicting performance and providing accurate cost models for the solvers and preconditioners. There has been some work in the past on characterizing performance of the iterative solvers them...
Implementation of an 8-bit microcontroller with system c
Kesen, Lokman; Aşkar, Murat; Department of Electrical and Electronics Engineering (2004)
In this thesis, an 8-bit microcontroller, 8051 core, is implemented using SystemC programming language. SystemC is a new generation co-design language which is capable of both programming software and describing hardware parts of a complete system. The benefit of this design environment appears while developing a System-on-Chip (SoC), that is a system consisting both custom hardware parts and embedded software parts. SystemC is not a completely new language, but based on C++ with some additional class libra...
A reconfigurable computing platform for real time embedded applications
Say, Fatih; Halıcı, Uğur; Department of Electrical and Electronics Engineering (2011)
Today’s reconfigurable devices successfully combine ‘reconfigurable computing machine’ paradigm and ‘high degree of parallelism’ and hence reconfigurable computing emerged as a promising alternative for computing-intensive applications. Despite its superior performance and lower power consumption compared to general purpose computing using microprocessors, reconfigurable computing comes with a cost of design complexity. This thesis aims to reduce this complexity by providing a flexible and user friendly dev...
Performance Evaluation of Different Real-Time Motion Controller Topologies Implemented on a FPGA
MUTLU, B. R.; Yaman, Ulaş; Dölen, Melik; Koku, Ahmet Buğra (2009-11-18)
This paper presents a comprehensive comparison of several real-time motion controller topologies implemented on a field programmable gate array (FPGA). Controller topologies are selected as proportional-integral-derivative controller with command feedforward, sliding mode controller, fuzzy controller, and a hysteresis controller. Controllers and other necessary modules are developed using Verilog HDL and they are implemented on a ML505 development board with a Xilinx Virtex-5 FPGA chip. In order to take ful...
Citation Formats
IEEE
ACM
APA
CHICAGO
MLA
BibTeX
A. Karakuş, U. Unnikrishnan, K. Rowe, and S. S. Patel, “Performance benchmarking of a discontinuous Galerkin-based compressible flow solver on GPU computing platforms using cnsBench,” presented at the Bulletin of the American Physical Society, Indiana, Amerika Birleşik Devletleri, 2022, Accessed: 00, 2022. [Online]. Available: https://hdl.handle.net/11511/100995.