Open problems in CEM: Porting an explicit time-domain volume-integral- equation solver on GPUs with OpenACC

2014-01-01
Ergül, Özgür Salih
Al-Jarro, Ahmed
Clo, Alain
Bagci, Hakan
Graphics processing units (GPUs) are gradually becoming mainstream in high-performance computing, as their capabilities for enhancing performance of a large spectrum of scientific applications to many fold when compared to multi-core CPUs have been clearly identified and proven. In this paper, implementation and performance-tuning details for porting an explicit marching-on-in-time (MOT)-based time-domain volume-integral-equation (TDVIE) solver onto GPUs are described in detail. To this end, a high-level approach, utilizing the OpenACC directive-based parallel programming model, is used to minimize two often-faced challenges in GPU programming: developer productivity and code portability. The MOT-TDVIE solver code, originally developed for CPUs, is annotated with compiler directives to port it to GPUs in a fashion similar to how OpenMP targets multi-core CPUs. In contrast to CUDA and OpenCL, where significant modifications to CPU-based codes are required, this high-level approach therefore requires minimal changes to the codes. In this work, we make use of two available OpenACC compilers, CAPS and PGI. Our experience reveals that different annotations of the code are required for each of the compilers, due to different interpretations of the fairly new standard by the compiler developers. Both versions of the OpenACC accelerated code achieved significant performance improvements, with up to 30× speedup against the sequential CPU code using recent hardware technology. Moreover, we demonstrated that the GPU-accelerated fully explicit MOT-TDVIE solver leveraged energy-consumption gains of the order of 3× against its CPU counterpart.
IEEE Antennas and Propagation Magazine

Suggestions

Causal and Passive Parameterization of S-Parameters Using Neural Networks
Torun, Hakki Mert; Durgun, Ahmet Cemal; Aygun, Kemal; Swaminathan, Madhavan (Institute of Electrical and Electronics Engineers (IEEE), 2020-10-01)
Neural networks (NNs) are widely used to create parametric models of S-parameters for various components in electronic systems. The focus of deriving these models has so far been numerical error reduction between the NN-generated S-parameters and the data source. However, this is not sufficient when creating such NNs since it does not guarantee predicted S-parameters to be physically consistent, i.e., passive and causal, which restricts their use cases. This article, therefore, proposes a causality enforcem...
Transformation Electromagnetics Based Analysis of Waveguides With Random Rough or Periodic Grooved Surfaces
Ozgun, Ozlem; Kuzuoğlu, Mustafa (Institute of Electrical and Electronics Engineers (IEEE), 2013-02-01)
A computational model is introduced which employs transformation-based media to increase the computational performance of finite methods (such as finite element or finite difference methods) for analyzing waveguides with grooves or rough surfaces. Random behavior of the roughness is taken into account by utilizing the Monte Carlo technique, which is based on a set of random rough surfaces generated from Gaussian distribution. The main objective of the proposed approach is to create a single mesh, and to ana...
Effects of parallel programming design patterns on the performance of multi-core processor based real time embedded systems
Kekeç, Burak; Bilgen, Semih; Department of Electrical and Electronics Engineering (2010)
Increasing usage of multi-core processors has led to their use in real time embedded systems (RTES). This entails high performance requirements which may not be easily met when software development follows traditional techniques long used for single processor systems. In this study, parallel programming design patterns especially developed and reported in the literature will be used to improve RTES implementations on multi-core systems. Specific performance parameters will be selected for assessment, and pe...
Advanced Design of Schottky Photodiodes in Bulk CMOS for High Speed Optical Receivers
Orsel, Ogulcan Emre; Erdil, Mertcan; Kocaman, Serdar (Institute of Electrical and Electronics Engineers (IEEE), 2020-02-01)
This paper provides theoretical insight on how circular Schottky photodiodes in bulk CMOS can be optimized for certain integrated receiver applications. An optimal photodiode size is analytically demonstrated and the effects of a metal plate reflector are simulated using a transfer-matrix method, both for frontside and backside illumination. Finally, a distributed circuit model is presented, which deviates from the classical lumped model for large photodiodes or sheet resistances. The presented methodologie...
Implementation of a distributed video codec
Işık, Cem Vedat; Akar, Gözde; Department of Electrical and Electronics Engineering (2008)
Current interframe video compression standards such as the MPEG4 and H.264, require a high-complexity encoder for predictive coding to exploit the similarities among successive video frames. This requirement is acceptable for cases where the video sequence to be transmitted is encoded once and decoded many times. However, some emerging applications such as video-based sensor networks, power-aware surveillance and mobile video communication systems require computational complexity to be shifted from encoder ...
Citation Formats
Ö. S. Ergül, A. Al-Jarro, A. Clo, and H. Bagci, “Open problems in CEM: Porting an explicit time-domain volume-integral- equation solver on GPUs with OpenACC,” IEEE Antennas and Propagation Magazine, pp. 265–277, 2014, Accessed: 00, 2020. [Online]. Available: https://hdl.handle.net/11511/37685.