Optimization of advanced encyription standard (AES) on CUDA

Download
2019
Çelik, Burak
This thesis presents several optimization techniques of AES implementations on CUDA. 6 different CUDA kernels are implemented for AES-128 exhaustive search with different software designs and they are compared with each other using Nsight experiment results. Outcome of these results are used for finding the best CUDA implementation and from it, AES-128, AES-192 and AES-256 versions are created for exhaustive search, on the fly CTR and file encryption. They are compared with CPU implementations in order to decide whether GPU or CPU is the fastest considering these topics. For this comparison, two different type of CPU implementations are created which are AES-NI, using new instruction set of Intel, and basic C++. 1, 2, 4 and 8 threads versions of these implementations are compared with CUDA and results are shared. According to them, CUDA is 21, 19 and 18 times faster than the best CPU implementations for exhaustive search with respect to key length. These ratios are 4 times for CTR implementations in which 37.52 GBs of data can be encrypted each second while using CUDA. File encryption for CUDA is 22, 19 and 17 times faster than the best CPU implementations. CUDA can encrypt 31.24 GBs of data per second in this regard without considering I/O operations.

Suggestions

Parallel Scalable PDE Constrained Optimization Antenna Identification in Hyperthermia Cancer Treatment Planning
SCHENK, Olaf; Manguoğlu, Murat; CHRİSTEN, Matthias; SATHE, Madan (Springer Science and Business Media LLC, 2009-01-01)
We present a PDE-constrained optimization algorithm which is designed for parallel scalability on distributed-memory architectures with thousands of cores. The method is based on a line-search interior-point algorithm for large-scale continuous optimization, it is matrix-free in that it does not require the factorization of derivative matrices. Instead, it uses a new parallel and robust iterative linear solver on distributed-memory architectures. We will show almost linear parallel scalability results for t...
Implementation Studies of Robot Swarm Navigation Using Potential Functions and Panel Methods
Merheb, Abdel-Razzak; GAZİ, VEYSEL; Sezer Uzol, Nilay (2016-10-01)
This paper presents a practical swarm navigation algorithm based on potential functions and properties of inviscid incompressible flows. Panel methods are used to solve the flow equations around complex shaped obstacles and to generate the flowlines, which provide collision-free paths to the goal position. Safe swarm navigation is achieved by following the generated streamlines. Potential functions are used to achieve and maintain group cohesion or a geometric formation during navigation. The algorithm is i...
Analysis of extended feature models with constraint programming
Karataş, Ahmet Serkan; Oğuztüzün, Mehmet Halit S.; Department of Computer Engineering (2010)
In this dissertation we lay the groundwork of automated analysis of extended feature models with constraint programming. Among different proposals, feature modeling has proven to be very effective for modeling and managing variability in Software Product Lines. However, industrial experiences showed that feature models often grow too large with hundreds of features and complex cross-tree relationships, which necessitates automated analysis support. To address this issue we present a mapping from extended fe...
FGPA based cryptography computation platform and the basis conversion in composite finite fields
Sial, Muhammad Riaz; Akyıldız, Ersan; Department of Cryptography (2013)
In the study of this thesis work we focused on the hardware based cryptographic algorithms computation platform, especially for elliptic-curve and hyper-elliptic curve based protocols. We worked for making the hyperelliptic curve based Tate Pairing computation efficient specially for hardware implementations. To achieve this one needs to make the underlying finite field arithmetic implementations efficient. For this we study the finite fields of type $\mathbb{F}_q, q=p^{2pn}$ from the efficient implementati...
PARALLEL IMPLEMENTATION OF MLFMA FOR HOMOGENEOUS OBJECTS WITH VARIOUS MATERIAL PROPERTIES
Ergül, Özgür Salih (2011-01-01)
We present a parallel implementation of the multilevel fast multipole algorithm (MLFMA) for fast and accurate solutions of electromagnetics problems involving homogeneous objects with diverse material properties. Problems are formulated rigorously with the electric and magnetic current combined-field integral equation (JMCFIE) and solved iteratively using MLFMA parallelized with the hierarchical partitioning strategy. Accuracy and efficiency of the resulting implementation are demonstrated on canonical prob...
Citation Formats
B. Çelik, “Optimization of advanced encyription standard (AES) on CUDA,” Thesis (M.S.) -- Graduate School of Informatics. Cyber Security., Middle East Technical University, 2019.