Optimization of XNOR Convolution for Binary Convolutional Neural Networks on GPU

Date

2020-10-09

Author

Kaya, Mete Can
İnci, Alperen
Temizel, Alptekin

Metadata

Show full item record

Item Usage Stats

243
views

0
downloads

Binary convolutional networks have lower computational load and lower memory foot-print compared to their full-precision counterparts. So, they are a feasible alternative for the deployment of computer vision applications on limited capacity embedded devices. Once trained on less resourceconstrained computational environments, they can be deployed for real-time inference on such devices. In this study, we propose an implementation of binary convolutional network inference on GPU by focusing on optimization of XNOR convolution. Experimental results show that using GPU can provide a speed-up of up to 42.61× with a kernel size of 3×3. The implementation is publicly available at https://github.com/metcan/ Binary-Convolutional-Neural-Network-Inference-on-GPU.

Subject Keywords

Computer Vision and Pattern Recognition (cs.CV), Distributed, Parallel, and Cluster Computing (cs.DC)

URI

https://hdl.handle.net/11511/73720
https://arxiv.org/pdf/2007.14178.pdf

Conference Name

6. Ulusal Yüksek Başarımlı Hesaplama Konferansı (BAŞARIM 2020)

Collections

Graduate School of Informatics, Conference / Seminar

Suggestions

OpenMETU
Core

Continuous dimensionality characterization of image structures Felsberg, Michael; Kalkan, Sinan; Kruger, Norbert (Elsevier BV, 2009-05-04) Intrinsic dimensionality is a concept introduced by statistics and later used in image processing to measure the dimensionality of a data set. In this paper, we introduce a continuous representation of the intrinsic dimension of an image patch in terms of its local spectrum or, equivalently, its gradient field. By making use of a cone structure and barycentric co-ordinates, we can associate three confidences to the three different ideal cases of intrinsic dimensions corresponding to homogeneous image patche...
Optimal dynamic resource allocation for heterogenous cloud data centers Ekici, Nazım Umut; Güran Schmidt, Şenan.; Department of Electrical and Electronics Engineering (2019) Today's data centers are mostly cloud-based with virtualized servers to provide on-demand scalability and flexibility of the available resources such as CPU, memory, data storage and network bandwidth. Heterogeneous cloud data centers (CDCs) offer hardware accelerators in addition to these standard cloud server resources. A cloud data center provider may provide Infrastructure as a Service and Platform as a Service (IPaaS), where the user gets a virtual machine (VM) with processing, memory, storage and netw...
Derivation of Transcriptional Regulatory Relationships by Partial Least Squares Regression Tan, Mehmet; Polat, Faruk; Alhajj, Reda (2009-11-04) As the number of genes in a transcriptional regulatory network is large and the number of samples in biological data types is usually small, there is a need for integrating multiple data types for reverse engineering these networks. In this paper, we propose a method to integrate microarray gene expression, ChIP-chip and transcription factor binding motif data sets in a partial least squares regression model to derive transcription factors (TFs) gene interactions. Both single and synergistic effects of TFs ...
Implementation of a DF algorithm on an FPGA platform İpek, Abdullah Volkan; Severcan, Mete; Department of Electrical and Electronics Engineering (2006) In this thesis work, the implementations of the monopulse amplitude comparison and phase comparison DF algorithms are performed on an FPGA platform. After the mathematical formulation of the algorithms using maximum-likelihood approach is done, software simulations are carried out to validate and find the DF accuracies of the algorithms under various conditions. Then the algorithms are implemented on an FPGA platform by utilizing platform specific software tools. Block diagrams of the hardware implementatio...
Boostıng performance of hls optımızatıon for soc based hardware accelerators. Kocaay, Aziz Berkin; Bazlamaçcı, Cüneyt F..; Department of Electrical and Electronics Engineering (2020) Modern large-scale computing algorithms require huge amount of computational power. In adapting to increasing computation demands, FPGA-based SoC platforms provide an alternative to traditional CPU or GPU units, which suffer from thermal problems, power issues, etc. However, design flow for FPGA based development may be hard and time-consuming for an average software engineer who has limited knowledge about hardware design. A new approach in FPGA-based system development without the need for a hardware engi...

Citation Formats

M. C. Kaya, A. İnci, and A. Temizel, “Optimization of XNOR Convolution for Binary Convolutional Neural Networks on GPU,” presented at the 6. Ulusal Yüksek Başarımlı Hesaplama Konferansı (BAŞARIM 2020), Ankara, Türkiye, 2020, Accessed: 00, 2021. [Online]. Available: https://hdl.handle.net/11511/73720.