Scalable high-performance architecture for convolutional ternary neural networks on FPGA

Download
2017-09-06
Prost-Boucle, Adrien
Bourge, Alban
Petrot, Frederic
Alemdar, Hande
Caldwell, Nicholas
Leroy, Vincent
Thanks to their excellent performances on typical artificial intelligence problems, deep neural networks have drawn a lot of interest lately. However, this comes at the cost of large computational needs and high power consumption. Benefiting from high precision at acceptable hardware cost on these difficult problems is a challenge. To address it, we advocate the use of ternary neural networks (TNN) that, when properly trained, can reach results close to the state of the art using floating-point arithmetic. We present a highly versatile FPGA friendly architecture for TNN in which we can vary both the number of bits of the input data and the level of parallelism at synthesis time, allowing to trade throughput for hardware resources and power consumption. To demonstrate the efficiency of our proposal, we implement high-complexity convolutional neural networks on the Xilinx Virtex-7 VC709 FPGA board. While reaching a better accuracy than comparable designs, we can target either high throughput or low power. We measure a throughput up to 27000 fps at approximate to 7 W or up to 8.36 TMAC/s at approximate to 13 W.

Suggestions

Generation and modification of 3D models with deep neural networks
Öngün, Cihan; Temizel, Alptekin; Department of Information Systems (2021-9)
Artificial intelligence (AI) and particularly deep neural networks (DNN) have become very hot topics in the recent years and they have been shown to be successful in problems such as detection, recognition and segmentation. More recently DNNs have started to be popular in data generation problems by the invention of Generative Adversarial Networks (GAN). Using GANs, various types of data such as audio, image or 3D models could be generated. In this thesis, we aim to propose a system that creates artificial...
Boostıng performance of hls optımızatıon for soc based hardware accelerators.
Kocaay, Aziz Berkin; Bazlamaçcı, Cüneyt F..; Department of Electrical and Electronics Engineering (2020)
Modern large-scale computing algorithms require huge amount of computational power. In adapting to increasing computation demands, FPGA-based SoC platforms provide an alternative to traditional CPU or GPU units, which suffer from thermal problems, power issues, etc. However, design flow for FPGA based development may be hard and time-consuming for an average software engineer who has limited knowledge about hardware design. A new approach in FPGA-based system development without the need for a hardware engi...
Geospatial object recognition using deep networks for satellite images
Barut, Onur; Alatan, Abdullah Aydın; Department of Electrical and Electronics Engineering (2018)
Deep learning paradigm has been drawing significant interest during the last decade due to the recent developments in novel machine learning algorithms and improvements in computational hardware. Satellite image analysis is also an important scientific area with many objectives, such as disaster and crisis management, forest cover, road mapping, city planning, even military purposes. Spatial correlations of land cover or geospatial objects between different images lead to widely utilization of convolutional...
Geospatial Object Detection Using Deep Networks
Barut, Onur; Alatan, Abdullah Aydın (2019-01-01)
In the last decade, deep learning has been drawing a huge interest due to the developments in the computational hardware and novel machine learning techniques. This progress also significantly effects satellite image analysis for various objectives, such as disaster and crisis management, forest cover, road mapping, city planning and even military purposes. For all these applications, detection of geospatial objects has crucial importance and some recent object detection techniques are still unexplored to b...
Position estimation for timing belt drives of precision machinery using structured neural networks
KILIÇ, Ergin; DOĞRUER, CAN ULAŞ; Dölen, Melik; Koku, Ahmet Buğra (2012-05-01)
This paper focuses on a viable position estimation scheme for timing-belt drives using artificial neural networks. In this study, the position of a carriage (load) is calculated via a structured neural network topology accepting input from a position sensor on the actuator side of the timing belt. The paper presents a detailed discussion on the source of transmission errors. The characteristics of the error in different operation regimes are exploited to construct different network topologies. That is, a re...
Citation Formats
A. Prost-Boucle, A. Bourge, F. Petrot, H. Alemdar, N. Caldwell, and V. Leroy, “Scalable high-performance architecture for convolutional ternary neural networks on FPGA,” 2017, Accessed: 00, 2020. [Online]. Available: https://hdl.handle.net/11511/41133.