Ternary Neural Networks for Resource-Efficient AI Applications

Download

index.pdf

Date

2017-05-19

Author

Alemdar, Hande
Prost-Boucle, Adrien
Petrot, Frederic

Metadata

Show full item record

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Item Usage Stats

70
views

53
downloads

The computation and storage requirements for Deep Neural Networks (DNNs) are usually high. This issue limits their deployability on ubiquitous computing devices such as smart phones, wearables and autonomous drones. In this paper, we propose ternary neural networks (TNNs) in order to make deep learning more resource-efficient. We train these TNNs using a teacher-student approach based on a novel, layer-wise greedy methodology. Thanks to our two-stage training procedure, the teacher network is still able to use state-of-the-art methods such as dropout and batch normalization to increase accuracy and reduce training time. Using only ternary weights and activations, the student ternary network learns to mimic the behavior of its teacher network without using any multiplication. Unlike its {-1,1} binary counterparts, a ternary neural network inherently prunes the smaller weights by setting them to zero during training. This makes them sparser and thus more energy-efficient. We design a purpose-built hardware architecture for TNNs and implement it on FPGA and ASIC. We evaluate TNNs on several benchmark datasets and demonstrate up to 3.1x better energy efficiency with respect to the state of the art while also improving accuracy.

Subject Keywords

Neurons, Training, Hardware, Biological neural networks, Artificial neural networks, Transfer functions

URI

https://hdl.handle.net/11511/40269

DOI

https://doi.org/10.1109/ijcnn.2017.7966166

Conference Name

International Joint Conference on Neural Networks (IJCNN)

Collections

Department of Computer Engineering, Conference / Seminar

Suggestions

OpenMETU
Core

Improved Knowledge Distillation with Dynamic Network Pruning Şener, Eren; Akbaş, Emre (2022-9-30) Deploying convolutional neural networks to mobile or embedded devices is often prohibited by limited memory and computational resources. This is particularly problematic for the most successful networks, which tend to be very large and require long inference times. Many alternative approaches have been developed for compressing neural networks based on pruning, regularization, quantization or distillation. In this paper, we propose the “Knowledge Distillation with Dynamic Pruning” (KDDP), which trains a dyn...
A new approach to mathematical water quality modeling in reservoirs: Neural networks Karul, C; Soyupak, S; Germen, E (1998-01-01) Neural Networks are becoming more and more valuable tools for system modeling and function approximation as computing power of microcomputers increase. Modeling of complex ecological systems such as reservoir limnology is very difficult since the ecological interactions within a reservoir are difficult to define mathematically and are usually system specific. To illustrate the potential use of Neural Networks in ecological modeling, a software was developed to train the data from Keban Dam Reservoir by back...
Effect of quantization on the performance of deep networks Kütükcü, Başar; Bozdağı Akar, Gözde.; Department of Electrical and Electronics Engineering (2020) Deep neural networks performed greatly for many engineering problems in recent years. However, power and memory hungry nature of deep learning algorithm prevents mobile devices to benefit from the success of deep neural networks. The increasing number of mobile devices creates a push to make deep network deployment possible for resource-constrained devices. Quantization is a solution for this problem. In this thesis, different quantization techniques and their effects on deep networks are examined. The tech...
Improved knowledge distillation with dynamic network pruning Şener, Eren; Akbaş, Emre; Department of Computer Engineering (2019) Deploying convolutional neural networks to mobile or embedded devices is often prohibited by limited memory and computational resources. This is particularly problematic for the most successful networks, which tend to be very large and require long inference times. In the past, many alternative approaches have been developed for compressing neural networks based on pruning, regularization, quantization or distillation. In this thesis, we propose the Knowledge Distillation with Dynamic Pruning (KDDP), which ...
Case studies on the use of neural networks in eutrophication modeling Karul, C; Soyupak, S; Cilesiz, AF; Akbay, N; Germen, E (2000-10-30) Artificial neural networks are becoming more and more common to be used in development of prediction models for complex systems as the theory behind them develops and the processing power of computers increase. A three layer Levenberg-Marquardt feedforward learning algorithm was used to model the eutrophication process in three water bodies of Turkey (Keban Dam Reservoir, Mogan and Eymir Lakes). Despite the very complex and peculiar nature of Keban Dam, a relatively good correlation (correlation coefficient...

Citation Formats

H. Alemdar, A. Prost-Boucle, and F. Petrot, “Ternary Neural Networks for Resource-Efficient AI Applications,” presented at the International Joint Conference on Neural Networks (IJCNN), Anchorage, AK, 2017, Accessed: 00, 2020. [Online]. Available: https://hdl.handle.net/11511/40269.