Improved knowledge distillation with dynamic network pruning

Download

index.pdf

Date

2019

Author

Şener, Eren

Metadata

Show full item record

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Item Usage Stats

252
views

138
downloads

Deploying convolutional neural networks to mobile or embedded devices is often prohibited by limited memory and computational resources. This is particularly problematic for the most successful networks, which tend to be very large and require long inference times. In the past, many alternative approaches have been developed for compressing neural networks based on pruning, regularization, quantization or distillation. In this thesis, we propose the Knowledge Distillation with Dynamic Pruning (KDDP), which trains a dynamically pruned compact student network under the guidance of a large teacher network. In KDDP, we train the student network with supervision from the teacher network, while applying L_1 regularization on the neuron activations in a fully-connected layer. Subsequently, we prune inactive neurons. Our method automatically determines the final size of the student model. We evaluate the compression rate and accuracy of the resulting networks on image classification datasets, and compare them to results obtained by Knowledge Distillation (KD). Compared to KD, our method produces better accuracy and more compact models.

Subject Keywords

Neural networks (Computer science)., Keywords: Knowledge distillation, model compression, deep learning.

URI

http://etd.lib.metu.edu.tr/upload/12624131/index.pdf
https://hdl.handle.net/11511/44620

Collections

Graduate School of Natural and Applied Sciences, Thesis

Suggestions

OpenMETU
Core

Improved Knowledge Distillation with Dynamic Network Pruning Şener, Eren; Akbaş, Emre (2022-9-30) Deploying convolutional neural networks to mobile or embedded devices is often prohibited by limited memory and computational resources. This is particularly problematic for the most successful networks, which tend to be very large and require long inference times. Many alternative approaches have been developed for compressing neural networks based on pruning, regularization, quantization or distillation. In this paper, we propose the “Knowledge Distillation with Dynamic Pruning” (KDDP), which trains a dyn...
Adaptive mean-shift for automated multi object tracking Beyan, C.; Temizel, Alptekin (2012-01-01) Mean-shift tracking plays an important role in computer vision applications because of its robustness, ease of implementation and computational efficiency. In this study, a fully automatic multiple-object tracker based on mean-shift algorithm is presented. Foreground is extracted using a mixture of Gaussian followed by shadow and noise removal to initialise the object trackers and also used as a kernel mask to make the system more efficient by decreasing the search area and the number of iterations to conve...
Effect of quantization on the performance of deep networks Kütükcü, Başar; Bozdağı Akar, Gözde.; Department of Electrical and Electronics Engineering (2020) Deep neural networks performed greatly for many engineering problems in recent years. However, power and memory hungry nature of deep learning algorithm prevents mobile devices to benefit from the success of deep neural networks. The increasing number of mobile devices creates a push to make deep network deployment possible for resource-constrained devices. Quantization is a solution for this problem. In this thesis, different quantization techniques and their effects on deep networks are examined. The tech...
Automated learning rate search using batch-level cross-validation KABAKÇI, Duygu; Akbaş, Emre (2021-04-01) Deep learning researchers and practitioners have accumulated a significant amount of experience on training a wide variety of architectures on various datasets. However, given anetwork architecture and a dataset, obtaining the best model (i.e. the model giving the smallest test set error) while keeping the training time complexity low is still a challenging task. Hyper-parameters of deep neural networks, especially the learning rate and its (decay) schedule, highly affect the network's final performance. Th...
A new approach to mathematical water quality modeling in reservoirs: Neural networks Karul, C; Soyupak, S; Germen, E (1998-01-01) Neural Networks are becoming more and more valuable tools for system modeling and function approximation as computing power of microcomputers increase. Modeling of complex ecological systems such as reservoir limnology is very difficult since the ecological interactions within a reservoir are difficult to define mathematically and are usually system specific. To illustrate the potential use of Neural Networks in ecological modeling, a software was developed to train the data from Keban Dam Reservoir by back...

Citation Formats

E. Şener, “Improved knowledge distillation with dynamic network pruning,” Thesis (M.S.) -- Graduate School of Natural and Applied Sciences. Computer Engineering., Middle East Technical University, 2019.