The reusability prior in deep learning models

Download

index.pdf

Date

2023-5-22

Author

Polat, Aydın Göze

Metadata

Show full item record

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Item Usage Stats

116
views

142
downloads

Various choices can affect the performance of deep learning (DL) models. For instance, repetitions in a model via cross-layer parameter sharing, using convolutional layers, and relying on skip connections affect the reusability of components in DL models, impacting parameter efficiency. In this work, three different approaches are proposed to investigate how different design choices in terms of such repetitions affect model performance. First, a new library, Revolver, is proposed to analyze reusable modules or model components while training a population of DL models. Reusing modules across models enabled training an entire population of models on a single GPU and collecting statistics about top scoring shared modules. Second, the reusability prior is proposed as follows: model components are forced to function in diverse contexts not only due to the training data, augmentation, and regularization choices but also due to the model design itself. Based on this prior, a counting-based graph analysis approach that can quantify the number of contexts for each learnable parameter is proposed. In the experiments, this approach was able to correctly predict the ranking of several analyzed models in terms of top-1 accuracy without relying on any training. Third, a generalized framework inspired by statistical mechanics is proposed, where the context-based counting approach describes models with absolute temperature T=-1. The generalized framework allowed going beyond the proposed counting approach by encoding the constraints and assumptions in the form of energy at the parameter level. Overall, these approaches may open up avenues for research on model analysis and comparison or lead to practical applications for neural architecture search.

Subject Keywords

Deep learning, Parameter sharing, Genetic algorithms, Reusability, Statistical mechanics

URI

https://hdl.handle.net/11511/104170

Collections

Graduate School of Natural and Applied Sciences, Thesis

Citation Formats

A. G. Polat, “The reusability prior in deep learning models,” Ph.D. - Doctoral Program, Middle East Technical University, 2023.