ENHANCING DNN TEST DATA SELECTION THROUGH UNCERTAINTY-BASED AND DATA DISTRIBUTION-AWARE APPROACHES

Download

Demet_Demir_Tez.pdf

Date

2024-7

Author

Demir, Demet

Metadata

Show full item record

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Item Usage Stats

153
views

12
downloads

In this thesis, we introduce a testing framework designed to identify fault-revealing data in Deep Neural Network (DNN) models and determine the causes of these failures. Given the data-driven nature of DNNs, the effectiveness of testing depends on the adequacy of labeled test data. We perform test data selection with the goal of identifying and prioritizing test data that will cause failures in the DNN. To achieve this, we leveraged the degree of uncertainty of the model for inputs. Initially, we employed state-of-the-art uncertainty estimation methods and metrics, then proposed new ones. Lastly, we developed a novel approach using a meta-model that integrates multiple uncertainty metrics, overcoming the limitations of individual metrics and enhancing effectiveness in various scenarios. The test data distribution significantly impacts DNN performance and is critical in assessing test results. Therefore, we generated test datasets with a distribution-aware perspective. We propose to first focus on in-distribution data for which the DNN model is expected to make accurate predictions and then include out-of-distribution (OOD) data. Furthermore, we investigated post-hoc explainability methods to identify the causes of incorrect predictions. Visualization explanation techniques provide insights into the reasons for incorrect decision-making by DNNs, however they require detailed manual assessment. We evaluated the proposed methodologies using image classification DNNs and datasets. The results show that uncertainty-based test selection effectively identifies fault-revealing inputs. Specifically, test data prioritization using the meta-model approach outperforms state-of-the-art methods. Consequently, we conclude that using prioritized data in tests significantly increases the detection rate of DNN model failures.

Subject Keywords

Deep Neural Network Testing, Test Data Selection and Prioritization, Data Distribution, Deep Learning Explainability, Deep Learning Uncertainty

URI

https://hdl.handle.net/11511/110146

Collections

Graduate School of Informatics, Thesis

Citation Formats

D. Demir, “ENHANCING DNN TEST DATA SELECTION THROUGH UNCERTAINTY-BASED AND DATA DISTRIBUTION-AWARE APPROACHES,” Ph.D. - Doctoral Program, Middle East Technical University, 2024.