ENHANCING DNN TEST DATA SELECTION THROUGH UNCERTAINTY-BASED AND DATA DISTRIBUTION-AWARE APPROACHES

2024-7
Demir, Demet
In this thesis, we introduce a testing framework designed to identify fault-revealing data in Deep Neural Network (DNN) models and determine the causes of these failures. Given the data-driven nature of DNNs, the effectiveness of testing depends on the adequacy of labeled test data. We perform test data selection with the goal of identifying and prioritizing test data that will cause failures in the DNN. To achieve this, we leveraged the degree of uncertainty of the model for inputs. Initially, we employed state-of-the-art uncertainty estimation methods and metrics, then proposed new ones. Lastly, we developed a novel approach using a meta-model that integrates multiple uncertainty metrics, overcoming the limitations of individual metrics and enhancing effectiveness in various scenarios. The test data distribution significantly impacts DNN performance and is critical in assessing test results. Therefore, we generated test datasets with a distribution-aware perspective. We propose to first focus on in-distribution data for which the DNN model is expected to make accurate predictions and then include out-of-distribution (OOD) data. Furthermore, we investigated post-hoc explainability methods to identify the causes of incorrect predictions. Visualization explanation techniques provide insights into the reasons for incorrect decision-making by DNNs, however they require detailed manual assessment. We evaluated the proposed methodologies using image classification DNNs and datasets. The results show that uncertainty-based test selection effectively identifies fault-revealing inputs. Specifically, test data prioritization using the meta-model approach outperforms state-of-the-art methods. Consequently, we conclude that using prioritized data in tests significantly increases the detection rate of DNN model failures.
Citation Formats
D. Demir, “ENHANCING DNN TEST DATA SELECTION THROUGH UNCERTAINTY-BASED AND DATA DISTRIBUTION-AWARE APPROACHES,” Ph.D. - Doctoral Program, Middle East Technical University, 2024.