ANALYSIS OF TECHNICAL DEBT IN ML-BASED SOFTWARE DEVELOPMENT PROJECTS

2024-9-06
Dayan Akman, Pelin
Rapid development of Machine Learning (ML) algorithms and tools, and easier access to available frameworks and infrastructures have greatly fueled development of ML- based software solutions for real-world problems. Similar to traditional software development projects, ML-based projects have to deal with significant consequences of quick but sub-optimal solutions or shortcuts taken in the development process. Effects of these intentional or unintentional poor decisions are known as technical debt (TD). Due to structural differences of ML projects compared to traditional software projects, the TD phenomenon needs to be revisited. In this thesis, TD is defined specifically in the context of ML-based projects, and distinct categories of TD relevant to these projects are identified. The assessment of TD were examined based on the data collected through interviews from 18 industry professionals in the fields of Data Science and ML. These interviews were analyzed by using thematic analysis to identify the root causes, impacts, band-aid solutions and mitigation strategies related to TD. The findings of the study were reviewed by academic experts in multiple iterations. The study, in addition to identifying TD categories specific to ML projects such as data, model, infrastructure and deployment, also identified traditional software project-specific TD categories such as code, system design, and team, resource, and knowledge management. This research provides a detailed understanding of TD phenomenon in ML projects and offers practical recommendations for its management. This study contributes to the field by highlighting the unique nature of TD in ML context and proposes a TD-oriented structure for its assessment.
Citation Formats
P. Dayan Akman, “ANALYSIS OF TECHNICAL DEBT IN ML-BASED SOFTWARE DEVELOPMENT PROJECTS,” M.S. - Master of Science, Middle East Technical University, 2024.