AN INVESTIGATION OF ISSUE LABELING IN OPEN SOURCE SOFTWARE PROJECTS USING LARGE LANGUAGE MODELS

Download

irem_selin_deniz_2409.pdf

Date

2024-9-06

Author

Deniz, İrem Selin

Metadata

Show full item record

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Item Usage Stats

151
views

0
downloads

In the evolving landscape of open source software projects, effective issue management remains a pivotal aspect of sustaining project success. Issue reports provide valuable information, as they are created for reporting bugs, requesting new features, or asking questions about a software product. The high number of issue reports, which vary widely in quality, requires accurate issue classification mechanisms to prioritize work and manage resources effectively. Properly assigned issue labels are crucial for effective project management and for the reliability of research conducted to improve issue management, as such research often assumes the assigned issue labels as the ground truth. This study aims to assess the reliability of the assigned issue labels in open source software development projects to improve issue management processes. The research involves collecting two datasets of issue reports from open source software development projects hosted on GitHub. Experiments were conducted with state-of-the-art large language models for issue label classification. Furthermore, a qualitative analysis was performed to evaluate the relevance of the assigned issue labels with respect to the content of the issue reports. The empirical study performed on issue reports revealed a significant mismatch between the assigned issue labels and the actual content of the issue reports. The study also demonstrated the effectiveness of state-of-the-art large language models in classifying issue labels, while highlighting concerns about the reliability of issue labels in open source software development projects.

Subject Keywords

issue management, issue classification, issue label, LLM, open source software

URI

https://hdl.handle.net/11511/111272

Collections

Graduate School of Informatics, Thesis

Citation Formats

İ. S. Deniz, “AN INVESTIGATION OF ISSUE LABELING IN OPEN SOURCE SOFTWARE PROJECTS USING LARGE LANGUAGE MODELS,” M.S. - Master of Science, Middle East Technical University, 2024.