Show/Hide Menu
Hide/Show Apps
Logout
Türkçe
Türkçe
Search
Search
Login
Login
OpenMETU
OpenMETU
About
About
Open Science Policy
Open Science Policy
Open Access Guideline
Open Access Guideline
Postgraduate Thesis Guideline
Postgraduate Thesis Guideline
Communities & Collections
Communities & Collections
Help
Help
Frequently Asked Questions
Frequently Asked Questions
Guides
Guides
Thesis submission
Thesis submission
MS without thesis term project submission
MS without thesis term project submission
Publication submission with DOI
Publication submission with DOI
Publication submission
Publication submission
Supporting Information
Supporting Information
General Information
General Information
Copyright, Embargo and License
Copyright, Embargo and License
Contact us
Contact us
Automated priority detection in software bugs: A comprehensive study on transformer-based encoders with contrastive learning, large language models and vector databases for enhanced efficiency
Download
tez.pdf
Date
2024-1-25
Author
Yılmaz, Eyüp Halit
Metadata
Show full item record
This work is licensed under a
Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License
.
Item Usage Stats
493
views
292
downloads
Cite This
Software development processes include many challenges that require human effort and time investment. In time, many tools and techniques are developed to address these challenges and automate parts of software development and maintenance. Software bug reports are textual descriptions, often accompanied by code snippets and error logs, written by users or developers documenting operational failures of programs. These reports are later examined by the assigned developer to fix the bug. Automating the bug fixing pipeline helps determine the most suitable developer to assign to a given bug report, predict the bug fix time, estimate a priority level an so on. This thesis focuses on automated software bug report priority detection using state-of-the-art classification techniques. Widely successful transformer-based encoder classifiers are adapted to software domain via fine-tuning using open source datasets. Large Language Models (LLMs), on the other hand, are recently popularized transformer decoder networks specifically trained for text generation, which can be configured for priority class prediction. In order to accurately shape LLM output into desired format, Retrieval Augmented Generation (RAG) is used to condition the network to the downstream task and domain. Vector databases help store textual content in the bug reports according to cosine similarity and retrieve related instances during inference.
Subject Keywords
Priority detection
,
Contrastive learning
,
LLMs
,
RAG
,
Vector databases
URI
https://hdl.handle.net/11511/108485
Collections
Graduate School of Natural and Applied Sciences, Thesis
Citation Formats
IEEE
ACM
APA
CHICAGO
MLA
BibTeX
E. H. Yılmaz, “Automated priority detection in software bugs: A comprehensive study on transformer-based encoders with contrastive learning, large language models and vector databases for enhanced efficiency,” M.S. - Master of Science, Middle East Technical University, 2024.