EXPERIMENTING DENSE PASSAGE RETRIEVAL FOR TURKISH LEGAL TEXTS

2025-8-28
Tiryaki Onay, Yağmur
Professionals working in the legal domain routinely engage with extensive and complex legal passages. The volume and intricacy of these legal passages inevitably re- quire an efficient and high-performance legal information retrieval system and there are emerging studies that integrate Natural Language Processing (NLP) techniques for this purpose. In this thesis, we introduce a dense retrieval system that combines ColBERT architecture with a multilingual BERT backbone, which is fine-tuned with comprehensive legal dataset in order for the model to capture nuanced semantic relationships between long queries and passages. The experiments were carried out to examine the capability of different dense retrieval models and the baseline is presented by traditional sparse retriever BM25. The results show that ColBERTv2 excels in ranking relevant documents given queries in Turkish legal domain, which can be utilized for various legal tasks and further research on Turkish passage retrieval.
Citation Formats
Y. Tiryaki Onay, “EXPERIMENTING DENSE PASSAGE RETRIEVAL FOR TURKISH LEGAL TEXTS,” M.S. - Master of Science, Middle East Technical University, 2025.