Named entity recognition and explainability analysis on Turkish sports news texts

Download
2023-12
Kılıç, Yüksel Pelin
In Natural Language Processing (NLP) and Information Extraction, Named Entity Recognition (NER) is a significant challenge. NER involves identifying entities like Person, Location, and Organization from text. While NER is well-researched in English and Chinese, Turkish NER lags, especially in domain-specific areas like sports. The sports industry has seen a remarkable transformation with the convergence of sports and technology, impacting performance enhancement, fan engagement, and management. There is an untapped potential in extracting qualitative insights from textual data, offering a deeper understanding of the dynamics between athletes, teams, and supporters. One key area needing further exploration is applying deep learning techniques to Turkish NER, particularly in comparison with traditional methods. Additionally, there is a lack of research on the interpretability and explainability of transformer-based models in this context. This study introduces domain-specific Turkish NER data sets, mainly those relevant to sports, to evaluate the effectiveness of transformer-based models in Turkish NER. A significant aspect of this research is comparing these models and analyzing how different annotation formats impact the results. The effects of named entity distribution on model performance are investigated through cross-validation techniques. Another crucial component of this study is focusing on interpretability. By employing interpretability methods, we aim to uncover the rationale and mechanisms behind the model predictions. This aspect is crucial in understanding how these models function and make decisions, a relatively under-explored area in Turkish NER. This research contributes to NLP and Information Extraction and has implications for enriching sports research and management, providing new insights into the interaction between sports and technology.
Citation Formats
Y. P. Kılıç, “Named entity recognition and explainability analysis on Turkish sports news texts,” M.S. - Master of Science, Middle East Technical University, 2023.