Show/Hide Menu
Hide/Show Apps
Logout
Türkçe
Türkçe
Search
Search
Login
Login
OpenMETU
OpenMETU
About
About
Open Science Policy
Open Science Policy
Open Access Guideline
Open Access Guideline
Postgraduate Thesis Guideline
Postgraduate Thesis Guideline
Communities & Collections
Communities & Collections
Help
Help
Frequently Asked Questions
Frequently Asked Questions
Guides
Guides
Thesis submission
Thesis submission
MS without thesis term project submission
MS without thesis term project submission
Publication submission with DOI
Publication submission with DOI
Publication submission
Publication submission
Supporting Information
Supporting Information
General Information
General Information
Copyright, Embargo and License
Copyright, Embargo and License
Contact us
Contact us
Named entity recognition in Turkish: A comparative study with detailed error analysis
Date
2022-11-01
Author
Ozcelik, Oguzhan
Toraman, Çağrı
Metadata
Show full item record
This work is licensed under a
Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License
.
Item Usage Stats
17
views
0
downloads
Cite This
Named entity recognition aims to detect pre-determined entity types in unstructured text. There is a limited number of studies on this task for low-resource languages such as Turkish. We provide a comprehensive study for Turkish named entity recognition by comparing the performances of existing state-of-the-art models on the datasets with varying domains to understand their generalization capability and further analyze why such models fail or succeed in this task. Our experimental results, supported by statistical tests, show that the highest weighted F1 scores are obtained by Transformer-based language models, varying from 80.8% in tweets to 96.1% in news articles. We find that Transformer-based language models are more robust to entity types with a small sample size and longer named entities compared to traditional models, yet all models have poor performance for longer named entities in social media. Moreover, when we shuffle 80% of words in a sentence to imitate flexible word order in Turkish, we observe more performance deterioration, 12% in well-written texts, compared to 7% in noisy text.
URI
https://hdl.handle.net/11511/109663
Journal
INFORMATION PROCESSING & MANAGEMENT
DOI
https://doi.org/10.1016/j.ipm.2022.103065
Collections
Department of Computer Engineering, Article
Citation Formats
IEEE
ACM
APA
CHICAGO
MLA
BibTeX
O. Ozcelik and Ç. Toraman, “Named entity recognition in Turkish: A comparative study with detailed error analysis,”
INFORMATION PROCESSING & MANAGEMENT
, vol. 59, no. 6, pp. 0–0, 2022, Accessed: 00, 2024. [Online]. Available: https://hdl.handle.net/11511/109663.