Show/Hide Menu
Hide/Show Apps
Logout
Türkçe
Türkçe
Search
Search
Login
Login
OpenMETU
OpenMETU
About
About
Open Science Policy
Open Science Policy
Open Access Guideline
Open Access Guideline
Postgraduate Thesis Guideline
Postgraduate Thesis Guideline
Communities & Collections
Communities & Collections
Help
Help
Frequently Asked Questions
Frequently Asked Questions
Guides
Guides
Thesis submission
Thesis submission
MS without thesis term project submission
MS without thesis term project submission
Publication submission with DOI
Publication submission with DOI
Publication submission
Publication submission
Supporting Information
Supporting Information
General Information
General Information
Copyright, Embargo and License
Copyright, Embargo and License
Contact us
Contact us
Natural Language Processing for the Turkish Academic Texts in the Engineering Field: Key-Term Extraction, Similarity Detection, Subject/Topic Assignment
Date
2023-01-01
Author
Kat, Bora
Metadata
Show full item record
This work is licensed under a
Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License
.
Item Usage Stats
111
views
0
downloads
Cite This
The information retrieved from texts plays crucial roles in many aspects. Although there are significant attempts on natural language processing for various types of texts in Turkish, none of them deals with academic texts. This study mainly aims to retrieve precise key terms from Turkish academic texts in the field of engineering and develops algorithms for similarity detection and automatic classification based on these key terms. In the first step of this study: a library and customized templates, that can transform the n-grams into structured forms, are created by considering the features of engineering terminology and the grammar of the Turkish language. Then, a customized similarity detection algorithm is developed. Finally, the Naïve Bayes Classifier is used to assign the documents to the appropriate engineering sub-fields. The project proposals submitted to The Scientific and Technological Research Council of Turkey (TÜBİTAK) Academic Research Funding Program Directorate (ARDEB) are analyzed as a case study. The results indicate that the proposed similarity algorithm correctly detects almost all of the re-submitted proposals while the accuracy of the classifier is 83.3% in the first prediction and reaches up to 96.4% in the first three predictions over a sample of 1255 proposals.
Subject Keywords
Conceptual similarity
,
Feature extraction
,
Key term extraction
,
Natural language processing (NLP)
,
Naïve Bayes classifier
,
subject/topic assignment
,
Supervised machine learning
,
TÜBİTAK
URI
https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85173561991&origin=inward
https://hdl.handle.net/11511/105782
DOI
https://doi.org/10.1007/978-3-031-34107-6_33
Conference Name
19th IFIP WG 12.5 International Conference on Artificial Intelligence Applications and Innovations, AIAI 2023
Collections
Graduate School of Natural and Applied Sciences, Conference / Seminar
Citation Formats
IEEE
ACM
APA
CHICAGO
MLA
BibTeX
B. Kat, “Natural Language Processing for the Turkish Academic Texts in the Engineering Field: Key-Term Extraction, Similarity Detection, Subject/Topic Assignment,” Leon, İspanya, 2023, vol. 676 IFIP, Accessed: 00, 2023. [Online]. Available: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85173561991&origin=inward.