Person name recognition in turkish financial texts by using local grammar approach

Download

index.pdf

Date

2007

Author

Bayraktar, Özkan

Metadata

Show full item record

Item Usage Stats

275
views

114
downloads

Named entity recognition (NER) is the task of identifying the named entities (NEs) in the texts and classifying them into semantic categories such as person, organization, and place names and time, date, monetary, and percent expressions. NER has two principal aims: identification of NEs and classification of them into semantic categories. The local grammar (LG) approach has recently been shown to be superior to other NER techniques such as the probabilistic approach, the symbolic approach, and the hybrid approach in terms of being able to work with untagged corpora. The LG approach does not require using any dictionaries and gazetteers, which are lists of proper nouns (PNs) used in NER applications, unlike most of the other NER systems. As a consequence, it is able to recognize NEs in previously unseen texts at minimal costs. Most of the NER systems are costly due to manual rule compilation especially in large tagged corpora. They also require some semantic and syntactic analyses to be applied before pattern generation process, which can be avoided by using the LG approach. In this thesis, we tried to acquire LGs for person names from a large untagged Turkish financial news corpus by using an approach successfully applied to a Reuter’s financial English news corpus recently by H. N. Traboulsi. We explored its applicability to Turkish language by using frequency, collocation, and concordance analyses. In addition, we constructed a list of Turkish reporting verbs. It is an important part of this study because there is no major study about reporting verbs in Turkish.

Subject Keywords

Named Entity Recognition.

URI

http://etd.lib.metu.edu.tr/upload/12608862/index.pdf
https://hdl.handle.net/11511/17119

Collections

Graduate School of Informatics, Thesis

Suggestions

OpenMETU
Core

Named Entity Recognition with Conditional Random Fields on Turkish News Dataset: Revisiting the Features Çekinel, Recep Fırat; Karagöz, Pınar (2019-04-24) Named entity recognition is a natural language processing problem that aims to mark entity names, such as person, place, organization, date, time, money and percentage, from different types of text. Various applications such as location estimation, event time estimation, determination of important people in the text can be possible with the solutions to this problem. The number of named entity recognition studies on Turkish texts is quite limited compared to those on English. In this study, the use of the t...
Financial named entity recognition for Turkish news texts Dinç, Duygu; Doğru, Ali Hikmet; Karagöz, Pınar; Department of Computer Engineering (2022-7-26) Named Entity Recognition (NER) is a problem of information extraction where the objective is; in a given text, to detect and label named entities (NE) according to predetermined categories correctly. An NE may be a noun or a group of nouns which correspond to the name of a specific object, location or a concept in case of domain-specific applications. In the literature, person, organization, location names or date,time, money, percentage expressions are among highly studied, generic NEs. Besides, there are ...
Named Entity Recognition in Turkish with Bayesian Learning and Hybrid Approaches RehaYavuz, Sermet; Kucuk, Dilek; Yazıcı, Adnan (2013-10-29) Named entity recognition is one of the significant textual information extraction tasks. In this paper, we present two approaches for named entity recognition on Turkish texts. The first is a Bayesian learning approach which is trained on a considerably limited training set. The second approach comprises two hybrid systems based on joint utilization of this Bayesian learning approach and a previously proposed rule-based named entity recognizer. All of the proposed three approaches achieve promising performa...
A hybrid named entity recognizer for Turkish Kucuk, Dilek; Yazıcı, Adnan (2012-02-15) Named entity recognition is an important subfield of the broader research area of information extraction from textual data. Yet, named entity recognition research conducted on Turkish texts is still rare as compared to related research carried out on other languages such as English, Spanish, Chinese, and Japanese. In this study, we present a hybrid named entity recognizer for Turkish, which is based on a manually engineered rule based recognizer that we have proposed. Since rule based systems for specific d...
Named entity recognition experiments on Turkish texts Küçük, Dilek; Yazıcı, Adnan (2009-10-28) Named entity recognition (NER) is one of the main information extraction tasks and research on NER from Turkish texts is known to be rare. In this study, we present a rule-based NER system for Turkish which employs a set of lexical resources and pattern bases for the extraction of named entities including the names of people, locations, organizations together with time/date and money/percentage expressions. The domain of the system is news texts and it does not utilize important clues of capitalization and ...

Citation Formats

Ö. Bayraktar, “Person name recognition in turkish financial texts by using local grammar approach,” M.S. - Master of Science, Middle East Technical University, 2007.