An Assessment of Explicit Inter- and Intra-sentential Discourse Connectives in Turkish Discourse Bank

2018-05-07
The paper offers a quantitative and qualitative analysis of explicit inter- and intra-sentential discourse connectives in Turkish Discourse Bank, or TDB version 1.1, a multi-genre resource of written Turkish manually annotated at the discourse level following the goals and principles of Penn Discourse TreeBank. TDB 1.1 is a 40K-word corpus involving all major discourse relation types (explicit discourse relations at intra- and inter-sentential positions, implicit discourse relations, alternative lexicalizations and entity relations) along with their senses and the text spans they relate. The paper focuses on the addition of a new set of explicit intra-sentential connectives to TDB 1.1, namely converbs (a subset of subordinators), which are suffixal connectives mostly corresponding to subordinating conjunctions in European languages. An evaluation of the converb sense annotations is provided. Then, with corpus statistics, explicit intra- and inter-sentential connectives are compared in terms of their frequency of occurrence and with respect to the senses they convey. The results suggest that the subordinators tend to select certain senses not selected by explicit inter-sentential discourse connectives in the data. Overall, our findings offer a promising direction for future NLP tasks in Turkish.

Suggestions

Assessment of the Turkish discourse bank and a cascaded model to automatically identify discursive phrasal expressions in Turkish
Sevdik Çallı, Ayışığı Başak; Zeyrek Bozşahin, Deniz; Department of Cognitive Sciences (2015)
This thesis presents a methodology for an overall assessment of the Turkish Discourse Bank (TDB), a linguistic resource where discourse relations overtly expressed by discourse connectives have been identified and annotated with the two arguments they relate. We provide a quantitative and qualitative assessment of the TDB in order to establish the reliability of this discourse resource for Turkish and suggest that our methodology can be utilized for reliability evaluations of other annotated corpora. Our qu...
Pair Annotation as a Novel Annotation Procedure: The Case of Turkish Discourse Bank
Demirşahin, Işın; Zeyrek Bozşahin, Deniz (2017-6-17)
In this chapter, we provide an overview of Turkish Discourse Bank, a resource of ∼∼400,000 words built on a sub-corpus of the 2-million-word METU Turkish Corpus annotated following the principles of Penn Discourse Tree Bank. We first present the annotation framework we adopted, explaining how it differs from the annotation of the original language, English. Then we focus on a novel annotation procedure that we have devised and named pair annotation after pair programming. We discuss the advantages it has of...
Pair Annotation as a Novel Annotation Procedure: The Case of Turkish Discourse Bank
Demirşahin, Işın; Zeyrek Bozşahin, Deniz (Springer, 2017-01-01)
In this chapter, we provide an overview of Turkish Discourse Bank, a resource of ∼ 400,000 words built on a sub-corpus of the 2-million-word METU Turkish Corpus annotated following the principles of Penn Discourse Tree Bank. We first present the annotation framework we adopted, explaining how it differs from the annotation of the original language, English. Then we focus on a novel annotation procedure that we have devised and named pair annotation after pair programming. We discuss the advantages it has ...
The annotation scheme of the Turkish Discourse Bank and an evaluation of inconsistent annotations
Zeyrek Bozşahin, Deniz; Sevdik- Çallı , Ayışığı; Ögel- Balaban, Hale; Yalçınkaya, İhsan; Turan, Ümit Deniz (2010-7-15-16)
In this paper, we report on the annotation procedures we developed for annotating the Turkish Discourse Bank (TDB), an effort that extends the Penn Discourse Tree Bank (PDTB) annotation style by using it for annotating Turkish discourse. After a brief introduction to the TDB, we describe the annotation cycle and the annotation scheme we developed, defining which parts of the scheme are an extension of the PDTB and which parts are different. We provide inter-coder reliability calculations on the first and se...
The Discourse structure of Turkish
Demirşahin, Işın; Zeyrek Bozşahin, Deniz; Department of Cognitive Sciences (2015)
This thesis investigates the structure of immediate discourse in Turkish. The first and fore- most question is how discourse is built. Are there components of discourse that constitute a predicate-argument structure, or is discourse realized by underlying non-structural ties that are merely made explicit by these components? If there is structure in discourse, what is the nature of this structure, and what is its complexity? For this purpose, we analyze the relations annotated in the Turkish Discourse Bank,...
Citation Formats
D. Zeyrek Bozşahin, “An Assessment of Explicit Inter- and Intra-sentential Discourse Connectives in Turkish Discourse Bank,” CenterMiyazaki; Japan, 2018, p. 4023, Accessed: 00, 2021. [Online]. Available: https://hdl.handle.net/11511/79039.