Linking discourse-level information: A study on discourse relation alignment within multiple texts and languages

Download
2024-9
Özer, Sibel
This thesis examines the complex nature of cross-linguistic discourse structures and the expression of discourse relations within multilingual contexts, focusing specifically on the TED-MDB corpus. By aligning discourse relations in parallel corpora, the study explores variations in how discourse is real- ized, semantic shifts, and patterns of inter-sentential encoding across different languages. The analysis emphasizes differences in expression, implicitation and explicitation of discourse connectives, and the distribution of discourse senses, highlighting the nuances in discourse translation. In addition, the study develops methods for bilingual lexicon induction from naturally occurring data, creating valu- able resources on multiple languages for discourse and pragmatic studies and the enhancement of nat- ural language processing (NLP) systems. Future research directions include investigating alternative discourse annotation schemes, exploring domain-specific impacts on discourse translation, examining syntactic interactions, and expanding the analysis to other language pairs while aligning data to Linked Language Open Data (LLOD) standards. This research significantly contributes to the understanding of linguistic differences in conveying discourse relations and semantic adaptations in the translation of discourse relations across diverse languages.
Citation Formats
S. Özer, “Linking discourse-level information: A study on discourse relation alignment within multiple texts and languages,” Ph.D. - Doctoral Program, Middle East Technical University, 2024.