Show/Hide Menu
Hide/Show Apps
Logout
Türkçe
Türkçe
Search
Search
Login
Login
OpenMETU
OpenMETU
About
About
Open Science Policy
Open Science Policy
Open Access Guideline
Open Access Guideline
Postgraduate Thesis Guideline
Postgraduate Thesis Guideline
Communities & Collections
Communities & Collections
Help
Help
Frequently Asked Questions
Frequently Asked Questions
Guides
Guides
Thesis submission
Thesis submission
MS without thesis term project submission
MS without thesis term project submission
Publication submission with DOI
Publication submission with DOI
Publication submission
Publication submission
Supporting Information
Supporting Information
General Information
General Information
Copyright, Embargo and License
Copyright, Embargo and License
Contact us
Contact us
A modular framework for PDTB-style multilingual discourse parsing
Download
Mustafa_Erolcan_Er_Tez.pdf
Mustafa Erolcan Er_Tez Teslim Belgeleri.pdf
Date
2025-12-23
Author
Er, Mustafa Erolcan
Metadata
Show full item record
This work is licensed under a
Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License
.
Item Usage Stats
87
views
0
downloads
Cite This
Discourse parsing is one of the most challenging tasks in the field of Natural Language Processing (NLP) due to its inherent complexity. However, advancements in large language model techniques have begun to demonstrate remarkable influence on discourse parsing, as they have across all areas of NLP. In this thesis, we introduce a multilingual discourse parsing framework designed for the Penn Discourse TreeBank (PDTB)-based datasets. Discourse parsing models based on the PDTB ideally involve three modules: discourse connective (DC) detection, argument span labeling, and discourse relation recognition (DRR). We first perform the DC detection and argument span labeling tasks for Explicit and Alternative Lexicalization (AltLex) relation types by fine-tuning the BERT model. Then, we perform the DRR phase (Explicit, Implicit and AltLex relation types) using various in-context learning strategies. Finally, we define two interconnected modules, one connecting the DC detection module with the argument span labeling module, and the other connecting DC detection module with the DRR. These modules can be considered as a first step toward end-to-end discourse parsing. Our discourse parsing pipeline is tested on seven different datasets across three languages (English, Portuguese, and Turkish) and achieves competitive results at each stage, on a par with the state-of-the-art discourse parsing models. In addition to our modular discourse parsing pipeline, we present two contributions: we propose a lightweight DC detection model and an improvement over the implicit DRR task by leveraging machine translation techniques.
Subject Keywords
Chain-of-Thought Reasoning
,
Discourse Parsing
,
In-Context Learning
,
Large Language Models
,
PDTB
URI
https://hdl.handle.net/11511/118348
Collections
Graduate School of Informatics, Thesis
Citation Formats
IEEE
ACM
APA
CHICAGO
MLA
BibTeX
M. E. Er, “A modular framework for PDTB-style multilingual discourse parsing,” Ph.D. - Doctoral Program, Middle East Technical University, 2025.