Show/Hide Menu
Hide/Show Apps
anonymousUser
Logout
Türkçe
Türkçe
Search
Search
Login
Login
OpenMETU
OpenMETU
About
About
Open Science Policy
Open Science Policy
Frequently Asked Questions
Frequently Asked Questions
Browse
Browse
By Issue Date
By Issue Date
Authors
Authors
Titles
Titles
Subjects
Subjects
Communities & Collections
Communities & Collections
TED Multilingual Discourse Bank (TED-MDB): a parallel corpus annotated in the PDTB style
Date
2020-06-01
Author
Zeyrek Bozşahin, Deniz
Mendes, Amália
Grishina, Yulia
Kurfalı, Murathan
Gibbon, Samuel
Ogrodniczuk, Maciej
Metadata
Show full item record
This work is licensed under a
Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License
.
Item Usage Stats
51
views
0
downloads
TED-Multilingual Discourse Bank, or TED-MDB, is a multilingual resource where TED-talks are annotated at the discourse level in 6 languages (English, Polish, German, Russian, European Portuguese, and Turkish) following the aims and principles of PDTB. We explain the corpus design criteria, which has three main features: the linguistic characteristics of the languages involved, the interactive nature of TED talks-which led us to annotate Hypophora, and the decision to avoid projection. We report our annotation consistency, and post-annotation alignment experiments, and provide a cross-lingual comparison based on corpus statistics.
Subject Keywords
Discourse
,
Discourse relations
,
Corpus creation
,
Annotation
,
Multilingual corpus
URI
https://hdl.handle.net/11511/31690
Journal
Language Resources and Evaluation
DOI
https://doi.org/10.1007/s10579-019-09445-9
Collections
Graduate School of Informatics, Article