Designing and debiasing binary classifiers for irony and satire detection

Download
2024-9-05
Öztürk, Aslı Umay
In the age of social media, detecting ironic and satirical text automatically is a challenging task that is important for fighting misinformation online. Even though there are compelling datasets and research conducted in other languages, the literature lacks any large datasets and comprehensive studies conducted in Turkish. This work aims to fill that gap by first curating two datasets for irony and satire detection, and uses curated datasets to explore binary classification pipelines for irony and satire detection tasks with traditional supervised learning methods such as SVM (Support Vector Machine) and large language models (LLMs) such as BERT (Bidirectional Encoder Representations from Transformers). Furthermore, this work discusses the possible biased nature of the curated datasets by stylistic analysis, and possible inherited bias of the trained models by using model explainability methods and comparing the results with human annotations. Finally, a pipeline is proposed for debiasing and improving model generalisability by using synthetic data generation with LLMs.
Citation Formats
A. U. Öztürk, “Designing and debiasing binary classifiers for irony and satire detection,” M.S. - Master of Science, Middle East Technical University, 2024.