Turkish clickbait detection in social media via machine learning algorithms

2021-8-26
Genç, Şura
Clickbait strategy, mostly used in headlines and teaser messages, aims to attract people’s attention, and make them click on the link by using intriguing expressions with various text-related features. Clickbait, which has become very common especially in social media in recent years, is a major problem for the flow of information. Since the information promised in the clickbait headline is generally not included in the main text, clickbait headlines disappoint readers and is problematic for ethics of journalism. In this thesis, we constructed a Turkish dataset –ClickbaitTR– with 48,060 samples, including headlines of Turkish news sources extracted from Twitter, and made it publicly available. Various machine learning algorithms such as Artificial Neural Network (ANN), Logistic Regression (LR), Random Forest (RF), Long Short-Term Memory Network (LSTM), Bidirectional Long Short-Term Memory (BiLSTM), and Ensemble Classifier (EC) were applied on the dataset for detecting the clickbait headlines. The results show that the BiLSTM has the best performance in detecting clickbait headlines with 97% accuracy followed by the LSTM, the ANN, and the Ensemble Classifier with 93% accuracy. In addition to a successful clickbait detection performance, in this thesis, linguistic and psychological analysis of clickbait sentences were presented with a focus on psychological mechanisms such as curiosity and interest. This thesis contributes to clickbait detection studies with the largest clickbait dataset and best clickbait detection performance in Turkish.

Suggestions

Detecting "Clickbait" News on Social Media Using Machine Learning Algorithms
Genc, Sura; Sürer, Elif (2019-01-01)
Clickbait, which has become very common in social media in recent years, is a technique which uses exaggerated and unreal headlines in order to manipulate people and attract them to their websites. Since the content mentioned in the title is not presented in the main text or the content of text is low-quality, clicked on links often disappoint people. In this study, we attempt to detect clickbaits in Turkish news using Twitter posts. For this purpose, headlines of news were collected from Twitter accounts o...
Turkey's Integration to Research Networks and Research Networks’ Effects on Scientific Studies: The Case of METU
Aydınoğlu, Arsev Umur (2021-10-08)
Araştırma ve geliştirme (Ar-Ge), ekonomik kalkınma ve politika tasarımı ile ilgili literatür, araştırma süreçlerinin dinamiklerini, yani “bilim bilimi”ni anlamanın verimli politikaları şekillendirmede çok önemli olduğunu göstermektedir. Wagner'in belirttiği gibi, "modern bilim yoğun bir şekilde sosyaldir" ve işbirliği, fiziksel sermaye, bilgi ve yetenekler dahil olmak üzere gerekli kaynakları sağlamanın iyi bir yoludur. Buna paralel olarak, araştırma ağlarının artan rolü ve bilimin küreselleşmesi, bilim bil...
Word Embedding Based Event Detection on Social Media
Ertugrul, Ali Mert; Velioglu, Burak; Karagöz, Pınar (2017-06-23)
Event detection from social media messages is conventionally based on clustering the message contents. The most basic approach is representing messages in terms of term vectors that are constructed through traditional natural language processing (NLP) methods and then assigning weights to terms generally based on frequency. In this study, we use neural feature extraction approach and explore the performance of event detection under the use of word embeddings. Using a corpus of a set of tweets, message terms...
Semantic Expansion of Hashtags for Enhanced Event Detection in Twitter
Özdikiş, Özer; Karagöz, Pınar; Oğuztüzün, Mehmet Halit Seyfullah (2012-09-09)
In this work, we present an event detection method in Twitter based on clustering of hashtags and introduce an enhancement technique by using the semantic similarities between the hashtags. To this aim, we devised two methods for tweet vector generation and evaluated their effect on clustering and event detection performance in comparison to word-based vector generation methods. By analyzing the contexts of hashtags and their co-occurrence statistics with other words, we identify their paradigmatic relation...
Tasarım öğrenimi seviye 1 ve 2’de yer alan dinamik kullanıcı grupların bilinç analizi ve yeni kullanım kılavuzu geliştirilmesi süreci
Gürsu, Hakan (Başkent Üniversitesi-Güzel Sanatlar Tasarım ve Mimarlık Fakültesi; 2018-12-19)
Dijitalleşen ve sürekli değişen bilgi çağının dinamik gereklerine karşı, aynı düzeydedeğişmeyen geleneksel temel tasarım kavramlarına ve kodlarına ulaşım ironik bir şekildezorlaşmaktadır. Tasarım temel seviye öğrenimdeki süreç gelişimi ve süreç algısında halasıkıntılar yaşandığı giderek artan bir şekilde gözlemlenmektedir. Diğer taraftan günümüzdebilgiye hızla ulaşma beklentisi/alışkanlığı kazanmış yeni kullanıcı gruplarının nesnel bilgikaynağına ve bu bilgiye ulaşım sürecindeki sorunsallar; öncelikle ilgis...
Citation Formats
Ş. Genç, “Turkish clickbait detection in social media via machine learning algorithms,” M.S. - Master of Science, Middle East Technical University, 2021.