Clustering based personality prediction on Turkish Tweets

Download
2019
Tutaysalgır, Esen
In this thesis, we present a framework for predicting the personality traits of users using their tweets written in Turkish. The prediction model is constructed with a clustering based approach. We show how to extract linguistic features from tweet data and to adapt TF-IDF weighting and word embeddings to the Turkish tweets. Since the model is based on linguistic features, it is language specific. The prediction model uses features applicable to Turkish language and related to writing style of Turkish Twitter users. Our approach uses anonymous BIG5 questionnaire scores of volunteer participants as ground truth in order to generate personality model from Twitter posts. Experiment results show that constructed model can predict personality traits of Turkish Twitter users with relatively small errors.

Suggestions

Clustering based personality prediction on turkish tweets
Tutaysalgir, Esen; Karagöz, Pınar; Toroslu, İsmail Hakkı (2019-08-30)
In this paper, we present a framework for predicting the personality traits by analyzing tweets written in Turkish. The prediction model is constructed with a clustering based approach. Since the model is based on linguistic features, it is language specific. The prediction model uses features applicable to Turkish language and related to writing style of Turkish Twitter users. Our approach uses anonymous BIGS questionnaire scores of volunteer participants as the ground truth in order to generate personalit...
Twitter Sentiment Analysis Experiments Using Word Embeddings on Datasets of Various Scales
Arslan, Yusuf; Kucuk, Dilek; Birtürk, Ayşe Nur (2018-06-15)
Sentiment analysis is a popular research topic in social media analysis and natural language processing. In this paper, we present the details and evaluation results of our Twitter sentiment analysis experiments which are based on word embeddings vectors such as word2vec and doc2vec, using an ANN classifier. In these experiments, we utilized two publicly available sentiment analysis datasets and four smaller datasets derived from these datasets, in addition to a publicly available trained vector model over ...
Transfer Learning Using Twitter Data for Improving Sentiment Classification of Turkish Political News
Kaya, Mesut; Fidan, Guven; Toroslu, İsmail Hakkı (2013-10-29)
In this paper, we aim to determine the overall sentiment classification of Turkish political columns. That is, our goal is to determine whether the whole document has positive or negative opinion regardless of its subject. In order to enhance the performance of the classification, transfer learning is applied from unlabeled Twitter data to labeled political columns. A variation of self-taught learning has been proposed, and implemented for the classification. Different machine learning techniques, including...
Sentiment Analysis of Turkish Political News
Kaya, Mesut; Fidan, Guven; Toroslu, İsmail Hakkı (2012-12-07)
In this paper, sentiment classification techniques are incorporated into the domain of political news from columns in different Turkish news sites. We compared four supervised machine learning algorithms of Naive Bayes, Maximum Entropy, SVM and the character based N-Gram Language Model for sentiment classification of Turkish political columns. We also discussed in detail the problem of sentiment classification in the political news domain. We observe from empirical findings that the Maximum Entropy and N-Gr...
Real-Time Lexicon-Based Sentiment Analysis Experiments On Twitter With A Mild (More Information, Less Data)
Arslan, Yusuf; Birtürk, Ayşe Nur; Djumabaev, Bekjan; Kucuk, Dilek (2017-12-14)
Sentiment analysis of Twitter data is a well studied area, however, there is a need for exploring the effectiveness of real-time approaches on small data sets that only include popular and targeted tweets. In this paper, we have employed several sentiment analysis techniques by using dynamic dictionaries and models, and performed some experiments on limited but relevant datasets to understand the popularity of some terms and the opinion of users about them. The results of our experiments are promising.
Citation Formats
E. Tutaysalgır, “Clustering based personality prediction on Turkish Tweets,” Thesis (M.S.) -- Graduate School of Natural and Applied Sciences. Computer Engineering., Middle East Technical University, 2019.