Hide/Show Apps

Clustering based personality prediction on Turkish Tweets

Tutaysalgır, Esen
In this thesis, we present a framework for predicting the personality traits of users using their tweets written in Turkish. The prediction model is constructed with a clustering based approach. We show how to extract linguistic features from tweet data and to adapt TF-IDF weighting and word embeddings to the Turkish tweets. Since the model is based on linguistic features, it is language specific. The prediction model uses features applicable to Turkish language and related to writing style of Turkish Twitter users. Our approach uses anonymous BIG5 questionnaire scores of volunteer participants as ground truth in order to generate personality model from Twitter posts. Experiment results show that constructed model can predict personality traits of Turkish Twitter users with relatively small errors.