Effect of Using Regression in Sentiment Analysis

Onal, Itir
Ertuğrul, Ali Mert
In this study, the effect of using regression on sentiment classification of Twitter data was analyzed. In other words, whether the strength of sentiment better discriminates the classes or not. Since our dataset includes class confidence scores rather than discrete class labels, regression analysis was employed on each class separately. Then, each tweet was assigned the class whose estimated confidence score is maximum among others after regression. The feature set used includes unigrams, POS tags, emoticons, sentiments of words and POS tags of sentiments. The results of experiments indicate that using classification on discrete class labels perform much better than using regression on continuous confidence scores.