Spam detection by using network and text embedding approaches

Download
2019
Yılmaz, Cennet Merve
Authenticity and reliability of the information spread over the cyberspace is becoming increasingly important, especially in e-commerce. This is because potential customers check reviews and customer feedbacks online before making a purchasing decision. Although this information is easily accessible through related websites, lack of verification of the authenticity of these reviews raises concerns about their reliability. Besides, fraudulent users disseminate disinformation to deceive people into acting against their interest. So, detection of fake and unreliable reviews is a crucial problem that must be addressed. In this study, we analyze and compare three different spam review detection approaches, DocRep, NodeRep and SPR2EP, that utilize review text only, network information only and the one that is proposed in this study that incorporates knowledge extracted from the textual content of the reviews with information obtained by exploiting the underlying reviewer-product network structure, respectively. One of the important contributions of this study is the proposed framework, SPR2EP, is that it benefits from both review text and network information. In SPR2EP approach, first, feature vectors are learned for each review, reviewer and product by utilizing state-of-the-art algorithms developed for learning document and node embeddings, and then these are fed into a classifier to identify opinion spam. It minimizes the feature engineering effort. The effectiveness of our framework approaches that utilize network embeddings over existing techniques on detecting spam reviews is demonstrated in three different data sets containing online reviews.

Suggestions

SPR2EP: A Semi-Supervised Spam Review Detection Framework
Yılmaz, Cengiz (2018-08-31)
Authenticity and reliability of the information spread over the cyberspace is becoming increasingly important. This is especially important in e-commerce since potential customers check reviews and customer feedbacks online before making a purchasing decision. Although this information is easily accessible through related websites, lack of verification of the authenticity of these reviews raises concerns about their reliability. Besides, fraudulent users disseminate misinformation to deceive people into act...
Web market analysis : static, dynamic, and content evaluation
Erdal, Feride; Arifoğlu, Ali; Department of Information Systems (2012)
Importance of web services increases as the technology improves and the need for the challenging e-commerce strategies increases. This thesis focuses on web market analysis of web sites by evaluating from the perspectives of static, dynamic and content. Firstly, web site evaluation methods and web analytic tools are introduced. Then evaluation methodology is described from three perspectives. Finally, results obtained from the evaluation of 113 web sites are presented as well as their correlations.
Multilingual dynamic linking of web resources
Dönmez, Uğur; Coşar, Ahmet; Yeşilada, Yeliz; Department of Computer Engineering (2014)
The World Wide Web is successful for locating, browsing and publishing information by its scalable architecture. However, the Web suffers from some limitations. For example, links on the Web are embedded in documents. Links are only unidirectional, ownership is required to place an anchor in documents, and authoring links is an expensive process. The embedded link structure of the Web can be improved by Semantic Web. By using Semantic Web components, existing Web resources can be enriched with additional ex...
Malware Detection Using Transformers-based Model GPT-2
Şahin, Nazenin; Acartürk, Cengiz; Department of Cybersecurity (2021-11-17)
The variety of malicious content, besides its complexity, has significantly impacted end-users of the Information and Communication Technologies (ICT). To mitigate the effect of malicious content, automated machine learning techniques have been developed to proactively defend the user systems against malware. Transformers, a category of attention-based deep learning techniques, have recently been shown to be effective in solving various malware problems by mainly employing Natural Language Processing (NLP) ...
Empirical studies on price determinants of online auctions with machine learning applications
Öz, Emrah; Gaygısız Lajunen, Esma; Department of Economics (2019)
Current technological developments have changed our trading habits and the importance of e-commerce in our lives has grown rapidly in the past decade. This new economic and technological environment generates massive, cheap, easily accessible and invaluable data. One of the important topics in electronic trade is the price estimation. Electronic trade takes place usually through two sales methods. The first is auctioning and the second is Buy-it-Now (BIN) sales. This dissertation concentrates on the determi...
Citation Formats
C. M. Yılmaz, “Spam detection by using network and text embedding approaches,” Thesis (M.S.) -- Graduate School of Natural and Applied Sciences. Industrial Engineering., Middle East Technical University, 2019.