A new approach for reactive web usage data processing

© 2006 IEEE.Web usage mining exploits data mining techniques to discover valuable information from navigation behavior of World Wide Web (WWW) users. The required information is captured by web servers and stored in web usage data logs. The first phase of web usage mining is the data processing phase. In the data processing phase, first, relevant information is filtered from the logs. After that, sessions are reconstructed by using heuristics that select and group requests belonging to the same user session. If we are processing requests after they are handled by the web server, this technique is called "reactive" while in "proactive" techniques the same (pre)processing occurs during the interactive browsing of the web site by the user. Reactive session reconstruction uses "time" and "navigation" oriented heuristics. We propose to combine these heuristics with "site topology" information in order to increase the accuracy of the reconstructed sessions. In this work, we have implemented an agent simulator, which models behavior of web users and generates web user navigation as well as the log data kept by the web server. By this way we know the actual user sessions and we can accurately evaluate and compare the performances of alternative session reconstruction heuristics (which will use only the web server log data). To the best of our knowledge, this paper is the first work that uses such an agent simulator, and therefore, is able to accurately evaluate different session reconstruction heuristics. By using the agent simulator, we attempt to show that our new heuristic discovers more accurate sessions than previous heuristics.


Discovering more accurate frequent web usage patterns
Bayır, Murat Ali; Toroslu, İsmail Hakkı; Coşar, Ahmet; Fidan, Güven (2008-09-01)
Web usage mining is a type of web mining, which exploits data mining techniques to discover valuable information from navigation behavior of World Wide Web users. As in classical data mining, data preparation and pattern discovery are the main issues in web usage mining. The first phase of web usage mining is the data processing phase, which includes the session reconstruction operation from server logs. Session reconstruction success directly affects the quality of the frequent patterns discovered in the n...
Optimization of an online course with web usage mining
Akman, LE; Akkan, B; Baykal, Nazife (2004-02-18)
The huge amount of information existing in the World Wide Web constitutes an ideal environment to implement data mining techniques. Web mining is the mining of web data. There are different applications of web mining: web content mining, web structure mining and web usage mining. In our study we analyzed an online course by web usage mining techniques in order to optimize the navigation paths, the duration of the time spend on each page and the number of visits throughout the semester of the course. Moreove...
An Approach for automated verification of web applications using model checking and replaying the scenarios of counterexamples
Paçin, Yudum; Betin Can, Aysu; Department of Information Systems (2015)
The increase in the use of web applications in various domains, raised the importance of the methodologies for verification of web applications. We propose a framework for the verification of web applications with respect to access control, link consistency and reachability properties using model checking. In this approach, users define the properties by explanatory guidance of user interface. The execution traces that lead to a property violation is translated to a script that automates the replaying of th...
Identifying the effectiveness of a web search engine with Turkish domain dependent impacts and global scale information retrieval improvements
Fidan, Güven; Demirörs, Onur; Yöndem, Meltem Turhan; Department of Information Systems (2012)
This study investigates the effectiveness of a Web search engine with newly added or improved features in Web search engine architecture. These features can be categorized into three groups: The impact of link quality and usage information on page importance calculation; the use of Turkish stemmer for indexing and query substitution; and, the use of thumbnails for Web search engine result visualization. As Web search engines have become the primary means for finding and accessing information on the Internet...
Advanced methods for result and score caching in web search engines
Yafay, Erman.; Altıngövde, İsmail Sengör; Department of Computer Engineering (2019)
Search engines employ caching techniques in main memory to improve system efficiency and scalability. In this thesis, we focus on improving the cache performance for web search engines where our contributions can be separated into two main parts. Firstly, we investigate the impact of the sample size for frequency statistics for most popular cache eviction strategies in the literature, and show that cache performance improves with larger samples, i.e., by storing the frequencies of all (or, most of) the quer...
Citation Formats
