Discovering more accurate frequent web usage patterns

2008-09-01
Bayır, Murat Ali
Toroslu, İsmail Hakkı
Coşar, Ahmet
Fidan, Güven
Web usage mining is a type of web mining, which exploits data mining techniques to discover valuable information from navigation behavior of World Wide Web users. As in classical data mining, data preparation and pattern discovery are the main issues in web usage mining. The first phase of web usage mining is the data processing phase, which includes the session reconstruction operation from server logs. Session reconstruction success directly affects the quality of the frequent patterns discovered in the next phase. In reactive web usage mining techniques, the source data is web server logs and the topology of the web pages served by the web server domain. Other kinds of information collected during the interactive browsing of web site by user, such as cookies or web logs containing similar information, are not used. The next phase of web usage mining is discovering frequent user navigation patterns. In this phase, pattern discovery methods are applied on the reconstructed sessions obtained in the first phase in order to discover frequent user patterns. In this paper, we propose a frequent web usage pattern discovery method that can be applied after session reconstruction phase. In order to compare accuracy performance of session reconstruction phase and pattern discovery phase, we have used an agent simulator, which models behavior of web users and generates web user navigation as well as the log data kept by the web server.
ARXIV

Suggestions

A new approach for reactive web usage data processing
Bayir, Murat Ali; Toroslu, İsmail Hakkı; Coşar, Ahmet (2006-01-01)
© 2006 IEEE.Web usage mining exploits data mining techniques to discover valuable information from navigation behavior of World Wide Web (WWW) users. The required information is captured by web servers and stored in web usage data logs. The first phase of web usage mining is the data processing phase. In the data processing phase, first, relevant information is filtered from the logs. After that, sessions are reconstructed by using heuristics that select and group requests belonging to the same user session...
Optimization of an online course with web usage mining
Akman, LE; Akkan, B; Baykal, Nazife (2004-02-18)
The huge amount of information existing in the World Wide Web constitutes an ideal environment to implement data mining techniques. Web mining is the mining of web data. There are different applications of web mining: web content mining, web structure mining and web usage mining. In our study we analyzed an online course by web usage mining techniques in order to optimize the navigation paths, the duration of the time spend on each page and the number of visits throughout the semester of the course. Moreove...
Improving pattern quality in web usage mining by using semantic information
Karagöz, Pınar (Springer Science and Business Media LLC, 2012-03-01)
Frequent Web navigation patterns generated by using Web usage mining techniques provide valuable information for several applications such as Web site restructuring and recommendation. In conventional Web usage mining, semantic information of the Web page content does not take part in the pattern generation process. In this work, we investigate the effect of semantic information on the patterns generated for Web usage mining in the form of frequent sequences. To this aim, we developed a technique and a fram...
A New WAP-tree based sequential pattern mining algorithm for faster pattern extraction
Önal, Kezban Dilek; Şenkul, Pınar; Department of Computer Engineering (2012)
Sequential pattern mining constitutes a basis for solution of problems in various domains like bio-informatics and web usage mining. Research on this field continues seeking faster algorithms. WAP-Tree based algorithms that emerged from web usage mining literature have shown a remarkable performance on single-item sequence databases. In this study, we investigated application of WAP-Tree based mining to multi-item sequential pattern mining and we designed an extension of WAP-Tree data structure for multi-it...
Improving Efficiency of Sequence Mining by Combining First Occurrence Forest (FOF) Strategy and Sibling Principle
Onal, Kezban Dilek; Karagöz, Pınar (2014-06-04)
Sequential pattern mining is one of the basic problems in data mining and it has many applications in web mining. The WAP-Tree (Web Access Pattern Tree) data structure provides a compact representation of single-item sequence databases. WAP-Tree based algorithms have shown notable execution time and memory consumption performance on mining single-item sequence databases. We propose a new algorithm FOF-SP, a WAP-Tree based algorithm which combines an early prunning strategy called "Sibling Principle" from th...
Citation Formats
M. A. Bayır, İ. H. Toroslu, A. Coşar, and G. Fidan, “ Discovering more accurate frequent web usage patterns,” ARXIV, pp. 0–0, 2008, Accessed: 00, 2021. [Online]. Available: https://hdl.handle.net/11511/76430.