Hide/Show Apps

WaPUPS: Web access pattern extraction under user-defined pattern scoring

Alkan, Oznur Kirmemis
Karagöz, Pınar
Extracting patterns from web usage data helps to facilitate better web personalization and web structure readjustment. The classical frequency-based sequence mining techniques consider only the binary occurrences of web pages in sessions that result in the extraction of many patterns that are not informative for users. To handle this problem, utility-based mining technique has emerged, which assigns non-binary values, called utilities, to web pages and calculates pattern utilities accordingly. However, the utility of a pattern cannot always be determined from distinct web page utilities. For instance, the number of distinct users that traverse an extracted pattern or some demographic data about those users may affect the value of the extracted patterns. However, such information cannot be calculated directly from web page utilities. In this paper, we present a new approach based on a user-defined scoring mechanism so as to extract patterns from web log data. The proposed approach can limit the size of the search space; therefore it has the ability to extract patterns even for large and sparse datasets. The framework is hybrid in the sense that it combines clustering with a heuristic-based pattern extraction algorithm. Substantial experiments on real datasets show that the proposed solution effectively discovers patterns under user-defined evaluation.