A New WAP-tree based sequential pattern mining algorithm for faster pattern extraction

Download
2012
Önal, Kezban Dilek
Sequential pattern mining constitutes a basis for solution of problems in various domains like bio-informatics and web usage mining. Research on this field continues seeking faster algorithms. WAP-Tree based algorithms that emerged from web usage mining literature have shown a remarkable performance on single-item sequence databases. In this study, we investigated application of WAP-Tree based mining to multi-item sequential pattern mining and we designed an extension of WAP-Tree data structure for multi-item sequence databases, the MULTI-WAP-Tree. In addition, we propose a new mining strategy on WAP-Tree which involves a hybrid traversal strategy in possible sequences search space and a new early prunning idea called Sibling Principle on Pattern Tree. Two algorithms, FOF-PT and MULTI-FOF-PT, applying this strategy on WAP-Tree and MULTI-WAP-Tree respectively, are developed. Experiments showed that FOF-PT outperforms both other WAP-Tree based algorithms and PrefixSpan in terms of execution time. Moreover, experimental results revealed MULTI-FOF-PT finds patterns faster than PrefixSpan on dense multi-item sequence databases with small alphabets.
Citation Formats
K. D. Önal, “A New WAP-tree based sequential pattern mining algorithm for faster pattern extraction,” M.S. - Master of Science, Middle East Technical University, 2012.