Fast, efficient and dynamically optimized data and hardware architectures for string matching

Zengin, Salih
Many fields of computing such as network intrusion detection employ string matching modules (SMM) that search for a given set of strings in their input. An SMM is expected to produce correct outcomes while scanning the input data at high rates. Furthermore, the string sets that are searched for are usually large and their sizes increase steadily. In this thesis, motivated by the requirement of designing fast, accurate and efficient SMMs; we propose a number of SMM architectures that employ Bloom Filters to compactly represent the large amounts of data for the string sets. The proposed architectures address the well-known slowdown problem of the Bloom Filters because of the verifications of the positive matches. To this end, the first contribution of the thesis is Double Bloom Filter SMM (DBFSMM) which employs a second Bloom Filter which acts as a verification engine. We present an analysis, evaluation and implementation of the DBF-SMM.We further verify the required functionality of the DBF-SMM by modeling and testing the architecture in SystemC environment. Our analytical and implementation results demonstrate that DBF-SMM is superior to the existing Bloom Filter based SMM designs in terms of sustainability of the response time with high string storage efficiency and hardware scalability. DBF-SMM is designed for fixed size strings. The second contribution of the thesis is a finite automaton-based design that stores variable size strings as state transitions between characters. To this end, we first identify the classes of state transitions. We then modify the implementation of the well-known Aho-Corasick algorithm to effectively store and query the appropriate transition classes in a hardware architecture that features Bloom Filters and CAM-RAM look-up tables. The Bloom Filter in this architecture is realized as a DBF-SMM. The proposed SMM achieves a memory efficiency that is superior to all previous SMM designs together with fast and scalable hardware design.