NFA based regular expressıon matchıng on FPGA

Sert, Kamil
String matching is about finding all occurrences of a string within a given text, which is a classical but still a popular problem. String matching algorithms have important roles in various real world areas, such as web and security applications. In this work, we are interested in solving regular expression and hence string matching problem targeting especially the network intrusion detection systems (NIDS) field. An NIDS engine inspects both the header and the content of the packet and hence performs a type of deep packet inspection also. This inspection requires effective string matching techniques because each network packet should be checked and compared against maybe hundreds of possible malicious attacks at line speed. In this thesis, a detailed literature analysis is presented first that explains and classifies regular expression matching studies. Among these, studies exist that presents nondeterministic finite automata (NFA) based architectures and their novel mappings onto FPGA. In our study we select one such study [1] and further modify vi and enhance the NFA architecture already proposed. The reference study uses the modified-McNaughton-Yamada-algorithm and maps the resulting NFA into structural HDL for FPGA implementation. Our modification proposes to use a 2-character based matching structure that yields better memory utilization. With our approach, the circuit for the NFA representation needs less number of states (hence flip-flops) and LUTs to perform the 2-character regular expression matching process. Within the scope of this thesis, an extensive evaluation study is performed using the well-known Snort IDS ruleset and the worst case evaluation is done using some intuitively and synthetically created regular expressions targeting Xilinx 7-series FPGAs. Evaluation results are obtained using various performance metrics.


NFA Based Regular Expression Matching on FPGA
Sert, Kamil; Bazlamacci, Cuneyt F. (2021-01-01)
String matching is about finding all occurrences of a string within a given text. String matching algorithms have important roles in various real world areas such as web and security applications. In this work, we are interested in solving regular expression matching hence a more general form of string matching problem targeting especially the field of network intrusion detection systems (NIDS). In our work, we enhance a non-deterministic finite automata (NFA) based method on FPGA considerably. We propose t...
Assignment problem and its variations
Gülek, Mehmet; Toroslu, İsmail Hakkı; Department of Computer Engineering (2007)
We investigate the assignment problem, which is the problem of matching two sets with each other, optimizing a given function on the possible matchings. Among different definitions, a graph theoretical definition of the linear sum assignment problem is as follows: Given a weighted complete bipartite graph, find a maximum (or minimum) one-to-one matching between the two equal-size sets of the graph, where the score of a matching is the total weight of the matched edges. We investigate extensions and variatio...
An asynchronous system design and implementation of an FPGA
Ayyıldız, Nizam; Güran, Hasan; Department of Electrical and Electronics Engineering (2006)
Field Programmable Gate Arrays (FPGAs) are widely used in prototyping digital circuits. However commercial FPGAs are not very suitable for asynchronous design. Both the architecture of the FPGAs and the synthesis tools are mostly tailored to synchronous design. Therefore potential advantages of the asynchronous circuits could not be observed when they are implemented on commercial FPGAs. This is shown by designing an asynchronous arithmetic logic unit (ALU), implemented in the style of micropipelines, on th...
Improving search result clustering by integrating semantic information from Wikipedia
Çallı, Çağatay; Üçoluk, Göktürk; Şehitoğlu, Onur Tolga; Department of Computer Engineering (2010)
Suffix Tree Clustering (STC) is a search result clustering (SRC) algorithm focused on generating overlapping clusters with meaningful labels in linear time. It showed the feasibility of SRC but in time, subsequent studies introduced description-first algorithms that generate better labels and achieve higher precision. Still, STC remained as the fastest SRC algorithm and there appeared studies concerned with different problems of STC. In this thesis, semantic relations between cluster labels and documents ar...
Improvement of corpus-based semantic word similarity using vector space model
Esin, Yunus Emre; Alpaslan, Ferda Nur; Department of Computer Engineering (2009)
This study presents a new approach for finding semantically similar words from corpora using window based context methods. Previous studies mainly concentrate on either finding new combination of distance-weight measurement methods or proposing new context methods. The main di fference of this new approach is that this study reprocesses the outputs of the existing methods to update the representation of related word vectors used for measuring semantic distance between words, to improve the results further. ...
Citation Formats
K. Sert, “NFA based regular expressıon matchıng on FPGA,” M.S. - Master of Science, Middle East Technical University, 2018.