An efficient and novel detection technique for next generation web-based exploitation kits

Süren, Emre
The prevalence and non-stop evolving technical sophistication of Exploit Kits (EKs) is one of the most challenging shifts in the modern cybercrime landscape. Over the last few years, malware infection via drive-by download attacks have been orchestrated with EK infrastructures. An EK serves various types of malicious content via several threat vectors for a variety of criminal attempts, which are mostly monetarycentric. In this dissertation, an in-depth discussion of the EK philosophy and internals is provided. A content analysis is introduced for the EK families where special context-aware properties are identified. A key observation is that while the webpage contents have drastic differences between distinct intrusions executed through the same EK, the patterns in URL addresses stay similar. This is due to the fact that auto-generated URLs by EK platforms follow specific templates. This dissertation proposes a new lightweight technique to quickly categorize unknown EK families with high accuracy leveraging machine learning algorithms with novel URL features. Rather than analyzing each URL individually, the proposed overall URL patterns approach examines all URLs associated with an EK infection. The method has been evaluated with a popular and publicly available dataset that contains 240 different real-world infection cases involving over 2250 URLs, the incidents being linked with the 4 major EK flavors that occurred throughout the year 2016. In the experiments, the system achieves up to 93.7% clustering accuracy and up to 100% classification accuracy with the estimators experimented.