Application of subspace clustering to scalable malware clustering

Download
2019
Işıktaş, Fatih
In recent years, massive proliferation of malware variants has made it necessary to employ sophisticated clustering techniques in malware analysis. Choosing an appropriate clustering approach is very important especially for rapidly and accurately mining clustering information from a large malware set with high number of attributes. In this study, we propose a clustering method that is based on subspace clustering and graph matching techniques and presents an enhanced clustering ability and scalable runtime performance for the analysis of large malware sets. Unlike traditional signature-based clustering techniques, we aimed to obtain more accurate malware clusters by comparing internal structures of malware binaries. We also integrated a subspace clustering technique in order to scale and speed up the clustering process. To be able to verify our method, we developed a system prototype that can perform the mentioned clustering processes. This prototype provides a graphical user interface which allows users to navigate over malware binaries and generated clusters for a detailed analysis. We performed clustering experiments on real malware sets by using our system prototype. The experiment results showed that using a clustering method based on comparison of internal structure of malware binaries reveals clustering outputs with a 98% accuracy. Besides, the experiment results demonstrated that our method significantly improves the runtime performance of the clustering process without degrading clustering accuracy.
Citation Formats
F. Işıktaş, “Application of subspace clustering to scalable malware clustering,” M.S. - Master of Science, 2019.