Malicious code detection in android: the role of sequence characteristics and disassembling methods

Download
2022-11-01
Gürkan Balıkçıoğlu, Pınar
Şırlancı, Melih
ACAR KÜÇÜK, ÖZGE
Ulukapi, Bulut
Turkmen, Ramazan K.
Acartürk, Cengiz
The acceptance and widespread use of the Android operating system drew the attention of both legitimate developers and malware authors, which resulted in a significant number of benign and malicious applications available on various online markets. Since the signature-based methods fall short for detecting malicious software effectively considering the vast number of applications, machine learning techniques in this field have also become widespread. In this context, stating the acquired accuracy values in the contingency tables in malware detection studies has become a popular and efficient method and enabled researchers to evaluate their methodologies comparatively. In this study, we wanted to investigate and emphasize the factors that may affect the accuracy values of the models managed by researchers, particularly the disassembly method and the input data characteristics. Firstly, we developed a model that tackles the malware detection problem from a Natural Language Processing (NLP) perspective using Long Short-Term Memory (LSTM). Then, we experimented with different base units (instruction, basic block, method, and class) and representations of source code obtained from three commonly used disassembling tools (JEB, IDA, and Apktool) and examined the results. Our findings exhibit that the disassembly method and different input representations affect the model results. More specifically, the datasets collected by the Apktool achieved better results compared to the other two disassemblers.
INTERNATIONAL JOURNAL OF INFORMATION SECURITY

Suggestions

Static Malware Detection Using Stacked BiLSTM and GPT-2
Demirci, Deniz; Sahin, Nazenin; Sirlancis, Melih; Acartürk, Cengiz (2022-01-01)
In recent years, cyber threats and malicious software attacks have been escalated on various platforms. Therefore, it has become essential to develop automated machine learning methods for defending against malware. In the present study, we propose stacked bidirectional long short-term memory (Stacked BiLSTM) and generative pre-trained transformer based (GPT-2) deep learning language models for detecting malicious code. We developed language models using assembly instructions extracted from .text sections o...
Comparison of classification algorithms for mobile malware detection: market metadata as input source
Baltacı, Nuray; Baykal, Nazife; Acartürk, Cengiz; Department of Information Systems (2014)
The prevalence of mobile devices has been catching the attention of malware authors especially for Android OS supported devices due to its user-centric security policy and open application development strategy for its official application market. In this study, an automated feature-based static analysis method was applied to detect malicious mobile applications on Android devices. The main purpose of the study is to investigate the contribution of other application market metadata to the detection of malici...
Malware Detection Using Transformers-based Model GPT-2
Şahin, Nazenin; Acartürk, Cengiz; Department of Cybersecurity (2021-11-17)
The variety of malicious content, besides its complexity, has significantly impacted end-users of the Information and Communication Technologies (ICT). To mitigate the effect of malicious content, automated machine learning techniques have been developed to proactively defend the user systems against malware. Transformers, a category of attention-based deep learning techniques, have recently been shown to be effective in solving various malware problems by mainly employing Natural Language Processing (NLP) ...
Static Malware Detection Using Stacked Bi-Directional LSTM
Demirci, Deniz; Acartürk, Cengiz; Department of Cybersecurity (2021-8-19)
The recent proliferation in the use of the Internet and personal computers has made it easier for cybercriminals to expose Internet users to widespread and damaging threats. In order protect the end users against such threats, a security system must be proactive. It needs to detect malicious files or executables before reaching the end-user. To create an efficient and low-cost malware detection mechanism, in the present study, we propose stacked bidirectional long short-term memory (Stacked BiLSTM) based de...
Application of subspace clustering to scalable malware clustering
Işıktaş, Fatih; Betin Can, Aysu; Department of Information Systems (2019)
In recent years, massive proliferation of malware variants has made it necessary to employ sophisticated clustering techniques in malware analysis. Choosing an appropriate clustering approach is very important especially for rapidly and accurately mining clustering information from a large malware set with high number of attributes. In this study, we propose a clustering method that is based on subspace clustering and graph matching techniques and presents an enhanced clustering ability and scalable runtime...
Citation Formats
P. Gürkan Balıkçıoğlu, M. Şırlancı, Ö. ACAR KÜÇÜK, B. Ulukapi, R. K. Turkmen, and C. Acartürk, “Malicious code detection in android: the role of sequence characteristics and disassembling methods,” INTERNATIONAL JOURNAL OF INFORMATION SECURITY, pp. 0–0, 2022, Accessed: 00, 2023. [Online]. Available: https://hdl.handle.net/11511/101818.