Ransomware Detection Using Portable Executable Imports





Ransomware Detection, Portal Executable, DLL, Machine Learning, Feature Selection


In recent years, there has been a substantial surge in ransomware attacks, wreaking havoc on both organizations and individuals. These attacks, driven by the lure of profits, particularly with the widespread use of cryptocurrencies, have prompted attackers to continuously develop innovative evasion techniques and obfuscation tactics to avoid detection. Ransomware, employing seemingly benign functions such as encryption and file-locking, poses a formidable challenge for detection as it evolves beyond traditional signature-based methods. Consequently, there is a growing need to identify previously unexplored and unstudied ransomware strains, necessitating the deployment of artificial intelligence (AI) to discern the unique characteristics and objectives of ransomware. The adoption of AI hinges on the prior selection of distinguishing features. Given that ransomware's intent fundamentally differs from that of benign files, there are variations in the structure of Portable Executables (PE) files. This study posits that the imports used by PE files can serve as a discriminating factor between ransomware and benign files. This research explored using machine learning models to detect ransomware by analysing and deriving insights from the PE Imports structure. To achieve this, the study trains seven machine learning classifiers, namely Random Forest, Logistic Regression, Naïve Bayes, Support Vector Machine, K-Nearest Neighbors, Gradient Boost, and Decision Tree. These models are trained on a dataset of carefully selected features derived from PE imports. The classifiers are benchmarked and ranked based on several evaluation metrics, including latency, accuracy, and confidence levels. For a model to be effective in ransomware detection, it should offer near real-time and highly confident accuracy. In other words, it should exhibit low latency, high accuracy, and strong AUC rates. Among the models, Logistic Regression emerges as the top performer, identifying ransomware programs with an impressive 98.5% accuracy and a confidence level of 98.6% within a mere 0.998-millisecond latency. This study conclusively affirms the efficacy of employing PE imports for ransomware detection.

Author Biographies

Tanatswa Ruramai Dendere, University of Pretoria

Tanatswa Ruramai Dendere is an enthusiastic and passionate young professional who is excited to share insights from her academic journey and early career experiences. She holds a BSc in Computer Science from the University of Pretoria, where she is currently pursuing her BSc Hons in Computer Science degree.

Avinash Singh, University of Pretoria

Avinash Singh is an emerging researcher and obtained his BSc Hons and MSc in Computer Science with distinctions from the University of Pretoria. He is currently a lecturer at the Department of Computer Science and is pursuing a PhD. He has published in international conferences and journals. He is also the head of the Intelligent Cyber Forensic Lab (ICFL) at the University of Pretoria and a member of the DigiForS research group.