Malware Detection and Computer System Security

Related Publications

HASP 2024
Towards Effective Machine Learning Models for Ransomware Detection via Low-Level Hardware Information (acm)
Chutitep Woralert, Chen Liu, and Zander Blasingame
In Proceedings of the 13th International Workshop on Hardware and Architectural Support for Security and Privacy (HASP 2024), Austin, Texas, USA, November 2, 2024
Abstract: In recent years, ransomware attacks have grown dramatically. New variants continually emerging make tracking and mitigating these threats increasingly difficult using traditional detection methods. As the landscape of ransomware evolves, there is a growing need for more advanced detection techniques. Neural networks have gained popularity as a method to enhance detection accuracy, by leveraging low-level hardware information such as hardware events as features for identifying ransomware attacks. In this paper, we investigated several state-of-the-art supervised learning models, including XGBoost, LightGBM, MLP, and CNN, which are specifically designed to handle time series data or image-based data for ransomware detection. We compared their detection accuracy, computational efficiency, and resource requirements for classification. Our findings indicate that particularly LightGBM, offer a strong balance of high detection accuracy, fast processing speed, and low memory usage, making them highly effective for ransomware detection tasks.

AsianHOST 2023
A Comparison of One-class and Two-class Models for Ransomware Detection via Low-level Hardware Information (ieee)
Chutitep Woralert, Chen Liu and Zander Blasingame
Asian Hardware Oriented Security and Trust Symposium (AsianHOST), Tianjin, China, December 13 - 15, 2023
Abstract: Recent years have witnessed a dramatic growth in ransomware attacks. Even though many tools have been developed to help combat against these attacks, new varieties of ransomware keep emerging and hence are difficult to keep track of with the traditional signature detection method. On the other hand, neural networks have been a popular technique that can be used to help enhance ransomware detection accuracy. Long Short-Term Memory (LSTM) network, in particular, is able to learn the temporal aspect of the time-series data which is suitable for the online behavioral analysis. In this paper, we compared the anomaly detection models trained with LSTM semi-supervised learning method against the LSTM model trained with the supervised learning method for ransomware detection, utilizing low-level hardware information. Overall, we are able to detect ransomware attacks with a detection accuracy of 98.60% for the supervised learning two-class model and 89.65% for the semi-supervised one-class model. Both models achieve a very high detection rate across multiple ransomware families with recall rates of 99.70% and 93.00% for two-class and one-class models, respectively. The supervised learning model demonstrates exceptional capability in detecting unseen ransomware attacks, which demonstrates the ability to overcome the limitation of static signature detection by performing live analysis of the system behavior. The model is able to retain a recall rate of 99.52% on average when facing against ransomware variety it has not seen during training. We hope the proposed methods shed light on our fight against ransomware.

TOCS
HARD-Lite: A Lightweight Hardware Anomaly Realtime Detection Framework Targeting Ransomware (ieee)
Chutitep Woralert, Chen Liu, and Zander Blasingame
IEEE Transactions on Circuits and Systems, August 2023
DOI: 10.1109/TCSI.2023.3299532
(Extended version of AsianHOST 2022 paper)

AsianHOST 2022
HARD-Lite: A Lightweight Hardware Anomaly Realtime Detection Framework Targeting Ransomware (ieee)
Chutitep Woralert, Chen Liu and Zander Blasingame
The 2022 Asian Hardware Oriented Security and Trust Symposium (AsianHOST 2022), December 14-16, 2022, Singapore (Best Paper Nominee)
Abstract: Recent years have witnessed a surge in ransomware attacks. Especially, many a new variant of ransomware has continued to emerge, employing more advanced techniques distributing the payload while avoiding detection. This renders the traditional static ransomware detection mechanism ineffective. In this paper, we present our Hardware Anomaly Realtime Detection - Lightweight (HARD-Lite) framework that employs semi-supervised machine learning method to detect ransomware using low-level hardware information. By using an LSTM network with a weighted majority voting ensemble and exponential moving average, we are able to take into consideration the temporal aspect of hardware-level information formed as time series in order to detect deviation in system behavior, thereby increasing the detection accuracy whilst reducing the number of false positives. Testing against various ransomware across multiple families, HARD-Lite has demonstrated remarkable effectiveness, detecting all cases tested successfully. What's more, with a hierarchical design that distributing the classifier from the user machine that is under monitoring to a server machine, Hard-Lite enables good scalability as well.

CF 2022
Where’s Waldo? Identifying Anomalous Behavior of Data-Only Attacks Using Hardware Features (acm)
Gildo Torres and Chen Liu
The 19th ACM International Conference on Computing Frontiers (CF 2022), May 17-19, 2022, Turin, Piedmont, Italy
Abstract: In recent years, multiple techniques have been proposed to defend computing systems against control-oriented attacks that hijack the control-flow of the victim program. Data-only attacks, on the other hand, are a less common and more subtle type of exploit which are more difficult to detect using traditional mitigation techniques that target control-oriented attacks. In this paper we introduce a novel methodology for the detection of data-only attacks through modeling the execution behavior of an application using low-level hardware information collected as a data series during execution. One unique aspect of the proposed methodology is that it uses a compilation flag based approach to collect hardware counts, eliminating the need for manual code instrumentation. Another unique aspect is the introduction of a data compression algorithm as the classifier. Using several representative real-world data-only exploits, our experiments show that data-only attacks can be detected with high accuracy using the proposed methodology. We also performed analysis on how to select the most relevant hardware events for the detection of the studied data-only attack, as well as a quantitative study of hardware events' sensitivity to interference.

ICANN 2021
Feature Creation Towards the Detection of Non-control-Flow Hijacking Attacks (acm)
Zander Blasingame, Chen Liu and Xin Yao
The 30th International Conference on Artificial Neural Networks (ICANN 2021), September 14-17, 2021
Abstract: With malware attacks on the rise, approaches using low-level hardware information to detect these attacks have been gaining popularity recently. This is achieved by using hardware event counts as features to describe the behavior of the software program. Then a classifier, such as support vector machine (SVM) or neural network, can be used to detect the anomalous behavior caused by malware attacks. The collected datasets to describe the program behavior, however, are normally imbalanced, as it is much easier to gather regular program behavior than abnormal ones, which can lead to high false negative rates (FNR). In an effort to provide a remedy to this situation, we propose the usage of Genetic Programming (GP) to create new features to augment the original features in conjunction with the classifier. One key component that will affect the classifier performance is to construct the Hellinger distance as the fitness function. As a result, we perform design space exploration in estimating the Hellinger distance. The performance of different approaches is evaluated using seven real-world attacks that target three vulnerabilities in the OpenSSL library and two vulnerabilities in modern web-servers. Our experimental results show, by using the new features evolved with GP, we are able to reduce the FNR and improve the performance characteristics of the classifier.

IISWC 2020
High Frequency Performance Monitoring via Architectural Event Measurement (ieee)
Chutitep Woralert, James Bruska, Chen Liu, and Lok Yan
2020 IEEE International Symposium on Workload Characterization (IISWC 2020), October 27-29, 2020
Abstract: Obtaining detailed software execution information via hardware performance counters is a powerful analysis technique. The performance counters provide an effective method to monitor program behaviors; hence performance bottlenecks due to hardware architecture or software design and implementation can be identified, isolated and improved on. The granularity and overhead of the monitoring mechanism, however, are paramount to proper analysis. Many prior designs have been able to provide performance counter monitoring with inherited drawbacks such as intrusive code changes, a slow timer system, or the need for a kernel patch. In this paper, we present K-LEB (Kernel - Lineage of Event Behavior), a new monitoring mechanism that can produce precise, non-intrusive, low overhead, periodic performance counter data using a kernel module based design. Our proposed approach has been evaluated on three different case studies to demonstrate its effectiveness, correctness and efficiency. By moving the responsibility of timing to kernel space, K-LEB can gather periodic data at a 100μs rate, which is 100 times faster than other comparable performance counter monitoring approaches. At the same time, it reduces the monitoring overhead by at least 58.8%, and the difference between the recorded performance counter readings and those of other tools are less than 0.3%.

HASP 2019
Detecting Non-Control-Flow Hijacking Attacks Using Contextual Execution Information (acm)
Gildo Torres, Zhiliu Yang, Zander Blasingame, James Bruska and Chen Liu
Hardware and Architectural Support for Security and Privacy (HASP) 2019, in conjunction with The 46th International Symposium on Computer Architecture (ISCA 2019), June 23, 2019, Phoenix, Arizona, USA
Abstract: In recent years, we see a rise of non-control-flow hijacking attacks, which manipulate key data elements to corrupt the integrity of a victim application while upholding a valid control-flow during its execution. Consequently, they are more difficult to be detected hence prevented with traditional mitigation techniques that target control-oriented attacks. In this work, we propose a methodology for the detection of non-control-flow hijacking attacks via employing low-level hardware information formatted as time series. Using architectural and micro-architectural hardware event counts, we model the regular execution behavior of the application(s) of interest, in an effort to detect abnormal execution behavior taking place at the vicinity of the vulnerability. We employed three distinct anomaly detection models: a traditional support vector machine (SVM), an echo state network (ESN), and a heavily modified k-nearest neighbors (KNN) model. We evaluated the proposed methodology using seven real-world non-control-flow hijacking exploits that target two vulnerabilities in modern web servers and three vulnerabilities in the OpenSSL library. Because our proposed detection methodology employs the contextual information across the temporal domain, we are able to achieve an average classification accuracy of 99.36%, with a false positive rate (FPR) of 0.79% and false negative rate (FNR) of 0.53%, respectively.

RESEC 2018
Detecting Data Exploits Using Low-level Hardware Information: A Short Time Series Approach (acm)
Chen Liu, Zhiliu Yang, Zander Blasingame, Gildo Torres, and James Bruska
Proceedings of the First Workshop on Radical and Experiential Security (RESEC 2018), June 2018
Abstract: In recent years, scale, frequency and complexity of cyber-attacks have been continuously on the rise. As a result, it has significantly impacted our daily lives and society as a whole. Never before have we had such an urgent need to defend against cyber-attacks. Previous studies suggest that it is possible to detect rootkits and control-flow attacks with high accuracy using information collected from hardware level. For data-only exploits, however, where the control-flow of the victim application is strictly conserved while its behavior may only be slightly modified, high accuracy detection is much more difficult to achieve. In this study, we propose the use of low-level hardware information collected as a short time series for the detection of data-only malware attacks. We employed several representative classification algorithms, e.g., linear regression (LR), autoencoder (AE), stacked denoising autoencoder (SDA), and echo state network (ESN). We build one-class classifiers that either use individual samples collected via monitoring hardware-level events or use multiple samples of hardware events collected at different time during execution, but all with only the knowledge from regular behavior. Using several real-life attacks as case studies, we examined their detection accuracy when confronted with malicious behavior. Our experimental results show that our SDA- and ESN-based approaches can achieve an average detection accuracy of 97.75% and 98.36% for the exploits studied, respectively. Our study suggests that when the hardware events are monitored at different time spots during the execution of the vulnerable application, our SDA- and ESN-based approaches have the potential to boost the detection accuracy for data exploits.

SPIE-DSS 2017
Verification of OpenSSL version via hardware performance counters (spie)
James Bruska, Zander Blasingame, and Chen Liu
Disruptive Technologies in Sensors and Sensor Systems, SPIE DSS 2017, Anaheim, California, USA, April 2017
DOI: 10.1117/12.2263029
Abstract: Many forms of malware and security breaches exist today. One type of breach downgrades a cryptographic program by employing a man-in-the-middle attack. In this work, we explore the utilization of hardware events in conjunction with machine learning algorithms to detect which version of OpenSSL is being run during the encryption process. This allows for the immediate detection of any unknown downgrade attacks in real time. Our experimental results indicated this detection method is both feasible and practical. When trained with normal TLS and SSL data, our classifier was able to detect which protocol was being used with 99.995% accuracy. After the scope of the hardware event recording was enlarged, the accuracy diminished greatly, but to 53.244%. Upon removal of TLS 1.1 from the data set, the accuracy returned to 99.905%.

HASP 2016
Can Data-Only Exploits be Detected at Runtime using Hardware Events?: A Case Study of the Heartbleed Vulnerability (acm) (pdf)
Gildo Torres and Chen Liu
Hardware and Architectural Support for Security and Privacy (HASP 2016), in conjunction with The 43rd International Symposium on Computer Architecture (ISCA 2016), Seoul, South Korea, June 18, 2016
Abstract: In this study, we investigate the feasibility of using an anomaly-based detection scheme that utilizes information collected from hardware performance counters at runtime to detect data-oriented attacks in user space libraries. Using the Heartbleed vulnerability as a test case, we studied twelve different hardware events and used a Support Vector Machine (SVM) model to classify between regular and abnormal behaviors. Our results demonstrated a detection accuracy over 92% for the two-class SVM model and over 70% for the one-class SVM model. We also studied the limitations of using certain type of hardware events and discussed possible implications of their use in detection schemes. Overall, the experiments conducted suggest that data-oriented attacks can be more difficult to detect than control-data exploits, as certain events are susceptible to interference hence less reliable.