I am a fourth-year Ph.D. student at the Department of Computer Science supervised by Professor V.S.Subrahmanian, the Dartmouth College Distinguished Professor in Cybersecurity, Technology, and Society. I work at the intersection of cybersecurity and machine learning, with an emphasis on robust and automated Android malware detection system.
As one of the most widely used mobile operating systems today, Android is consistently the largest target of malware. Nokia's 2017 Threat Intelligence Report states that a staggering 68.5% of all malware targets the Android platform. As a consequence, the massive volume and variants of Android malware often make it difficult for security analysts to detect, classify, and analyze them. Therefore, research into Android malware is crucial for the development of more effective security measures and implementations. To address this, my research group Dartmouth Security and AI Lab (DSAIL) has led a number of research projects into Android security, which have improved the classification and detection of Android malware.
DBank is a framework developed by DSAIL that detects and characterizes Android Banking Trojans (ABT). DBank invents a novel concept of a Triadic Suspicion Graph (TSG), which has goodware, banking trojans, and Android API packages as nodes. Suspicion scores (SUS) and suspicion ranks (SR) associated with the API packages are also calculated, which takes into account how often goodware and banking trojans invoke the API. After evaluation, DBank was found to have a 99.9% predictive accuracy (AUC) and a 0.3% false-positive rate.
DBank was also tested against certain adversary attacks, such as how well an attacker could infer the predictions of the defender model given an intersection of defender and adversary training data. Testing revealed that, even with 80% intersection of data with the adversary, the adversary's error rate is multiple times higher for ABT detection with DBank. Regarding robustness while using a subset of training data, DBank still had a high AUC when using 20% or 30% of the training set, allowing the defender to have a moving defense surface. DBank also proved robust against fake calls to an API to avoid suspicion. Additionally, DBank has identified two previously unknown ABTs, which Google has confirmed. Overall, it is clear that DBank is a reliable and accurate system in detecting ABTs.
FARM, a Feature transformation-based Android Rooting Malware detector, is a DSAIL project that uses machine learning to detect Android rooting malware. Especially since rooting malware is much more difficult to remove than other malware, research into prevention is essential. Like the previous projects, FARM extracts basic static and dynamic features from Android APKs. Using these features, FARM creates a "basic feature vector" for each APK in the dataset of rooting malware samples and goodware. FARM then transforms these basic feature vectors into a new feature space using three new classes of irreversible feature transformation: landmark based, feature value clustering based, and correlation graph-based feature transformations.
Through experimental evaluation, FARM outperformed baselines in detecting rooting malware against goodware and other malware. FARM also proved robust against adversarial attacks, such as fake API calls, fake permission, and reduced API feature attacks. Besides, FARM has labeled two malware samples on VirusTotal as rooting malware before this was observed by any of the 61 anti-virus engines on the site. The samples were reported to Google's Android Security Team who have confirmed the findings. Given these results, FARM is a remarkably accurate rooting malware detection system.
With generous support from the Alumni Research Award, I was able to improve my research in both depth and quality. During the summer of 2019, I worked at Deutsche Telekom Innovation Laboratories in Israel. Under the supervision of the internationally renowned cybersecurity expert Professor Yuval Elovici, I proposed mechanisms to enhance the robustness of anti-virus engines based on deep learning (Generative Adversarial Network), and reduce the impact of large-scale adversary attack successfully.
After completing my Ph.D. in the coming year, I will continue conducting my research into the exploration of innovative techniques to quickly identify different types of general malware and the rapid deployment of countermeasures on prevalent anti-virus engines.