Project Abstract (First Revision)
Spyware is a form of malware with the primary intention of stealing or monitoring the victim’s online and offline activity. The most common circumstances are where a user’s system is compromised by spyware through misleading links or advertisements on a webpage. Spyware comes in many forms, including adware, keyloggers and scareware. Removal can be difficult if the spyware blocks access to legitimate antivirus solutions on a victim’s system. With the recent advances in the automotive industry, consumers can be impacted by spyware through their automobile as a medium. The goal of our research is to create a spyware detection and anti-spyware solution that will serve three purposes: to appropriately identify and classify unknown files on a spectrum of spyware severity; to introduce a self-adapting structure to detect new or modified spyware traces; and to increase the accuracy of detection results. The identification, classification, and adaptation phase can be accomplished using machine learning and data mining concepts and algorithms. Data will be classified through two different methods of machine learning: supervised and unsupervised. We will apply the self-organizing map (SOM) algorithm in the unsupervised case. For the supervised case, we will introduce the C5.0 algorithm (decision tree). In addition, we will explore the possibility of a solution using a hybrid of both learning methods.The end result is to implement our anti-spyware solution on automotive specific systems.
The Problem
Current anti-spyware systems that classify an input file into spy or not spy suffer from the following drawbacks:
-Need for updating data describing the system behavior to detect new/unknown/modified spywares
-High level of false positive and false negative rates
-Need for adapting the structure of old anti-spy programs
-Need for updating data describing the system behavior to detect new/unknown/modified spywares
-High level of false positive and false negative rates
-Need for adapting the structure of old anti-spy programs
Intended Contributions
Application on automobile systems
Obtain files and samples through two methods:
-Online resources (VX Heaven and KernelMode)
-Honeypot trap
Review design pattern and adaptation
Apply Data Mining feature extraction on samples
Apply a hybridized Machine Learning approach
-Supervised Learning algorithm to classify a sample as Spyware or Not Spy
-Unsupervised Learning algorithm (Self-Organizing Map) to classify a sample on a severity spectrum
Obtain files and samples through two methods:
-Online resources (VX Heaven and KernelMode)
-Honeypot trap
Review design pattern and adaptation
Apply Data Mining feature extraction on samples
Apply a hybridized Machine Learning approach
-Supervised Learning algorithm to classify a sample as Spyware or Not Spy
-Unsupervised Learning algorithm (Self-Organizing Map) to classify a sample on a severity spectrum