An Efficient Approach for Detecting Malware Using API Call Mining
Advanced malware remain as a major challenge in enforcing security of the enterprise networks. Since most of the commercial tools use static analysis for malware detection and prevention, they are unable to detect unknown malware and sophisticated malware such as polymorphic and metamorphic malware. Dynamic approaches that consider the real-time as well as run-time behaviour of the malware are very essential. We devised two methodologies of dynamic analysis by making use of the Application Programming Interface (API) call features to detect malware: (i) Application Programming Interface Call Frequency Mining (API-CFM) and (ii) Application Programming Interface Call Transition Matrix Mining (API-CTMM). Our analysis shows that API-CFM and API-CTMM provide improved accuracy in malware detection. In our techniques, the API usage of a set of advanced malware and benign programs are learned/characterised using supervised classification algorithms: random forest, adaboost, support vector machine, and naive bayes. We used 94 API calls for the malware detection methodology. For the API call transition based mining techniques, we use a feature vector of dimension 94 × 94. We also engage the Principal Component Analysis (PCA) of the selected features to reduce the time complexity in malware detection. Our test results show that API-CFM technique gives an accuracy of 76.19% and API- CTMM technique gives an improved accuracy of 95.23%.