A scalable and accurate feature representation method for identifying malicious mobile applications

B Sun, T Ban, SC Chang, YS Sun, T Takahashi… - Proceedings of the 34th …, 2019 - dl.acm.org
Proceedings of the 34th ACM/SIGAPP Symposium on Applied Computing, 2019dl.acm.org
With the dramatic growth in smartphone usage, the number of new malicious mobile
applications has increased rapidly. Identifying malicious applications in large-scale datasets
is intensive and time consuming. Multiple previous studies have focused on automating the
process of malicious application detection using machine (or deep) learning technology.
However, a scalable and accurate solution is still lacking for large-scale applications.
Therefore, in this study, we propose a novel approach to improve the accuracy of …
With the dramatic growth in smartphone usage, the number of new malicious mobile applications has increased rapidly. Identifying malicious applications in large-scale datasets is intensive and time consuming. Multiple previous studies have focused on automating the process of malicious application detection using machine (or deep) learning technology. However, a scalable and accurate solution is still lacking for large-scale applications. Therefore, in this study, we propose a novel approach to improve the accuracy of discovering malicious applications and decrease the computation time for processing the analysis. We implemented our proposed approach combining data collection, static feature extraction, and machine learning algorithms. Using a large dataset collected from a mobile application store that included 49,045 benign samples and 12,685 malicious samples, we demonstrate that the F-measure of the malicious application detection of our approach ranges from 0.968 to 0.995, with a false positive rate of 0.48% ~3.3%. We find that a multi-layer perceptron classifier performs best in these algorithms. Moreover, the analysis processing running time can be compressed to less than 18 min. Finally, we compare our method to those of two types of previous studies and report a better performance in terms of scalability and accuracy.
ACM Digital Library
Showing the best result for this search. See all results