TY - GEN
T1 - A Graph-Representation-Learning Framework for Supporting Android Malware Identification and Polymorphic Evolution
AU - Cuzzocrea, Alfredo
AU - Quebrado, Miguel
AU - Hafsaoui, Abderraouf
AU - Serra, Edoardo
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - Detecting Malware is an interesting research area, however, as the polymorphic nature of the latter makes it difficult to identify, particularly when using Hash-based detection methods. Unlike image-based strategies, in this research, a graph-based technique was used to extract control flow graphs from Android APK binaries. In order to handle the generated graph, we employ an approach that combines a novel graph representation learning method called Inferential SIR- GN for Graph representation, which retains graph structural similarities, with XGBoost, i.e., a typical Machine Learning model. The approach is then applied to MALNET, a publicly accessible cybersecurity database that contains the image and graph-based Android APK binary representations for a total of 1, 262, 024 million Android APK binary files with 47 kinds and 696 families. The experimental results indicate that our graph-based strategy outperforms the image-based approach in terms of detection accuracy.
AB - Detecting Malware is an interesting research area, however, as the polymorphic nature of the latter makes it difficult to identify, particularly when using Hash-based detection methods. Unlike image-based strategies, in this research, a graph-based technique was used to extract control flow graphs from Android APK binaries. In order to handle the generated graph, we employ an approach that combines a novel graph representation learning method called Inferential SIR- GN for Graph representation, which retains graph structural similarities, with XGBoost, i.e., a typical Machine Learning model. The approach is then applied to MALNET, a publicly accessible cybersecurity database that contains the image and graph-based Android APK binary representations for a total of 1, 262, 024 million Android APK binary files with 47 kinds and 696 families. The experimental results indicate that our graph-based strategy outperforms the image-based approach in terms of detection accuracy.
KW - Malware Polymorphism
KW - Structural Graph Representation Learning
UR - http://www.scopus.com/inward/record.url?scp=85168766143&partnerID=8YFLogxK
U2 - 10.1109/SDS57534.2023.00012
DO - 10.1109/SDS57534.2023.00012
M3 - Conference contribution
AN - SCOPUS:85168766143
T3 - Proceedings - 2023 10th IEEE Swiss Conference on Data Science, SDS 2023
SP - 34
EP - 41
BT - Proceedings - 2023 10th IEEE Swiss Conference on Data Science, SDS 2023
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 10th IEEE Swiss Conference on Data Science, SDS 2023
Y2 - 22 June 2023 through 23 June 2023
ER -