Abstract
The polymorphic nature of malware makes it challenging to identify, especially when employing hash-based detection approaches, making malware detection an intriguing study topic. In contrast to image-based methods, a graph-based method was employed in this study to extract control flow graphs from Android APK binaries. We use a method that combines XGBoost, a common machine learning model, with Inferential SIR-GN for Graph representation, a novel graph representation learning method that preserves graph structural similarities, to handle the resulting graph. The method is then used on MALNET, an open cybersecurity database containing 1,262,024 million Android APK binary files in total, with 47 kinds and 696 families. The experimental findings show that, in terms of detection accuracy, our graph-based technique surpasses the image-based method.
| Original language | English |
|---|---|
| Pages (from-to) | 279-290 |
| Number of pages | 12 |
| Journal | CEUR Workshop Proceedings |
| Volume | 3478 |
| State | Published - 2023 |
| Event | 31st Symposium of Advanced Database Systems, SEBD 2023 - Galzingano Terme, Italy Duration: 2 Jul 2023 → 5 Jul 2023 |
Keywords
- Malware Polymorphism
- Structural Graph Representation Learning