Modern malware detection tools rely on special permissions to collect data that can reveal the presence of suspicious software within a machine. Typical data that they collect for this task are the set of system calls, the content of network traffic, file system changes, and API calls. However, giving access to these data to an externally created program means granting the company that created that software complete control over the host machine. This is undesirable for many reasons. In this work, we propose an alternative approach for this task, which relies on easily accessible data, information about system performances (CPU, RAM, disk, and network usage), and does not need high-level permissions to be collected. To investigate the effectiveness of this approach, we collected these data in the form of a multivalued time series and ran a number of malware programs in a suitably devised sandbox. Then – to address the fact that deep learning models need large training sets – we augmented the dataset using a deep learning generative model (a Generative Adversarial Network). Finally, we trained an LSTM (Long Short Term Memory) network to capture the malware behavioral patterns. Our investigation found that this approach, based on easy-to-collect information, is very effective (we achieved 0.99 accuracy), despite the fact that the data used for training the detector are substantially different from the ones specifically targeted for this purpose. The real and synthetic datasets, as well as corresponding source code, are publicly available.

Anisetti, M., Ardagna, C., Bena, N., Giandomenico, V., Gianini, G. (2024). Lightweight Behavior-Based Malware Detection. In Management of Digital EcoSystems 15th International Conference, MEDES 2023, Heraklion, Crete, Greece, May 5–7, 2023, Revised Selected Papers (pp.237-250). Springer [10.1007/978-3-031-51643-6_17].

Lightweight Behavior-Based Malware Detection

Gianini, G
Ultimo
2024

Abstract

Modern malware detection tools rely on special permissions to collect data that can reveal the presence of suspicious software within a machine. Typical data that they collect for this task are the set of system calls, the content of network traffic, file system changes, and API calls. However, giving access to these data to an externally created program means granting the company that created that software complete control over the host machine. This is undesirable for many reasons. In this work, we propose an alternative approach for this task, which relies on easily accessible data, information about system performances (CPU, RAM, disk, and network usage), and does not need high-level permissions to be collected. To investigate the effectiveness of this approach, we collected these data in the form of a multivalued time series and ran a number of malware programs in a suitably devised sandbox. Then – to address the fact that deep learning models need large training sets – we augmented the dataset using a deep learning generative model (a Generative Adversarial Network). Finally, we trained an LSTM (Long Short Term Memory) network to capture the malware behavioral patterns. Our investigation found that this approach, based on easy-to-collect information, is very effective (we achieved 0.99 accuracy), despite the fact that the data used for training the detector are substantially different from the ones specifically targeted for this purpose. The real and synthetic datasets, as well as corresponding source code, are publicly available.
paper
Malware detection; behavior analysis; LSTM; GAN
English
15th International Conference, MEDES 2023 - May 5–7, 2023
2023
Chbeir, R; Benslimane, D; Zervakis, M; Manolopoulos, Y; Ngyuen, NT; Tekli, J
Management of Digital EcoSystems 15th International Conference, MEDES 2023, Heraklion, Crete, Greece, May 5–7, 2023, Revised Selected Papers
9783031516429
2-feb-2024
2024
237
250
https://link.springer.com/chapter/10.1007/978-3-031-51643-6_17
partially_open
Anisetti, M., Ardagna, C., Bena, N., Giandomenico, V., Gianini, G. (2024). Lightweight Behavior-Based Malware Detection. In Management of Digital EcoSystems 15th International Conference, MEDES 2023, Heraklion, Crete, Greece, May 5–7, 2023, Revised Selected Papers (pp.237-250). Springer [10.1007/978-3-031-51643-6_17].
File in questo prodotto:
File Dimensione Formato  
Anisetti-2024-MEDES-Lightweight Behavior-Based-VoR.pdf

Solo gestori archivio

Tipologia di allegato: Publisher’s Version (Version of Record, VoR)
Licenza: Tutti i diritti riservati
Dimensione 1.34 MB
Formato Adobe PDF
1.34 MB Adobe PDF   Visualizza/Apri   Richiedi una copia
Anisetti-2024-MEDES-Lightweight-preprint.pdf

accesso aperto

Tipologia di allegato: Submitted Version (Pre-print)
Licenza: Altro
Dimensione 3.17 MB
Formato Adobe PDF
3.17 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/460859
Citazioni
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
Social impact