Bicocca Open Archive

Modern malware detection tools rely on special permissions to collect data that can reveal the presence of suspicious software within a machine. Typical data that they collect for this task are the set of system calls, the content of network traffic, file system changes, and API calls. However, giving access to these data to an externally created program means granting the company that created that software complete control over the host machine. This is undesirable for many reasons. In this work, we propose an alternative approach for this task, which relies on easily accessible data, information about system performances (CPU, RAM, disk, and network usage), and does not need high-level permissions to be collected. To investigate the effectiveness of this approach, we collected these data in the form of a multivalued time series and ran a number of malware programs in a suitably devised sandbox. Then – to address the fact that deep learning models need large training sets – we augmented the dataset using a deep learning generative model (a Generative Adversarial Network). Finally, we trained an LSTM (Long Short Term Memory) network to capture the malware behavioral patterns. Our investigation found that this approach, based on easy-to-collect information, is very effective (we achieved 0.99 accuracy), despite the fact that the data used for training the detector are substantially different from the ones specifically targeted for this purpose. The real and synthetic datasets, as well as corresponding source code, are publicly available.

Anisetti, M., Ardagna, C., Bena, N., Giandomenico, V., Gianini, G. (2024). Lightweight Behavior-Based Malware Detection. In Management of Digital EcoSystems 15th International Conference, MEDES 2023, Heraklion, Crete, Greece, May 5–7, 2023, Revised Selected Papers (pp.237-250). Springer [10.1007/978-3-031-51643-6_17].

Lightweight Behavior-Based Malware Detection

Anisetti, M;Ardagna, CA;Bena, N;Giandomenico, V;Gianini, G^Ultimo

2024

Abstract

Modern malware detection tools rely on special permissions to collect data that can reveal the presence of suspicious software within a machine. Typical data that they collect for this task are the set of system calls, the content of network traffic, file system changes, and API calls. However, giving access to these data to an externally created program means granting the company that created that software complete control over the host machine. This is undesirable for many reasons. In this work, we propose an alternative approach for this task, which relies on easily accessible data, information about system performances (CPU, RAM, disk, and network usage), and does not need high-level permissions to be collected. To investigate the effectiveness of this approach, we collected these data in the form of a multivalued time series and ran a number of malware programs in a suitably devised sandbox. Then – to address the fact that deep learning models need large training sets – we augmented the dataset using a deep learning generative model (a Generative Adversarial Network). Finally, we trained an LSTM (Long Short Term Memory) network to capture the malware behavioral patterns. Our investigation found that this approach, based on easy-to-collect information, is very effective (we achieved 0.99 accuracy), despite the fact that the data used for training the detector are substantially different from the ones specifically targeted for this purpose. The real and synthetic datasets, as well as corresponding source code, are publicly available.

Scheda breve

Scheda completa

Scheda completa (DC)

	Tipo di intervento
	
				paper
			
	Parole chiave
	
				behavior analysis; GAN; LSTM; Malware detection;
			
	Lingua del contenuto
	
				English
			
	Nome del convegno
	
				15th International Conference, MEDES 2023 - May 5–7, 2023
			
	Anno del convegno
	
				2023
			
	Curatori della monografia
	
				Chbeir, R; Benslimane, D; Zervakis, M; Manolopoulos, Y; Ngyuen, NT; Tekli, J
			
	Titolo degli atti
	
				Management of Digital EcoSystems
15th International Conference, MEDES 2023, Heraklion, Crete, Greece, May 5–7, 2023, Revised Selected Papers
			
	ISBN del volume degli atti
	
				9783031516429
			
	Collana o serie
	
				COMMUNICATIONS IN COMPUTER AND INFORMATION SCIENCE
			
	Data ahead of print o Data prima pubblicazione Online
	
				2-feb-2024
			
	Data di pubblicazione
	
				2024
			
	Numero del volume
	
				2022 CCIS
			
	Pagina iniziale
	
				237
			
	Pagina finale
	
				250
			
	DOI dell'intervento
	
				https://dx.doi.org/10.1007/978-3-031-51643-6_17
			
	URL alternativo
	
				https://link.springer.com/chapter/10.1007/978-3-031-51643-6_17
			
	Fulltext
	
				partially_open
			
	Citazione
	
				Anisetti, M., Ardagna, C., Bena, N., Giandomenico, V., Gianini, G. (2024). Lightweight Behavior-Based Malware Detection. In Management of Digital EcoSystems
15th International Conference, MEDES 2023, Heraklion, Crete, Greece, May 5–7, 2023, Revised Selected Papers (pp.237-250). Springer [10.1007/978-3-031-51643-6_17].
			
	Appare nelle tipologie:
	
				02 - Intervento a convegno

File in questo prodotto:

File	Dimensione	Formato
Anisetti-2024-MEDES-Lightweight Behavior-Based-VoR.pdf Solo gestori archivio Tipologia di allegato: Publisher’s Version (Version of Record, VoR) Licenza: Tutti i diritti riservati Dimensione 1.34 MB Formato Adobe PDF Visualizza/Apri Richiedi una copia	1.34 MB	Adobe PDF	Visualizza/Apri Richiedi una copia
Anisetti-2024-MEDES-Lightweight-preprint.pdf accesso aperto Tipologia di allegato: Submitted Version (Pre-print) Licenza: Altro Dimensione 3.17 MB Formato Adobe PDF Visualizza/Apri	3.17 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/460859

Citazioni

2

1

Social impact