Bicocca Open Archive

Nowadays, Machine Learning (ML) systems are widely used in various businesses and are increasingly being adopted to make decisions that can significantly impact people's lives. However, these decision-making systems rely on data-driven learning, which poses a risk of propagating the bias embedded in the data. Despite various attempts by the algorithmic fairness community to outline different types of bias in data and algorithms, there is still a limited understanding of how these biases relate to the fairness of ML-based decision-making systems. In addition, efforts to mitigate bias and unfairness are often agnostic to the specific type(s) of bias present in the data. This paper explores the nature of fundamental types of bias, discussing their relationship to moral and technical frameworks. To prevent harmful consequences, it is essential to comprehend how and where bias is introduced throughout the entire modelling pipeline and possibly how to mitigate it. Our primary contribution is a framework for generating synthetic datasets with different forms of biases. We use our proposed synthetic data generator to perform experiments on different scenarios to showcase the interconnection between biases and their effect on performance and fairness evaluations. Furthermore, we provide initial insights into mitigating specific types of bias through post-processing techniques. The implementation of the synthetic data generator and experiments can be found at https://github.com/rcrupiISP/BiasOnDemand.

Baumann, J., Castelnovo, A., Crupi, R., Inverardi, N., Regoli, D. (2023). Bias on Demand: A Modelling Framework That Generates Synthetic Data With Bias. In Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency (pp.1002-1013). Association for Computing Machinery (ACM) [10.1145/3593013.3594058].

Bias on Demand: A Modelling Framework That Generates Synthetic Data With Bias

Baumann, J;Castelnovo, A;Crupi, R;Inverardi, N;Regoli, D

2023

Abstract

Nowadays, Machine Learning (ML) systems are widely used in various businesses and are increasingly being adopted to make decisions that can significantly impact people's lives. However, these decision-making systems rely on data-driven learning, which poses a risk of propagating the bias embedded in the data. Despite various attempts by the algorithmic fairness community to outline different types of bias in data and algorithms, there is still a limited understanding of how these biases relate to the fairness of ML-based decision-making systems. In addition, efforts to mitigate bias and unfairness are often agnostic to the specific type(s) of bias present in the data. This paper explores the nature of fundamental types of bias, discussing their relationship to moral and technical frameworks. To prevent harmful consequences, it is essential to comprehend how and where bias is introduced throughout the entire modelling pipeline and possibly how to mitigate it. Our primary contribution is a framework for generating synthetic datasets with different forms of biases. We use our proposed synthetic data generator to perform experiments on different scenarios to showcase the interconnection between biases and their effect on performance and fairness evaluations. Furthermore, we provide initial insights into mitigating specific types of bias through post-processing techniques. The implementation of the synthetic data generator and experiments can be found at https://github.com/rcrupiISP/BiasOnDemand.

Scheda breve

Scheda completa

Scheda completa (DC)

	Tipo di intervento
	
				paper
			
	Parole chiave
	
				bias; fairness; moral worldviews; synthetic data;
			
	Lingua del contenuto
	
				English
			
	Nome del convegno
	
				6th ACM Conference on Fairness, Accountability, and Transparency, FAccT 2023 - 12 June 2023 through 15 June 2023
			
	Anno del convegno
	
				2023
			
	Titolo degli atti
	
				Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency
			
	ISBN del volume degli atti
	
				9798400701924
			
	Data di pubblicazione
	
				2023
			
	Pagina iniziale
	
				1002
			
	Pagina finale
	
				1013
			
	DOI dell'intervento
	
				https://dx.doi.org/10.1145/3593013.3594058
			
	Fulltext
	
				open
			
	Citazione
	
				Baumann, J., Castelnovo, A., Crupi, R., Inverardi, N., Regoli, D. (2023). Bias on Demand: A Modelling Framework That Generates Synthetic Data With Bias. In Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency (pp.1002-1013). Association for Computing Machinery (ACM) [10.1145/3593013.3594058].
			
	Appare nelle tipologie:
	
				02 - Intervento a convegno

File in questo prodotto:

File	Dimensione	Formato
Baumann-2023-FAccT-VoR.pdf accesso aperto Descrizione: Intervento a convegno - Research article Tipologia di allegato: Publisher’s Version (Version of Record, VoR) Licenza: Creative Commons Dimensione 727.26 kB Formato Adobe PDF Visualizza/Apri	727.26 kB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/446479

Citazioni

12

8

Social impact