Discrimination discovery from data is an important data mining task, whose goal is to identify patterns of illegal and unethical discriminatory activities against protected-by-law groups, e.g., ethnic minorities. While any legally valid proof of discrimination requires evidence of causality, the state-of-the-art methods are essentially correlation based, albeit, as it is well known, correlation does not imply causation. In this paper, we take a principled causal approach to discrimination detection following Suppes’ probabilistic causation theory. In particular, we define a method to extract, from a dataset of historical decision records, the causal structures existing among the attributes in the data. The result is a type of constrained Bayesian network, which we dub Suppes-Bayes causal network (SBCN). Next, we develop a toolkit of methods based on random walks on top of the SBCN, addressing different anti-discrimination legal concepts, such as direct and indirect discrimination, group and individual discrimination, genuine requirement, and favoritism. Our experiments on real-world datasets confirm the inferential power of our approach in all these different tasks.

Bonchi, F., Hajian, S., Mishra, B., Ramazzotti, D. (2017). Exposing the probabilistic causal structure of discrimination. INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS, 3(1) [10.1007/s41060-016-0040-z].

Exposing the probabilistic causal structure of discrimination

Ramazzotti, D
2017

Abstract

Discrimination discovery from data is an important data mining task, whose goal is to identify patterns of illegal and unethical discriminatory activities against protected-by-law groups, e.g., ethnic minorities. While any legally valid proof of discrimination requires evidence of causality, the state-of-the-art methods are essentially correlation based, albeit, as it is well known, correlation does not imply causation. In this paper, we take a principled causal approach to discrimination detection following Suppes’ probabilistic causation theory. In particular, we define a method to extract, from a dataset of historical decision records, the causal structures existing among the attributes in the data. The result is a type of constrained Bayesian network, which we dub Suppes-Bayes causal network (SBCN). Next, we develop a toolkit of methods based on random walks on top of the SBCN, addressing different anti-discrimination legal concepts, such as direct and indirect discrimination, group and individual discrimination, genuine requirement, and favoritism. Our experiments on real-world datasets confirm the inferential power of our approach in all these different tasks.
Articolo in rivista - Articolo scientifico
Algorithmic discrimination; constrained Bayesian network; Discrimination discovery; probabilistic causation; Random walks
English
2017
3
1
none
Bonchi, F., Hajian, S., Mishra, B., Ramazzotti, D. (2017). Exposing the probabilistic causal structure of discrimination. INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS, 3(1) [10.1007/s41060-016-0040-z].
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/285172
Citazioni
  • Scopus 47
  • ???jsp.display-item.citation.isi??? ND
Social impact