The main goal of this PhD project was to leverage the great body of genetic knowledge generated by high-throughput DNA sequencing to better characterize the genetic architecture of immune diseases, with a main focus on Primary Biliary Cholangitis (PBC). Our work has employed established statistical methods such as meta-analysis as well as novel data mining tools, such as packages dedicated to the study of chromosome X (chrX) and Machine Learning (ML) softwares. In addition, we have also applied established computational methods and ML tools to shed light on the evolutionary history of variants associated with immune-mediated traits. More specifically, the meta-analysis of previous Genome-Wide Association Studies (GWAS) in PBC has identified additional risk loci and, by means of functional annotation of credible causal variants and multi-omic analysis, has produced a list of candidate genes together with several drugs that are potentially suitable for re-purposing to PBC. The extensive analysis of chrX has identified several suggestive new loci associated with PBC located on this chromosome. The major finding of the X-Wide Association Study (XWAS) approach has been the identification of a genome-wide significantly associated locus characterized by the presence of different genes and of a superenhancer possibly involved in their co-regulation, as well as in the regulation of FOXP3 (which is located in the same TAD). Based on the new set of variants identified in the international meta-analysis, we have generated a novel Polygenic Risk Score (PRS) that has been incorporated in a new integrative risk model. This model, that has included 22 non-HLA variants, one HLA variant and sex, is accurate (Area Under the Curve (AUC) 0.83 and 0.81 in the two cohorts under study) and well calibrated. The PRS has pinpointed a subgroup of subjects at strikingly higher risk (OR ~ 14) of developing the disease that should be the target of tailored follow-up strategies. Prospective studies are needed to evaluate the potential role of PRS in first-degree relatives of patients with PBC. We have also presented the first example of a successful, proof-of-concept analysis of GWAS data with ML to study genetic liability of PBC. ML is computationally feasible and generates accurate information that can be leveraged for disease prediction (AUC 0.73). Our ML-based model predicts genetic susceptibility to PBC through a methodologically innovative and explainable method. The innovation relies on the explainability of the rules for disease prediction, predicting genetic liability at the individual level; in addition, rules have considered groups of variants instead of single variants alone, paving the way for integrating gene-gene interactions in genetic predictive models. Finally, to study the evolutionary determinants of variants related to immune-mediated complex traits, we have leveraged information about genetic variants of archaic hominins and genomic estimates of deleteriousness at nucleotide-resolution to estimate the mutational load in Neanderthals along the genome. We have shown that the allele frequency in high coverage Neanderthal genomes is informative of the fate of alleles in the human population. While the common identified pattern is that Neandertal fragments bringing more deleterious variants have been purged away, several fragments involved in immunity are observed nowadays at high frequencies in human populations. After confirming the enrichment of Neanderthal ancestry in immune-related traits and genes, we also observed that regions of the human genome carrying putatively deleterious variants and involved in immunity do not show traces of purifying selection but rather the opposite, reinforcing the notion that immune-related genes are under balancing selection.

Lo scopo principale di questo lavoro di tesi di dottorato è stato l'utilizzo dell'ampio materiale genetico disponibile a seguito del sequenziamento del DNA per migliorare la caratterizzazione dell'architettura genetica delle malattie autoimmuni, con un focus specifico sulla Colangite Biliare Primitiva. Il nostro lavoro ha utilizzato metodi statistici "classici" come la meta-analisi e metodi di analisi dati nuovi, come pacchetti dedicati allo studio del cromosoma X e software di Machine Learning. Inoltre abbiamo applicate tecniche di biologia evoluzionistica computazionale per studiare la storia evolutiva delle varianti associate alla immunità.

(2022). DISSECTING THE GENETIC ARCHITECTURE OF AUTOIMMUNITY: A SPOTLIGHT ON PRIMARY BILIARY CHOLANGITIS. (Tesi di dottorato, Università degli Studi di Milano-Bicocca, 2022).

DISSECTING THE GENETIC ARCHITECTURE OF AUTOIMMUNITY: A SPOTLIGHT ON PRIMARY BILIARY CHOLANGITIS

GERUSSI, ALESSIO
2022

Abstract

The main goal of this PhD project was to leverage the great body of genetic knowledge generated by high-throughput DNA sequencing to better characterize the genetic architecture of immune diseases, with a main focus on Primary Biliary Cholangitis (PBC). Our work has employed established statistical methods such as meta-analysis as well as novel data mining tools, such as packages dedicated to the study of chromosome X (chrX) and Machine Learning (ML) softwares. In addition, we have also applied established computational methods and ML tools to shed light on the evolutionary history of variants associated with immune-mediated traits. More specifically, the meta-analysis of previous Genome-Wide Association Studies (GWAS) in PBC has identified additional risk loci and, by means of functional annotation of credible causal variants and multi-omic analysis, has produced a list of candidate genes together with several drugs that are potentially suitable for re-purposing to PBC. The extensive analysis of chrX has identified several suggestive new loci associated with PBC located on this chromosome. The major finding of the X-Wide Association Study (XWAS) approach has been the identification of a genome-wide significantly associated locus characterized by the presence of different genes and of a superenhancer possibly involved in their co-regulation, as well as in the regulation of FOXP3 (which is located in the same TAD). Based on the new set of variants identified in the international meta-analysis, we have generated a novel Polygenic Risk Score (PRS) that has been incorporated in a new integrative risk model. This model, that has included 22 non-HLA variants, one HLA variant and sex, is accurate (Area Under the Curve (AUC) 0.83 and 0.81 in the two cohorts under study) and well calibrated. The PRS has pinpointed a subgroup of subjects at strikingly higher risk (OR ~ 14) of developing the disease that should be the target of tailored follow-up strategies. Prospective studies are needed to evaluate the potential role of PRS in first-degree relatives of patients with PBC. We have also presented the first example of a successful, proof-of-concept analysis of GWAS data with ML to study genetic liability of PBC. ML is computationally feasible and generates accurate information that can be leveraged for disease prediction (AUC 0.73). Our ML-based model predicts genetic susceptibility to PBC through a methodologically innovative and explainable method. The innovation relies on the explainability of the rules for disease prediction, predicting genetic liability at the individual level; in addition, rules have considered groups of variants instead of single variants alone, paving the way for integrating gene-gene interactions in genetic predictive models. Finally, to study the evolutionary determinants of variants related to immune-mediated complex traits, we have leveraged information about genetic variants of archaic hominins and genomic estimates of deleteriousness at nucleotide-resolution to estimate the mutational load in Neanderthals along the genome. We have shown that the allele frequency in high coverage Neanderthal genomes is informative of the fate of alleles in the human population. While the common identified pattern is that Neandertal fragments bringing more deleterious variants have been purged away, several fragments involved in immunity are observed nowadays at high frequencies in human populations. After confirming the enrichment of Neanderthal ancestry in immune-related traits and genes, we also observed that regions of the human genome carrying putatively deleterious variants and involved in immunity do not show traces of purifying selection but rather the opposite, reinforcing the notion that immune-related genes are under balancing selection.
INVERNIZZI, PIETRO
ASSELTA, ROSANNA
genetica; immunologia; autoimmunita; cromosoma X; fegato
genetics; immunology; autoimmunity; X chromosome; fegato
MED/12 - GASTROENTEROLOGIA
English
3-mag-2022
MEDICINA TRASLAZIONALE E MOLECOLARE - DIMET
34
2020/2021
open
(2022). DISSECTING THE GENETIC ARCHITECTURE OF AUTOIMMUNITY: A SPOTLIGHT ON PRIMARY BILIARY CHOLANGITIS. (Tesi di dottorato, Università degli Studi di Milano-Bicocca, 2022).
File in questo prodotto:
File Dimensione Formato  
phd_unimib_849287.pdf

accesso aperto

Descrizione: Tesi di Gerussi Alessio - 849287
Tipologia di allegato: Doctoral thesis
Dimensione 4.26 MB
Formato Adobe PDF
4.26 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/382077
Citazioni
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
Social impact