The increasing number of single-cell transcriptomic and single-cell RNA sequencing studies are allowing for a deeper understanding of the molecular processes underlying the normal development of an organism as well as the onset of pathologies. These studies continuously refine the functional roles of known cell populations, and provide their characterization as soon as putatively novel cell populations are detected. In order to isolate the cell populations for further tailored analysis, succinct marker panels-composed of a few cell surface proteins and clusters of differentiation molecules-must be identified. The identification of these marker panels is a challenging computational problem due to its intrinsic combinatorial nature, which makes it an NP-hard problem. Genetic Algorithms (GAs) have been successfully used in Bioinformatics and other biomedical applications to tackle combinatorial problems. We present here a GA-based approach to solve the problem of the identification of succinct marker panels. Since the performance of a GA is strictly related to the representation of the candidate solutions, we propose and compare three alternative representations, able to implicitly introduce different constraints on the search space. For each representation, we perform a fine-tuning of the parameter settings to calibrate the GA, and we show that different representations yield different performance, where the most relaxed representations-in which the GA can also evolve the number of genes in the panel-turn out to be the more effective, especially in the case of 0-knowledge problems. Our results also show that the marker panels identified by GAs can outperform manually curated solutions.

Tangherloni, A., Riva, S., Spolaor, S., Besozzi, D., Nobile, M., Cazzaniga, P. (2021). The Impact of Representation on the Optimization of Marker Panels for Single-cell RNA Data. In 2021 IEEE Congress on Evolutionary Computation, CEC 2021 - Proceedings (pp.1423-1430). 345 E 47TH ST, NEW YORK, NY 10017 USA : Institute of Electrical and Electronics Engineers Inc. [10.1109/CEC45853.2021.9504808].

The Impact of Representation on the Optimization of Marker Panels for Single-cell RNA Data

Tangherloni, Andrea;Spolaor, Simone;Besozzi, Daniela;Nobile, Marco S.;Cazzaniga, Paolo
2021

Abstract

The increasing number of single-cell transcriptomic and single-cell RNA sequencing studies are allowing for a deeper understanding of the molecular processes underlying the normal development of an organism as well as the onset of pathologies. These studies continuously refine the functional roles of known cell populations, and provide their characterization as soon as putatively novel cell populations are detected. In order to isolate the cell populations for further tailored analysis, succinct marker panels-composed of a few cell surface proteins and clusters of differentiation molecules-must be identified. The identification of these marker panels is a challenging computational problem due to its intrinsic combinatorial nature, which makes it an NP-hard problem. Genetic Algorithms (GAs) have been successfully used in Bioinformatics and other biomedical applications to tackle combinatorial problems. We present here a GA-based approach to solve the problem of the identification of succinct marker panels. Since the performance of a GA is strictly related to the representation of the candidate solutions, we propose and compare three alternative representations, able to implicitly introduce different constraints on the search space. For each representation, we perform a fine-tuning of the parameter settings to calibrate the GA, and we show that different representations yield different performance, where the most relaxed representations-in which the GA can also evolve the number of genes in the panel-turn out to be the more effective, especially in the case of 0-knowledge problems. Our results also show that the marker panels identified by GAs can outperform manually curated solutions.
paper
Bioinformatics; Candidate solution representation; Computational biology; Genetic algorithms; Single-cell transcriptomic data;
English
2021 IEEE Congress on Evolutionary Computation, CEC 2021 - 28 June 2021 through 1 July 2021
2021
2021 IEEE Congress on Evolutionary Computation, CEC 2021 - Proceedings
978-172818392-3
2021
1423
1430
none
Tangherloni, A., Riva, S., Spolaor, S., Besozzi, D., Nobile, M., Cazzaniga, P. (2021). The Impact of Representation on the Optimization of Marker Panels for Single-cell RNA Data. In 2021 IEEE Congress on Evolutionary Computation, CEC 2021 - Proceedings (pp.1423-1430). 345 E 47TH ST, NEW YORK, NY 10017 USA : Institute of Electrical and Electronics Engineers Inc. [10.1109/CEC45853.2021.9504808].
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/327259
Citazioni
  • Scopus 7
  • ???jsp.display-item.citation.isi??? 4
Social impact