Indexing huge collections of strings, such as those produced by the widespread sequencing technologies, heavily relies on multi-string generalizations of the Burrows-Wheeler Transform (BWT) and the Longest Common Prefix (LCP) array, since solving efficiently both problems are essential ingredients of several algorithms on a collection of strings. In this paper we explore lightweight and parallel computational strategies for building the BWT and LCP array. We design a novel algorithm based on a divide and conquer approach that leads to a simultaneous and parallel computation of multi-string BWT and LCP array.
Bonizzoni, P., Della Vedova, G., Nicosia, S., Pirola, Y., Previtali, M., Rizzi, R. (2018). Divide and conquer computation of the multi-string BWT and LCP array. In Sailing Routes in the World of Computation (pp.107-117). Springer Verlag [10.1007/978-3-319-94418-0_11].
Divide and conquer computation of the multi-string BWT and LCP array
Bonizzoni, P;Della Vedova, G;Pirola, Y;Previtali, M;Rizzi, R
2018
Abstract
Indexing huge collections of strings, such as those produced by the widespread sequencing technologies, heavily relies on multi-string generalizations of the Burrows-Wheeler Transform (BWT) and the Longest Common Prefix (LCP) array, since solving efficiently both problems are essential ingredients of several algorithms on a collection of strings. In this paper we explore lightweight and parallel computational strategies for building the BWT and LCP array. We design a novel algorithm based on a divide and conquer approach that leads to a simultaneous and parallel computation of multi-string BWT and LCP array.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.