An elastic-degenerate (ED) string is a sequence of n sets of strings of total length N, which was recently proposed to model a set of similar sequences. The ED string matching (EDSM) problem is to find all occurrences of a pattern of length m in an ED text. The EDSM problem has recently received some attention in the combinatorial pattern matching community, and an O(nm1.5 √log m + N)-time algorithm is known [Aoyama et al., CPM 2018]. The standard assumption in the prior work on this question is that N is substantially larger than both n and m, and thus we would like to have a linear dependency on the former. Under this assumption, the natural open problem is whether we can decrease the 1.5 exponent in the time complexity, similarly as in the related (but, to the best of our knowledge, not equivalent) word break problem [Backurs and Indyk, FOCS 2016]. Our starting point is a conditional lower bound for the EDSM problem. We use the popular combinatorial Boolean matrix multiplication (BMM) conjecture stating that there is no truly subcubic combinatorial algorithm for BMM [Abboud and Williams, FOCS 2014]. By designing an appropriate reduction we show that a combinatorial algorithm solving the EDSM problem in O(nm1.5−∊ + N) time, for any ∊ > 0, refutes this conjecture. Of course, the notion of combinatorial algorithms is not clearly defined, so our reduction should be understood as an indication that decreasing the exponent requires fast matrix multiplication. Two standard tools used in algorithms on strings are string periodicity and fast Fourier transform. Our main technical contribution is that we successfully combine these tools with fast matrix multiplication to design a non-combinatorial O(nm1.381 + N)-time algorithm for EDSM. To the best of our knowledge, we are the first to do so.

Bernardini, G., Gawrychowski, P., Pisanti, N., Pissis, S., Rosone, G. (2019). Even faster elastic-degenerate string matching via fast matrix multiplication. In The ICALP 2019 Proceedings (pp.1-5). Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing [10.4230/LIPIcs.ICALP.2019.21].

Even faster elastic-degenerate string matching via fast matrix multiplication

Bernardini, G;
2019

Abstract

An elastic-degenerate (ED) string is a sequence of n sets of strings of total length N, which was recently proposed to model a set of similar sequences. The ED string matching (EDSM) problem is to find all occurrences of a pattern of length m in an ED text. The EDSM problem has recently received some attention in the combinatorial pattern matching community, and an O(nm1.5 √log m + N)-time algorithm is known [Aoyama et al., CPM 2018]. The standard assumption in the prior work on this question is that N is substantially larger than both n and m, and thus we would like to have a linear dependency on the former. Under this assumption, the natural open problem is whether we can decrease the 1.5 exponent in the time complexity, similarly as in the related (but, to the best of our knowledge, not equivalent) word break problem [Backurs and Indyk, FOCS 2016]. Our starting point is a conditional lower bound for the EDSM problem. We use the popular combinatorial Boolean matrix multiplication (BMM) conjecture stating that there is no truly subcubic combinatorial algorithm for BMM [Abboud and Williams, FOCS 2014]. By designing an appropriate reduction we show that a combinatorial algorithm solving the EDSM problem in O(nm1.5−∊ + N) time, for any ∊ > 0, refutes this conjecture. Of course, the notion of combinatorial algorithms is not clearly defined, so our reduction should be understood as an indication that decreasing the exponent requires fast matrix multiplication. Two standard tools used in algorithms on strings are string periodicity and fast Fourier transform. Our main technical contribution is that we successfully combine these tools with fast matrix multiplication to design a non-combinatorial O(nm1.381 + N)-time algorithm for EDSM. To the best of our knowledge, we are the first to do so.
paper
Elastic-degenerate string; Fast Fourier transform; Matrix multiplication; Pattern matching; String algorithms;
string algorithms; pattern matching; elastic-degenerate string; matrix multiplication; fast Fourier transform
English
46th International Colloquium on Automata, Languages, and Programming, ICALP 2019
2019
Chatzigiannakis, I; Baier, C; Leonardi, S; Flocchini, P
The ICALP 2019 Proceedings
9783959771092
2019
132
1
5
21
none
Bernardini, G., Gawrychowski, P., Pisanti, N., Pissis, S., Rosone, G. (2019). Even faster elastic-degenerate string matching via fast matrix multiplication. In The ICALP 2019 Proceedings (pp.1-5). Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing [10.4230/LIPIcs.ICALP.2019.21].
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/235518
Citazioni
  • Scopus 27
  • ???jsp.display-item.citation.isi??? ND
Social impact