Inspired by a recent approach for genome reconstruction from incomplete data, we consider a variant of the longest common subsequence problem for the comparison of two sequences, one of which is incomplete, i.e. it has some missing elements. The new combinatorial problem, called Longest Filled Common Subsequence, given two sequences A and B, and a multiset M of symbols missing in B, asks for a sequence B∗ obtained by inserting the symbols of M into B so that B∗ induces a common subsequence with A of maximum length. First, we investigate the computational and approximation complexity of the problem and we show that it is NP-hard and APX-hard when A contains at most two occurrences of each symbol. Then, we give a 3/5-approximation algorithm for the problem. Finally, we present a fixed-parameter algorithm, when the problem is parameterized by the number of symbols inserted in B that "match" symbols of A.

Castelli, M., Dondi, R., Mauri, G., Zoppis, I. (2017). The longest filled common subsequence problem. In Proceedings CPM 2017 – 28th Annual Symposium on Combinatorial Pattern Matching (pp.1-14). Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing [10.4230/LIPIcs.CPM.2017.14].

The longest filled common subsequence problem

MAURI, GIANCARLO
Penultimo
;
ZOPPIS, ITALO FRANCESCO
Ultimo
2017

Abstract

Inspired by a recent approach for genome reconstruction from incomplete data, we consider a variant of the longest common subsequence problem for the comparison of two sequences, one of which is incomplete, i.e. it has some missing elements. The new combinatorial problem, called Longest Filled Common Subsequence, given two sequences A and B, and a multiset M of symbols missing in B, asks for a sequence B∗ obtained by inserting the symbols of M into B so that B∗ induces a common subsequence with A of maximum length. First, we investigate the computational and approximation complexity of the problem and we show that it is NP-hard and APX-hard when A contains at most two occurrences of each symbol. Then, we give a 3/5-approximation algorithm for the problem. Finally, we present a fixed-parameter algorithm, when the problem is parameterized by the number of symbols inserted in B that "match" symbols of A.
slide + paper
Approximation algorithms; Computational complexity; Fixed-parameter algorithms; Longest common subsequence
English
28th Annual Symposium on Combinatorial Pattern Matching, CPM 2017
2017
Kärkkäinen, J; Radoszewski, J; Rytter, W
Proceedings CPM 2017 – 28th Annual Symposium on Combinatorial Pattern Matching
9783959770392
2017
78
1
14
14
http://drops.dagstuhl.de/opus/institut_lipics.php?fakultaet=04
open
Castelli, M., Dondi, R., Mauri, G., Zoppis, I. (2017). The longest filled common subsequence problem. In Proceedings CPM 2017 – 28th Annual Symposium on Combinatorial Pattern Matching (pp.1-14). Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing [10.4230/LIPIcs.CPM.2017.14].
File in questo prodotto:
File Dimensione Formato  
C173-CPM 17.pdf

accesso aperto

Dimensione 564.31 kB
Formato Adobe PDF
564.31 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/168555
Citazioni
  • Scopus 7
  • ???jsp.display-item.citation.isi??? ND
Social impact