Bicocca Open Archive

Inspired by scaffold filling, a recent approach for genome reconstruction from incomplete data, we consider a variant of the well-known longest common subsequence problem for the comparison of two sequences. The new problem, called Longest Filled Common Subsequence, aims to compare a complete sequence with an incomplete one, i.e. with some missing elements. Longest Filled Common Subsequence (LFCS), given a complete sequence A, an incomplete sequence B, and a multiset M of symbols missing in B, asks for a sequence B⁎ obtained by inserting the symbols of M into B so that B⁎ induces a common subsequence with A of maximum length. We investigate the computational and approximation complexity of the problem and we show that it is NP-hard and APX-hard when A contains at most two occurrences of each symbol, and we give a polynomial time algorithm when the input sequences are over a constant-size alphabet. We give a [Formula presented] approximation algorithm for the Longest Filled Common Subsequence problem. Finally, we present a fixed-parameter algorithm for the problem, when it is parameterized by the number of symbols inserted in B that “match” symbols of A.

Castelli, M., Dondi, R., Mauri, G., Zoppis, I. (2019). Comparing incomplete sequences via longest common subsequence. THEORETICAL COMPUTER SCIENCE, 796, 272-285 [10.1016/j.tcs.2019.09.022].

Comparing incomplete sequences via longest common subsequence

Castelli M.;Dondi R.;Mauri G.;Zoppis I.

2019

Abstract

Inspired by scaffold filling, a recent approach for genome reconstruction from incomplete data, we consider a variant of the well-known longest common subsequence problem for the comparison of two sequences. The new problem, called Longest Filled Common Subsequence, aims to compare a complete sequence with an incomplete one, i.e. with some missing elements. Longest Filled Common Subsequence (LFCS), given a complete sequence A, an incomplete sequence B, and a multiset M of symbols missing in B, asks for a sequence B⁎ obtained by inserting the symbols of M into B so that B⁎ induces a common subsequence with A of maximum length. We investigate the computational and approximation complexity of the problem and we show that it is NP-hard and APX-hard when A contains at most two occurrences of each symbol, and we give a polynomial time algorithm when the input sequences are over a constant-size alphabet. We give a [Formula presented] approximation algorithm for the Longest Filled Common Subsequence problem. Finally, we present a fixed-parameter algorithm for the problem, when it is parameterized by the number of symbols inserted in B that “match” symbols of A.

Scheda breve

Scheda completa

Scheda completa (DC)

	Sottotipologia
	
				Articolo in rivista - Articolo scientifico
			
	Parole chiave
	
				Approximation algorithms; Computational complexity; Fixed-parameter algorithms; Longest common subsequence; String algorithms;
			
	Parole chiave
	
				Approximation algorithms; Computational complexity; Fixed-parameter algorithms; Longest common subsequence; String algorithms
			
	Lingua del contenuto
	
				English
			
	Data di pubblicazione
	
				2019
			
	Rivista
	
				THEORETICAL COMPUTER SCIENCE
			
	Numero del volume
	
				796
			
	Pagina iniziale
	
				272
			
	Pagina finale
	
				285
			
	DOI dell'articolo
	
				https://dx.doi.org/10.1016/j.tcs.2019.09.022
			
	URL alternativo
	
				http://www.journals.elsevier.com/theoretical-computer-science/
			
	Fulltext
	
				reserved
			
	Citazione
	
				Castelli, M., Dondi, R., Mauri, G., Zoppis, I. (2019). Comparing incomplete sequences via longest common subsequence. THEORETICAL COMPUTER SCIENCE, 796, 272-285 [10.1016/j.tcs.2019.09.022].
			
	Appare nelle tipologie:
	
				01 - Articolo su rivista

File in questo prodotto:

File	Dimensione	Formato
Comparing incomplete sequences via longest common subsequence.pdf Solo gestori archivio Tipologia di allegato: Publisher’s Version (Version of Record, VoR) Dimensione 619.8 kB Formato Adobe PDF Visualizza/Apri Richiedi una copia	619.8 kB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/253649

Citazioni

6

4

Social impact