We present a detailed description of an algorithm tailored to detect external plagiarism in PAN-09 competition. The algorithm is divided into three steps: a first reduction of the size of the problem by a selection of ten suspicious plagiarists using a n-gram distance on properly recoded texts. A search for matches after T9-like recoding. A “joining algorithm” that merges selected matches and is able to detect obfuscated plagiarism. The results are briefly discussed. Keywords: n-grams, plagiarism, coding, string matching
Basile, C., Benedetto, D., Caglioti, E., Cristadoro, G., & Degli Esposti, M. (2009). A plagiarism detection procedure in three steps: selection, matches and ”squares”. In Proceedings of SEPLN 2009 - 3rd Workshop on Uncovering Plagiarism, Authorship and Social Software Misuse, PAN 2009 and 1st International Competition on Plagiarism Detection; San Sebastian (Donostia); Spain; 10 September 2009 (pp.19-23).
Citazione: | Basile, C., Benedetto, D., Caglioti, E., Cristadoro, G., & Degli Esposti, M. (2009). A plagiarism detection procedure in three steps: selection, matches and ”squares”. In Proceedings of SEPLN 2009 - 3rd Workshop on Uncovering Plagiarism, Authorship and Social Software Misuse, PAN 2009 and 1st International Competition on Plagiarism Detection; San Sebastian (Donostia); Spain; 10 September 2009 (pp.19-23). | |
Tipo: | paper | |
Carattere della pubblicazione: | Scientifica | |
Presenza di un coautore afferente ad Istituzioni straniere: | No | |
Titolo: | A plagiarism detection procedure in three steps: selection, matches and ”squares” | |
Autori: | Basile, C; Benedetto, D; Caglioti, E; Cristadoro, G; Degli Esposti, M | |
Autori: | ||
Data di pubblicazione: | 2009 | |
Lingua: | English | |
Nome del convegno: | 3rd PAN Workshop. Uncovering Plagiarism, Authorship And Social Software Misuse with 25th Annual Conference of the Spanish Society for Natural Language Processing, SEPLN 2009 | |
Serie: | CEUR WORKSHOP PROCEEDINGS | |
Appare nelle tipologie: | 02 - Intervento a convegno |
File in questo prodotto:
File | Descrizione | Tipologia | Licenza | |
---|---|---|---|---|
AAAplagioPAN_pubblicato.pdf | post-print | N/A | Administrator Richiedi una copia |