Similarity based on data compression
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F61989100%3A27240%2F13%3A86088863" target="_blank" >RIV/61989100:27240/13:86088863 - isvavai.cz</a>
Alternative codes found
RIV/61989100:27740/13:86088863
Result on the web
<a href="http://dx.doi.org/10.1007/978-3-642-45111-9_24" target="_blank" >http://dx.doi.org/10.1007/978-3-642-45111-9_24</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1007/978-3-642-45111-9_24" target="_blank" >10.1007/978-3-642-45111-9_24</a>
Alternative languages
Result language
angličtina
Original language name
Similarity based on data compression
Original language description
Similarity detection is one of the most important areas in document processing. The applications of it starts in spam detection and goes through identification of plagiarism in the web, bachelor or master thesis and ends at identification of copied scientific papers. This paper presents an improvement of a plagiarism detection algorithm which is based on the Lampel and Ziv dictionary based compression algorithm by application of stop words removing and tests this algorithm on real dataset. Moreover, a visualization of the plagiarized documents relationship is also presented. The algorithm confirms its ability in detection of the plagiarized parts of text and also the achieved improvement when the suggested improvements are applied.
Czech name
—
Czech description
—
Classification
Type
D - Article in proceedings
CEP classification
IN - Informatics
OECD FORD branch
—
Result continuities
Project
Result was created during the realization of more than one project. More information in the Projects tab.
Continuities
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)<br>S - Specificky vyzkum na vysokych skolach
Others
Publication year
2013
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Article name in the collection
Lecture Notes in Computer Science. Volume 8266
ISBN
978-3-642-45110-2
ISSN
0302-9743
e-ISSN
—
Number of pages
12
Pages from-to
267-278
Publisher name
Springer Verlag
Place of publication
London
Event location
Mexico City
Event date
Nov 24, 2013
Type of event by nationality
WRD - Celosvětová akce
UT code for WoS article
—