Utilizing Text Similarity Measurement for Data Compression to Detect Plagiarism in Czech
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F61989100%3A27240%2F15%3A86092341" target="_blank" >RIV/61989100:27240/15:86092341 - isvavai.cz</a>
Alternative codes found
RIV/61989100:27740/15:86092341
Result on the web
<a href="http://dx.doi.org/10.1007/978-3-319-13572-4_13" target="_blank" >http://dx.doi.org/10.1007/978-3-319-13572-4_13</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1007/978-3-319-13572-4_13" target="_blank" >10.1007/978-3-319-13572-4_13</a>
Alternative languages
Result language
angličtina
Original language name
Utilizing Text Similarity Measurement for Data Compression to Detect Plagiarism in Czech
Original language description
This paper attempts to apply data compression based simi- larity method for plagiarism detection. The method has been used earlier for plagiarism detection for Arabic and English languages. In this paper we utilize this method for Czech language text from a local multi-domain Czech corpus with 50 original documents with non-plagiarized parts, and 100 suspicious documents. The documents were generated so that every document could have from 1 to 5 paragraphs. The suspicion rate in the documents was randomly chosen from 0.2 to 0.8. The ndings of the study show that the similarity measurement based on Lempel-Ziv com- parison algorithms is ecient for the plagiarized part of the Czech text documents with a success rate of 82.60%. Future studies may enhance the eciency of the algorithms by including combined and more sophis- ticated methods.
Czech name
—
Czech description
—
Classification
Type
D - Article in proceedings
CEP classification
IN - Informatics
OECD FORD branch
—
Result continuities
Project
—
Continuities
S - Specificky vyzkum na vysokych skolach
Others
Publication year
2015
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Article name in the collection
Advances in Intelligent Systems and Computing. Volume 334
ISBN
978-3-319-13571-7
ISSN
2194-5357
e-ISSN
—
Number of pages
10
Pages from-to
163-182
Publisher name
Springer
Place of publication
New York
Event location
Addis Ababa
Event date
Nov 17, 2014
Type of event by nationality
WRD - Celosvětová akce
UT code for WoS article
—