Similarity based on data compression

Identifikátory výsledku

Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F61989100%3A27240%2F13%3A86088863" target="_blank" >RIV/61989100:27240/13:86088863 - isvavai.cz</a>
Nalezeny alternativní kódy
RIV/61989100:27740/13:86088863
Výsledek na webu
<a href="http://dx.doi.org/10.1007/978-3-642-45111-9_24" target="_blank" >http://dx.doi.org/10.1007/978-3-642-45111-9_24</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1007/978-3-642-45111-9_24" target="_blank" >10.1007/978-3-642-45111-9_24</a>

Alternativní jazyky

Jazyk výsledku
angličtina
Název v původním jazyce
Similarity based on data compression
Popis výsledku v původním jazyce
Similarity detection is one of the most important areas in document processing. The applications of it starts in spam detection and goes through identification of plagiarism in the web, bachelor or master thesis and ends at identification of copied scientific papers. This paper presents an improvement of a plagiarism detection algorithm which is based on the Lampel and Ziv dictionary based compression algorithm by application of stop words removing and tests this algorithm on real dataset. Moreover, a visualization of the plagiarized documents relationship is also presented. The algorithm confirms its ability in detection of the plagiarized parts of text and also the achieved improvement when the suggested improvements are applied.
Název v anglickém jazyce
Similarity based on data compression
Popis výsledku anglicky
Similarity detection is one of the most important areas in document processing. The applications of it starts in spam detection and goes through identification of plagiarism in the web, bachelor or master thesis and ends at identification of copied scientific papers. This paper presents an improvement of a plagiarism detection algorithm which is based on the Lampel and Ziv dictionary based compression algorithm by application of stop words removing and tests this algorithm on real dataset. Moreover, a visualization of the plagiarized documents relationship is also presented. The algorithm confirms its ability in detection of the plagiarized parts of text and also the achieved improvement when the suggested improvements are applied.

Klasifikace

Druh
D - Stať ve sborníku
CEP obor
IN - Informatika
OECD FORD obor
—

Návaznosti výsledku

Projekt
Výsledek vznikl pri realizaci vícero projektů. Více informací v záložce Projekty.
Návaznosti
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)<br>S - Specificky vyzkum na vysokych skolach

Ostatní

Rok uplatnění
2013
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Údaje specifické pro druh výsledku

Název statě ve sborníku
Lecture Notes in Computer Science. Volume 8266
ISBN
978-3-642-45110-2
ISSN
0302-9743
e-ISSN
—
Počet stran výsledku
12
Strana od-do
267-278
Název nakladatele
Springer Verlag
Místo vydání
London
Místo konání akce
Mexico City
Datum konání akce
24. 11. 2013
Typ akce podle státní příslušnosti
WRD - Celosvětová akce
Kód UT WoS článku
—

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Similarity based on data compression

Identifikátory výsledku

Alternativní jazyky

Klasifikace

Návaznosti výsledku

Ostatní

Údaje specifické pro druh výsledku

Podobné výsledky(10)

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Popis výsledku

Identifikátory výsledku

Identifikátory výsledku

Alternativní jazyky

Alternativní jazyky

Klasifikace

Klasifikace

Návaznosti výsledku

Návaznosti výsledku

Ostatní

Ostatní

Údaje specifické pro druh výsledku

Údaje specifické pro druh výsledku

Podobné výsledky(10)