All

What are you looking for?

All
Projects
Results
Organizations

Quick search

  • Projects supported by TA ČR
  • Excellent projects
  • Projects with the highest public support
  • Current projects

Smart search

  • That is how I find a specific +word
  • That is how I leave the -word out of the results
  • “That is how I can find the whole phrase”

Utilizing Text Similarity Measurement for Data Compression to Detect Plagiarism in Czech

The result's identifiers

  • Result code in IS VaVaI

    <a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F61989100%3A27240%2F15%3A86092341" target="_blank" >RIV/61989100:27240/15:86092341 - isvavai.cz</a>

  • Alternative codes found

    RIV/61989100:27740/15:86092341

  • Result on the web

    <a href="http://dx.doi.org/10.1007/978-3-319-13572-4_13" target="_blank" >http://dx.doi.org/10.1007/978-3-319-13572-4_13</a>

  • DOI - Digital Object Identifier

    <a href="http://dx.doi.org/10.1007/978-3-319-13572-4_13" target="_blank" >10.1007/978-3-319-13572-4_13</a>

Alternative languages

  • Result language

    angličtina

  • Original language name

    Utilizing Text Similarity Measurement for Data Compression to Detect Plagiarism in Czech

  • Original language description

    This paper attempts to apply data compression based simi- larity method for plagiarism detection. The method has been used earlier for plagiarism detection for Arabic and English languages. In this paper we utilize this method for Czech language text from a local multi-domain Czech corpus with 50 original documents with non-plagiarized parts, and 100 suspicious documents. The documents were generated so that every document could have from 1 to 5 paragraphs. The suspicion rate in the documents was randomly chosen from 0.2 to 0.8. The ndings of the study show that the similarity measurement based on Lempel-Ziv com- parison algorithms is ecient for the plagiarized part of the Czech text documents with a success rate of 82.60%. Future studies may enhance the eciency of the algorithms by including combined and more sophis- ticated methods.

  • Czech name

  • Czech description

Classification

  • Type

    D - Article in proceedings

  • CEP classification

    IN - Informatics

  • OECD FORD branch

Result continuities

  • Project

  • Continuities

    S - Specificky vyzkum na vysokych skolach

Others

  • Publication year

    2015

  • Confidentiality

    S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Data specific for result type

  • Article name in the collection

    Advances in Intelligent Systems and Computing. Volume 334

  • ISBN

    978-3-319-13571-7

  • ISSN

    2194-5357

  • e-ISSN

  • Number of pages

    10

  • Pages from-to

    163-182

  • Publisher name

    Springer

  • Place of publication

    New York

  • Event location

    Addis Ababa

  • Event date

    Nov 17, 2014

  • Type of event by nationality

    WRD - Celosvětová akce

  • UT code for WoS article