Heuristic and ai approach to optimize plagiarism detection tool using a public search engine

Identifikátory výsledku

Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F62156489%3A43110%2F12%3A43908973" target="_blank" >RIV/62156489:43110/12:43908973 - isvavai.cz</a>
Výsledek na webu
—
DOI - Digital Object Identifier
—

Alternativní jazyky

Jazyk výsledku
angličtina
Název v původním jazyce
Heuristic and ai approach to optimize plagiarism detection tool using a public search engine
Popis výsledku v původním jazyce
The paper presents an experience with methods for efficient population of the database of possible sources for plagiarism. Each document is examined with public search engine for potential plagiarism. To ensure maximal relevance of results and maximal speed of examination, the fragments of source documents have to be chosen very carefully. We tried naive approach, heuristic and neural networks to optimize the number of queries for the public search engine. We found that neural network has no use withoutbigram or trigram frequency dictionary, so that context is important for querying. The most efficient way how to speed up the matching is to learn how to estimate the plagiarism probability for each part of the document and use it for building the queries for the search engine.
Název v anglickém jazyce
Heuristic and ai approach to optimize plagiarism detection tool using a public search engine
Popis výsledku anglicky
The paper presents an experience with methods for efficient population of the database of possible sources for plagiarism. Each document is examined with public search engine for potential plagiarism. To ensure maximal relevance of results and maximal speed of examination, the fragments of source documents have to be chosen very carefully. We tried naive approach, heuristic and neural networks to optimize the number of queries for the public search engine. We found that neural network has no use withoutbigram or trigram frequency dictionary, so that context is important for querying. The most efficient way how to speed up the matching is to learn how to estimate the plagiarism probability for each part of the document and use it for building the queries for the search engine.

Klasifikace

Druh
D - Stať ve sborníku
CEP obor
IN - Informatika
OECD FORD obor
—

Návaznosti výsledku

Projekt
—
Návaznosti
I - Institucionalni podpora na dlouhodoby koncepcni rozvoj vyzkumne organizace

Ostatní

Rok uplatnění
2012
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Údaje specifické pro druh výsledku

Název statě ve sborníku
Proceedings of the IADIS International Conference WWW/Internet 2012 (ICWI 2012)
ISBN
978-989-8533-09-8
ISSN
—
e-ISSN
—
Počet stran výsledku
5
Strana od-do
399-403
Název nakladatele
IADIS (International Association for Development of the Information Society)
Místo vydání
Lisabon
Místo konání akce
Madrid
Datum konání akce
18. 10. 2012
Typ akce podle státní příslušnosti
WRD - Celosvětová akce
Kód UT WoS článku
—

Podobné výsledky(10)

Heterogeneous Queries for Synoptic and Phrasal Search Using kohonen maps and singular value decomposition for plagiarism detection Grow up precision recall relationship curve in IR system using GP and fuzzy optimization in optimizing the user query

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Heuristic and ai approach to optimize plagiarism detection tool using a public search engine

Identifikátory výsledku

Alternativní jazyky

Klasifikace

Návaznosti výsledku

Ostatní

Údaje specifické pro druh výsledku

Podobné výsledky(10)

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Popis výsledku

Identifikátory výsledku

Identifikátory výsledku

Alternativní jazyky

Alternativní jazyky

Klasifikace

Klasifikace

Návaznosti výsledku

Návaznosti výsledku

Ostatní

Ostatní

Údaje specifické pro druh výsledku

Údaje specifické pro druh výsledku

Podobné výsledky(10)