Heterogeneous Queries for Synoptic and Phrasal Search
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216224%3A14330%2F14%3A00077319" target="_blank" >RIV/00216224:14330/14:00077319 - isvavai.cz</a>
Result on the web
<a href="http://ceur-ws.org/Vol-1180/" target="_blank" >http://ceur-ws.org/Vol-1180/</a>
DOI - Digital Object Identifier
—
Alternative languages
Result language
angličtina
Original language name
Heterogeneous Queries for Synoptic and Phrasal Search
Original language description
This paper describes our approaches for the Plagiarism Detection ? Source Retrieval task of PAN 2014. We combined and improved methodology used at PAN 2012 and PAN 2013. Our system combines three types of queries: The keywords-based queries; the paragraph-based queries; and the headers-based queries. The queries are distinguished also by other properties such as the phrase query or the positional query. The queries are submitted to two search engines ? Chatnoir and Indri ? according to their properties.The query?s position serves for the search control, minimization of the total number of executed queries is the system?s priority. Downloaded documents are textually compared with the suspicious document and if a similarity is found, the downloaded document is reported.
Czech name
—
Czech description
—
Classification
Type
D - Article in proceedings
CEP classification
IN - Informatics
OECD FORD branch
—
Result continuities
Project
<a href="/en/project/LG13010" target="_blank" >LG13010: Czech Republic representation in the European Research Consortium for Informatics and Mathematics (ERCIM)</a><br>
Continuities
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)<br>I - Institucionalni podpora na dlouhodoby koncepcni rozvoj vyzkumne organizace
Others
Publication year
2014
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Article name in the collection
CLEF2014 Working Notes
ISBN
—
ISSN
1613-0073
e-ISSN
—
Number of pages
4
Pages from-to
1017-1020
Publisher name
CEUR, Aachen University
Place of publication
Sheffield, UK
Event location
Sheffield, UK
Event date
Jan 1, 2014
Type of event by nationality
CST - Celostátní akce
UT code for WoS article
—