Finding Definitions in Large Corpora with Sketch Engine
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216224%3A14330%2F16%3A00088334" target="_blank" >RIV/00216224:14330/16:00088334 - isvavai.cz</a>
Result on the web
—
DOI - Digital Object Identifier
—
Alternative languages
Result language
angličtina
Original language name
Finding Definitions in Large Corpora with Sketch Engine
Original language description
The paper describes automatic definition finding implemented within the leading corpus query and management tool, Sketch Engine. The implementation exploits complex pattern-matching queries in the corpus query language (CQL) and the indexing mechanism of word sketches for finding and storing definition candidates throughout the corpus. The approach is evaluated for Czech and English corpora, showing that the results are usable in practice: precision of the tool ranges between 30 and 75 percent (depending on the major corpus text types) and we were able to extract nearly 2 million definition candidates from an English corpus with 1.4 billion words. The feature is embedded into the interface as a concordance filter, so that users can search for definitions of any query to the corpus, including very specific multi-word queries.
Czech name
—
Czech description
—
Classification
Type
D - Article in proceedings
CEP classification
IN - Informatics
OECD FORD branch
—
Result continuities
Project
Result was created during the realization of more than one project. More information in the Projects tab.
Continuities
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)
Others
Publication year
2016
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Article name in the collection
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016)
ISBN
9782951740891
ISSN
—
e-ISSN
—
Number of pages
4
Pages from-to
391-394
Publisher name
European Language Resources Association (ELRA)
Place of publication
Portorož, Slovenia
Event location
Portorož, Slovenia
Event date
Jan 1, 2016
Type of event by nationality
WRD - Celosvětová akce
UT code for WoS article
—