Query a corpus in near-natural language A human-friendly corpus query language not only for linguists
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F25%3AKQF5AUVX" target="_blank" >RIV/00216208:11320/25:KQF5AUVX - isvavai.cz</a>
Result on the web
<a href="https://www.scopus.com/inward/record.uri?eid=2-s2.0-85207419254&doi=10.1075%2fscl.119.10mil&partnerID=40&md5=50fcc8f7f7d797775ddb24465bac4910" target="_blank" >https://www.scopus.com/inward/record.uri?eid=2-s2.0-85207419254&doi=10.1075%2fscl.119.10mil&partnerID=40&md5=50fcc8f7f7d797775ddb24465bac4910</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1075/scl.119.10mil" target="_blank" >10.1075/scl.119.10mil</a>
Alternative languages
Result language
angličtina
Original language name
Query a corpus in near-natural language A human-friendly corpus query language not only for linguists
Original language description
This paper addresses the pressing issue of accessibility of corpora to users who are not able or willing to learn a formal query language. It introduces a working online automatic translator from a near-natural language into the Corpus Query Language (CQL), as used in SketchEngine, Czech National Corpus web applications, and other services. The translator does not require strict syntactical patterns and allows for a certain amount of typing errors, using the redundancy associated with natural language. It allows querying corpora of 35 languages hosted by the Czech National Corpus infrastructure, all of them annotated in the Universal Dependencies formalism. Alternatively, the translated CQL code can be employed in other compatible systems. The paper both presents the theoretical assumptions of our solution and outlines the details of its implementation, including examples of use. © 2024 John Benjamins Publishing Company.
Czech name
—
Czech description
—
Classification
Type
J<sub>SC</sub> - Article in a specialist periodical, which is included in the SCOPUS database
CEP classification
—
OECD FORD branch
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Result continuities
Project
—
Continuities
—
Others
Publication year
2024
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Name of the periodical
Stud. Corpus Linguist.
ISSN
1388-0373
e-ISSN
—
Volume of the periodical
119
Issue of the periodical within the volume
2024
Country of publishing house
US - UNITED STATES
Number of pages
15
Pages from-to
248-262
UT code for WoS article
—
EID of the result in the Scopus database
2-s2.0-85207419254