Query a corpus in near-natural language A human-friendly corpus query language not only for linguists
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11210%2F24%3A10488519" target="_blank" >RIV/00216208:11210/24:10488519 - isvavai.cz</a>
Výsledek na webu
<a href="https://doi.org/10.1075/scl.119.10mil" target="_blank" >https://doi.org/10.1075/scl.119.10mil</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1075/scl.119.10mil" target="_blank" >10.1075/scl.119.10mil</a>
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
Query a corpus in near-natural language A human-friendly corpus query language not only for linguists
Popis výsledku v původním jazyce
This paper addresses the pressing issue of accessibility of corpora to users who are not able or willing to learn a formal query language. It introduces a working online automatic translator from a near-natural language into the Corpus Query Language (CQL), as used in SketchEngine, Czech National Corpus web applications, and other services. The translator does not require strict syntactical patterns and allows for a certain amount of typing errors, using the redundancy associated with natural language. It allows querying corpora of 35 languages hosted by the Czech National Corpus infrastructure, all of them annotated in the Universal Dependencies formalism. Alternatively, the translated CQL code can be employed in other compatible systems. The paper both presents the theoretical assumptions of our solution and outlines the details of its implementation, including examples of use.
Název v anglickém jazyce
Query a corpus in near-natural language A human-friendly corpus query language not only for linguists
Popis výsledku anglicky
This paper addresses the pressing issue of accessibility of corpora to users who are not able or willing to learn a formal query language. It introduces a working online automatic translator from a near-natural language into the Corpus Query Language (CQL), as used in SketchEngine, Czech National Corpus web applications, and other services. The translator does not require strict syntactical patterns and allows for a certain amount of typing errors, using the redundancy associated with natural language. It allows querying corpora of 35 languages hosted by the Czech National Corpus infrastructure, all of them annotated in the Universal Dependencies formalism. Alternatively, the translated CQL code can be employed in other compatible systems. The paper both presents the theoretical assumptions of our solution and outlines the details of its implementation, including examples of use.
Klasifikace
Druh
C - Kapitola v odborné knize
CEP obor
—
OECD FORD obor
60203 - Linguistics
Návaznosti výsledku
Projekt
—
Návaznosti
I - Institucionalni podpora na dlouhodoby koncepcni rozvoj vyzkumne organizace
Ostatní
Rok uplatnění
2024
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název knihy nebo sborníku
Studies in Corpus Linguistics
ISBN
978-90-272-1594-9
Počet stran výsledku
15
Strana od-do
248-262
Počet stran knihy
266
Název nakladatele
John Benjamins
Místo vydání
Amsterdam
Kód UT WoS kapitoly
—