Od korpusu jako otevřeného zdroje pro bádání ke komerčním produktům

Identifikátory výsledku

Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F68378092%3A_____%2F07%3A00097207" target="_blank" >RIV/68378092:_____/07:00097207 - isvavai.cz</a>
Výsledek na webu
—
DOI - Digital Object Identifier
—

Alternativní jazyky

Jazyk výsledku
angličtina
Název v původním jazyce
From the corpus as an open source for investigation to commercial products
Popis výsledku v původním jazyce
The development of corpora is sketched, from large collections of texts without tagging through tagged corpora to machines that operate above tagged corpora and produce data presented as data about language, such as Word Sketches (TM). The article remarks that every corpus is merely a representation of texts and that the quality of representation is to be examined. The unavoidable question in research is how is the corpus built and how, under what principles, the service software operates. Both in casewe explore a corpus with distortions, where texts appear in a way nobody has written them so (digits and their environment uses to be phenomena of that sort), and in case we are not allowed to have an insight "below the bonnet" or to change working parameters, we hardly may speak about doing scholarly research.
Název v anglickém jazyce
From the corpus as an open source for investigation to commercial products
Popis výsledku anglicky
The development of corpora is sketched, from large collections of texts without tagging through tagged corpora to machines that operate above tagged corpora and produce data presented as data about language, such as Word Sketches (TM). The article remarks that every corpus is merely a representation of texts and that the quality of representation is to be examined. The unavoidable question in research is how is the corpus built and how, under what principles, the service software operates. Both in casewe explore a corpus with distortions, where texts appear in a way nobody has written them so (digits and their environment uses to be phenomena of that sort), and in case we are not allowed to have an insight "below the bonnet" or to change working parameters, we hardly may speak about doing scholarly research.

Klasifikace

Druh
D - Stať ve sborníku
CEP obor
AI - Jazykověda
OECD FORD obor
—

Návaznosti výsledku

Projekt
<a href="/cs/project/GA405%2F03%2F0377" target="_blank" >GA405/03/0377: Možnosti a meze gramatiky češtiny ve světle Českého národního korpusu</a><br>
Návaznosti
Z - Vyzkumny zamer (s odkazem do CEZ)

Ostatní

Rok uplatnění
2007
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Údaje specifické pro druh výsledku

Název statě ve sborníku
Gramatika a korpus 2005
ISBN
80-86496-32-5
ISSN
—
e-ISSN
—
Počet stran výsledku
7
Strana od-do
243-249
Název nakladatele
Ústav pro jazyk český AV ČR, v.v.i
Místo vydání
Praha
Místo konání akce
Praha
Datum konání akce
23. 11. 2005
Typ akce podle státní příslušnosti
WRD - Celosvětová akce
Kód UT WoS článku
—

Podobné výsledky(10)

Corpora and Language Learning with the Sketch Engine and SKELL The DQMD Tag : A system of direct quotation meta-data tagging for EAP corpora Introduction

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Od korpusu jako otevřeného zdroje pro bádání ke komerčním produktům

Identifikátory výsledku

Alternativní jazyky

Klasifikace

Návaznosti výsledku

Ostatní

Údaje specifické pro druh výsledku

Podobné výsledky(10)

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Popis výsledku

Identifikátory výsledku

Identifikátory výsledku

Alternativní jazyky

Alternativní jazyky

Klasifikace

Klasifikace

Návaznosti výsledku

Návaznosti výsledku

Ostatní

Ostatní

Údaje specifické pro druh výsledku

Údaje specifické pro druh výsledku

Podobné výsledky(10)