Compiling and annotating a learner corpus for a morphologically rich language - CzeSL, a corpus of non-native Czech
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11210%2F20%3A10419686" target="_blank" >RIV/00216208:11210/20:10419686 - isvavai.cz</a>
Nalezeny alternativní kódy
RIV/00216208:11320/20:10419686
Výsledek na webu
<a href="https://dspace.cuni.cz/handle/20.500.11956/123103" target="_blank" >https://dspace.cuni.cz/handle/20.500.11956/123103</a>
DOI - Digital Object Identifier
—
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
Compiling and annotating a learner corpus for a morphologically rich language - CzeSL, a corpus of non-native Czech
Popis výsledku v původním jazyce
Learner corpora, linguistic collections documenting a language as used by learners, provide an important empirical foundation for language acquisition research and teaching practice. This book presents CzeSL, a corpus of non-native Czech, against the background of theoretical and practical issues in the current learner corpus research. Languages with rich morphology and relatively free word order, including Czech, are particularly challenging for the analysis of learner language. The authors address both the complexity of learner error annotation, describing three complementary annotation schemes, and the complexity of description of non-native Czech in terms of standard linguistic categories. The book discusses in detail practical aspects of the corpus creation: the process of collection and annotation itself, the supporting tools, the resulting data, their formats and search platforms. The chapter on use cases exemplifies the usefulness of learner corpora for teaching, language acquisition research, and computational linguistics. Any researcher developing learner corpora will surely appreciate the concluding chapter listing lessons learned and pitfalls to avoid.
Název v anglickém jazyce
Compiling and annotating a learner corpus for a morphologically rich language - CzeSL, a corpus of non-native Czech
Popis výsledku anglicky
Learner corpora, linguistic collections documenting a language as used by learners, provide an important empirical foundation for language acquisition research and teaching practice. This book presents CzeSL, a corpus of non-native Czech, against the background of theoretical and practical issues in the current learner corpus research. Languages with rich morphology and relatively free word order, including Czech, are particularly challenging for the analysis of learner language. The authors address both the complexity of learner error annotation, describing three complementary annotation schemes, and the complexity of description of non-native Czech in terms of standard linguistic categories. The book discusses in detail practical aspects of the corpus creation: the process of collection and annotation itself, the supporting tools, the resulting data, their formats and search platforms. The chapter on use cases exemplifies the usefulness of learner corpora for teaching, language acquisition research, and computational linguistics. Any researcher developing learner corpora will surely appreciate the concluding chapter listing lessons learned and pitfalls to avoid.
Klasifikace
Druh
B - Odborná kniha
CEP obor
—
OECD FORD obor
60203 - Linguistics
Návaznosti výsledku
Projekt
Výsledek vznikl pri realizaci vícero projektů. Více informací v záložce Projekty.
Návaznosti
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)
Ostatní
Rok uplatnění
2020
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
ISBN
978-80-246-4759-3
Počet stran knihy
281
Název nakladatele
Karolinum
Místo vydání
Praha
Kód UT WoS knihy
—