Evaluating and automating the annotation of a learner corpus

Identifikátory výsledku

Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F13%3A10194812" target="_blank" >RIV/00216208:11320/13:10194812 - isvavai.cz</a>
Nalezeny alternativní kódy
RIV/00216208:11210/13:10194812 RIV/46747885:24510/13:#0001083
Výsledek na webu
<a href="http://dx.doi.org/10.1007/s10579-013-9226-3" target="_blank" >http://dx.doi.org/10.1007/s10579-013-9226-3</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1007/s10579-013-9226-3" target="_blank" >10.1007/s10579-013-9226-3</a>

Alternativní jazyky

Jazyk výsledku
angličtina
Název v původním jazyce
Evaluating and automating the annotation of a learner corpus
Popis výsledku v původním jazyce
The paper describes a corpus of texts produced by non-native speakers of Czech. We discuss its annotation scheme, consisting of three interlinked tiers, designed to handle a wide range of error types present in the input. Each tier corrects different types of errors; links between the tiers allow capturing errors in word order and complex discontinuous expressions. Errors are not only corrected, but also classified. The annotation scheme is tested on a data set including approx. 175,000 words with fairinter-annotator agreement results. We also explore the possibility of applying automated linguistic annotation tools (taggers, spell checkers and grammar checkers) to the learner text to support or even substitute manual annotation.
Název v anglickém jazyce
Evaluating and automating the annotation of a learner corpus
Popis výsledku anglicky
The paper describes a corpus of texts produced by non-native speakers of Czech. We discuss its annotation scheme, consisting of three interlinked tiers, designed to handle a wide range of error types present in the input. Each tier corrects different types of errors; links between the tiers allow capturing errors in word order and complex discontinuous expressions. Errors are not only corrected, but also classified. The annotation scheme is tested on a data set including approx. 175,000 words with fairinter-annotator agreement results. We also explore the possibility of applying automated linguistic annotation tools (taggers, spell checkers and grammar checkers) to the learner text to support or even substitute manual annotation.

Klasifikace

Druh
J<sub>x</sub> - Nezařazeno - Článek v odborném periodiku (Jimp, Jsc a Jost)
CEP obor
AI - Jazykověda
OECD FORD obor
—

Návaznosti výsledku

Projekt
<a href="/cs/project/GPP406%2F10%2FP328" target="_blank" >GPP406/10/P328: Morfologická analýza a tagging s minimálními zdroji</a><br>
Návaznosti
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)

Ostatní

Rok uplatnění
2013
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Údaje specifické pro druh výsledku

Název periodika
Language Resources and Evaluation
ISSN
1574-020X
e-ISSN
—
Svazek periodika
47
Číslo periodika v rámci svazku
1
Stát vydavatele periodika
NL - Nizozemsko
Počet stran výsledku
2
Strana od-do
1-2
Kód UT WoS článku
—
EID výsledku v databázi Scopus
—

Podobné výsledky(10)

Building a learner corpus CzeSL - an error tagged corpus of Czech as a second language Building a learner corpus

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Evaluating and automating the annotation of a learner corpus

Identifikátory výsledku

Alternativní jazyky

Klasifikace

Návaznosti výsledku

Ostatní

Údaje specifické pro druh výsledku

Podobné výsledky(10)

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Popis výsledku

Identifikátory výsledku

Identifikátory výsledku

Alternativní jazyky

Alternativní jazyky

Klasifikace

Klasifikace

Návaznosti výsledku

Návaznosti výsledku

Ostatní

Ostatní

Údaje specifické pro druh výsledku

Údaje specifické pro druh výsledku

Podobné výsledky(10)