Automatic Identification of Learners' Language Background based on their Writing in Czech
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F13%3A10194613" target="_blank" >RIV/00216208:11320/13:10194613 - isvavai.cz</a>
Result on the web
—
DOI - Digital Object Identifier
—
Alternative languages
Result language
angličtina
Original language name
Automatic Identification of Learners' Language Background based on their Writing in Czech
Original language description
The goal of this study is to investigate whether learners' written data in highly inflectional Czech can suggest a consistent set of clues for automatic identification of the learners' L1 background. For our experiments, we use texts written by learnersof Czech, which have been automatically and manually annotated for errors. We define two classes of learners: speakers of Indo-European languages and speakers of non-Indo-European languages. We use an SVM classifier to perform the binary classification.We show that non-content based features perform well on highly inflectional data. In particular, features reflecting errors in orthography are the most useful, yielding about 89% precision and the same recall. A detailed discussion of the best performingfeatures is provided.
Czech name
—
Czech description
—
Classification
Type
D - Article in proceedings
CEP classification
IN - Informatics
OECD FORD branch
—
Result continuities
Project
<a href="/en/project/GPP406%2F10%2FP328" target="_blank" >GPP406/10/P328: Resource-light Morphological Analysis and Tagging</a><br>
Continuities
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)
Others
Publication year
2013
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Article name in the collection
Proceedings of the 6th International Joint Conference on Natural Language Processing
ISBN
978-4-9907348-0-0
ISSN
—
e-ISSN
—
Number of pages
9
Pages from-to
1428-1436
Publisher name
Asian Federation of Natural Language Processing
Place of publication
Nagoya, Japan
Event location
Nagoya, Japan
Event date
Oct 14, 2013
Type of event by nationality
CST - Celostátní akce
UT code for WoS article
—