Data-driven Crosslinguistic Syntactic Transfer in Second Language Learning
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F22%3A2GD7DR33" target="_blank" >RIV/00216208:11320/22:2GD7DR33 - isvavai.cz</a>
Result on the web
<a href="https://escholarship.org/uc/item/86j2x3t2" target="_blank" >https://escholarship.org/uc/item/86j2x3t2</a>
DOI - Digital Object Identifier
—
Alternative languages
Result language
angličtina
Original language name
Data-driven Crosslinguistic Syntactic Transfer in Second Language Learning
Original language description
Second-language (L2) learning is characterized by both positive and negative transfer from the first language (L1). However, psycholinguistic studies focus on a few syntactic phenomena and L1-L2 pairs at a time, resulting in an incomplete picture. We apply machine learning to seven learner corpora in English and Spanish with 39 language pairs, showing that statistical models combined with simple $n$-grams of part-of-speech tags and syntactic dependency relations achieve good performance in recovering the L1, indicating structural transfer from L1 to L2. Further machine learning using a rich hand-curated linguistic feature set allowed us to identify aspects of L2 linguistic structure particularly influenced by L1 (verbal morphology, average dependency tree parse depth, and headedness of clausal structures) as well as those with minimal influence (distributions of dependency relations, basic word orders, or non-projective dependencies).
Czech name
—
Czech description
—
Classification
Type
O - Miscellaneous
CEP classification
—
OECD FORD branch
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Result continuities
Project
—
Continuities
—
Others
Publication year
2022
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů