Data-driven Crosslinguistic Syntactic Transfer in Second Language Learning
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F22%3A2GD7DR33" target="_blank" >RIV/00216208:11320/22:2GD7DR33 - isvavai.cz</a>
Výsledek na webu
<a href="https://escholarship.org/uc/item/86j2x3t2" target="_blank" >https://escholarship.org/uc/item/86j2x3t2</a>
DOI - Digital Object Identifier
—
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
Data-driven Crosslinguistic Syntactic Transfer in Second Language Learning
Popis výsledku v původním jazyce
Second-language (L2) learning is characterized by both positive and negative transfer from the first language (L1). However, psycholinguistic studies focus on a few syntactic phenomena and L1-L2 pairs at a time, resulting in an incomplete picture. We apply machine learning to seven learner corpora in English and Spanish with 39 language pairs, showing that statistical models combined with simple $n$-grams of part-of-speech tags and syntactic dependency relations achieve good performance in recovering the L1, indicating structural transfer from L1 to L2. Further machine learning using a rich hand-curated linguistic feature set allowed us to identify aspects of L2 linguistic structure particularly influenced by L1 (verbal morphology, average dependency tree parse depth, and headedness of clausal structures) as well as those with minimal influence (distributions of dependency relations, basic word orders, or non-projective dependencies).
Název v anglickém jazyce
Data-driven Crosslinguistic Syntactic Transfer in Second Language Learning
Popis výsledku anglicky
Second-language (L2) learning is characterized by both positive and negative transfer from the first language (L1). However, psycholinguistic studies focus on a few syntactic phenomena and L1-L2 pairs at a time, resulting in an incomplete picture. We apply machine learning to seven learner corpora in English and Spanish with 39 language pairs, showing that statistical models combined with simple $n$-grams of part-of-speech tags and syntactic dependency relations achieve good performance in recovering the L1, indicating structural transfer from L1 to L2. Further machine learning using a rich hand-curated linguistic feature set allowed us to identify aspects of L2 linguistic structure particularly influenced by L1 (verbal morphology, average dependency tree parse depth, and headedness of clausal structures) as well as those with minimal influence (distributions of dependency relations, basic word orders, or non-projective dependencies).
Klasifikace
Druh
O - Ostatní výsledky
CEP obor
—
OECD FORD obor
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Návaznosti výsledku
Projekt
—
Návaznosti
—
Ostatní
Rok uplatnění
2022
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů