Automatic analysis of caregiver input and child production: Insight into corpus-based research on child language development in Korean
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F61989592%3A15210%2F22%3A73614145" target="_blank" >RIV/61989592:15210/22:73614145 - isvavai.cz</a>
Výsledek na webu
<a href="https://benjamins.com/catalog/kl.20002.shi" target="_blank" >https://benjamins.com/catalog/kl.20002.shi</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1075/kl.20002.shi" target="_blank" >10.1075/kl.20002.shi</a>
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
Automatic analysis of caregiver input and child production: Insight into corpus-based research on child language development in Korean
Popis výsledku v původním jazyce
The present study explores the applicability of Natural Language Processing (NLP) techniques to investigate child corpora in Korean. We employ caregiver input and child production data in the CHILDES database, currently the largest and open-access Korean child corpus data, and apply NLP techniques to the data in two ways: automatic Part-of-Speech tagging by adapting a machine learning algorithm, and (semi-)automatic extraction of constructional patterns expressing a transitive event (active transitive and suffixal passive). As the first empirical report on NLP-assisted analysis of Korean child corpora, this study is expected to reveal its advantages and drawbacks, thereby opening the window to furthering corpus-mediated research on child language development in Korean. Implications of this study’s findings will also contribute to research practice regarding developmental studies on Korean through child corpora, ensuring the reproducibility of procedures and results, which is often lacking in previous corpus-based research on child language development in Korean.
Název v anglickém jazyce
Automatic analysis of caregiver input and child production: Insight into corpus-based research on child language development in Korean
Popis výsledku anglicky
The present study explores the applicability of Natural Language Processing (NLP) techniques to investigate child corpora in Korean. We employ caregiver input and child production data in the CHILDES database, currently the largest and open-access Korean child corpus data, and apply NLP techniques to the data in two ways: automatic Part-of-Speech tagging by adapting a machine learning algorithm, and (semi-)automatic extraction of constructional patterns expressing a transitive event (active transitive and suffixal passive). As the first empirical report on NLP-assisted analysis of Korean child corpora, this study is expected to reveal its advantages and drawbacks, thereby opening the window to furthering corpus-mediated research on child language development in Korean. Implications of this study’s findings will also contribute to research practice regarding developmental studies on Korean through child corpora, ensuring the reproducibility of procedures and results, which is often lacking in previous corpus-based research on child language development in Korean.
Klasifikace
Druh
J<sub>ost</sub> - Ostatní články v recenzovaných periodicích
CEP obor
—
OECD FORD obor
60203 - Linguistics
Návaznosti výsledku
Projekt
—
Návaznosti
I - Institucionalni podpora na dlouhodoby koncepcni rozvoj vyzkumne organizace
Ostatní
Rok uplatnění
2022
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název periodika
Korean Linguistics
ISSN
0257-3784
e-ISSN
2212-9731
Svazek periodika
18
Číslo periodika v rámci svazku
2
Stát vydavatele periodika
KR - Korejská republika
Počet stran výsledku
34
Strana od-do
"125–158"
Kód UT WoS článku
000871382900002
EID výsledku v databázi Scopus
—