Analyzing learner language: the case of the Hebrew Learner Essay Corpus
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F25%3AIPSEUEWA" target="_blank" >RIV/00216208:11320/25:IPSEUEWA - isvavai.cz</a>
Výsledek na webu
<a href="https://www.scopus.com/inward/record.uri?eid=2-s2.0-85192898478&doi=10.1007%2fs10579-023-09712-w&partnerID=40&md5=de50d67246e8f7d50c3bb9d02e08e42f" target="_blank" >https://www.scopus.com/inward/record.uri?eid=2-s2.0-85192898478&doi=10.1007%2fs10579-023-09712-w&partnerID=40&md5=de50d67246e8f7d50c3bb9d02e08e42f</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1007/s10579-023-09712-w" target="_blank" >10.1007/s10579-023-09712-w</a>
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
Analyzing learner language: the case of the Hebrew Learner Essay Corpus
Popis výsledku v původním jazyce
We present the Hebrew Learner Essay Corpus (HELEECS): an annotated corpus of Hebrew language argumentative essays authored by prospective higher-education students. The corpus includes essays by two main populations: (1) essays by native speakers of Hebrew, written as part of the psychometric exam that is used to assess their future success in academic studies; (2) essays by non-native speakers of Hebrew, with three different native languages (Arabic, French, and Russian), that were written as part of a language aptitude test. The corpus is uniformly encoded and stored. The non-native essays were annotated with target hypotheses (i.e., hypothesized intended formulations in standard written Hebrew). The corpus is available for research purposes upon request. We describe the corpus and the error correction and annotation schemes used in its analysis. In addition to introducing this new resource, we discuss the challenges of identifying and analyzing non-native language use. Among these challenges are determining whether the language used in a particular utterance is native-like, and determining the target hypothesis when language use is non-native-like. We propose various ways for dealing with these challenges. © The Author(s) 2024.
Název v anglickém jazyce
Analyzing learner language: the case of the Hebrew Learner Essay Corpus
Popis výsledku anglicky
We present the Hebrew Learner Essay Corpus (HELEECS): an annotated corpus of Hebrew language argumentative essays authored by prospective higher-education students. The corpus includes essays by two main populations: (1) essays by native speakers of Hebrew, written as part of the psychometric exam that is used to assess their future success in academic studies; (2) essays by non-native speakers of Hebrew, with three different native languages (Arabic, French, and Russian), that were written as part of a language aptitude test. The corpus is uniformly encoded and stored. The non-native essays were annotated with target hypotheses (i.e., hypothesized intended formulations in standard written Hebrew). The corpus is available for research purposes upon request. We describe the corpus and the error correction and annotation schemes used in its analysis. In addition to introducing this new resource, we discuss the challenges of identifying and analyzing non-native language use. Among these challenges are determining whether the language used in a particular utterance is native-like, and determining the target hypothesis when language use is non-native-like. We propose various ways for dealing with these challenges. © The Author(s) 2024.
Klasifikace
Druh
J<sub>SC</sub> - Článek v periodiku v databázi SCOPUS
CEP obor
—
OECD FORD obor
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Návaznosti výsledku
Projekt
—
Návaznosti
—
Ostatní
Rok uplatnění
2024
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název periodika
Language Resources and Evaluation
ISSN
1574-020X
e-ISSN
—
Svazek periodika
2024
Číslo periodika v rámci svazku
2024
Stát vydavatele periodika
US - Spojené státy americké
Počet stran výsledku
42
Strana od-do
1-42
Kód UT WoS článku
—
EID výsledku v databázi Scopus
2-s2.0-85192898478