Analyzing learner language: the case of the Hebrew Learner Essay Corpus
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F25%3AIPSEUEWA" target="_blank" >RIV/00216208:11320/25:IPSEUEWA - isvavai.cz</a>
Result on the web
<a href="https://www.scopus.com/inward/record.uri?eid=2-s2.0-85192898478&doi=10.1007%2fs10579-023-09712-w&partnerID=40&md5=de50d67246e8f7d50c3bb9d02e08e42f" target="_blank" >https://www.scopus.com/inward/record.uri?eid=2-s2.0-85192898478&doi=10.1007%2fs10579-023-09712-w&partnerID=40&md5=de50d67246e8f7d50c3bb9d02e08e42f</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1007/s10579-023-09712-w" target="_blank" >10.1007/s10579-023-09712-w</a>
Alternative languages
Result language
angličtina
Original language name
Analyzing learner language: the case of the Hebrew Learner Essay Corpus
Original language description
We present the Hebrew Learner Essay Corpus (HELEECS): an annotated corpus of Hebrew language argumentative essays authored by prospective higher-education students. The corpus includes essays by two main populations: (1) essays by native speakers of Hebrew, written as part of the psychometric exam that is used to assess their future success in academic studies; (2) essays by non-native speakers of Hebrew, with three different native languages (Arabic, French, and Russian), that were written as part of a language aptitude test. The corpus is uniformly encoded and stored. The non-native essays were annotated with target hypotheses (i.e., hypothesized intended formulations in standard written Hebrew). The corpus is available for research purposes upon request. We describe the corpus and the error correction and annotation schemes used in its analysis. In addition to introducing this new resource, we discuss the challenges of identifying and analyzing non-native language use. Among these challenges are determining whether the language used in a particular utterance is native-like, and determining the target hypothesis when language use is non-native-like. We propose various ways for dealing with these challenges. © The Author(s) 2024.
Czech name
—
Czech description
—
Classification
Type
J<sub>SC</sub> - Article in a specialist periodical, which is included in the SCOPUS database
CEP classification
—
OECD FORD branch
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Result continuities
Project
—
Continuities
—
Others
Publication year
2024
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Name of the periodical
Language Resources and Evaluation
ISSN
1574-020X
e-ISSN
—
Volume of the periodical
2024
Issue of the periodical within the volume
2024
Country of publishing house
US - UNITED STATES
Number of pages
42
Pages from-to
1-42
UT code for WoS article
—
EID of the result in the Scopus database
2-s2.0-85192898478