Towards a Romanian Phrasal Academic Lexicon
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F25%3ANDEVCKX9" target="_blank" >RIV/00216208:11320/25:NDEVCKX9 - isvavai.cz</a>
Result on the web
<a href="https://aclanthology.org/2024.clib-1.10" target="_blank" >https://aclanthology.org/2024.clib-1.10</a>
DOI - Digital Object Identifier
—
Alternative languages
Result language
angličtina
Original language name
Towards a Romanian Phrasal Academic Lexicon
Original language description
The lack of NLP based research studies on academic writing in Romania results in an unbalanced development of automatic support tools in Romanian compared to other languages, such as English. For this study, we use Romanian subsets of two bilingual academic writing corpora: the ROGER corpus, consisting of university student papers, and the EXPRES corpus, composed of expert research articles. Working with the Romanian Academic Word List / RoAWL, we present two phrase extraction phases: (i) use Ro-AWL words as node words to extract collocations according to the thresholds of statistical measures and (ii) classify extracted phrases into general versus domain-specific multi-word units. We show how manual rhetorical function annotation of resulting phrases can be combined with automatic function detection. The comparison between academic phrases in ROGER and EXPRES validates the final phrase list. The Romanian phrasal academic lexicon (ROPAL), similar to the Oxford Phrasal Academic Lexicon (OPAL), is a written academic phrase lexicon for Romanian language made available for academic use and further research or applications.
Czech name
—
Czech description
—
Classification
Type
D - Article in proceedings
CEP classification
—
OECD FORD branch
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Result continuities
Project
—
Continuities
—
Others
Publication year
2024
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Article name in the collection
Proceedings of the Sixth International Conference on Computational Linguistics in Bulgaria (CLIB 2024)
ISBN
—
ISSN
2367-5578
e-ISSN
—
Number of pages
7
Pages from-to
106-112
Publisher name
Department of Computational Linguistics, Institute for Bulgarian Language, Bulgarian Academy of Sciences
Place of publication
—
Event location
Sofia, Bulgaria
Event date
Jan 1, 2025
Type of event by nationality
WRD - Celosvětová akce
UT code for WoS article
—