All

What are you looking for?

All
Projects
Results
Organizations

Quick search

  • Projects supported by TA ČR
  • Excellent projects
  • Projects with the highest public support
  • Current projects

Smart search

  • That is how I find a specific +word
  • That is how I leave the -word out of the results
  • “That is how I can find the whole phrase”

Towards a Romanian Phrasal Academic Lexicon

The result's identifiers

  • Result code in IS VaVaI

    <a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F25%3ANDEVCKX9" target="_blank" >RIV/00216208:11320/25:NDEVCKX9 - isvavai.cz</a>

  • Result on the web

    <a href="https://aclanthology.org/2024.clib-1.10" target="_blank" >https://aclanthology.org/2024.clib-1.10</a>

  • DOI - Digital Object Identifier

Alternative languages

  • Result language

    angličtina

  • Original language name

    Towards a Romanian Phrasal Academic Lexicon

  • Original language description

    The lack of NLP based research studies on academic writing in Romania results in an unbalanced development of automatic support tools in Romanian compared to other languages, such as English. For this study, we use Romanian subsets of two bilingual academic writing corpora: the ROGER corpus, consisting of university student papers, and the EXPRES corpus, composed of expert research articles. Working with the Romanian Academic Word List / RoAWL, we present two phrase extraction phases: (i) use Ro-AWL words as node words to extract collocations according to the thresholds of statistical measures and (ii) classify extracted phrases into general versus domain-specific multi-word units. We show how manual rhetorical function annotation of resulting phrases can be combined with automatic function detection. The comparison between academic phrases in ROGER and EXPRES validates the final phrase list. The Romanian phrasal academic lexicon (ROPAL), similar to the Oxford Phrasal Academic Lexicon (OPAL), is a written academic phrase lexicon for Romanian language made available for academic use and further research or applications.

  • Czech name

  • Czech description

Classification

  • Type

    D - Article in proceedings

  • CEP classification

  • OECD FORD branch

    10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)

Result continuities

  • Project

  • Continuities

Others

  • Publication year

    2024

  • Confidentiality

    S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Data specific for result type

  • Article name in the collection

    Proceedings of the Sixth International Conference on Computational Linguistics in Bulgaria (CLIB 2024)

  • ISBN

  • ISSN

    2367-5578

  • e-ISSN

  • Number of pages

    7

  • Pages from-to

    106-112

  • Publisher name

    Department of Computational Linguistics, Institute for Bulgarian Language, Bulgarian Academy of Sciences

  • Place of publication

  • Event location

    Sofia, Bulgaria

  • Event date

    Jan 1, 2025

  • Type of event by nationality

    WRD - Celosvětová akce

  • UT code for WoS article