All

What are you looking for?

All
Projects
Results
Organizations

Quick search

  • Projects supported by TA ČR
  • Excellent projects
  • Projects with the highest public support
  • Current projects

Smart search

  • That is how I find a specific +word
  • That is how I leave the -word out of the results
  • “That is how I can find the whole phrase”

Interactive Analysis and Visualisation of Annotated Collocations in Spanish (AVAnCES)

The result's identifiers

  • Result code in IS VaVaI

    <a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F22%3AVURQ3TMA" target="_blank" >RIV/00216208:11320/22:VURQ3TMA - isvavai.cz</a>

  • Result on the web

    <a href="https://aclanthology.org/2022.nlp4dh-1.4" target="_blank" >https://aclanthology.org/2022.nlp4dh-1.4</a>

  • DOI - Digital Object Identifier

Alternative languages

  • Result language

    angličtina

  • Original language name

    Interactive Analysis and Visualisation of Annotated Collocations in Spanish (AVAnCES)

  • Original language description

    Phraseology studies have been enhanced by Corpus Linguistics, which has become an interdisciplinary field where current technologies play an important role in its development. Computational tools have been implemented in the last decades with positive results on the identification of phrases in different languages. One specific technology that has impacted these studies is social media. As researchers, we have turned our attention to collecting data from these platforms, which comes with great advantages and its own challenges. One of the challenges is the way we design and build corpora relevant to the questions emerging in this type of language expression. This has been approached from different angles, but one that has given invaluable outputs is the building of linguistic corpora with the use of online web applications. In this paper, we take a multidimensional approach to the collection, design, and deployment of a phraseology corpus for Latin American Spanish from Twitter data, extracting features using NLP techniques, and presenting it in an interactive online web application. We expect to contribute to the methodologies used for Corpus Linguistics in the current technological age. Finally, we make this tool publicly available to be used by any researcher interested in the data itself and also on the technological tools developed here.

  • Czech name

  • Czech description

Classification

  • Type

    D - Article in proceedings

  • CEP classification

  • OECD FORD branch

    10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)

Result continuities

  • Project

  • Continuities

Others

  • Publication year

    2022

  • Confidentiality

    S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Data specific for result type

  • Article name in the collection

    Proceedings of the 2nd International Workshop on Natural Language Processing for Digital Humanities

  • ISBN

    978-1-955917-75-9

  • ISSN

  • e-ISSN

  • Number of pages

    10

  • Pages from-to

    21-30

  • Publisher name

    Association for Computational Linguistics

  • Place of publication

  • Event location

    Taipei, Taiwan

  • Event date

    Jan 1, 2022

  • Type of event by nationality

    WRD - Celosvětová akce

  • UT code for WoS article