Interactive Analysis and Visualisation of Annotated Collocations in Spanish (AVAnCES)
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F22%3AVURQ3TMA" target="_blank" >RIV/00216208:11320/22:VURQ3TMA - isvavai.cz</a>
Result on the web
<a href="https://aclanthology.org/2022.nlp4dh-1.4" target="_blank" >https://aclanthology.org/2022.nlp4dh-1.4</a>
DOI - Digital Object Identifier
—
Alternative languages
Result language
angličtina
Original language name
Interactive Analysis and Visualisation of Annotated Collocations in Spanish (AVAnCES)
Original language description
Phraseology studies have been enhanced by Corpus Linguistics, which has become an interdisciplinary field where current technologies play an important role in its development. Computational tools have been implemented in the last decades with positive results on the identification of phrases in different languages. One specific technology that has impacted these studies is social media. As researchers, we have turned our attention to collecting data from these platforms, which comes with great advantages and its own challenges. One of the challenges is the way we design and build corpora relevant to the questions emerging in this type of language expression. This has been approached from different angles, but one that has given invaluable outputs is the building of linguistic corpora with the use of online web applications. In this paper, we take a multidimensional approach to the collection, design, and deployment of a phraseology corpus for Latin American Spanish from Twitter data, extracting features using NLP techniques, and presenting it in an interactive online web application. We expect to contribute to the methodologies used for Corpus Linguistics in the current technological age. Finally, we make this tool publicly available to be used by any researcher interested in the data itself and also on the technological tools developed here.
Czech name
—
Czech description
—
Classification
Type
D - Article in proceedings
CEP classification
—
OECD FORD branch
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Result continuities
Project
—
Continuities
—
Others
Publication year
2022
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Article name in the collection
Proceedings of the 2nd International Workshop on Natural Language Processing for Digital Humanities
ISBN
978-1-955917-75-9
ISSN
—
e-ISSN
—
Number of pages
10
Pages from-to
21-30
Publisher name
Association for Computational Linguistics
Place of publication
—
Event location
Taipei, Taiwan
Event date
Jan 1, 2022
Type of event by nationality
WRD - Celosvětová akce
UT code for WoS article
—