Hausa Visual Genome: A Dataset for Multi-Modal English to Hausa Machine Translation
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F22%3A10456915" target="_blank" >RIV/00216208:11320/22:10456915 - isvavai.cz</a>
Result on the web
<a href="https://aclanthology.org/2022.lrec-1.694/" target="_blank" >https://aclanthology.org/2022.lrec-1.694/</a>
DOI - Digital Object Identifier
—
Alternative languages
Result language
angličtina
Original language name
Hausa Visual Genome: A Dataset for Multi-Modal English to Hausa Machine Translation
Original language description
Multi-modal Machine Translation (MMT) enables the use of visual information to enhance the quality of translations. The visual information can serve as a valuable piece of context information to decrease the ambiguity of input sentences. Despite the increasing popularity of such a technique, good and sizeable datasets are scarce, limiting the full extent of their potential. Hausa, a Chadic language, is a member of the Afro-Asiatic language family. It is estimated that about 100 to 150 million people speak the language, with more than 80 million indigenous speakers. This is more than any of the other Chadic languages. Despite a large number of speakers, the Hausa language is considered low-resource in natural language processing (NLP). This is due to the absence of sufficient resources to implement most NLP tasks. While some datasets exist, they are either scarce, machine-generated, or in the religious domain. Therefore, there is a need to create training and evaluation data for implementing machine le
Czech name
—
Czech description
—
Classification
Type
D - Article in proceedings
CEP classification
—
OECD FORD branch
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Result continuities
Project
Result was created during the realization of more than one project. More information in the Projects tab.
Continuities
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)
Others
Publication year
2022
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Article name in the collection
Proceedings of the 13th Conference on Language Resources and Evaluation (LREC 2022)
ISBN
979-10-95546-72-6
ISSN
—
e-ISSN
—
Number of pages
9
Pages from-to
6471-6479
Publisher name
European Language Resources Association
Place of publication
Marseille, France
Event location
Marseille, France
Event date
Jun 20, 2022
Type of event by nationality
WRD - Celosvětová akce
UT code for WoS article
—