Quality and Quantity of Machine Translation References for Automatic Metrics
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F24%3A10492921" target="_blank" >RIV/00216208:11320/24:10492921 - isvavai.cz</a>
Result on the web
<a href="https://aclanthology.org/2024.humeval-1.1/" target="_blank" >https://aclanthology.org/2024.humeval-1.1/</a>
DOI - Digital Object Identifier
—
Alternative languages
Result language
angličtina
Original language name
Quality and Quantity of Machine Translation References for Automatic Metrics
Original language description
Automatic machine translation metrics typically rely on human translations to determine the quality of system translations. Common wisdom in the field dictates that the human references should be of very high quality. However, there are no cost-benefit analyses that could be used to guide practitioners who plan to collect references for machine translation evaluation. We find that higher-quality references lead to better metric correlations with humans at the segment-level. Having up to 7 references per segment and taking their average (or maximum) helps all metrics. Interestingly, the references from vendors of different qualities can be mixed together and improve metric success. Higher quality references, however, cost more to create and we frame this as an optimization problem: given a specific budget, what references should be collected to maximize metric success. These findings can be used by evaluators of shared tasks when references need to be created under a certain budget.
Czech name
—
Czech description
—
Classification
Type
D - Article in proceedings
CEP classification
—
OECD FORD branch
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Result continuities
Project
Result was created during the realization of more than one project. More information in the Projects tab.
Continuities
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)
Others
Publication year
2024
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Article name in the collection
Fourth Workshop on Human Evaluation of NLP Systems (HumEval) @ LREC-COLING 2024
ISBN
978-2-493-81441-8
ISSN
—
e-ISSN
—
Number of pages
11
Pages from-to
1-11
Publisher name
ELRA
Place of publication
Paris, France
Event location
Torino, Italy
Event date
May 21, 2024
Type of event by nationality
WRD - Celosvětová akce
UT code for WoS article
—