Multi-view fusion for universal translation quality estimation
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F23%3APEVAZN2Q" target="_blank" >RIV/00216208:11320/23:PEVAZN2Q - isvavai.cz</a>
Výsledek na webu
<a href="https://www.webofscience.com/wos/woscc/summary/e0b8ef34-8e6b-412a-9b8f-87607433ed44-bb92f483/relevance/1" target="_blank" >https://www.webofscience.com/wos/woscc/summary/e0b8ef34-8e6b-412a-9b8f-87607433ed44-bb92f483/relevance/1</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1016/j.inffus.2023.102022" target="_blank" >10.1016/j.inffus.2023.102022</a>
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
Multi-view fusion for universal translation quality estimation
Popis výsledku v původním jazyce
"Machine translation quality estimation (QE) aims to evaluate the result of translation without reference. Despite the progress it has made, state-of-the-art QE models are proven to be biased. More specifically, they over-rely on spurious statistical features while ignoring the bilingual semantic adequacy, leading to performance degradation. Besides, existing approaches require large amounts of annotation data, restricting their applications in new domains and languages. In this work, we propose a universal framework for quality estimation based on multi-view fusion. We first introduce noise to the target side of the parallel sentence pair, either by pre-trained language model or by large language model. After that, with the clean parallel pairs and the noised pairs as different views, the QE model is trained to distinguish the clean pairs from the noised ones. Our method can improve the accuracy and generalizability in supervised scenario, and can solely perform estimation in zero-shot scenario. We perform experiments on WMT QE evaluation datasets under different scenarios, verifying the effectiveness of our method. We also make an in-depth investigation of the bias of QE model."
Název v anglickém jazyce
Multi-view fusion for universal translation quality estimation
Popis výsledku anglicky
"Machine translation quality estimation (QE) aims to evaluate the result of translation without reference. Despite the progress it has made, state-of-the-art QE models are proven to be biased. More specifically, they over-rely on spurious statistical features while ignoring the bilingual semantic adequacy, leading to performance degradation. Besides, existing approaches require large amounts of annotation data, restricting their applications in new domains and languages. In this work, we propose a universal framework for quality estimation based on multi-view fusion. We first introduce noise to the target side of the parallel sentence pair, either by pre-trained language model or by large language model. After that, with the clean parallel pairs and the noised pairs as different views, the QE model is trained to distinguish the clean pairs from the noised ones. Our method can improve the accuracy and generalizability in supervised scenario, and can solely perform estimation in zero-shot scenario. We perform experiments on WMT QE evaluation datasets under different scenarios, verifying the effectiveness of our method. We also make an in-depth investigation of the bias of QE model."
Klasifikace
Druh
J<sub>ost</sub> - Ostatní články v recenzovaných periodicích
CEP obor
—
OECD FORD obor
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Návaznosti výsledku
Projekt
—
Návaznosti
—
Ostatní
Rok uplatnění
2023
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název periodika
"INFORMATION FUSION"
ISSN
1566-2535
e-ISSN
—
Svazek periodika
102
Číslo periodika v rámci svazku
2024-2
Stát vydavatele periodika
US - Spojené státy americké
Počet stran výsledku
9
Strana od-do
1-9
Kód UT WoS článku
001083713100001
EID výsledku v databázi Scopus
—