Statistical-based system combination approach to gain advantages over different machine translation systems

Identifikátory výsledku

Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F61989100%3A27240%2F19%3A10242519" target="_blank" >RIV/61989100:27240/19:10242519 - isvavai.cz</a>
Výsledek na webu
<a href="https://www.cell.com/heliyon/fulltext/S2405-8440(19)36164-X#%20" target="_blank" >https://www.cell.com/heliyon/fulltext/S2405-8440(19)36164-X#%20</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1016/j.heliyon.2019.e02504" target="_blank" >10.1016/j.heliyon.2019.e02504</a>

Alternativní jazyky

Jazyk výsledku
angličtina
Název v původním jazyce
Statistical-based system combination approach to gain advantages over different machine translation systems
Popis výsledku v původním jazyce
Every machine translation system has some advantages. We propose an improved statistical system combination approach to achieve the advantages of existing machine translation systems. The primary task is to score all the phrases of the outputs of different machine translation systems selected for combination. Three steps are involved in the proposed statistical system combination approach, viz., alignment, decoding, and scoring. Pair alignment is done in the first step to prevent duplication so that only a single phrase is chosen from various phrases containing the same information. Thus the alignment and scoring strategy are implemented in our approach. Hypotheses are built in the second step. In the third step, we calculate the scores for all the hypotheses. The hypothesis with the highest score is chosen as the final translated output. Wrong scoring can mislead to identify the best part from different systems. It may be noted that a particular phrase may appear in various ways in different translations. To resolve the challenges, we incorporate WordNet in the alignment phase and word2vec in the scoring phase along with the existing factors. We find that the system combination model using WordNet and word2vec injection improves the machine translation accuracy. In this work, we have merged three systems viz., Hierarchical machine translation system, Bing Microsoft Translate, and Google Translate. The broad tests of translation on eight language pairs with benchmark datasets demonstrate that the proposed system achieves better quality than the individual systems and the state-of-the-art system combination models. (C) 2019 The Authors
Název v anglickém jazyce
Statistical-based system combination approach to gain advantages over different machine translation systems
Popis výsledku anglicky
Every machine translation system has some advantages. We propose an improved statistical system combination approach to achieve the advantages of existing machine translation systems. The primary task is to score all the phrases of the outputs of different machine translation systems selected for combination. Three steps are involved in the proposed statistical system combination approach, viz., alignment, decoding, and scoring. Pair alignment is done in the first step to prevent duplication so that only a single phrase is chosen from various phrases containing the same information. Thus the alignment and scoring strategy are implemented in our approach. Hypotheses are built in the second step. In the third step, we calculate the scores for all the hypotheses. The hypothesis with the highest score is chosen as the final translated output. Wrong scoring can mislead to identify the best part from different systems. It may be noted that a particular phrase may appear in various ways in different translations. To resolve the challenges, we incorporate WordNet in the alignment phase and word2vec in the scoring phase along with the existing factors. We find that the system combination model using WordNet and word2vec injection improves the machine translation accuracy. In this work, we have merged three systems viz., Hierarchical machine translation system, Bing Microsoft Translate, and Google Translate. The broad tests of translation on eight language pairs with benchmark datasets demonstrate that the proposed system achieves better quality than the individual systems and the state-of-the-art system combination models. (C) 2019 The Authors

Klasifikace

Druh
J<sub>SC</sub> - Článek v periodiku v databázi SCOPUS
CEP obor
—
OECD FORD obor
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)

Návaznosti výsledku

Projekt
—
Návaznosti
S - Specificky vyzkum na vysokych skolach

Ostatní

Rok uplatnění
2019
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Údaje specifické pro druh výsledku

Název periodika
Heliyon
ISSN
2405-8440
e-ISSN
—
Svazek periodika
5
Číslo periodika v rámci svazku
9
Stát vydavatele periodika
US - Spojené státy americké
Počet stran výsledku
9
Strana od-do
—
Kód UT WoS článku
—
EID výsledku v databázi Scopus
2-s2.0-85072698660

Podobné výsledky(10)

Fuzzy Influenced Process to Generate Comparable to Parallel Corpora Evaluation of English-Slovak neural and statistical machine translation Using Tectogrammatical Alignment in Phrase-Based Machine Translation

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Statistical-based system combination approach to gain advantages over different machine translation systems

Identifikátory výsledku

Alternativní jazyky

Klasifikace

Návaznosti výsledku

Ostatní

Údaje specifické pro druh výsledku

Podobné výsledky(10)

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Popis výsledku

Identifikátory výsledku

Identifikátory výsledku

Alternativní jazyky

Alternativní jazyky

Klasifikace

Klasifikace

Návaznosti výsledku

Návaznosti výsledku

Ostatní

Ostatní

Údaje specifické pro druh výsledku

Údaje specifické pro druh výsledku

Podobné výsledky(10)