Analysis of Edit Operations for Post-editing Systems
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216275%3A25410%2F21%3A39922775" target="_blank" >RIV/00216275:25410/21:39922775 - isvavai.cz</a>
Výsledek na webu
<a href="https://link.springer.com/article/10.1007/s44196-021-00048-3" target="_blank" >https://link.springer.com/article/10.1007/s44196-021-00048-3</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1007/s44196-021-00048-3" target="_blank" >10.1007/s44196-021-00048-3</a>
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
Analysis of Edit Operations for Post-editing Systems
Popis výsledku v původním jazyce
Post-editing has become an important part not only of translation research but also in the global translation industry. While computer-aided translation tools, such as translation memories, are considered to be part of a translator's work, lately, machine translation (MT) systems have also been accepted by human translators. However, many human translators are still adopting the changes brought by translation technologies to the translation industry. This paper introduces a novel approach for seeking suitable pairs of n-grams when recommending n-grams (corresponding n-grams between MT and post-edited MT) based on the type of text (manual or administrative) and MT system used for machine translation. A tool that recommends and speeds up the correction of MT was developed to help the post-editors with their work. It is based on the analysis of words with the same lemmas and analysis of n-gram recommendations. These recommendations are extracted from sequence patterns of the mismatched words (MisMatch) between MT output and post-edited MT output. The paper aims to show the usage of morphological analysis for recommending the post-edit operations. It describes the usage of mismatched words in the n-gram recommendations for the post-edited MT output. The contribution consists of the methodology for seeking suitable pairs of words, n-grams and additionally the importance of taking into account metadata (the type of the text and/or style and MT system) when recommending post-edited operations.
Název v anglickém jazyce
Analysis of Edit Operations for Post-editing Systems
Popis výsledku anglicky
Post-editing has become an important part not only of translation research but also in the global translation industry. While computer-aided translation tools, such as translation memories, are considered to be part of a translator's work, lately, machine translation (MT) systems have also been accepted by human translators. However, many human translators are still adopting the changes brought by translation technologies to the translation industry. This paper introduces a novel approach for seeking suitable pairs of n-grams when recommending n-grams (corresponding n-grams between MT and post-edited MT) based on the type of text (manual or administrative) and MT system used for machine translation. A tool that recommends and speeds up the correction of MT was developed to help the post-editors with their work. It is based on the analysis of words with the same lemmas and analysis of n-gram recommendations. These recommendations are extracted from sequence patterns of the mismatched words (MisMatch) between MT output and post-edited MT output. The paper aims to show the usage of morphological analysis for recommending the post-edit operations. It describes the usage of mismatched words in the n-gram recommendations for the post-edited MT output. The contribution consists of the methodology for seeking suitable pairs of words, n-grams and additionally the importance of taking into account metadata (the type of the text and/or style and MT system) when recommending post-edited operations.
Klasifikace
Druh
J<sub>imp</sub> - Článek v periodiku v databázi Web of Science
CEP obor
—
OECD FORD obor
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Návaznosti výsledku
Projekt
<a href="/cs/project/GA19-15498S" target="_blank" >GA19-15498S: Modelování emocí ve verbální a neverbální manažerské komunikaci pro predikci podnikových finančních rizik</a><br>
Návaznosti
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)
Ostatní
Rok uplatnění
2021
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název periodika
International Journal of Computational Intelligence Systems
ISSN
1875-6891
e-ISSN
1875-6883
Svazek periodika
14
Číslo periodika v rámci svazku
1
Stát vydavatele periodika
FR - Francouzská republika
Počet stran výsledku
12
Strana od-do
197
Kód UT WoS článku
000778406100001
EID výsledku v databázi Scopus
2-s2.0-85119997372