All

What are you looking for?

All
Projects
Results
Organizations

Quick search

  • Projects supported by TA ČR
  • Excellent projects
  • Projects with the highest public support
  • Current projects

Smart search

  • That is how I find a specific +word
  • That is how I leave the -word out of the results
  • “That is how I can find the whole phrase”

The Role of Compounds in Human vs. Machine Translation Quality

The result's identifiers

  • Result code in IS VaVaI

    <a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F23%3A10476127" target="_blank" >RIV/00216208:11320/23:10476127 - isvavai.cz</a>

  • Result on the web

  • DOI - Digital Object Identifier

Alternative languages

  • Result language

    angličtina

  • Original language name

    The Role of Compounds in Human vs. Machine Translation Quality

  • Original language description

    We focus on the production of German compounds in English-to-German manual and automatic translation. On the example of WMT21 news translation test set, we observe that even the best MT systems produce much fewer compounds compared to three independent manual translations. Despite this striking difference, we observe that this insufficiency is not apparent in manual evaluation methods that target the overall translation quality (DA and MQM). Simple automatic methods like BLEU somewhat surprisingly provide a better indication of this quality aspect. Our manual analysis of system outputs, including our freshly trained Transformer models, confirms that current deep neural systems operating at the level of subword units are capable of constructing novel words, including novel compounds. This effect however cannot be measured using static dictionaries of compounds such as GermaNet. German compounds thus pose an interesting challenge for future development of MT systems.

  • Czech name

  • Czech description

Classification

  • Type

    D - Article in proceedings

  • CEP classification

  • OECD FORD branch

    10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)

Result continuities

  • Project

    <a href="/en/project/GX19-26934X" target="_blank" >GX19-26934X: Neural Representations in Multi-modal and Multi-lingual Modeling</a><br>

  • Continuities

    P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)

Others

  • Publication year

    2023

  • Confidentiality

    S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Data specific for result type

  • Article name in the collection

    Proceedings of Machine Translation Summit XIX vol. 1: Research Track

  • ISBN

    978-4-9913461-0-1

  • ISSN

  • e-ISSN

  • Number of pages

    13

  • Pages from-to

    248-260

  • Publisher name

    Asia-Pacific Association for Machine Translation (AAMT)

  • Place of publication

    Kyoto, Japan

  • Event location

    Macau SAR, China

  • Event date

    Sep 4, 2023

  • Type of event by nationality

    WRD - Celosvětová akce

  • UT code for WoS article