Vše

Co hledáte?

Vše
Projekty
Výsledky výzkumu
Subjekty

Rychlé hledání

  • Projekty podpořené TA ČR
  • Významné projekty
  • Projekty s nejvyšší státní podporou
  • Aktuálně běžící projekty

Chytré vyhledávání

  • Takto najdu konkrétní +slovo
  • Takto z výsledků -slovo zcela vynechám
  • “Takto můžu najít celou frázi”

MT Evaluation in the Context of Language Complexity

Identifikátory výsledku

  • Kód výsledku v IS VaVaI

    <a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F62156489%3A43110%2F21%3A43920964" target="_blank" >RIV/62156489:43110/21:43920964 - isvavai.cz</a>

  • Nalezeny alternativní kódy

    RIV/00216305:26210/22:PU143580

  • Výsledek na webu

    <a href="https://doi.org/10.1155/2021/2806108" target="_blank" >https://doi.org/10.1155/2021/2806108</a>

  • DOI - Digital Object Identifier

    <a href="http://dx.doi.org/10.1155/2021/2806108" target="_blank" >10.1155/2021/2806108</a>

Alternativní jazyky

  • Jazyk výsledku

    angličtina

  • Název v původním jazyce

    MT Evaluation in the Context of Language Complexity

  • Popis výsledku v původním jazyce

    The paper focuses on investigating the impact of artificial agent (machine translator) on human agent (posteditor) using a proposed methodology, which is based on language complexity measures, POS tags, frequent tagsets, association rules, and their summarization. We examine this impact from the point of view of language complexity in terms of word and sentence structure. By the proposed methodology, we analyzed 24 733 tags of English to Slovak translations of technical texts, corresponding to the output of two MT systems (Google Translate and the European Commission&apos;s MT tool). We used both manual (adequacy and fluency) and semiautomatic (HTER metric) MT evaluation measures as the criteria for validity. We show that the proposed methodology is valid based on the evaluation of frequent tagsets and rules of MT outputs produced by Google Translate or of the European Commission&apos;s MT tool, and both postedited MT (PEMT) outputs using baseline methods. Our results have also shown that PEMT output produced by Google Translate is characterized by more frequent tagsets such as verbs in the infinitive with modal verbs compared to its MT output, which is characterized by masculine, inanimate nouns in locative of singular. In the MT output, produced by the European Commission&apos;s MT tool, the most frequent tagset was verbs in the infinitive compared to its postedited MT output, where verbs in imperative and the second person of plural occurred. These findings are also obtained from the use of the proposed methodology for MT evaluation. The contribution of the proposed methodology is an identification of systematic not random errors. Additionally, the study can also serve as information for optimizing the translation process using postediting.

  • Název v anglickém jazyce

    MT Evaluation in the Context of Language Complexity

  • Popis výsledku anglicky

    The paper focuses on investigating the impact of artificial agent (machine translator) on human agent (posteditor) using a proposed methodology, which is based on language complexity measures, POS tags, frequent tagsets, association rules, and their summarization. We examine this impact from the point of view of language complexity in terms of word and sentence structure. By the proposed methodology, we analyzed 24 733 tags of English to Slovak translations of technical texts, corresponding to the output of two MT systems (Google Translate and the European Commission&apos;s MT tool). We used both manual (adequacy and fluency) and semiautomatic (HTER metric) MT evaluation measures as the criteria for validity. We show that the proposed methodology is valid based on the evaluation of frequent tagsets and rules of MT outputs produced by Google Translate or of the European Commission&apos;s MT tool, and both postedited MT (PEMT) outputs using baseline methods. Our results have also shown that PEMT output produced by Google Translate is characterized by more frequent tagsets such as verbs in the infinitive with modal verbs compared to its MT output, which is characterized by masculine, inanimate nouns in locative of singular. In the MT output, produced by the European Commission&apos;s MT tool, the most frequent tagset was verbs in the infinitive compared to its postedited MT output, where verbs in imperative and the second person of plural occurred. These findings are also obtained from the use of the proposed methodology for MT evaluation. The contribution of the proposed methodology is an identification of systematic not random errors. Additionally, the study can also serve as information for optimizing the translation process using postediting.

Klasifikace

  • Druh

    J<sub>imp</sub> - Článek v periodiku v databázi Web of Science

  • CEP obor

  • OECD FORD obor

    10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)

Návaznosti výsledku

  • Projekt

  • Návaznosti

    I - Institucionalni podpora na dlouhodoby koncepcni rozvoj vyzkumne organizace

Ostatní

  • Rok uplatnění

    2021

  • Kód důvěrnosti údajů

    S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Údaje specifické pro druh výsledku

  • Název periodika

    Complexity

  • ISSN

    1076-2787

  • e-ISSN

  • Svazek periodika

    Neuveden

  • Číslo periodika v rámci svazku

    17 December

  • Stát vydavatele periodika

    US - Spojené státy americké

  • Počet stran výsledku

    15

  • Strana od-do

    2806108

  • Kód UT WoS článku

    000783326400002

  • EID výsledku v databázi Scopus

    2-s2.0-85122366968