Vše

Co hledáte?

Vše
Projekty
Výsledky výzkumu
Subjekty

Rychlé hledání

  • Projekty podpořené TA ČR
  • Významné projekty
  • Projekty s nejvyšší státní podporou
  • Aktuálně běžící projekty

Chytré vyhledávání

  • Takto najdu konkrétní +slovo
  • Takto z výsledků -slovo zcela vynechám
  • “Takto můžu najít celou frázi”

Source language classification of indirect translations

Identifikátory výsledku

  • Kód výsledku v IS VaVaI

    <a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11210%2F22%3A10456674" target="_blank" >RIV/00216208:11210/22:10456674 - isvavai.cz</a>

  • Výsledek na webu

    <a href="https://verso.is.cuni.cz/pub/verso.fpl?fname=obd_publikace_handle&handle=SMT_u-XsKd" target="_blank" >https://verso.is.cuni.cz/pub/verso.fpl?fname=obd_publikace_handle&handle=SMT_u-XsKd</a>

  • DOI - Digital Object Identifier

    <a href="http://dx.doi.org/10.1075/target.00006.iva" target="_blank" >10.1075/target.00006.iva</a>

Alternativní jazyky

  • Jazyk výsledku

    angličtina

  • Název v původním jazyce

    Source language classification of indirect translations

  • Popis výsledku v původním jazyce

    One of the major barriers to the systematic study of indirect translation - that is, translations of translations - is the lack of efficient methods to identify these translations. In this article, we use supervised machine learning to examine whether computers can be harnessed to identify indirect translations. Our data consist of a monolingual comparable corpus that includes (1) nontranslated Finnish texts, (2) direct translations from English, French, German, Greek, and Swedish into Finnish, and (3) indirect translations from Greek (the ultimate source language) via English, French, German, and Swedish (mediating languages) into Finnish. We use n-grams of various types and lengths as feature sets and random forests as the statistical classification technique. To maximize the transferability of the method, the feature sets were implemented in accordance with the Universal Dependencies framework. This study confirms that computers can distinguish between translated and nontranslated Finnish, as well as between Finnish translations made from different source languages. Regarding indirect translations, the ultimate source language has a greater impact on the linguistic composition of indirect Finnish translations than their respective mediating languages. Hence, the indirect translations could not be reliably identified. Therefore, our results suggest that the reliable computational identification of indirect translations and their mediating languages requires a way to control for the effect of the ultimate source language.

  • Název v anglickém jazyce

    Source language classification of indirect translations

  • Popis výsledku anglicky

    One of the major barriers to the systematic study of indirect translation - that is, translations of translations - is the lack of efficient methods to identify these translations. In this article, we use supervised machine learning to examine whether computers can be harnessed to identify indirect translations. Our data consist of a monolingual comparable corpus that includes (1) nontranslated Finnish texts, (2) direct translations from English, French, German, Greek, and Swedish into Finnish, and (3) indirect translations from Greek (the ultimate source language) via English, French, German, and Swedish (mediating languages) into Finnish. We use n-grams of various types and lengths as feature sets and random forests as the statistical classification technique. To maximize the transferability of the method, the feature sets were implemented in accordance with the Universal Dependencies framework. This study confirms that computers can distinguish between translated and nontranslated Finnish, as well as between Finnish translations made from different source languages. Regarding indirect translations, the ultimate source language has a greater impact on the linguistic composition of indirect Finnish translations than their respective mediating languages. Hence, the indirect translations could not be reliably identified. Therefore, our results suggest that the reliable computational identification of indirect translations and their mediating languages requires a way to control for the effect of the ultimate source language.

Klasifikace

  • Druh

    J<sub>imp</sub> - Článek v periodiku v databázi Web of Science

  • CEP obor

  • OECD FORD obor

    60203 - Linguistics

Návaznosti výsledku

  • Projekt

  • Návaznosti

Ostatní

  • Rok uplatnění

    2022

  • Kód důvěrnosti údajů

    S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Údaje specifické pro druh výsledku

  • Název periodika

    Target

  • ISSN

    0924-1884

  • e-ISSN

    1569-9986

  • Svazek periodika

    34

  • Číslo periodika v rámci svazku

    3

  • Stát vydavatele periodika

    NL - Nizozemsko

  • Počet stran výsledku

    25

  • Strana od-do

    370-394

  • Kód UT WoS článku

    000782566300001

  • EID výsledku v databázi Scopus