Source language classification of indirect translations
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11210%2F22%3A10456674" target="_blank" >RIV/00216208:11210/22:10456674 - isvavai.cz</a>
Result on the web
<a href="https://verso.is.cuni.cz/pub/verso.fpl?fname=obd_publikace_handle&handle=SMT_u-XsKd" target="_blank" >https://verso.is.cuni.cz/pub/verso.fpl?fname=obd_publikace_handle&handle=SMT_u-XsKd</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1075/target.00006.iva" target="_blank" >10.1075/target.00006.iva</a>
Alternative languages
Result language
angličtina
Original language name
Source language classification of indirect translations
Original language description
One of the major barriers to the systematic study of indirect translation - that is, translations of translations - is the lack of efficient methods to identify these translations. In this article, we use supervised machine learning to examine whether computers can be harnessed to identify indirect translations. Our data consist of a monolingual comparable corpus that includes (1) nontranslated Finnish texts, (2) direct translations from English, French, German, Greek, and Swedish into Finnish, and (3) indirect translations from Greek (the ultimate source language) via English, French, German, and Swedish (mediating languages) into Finnish. We use n-grams of various types and lengths as feature sets and random forests as the statistical classification technique. To maximize the transferability of the method, the feature sets were implemented in accordance with the Universal Dependencies framework. This study confirms that computers can distinguish between translated and nontranslated Finnish, as well as between Finnish translations made from different source languages. Regarding indirect translations, the ultimate source language has a greater impact on the linguistic composition of indirect Finnish translations than their respective mediating languages. Hence, the indirect translations could not be reliably identified. Therefore, our results suggest that the reliable computational identification of indirect translations and their mediating languages requires a way to control for the effect of the ultimate source language.
Czech name
—
Czech description
—
Classification
Type
J<sub>imp</sub> - Article in a specialist periodical, which is included in the Web of Science database
CEP classification
—
OECD FORD branch
60203 - Linguistics
Result continuities
Project
—
Continuities
—
Others
Publication year
2022
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Name of the periodical
Target
ISSN
0924-1884
e-ISSN
1569-9986
Volume of the periodical
34
Issue of the periodical within the volume
3
Country of publishing house
NL - THE KINGDOM OF THE NETHERLANDS
Number of pages
25
Pages from-to
370-394
UT code for WoS article
000782566300001
EID of the result in the Scopus database
—