Machine Translation for Historical Research: A Case Study of Aramaic-Ancient Hebrew Translations
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F25%3AD9LZB4LB" target="_blank" >RIV/00216208:11320/25:D9LZB4LB - isvavai.cz</a>
Výsledek na webu
<a href="https://www.scopus.com/inward/record.uri?eid=2-s2.0-85191791166&doi=10.1145%2f3627168&partnerID=40&md5=5b1fe0c881b8fe08bf97e05c2473b5f7" target="_blank" >https://www.scopus.com/inward/record.uri?eid=2-s2.0-85191791166&doi=10.1145%2f3627168&partnerID=40&md5=5b1fe0c881b8fe08bf97e05c2473b5f7</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1145/3627168" target="_blank" >10.1145/3627168</a>
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
Machine Translation for Historical Research: A Case Study of Aramaic-Ancient Hebrew Translations
Popis výsledku v původním jazyce
In this article, by the ability to translate Aramaic to another spoken languages, we investigated machine translation in a cultural heritage domain for two primary purposes: evaluating the quality of ancient translations and preserving Aramaic (an endangered language). First, we detailed the construction of a publicly available Biblical parallel Aramaic-Hebrew corpus based on two ancient (early 2nd to late 4th century) Hebrew-Aramaic translations: Targum Onkelus and Targum Jonathan. Then using the statistical machine translation approach, which in our use case significantly outperforms neural machine translation, we validated the excepted high quality of the translations. The trained model failed to translate Aramaic texts of other dialects. However, when we trained the same statistical machine translation model on another Aramaic-Hebrew corpus of a different dialect (Zohar, 13th century), a very high translation score was achieved. We examined an additional important cultural heritage source of Aramaic texts, the Babylonian Talmud (early 3rd to late 5th century). Since we do not have a parallel Aramaic-Hebrew corpus of the Talmud, we used the model trained on the Bible corpus for translation. We performed an analysis of the results and suggest some potential promising future research. Copyright © 2024 held by the owner/author(s). Publication rights licensed to ACM.
Název v anglickém jazyce
Machine Translation for Historical Research: A Case Study of Aramaic-Ancient Hebrew Translations
Popis výsledku anglicky
In this article, by the ability to translate Aramaic to another spoken languages, we investigated machine translation in a cultural heritage domain for two primary purposes: evaluating the quality of ancient translations and preserving Aramaic (an endangered language). First, we detailed the construction of a publicly available Biblical parallel Aramaic-Hebrew corpus based on two ancient (early 2nd to late 4th century) Hebrew-Aramaic translations: Targum Onkelus and Targum Jonathan. Then using the statistical machine translation approach, which in our use case significantly outperforms neural machine translation, we validated the excepted high quality of the translations. The trained model failed to translate Aramaic texts of other dialects. However, when we trained the same statistical machine translation model on another Aramaic-Hebrew corpus of a different dialect (Zohar, 13th century), a very high translation score was achieved. We examined an additional important cultural heritage source of Aramaic texts, the Babylonian Talmud (early 3rd to late 5th century). Since we do not have a parallel Aramaic-Hebrew corpus of the Talmud, we used the model trained on the Bible corpus for translation. We performed an analysis of the results and suggest some potential promising future research. Copyright © 2024 held by the owner/author(s). Publication rights licensed to ACM.
Klasifikace
Druh
J<sub>SC</sub> - Článek v periodiku v databázi SCOPUS
CEP obor
—
OECD FORD obor
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Návaznosti výsledku
Projekt
—
Návaznosti
—
Ostatní
Rok uplatnění
2024
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název periodika
Journal on Computing and Cultural Heritage
ISSN
1556-4673
e-ISSN
—
Svazek periodika
17
Číslo periodika v rámci svazku
2
Stát vydavatele periodika
US - Spojené státy americké
Počet stran výsledku
23
Strana od-do
1-23
Kód UT WoS článku
—
EID výsledku v databázi Scopus
2-s2.0-85191791166