Word-Order Issues in English-to-Urdu Statistical Machine Translation

Identifikátory výsledku

Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F11%3A10107812" target="_blank" >RIV/00216208:11320/11:10107812 - isvavai.cz</a>
Výsledek na webu
<a href="http://dx.doi.org/10.2478/v10108-011-0007-0" target="_blank" >http://dx.doi.org/10.2478/v10108-011-0007-0</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.2478/v10108-011-0007-0" target="_blank" >10.2478/v10108-011-0007-0</a>

Alternativní jazyky

Jazyk výsledku
angličtina
Název v původním jazyce
Word-Order Issues in English-to-Urdu Statistical Machine Translation
Popis výsledku v původním jazyce
We investigate phrase-based statistical machine translation between English and Urdu, two Indo-European languages that differ significantly in their word-order preferences. Reordering of words and phrases is thus a necessary part of the translation process. While local reordering is modeled nicely by phrase-based systems, long-distance reordering is known to be a hard problem. We perform experiments using the Moses SMT system and discuss reordering models available in Moses. We then present our novel, Urdu-aware, yet generalizable approach based on reordering phrases in syntactic parse tree of the source English sentence. Our technique significantly improves quality of English-Urdu translation with Moses, both in terms of BLEU score and of subjective human judgments.
Název v anglickém jazyce
Word-Order Issues in English-to-Urdu Statistical Machine Translation
Popis výsledku anglicky
We investigate phrase-based statistical machine translation between English and Urdu, two Indo-European languages that differ significantly in their word-order preferences. Reordering of words and phrases is thus a necessary part of the translation process. While local reordering is modeled nicely by phrase-based systems, long-distance reordering is known to be a hard problem. We perform experiments using the Moses SMT system and discuss reordering models available in Moses. We then present our novel, Urdu-aware, yet generalizable approach based on reordering phrases in syntactic parse tree of the source English sentence. Our technique significantly improves quality of English-Urdu translation with Moses, both in terms of BLEU score and of subjective human judgments.

Klasifikace

Druh
Jx - Nezařazeno - Článek v odborném periodiku (Jimp, Jsc a Jost)
CEP obor
AI - Jazykověda
OECD FORD obor
—

Návaznosti výsledku

Projekt
<a href="/cs/project/GAP406%2F11%2F1499" target="_blank" >GAP406/11/1499: Čeština ve věku strojového překladu</a>
Návaznosti
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP) Z - Vyzkumny zamer (s odkazem do CEZ) S - Specificky vyzkum na vysokych skolach

Ostatní

Rok uplatnění
2011
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Údaje specifické pro druh výsledku

Název periodika
The Prague Bulletin of Mathematical Linguistics
ISSN
0032-6585
e-ISSN
—
Svazek periodika
Neuveden
Číslo periodika v rámci svazku
95
Stát vydavatele periodika
CZ - Česká republika
Počet stran výsledku
20
Strana od-do
87-106
Kód UT WoS článku
—
EID výsledku v databázi Scopus
—

Podobné výsledky(10)

What a Transfer-Based System Brings to the Combination with PBMT Merged bilingual trees based on Universal Dependencies in Machine Translation English to Urdu Statistical Machine Translation: Establishing a Baseline

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Word-Order Issues in English-to-Urdu Statistical Machine Translation

Identifikátory výsledku

Alternativní jazyky

Klasifikace

Návaznosti výsledku

Ostatní

Údaje specifické pro druh výsledku

Podobné výsledky(10)

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Popis výsledku

Identifikátory výsledku

Identifikátory výsledku

Alternativní jazyky

Alternativní jazyky

Klasifikace

Klasifikace

Návaznosti výsledku

Návaznosti výsledku

Ostatní

Ostatní

Údaje specifické pro druh výsledku

Údaje specifické pro druh výsledku

Podobné výsledky(10)