English to Urdu Statistical Machine Translation: Establishing a Baseline

Identifikátory výsledku

Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F14%3A10289379" target="_blank" >RIV/00216208:11320/14:10289379 - isvavai.cz</a>
Výsledek na webu
—
DOI - Digital Object Identifier
—

Alternativní jazyky

Jazyk výsledku
angličtina
Název v původním jazyce
English to Urdu Statistical Machine Translation: Establishing a Baseline
Popis výsledku v původním jazyce
The aim of this paper is to categorize and present the existence of resources for English- to-Urdu machine translation (MT) and to establish an empirical baseline for this task. By doing so, we hope to set up a common ground for MT research with Urdu toallow for a congruent progress in this field. We build baseline phrase-based MT (PBMT) and hierarchical MT systems and report the results on 3 official independent test sets. On all test sets, hierarchial MT significantly outperformed PBMT. The highest single-reference BLEU score is achieved by the hierarchical system and reaches 21.58% but this figure depends on the randomly selected test set. Our manual evaluation of 175 sentences suggests that in 45% of sentences, the hierarchical MT is ranked betterthan the PBMT output compared to 21% of sentences where PBMT wins, the rest being equal.
Název v anglickém jazyce
English to Urdu Statistical Machine Translation: Establishing a Baseline
Popis výsledku anglicky
The aim of this paper is to categorize and present the existence of resources for English- to-Urdu machine translation (MT) and to establish an empirical baseline for this task. By doing so, we hope to set up a common ground for MT research with Urdu toallow for a congruent progress in this field. We build baseline phrase-based MT (PBMT) and hierarchical MT systems and report the results on 3 official independent test sets. On all test sets, hierarchial MT significantly outperformed PBMT. The highest single-reference BLEU score is achieved by the hierarchical system and reaches 21.58% but this figure depends on the randomly selected test set. Our manual evaluation of 175 sentences suggests that in 45% of sentences, the hierarchical MT is ranked betterthan the PBMT output compared to 21% of sentences where PBMT wins, the rest being equal.

Klasifikace

Druh
D - Stať ve sborníku
CEP obor
IN - Informatika
OECD FORD obor
—

Návaznosti výsledku

Projekt
—
Návaznosti
R - Projekt Ramcoveho programu EK

Ostatní

Rok uplatnění
2014
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Údaje specifické pro druh výsledku

Název statě ve sborníku
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers
ISBN
978-1-941643-26-6
ISSN
—
e-ISSN
—
Počet stran výsledku
6
Strana od-do
37-42
Název nakladatele
Dublin City University and Association for Computational Linguistics
Místo vydání
Dublin, Ireland
Místo konání akce
Dublin, Ireland
Datum konání akce
23. 8. 2014
Typ akce podle státní příslušnosti
WRD - Celosvětová akce
Kód UT WoS článku
—

Podobné výsledky(10)

UNLT: Urdu Natural Language Toolkit Explainable Quality Estimation: CUNI Eval4NLP Submission Word-Order Issues in English-to-Urdu Statistical Machine Translation

Co hledáte?

Rychlé hledání

Chytré vyhledávání

English to Urdu Statistical Machine Translation: Establishing a Baseline

Identifikátory výsledku

Alternativní jazyky

Klasifikace

Návaznosti výsledku

Ostatní

Údaje specifické pro druh výsledku

Podobné výsledky(10)

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Popis výsledku

Identifikátory výsledku

Identifikátory výsledku

Alternativní jazyky

Alternativní jazyky

Klasifikace

Klasifikace

Návaznosti výsledku

Návaznosti výsledku

Ostatní

Ostatní

Údaje specifické pro druh výsledku

Údaje specifické pro druh výsledku

Podobné výsledky(10)