Improving Evaluation of English-Czech MT through Paraphrasing
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F14%3A10289360" target="_blank" >RIV/00216208:11320/14:10289360 - isvavai.cz</a>
Result on the web
<a href="http://www.lrec-conf.org/proceedings/lrec2014/pdf/935_Paper.pdf" target="_blank" >http://www.lrec-conf.org/proceedings/lrec2014/pdf/935_Paper.pdf</a>
DOI - Digital Object Identifier
—
Alternative languages
Result language
angličtina
Original language name
Improving Evaluation of English-Czech MT through Paraphrasing
Original language description
In this paper, we present a method of improving the accuracy of machine translation evaluation of Czech sentences. Given a reference sentence, our algorithm transforms it by targeted paraphrasing into a new synthetic reference sentence that is closer inwording to the machine translation output, but at the same time preserves the meaning of the original reference sentence. Grammatical correctness of~the new reference sentence is provided by applying Depfix on newly created paraphrases. Depfix is a system for post-editing English-to-Czech machine translation outputs. We adjusted it to fix the errors in paraphrased sentences. Due to a noisy source of our paraphrases, we experiment with adding word alignment. However, the alignment reduces the number of paraphrases found and the best results were achieved by~a~simple greedy method with only one-word paraphrases thanks to their intensive filtering. BLEU scores computed using these new reference sentences show significantly higher correlati
Czech name
—
Czech description
—
Classification
Type
D - Article in proceedings
CEP classification
IN - Informatics
OECD FORD branch
—
Result continuities
Project
<a href="/en/project/LM2010013" target="_blank" >LM2010013: LINDAT-CLARIN: Institute for analysis, processing and distribution of linguistic data</a><br>
Continuities
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)<br>S - Specificky vyzkum na vysokych skolach
Others
Publication year
2014
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Article name in the collection
Proceedings of the 9th International Conference on Language Resources and Evaluation (LREC 2014)
ISBN
978-2-9517408-8-4
ISSN
—
e-ISSN
—
Number of pages
6
Pages from-to
596-601
Publisher name
European Language Resources Association
Place of publication
Reykjavík, Iceland
Event location
Reykjavík, Iceland
Event date
May 26, 2014
Type of event by nationality
WRD - Celosvětová akce
UT code for WoS article
—