Increasing Coverage of Translation Memories with Linguistically Motivated Segment Combination Methods
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216224%3A14330%2F15%3A00081035" target="_blank" >RIV/00216224:14330/15:00081035 - isvavai.cz</a>
Výsledek na webu
<a href="http://rgcl.wlv.ac.uk/events/NLP4TM/3_Paper.pdf" target="_blank" >http://rgcl.wlv.ac.uk/events/NLP4TM/3_Paper.pdf</a>
DOI - Digital Object Identifier
—
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
Increasing Coverage of Translation Memories with Linguistically Motivated Segment Combination Methods
Popis výsledku v původním jazyce
Translation memories (TMs) used in computer-aided translation (CAT) systems are the highest-quality source of parallel texts since they consist of segment translation pairs approved by professional human translators. The obvious problem is their size andcoverage of new document segments when compared with other parallel data. In this paper, we describe several methods for expanding translation memories using linguistically motivated segment combining approaches concentrated on preserving the high translational quality. The evaluation of the methods was done on a medium-size real-world translation memory and documents provided by a Czech translation company as well as on a large publicly available DGT translation memory published by European Commission. The asset of the TM expansion methods were evaluated by the pre-translation analysis of widely used MemoQ CAT system and the METEOR metric was used for measuring the quality of fully expanded new translation segments.
Název v anglickém jazyce
Increasing Coverage of Translation Memories with Linguistically Motivated Segment Combination Methods
Popis výsledku anglicky
Translation memories (TMs) used in computer-aided translation (CAT) systems are the highest-quality source of parallel texts since they consist of segment translation pairs approved by professional human translators. The obvious problem is their size andcoverage of new document segments when compared with other parallel data. In this paper, we describe several methods for expanding translation memories using linguistically motivated segment combining approaches concentrated on preserving the high translational quality. The evaluation of the methods was done on a medium-size real-world translation memory and documents provided by a Czech translation company as well as on a large publicly available DGT translation memory published by European Commission. The asset of the TM expansion methods were evaluated by the pre-translation analysis of widely used MemoQ CAT system and the METEOR metric was used for measuring the quality of fully expanded new translation segments.
Klasifikace
Druh
D - Stať ve sborníku
CEP obor
IN - Informatika
OECD FORD obor
—
Návaznosti výsledku
Projekt
Výsledek vznikl pri realizaci vícero projektů. Více informací v záložce Projekty.
Návaznosti
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)<br>S - Specificky vyzkum na vysokych skolach
Ostatní
Rok uplatnění
2015
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název statě ve sborníku
Proceedings of The Workshop on Natural Language Processing for Translation Memories (NLP4TM)
ISBN
9789544520328
ISSN
—
e-ISSN
—
Počet stran výsledku
5
Strana od-do
31-35
Název nakladatele
INCOMA Ltd. Shoumen
Místo vydání
Bulgaria
Místo konání akce
Hissar, Bulgaria
Datum konání akce
1. 1. 2015
Typ akce podle státní příslušnosti
WRD - Celosvětová akce
Kód UT WoS článku
—