Increasing Coverage of Translation Memories with Linguistically Motivated Segment Combination Methods
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216224%3A14330%2F15%3A00081035" target="_blank" >RIV/00216224:14330/15:00081035 - isvavai.cz</a>
Result on the web
<a href="http://rgcl.wlv.ac.uk/events/NLP4TM/3_Paper.pdf" target="_blank" >http://rgcl.wlv.ac.uk/events/NLP4TM/3_Paper.pdf</a>
DOI - Digital Object Identifier
—
Alternative languages
Result language
angličtina
Original language name
Increasing Coverage of Translation Memories with Linguistically Motivated Segment Combination Methods
Original language description
Translation memories (TMs) used in computer-aided translation (CAT) systems are the highest-quality source of parallel texts since they consist of segment translation pairs approved by professional human translators. The obvious problem is their size andcoverage of new document segments when compared with other parallel data. In this paper, we describe several methods for expanding translation memories using linguistically motivated segment combining approaches concentrated on preserving the high translational quality. The evaluation of the methods was done on a medium-size real-world translation memory and documents provided by a Czech translation company as well as on a large publicly available DGT translation memory published by European Commission. The asset of the TM expansion methods were evaluated by the pre-translation analysis of widely used MemoQ CAT system and the METEOR metric was used for measuring the quality of fully expanded new translation segments.
Czech name
—
Czech description
—
Classification
Type
D - Article in proceedings
CEP classification
IN - Informatics
OECD FORD branch
—
Result continuities
Project
Result was created during the realization of more than one project. More information in the Projects tab.
Continuities
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)<br>S - Specificky vyzkum na vysokych skolach
Others
Publication year
2015
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Article name in the collection
Proceedings of The Workshop on Natural Language Processing for Translation Memories (NLP4TM)
ISBN
9789544520328
ISSN
—
e-ISSN
—
Number of pages
5
Pages from-to
31-35
Publisher name
INCOMA Ltd. Shoumen
Place of publication
Bulgaria
Event location
Hissar, Bulgaria
Event date
Jan 1, 2015
Type of event by nationality
WRD - Celosvětová akce
UT code for WoS article
—