A New Approach for Semi-automatic Building and Extending a Multilingual Terminology Thesaurus
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216224%3A14330%2F19%3A00109355" target="_blank" >RIV/00216224:14330/19:00109355 - isvavai.cz</a>
Výsledek na webu
<a href="http://dx.doi.org/10.1142/S0218213019500088" target="_blank" >http://dx.doi.org/10.1142/S0218213019500088</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1142/S0218213019500088" target="_blank" >10.1142/S0218213019500088</a>
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
A New Approach for Semi-automatic Building and Extending a Multilingual Terminology Thesaurus
Popis výsledku v původním jazyce
This paper describes a new system for semi-automatically building, extending and managing a terminological thesaurus---a multilingual terminology dictionary enriched with relationships between the terms themselves to form a thesaurus. The system allows to radically enhance the workflow of current terminology expert groups, where most of the editing decisions still come from introspection. The presented system supplements the lexicographic process with natural language processing techniques, which are seamlessly integrated to the thesaurus editing environment. The system's methodology and the resulting thesaurus are closely connected to new domain corpora in the six languages involved. They are used for term usage examples as well as for the automatic extraction of new candidate terms. The terminological thesaurus is now accessible via a web-based application, which a) presents rich detailed information on each term, b) visualizes term relations, and c) displays real-life usage examples of the term in the domain-related documents and in the context-based similar terms. Furthermore, the specialized corpora are used to detect candidate translations of terms from the central language (Czech) to the other languages (English, French, German, Russian and Slovak) as well as to detect broader Czech terms, which help to place new terms in the actual thesaurus hierarchy. This project has been realized as a terminological thesaurus of land surveying, but the presented tools and methodology are reusable for other terminology domains.
Název v anglickém jazyce
A New Approach for Semi-automatic Building and Extending a Multilingual Terminology Thesaurus
Popis výsledku anglicky
This paper describes a new system for semi-automatically building, extending and managing a terminological thesaurus---a multilingual terminology dictionary enriched with relationships between the terms themselves to form a thesaurus. The system allows to radically enhance the workflow of current terminology expert groups, where most of the editing decisions still come from introspection. The presented system supplements the lexicographic process with natural language processing techniques, which are seamlessly integrated to the thesaurus editing environment. The system's methodology and the resulting thesaurus are closely connected to new domain corpora in the six languages involved. They are used for term usage examples as well as for the automatic extraction of new candidate terms. The terminological thesaurus is now accessible via a web-based application, which a) presents rich detailed information on each term, b) visualizes term relations, and c) displays real-life usage examples of the term in the domain-related documents and in the context-based similar terms. Furthermore, the specialized corpora are used to detect candidate translations of terms from the central language (Czech) to the other languages (English, French, German, Russian and Slovak) as well as to detect broader Czech terms, which help to place new terms in the actual thesaurus hierarchy. This project has been realized as a terminological thesaurus of land surveying, but the presented tools and methodology are reusable for other terminology domains.
Klasifikace
Druh
J<sub>imp</sub> - Článek v periodiku v databázi Web of Science
CEP obor
—
OECD FORD obor
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Návaznosti výsledku
Projekt
—
Návaznosti
S - Specificky vyzkum na vysokych skolach
Ostatní
Rok uplatnění
2019
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název periodika
International Journal on Artificial Intelligence Tools
ISSN
0218-2130
e-ISSN
1793-6349
Svazek periodika
28
Číslo periodika v rámci svazku
2
Stát vydavatele periodika
US - Spojené státy americké
Počet stran výsledku
21
Strana od-do
1-21
Kód UT WoS článku
000463577400004
EID výsledku v databázi Scopus
2-s2.0-85063990309