The problem of linguistic markup conversion: the transformation of the Compreno markup into the UD format
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F23%3A562VW4RS" target="_blank" >RIV/00216208:11320/23:562VW4RS - isvavai.cz</a>
Výsledek na webu
<a href="https://www.scopus.com/inward/record.uri?eid=2-s2.0-85175564216&doi=10.28995%2f2075-7182-2023-22-191-199&partnerID=40&md5=a4adf1ae2bbef05d9075ea684d9574ac" target="_blank" >https://www.scopus.com/inward/record.uri?eid=2-s2.0-85175564216&doi=10.28995%2f2075-7182-2023-22-191-199&partnerID=40&md5=a4adf1ae2bbef05d9075ea684d9574ac</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.28995/2075-7182-2023-22-191-199" target="_blank" >10.28995/2075-7182-2023-22-191-199</a>
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
The problem of linguistic markup conversion: the transformation of the Compreno markup into the UD format
Popis výsledku v původním jazyce
"The linguistic markup is an important NLP task. Currently, there are several popular formats of the markup (Universal Dependencies, Prague Dependencies, and so on), which are mostly focused on morphology and syntax. Full semantic markup can be found in the ABBYY Compreno model. However, the structure of the format differs significantly from the models mentioned above. In the given work, we convert the Compreno markup into the UD format, which is rather popular among NLP researchers, and enrich it with the semantical pattern. Compreno and UD present morphology and syntax differently as far as tokenization, POS-tagging, ellipsis, coordination, and some other things are concerned, which makes the conversion of one format into another more complicated. Nevertheless, the conversion allowed us to create the UD-markup containing not only morpho-syntactic information but also the semantic one. © Dialogue 2023.All rights reserved."
Název v anglickém jazyce
The problem of linguistic markup conversion: the transformation of the Compreno markup into the UD format
Popis výsledku anglicky
"The linguistic markup is an important NLP task. Currently, there are several popular formats of the markup (Universal Dependencies, Prague Dependencies, and so on), which are mostly focused on morphology and syntax. Full semantic markup can be found in the ABBYY Compreno model. However, the structure of the format differs significantly from the models mentioned above. In the given work, we convert the Compreno markup into the UD format, which is rather popular among NLP researchers, and enrich it with the semantical pattern. Compreno and UD present morphology and syntax differently as far as tokenization, POS-tagging, ellipsis, coordination, and some other things are concerned, which makes the conversion of one format into another more complicated. Nevertheless, the conversion allowed us to create the UD-markup containing not only morpho-syntactic information but also the semantic one. © Dialogue 2023.All rights reserved."
Klasifikace
Druh
D - Stať ve sborníku
CEP obor
—
OECD FORD obor
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Návaznosti výsledku
Projekt
—
Návaznosti
—
Ostatní
Rok uplatnění
2023
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název statě ve sborníku
"Komp'ut. Lingvist. Intellekt. Tehnol."
ISBN
—
ISSN
2221-7932
e-ISSN
—
Počet stran výsledku
15
Strana od-do
200-214
Název nakladatele
ABBYY PRODUCTION LLC
Místo vydání
—
Místo konání akce
Dubrovnik
Datum konání akce
1. 1. 2023
Typ akce podle státní příslušnosti
WRD - Celosvětová akce
Kód UT WoS článku
—