Korpus InterCorp, verze 16ud

Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11210%2F24%3A10489049" target="_blank" >RIV/00216208:11210/24:10489049 - isvavai.cz</a>
Výsledek na webu
<a href="https://wiki.korpus.cz/doku.php/cnk:intercorp:verze16ud" target="_blank" >https://wiki.korpus.cz/doku.php/cnk:intercorp:verze16ud</a>
DOI - Digital Object Identifier
—

Jazyk výsledku
čeština
Název v původním jazyce
Korpus InterCorp, verze 16ud
Popis výsledku v původním jazyce
Nová verze rozsáhlého paralelního korpusu InterCorp obsahujícího původní a překladové texty v 62 jazycích (včetně češtiny). Obsahuje stejné texty jako InterCorp verze 16, obě verze se liší jen v lingvistické anotaci. Po InterCorpu 13ud je to druhá verze InterCorpu s lingvistickou anotací podle standardu Universal Dependencies, jednotnou pro všech 47 anotovaných jazyků. Verze 16ud je také prvním korpusem ČNK, který obsahuje metriky syntaktické komplexity a lexikální diverzity. Anotaci provedl u všech jazyků nástroj UDPipe na základě dat vytvořených v projektu UD.
Název v anglickém jazyce
The InterCorp corpus, release 16ud
Popis výsledku anglicky
A new release of the extensive parallel corpus InterCorp, containing original and translated texts in 62 languages (including Czech). It includes the same texts as InterCorp release 16, with the only difference in linguistic annotation. After InterCorp 13ud, this is the second release of InterCorp with linguistic annotation according to the Universal Dependencies standard, unified for all 47 annotated languages. Release 16ud is also the first CNC corpus to include syntactic complexity and lexical diversity metrics. The annotation for all languages was carried out using the UDPipe tool based on data created within the UD project.

Projekt
<a href="/cs/project/LM2023044" target="_blank" >LM2023044: Český národní korpus</a><br>
Návaznosti
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)

Rok uplatnění
2024
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Podobné výsledky(10)