The ParlaMint corpora of parliamentary proceedings

Identifikátory výsledku

Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F23%3A10475718" target="_blank" >RIV/00216208:11320/23:10475718 - isvavai.cz</a>
Nalezeny alternativní kódy
RIV/00216208:11320/23:NDRH6PUA
Výsledek na webu
<a href="https://verso.is.cuni.cz/pub/verso.fpl?fname=obd_publikace_handle&handle=v6xM9SPCz2" target="_blank" >https://verso.is.cuni.cz/pub/verso.fpl?fname=obd_publikace_handle&handle=v6xM9SPCz2</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1007/s10579-021-09574-0" target="_blank" >10.1007/s10579-021-09574-0</a>

Alternativní jazyky

Jazyk výsledku
angličtina
Název v původním jazyce
The ParlaMint corpora of parliamentary proceedings
Popis výsledku v původním jazyce
This paper presents the ParlaMint corpora containing transcriptions of the sessions of the 17 European national parliaments with half a billion words. The corpora are uniformly encoded, contain rich meta-data about 11 thousand speakers, and are linguistically annotated following the Universal Dependencies formalism and with named entities. Samples of the corpora and conversion scripts are available from the project's GitHub repository, and the complete corpora are openly available via the CLARIN.SI repository for download, as well as through the NoSketch Engine and KonText concordancers and the Parlameter interface for on-line exploration and analysis.
Název v anglickém jazyce
The ParlaMint corpora of parliamentary proceedings
Popis výsledku anglicky
This paper presents the ParlaMint corpora containing transcriptions of the sessions of the 17 European national parliaments with half a billion words. The corpora are uniformly encoded, contain rich meta-data about 11 thousand speakers, and are linguistically annotated following the Universal Dependencies formalism and with named entities. Samples of the corpora and conversion scripts are available from the project's GitHub repository, and the complete corpora are openly available via the CLARIN.SI repository for download, as well as through the NoSketch Engine and KonText concordancers and the Parlameter interface for on-line exploration and analysis.

Klasifikace

Druh
J<sub>imp</sub> - Článek v periodiku v databázi Web of Science
CEP obor
—
OECD FORD obor
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)

Návaznosti výsledku

Projekt
<a href="/cs/project/LM2018101" target="_blank" >LM2018101: Digitální výzkumná infrastruktura pro jazykové technologie, umění a humanitní vědy</a><br>
Návaznosti
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)

Ostatní

Rok uplatnění
2023
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Údaje specifické pro druh výsledku

Název periodika
Language Resources and Evaluation
ISSN
1574-020X
e-ISSN
1574-0218
Svazek periodika
57
Číslo periodika v rámci svazku
1
Stát vydavatele periodika
NL - Nizozemsko
Počet stran výsledku
34
Strana od-do
415-448
Kód UT WoS článku
000749985300001
EID výsledku v databázi Scopus
2-s2.0-85124105199

Podobné výsledky(10)

ParlaMint: Comparable Corpora of European Parliamentary Data New Textual Corpora for Serbian Language Modeling CsEnVi Pairwise Parallel Corpora

Co hledáte?

Rychlé hledání

Chytré vyhledávání

The ParlaMint corpora of parliamentary proceedings

Identifikátory výsledku

Alternativní jazyky

Klasifikace

Návaznosti výsledku

Ostatní

Údaje specifické pro druh výsledku

Podobné výsledky(10)

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Popis výsledku

Identifikátory výsledku

Identifikátory výsledku

Alternativní jazyky

Alternativní jazyky

Klasifikace

Klasifikace

Návaznosti výsledku

Návaznosti výsledku

Ostatní

Ostatní

Údaje specifické pro druh výsledku

Údaje specifické pro druh výsledku

Podobné výsledky(10)