Designing a corpus of Czech monologues: ORATOR v2

Identifikátory výsledku

Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11210%2F21%3A10434710" target="_blank" >RIV/00216208:11210/21:10434710 - isvavai.cz</a>
Výsledek na webu
<a href="https://verso.is.cuni.cz/pub/verso.fpl?fname=obd_publikace_handle&handle=j7wj1Oz78u" target="_blank" >https://verso.is.cuni.cz/pub/verso.fpl?fname=obd_publikace_handle&handle=j7wj1Oz78u</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.2478/jazcas-2021-0048" target="_blank" >10.2478/jazcas-2021-0048</a>

Alternativní jazyky

Jazyk výsledku
angličtina
Název v původním jazyce
Designing a corpus of Czech monologues: ORATOR v2
Popis výsledku v původním jazyce
ORATOR v2 is a new 1.5M word corpus of Czech monologues, delivered to a live audience in semi-formal to formal settings. It was designed to chart the space of naturally occurring monologues which can be obtained for corpus processing. As such, it aims for diversity but does not attempt any balancing of subcategories, recognizing that some types of data are inherently easier to obtain in high volume than others. The transcription guidelines and annotation tools employed are the same as other recent spoken corpora published by the CNC, which facilitates interesting comparisons between various types of spoken Czech. The present paper sketches out three case studies, comparing ORATOR to the informal conversations of ORTOFON v2 in terms of the frequencies of demonstratives and hesitations, as well as lexical richness.
Název v anglickém jazyce
Designing a corpus of Czech monologues: ORATOR v2
Popis výsledku anglicky
ORATOR v2 is a new 1.5M word corpus of Czech monologues, delivered to a live audience in semi-formal to formal settings. It was designed to chart the space of naturally occurring monologues which can be obtained for corpus processing. As such, it aims for diversity but does not attempt any balancing of subcategories, recognizing that some types of data are inherently easier to obtain in high volume than others. The transcription guidelines and annotation tools employed are the same as other recent spoken corpora published by the CNC, which facilitates interesting comparisons between various types of spoken Czech. The present paper sketches out three case studies, comparing ORATOR to the informal conversations of ORTOFON v2 in terms of the frequencies of demonstratives and hesitations, as well as lexical richness.

Klasifikace

Druh
J<sub>SC</sub> - Článek v periodiku v databázi SCOPUS
CEP obor
—
OECD FORD obor
60203 - Linguistics

Návaznosti výsledku

Projekt
<a href="/cs/project/LM2018137" target="_blank" >LM2018137: Český národní korpus</a><br>
Návaznosti
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)

Ostatní

Rok uplatnění
2021
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Údaje specifické pro druh výsledku

Název periodika
Jazykovedny Casopis
ISSN
0021-5597
e-ISSN
—
Svazek periodika
72
Číslo periodika v rámci svazku
2
Stát vydavatele periodika
SK - Slovenská republika
Počet stran výsledku
11
Strana od-do
520-530
Kód UT WoS článku
—
EID výsledku v databázi Scopus
2-s2.0-85123540419

Podobné výsledky(10)

Korpus monologů: ORATOR ORATOR v2: Korpus monologů Corpus of Czech Students' Spoken English

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Designing a corpus of Czech monologues: ORATOR v2

Identifikátory výsledku

Alternativní jazyky

Klasifikace

Návaznosti výsledku

Ostatní

Údaje specifické pro druh výsledku

Podobné výsledky(10)

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Popis výsledku

Identifikátory výsledku

Identifikátory výsledku

Alternativní jazyky

Alternativní jazyky

Klasifikace

Klasifikace

Návaznosti výsledku

Návaznosti výsledku

Ostatní

Ostatní

Údaje specifické pro druh výsledku

Údaje specifické pro druh výsledku

Podobné výsledky(10)