Contribution Towards a Corpus-Based Phraseology Minimum
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11210%2F17%3A10366692" target="_blank" >RIV/00216208:11210/17:10366692 - isvavai.cz</a>
Výsledek na webu
—
DOI - Digital Object Identifier
—
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
Contribution Towards a Corpus-Based Phraseology Minimum
Popis výsledku v původním jazyce
This paper represents an attempt to put together a list of the most commonly used (most typical) Czech idioms using corpus data with annotated collocations. Collocations are annotated in corpora of contemporary written Czech as well as in a corpus of spoken Czech containing transcripts of intimate conversations. Idioms are selected based on their frequency in different text types (newspapers and magazines, non-fiction, fiction, spoken language) and the resulting list is compiled based on a criterion of occurrence of the given idiom in at least two different text types. A short characteristic of the individual text types is given in terms of which types of idioms are typical for them (according to formal criteria). This study confirms a substantial divide between idiom use in written and spoken language. A smaller difference can be observed between fiction on the one hand and non-fiction and newspapers on the other. The main reason for this is the interactive nature of fiction texts, which leads to them containing idioms with verbal components. These are employed in a fashion similar to spoken languages, in interactions among the individual characters. By contrast, non-fiction and journalistic language tends to be more descriptive, with more nominal idioms.
Název v anglickém jazyce
Contribution Towards a Corpus-Based Phraseology Minimum
Popis výsledku anglicky
This paper represents an attempt to put together a list of the most commonly used (most typical) Czech idioms using corpus data with annotated collocations. Collocations are annotated in corpora of contemporary written Czech as well as in a corpus of spoken Czech containing transcripts of intimate conversations. Idioms are selected based on their frequency in different text types (newspapers and magazines, non-fiction, fiction, spoken language) and the resulting list is compiled based on a criterion of occurrence of the given idiom in at least two different text types. A short characteristic of the individual text types is given in terms of which types of idioms are typical for them (according to formal criteria). This study confirms a substantial divide between idiom use in written and spoken language. A smaller difference can be observed between fiction on the one hand and non-fiction and newspapers on the other. The main reason for this is the interactive nature of fiction texts, which leads to them containing idioms with verbal components. These are employed in a fashion similar to spoken languages, in interactions among the individual characters. By contrast, non-fiction and journalistic language tends to be more descriptive, with more nominal idioms.
Klasifikace
Druh
D - Stať ve sborníku
CEP obor
—
OECD FORD obor
60203 - Linguistics
Návaznosti výsledku
Projekt
<a href="/cs/project/GA16-07473S" target="_blank" >GA16-07473S: Mezi slovníkem a gramatikou</a><br>
Návaznosti
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)<br>I - Institucionalni podpora na dlouhodoby koncepcni rozvoj vyzkumne organizace
Ostatní
Rok uplatnění
2017
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název statě ve sborníku
Computational and Corpus-Based Phraseology
ISBN
978-3-319-69804-5
ISSN
0302-9743
e-ISSN
—
Počet stran výsledku
12
Strana od-do
220-231
Název nakladatele
Springer Berlin Heidelberg
Místo vydání
London
Místo konání akce
London
Datum konání akce
13. 11. 2017
Typ akce podle státní příslušnosti
WRD - Celosvětová akce
Kód UT WoS článku
—