Experiments with the recognition of highly inflected spoken language (czech) in the large vocabulary task
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F49777513%3A23520%2F01%3A00065617" target="_blank" >RIV/49777513:23520/01:00065617 - isvavai.cz</a>
Result on the web
—
DOI - Digital Object Identifier
—
Alternative languages
Result language
angličtina
Original language name
Experiments with the recognition of highly inflected spoken language (czech) in the large vocabulary task
Original language description
This paper presents three annotated and phonetically transcribed large speech corpora developed for spoken Czech. All corpora were collected during the last two years at the Department of Cybernetics, University of West Bohemia (UWB) in Pilsen. The firsttwo collections are broadcast news, the third corpus is a high-quality read-speech database. This paper describes the collection conditions, annotation and phonetic transcription process related to each corpus. The basic phonetic and lexical characteristics of all corpora will be given and compared mutually. Moreover the paper deals with problems encountered in large vocabulary continuous speech recognition of highly inflectional languages. The concept of morpheme-based language modeling is introducedand speech recognition results using word-based and morpheme-based language models are also reported.
Czech name
—
Czech description
—
Classification
Type
D - Article in proceedings
CEP classification
JD - Use of computers, robotics and its application
OECD FORD branch
—
Result continuities
Project
<a href="/en/project/LN00A063" target="_blank" >LN00A063: Centre of Computational Linguistics</a><br>
Continuities
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)<br>Z - Vyzkumny zamer (s odkazem do CEZ)
Others
Publication year
2001
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Article name in the collection
Experiments with the recognition of highly inflected spoken language (czech) in the large vocabulary task
ISBN
9800775463
ISSN
—
e-ISSN
—
Number of pages
1
Pages from-to
—
Publisher name
Neuveden
Place of publication
Orlando
Event location
Neuveden
Event date
Jan 1, 2001
Type of event by nationality
CST - Celostátní akce
UT code for WoS article
—