Building LVCSR system for transcription of spontaneously pronounced russian testimonies in the MALACH project: initial steps and first results
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F49777513%3A23520%2F03%3A00000156" target="_blank" >RIV/49777513:23520/03:00000156 - isvavai.cz</a>
Result on the web
—
DOI - Digital Object Identifier
—
Alternative languages
Result language
angličtina
Original language name
Building LVCSR system for transcription of spontaneously pronounced russian testimonies in the MALACH project: initial steps and first results
Original language description
The MALACH project uses the world's largest digital archives of video oral histories collected by the Survivors of the Shoah Visual History Foundation (VHF) and attempts to access such archives by advancing the state-of-the-art in Automated Speech Recognition (ASR) and Information Retrieval (IR). This paper discusses the initial steps and the first results in building large vocabulary continuous speech recognition (LVCSR) system for transcription of Russian witnesses. Russian as the third langu age processed in the MALACH project (after English and Czech) brought new problems especially in the phonetic area. Although the most of the Russian testimonies were provided by native Russian survivors we have encountered many different accents in thei r speechcaused by a territory where the survivors are living.
Czech name
—
Czech description
—
Classification
Type
D - Article in proceedings
CEP classification
JD - Use of computers, robotics and its application
OECD FORD branch
—
Result continuities
Project
<a href="/en/project/LN00A063" target="_blank" >LN00A063: Centre of Computational Linguistics</a><br>
Continuities
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)<br>Z - Vyzkumny zamer (s odkazem do CEZ)
Others
Publication year
2003
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Article name in the collection
Text, Speech and Dialogue
ISBN
3-540-20024-X
ISSN
—
e-ISSN
—
Number of pages
6
Pages from-to
327-332
Publisher name
Springer
Place of publication
Berlin
Event location
České Budějovice
Event date
Sep 8, 2003
Type of event by nationality
CST - Celostátní akce
UT code for WoS article
—