System for fast lexical and phonetic spoken term detection in a Czech cultural heritage archive
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F49777513%3A23520%2F11%3A43898200" target="_blank" >RIV/49777513:23520/11:43898200 - isvavai.cz</a>
Result on the web
<a href="http://dx.doi.org/10.1186/1687-4722-2011-10" target="_blank" >http://dx.doi.org/10.1186/1687-4722-2011-10</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1186/1687-4722-2011-10" target="_blank" >10.1186/1687-4722-2011-10</a>
Alternative languages
Result language
angličtina
Original language name
System for fast lexical and phonetic spoken term detection in a Czech cultural heritage archive
Original language description
The main objective of the work presented in this paper was to develop a complete system that would accomplish the original visions of the MALACH project. Those goals were to employ automatic speech recognition and information retrieval techniques to provide improved access to the large video archive containing recorded testimonies of the Holocaust survivors. The system has been so far developed for the Czech part of the archive only. It takes advantage of the state-of-the-art speech recognition system tailored to the challenging properties of the recordings in the archive (elderly speakers, spontaneous speech and emotionally loaded content) and its close coupling with the actual search engine.The design of the algorithm adopting the spoken term detection approach is focused on the speed of the retrieval. The resulting system is able to search through the 1,000 h of video constituting the Czech portion of the archive and find query word occurrences in the matter of seconds.
Czech name
—
Czech description
—
Classification
Type
J<sub>x</sub> - Unclassified - Peer-reviewed scientific article (Jimp, Jsc and Jost)
CEP classification
JD - Use of computers, robotics and its application
OECD FORD branch
—
Result continuities
Project
<a href="/en/project/1QS101470516" target="_blank" >1QS101470516: Automatic keyword spotting in audio data streams</a><br>
Continuities
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)<br>S - Specificky vyzkum na vysokych skolach
Others
Publication year
2011
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Name of the periodical
EURASIP Journal on Audio, Speech and Music Processing
ISSN
1687-4714
e-ISSN
—
Volume of the periodical
2011
Issue of the periodical within the volume
10
Country of publishing house
US - UNITED STATES
Number of pages
19
Pages from-to
1-19
UT code for WoS article
000299122700001
EID of the result in the Scopus database
—