Interactive search for words and phrases in large audio-visual archives
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F49777513%3A23520%2F17%3A43932995" target="_blank" >RIV/49777513:23520/17:43932995 - isvavai.cz</a>
Result on the web
—
DOI - Digital Object Identifier
—
Alternative languages
Result language
angličtina
Original language name
Interactive search for words and phrases in large audio-visual archives
Original language description
This paper describes an automatic system for processing and searching for large audio-visual archives, especially for the use of in the field of oral history studies. The system contains automated processing pipeline for speech recognition and indexation. The carefully designed graphical user interface allows to search for specific words and phrases. It also allows to search for out-of-vocabulary words and directly replay the occurrences sorted according to the automatically estimated confidence scores. The first archive processed in this system is the MALACH archive containing personal testimonies of holocaust survivors and witnesses. The searchable portion of interviews consists of 2,000 hours of English recordings and 1,000 hours of Czech recordings. The paper gives a brief overview of the architecture of the system, describes the phoneme-based search and provides basic performance metrics for both the English and Czech data.
Czech name
—
Czech description
—
Classification
Type
O - Miscellaneous
CEP classification
—
OECD FORD branch
20205 - Automation and control systems
Result continuities
Project
<a href="/en/project/TE01020197" target="_blank" >TE01020197: Centre for Applied Cybernetics 3</a><br>
Continuities
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)
Others
Publication year
2017
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů