All

What are you looking for?

All
Projects
Results
Organizations

Quick search

  • Projects supported by TA ČR
  • Excellent projects
  • Projects with the highest public support
  • Current projects

Smart search

  • That is how I find a specific +word
  • That is how I leave the -word out of the results
  • “That is how I can find the whole phrase”

Using Various Types of Multimedia Resources to Train System for Automatic Transcription of Czech Historical Oral Archives

The result's identifiers

  • Result code in IS VaVaI

    <a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F46747885%3A24220%2F13%3A%230002790" target="_blank" >RIV/46747885:24220/13:#0002790 - isvavai.cz</a>

  • Result on the web

    <a href="http://dx.doi.org/10.1007/978-3-642-41190-8_25" target="_blank" >http://dx.doi.org/10.1007/978-3-642-41190-8_25</a>

  • DOI - Digital Object Identifier

    <a href="http://dx.doi.org/10.1007/978-3-642-41190-8_25" target="_blank" >10.1007/978-3-642-41190-8_25</a>

Alternative languages

  • Result language

    angličtina

  • Original language name

    Using Various Types of Multimedia Resources to Train System for Automatic Transcription of Czech Historical Oral Archives

  • Original language description

    Historical spoken documents represent a unique segment of national cultural heritage. In order to disclose the large Czech Radio audio archive to research community and to public, we have been developing a system whose aim is to transcribe automaticallythe archive files, index them and make them searchable. The transcription of contemporary (1 or 2 decades old) documents is based on the lexicon and statistical language model (LM) built from a large amount of recent texts available in electronic form. From the older periods (before 1990), however, digital texts do not exist. Therefore, we needed a) to find resources that represent language of those times, b) to convert them from their original form to text, c) to utilize this text for creating epoch specific lexicons and LMs, and eventually, d) to apply them in the developed speech recognition system. In our case, the main resources included: scanned historical newspapers, shorthand notes from the national parliament and subtitles from

  • Czech name

  • Czech description

Classification

  • Type

    D - Article in proceedings

  • CEP classification

    JC - Computer hardware and software

  • OECD FORD branch

Result continuities

  • Project

    <a href="/en/project/DF11P01OVV013" target="_blank" >DF11P01OVV013: Disclosure of the Czech Radio archive for sophisticated search</a><br>

  • Continuities

    P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)

Others

  • Publication year

    2013

  • Confidentiality

    S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Data specific for result type

  • Article name in the collection

    New Trends in Image Analysis and Processing - ICIAP 2013

  • ISBN

    9783642411892

  • ISSN

    0302-9743

  • e-ISSN

  • Number of pages

    10

  • Pages from-to

    228-237

  • Publisher name

    Springer-Verlag Berlin Heidelber

  • Place of publication

    Germany, Berlin

  • Event location

    Italy, Naples

  • Event date

    Sep 9, 2013

  • Type of event by nationality

    WRD - Celosvětová akce

  • UT code for WoS article