All

What are you looking for?

All
Projects
Results
Organizations

Quick search

  • Projects supported by TA ČR
  • Excellent projects
  • Projects with the highest public support
  • Current projects

Smart search

  • That is how I find a specific +word
  • That is how I leave the -word out of the results
  • “That is how I can find the whole phrase”

Online LDA-Based Language Model Adaptation

The result's identifiers

  • Result code in IS VaVaI

    <a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F49777513%3A23520%2F18%3A43952476" target="_blank" >RIV/49777513:23520/18:43952476 - isvavai.cz</a>

  • Result on the web

    <a href="https://link.springer.com/chapter/10.1007%2F978-3-030-00794-2_36" target="_blank" >https://link.springer.com/chapter/10.1007%2F978-3-030-00794-2_36</a>

  • DOI - Digital Object Identifier

    <a href="http://dx.doi.org/10.1007/978-3-030-00794-2_36" target="_blank" >10.1007/978-3-030-00794-2_36</a>

Alternative languages

  • Result language

    angličtina

  • Original language name

    Online LDA-Based Language Model Adaptation

  • Original language description

    In this paper, we present our improvements in online topic-based language model adaptation. Our aim is to enhance the automatic speech recognition of a multi-topic speech which is to be recognized in the real-time (online). Latent Dirichlet Allocation (LDA) is an unsupervised topic model designed to uncover hidden semantic relationships between words and documents in a text corpus and thus reveal latent topics automatically. We use LDA to cluster the text corpus and to predict topics online from partial hypotheses during the real-time speech recognition. Based on detected topic changes in the speech, we adapt the language model on-the-fly. We are demonstrating the improvement of our system on the task of online subtitling of TV news, where we achieved 18% relative reduction of perplexity and 3.52% relative reduction of WER over non-adapted system.

  • Czech name

  • Czech description

Classification

  • Type

    D - Article in proceedings

  • CEP classification

  • OECD FORD branch

    20205 - Automation and control systems

Result continuities

  • Project

    <a href="/en/project/GBP103%2F12%2FG084" target="_blank" >GBP103/12/G084: Center for Large Scale Multi-modal Data Interpretation</a><br>

  • Continuities

    P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)

Others

  • Publication year

    2018

  • Confidentiality

    S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Data specific for result type

  • Article name in the collection

    Lecture Notes in Computer Science

  • ISBN

    978-3-030-00793-5

  • ISSN

    0302-9743

  • e-ISSN

    neuvedeno

  • Number of pages

    8

  • Pages from-to

    334-341

  • Publisher name

    Springer

  • Place of publication

    Cham

  • Event location

    Brno, Czech Republic

  • Event date

    Sep 11, 2018

  • Type of event by nationality

    WRD - Celosvětová akce

  • UT code for WoS article