All

What are you looking for?

All
Projects
Results
Organizations

Quick search

  • Projects supported by TA ČR
  • Excellent projects
  • Projects with the highest public support
  • Current projects

Smart search

  • That is how I find a specific +word
  • That is how I leave the -word out of the results
  • “That is how I can find the whole phrase”

Unsupervised Language Model Adaptation for Speech Recognition with no Extra Resources

The result's identifiers

  • Result code in IS VaVaI

    <a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216305%3A26230%2F19%3APU134188" target="_blank" >RIV/00216305:26230/19:PU134188 - isvavai.cz</a>

  • Result on the web

    <a href="https://www.dega-akustik.de/publikationen/online-proceedings/" target="_blank" >https://www.dega-akustik.de/publikationen/online-proceedings/</a>

  • DOI - Digital Object Identifier

Alternative languages

  • Result language

    angličtina

  • Original language name

    Unsupervised Language Model Adaptation for Speech Recognition with no Extra Resources

  • Original language description

    Classically, automatic speech recognition (ASR) models are decomposed into acoustic models and language models (LM). LMs usually exploit the linguistic structure on a purely textual level and usually contribute strongly to an ASR systems performance. LMs are estimated on large amounts of textual data covering the target domain. However, most utterances cover more specic topics, e.g. in uencing the vocabulary used. Therefore, it's desirable to have the LM adjusted to an utterance's topic. Previous work achieves this by crawling extra data from the web or by using signicant amounts of previous speech data to train topic-specic LM on. We propose a way of adapting the LM directly using the target utterance to be recognized. The corresponding adaptation needs to be done in an unsupervised or automatically supervised way based on the speech input. To deal with corresponding errors robustly, we employ topic encodings from the recently proposed Subspace Multinomial Model. This model also avoids any need of explicit topic labelling during training or recognition, making the proposed method straight-forward to use. We demonstrate the performance of the method on the Librispeech corpus, which consists of read ction books, and we discuss it's behaviour qualitatively.

  • Czech name

  • Czech description

Classification

  • Type

    D - Article in proceedings

  • CEP classification

  • OECD FORD branch

    10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)

Result continuities

  • Project

    <a href="/en/project/EF16_027%2F0008371" target="_blank" >EF16_027/0008371: International mobility of researchers at the Brno University of Technology</a><br>

  • Continuities

    P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)

Others

  • Publication year

    2019

  • Confidentiality

    S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Data specific for result type

  • Article name in the collection

    Proceedings of DAGA 2019

  • ISBN

    978-3-939296-14-0

  • ISSN

  • e-ISSN

  • Number of pages

    4

  • Pages from-to

    954-957

  • Publisher name

    DEGA Head office, Deutsche Gesellschaft für Akustik

  • Place of publication

    Rostock

  • Event location

    Rostock

  • Event date

    Mar 18, 2019

  • Type of event by nationality

    WRD - Celosvětová akce

  • UT code for WoS article