All

What are you looking for?

All
Projects
Results
Organizations

Quick search

  • Projects supported by TA ČR
  • Excellent projects
  • Projects with the highest public support
  • Current projects

Smart search

  • That is how I find a specific +word
  • That is how I leave the -word out of the results
  • “That is how I can find the whole phrase”

Pretraining End-to-End Keyword Search with Automatically Discovered Acoustic Units

The result's identifiers

  • Result code in IS VaVaI

    <a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216305%3A26230%2F24%3APU154929" target="_blank" >RIV/00216305:26230/24:PU154929 - isvavai.cz</a>

  • Result on the web

    <a href="https://www.isca-archive.org/interspeech_2024/yusuf24b_interspeech.pdf" target="_blank" >https://www.isca-archive.org/interspeech_2024/yusuf24b_interspeech.pdf</a>

  • DOI - Digital Object Identifier

    <a href="http://dx.doi.org/10.21437/Interspeech.2024-1713" target="_blank" >10.21437/Interspeech.2024-1713</a>

Alternative languages

  • Result language

    angličtina

  • Original language name

    Pretraining End-to-End Keyword Search with Automatically Discovered Acoustic Units

  • Original language description

    End-to-end (E2E) keyword search (KWS) has emerged as an alternative and complimentary approach to conventional key- word search which depends on the output of automatic speech recognition (ASR) systems. While E2E methods greatly sim- plify the KWS pipeline, they generally have worse performance than their ASR-based counterparts, which can benefit from pretraining with untranscribed data. In this work, we propose a method for pretraining E2E KWS systems with untranscribed data, which involves using acoustic unit discovery (AUD) to obtain discrete units for untranscribed data and then learning to locate sequences of such units in the speech. We conduct exper- iments across languages and AUD systems: we show that finetuning such a model significantly outperforms a model trained from scratch, and the performance improvements are generally correlated with the quality of the AUD system used for pretraining.

  • Czech name

  • Czech description

Classification

  • Type

    D - Article in proceedings

  • CEP classification

  • OECD FORD branch

    10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)

Result continuities

  • Project

    <a href="/en/project/VJ01010108" target="_blank" >VJ01010108: Robust processing of recordings for operations and security</a><br>

  • Continuities

    P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)

Others

  • Publication year

    2024

  • Confidentiality

    S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Data specific for result type

  • Article name in the collection

    Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

  • ISBN

  • ISSN

    1990-9772

  • e-ISSN

  • Number of pages

    5

  • Pages from-to

    5068-5072

  • Publisher name

    International Speech Communication Association

  • Place of publication

    Kos

  • Event location

    Kos

  • Event date

    Sep 1, 2024

  • Type of event by nationality

    WRD - Celosvětová akce

  • UT code for WoS article