All

What are you looking for?

All
Projects
Results
Organizations

Quick search

  • Projects supported by TA ČR
  • Excellent projects
  • Projects with the highest public support
  • Current projects

Smart search

  • That is how I find a specific +word
  • That is how I leave the -word out of the results
  • “That is how I can find the whole phrase”

Written Term Detection Improves Spoken Term Detection

The result's identifiers

  • Result code in IS VaVaI

    <a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216305%3A26230%2F24%3APU154756" target="_blank" >RIV/00216305:26230/24:PU154756 - isvavai.cz</a>

  • Result on the web

    <a href="https://ieeexplore.ieee.org/document/10571348" target="_blank" >https://ieeexplore.ieee.org/document/10571348</a>

  • DOI - Digital Object Identifier

    <a href="http://dx.doi.org/10.1109/TASLP.2024.3407476" target="_blank" >10.1109/TASLP.2024.3407476</a>

Alternative languages

  • Result language

    angličtina

  • Original language name

    Written Term Detection Improves Spoken Term Detection

  • Original language description

    End-to-end (E2E) approaches to keyword search (KWS) are considerably simpler in terms of training and indexing complexity when compared to approaches which use the output of automatic speech recognition (ASR) systems. This simplification however has drawbacks due to the loss of modularity. In partic- ular, where ASR-based KWS systems can benefit from external unpaired text via a language model, current formulations of E2E KWS systems have no such mechanism. Therefore, in this paper, we propose a multitask training objective which allows unpaired text to be integrated into E2E KWS without complicating indexing and search. In addition to training an E2E KWS model to retrieve text queries from spoken documents, we jointly train it to retrieve text queries from masked written documents. We show empirically that this approach can effectively leverage unpaired text for KWS, with significant improvements in search performance across a wide variety of languages. We conduct analysis which indicates that these improvements are achieved because the proposed method improves document representations for words in the unpaired text. Finally, we show that the proposed method can be used for domain adaptation in settings where in-domain paired data is scarce or nonexistent.

  • Czech name

  • Czech description

Classification

  • Type

    J<sub>imp</sub> - Article in a specialist periodical, which is included in the Web of Science database

  • CEP classification

  • OECD FORD branch

    10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)

Result continuities

  • Project

    <a href="/en/project/VJ01010108" target="_blank" >VJ01010108: Robust processing of recordings for operations and security</a><br>

  • Continuities

    P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)

Others

  • Publication year

    2024

  • Confidentiality

    S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Data specific for result type

  • Name of the periodical

    IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH AND LANGUAGE PROCESSING

  • ISSN

    2329-9290

  • e-ISSN

    2329-9304

  • Volume of the periodical

    32

  • Issue of the periodical within the volume

    06

  • Country of publishing house

    US - UNITED STATES

  • Number of pages

    11

  • Pages from-to

    3213-3223

  • UT code for WoS article

    001256333200007

  • EID of the result in the Scopus database

    2-s2.0-85198013158