All

What are you looking for?

All
Projects
Results
Organizations

Quick search

  • Projects supported by TA ČR
  • Excellent projects
  • Projects with the highest public support
  • Current projects

Smart search

  • That is how I find a specific +word
  • That is how I leave the -word out of the results
  • “That is how I can find the whole phrase”

Hystoc: Obtaining Word Confidences for Fusion of End-To-End ASR Systems

The result's identifiers

  • Result code in IS VaVaI

    <a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216305%3A26230%2F24%3APU152209" target="_blank" >RIV/00216305:26230/24:PU152209 - isvavai.cz</a>

  • Result on the web

    <a href="https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10446739" target="_blank" >https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10446739</a>

  • DOI - Digital Object Identifier

    <a href="http://dx.doi.org/10.1109/ICASSP48485.2024.10446739" target="_blank" >10.1109/ICASSP48485.2024.10446739</a>

Alternative languages

  • Result language

    angličtina

  • Original language name

    Hystoc: Obtaining Word Confidences for Fusion of End-To-End ASR Systems

  • Original language description

    End-to-end (e2e) systems have recently gained wide popularity in automatic speech recognition. However, these systems do generally not provide well-calibrated word-level confidences. In this paper, we propose Hystoc, a simple method for obtaining word-level confidences from hypothesis-level scores. Hystoc is an iterative alignment procedure which turns hypotheses from an n-best output of the ASR system into a confusion network. Eventually, word-level confidences are obtained as posterior probabilities in the individual bins of the confusion network. We show that Hystoc provides confidences that correlate well with the accuracy of the ASR hypothesis. Furthermore, we show that utilizing Hystoc in fusion of multiple e2e ASR systems increases the gains from the fusion by up to 1% WER absolute on Spanish RTVE2020 dataset. Finally, we experiment with using Hystoc for direct fusion of n-best outputs from multiple systems, but we only achieve minor gains when fusing very similar systems.

  • Czech name

  • Czech description

Classification

  • Type

    D - Article in proceedings

  • CEP classification

  • OECD FORD branch

    10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)

Result continuities

  • Project

    Result was created during the realization of more than one project. More information in the Projects tab.

  • Continuities

    P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)

Others

  • Publication year

    2024

  • Confidentiality

    S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Data specific for result type

  • Article name in the collection

    ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

  • ISBN

    979-8-3503-4485-1

  • ISSN

  • e-ISSN

  • Number of pages

    5

  • Pages from-to

    11276-11280

  • Publisher name

    IEEE Signal Processing Society

  • Place of publication

    Seoul

  • Event location

    Seoul

  • Event date

    Apr 14, 2024

  • Type of event by nationality

    WRD - Celosvětová akce

  • UT code for WoS article