All

What are you looking for?

All
Projects
Results
Organizations

Quick search

  • Projects supported by TA ČR
  • Excellent projects
  • Projects with the highest public support
  • Current projects

Smart search

  • That is how I find a specific +word
  • That is how I leave the -word out of the results
  • “That is how I can find the whole phrase”

A Comparative Analysis of Bilingual and Trilingual Wav2Vec Models for Automatic Speech Recognition in Multilingual Oral History Archives

The result's identifiers

  • Result code in IS VaVaI

    <a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F49777513%3A23520%2F24%3A43973106" target="_blank" >RIV/49777513:23520/24:43973106 - isvavai.cz</a>

  • Result on the web

    <a href="https://www.isca-archive.org/interspeech_2024/lehecka24_interspeech.pdf" target="_blank" >https://www.isca-archive.org/interspeech_2024/lehecka24_interspeech.pdf</a>

  • DOI - Digital Object Identifier

    <a href="http://dx.doi.org/10.21437/Interspeech.2024-472" target="_blank" >10.21437/Interspeech.2024-472</a>

Alternative languages

  • Result language

    angličtina

  • Original language name

    A Comparative Analysis of Bilingual and Trilingual Wav2Vec Models for Automatic Speech Recognition in Multilingual Oral History Archives

  • Original language description

    In this paper, we are comparing monolingual Wav2Vec 2.0 models with various multilingual models to see whether we could improve speech recognition performance on a unique oral history archive containing a lot of mixed-language sentences. Our main goal is to push forward research on this unique dataset, which is an extremely valuable part of our cultural heritage. Our results suggest that monolingual speech recognition models are, in most cases, superior to multilingual models, even when processing the oral history archive full of mixed-language sentences from non-native speakers. We also performed the same experiments on the public CommonVoice dataset to verify our results. We are contributing to the research community by releasing our pre-trained models to the public.

  • Czech name

  • Czech description

Classification

  • Type

    D - Article in proceedings

  • CEP classification

  • OECD FORD branch

    20205 - Automation and control systems

Result continuities

  • Project

    <a href="/en/project/VJ01010108" target="_blank" >VJ01010108: Robust processing of recordings for operations and security</a><br>

  • Continuities

    P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)

Others

  • Publication year

    2024

  • Confidentiality

    S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Data specific for result type

  • Article name in the collection

    Interspeech 2024

  • ISBN

  • ISSN

    2308-457X

  • e-ISSN

    2958-1796

  • Number of pages

    5

  • Pages from-to

    1285-1289

  • Publisher name

    International Speech Communication Association (ISCA)

  • Place of publication

    New York

  • Event location

    Kos, Řecko

  • Event date

    Sep 1, 2024

  • Type of event by nationality

    WRD - Celosvětová akce

  • UT code for WoS article

    001331850101086