All

What are you looking for?

All
Projects
Results
Organizations

Quick search

  • Projects supported by TA ČR
  • Excellent projects
  • Projects with the highest public support
  • Current projects

Smart search

  • That is how I find a specific +word
  • That is how I leave the -word out of the results
  • “That is how I can find the whole phrase”

Utilizing VOiCES dataset for multichannel speaker verification with beamforming

The result's identifiers

  • Result code in IS VaVaI

    <a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216305%3A26230%2F20%3APU136531" target="_blank" >RIV/00216305:26230/20:PU136531 - isvavai.cz</a>

  • Result on the web

    <a href="https://www.isca-speech.org/archive/Odyssey_2020/abstracts/80.html" target="_blank" >https://www.isca-speech.org/archive/Odyssey_2020/abstracts/80.html</a>

  • DOI - Digital Object Identifier

    <a href="http://dx.doi.org/10.21437/Odyssey.2020-27" target="_blank" >10.21437/Odyssey.2020-27</a>

Alternative languages

  • Result language

    angličtina

  • Original language name

    Utilizing VOiCES dataset for multichannel speaker verification with beamforming

  • Original language description

    VOiCES from a Distance Challenge 2019 aimed at the evaluation of speaker verification (SV) systems using single-channel trials based on the Voices Obscured in Complex Environmental Settings (VOiCES) corpus. Since it comprises recordings of the same utterances captured simultaneously by multiple microphones in the same environments, it is also suitable for multichannel experiments. In this work, we design a multichannel dataset as well as development and evaluation trials for SV inspired by the VOiCES challenge. Alternatives discarding harmful microphones are presented as well. We asses the utilization of the created dataset for x-vector based SV with beamforming as a front end. Standard fixed beamforming and NN-supported beamforming using simulated data and ideal binary masks (IBM) are compared with another variant of NNsupported beamforming that is trained solely on the VOiCES data. Lack of data revealed by experiments with VOiCESdata trained beamformer was tackled by means of a variant of SpecAugment applied to magnitude spectra. This approach led to as much as 10% relative improvement in EER pushing results closer to those obtained by a good beamformer based on IBMs.

  • Czech name

  • Czech description

Classification

  • Type

    D - Article in proceedings

  • CEP classification

  • OECD FORD branch

    10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)

Result continuities

  • Project

    Result was created during the realization of more than one project. More information in the Projects tab.

  • Continuities

    P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)

Others

  • Publication year

    2020

  • Confidentiality

    S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Data specific for result type

  • Article name in the collection

    Proceedings of Odyssey 2020 The Speaker and Language Recognition Workshop

  • ISBN

  • ISSN

    2312-2846

  • e-ISSN

  • Number of pages

    7

  • Pages from-to

    187-193

  • Publisher name

    International Speech Communication Association

  • Place of publication

    Tokyo

  • Event location

    Tokyo

  • Event date

    Nov 1, 2020

  • Type of event by nationality

    WRD - Celosvětová akce

  • UT code for WoS article