All

What are you looking for?

All
Projects
Results
Organizations

Quick search

  • Projects supported by TA ČR
  • Excellent projects
  • Projects with the highest public support
  • Current projects

Smart search

  • That is how I find a specific +word
  • That is how I leave the -word out of the results
  • “That is how I can find the whole phrase”

Single Channel Target Speaker Extraction and Recognition with Speaker Beam

The result's identifiers

  • Result code in IS VaVaI

    <a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216305%3A26230%2F18%3APU130735" target="_blank" >RIV/00216305:26230/18:PU130735 - isvavai.cz</a>

  • Result on the web

    <a href="http://www.fit.vutbr.cz/research/pubs/all.php?id=11721" target="_blank" >http://www.fit.vutbr.cz/research/pubs/all.php?id=11721</a>

  • DOI - Digital Object Identifier

    <a href="http://dx.doi.org/10.1109/ICASSP.2018.8462661" target="_blank" >10.1109/ICASSP.2018.8462661</a>

Alternative languages

  • Result language

    angličtina

  • Original language name

    Single Channel Target Speaker Extraction and Recognition with Speaker Beam

  • Original language description

    This paper addresses the problem of single channel speech recognition of a target speaker in a mixture of speech signals. We propose to exploit auxiliary speaker information provided by an adaptation utterance from the target speaker to extract and recognize only that speaker. Using such auxiliary information, we can build a speaker extraction neural network (NN) that is independent of the number of sources in the mixture, and that can track speakers across different utterances, which are two challenging issues occurring with conventional approaches for speech recognition of mixtures. We call such an informed speaker extraction scheme "SpeakerBeam". SpeakerBeam exploits a recently developed context adaptive deep NN (CADNN) that allows tracking speech from a target speaker using a speaker adaptation layer, whose parameters are adjusted depending on auxiliary features representing the target speaker characteristics. SpeakerBeam was previously investigated for speaker extraction using a microphone array. In this paper, we demonstrate that it is also efficient for single channel speaker extraction. The speaker adaptation layer can be employed either to build a speaker adaptive acoustic model that recognizes only the target speaker or a maskbased speaker extraction network that extracts the target speech from the speech mixture signal prior to recognition. We also show that the latter speaker extraction network can be optimized jointly with an acoustic model to further improve ASR performance.

  • Czech name

  • Czech description

Classification

  • Type

    D - Article in proceedings

  • CEP classification

  • OECD FORD branch

    10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)

Result continuities

  • Project

    <a href="/en/project/LQ1602" target="_blank" >LQ1602: IT4Innovations excellence in science</a><br>

  • Continuities

    P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)

Others

  • Publication year

    2018

  • Confidentiality

    S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Data specific for result type

  • Article name in the collection

    Proceedings of ICASSP 2018

  • ISBN

    978-1-5386-4658-8

  • ISSN

  • e-ISSN

  • Number of pages

    5

  • Pages from-to

    5554-5558

  • Publisher name

    IEEE Signal Processing Society

  • Place of publication

    Calgary

  • Event location

    Calgary

  • Event date

    Apr 15, 2018

  • Type of event by nationality

    WRD - Celosvětová akce

  • UT code for WoS article

    000446384605144