Compact Network for Speakerbeam Target Speaker Extraction

The result's identifiers

Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216305%3A26230%2F19%3APU134186" target="_blank" >RIV/00216305:26230/19:PU134186 - isvavai.cz</a>
Result on the web
<a href="https://ieeexplore.ieee.org/document/8683087" target="_blank" >https://ieeexplore.ieee.org/document/8683087</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1109/ICASSP.2019.8683087" target="_blank" >10.1109/ICASSP.2019.8683087</a>

Alternative languages

Result language
angličtina
Original language name
Compact Network for Speakerbeam Target Speaker Extraction
Original language description
Speech separation that separates a mixture of speech signals into each of its sources has been an active research topic for a long time and has seen recent progress with the advent of deep learning. A related problem is target speaker extraction, i.e. extraction of only speech of a target speaker out of a mixture, given characteristics of his/her voice. We have recently proposed SpeakerBeam, which is a neural network-based target speaker extraction method. Speaker- Beam uses a speech extraction network that is adapted to the target speaker using auxiliary features derived from an adaptation utterance of that speaker. Initially, we implemented SpeakerBeam with a factorized adaptation layer, which consists of several parallel linear transformations weighted by weights derived from the auxiliary features. The factorized layer is effective for target speech extraction, but it requires a large number of parameters. In this paper, we propose to simply scale the activations of a hidden layer of the speech extraction network with weights derived from the auxiliary features. This simpler approach greatly reduces the number of model parameters by up to 60%, making it much more practical, while maintaining a similar level of performance. We tested our approach on simulated and real noisy and reverberant mixtures, showing the potential of SpeakerBeam for real-life applications. Moreover, we showed that speech extraction performance of SpeakerBeam compares favorably with that of a state-of-the-art speech separation method with a similar network configuration.
Czech name
—
Czech description
—

Classification

Type
D - Article in proceedings
CEP classification
—
OECD FORD branch
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)

Result continuities

Project
<a href="/en/project/TJ01000208" target="_blank" >TJ01000208: Neural networks for speech signal processing and data mining</a><br>
Continuities
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)

Others

Publication year
2019
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Data specific for result type

Article name in the collection
Proceedings of ICASSP
ISBN
978-1-5386-4658-8
ISSN
—
e-ISSN
—
Number of pages
5
Pages from-to
6965-6969
Publisher name
IEEE Signal Processing Society
Place of publication
Brighton
Event location
Brighton
Event date
May 12, 2019
Type of event by nationality
WRD - Celosvětová akce
UT code for WoS article
000482554007040

Similar results(10)

Single Channel Target Speaker Extraction and Recognition with Speaker Beam Improving Speaker Discrimination of Target Speech Extraction With Time-Domain Speakerbeam Evaluation of SpeakerBeam target speech extraction in real noisy and reverberant conditions

What are you looking for?

Quick search

Smart search

Compact Network for Speakerbeam Target Speaker Extraction

The result's identifiers

Alternative languages

Classification

Result continuities

Others

Data specific for result type

Similar results(10)

What are you looking for?

Quick search

Smart search

Result description

The result's identifiers

The result's identifiers

Alternative languages

Alternative languages

Classification

Classification

Result continuities

Result continuities

Others

Others

Data specific for result type

Data specific for result type

Similar results(10)