Improving Noise Robustness of Automatic Speech Recognition via Parallel Data and Teacher-student Learning
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216305%3A26230%2F19%3APU134189" target="_blank" >RIV/00216305:26230/19:PU134189 - isvavai.cz</a>
Result on the web
<a href="https://ieeexplore.ieee.org/document/8683422" target="_blank" >https://ieeexplore.ieee.org/document/8683422</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1109/ICASSP.2019.8683422" target="_blank" >10.1109/ICASSP.2019.8683422</a>
Alternative languages
Result language
angličtina
Original language name
Improving Noise Robustness of Automatic Speech Recognition via Parallel Data and Teacher-student Learning
Original language description
For real-world speech recognition applications, noise robustness is still a challenge. In this work, we adopt the teacherstudent (T/S) learning technique using a parallel clean and noisy corpus for improving automatic speech recognition (ASR) performance under multimedia noise. On top of that, we apply a logits selection method which only preserves the k highest values to prevent wrong emphasis of knowledge from the teacher and to reduce bandwidth needed for transferring data. We incorporate up to 8000 hours of untranscribed data for training and present our results on sequence trained models apart from cross entropy trained ones. The best sequence trained student model yields relative word error rate (WER) reductions of approximately 10.1%, 28.7% and 19.6% on our clean, simulated noisy and real test sets respectively comparing to a sequence trained teacher.
Czech name
—
Czech description
—
Classification
Type
D - Article in proceedings
CEP classification
—
OECD FORD branch
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Result continuities
Project
<a href="/en/project/LQ1602" target="_blank" >LQ1602: IT4Innovations excellence in science</a><br>
Continuities
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)
Others
Publication year
2019
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Article name in the collection
Proceedings of ICASSP
ISBN
978-1-5386-4658-8
ISSN
—
e-ISSN
—
Number of pages
5
Pages from-to
6475-6479
Publisher name
IEEE Signal Processing Society
Place of publication
Brighton
Event location
Brighton
Event date
May 12, 2019
Type of event by nationality
WRD - Celosvětová akce
UT code for WoS article
000482554006141