Various DNN-HMM architectures used in acoustic modeling with single-speaker and single-channel

Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F49777513%3A23520%2F21%3A43962798" target="_blank" >RIV/49777513:23520/21:43962798 - isvavai.cz</a>
Result on the web
<a href="https://link.springer.com/chapter/10.1007/978-3-030-89579-2_8" target="_blank" >https://link.springer.com/chapter/10.1007/978-3-030-89579-2_8</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1007/978-3-030-89579-2_8" target="_blank" >10.1007/978-3-030-89579-2_8</a>

Result language
angličtina
Original language name
Various DNN-HMM architectures used in acoustic modeling with single-speaker and single-channel
Original language description
In this paper, we discuss some interesting features of training a special acoustic model for only one speaker with a constant acoustic background (acoustic channel). Currently, the LF-MMI method achieves the best results in many speech recognition tasks. A typical LF-MMI training procedure uses a special 1-state HMM topology that has different pdfs at the self-loop and forward transitions. We would like to discuss the replacement of this typical LF-MMI HMM by different types of HMM topologies (1-, 2- and 3-state HMM topologies that have outputs associated with states). Next, we discuss the advantages of using biphone context modeling over using the triphone context or even simpler context-free monophone. We also address the effect of the amount of training data and the context of DNN on WER, and all this with regard to a special acoustic model with one speaker and an almost constant acoustic channel.
Czech name
—
Czech description
—

Project
<a href="/en/project/EF17_048%2F0007267" target="_blank" >EF17_048/0007267: Research and Development of Intelligent Components of Advanced Technologies for the Pilsen Metropolitan Area (InteCom)</a><br>
Continuities
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)<br>I - Institucionalni podpora na dlouhodoby koncepcni rozvoj vyzkumne organizace

Publication year
2021
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Article name in the collection
Statistical Language and Speech Processing, 9th International Conference, SLSP 2021, Cardiff, UK, November 23–25, 2021, Proceedings
ISBN
978-3-030-89578-5
ISSN
0302-9743
e-ISSN
1611-3349
Number of pages
12
Pages from-to
85-96
Publisher name
Springer
Place of publication
Cham
Event location
Cardiff, United Kingdom
Event date
Nov 23, 2021
Type of event by nationality
WRD - Celosvětová akce
UT code for WoS article
—

Similar results(10)