Complexity of the TDNN Acoustic Model with Respect to the HMM Topology

Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F49777513%3A23520%2F20%3A43959365" target="_blank" >RIV/49777513:23520/20:43959365 - isvavai.cz</a>
Result on the web
<a href="https://link.springer.com/chapter/10.1007%2F978-3-030-58323-1_50" target="_blank" >https://link.springer.com/chapter/10.1007%2F978-3-030-58323-1_50</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1007/978-3-030-58323-1_50" target="_blank" >10.1007/978-3-030-58323-1_50</a>

Result language
angličtina
Original language name
Complexity of the TDNN Acoustic Model with Respect to the HMM Topology
Original language description
In this paper, we discuss some of the properties of training acoustic models using a lattice-free version of the maximum mutual information criterion (LF-MMI). Currently, the LF-MMI method achieves state-of-the-art results on many speech recognition tasks. Some of the key features of the LF-MMI approach are: training DNN without initialization from a cross-entropy system, the use of a 3-fold reduced frame rate and the use of a simpler HMM topology. The conventional 3-state HMM topology was replaced in a typical LF-MMI training procedure with a special 1-stage HMM topology, that has different pdfs on the self-loop and forward transitions. In this paper, we would like to discuss both the different types of HMM topologies (conventional 1-, 2- and 3-state HMM topology) and the advantages of using biphone context modeling over using the original triphone or a simpler monophone context. We would also like to mention the impact of the subsampling factor to WER.
Czech name
—
Czech description
—

Project
<a href="/en/project/LO1506" target="_blank" >LO1506: Sustainability support of the centre NTIS - New Technologies for the Information Society</a><br>
Continuities
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)<br>I - Institucionalni podpora na dlouhodoby koncepcni rozvoj vyzkumne organizace

Publication year
2020
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Article name in the collection
Text, Speech, and Dialogue 23rd International Conference, TSD 2020, Brno, Czech Republic, September 8-11, 2020, Proceedings
ISBN
978-3-030-58322-4
ISSN
0302-9743
e-ISSN
1611-3349
Number of pages
9
Pages from-to
465-473
Publisher name
Springer
Place of publication
Cham
Event location
Brno, Česká republika
Event date
Sep 8, 2020
Type of event by nationality
WRD - Celosvětová akce
UT code for WoS article
—

Similar results(10)