Complexity of the TDNN Acoustic Model with Respect to the HMM Topology

Identifikátory výsledku

Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F49777513%3A23520%2F20%3A43959365" target="_blank" >RIV/49777513:23520/20:43959365 - isvavai.cz</a>
Výsledek na webu
<a href="https://link.springer.com/chapter/10.1007%2F978-3-030-58323-1_50" target="_blank" >https://link.springer.com/chapter/10.1007%2F978-3-030-58323-1_50</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1007/978-3-030-58323-1_50" target="_blank" >10.1007/978-3-030-58323-1_50</a>

Alternativní jazyky

Jazyk výsledku
angličtina
Název v původním jazyce
Complexity of the TDNN Acoustic Model with Respect to the HMM Topology
Popis výsledku v původním jazyce
In this paper, we discuss some of the properties of training acoustic models using a lattice-free version of the maximum mutual information criterion (LF-MMI). Currently, the LF-MMI method achieves state-of-the-art results on many speech recognition tasks. Some of the key features of the LF-MMI approach are: training DNN without initialization from a cross-entropy system, the use of a 3-fold reduced frame rate and the use of a simpler HMM topology. The conventional 3-state HMM topology was replaced in a typical LF-MMI training procedure with a special 1-stage HMM topology, that has different pdfs on the self-loop and forward transitions. In this paper, we would like to discuss both the different types of HMM topologies (conventional 1-, 2- and 3-state HMM topology) and the advantages of using biphone context modeling over using the original triphone or a simpler monophone context. We would also like to mention the impact of the subsampling factor to WER.
Název v anglickém jazyce
Complexity of the TDNN Acoustic Model with Respect to the HMM Topology
Popis výsledku anglicky
In this paper, we discuss some of the properties of training acoustic models using a lattice-free version of the maximum mutual information criterion (LF-MMI). Currently, the LF-MMI method achieves state-of-the-art results on many speech recognition tasks. Some of the key features of the LF-MMI approach are: training DNN without initialization from a cross-entropy system, the use of a 3-fold reduced frame rate and the use of a simpler HMM topology. The conventional 3-state HMM topology was replaced in a typical LF-MMI training procedure with a special 1-stage HMM topology, that has different pdfs on the self-loop and forward transitions. In this paper, we would like to discuss both the different types of HMM topologies (conventional 1-, 2- and 3-state HMM topology) and the advantages of using biphone context modeling over using the original triphone or a simpler monophone context. We would also like to mention the impact of the subsampling factor to WER.

Klasifikace

Druh
D - Stať ve sborníku
CEP obor
—
OECD FORD obor
20205 - Automation and control systems

Návaznosti výsledku

Projekt
<a href="/cs/project/LO1506" target="_blank" >LO1506: Podpora udržitelnosti centra NTIS - Nové technologie pro informační společnost</a><br>
Návaznosti
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)<br>I - Institucionalni podpora na dlouhodoby koncepcni rozvoj vyzkumne organizace

Ostatní

Rok uplatnění
2020
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Údaje specifické pro druh výsledku

Název statě ve sborníku
Text, Speech, and Dialogue 23rd International Conference, TSD 2020, Brno, Czech Republic, September 8-11, 2020, Proceedings
ISBN
978-3-030-58322-4
ISSN
0302-9743
e-ISSN
1611-3349
Počet stran výsledku
9
Strana od-do
465-473
Název nakladatele
Springer
Místo vydání
Cham
Místo konání akce
Brno, Česká republika
Datum konání akce
8. 9. 2020
Typ akce podle státní příslušnosti
WRD - Celosvětová akce
Kód UT WoS článku
—

Podobné výsledky(10)

Various DNN-HMM architectures used in acoustic modeling with single-speaker and single-channel GPU-Accelerated Forward-Backward Algorithm with Application to Lattice-Free MMI Fonémový detektor klíčových slov založený na akustice pro neformální konverzační řeč

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Complexity of the TDNN Acoustic Model with Respect to the HMM Topology

Identifikátory výsledku

Alternativní jazyky

Klasifikace

Návaznosti výsledku

Ostatní

Údaje specifické pro druh výsledku

Podobné výsledky(10)

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Popis výsledku

Identifikátory výsledku

Identifikátory výsledku

Alternativní jazyky

Alternativní jazyky

Klasifikace

Klasifikace

Návaznosti výsledku

Návaznosti výsledku

Ostatní

Ostatní

Údaje specifické pro druh výsledku

Údaje specifické pro druh výsledku

Podobné výsledky(10)