Automatic Phonetic Segmentation Using the Kaldi Toolkit

Identifikátory výsledku

Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F49777513%3A23520%2F17%3A43932638" target="_blank" >RIV/49777513:23520/17:43932638 - isvavai.cz</a>
Výsledek na webu
<a href="https://link.springer.com/chapter/10.1007%2F978-3-319-64206-2_16" target="_blank" >https://link.springer.com/chapter/10.1007%2F978-3-319-64206-2_16</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1007/978-3-319-64206-2_16" target="_blank" >10.1007/978-3-319-64206-2_16</a>

Alternativní jazyky

Jazyk výsledku
angličtina
Název v původním jazyce
Automatic Phonetic Segmentation Using the Kaldi Toolkit
Popis výsledku v původním jazyce
In this paper we explore the possibilities of hidden Markov model based automatic phonetic segmentation with the Kaldi toolkit. We compare the Kaldi toolkit and the Hidden Markov Model Toolkit (HTK) in terms of segmentation accuracy. The well-tuned HTK-based phonetic segmentation framework was taken as the baseline and compared to a newly proposed segmentation framework built from the default examples and recipes available in the Kaldi repository. Since the segmentation accuracy of the HTK-based system was significantly higher than that of the Kaldi-based system, the default Kaldi setting was modified with respect to pause model topology, the way of generating phonetic questions for clustering, and the number of Gaussian mixtures used during modeling. The modified Kaldi-based system achieved results comparable to those obtained by HTK—slightly worse for small segmentation errors but better for gross segmentation errors. We also confirmed that, for both toolkits, the standard three-state left-to-right model topology was significantly outperformed by a modified five-state left-to-right topology, especially with respect to small segmentation errors.
Název v anglickém jazyce
Automatic Phonetic Segmentation Using the Kaldi Toolkit
Popis výsledku anglicky
In this paper we explore the possibilities of hidden Markov model based automatic phonetic segmentation with the Kaldi toolkit. We compare the Kaldi toolkit and the Hidden Markov Model Toolkit (HTK) in terms of segmentation accuracy. The well-tuned HTK-based phonetic segmentation framework was taken as the baseline and compared to a newly proposed segmentation framework built from the default examples and recipes available in the Kaldi repository. Since the segmentation accuracy of the HTK-based system was significantly higher than that of the Kaldi-based system, the default Kaldi setting was modified with respect to pause model topology, the way of generating phonetic questions for clustering, and the number of Gaussian mixtures used during modeling. The modified Kaldi-based system achieved results comparable to those obtained by HTK—slightly worse for small segmentation errors but better for gross segmentation errors. We also confirmed that, for both toolkits, the standard three-state left-to-right model topology was significantly outperformed by a modified five-state left-to-right topology, especially with respect to small segmentation errors.

Klasifikace

Druh
D - Stať ve sborníku
CEP obor
—
OECD FORD obor
20205 - Automation and control systems

Návaznosti výsledku

Projekt
<a href="/cs/project/TH02010307" target="_blank" >TH02010307: Automatická konzervace a rekonstrukce hlasu se zaměřením na pacienty po totální laryngektomii</a><br>
Návaznosti
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)

Ostatní

Rok uplatnění
2017
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Údaje specifické pro druh výsledku

Název statě ve sborníku
Text, Speech and Dialogue, 20th International Conference, TSD 2017, Prague, Czech Republic, August 27-31 August, 2017, Proceedings
ISBN
978-3-319-64205-5
ISSN
0302-9743
e-ISSN
—
Počet stran výsledku
9
Strana od-do
138-146
Název nakladatele
Springer
Místo vydání
Cham
Místo konání akce
Prague, Czech Republic
Datum konání akce
27. 8. 2017
Typ akce podle státní příslušnosti
WRD - Celosvětová akce
Kód UT WoS článku
000449869200016

Podobné výsledky(10)

Soubor programů pro práci se skrytými Markovovými modely (HTK)Fonetická segmetnace na bázi HMM v prostředí programu Praat Automatická fonetická segmentace řečového signálu na bázi HMM a její implementace v prostředí programu Praat

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Automatic Phonetic Segmentation Using the Kaldi Toolkit

Identifikátory výsledku

Alternativní jazyky

Klasifikace

Návaznosti výsledku

Ostatní

Údaje specifické pro druh výsledku

Podobné výsledky(10)

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Popis výsledku

Identifikátory výsledku

Identifikátory výsledku

Alternativní jazyky

Alternativní jazyky

Klasifikace

Klasifikace

Návaznosti výsledku

Návaznosti výsledku

Ostatní

Ostatní

Údaje specifické pro druh výsledku

Údaje specifické pro druh výsledku

Podobné výsledky(10)