Bayesian joint-sequence models for grapheme-to-phoneme conversion
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216305%3A26230%2F17%3APU126426" target="_blank" >RIV/00216305:26230/17:PU126426 - isvavai.cz</a>
Výsledek na webu
<a href="https://www.fit.vut.cz/research/publication/11469/" target="_blank" >https://www.fit.vut.cz/research/publication/11469/</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1109/ICASSP.2017.7952674" target="_blank" >10.1109/ICASSP.2017.7952674</a>
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
Bayesian joint-sequence models for grapheme-to-phoneme conversion
Popis výsledku v původním jazyce
We describe a fully Bayesian approach to grapheme-to-phoneme conversion based on the joint-sequence model (JSM). Usually, standard smoothed n-gram language models (LM, e.g. Kneser-Ney) are used with JSMs to model graphone sequences (joint graphemephoneme pairs). However, we take a Bayesian approach using a hierarchical Pitman-Yor-Process LM. This provides an elegant alternative to using smoothing techniques to avoid over-training. No held-out sets and complex parameter tuning is necessary, and several convergence problems encountered in the discounted Expectation- Maximization (as used in the smoothed JSMs) are avoided. Every step is modeled by weighted finite state transducers and implemented with standard operations from the OpenFST toolkit. We evaluate our model on a standard data set (CMUdict), where it gives comparable results to the previously reported smoothed JSMs in terms of phoneme-error rate while requiring a much smaller training/ testing time. Most importantly, our model can be used in a Bayesian framework and for (partly) un-supervised training.
Název v anglickém jazyce
Bayesian joint-sequence models for grapheme-to-phoneme conversion
Popis výsledku anglicky
We describe a fully Bayesian approach to grapheme-to-phoneme conversion based on the joint-sequence model (JSM). Usually, standard smoothed n-gram language models (LM, e.g. Kneser-Ney) are used with JSMs to model graphone sequences (joint graphemephoneme pairs). However, we take a Bayesian approach using a hierarchical Pitman-Yor-Process LM. This provides an elegant alternative to using smoothing techniques to avoid over-training. No held-out sets and complex parameter tuning is necessary, and several convergence problems encountered in the discounted Expectation- Maximization (as used in the smoothed JSMs) are avoided. Every step is modeled by weighted finite state transducers and implemented with standard operations from the OpenFST toolkit. We evaluate our model on a standard data set (CMUdict), where it gives comparable results to the previously reported smoothed JSMs in terms of phoneme-error rate while requiring a much smaller training/ testing time. Most importantly, our model can be used in a Bayesian framework and for (partly) un-supervised training.
Klasifikace
Druh
D - Stať ve sborníku
CEP obor
—
OECD FORD obor
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Návaznosti výsledku
Projekt
—
Návaznosti
N - Vyzkumna aktivita podporovana z neverejnych zdroju
Ostatní
Rok uplatnění
2017
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název statě ve sborníku
Proceedings of ICASSP 2017
ISBN
978-1-5090-4117-6
ISSN
—
e-ISSN
—
Počet stran výsledku
5
Strana od-do
2836-2840
Název nakladatele
IEEE Signal Processing Society
Místo vydání
New Orleans
Místo konání akce
New Orleans, USA
Datum konání akce
5. 3. 2017
Typ akce podle státní příslušnosti
WRD - Celosvětová akce
Kód UT WoS článku
000414286203002