Bayesian phonotactic language model for acoustic unit discovery
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216305%3A26230%2F17%3APU126429" target="_blank" >RIV/00216305:26230/17:PU126429 - isvavai.cz</a>
Result on the web
<a href="https://www.fit.vut.cz/research/publication/11472/" target="_blank" >https://www.fit.vut.cz/research/publication/11472/</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1109/ICASSP.2017.7953258" target="_blank" >10.1109/ICASSP.2017.7953258</a>
Alternative languages
Result language
angličtina
Original language name
Bayesian phonotactic language model for acoustic unit discovery
Original language description
Recent work on Acoustic Unit Discovery (AUD) has led to the development of a non-parametric Bayesian phone-loop model where the prior over the probability of the phone-like units is assumed to be sampled from a Dirichlet Process (DP). In this work, we propose to improve this model by incorporating a Hierarchical Pitman-Yor based bigram Language Model on top of the units transitions. This new model makes use of the phonotactic context information but assumes a fixed number of units. To remedy this limitation we first train a DP phoneloop model to infer the number of units, then, the bigram phone-loop is initialized from the DP phone-loop and trained until convergence of its parameters. Results show an absolute improvement of 1-2%on the Normalized Mutual Information (NMI) metric. Furthermore, we show that, combined with Multilingual Bottleneck (MBN) features the model yields a same or higher NMI as an English phone recogniser trained on TIMIT.
Czech name
—
Czech description
—
Classification
Type
D - Article in proceedings
CEP classification
—
OECD FORD branch
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Result continuities
Project
<a href="/en/project/LQ1602" target="_blank" >LQ1602: IT4Innovations excellence in science</a><br>
Continuities
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)
Others
Publication year
2017
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Article name in the collection
Proceedings of ICASSP 2017
ISBN
978-1-5090-4117-6
ISSN
—
e-ISSN
—
Number of pages
5
Pages from-to
5750-5754
Publisher name
IEEE Signal Processing Society
Place of publication
New Orleans
Event location
New Orleans, USA
Event date
Mar 5, 2017
Type of event by nationality
WRD - Celosvětová akce
UT code for WoS article
000414286205182