Using Auto-Encoder BiLSTM Neural Network for Czech Grapheme-to-Phoneme Conversion
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F49777513%3A23520%2F19%3A43955897" target="_blank" >RIV/49777513:23520/19:43955897 - isvavai.cz</a>
Result on the web
<a href="https://link.springer.com/chapter/10.1007%2F978-3-030-27947-9_8" target="_blank" >https://link.springer.com/chapter/10.1007%2F978-3-030-27947-9_8</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1007/978-3-030-27947-9_8" target="_blank" >10.1007/978-3-030-27947-9_8</a>
Alternative languages
Result language
angličtina
Original language name
Using Auto-Encoder BiLSTM Neural Network for Czech Grapheme-to-Phoneme Conversion
Original language description
The crucial part of almost all current TTS systems is a grapheme-to-phoneme (G2P) conversion, i.e. the transcription of any input grapheme sequence into the correct sequence of phonemes in the given language. Unfortunately, the preparation of transcription rules and pronunciation dictionaries is not an easy process for new languages in TTS systems. For that reason, in the presented paper, we focus on the creation of an automatic G2P model, based on neural networks (NN). But, contrary to the majority of related works in G2P field, using only separate words as an input, we consider a whole phrase the input of our proposed NN model. That approach should, in our opinion, lead to more precise phonetic transcription output because the pronunciation of a word can depend on the surrounding words. The results of the trained G2P model are presented on the Czech language where the cross-word-boundary phenomena occur quite often, and they are compared to the rule-based approach.
Czech name
—
Czech description
—
Classification
Type
D - Article in proceedings
CEP classification
—
OECD FORD branch
20205 - Automation and control systems
Result continuities
Project
<a href="/en/project/GA19-19324S" target="_blank" >GA19-19324S: Fully Trainable Deep Neural Network Based Czech Text-to-Speech Synthesis</a><br>
Continuities
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)
Others
Publication year
2019
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Article name in the collection
Text, Speech, and Dialogue 22nd International Conference, TSD 2019, Ljubljana,Slovenia, September 11-13, 2019, Proceedings
ISBN
978-3-030-27946-2
ISSN
0302-9743
e-ISSN
1611-3349
Number of pages
12
Pages from-to
91-102
Publisher name
Springer
Place of publication
Cham
Event location
Ljubljana, Slovenia
Event date
Sep 11, 2019
Type of event by nationality
WRD - Celosvětová akce
UT code for WoS article
—