T5G2P: Using Text-to-Text Transfer Transformer for Grapheme-to-Phoneme Conversion
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F49777513%3A23520%2F21%3A43962415" target="_blank" >RIV/49777513:23520/21:43962415 - isvavai.cz</a>
Result on the web
<a href="https://www.isca-speech.org/archive/interspeech_2021/rezackova21_interspeech.html" target="_blank" >https://www.isca-speech.org/archive/interspeech_2021/rezackova21_interspeech.html</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.21437/Interspeech.2021-546" target="_blank" >10.21437/Interspeech.2021-546</a>
Alternative languages
Result language
angličtina
Original language name
T5G2P: Using Text-to-Text Transfer Transformer for Grapheme-to-Phoneme Conversion
Original language description
Despite the increasing popularity of end-to-end text-to-speech (TTS) systems, the correct grapheme-to-phoneme (G2P) module is still a crucial part of those relying on a phonetic input. In this paper, we, therefore, introduce a T5G2P model, a Text-to-Text Transfer Transformer (T5) neural network model which is able to convert an input text sentence into a phoneme sequence with a high accuracy. The evaluation of our trained T5 model is carried out on English and Czech, since there are different specific properties of G2P, including homograph disambiguation, cross-word assimilation and irregular pronunciation of loanwords. The paper also contains an analysis of a homographs issue in English and offers another approach to Czech phonetic transcription using the detection of pronunciation exceptions.
Czech name
—
Czech description
—
Classification
Type
D - Article in proceedings
CEP classification
—
OECD FORD branch
20205 - Automation and control systems
Result continuities
Project
<a href="/en/project/GA19-19324S" target="_blank" >GA19-19324S: Fully Trainable Deep Neural Network Based Czech Text-to-Speech Synthesis</a><br>
Continuities
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)
Others
Publication year
2021
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Article name in the collection
Proceedings of the Annual Conference of the International Speech Communication Association, Interspeech
ISBN
978-1-71383-690-2
ISSN
2308-457X
e-ISSN
—
Number of pages
5
Pages from-to
3291-3295
Publisher name
International Speech Communication Association
Place of publication
Red Hook, NY
Event location
Brno, Czech Republic
Event date
Aug 30, 2021
Type of event by nationality
WRD - Celosvětová akce
UT code for WoS article
—