On Comparison of Phonetic Representations for Czech Neural Speech Synthesis

Identifikátory výsledku

Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F49777513%3A23520%2F22%3A43965699" target="_blank" >RIV/49777513:23520/22:43965699 - isvavai.cz</a>
Výsledek na webu
<a href="https://link.springer.com/chapter/10.1007/978-3-031-16270-1_34" target="_blank" >https://link.springer.com/chapter/10.1007/978-3-031-16270-1_34</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1007/978-3-031-16270-1_34" target="_blank" >10.1007/978-3-031-16270-1_34</a>

Alternativní jazyky

Jazyk výsledku
angličtina
Název v původním jazyce
On Comparison of Phonetic Representations for Czech Neural Speech Synthesis
Popis výsledku v původním jazyce
In this paper, we investigate two research questions related to the phonetic representation of input text in Czech neural speech synthesis: 1) whether we can afford to reduce the phonetic alphabet, and 2) whether we can remove pauses from phonetic transcription and let the speech synthesis model predict the pause positions itself. In our experiments, three different modern speech synthesis models (FastSpeech 2 + Multi-band MelGAN, Glow-TTS + UnivNet, and VITS) were employed. We have found that the reduced phonetic alphabet outperforms the traditionally used full phonetic alphabet. On the other hand, removing pauses does not help. The presence of pauses (predicted by an external pause prediction tool) in phonetic transcription leads to a slightly better quality of synthetic speech.
Název v anglickém jazyce
On Comparison of Phonetic Representations for Czech Neural Speech Synthesis
Popis výsledku anglicky
In this paper, we investigate two research questions related to the phonetic representation of input text in Czech neural speech synthesis: 1) whether we can afford to reduce the phonetic alphabet, and 2) whether we can remove pauses from phonetic transcription and let the speech synthesis model predict the pause positions itself. In our experiments, three different modern speech synthesis models (FastSpeech 2 + Multi-band MelGAN, Glow-TTS + UnivNet, and VITS) were employed. We have found that the reduced phonetic alphabet outperforms the traditionally used full phonetic alphabet. On the other hand, removing pauses does not help. The presence of pauses (predicted by an external pause prediction tool) in phonetic transcription leads to a slightly better quality of synthetic speech.

Klasifikace

Druh
D - Stať ve sborníku
CEP obor
—
OECD FORD obor
20205 - Automation and control systems

Návaznosti výsledku

Projekt
<a href="/cs/project/TL05000546" target="_blank" >TL05000546: Využití multimediálního výkladového slovníku pro moderní výuku češtiny</a><br>
Návaznosti
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)

Ostatní

Rok uplatnění
2022
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Údaje specifické pro druh výsledku

Název statě ve sborníku
Text, Speech, and Dialogue 25th International Conference, TSD 2022, Brno, Czech Republic, September 6–9, 2022, Proceedings
ISBN
978-3-031-16269-5
ISSN
0302-9743
e-ISSN
1611-3349
Počet stran výsledku
13
Strana od-do
410-422
Název nakladatele
Springer International Publishing
Místo vydání
Cham
Místo konání akce
Brno, Czech Republic
Datum konání akce
6. 9. 2022
Typ akce podle státní příslušnosti
WRD - Celosvětová akce
Kód UT WoS článku
—

Podobné výsledky(10)

Speakers Talking Foreign Languages in a Multi-lingual TTS System Modelování rázu v syntéze české řeči z textu Kroslingvální neurální TTS model angličtiny a češtiny pro syntézu řeči v definovaném stylu

Co hledáte?

Rychlé hledání

Chytré vyhledávání

On Comparison of Phonetic Representations for Czech Neural Speech Synthesis

Identifikátory výsledku

Alternativní jazyky

Klasifikace

Návaznosti výsledku

Ostatní

Údaje specifické pro druh výsledku

Podobné výsledky(10)

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Popis výsledku

Identifikátory výsledku

Identifikátory výsledku

Alternativní jazyky

Alternativní jazyky

Klasifikace

Klasifikace

Návaznosti výsledku

Návaznosti výsledku

Ostatní

Ostatní

Údaje specifické pro druh výsledku

Údaje specifické pro druh výsledku

Podobné výsledky(10)