Google’s Next-Generation Real-Time Unit-Selection Synthesizer using Sequence-To-Sequence LSTM-based Autoencoders
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F49777513%3A23520%2F17%3A43932652" target="_blank" >RIV/49777513:23520/17:43932652 - isvavai.cz</a>
Result on the web
<a href="http://dx.doi.org/10.21437/Interspeech.2017-1107" target="_blank" >http://dx.doi.org/10.21437/Interspeech.2017-1107</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.21437/Interspeech.2017-1107" target="_blank" >10.21437/Interspeech.2017-1107</a>
Alternative languages
Result language
angličtina
Original language name
Google’s Next-Generation Real-Time Unit-Selection Synthesizer using Sequence-To-Sequence LSTM-based Autoencoders
Original language description
A neural network model that significant improves unit- selection-based Text-To-Speech synthesis is presented. The model employs a sequence-to-sequence LSTM-based autoen- coder that compresses the acoustic and linguistic features of each unit to a fixed-size vector referred to as an embedding. Unit-selection is facilitated by formulating the target cost as an L2 distance in the embedding space. In open-domain speech synthesis the method achieves a 0.2 improvement in the MOS, while for limited-domain it reaches the cap of 4.5 MOS. Fur- thermore, the new TTS system halves the gap between the pre- vious unit-selection system and WaveNet in terms of quality while retaining low computational cost and latency.
Czech name
—
Czech description
—
Classification
Type
D - Article in proceedings
CEP classification
—
OECD FORD branch
20205 - Automation and control systems
Result continuities
Project
<a href="/en/project/LO1506" target="_blank" >LO1506: Sustainability support of the centre NTIS - New Technologies for the Information Society</a><br>
Continuities
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)
Others
Publication year
2017
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Article name in the collection
Proceedings of the 18th Annual Conference of the International Speech Communication Association (Interspeech 2017)
ISBN
978-1-5108-4876-4
ISSN
—
e-ISSN
—
Number of pages
5
Pages from-to
1143-1147
Publisher name
Curran Associates, Inc.
Place of publication
Red Hook, NY
Event location
Stockholm, Sweden
Event date
Aug 20, 2017
Type of event by nationality
WRD - Celosvětová akce
UT code for WoS article
000457505000239