Reducing Footprint of Unit Selection TTS System by Excluding Utterances from Source Speech Corpus
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F49777513%3A23520%2F09%3A00504529" target="_blank" >RIV/49777513:23520/09:00504529 - isvavai.cz</a>
Výsledek na webu
—
DOI - Digital Object Identifier
—
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
Reducing Footprint of Unit Selection TTS System by Excluding Utterances from Source Speech Corpus
Popis výsledku v původním jazyce
Current unit selection speech synthesis systems are capable of producing speech of a high quality at the expense of enormous computational and storage requirements. In this paper, the analysis of an existing large speech corpus employed for unit-selection-based synthesis of Czech speech is performed. Subsequently, a procedure for the exclusion of some amount of utterances from the source speech corpus is proposed. The procedure is based on the statistics of the utilisation of all utterances during text-to-speech synthesis of a large portion of texts. The exclusion of whole utterances was preferred over the exclusion of the particular instances of speech units in order to preserve the main feature of unit selection framework - to select as longest sequence of contiguous speech units as possible. After the exclusion, the footprint of the system was reduced approximately by 42 %. The resulting synthetic speech was then judged by means of 5-scale CCR listening tests and evaluated in averag
Název v anglickém jazyce
Reducing Footprint of Unit Selection TTS System by Excluding Utterances from Source Speech Corpus
Popis výsledku anglicky
Current unit selection speech synthesis systems are capable of producing speech of a high quality at the expense of enormous computational and storage requirements. In this paper, the analysis of an existing large speech corpus employed for unit-selection-based synthesis of Czech speech is performed. Subsequently, a procedure for the exclusion of some amount of utterances from the source speech corpus is proposed. The procedure is based on the statistics of the utilisation of all utterances during text-to-speech synthesis of a large portion of texts. The exclusion of whole utterances was preferred over the exclusion of the particular instances of speech units in order to preserve the main feature of unit selection framework - to select as longest sequence of contiguous speech units as possible. After the exclusion, the footprint of the system was reduced approximately by 42 %. The resulting synthetic speech was then judged by means of 5-scale CCR listening tests and evaluated in averag
Klasifikace
Druh
D - Stať ve sborníku
CEP obor
JD - Využití počítačů, robotika a její aplikace
OECD FORD obor
—
Návaznosti výsledku
Projekt
Výsledek vznikl pri realizaci vícero projektů. Více informací v záložce Projekty.
Návaznosti
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)
Ostatní
Rok uplatnění
2009
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název statě ve sborníku
Speech Processing
ISBN
978-80-86269-18-4
ISSN
—
e-ISSN
—
Počet stran výsledku
8
Strana od-do
—
Název nakladatele
Institute of Photonics and Electronics AS CR
Místo vydání
Prague
Místo konání akce
Prague
Datum konání akce
1. 1. 2009
Typ akce podle státní příslušnosti
EUR - Evropská akce
Kód UT WoS článku
—