LSTM-Based Speech Segmentation Trained on Different Foreign Languages
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F49777513%3A23520%2F20%3A43959258" target="_blank" >RIV/49777513:23520/20:43959258 - isvavai.cz</a>
Result on the web
<a href="https://link.springer.com/chapter/10.1007%2F978-3-030-58323-1_49" target="_blank" >https://link.springer.com/chapter/10.1007%2F978-3-030-58323-1_49</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1007/978-3-030-58323-1_49" target="_blank" >10.1007/978-3-030-58323-1_49</a>
Alternative languages
Result language
angličtina
Original language name
LSTM-Based Speech Segmentation Trained on Different Foreign Languages
Original language description
This paper describes experiments on speech segmentation by using bidirectional LSTM neural networks. The networks were trained on various languages (English, German, Russian and Czech), segmentation experiments were performed on 4 Czech professional voices. To be able to use various combinations of foreign languages, we defined a reduced phonetic alphabet based on IPA notation. It consists of 26 phones, all included in all languages. To increase the segmentation accuracy, we applied an iterative procedure based on detection of improperly segmented data and retraining of the network. Experiments confirmed the convergence of the procedure. A comparison with a reference HMM-based segmentation with additional manual corrections was performed.
Czech name
—
Czech description
—
Classification
Type
D - Article in proceedings
CEP classification
—
OECD FORD branch
20205 - Automation and control systems
Result continuities
Project
<a href="/en/project/GA19-19324S" target="_blank" >GA19-19324S: Fully Trainable Deep Neural Network Based Czech Text-to-Speech Synthesis</a><br>
Continuities
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)
Others
Publication year
2020
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Article name in the collection
Text, Speech, and Dialogue 23rd International Conference, TSD 2020, Brno, Czech Republic, September 8-11, 2020, Proceedings
ISBN
978-3-030-58322-4
ISSN
0302-9743
e-ISSN
1611-3349
Number of pages
9
Pages from-to
456-464
Publisher name
Springer Nature Switzerland AG
Place of publication
Cham
Event location
Brno, Czech Republic
Event date
Sep 8, 2020
Type of event by nationality
WRD - Celosvětová akce
UT code for WoS article
—