Speech Technology for Unwritten Languages
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216305%3A26230%2F20%3APU140040" target="_blank" >RIV/00216305:26230/20:PU140040 - isvavai.cz</a>
Výsledek na webu
<a href="https://ieeexplore.ieee.org/document/8998182" target="_blank" >https://ieeexplore.ieee.org/document/8998182</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1109/TASLP.2020.2973896" target="_blank" >10.1109/TASLP.2020.2973896</a>
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
Speech Technology for Unwritten Languages
Popis výsledku v původním jazyce
Abstract-Speech technology plays an important role in our everyday life. Among others, speech is used for human-computer interaction, for instance for information retrieval and on-line shopping. In the case of an unwritten language, however, speech technology is unfortunately difficult to create, because it cannot be created by the standard combination of pre-trained speech-to-text and text-to-speech subsystems. The research presented in this article takes the first steps towards speech technology for unwritten languages. Specifically, the aim of this work was 1) to learn speech-to-meaning representations without using text as an intermediate representation, and 2) to test the sufficiency of the learned representations to regenerate speech or translated text, or to retrieve images that depict the meaning of an utterance in an unwritten language. The results suggest that building systems that go directly from speech-to-meaning and from meaning-to-speech, bypassing the need for text, is possible.
Název v anglickém jazyce
Speech Technology for Unwritten Languages
Popis výsledku anglicky
Abstract-Speech technology plays an important role in our everyday life. Among others, speech is used for human-computer interaction, for instance for information retrieval and on-line shopping. In the case of an unwritten language, however, speech technology is unfortunately difficult to create, because it cannot be created by the standard combination of pre-trained speech-to-text and text-to-speech subsystems. The research presented in this article takes the first steps towards speech technology for unwritten languages. Specifically, the aim of this work was 1) to learn speech-to-meaning representations without using text as an intermediate representation, and 2) to test the sufficiency of the learned representations to regenerate speech or translated text, or to retrieve images that depict the meaning of an utterance in an unwritten language. The results suggest that building systems that go directly from speech-to-meaning and from meaning-to-speech, bypassing the need for text, is possible.
Klasifikace
Druh
J<sub>imp</sub> - Článek v periodiku v databázi Web of Science
CEP obor
—
OECD FORD obor
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Návaznosti výsledku
Projekt
—
Návaznosti
S - Specificky vyzkum na vysokych skolach
Ostatní
Rok uplatnění
2020
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název periodika
IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH AND LANGUAGE PROCESSING
ISSN
2329-9290
e-ISSN
2329-9304
Svazek periodika
2020
Číslo periodika v rámci svazku
28
Stát vydavatele periodika
US - Spojené státy americké
Počet stran výsledku
12
Strana od-do
964-975
Kód UT WoS článku
000522357500002
EID výsledku v databázi Scopus
2-s2.0-85079642575