Vše

Co hledáte?

Vše
Projekty
Výsledky výzkumu
Subjekty

Rychlé hledání

  • Projekty podpořené TA ČR
  • Významné projekty
  • Projekty s nejvyšší státní podporou
  • Aktuálně běžící projekty

Chytré vyhledávání

  • Takto najdu konkrétní +slovo
  • Takto z výsledků -slovo zcela vynechám
  • “Takto můžu najít celou frázi”

Increasing the Accuracy of the ASR System by Prolonging Voiceless Phonemes in the Speech of Patients Using the Electrolarynx

Identifikátory výsledku

  • Kód výsledku v IS VaVaI

    <a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F49777513%3A23520%2F20%3A43959812" target="_blank" >RIV/49777513:23520/20:43959812 - isvavai.cz</a>

  • Výsledek na webu

    <a href="https://link.springer.com/chapter/10.1007/978-3-030-60276-5_54" target="_blank" >https://link.springer.com/chapter/10.1007/978-3-030-60276-5_54</a>

  • DOI - Digital Object Identifier

    <a href="http://dx.doi.org/10.1007/978-3-030-60276-5_54" target="_blank" >10.1007/978-3-030-60276-5_54</a>

Alternativní jazyky

  • Jazyk výsledku

    angličtina

  • Název v původním jazyce

    Increasing the Accuracy of the ASR System by Prolonging Voiceless Phonemes in the Speech of Patients Using the Electrolarynx

  • Popis výsledku v původním jazyce

    Patients who have undergone total laryngectomy and use electrolarynx for voice production suffer from poor intelligibility. It may lead in many cases to fear of speaking to strangers, even over the phone. Automatic Speech Recognition (ASR) systems could help patients overcome this problem in many ways. Unfortunately, even state-of-the-art ASR systems cannot provide results comparable to those of conventional speakers. The problem is mainly caused by the similarity between voiced and unvoiced phoneme pairs. In many cases, a language model can help to solve the issue, but only if the word context is sufficiently long. Therefore adjustment of acoustic data and/or acoustic model is necessary to increase recognition accuracy. In this paper, we propose voiceless phonemes elongation to improve recognition accuracy and enrich the ASR system with a model that takes this elongation into account. The idea of elongation is verified on a set of ASR experiments with artificially elongated voiceless phonemes. To enriching the ASR system, the DNN model for rescoring lattices based on phoneme duration is proposed. The new system is compared with a standard ASR. It is also verified that the ASR system created using elongated synthetic data can successfully recognize the actual elongated data pronounced by the real speaker.

  • Název v anglickém jazyce

    Increasing the Accuracy of the ASR System by Prolonging Voiceless Phonemes in the Speech of Patients Using the Electrolarynx

  • Popis výsledku anglicky

    Patients who have undergone total laryngectomy and use electrolarynx for voice production suffer from poor intelligibility. It may lead in many cases to fear of speaking to strangers, even over the phone. Automatic Speech Recognition (ASR) systems could help patients overcome this problem in many ways. Unfortunately, even state-of-the-art ASR systems cannot provide results comparable to those of conventional speakers. The problem is mainly caused by the similarity between voiced and unvoiced phoneme pairs. In many cases, a language model can help to solve the issue, but only if the word context is sufficiently long. Therefore adjustment of acoustic data and/or acoustic model is necessary to increase recognition accuracy. In this paper, we propose voiceless phonemes elongation to improve recognition accuracy and enrich the ASR system with a model that takes this elongation into account. The idea of elongation is verified on a set of ASR experiments with artificially elongated voiceless phonemes. To enriching the ASR system, the DNN model for rescoring lattices based on phoneme duration is proposed. The new system is compared with a standard ASR. It is also verified that the ASR system created using elongated synthetic data can successfully recognize the actual elongated data pronounced by the real speaker.

Klasifikace

  • Druh

    D - Stať ve sborníku

  • CEP obor

  • OECD FORD obor

    20205 - Automation and control systems

Návaznosti výsledku

  • Projekt

    <a href="/cs/project/TN01000024" target="_blank" >TN01000024: Národní centrum kompetence - Kybernetika a umělá inteligence</a><br>

  • Návaznosti

    P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)

Ostatní

  • Rok uplatnění

    2020

  • Kód důvěrnosti údajů

    S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Údaje specifické pro druh výsledku

  • Název statě ve sborníku

    22nd International Conference, SPECOM 2020, St. Petersburg, Russia, October 7–9, 2020, Proceedings

  • ISBN

    978-3-030-60275-8

  • ISSN

    0302-9743

  • e-ISSN

    1611-3349

  • Počet stran výsledku

    10

  • Strana od-do

    562-571

  • Název nakladatele

    Springer

  • Místo vydání

    Cham

  • Místo konání akce

    St. Petersburg; Russian Federation

  • Datum konání akce

    7. 10. 2020

  • Typ akce podle státní příslušnosti

    WRD - Celosvětová akce

  • Kód UT WoS článku