All

What are you looking for?

All
Projects
Results
Organizations

Quick search

  • Projects supported by TA ČR
  • Excellent projects
  • Projects with the highest public support
  • Current projects

Smart search

  • That is how I find a specific +word
  • That is how I leave the -word out of the results
  • “That is how I can find the whole phrase”

Applying articulatory features within speech recognition

The result's identifiers

  • Result code in IS VaVaI

    <a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F68407700%3A21230%2F19%3A00341897" target="_blank" >RIV/68407700:21230/19:00341897 - isvavai.cz</a>

  • Result on the web

  • DOI - Digital Object Identifier

Alternative languages

  • Result language

    angličtina

  • Original language name

    Applying articulatory features within speech recognition

  • Original language description

    This thesis deals with research on Articulatory Features (AF) of speech with special focus on improvement of Czech spontaneous speech recognition. As spontaneous speech is caused by frequent occurrence of coarticulation process, assimilation and reduction of phones and as AF contain the information about speech production mechanisms, they might represent a possible way how to improve results of these systems. The potential contribution of AF-based TANDEM ASR architecture on the tasks of recognition or phonetic segmentation of spontaneous speech is described. The multi-valued AF classes for Czech and four East-European languages were defined and unified. Next work was focused on the estimation of AF using artificial neural networks. The suitability of standard and advanced acoustic speech features was analyzed, mainly from the point of view of temporal context at the input of ANN/DNN network. The behaviour of AF estimation in mismatched or adverse noisy acoustic conditions was also studied and the robustness of DCT-TRAP features was proved as the best choice for this task. The application of AF within ASR was realized in the form of AF-Based TANDEM system. The performance of the AF-Based TANDEM system was analyzed for English phone recognition and Czech ASR tasks. Positive impact of this system was observed for standard monophone and triphone systems based on MFCC features. The ASR combination of GMM-HMM/DNN-HMM with the AF-Based TANDEM system on the level of lattice with decoded hypotheses significantly improved baseline results. Finally, phonetic segmentation task was analyzed using various type of acoustic model architectures as well as focusing on proper pronunciation variant selection. It was done for the following two task: read English and casual Czech. Two-stage forced-alignment with combination of DNN-HMM and optimized monophone-based system was proposed and the improvement of phone boundary determination was proved for both tasks.

  • Czech name

  • Czech description

Classification

  • Type

    O - Miscellaneous

  • CEP classification

  • OECD FORD branch

    10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)

Result continuities

  • Project

  • Continuities

    S - Specificky vyzkum na vysokych skolach

Others

  • Publication year

    2019

  • Confidentiality

    S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů