All

What are you looking for?

All
Projects
Results
Organizations

Quick search

  • Projects supported by TA ČR
  • Excellent projects
  • Projects with the highest public support
  • Current projects

Smart search

  • That is how I find a specific +word
  • That is how I leave the -word out of the results
  • “That is how I can find the whole phrase”

Digits to Words Converter for Slavic Languages in Systems of Automatic Speech Recognition

The result's identifiers

  • Result code in IS VaVaI

    <a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F46747885%3A24220%2F17%3A00004821" target="_blank" >RIV/46747885:24220/17:00004821 - isvavai.cz</a>

  • Result on the web

    <a href="http://dx.doi.org/10.1007/978-3-319-66429-3_30" target="_blank" >http://dx.doi.org/10.1007/978-3-319-66429-3_30</a>

  • DOI - Digital Object Identifier

    <a href="http://dx.doi.org/10.1007/978-3-319-66429-3_30" target="_blank" >10.1007/978-3-319-66429-3_30</a>

Alternative languages

  • Result language

    angličtina

  • Original language name

    Digits to Words Converter for Slavic Languages in Systems of Automatic Speech Recognition

  • Original language description

    In this paper, a system for digits to words conversion for almost all Slavic languages is proposed. This system was developed for improvement of text corpora which we are using for building of a lexicon or for training of language models and acoustic models in the task of Large Vocabulary Continuous Speech Recognition (LVCSR). Strings of digits, some other special characters (%, €, $, …) or abbreviations of physical units (km, m, cm, kg, 1, °C, etc.) occur very often in our text corpora. It is in about 5% cases. The strings of digits or special characters are usually omitted if a lexicon is being built or if the language model is being trained. The task of digits to words conversion in non-inflected languages (e.g. English) is solved by relatively simple conversion or lookup table. The problem is more complex in inflected Slavic languages. The string of digits can be converted into several different word combinations. It depends on the context and resulting words are inflected by gender or cases. The main goal of this research was to find the rules (patterns) for conversion of string of digits into words for Slavic languages. The second goal was to unify this patterns over Slavic languages and to integrate them to the universal system for digits to words conversion.

  • Czech name

  • Czech description

Classification

  • Type

    D - Article in proceedings

  • CEP classification

  • OECD FORD branch

    20204 - Robotics and automatic control

Result continuities

  • Project

    <a href="/en/project/TA04010199" target="_blank" >TA04010199: MULTILINMEDIA - Multilingual Multimedia Monitoring and Analyzing Platform</a><br>

  • Continuities

    P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)

Others

  • Publication year

    2017

  • Confidentiality

    S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Data specific for result type

  • Article name in the collection

    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

  • ISBN

    9783319664286

  • ISSN

    0302-9743

  • e-ISSN

  • Number of pages

    10

  • Pages from-to

    312-321

  • Publisher name

    Springer Verlag

  • Place of publication

    Německo

  • Event location

    Hatfield; United Kingdom

  • Event date

    Jan 1, 2017

  • Type of event by nationality

    WRD - Celosvětová akce

  • UT code for WoS article