Lexicon-based vs. Lexicon-free ASR for Norwegian Parliament Speech Transcription

Identifikátory výsledku

Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F46747885%3A24220%2F22%3A00009900" target="_blank" >RIV/46747885:24220/22:00009900 - isvavai.cz</a>
Výsledek na webu
<a href="https://link.springer.com/chapter/10.1007/978-3-031-16270-1_33" target="_blank" >https://link.springer.com/chapter/10.1007/978-3-031-16270-1_33</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1007/978-3-031-16270-1_33" target="_blank" >10.1007/978-3-031-16270-1_33</a>

Alternativní jazyky

Jazyk výsledku
angličtina
Název v původním jazyce
Lexicon-based vs. Lexicon-free ASR for Norwegian Parliament Speech Transcription
Popis výsledku v původním jazyce
Norwegian is a challenging language for automatic speech recognition research because it has two written standards (Bokmal and Nynorsk) and a large number of distinct dialects, from which none has status of an official spoken norm. A traditional lexicon-based approach to ASR leads to a huge lexicon (because of the two standards and also due to compound words) with many spelling and pronunciation variants, and consequently to a large (and sparse) language model (LM). We have built a system with 601k-word lexicon and an acoustic model (AM) based on several types of neural networks and compare its performance with a lexicon-free end-to-end system developed in the ESPnet framework. For evaluation we use a publically available dataset of Norwegian parliament speeches that offers 100 h for training and 12 h for testing. In spite of this rather limited training resource, the lexicon-free approach yields significantly better results (13.0% word-error rate) compared to the best system with the lexicon, LM and neural network AM (that achieved 22.5% WER).
Název v anglickém jazyce
Lexicon-based vs. Lexicon-free ASR for Norwegian Parliament Speech Transcription
Popis výsledku anglicky
Norwegian is a challenging language for automatic speech recognition research because it has two written standards (Bokmal and Nynorsk) and a large number of distinct dialects, from which none has status of an official spoken norm. A traditional lexicon-based approach to ASR leads to a huge lexicon (because of the two standards and also due to compound words) with many spelling and pronunciation variants, and consequently to a large (and sparse) language model (LM). We have built a system with 601k-word lexicon and an acoustic model (AM) based on several types of neural networks and compare its performance with a lexicon-free end-to-end system developed in the ESPnet framework. For evaluation we use a publically available dataset of Norwegian parliament speeches that offers 100 h for training and 12 h for testing. In spite of this rather limited training resource, the lexicon-free approach yields significantly better results (13.0% word-error rate) compared to the best system with the lexicon, LM and neural network AM (that achieved 22.5% WER).

Klasifikace

Druh
D - Stať ve sborníku
CEP obor
—
OECD FORD obor
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)

Návaznosti výsledku

Projekt
<a href="/cs/project/TO01000027" target="_blank" >TO01000027: NORDTRANS - Technologie pro automatický přepis řeči ve vybraných severských jazycích</a><br>
Návaznosti
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)

Ostatní

Rok uplatnění
2022
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Údaje specifické pro druh výsledku

Název statě ve sborníku
Lecture Notes in Computer Science
ISBN
978-303116269-5
ISSN
0302-9743
e-ISSN
—
Počet stran výsledku
9
Strana od-do
401-409
Název nakladatele
SPRINGER-VERLAG BERLIN
Místo vydání
—
Místo konání akce
Brno
Datum konání akce
1. 1. 2022
Typ akce podle státní příslušnosti
WRD - Celosvětová akce
Kód UT WoS článku
000866222300033

Podobné výsledky(10)

Supervised Morphological Segmentation Using Rich Annotated Lexicon Developing State-of-the-Art End-to-End ASR for Norwegian Deep Neural Networks Based Automatic Speech Recognition for Four Ethiopian Languages

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Lexicon-based vs. Lexicon-free ASR for Norwegian Parliament Speech Transcription

Identifikátory výsledku

Alternativní jazyky

Klasifikace

Návaznosti výsledku

Ostatní

Údaje specifické pro druh výsledku

Podobné výsledky(10)

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Popis výsledku

Identifikátory výsledku

Identifikátory výsledku

Alternativní jazyky

Alternativní jazyky

Klasifikace

Klasifikace

Návaznosti výsledku

Návaznosti výsledku

Ostatní

Ostatní

Údaje specifické pro druh výsledku

Údaje specifické pro druh výsledku

Podobné výsledky(10)