Towards Automatic Methods to Detect Errors in Transcriptions of Speech Recordings
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216305%3A26230%2F19%3APU134190" target="_blank" >RIV/00216305:26230/19:PU134190 - isvavai.cz</a>
Result on the web
<a href="https://ieeexplore.ieee.org/document/8683722" target="_blank" >https://ieeexplore.ieee.org/document/8683722</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1109/ICASSP.2019.8683722" target="_blank" >10.1109/ICASSP.2019.8683722</a>
Alternative languages
Result language
angličtina
Original language name
Towards Automatic Methods to Detect Errors in Transcriptions of Speech Recordings
Original language description
This work explores different methods to detect errors in transcriptions of speech recordings. We artificially corrupt well transcribed speech transcriptions with three types of errors: substitution, insertion and deletion on TIMIT phonemic transcriptions and WSJ word transcriptions. First, we use Bayesian model selection method by comparing the log-likelihoods from alignment and phone recognizer, a final score is computed to make decision. In this method, we consider two models, Bayesian Hidden Markov Model (HMM) and a Variational Auto-Encoder (VAE) combined with a HMM. Alternately, we build a biased ASR system with language models trained on individual transcriptions, detection decision is based on Levenshtein distance (LD) between transcription and oracle path from decoded lattice. We evaluate the methods of detecting errors in corrupted TIMIT transcription, the best result (either using model selection with VAE model or biased ASR) achieves 7% equal error rate on the Detection Error Tradeoff (DET) curve; we also evaluate the methods of detecting errors in corrupted WSJ transcriptions, and the best result (using biased ASR) achieves 3% equal error rate.
Czech name
—
Czech description
—
Classification
Type
D - Article in proceedings
CEP classification
—
OECD FORD branch
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Result continuities
Project
<a href="/en/project/LQ1602" target="_blank" >LQ1602: IT4Innovations excellence in science</a><br>
Continuities
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)<br>S - Specificky vyzkum na vysokych skolach
Others
Publication year
2019
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Article name in the collection
Proceedings of ICASSP
ISBN
978-1-5386-4658-8
ISSN
—
e-ISSN
—
Number of pages
5
Pages from-to
3747-3751
Publisher name
IEEE Signal Processing Society
Place of publication
Brighton
Event location
Brighton
Event date
May 12, 2019
Type of event by nationality
WRD - Celosvětová akce
UT code for WoS article
000482554003194