OCR Error Correction for Vietnamese OCR Text with Different Edit Distances

Identifikátory výsledku

Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F61989100%3A27240%2F22%3A10252180" target="_blank" >RIV/61989100:27240/22:10252180 - isvavai.cz</a>
Výsledek na webu
<a href="https://link.springer.com/chapter/10.1007/978-3-031-14627-5_13" target="_blank" >https://link.springer.com/chapter/10.1007/978-3-031-14627-5_13</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1007/978-3-031-14627-5_13" target="_blank" >10.1007/978-3-031-14627-5_13</a>

Alternativní jazyky

Jazyk výsledku
angličtina
Název v původním jazyce
OCR Error Correction for Vietnamese OCR Text with Different Edit Distances
Popis výsledku v původním jazyce
Candidate word generation by character edit operations is an important method that has been employed in many OCR error correction approaches. In this paper, we study how character edit distances impact the performance of OCR error correction. We propose the algorithm of generating correction candidates with different edit distances. Correction candidates for both non-word and real-word errors are considered. The candidates are scored and ranked based on linguistic features and edit probability. The experiments are tested on the VNOnDB database used in the Vietnamese online handwritten text recognition competition (VOHTR 2018). We evaluate the error correction performance on different edit distances in terms of two error metrics, character error rate (CER) and word error rate (WER). It is shown that the edit distances of 1 and 2 obtain better correction results instead of higher edit distances. (C) 2022, The Author(s), under exclusive license to Springer Nature Switzerland AG.
Název v anglickém jazyce
OCR Error Correction for Vietnamese OCR Text with Different Edit Distances
Popis výsledku anglicky
Candidate word generation by character edit operations is an important method that has been employed in many OCR error correction approaches. In this paper, we study how character edit distances impact the performance of OCR error correction. We propose the algorithm of generating correction candidates with different edit distances. Correction candidates for both non-word and real-word errors are considered. The candidates are scored and ranked based on linguistic features and edit probability. The experiments are tested on the VNOnDB database used in the Vietnamese online handwritten text recognition competition (VOHTR 2018). We evaluate the error correction performance on different edit distances in terms of two error metrics, character error rate (CER) and word error rate (WER). It is shown that the edit distances of 1 and 2 obtain better correction results instead of higher edit distances. (C) 2022, The Author(s), under exclusive license to Springer Nature Switzerland AG.

Klasifikace

Druh
D - Stať ve sborníku
CEP obor
—
OECD FORD obor
10200 - Computer and information sciences

Návaznosti výsledku

Projekt
—
Návaznosti
S - Specificky vyzkum na vysokych skolach

Ostatní

Rok uplatnění
2022
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Údaje specifické pro druh výsledku

Název statě ve sborníku
Lecture Notes in Networks and Systems. Volume 527
ISBN
978-3-031-14626-8
ISSN
2367-3370
e-ISSN
2367-3389
Počet stran výsledku
10
Strana od-do
130-139
Název nakladatele
Springer
Místo vydání
Cham
Místo konání akce
Sanda
Datum konání akce
7. 9. 2022
Typ akce podle státní příslušnosti
WRD - Celosvětová akce
Kód UT WoS článku
000870692600013

Podobné výsledky(10)

An Efficient Unsupervised Approach for OCR Error Correction of Vietnamese OCR Text OCR error correction using correction patterns and self-organizing migrating algorithm An In-depth Analysis of OCR Errors for Unconstrained Vietnamese Handwriting

Co hledáte?

Rychlé hledání

Chytré vyhledávání

OCR Error Correction for Vietnamese OCR Text with Different Edit Distances

Identifikátory výsledku

Alternativní jazyky

Klasifikace

Návaznosti výsledku

Ostatní

Údaje specifické pro druh výsledku

Podobné výsledky(10)

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Popis výsledku

Identifikátory výsledku

Identifikátory výsledku

Alternativní jazyky

Alternativní jazyky

Klasifikace

Klasifikace

Návaznosti výsledku

Návaznosti výsledku

Ostatní

Ostatní

Údaje specifické pro druh výsledku

Údaje specifické pro druh výsledku

Podobné výsledky(10)