All

What are you looking for?

All
Projects
Results
Organizations

Quick search

  • Projects supported by TA ČR
  • Excellent projects
  • Projects with the highest public support
  • Current projects

Smart search

  • That is how I find a specific +word
  • That is how I leave the -word out of the results
  • “That is how I can find the whole phrase”

An Efficient Unsupervised Approach for OCR Error Correction of Vietnamese OCR Text

The result's identifiers

  • Result code in IS VaVaI

    <a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F61989100%3A27240%2F23%3A10252607" target="_blank" >RIV/61989100:27240/23:10252607 - isvavai.cz</a>

  • Result on the web

    <a href="https://ieeexplore.ieee.org/document/10144767" target="_blank" >https://ieeexplore.ieee.org/document/10144767</a>

  • DOI - Digital Object Identifier

    <a href="http://dx.doi.org/10.1109/ACCESS.2023.3283340" target="_blank" >10.1109/ACCESS.2023.3283340</a>

Alternative languages

  • Result language

    angličtina

  • Original language name

    An Efficient Unsupervised Approach for OCR Error Correction of Vietnamese OCR Text

  • Original language description

    Different types of OCR errors often occur in OCR texts due to the low quality of scanned document images or limitations in OCR software. In this paper, we propose a novel unsupervised approach for OCR error correction. Correction candidates for OCR errors are generated and explored in their neighborhoods using correction character edits controlled by an adapted hill-climbing algorithm. Correction characters are extracted from only original ground truth texts, which do not depend on OCR texts in training data. A weighted objective function used to score and rank correction candidates is heuristically tested to find optimal weight combinations. The proposed model is evaluated on an OCR text dataset originating from the Vietnamese handwritten database in the ICFHR 2018 Vietnamese online handwritten text recognition competition. The proposed model is also verified concerning its stability and complexity. The experimental results show that our model achieves competitive performance compared to the other models in the ICFHR 2018 competition.

  • Czech name

  • Czech description

Classification

  • Type

    J<sub>imp</sub> - Article in a specialist periodical, which is included in the Web of Science database

  • CEP classification

  • OECD FORD branch

    10200 - Computer and information sciences

Result continuities

  • Project

    <a href="/en/project/EF17_049%2F0008425" target="_blank" >EF17_049/0008425: A Research Platform focused on Industry 4.0 and Robotics in Ostrava Agglomeration</a><br>

  • Continuities

    P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)<br>S - Specificky vyzkum na vysokych skolach

Others

  • Publication year

    2023

  • Confidentiality

    S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Data specific for result type

  • Name of the periodical

    IEEE Access

  • ISSN

    2169-3536

  • e-ISSN

  • Volume of the periodical

    11

  • Issue of the periodical within the volume

    06 June 2023

  • Country of publishing house

    US - UNITED STATES

  • Number of pages

    16

  • Pages from-to

    58406-58421

  • UT code for WoS article

    001012334700001

  • EID of the result in the Scopus database