An OCR-based application using Tesseract engine to extract text information from ultrasound B-MODE images

Identifikátory výsledku

Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F47813059%3A19240%2F24%3AA0001431" target="_blank" >RIV/47813059:19240/24:A0001431 - isvavai.cz</a>
Výsledek na webu
<a href="https://ceur-ws.org/Vol-3792/paper12.pdf" target="_blank" >https://ceur-ws.org/Vol-3792/paper12.pdf</a>
DOI - Digital Object Identifier
—

Alternativní jazyky

Jazyk výsledku
angličtina
Název v původním jazyce
An OCR-based application using Tesseract engine to extract text information from ultrasound B-MODE images
Popis výsledku v původním jazyce
In this paper we introduce our developed OCR-based application focused on the extraction of text data from ultrasound B-MODE images. These images contain not only visual image but also additional, important text information about the examination, image data, etc. The extraction of these data is helpful in clinical practice. The application has a simple, user-friendly interface to use. The core of the algorithm to extract the text data is focused on Tesseract engine with C# programming language; front-end is a Windows Forms application. Although this software is fast, simple and user-friendly, in some cases, the recognition could produce a error, like patient’s name incorrectly recognized and/or some characters are missing or mistaken. That means, some text in the image could be missed or misinterpreted in comparison with the original. However, while the software could be a helpful tool, the recognition accuracy should be improved. After that, the application can be used for different types of medical images, e.g. CT/CTA, PET, SPECT and many more. Currently achieved accuracy is about 90 % in average. The authors discuss some ideas to increase the accuracy of the recognition and also some front-end features that can be improved for more comfortable use. Extracted text data can be saved as a CSV file for further processing.
Název v anglickém jazyce
An OCR-based application using Tesseract engine to extract text information from ultrasound B-MODE images
Popis výsledku anglicky
In this paper we introduce our developed OCR-based application focused on the extraction of text data from ultrasound B-MODE images. These images contain not only visual image but also additional, important text information about the examination, image data, etc. The extraction of these data is helpful in clinical practice. The application has a simple, user-friendly interface to use. The core of the algorithm to extract the text data is focused on Tesseract engine with C# programming language; front-end is a Windows Forms application. Although this software is fast, simple and user-friendly, in some cases, the recognition could produce a error, like patient’s name incorrectly recognized and/or some characters are missing or mistaken. That means, some text in the image could be missed or misinterpreted in comparison with the original. However, while the software could be a helpful tool, the recognition accuracy should be improved. After that, the application can be used for different types of medical images, e.g. CT/CTA, PET, SPECT and many more. Currently achieved accuracy is about 90 % in average. The authors discuss some ideas to increase the accuracy of the recognition and also some front-end features that can be improved for more comfortable use. Extracted text data can be saved as a CSV file for further processing.

Klasifikace

Druh
D - Stať ve sborníku
CEP obor
—
OECD FORD obor
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)

Návaznosti výsledku

Projekt
—
Návaznosti
I - Institucionalni podpora na dlouhodoby koncepcni rozvoj vyzkumne organizace

Ostatní

Rok uplatnění
2024
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Údaje specifické pro druh výsledku

Název statě ve sborníku
Proceedings of the 24th Conference Information Technologies – Applications and Theory (ITAT 2024)
ISBN
—
ISSN
1613-0073
e-ISSN
—
Počet stran výsledku
6
Strana od-do
105-110
Název nakladatele
CEUR-WS
Místo vydání
Neuveden
Místo konání akce
Drienica (Slovensko)
Datum konání akce
20. 9. 2024
Typ akce podle státní příslušnosti
EUR - Evropská akce
Kód UT WoS článku
—

Podobné výsledky(10)

Visual Feature Extraction for Isolated Word Visual Only Speech Recognition of Vietnamese Online medical information system to create a decision-making expert system for risk assessment of atherosclerotic plaques from b-images and histological patterns Reconstruction and enhancement techniques for overcoming occlusion in facial recognition

Co hledáte?

Rychlé hledání

Chytré vyhledávání

An OCR-based application using Tesseract engine to extract text information from ultrasound B-MODE images

Identifikátory výsledku

Alternativní jazyky

Klasifikace

Návaznosti výsledku

Ostatní

Údaje specifické pro druh výsledku

Podobné výsledky(10)

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Popis výsledku

Identifikátory výsledku

Identifikátory výsledku

Alternativní jazyky

Alternativní jazyky

Klasifikace

Klasifikace

Návaznosti výsledku

Návaznosti výsledku

Ostatní

Ostatní

Údaje specifické pro druh výsledku

Údaje specifické pro druh výsledku

Podobné výsledky(10)