Vše

Co hledáte?

Vše
Projekty
Výsledky výzkumu
Subjekty

Rychlé hledání

  • Projekty podpořené TA ČR
  • Významné projekty
  • Projekty s nejvyšší státní podporou
  • Aktuálně běžící projekty

Chytré vyhledávání

  • Takto najdu konkrétní +slovo
  • Takto z výsledků -slovo zcela vynechám
  • “Takto můžu najít celou frázi”

A New Approach to Automatically Find and Fix Erroneous Labels in Dependency Parsing Treebanks

Identifikátory výsledku

  • Kód výsledku v IS VaVaI

    <a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F21%3A10441550" target="_blank" >RIV/00216208:11320/21:10441550 - isvavai.cz</a>

  • Výsledek na webu

    <a href="https://verso.is.cuni.cz/pub/verso.fpl?fname=obd_publikace_handle&handle=fkq3xX7PWH" target="_blank" >https://verso.is.cuni.cz/pub/verso.fpl?fname=obd_publikace_handle&handle=fkq3xX7PWH</a>

  • DOI - Digital Object Identifier

    <a href="http://dx.doi.org/10.34028/iajit/18/3/12" target="_blank" >10.34028/iajit/18/3/12</a>

Alternativní jazyky

  • Jazyk výsledku

    angličtina

  • Název v původním jazyce

    A New Approach to Automatically Find and Fix Erroneous Labels in Dependency Parsing Treebanks

  • Popis výsledku v původním jazyce

    Dependency Parsing (DP) is the existence of sub-term/upper-term relations between the words that make up that sentence for each sentence in the text. DP serves to produce meaningful information for high-level applications. Correct labeling of the text corpus used in DP studies is very important. There will be mistakes in the results of the studies that will be performed with the wrongly-labeled text corpus. If text corpus is labeled manually or automatically by human beings, then faulty cases will occur. As a result of the cases that may arise from human factors or annotations used for labeling, faulty labels will be on freebanks. In order to prevent these errors, detection, and correction of possible faulty labeling is very important in terms of increasing the accuracy of the studies to be carried out. Manual correction of possible faulty labels requires great effort and time. The purpose of this study is to create a model that automatically finds possible faulty labels and offers new label suggestions for faulty labels. With the help of the proposed model, it is aimed to detect and correct possible faulty labels that are included in a text corpus, and to increase consistency among the text corpus of the same language. With the help of the developed model, suggesting new labels for faulty labels by a language expert will be a great convenient for the specialist. Another advantage of the model is that the developed model provides a language-independent structure. It has succeeded in obtaining successful results in finding and correcting potentially faulty labels in experimental studies for Turkish. An increase in accuracy has been detected in studies carried out for languages other than Turkish. In investigating the accuracy of the results obtained by the system, the results were analyzed with the help of 10 different language experts.

  • Název v anglickém jazyce

    A New Approach to Automatically Find and Fix Erroneous Labels in Dependency Parsing Treebanks

  • Popis výsledku anglicky

    Dependency Parsing (DP) is the existence of sub-term/upper-term relations between the words that make up that sentence for each sentence in the text. DP serves to produce meaningful information for high-level applications. Correct labeling of the text corpus used in DP studies is very important. There will be mistakes in the results of the studies that will be performed with the wrongly-labeled text corpus. If text corpus is labeled manually or automatically by human beings, then faulty cases will occur. As a result of the cases that may arise from human factors or annotations used for labeling, faulty labels will be on freebanks. In order to prevent these errors, detection, and correction of possible faulty labeling is very important in terms of increasing the accuracy of the studies to be carried out. Manual correction of possible faulty labels requires great effort and time. The purpose of this study is to create a model that automatically finds possible faulty labels and offers new label suggestions for faulty labels. With the help of the proposed model, it is aimed to detect and correct possible faulty labels that are included in a text corpus, and to increase consistency among the text corpus of the same language. With the help of the developed model, suggesting new labels for faulty labels by a language expert will be a great convenient for the specialist. Another advantage of the model is that the developed model provides a language-independent structure. It has succeeded in obtaining successful results in finding and correcting potentially faulty labels in experimental studies for Turkish. An increase in accuracy has been detected in studies carried out for languages other than Turkish. In investigating the accuracy of the results obtained by the system, the results were analyzed with the help of 10 different language experts.

Klasifikace

  • Druh

    J<sub>imp</sub> - Článek v periodiku v databázi Web of Science

  • CEP obor

  • OECD FORD obor

    10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)

Návaznosti výsledku

  • Projekt

  • Návaznosti

Ostatní

  • Rok uplatnění

    2021

  • Kód důvěrnosti údajů

    S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Údaje specifické pro druh výsledku

  • Název periodika

    International Arab Journal of Information Technology

  • ISSN

    1683-3198

  • e-ISSN

  • Svazek periodika

    18

  • Číslo periodika v rámci svazku

    3

  • Stát vydavatele periodika

    JO - Jordánské hášimovské království

  • Počet stran výsledku

    9

  • Strana od-do

    356-364

  • Kód UT WoS článku

    000667208600012

  • EID výsledku v databázi Scopus

    2-s2.0-85106439495