Vše

Co hledáte?

Vše
Projekty
Výsledky výzkumu
Subjekty

Rychlé hledání

  • Projekty podpořené TA ČR
  • Významné projekty
  • Projekty s nejvyšší státní podporou
  • Aktuálně běžící projekty

Chytré vyhledávání

  • Takto najdu konkrétní +slovo
  • Takto z výsledků -slovo zcela vynechám
  • “Takto můžu najít celou frázi”

The New DBpedia Release Cycle: Increasing Agility and Efficiency in Knowledge Extraction Workflows

Identifikátory výsledku

  • Kód výsledku v IS VaVaI

    <a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F68407700%3A21240%2F20%3A00344079" target="_blank" >RIV/68407700:21240/20:00344079 - isvavai.cz</a>

  • Výsledek na webu

    <a href="https://doi.org/10.1007/978-3-030-59833-4_1" target="_blank" >https://doi.org/10.1007/978-3-030-59833-4_1</a>

  • DOI - Digital Object Identifier

    <a href="http://dx.doi.org/10.1007/978-3-030-59833-4_1" target="_blank" >10.1007/978-3-030-59833-4_1</a>

Alternativní jazyky

  • Jazyk výsledku

    angličtina

  • Název v původním jazyce

    The New DBpedia Release Cycle: Increasing Agility and Efficiency in Knowledge Extraction Workflows

  • Popis výsledku v původním jazyce

    Since its inception in 2007, DBpedia has been constantly releasing open data in RDF, extracted from various Wikimedia projects using a complex software system called the DBpedia Information Extraction Framework (DIEF). For the past 12 years, the software received a plethora of extensions by the community, which positively affected the size and data quality. Due to the increase in size and complexity, the release process was facing huge delays (from 12 to 17 months cycle), thus impacting the agility of the development. In this paper, we describe the new DBpedia release cycle including our innovative release workflow, which allows development teams (in particular those who publish large, open data) to implement agile, cost-efficient processes and scale up productivity. The DBpedia release workflow has been re-engineered, its new primary focus is on productivity and agility, to address the challenges of size and complexity. At the same time, quality is assured by implementing a comprehensive testing methodology. We run an experimental evaluation and argue that the implemented measures increase agility and allow for cost-effective quality-control and debugging and thus achieve a higher level of maintainability. As a result, DBpedia now publishes regular (i.e. monthly) releases with over 21 billion triples with minimal publishing effort.

  • Název v anglickém jazyce

    The New DBpedia Release Cycle: Increasing Agility and Efficiency in Knowledge Extraction Workflows

  • Popis výsledku anglicky

    Since its inception in 2007, DBpedia has been constantly releasing open data in RDF, extracted from various Wikimedia projects using a complex software system called the DBpedia Information Extraction Framework (DIEF). For the past 12 years, the software received a plethora of extensions by the community, which positively affected the size and data quality. Due to the increase in size and complexity, the release process was facing huge delays (from 12 to 17 months cycle), thus impacting the agility of the development. In this paper, we describe the new DBpedia release cycle including our innovative release workflow, which allows development teams (in particular those who publish large, open data) to implement agile, cost-efficient processes and scale up productivity. The DBpedia release workflow has been re-engineered, its new primary focus is on productivity and agility, to address the challenges of size and complexity. At the same time, quality is assured by implementing a comprehensive testing methodology. We run an experimental evaluation and argue that the implemented measures increase agility and allow for cost-effective quality-control and debugging and thus achieve a higher level of maintainability. As a result, DBpedia now publishes regular (i.e. monthly) releases with over 21 billion triples with minimal publishing effort.

Klasifikace

  • Druh

    D - Stať ve sborníku

  • CEP obor

  • OECD FORD obor

    10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)

Návaznosti výsledku

  • Projekt

  • Návaznosti

    V - Vyzkumna aktivita podporovana z jinych verejnych zdroju

Ostatní

  • Rok uplatnění

    2020

  • Kód důvěrnosti údajů

    S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Údaje specifické pro druh výsledku

  • Název statě ve sborníku

    Semantic Systems. In the Era of Knowledge Graphs

  • ISBN

    978-3-030-59832-7

  • ISSN

    0302-9743

  • e-ISSN

  • Počet stran výsledku

    18

  • Strana od-do

    1-18

  • Název nakladatele

    Springer International Publishing

  • Místo vydání

    Cham

  • Místo konání akce

    Amsterdam

  • Datum konání akce

    9. 9. 2020

  • Typ akce podle státní příslušnosti

    WRD - Celosvětová akce

  • Kód UT WoS článku