Towards a Systematic Approach to Sync Factual Data across Wikipedia, Wikidata and External Data Sources
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F68407700%3A21240%2F21%3A00356836" target="_blank" >RIV/68407700:21240/21:00356836 - isvavai.cz</a>
Result on the web
<a href="http://ceur-ws.org/Vol-2836/qurator2021_paper_18.pdf" target="_blank" >http://ceur-ws.org/Vol-2836/qurator2021_paper_18.pdf</a>
DOI - Digital Object Identifier
—
Alternative languages
Result language
angličtina
Original language name
Towards a Systematic Approach to Sync Factual Data across Wikipedia, Wikidata and External Data Sources
Original language description
This paper addresses one of the largest and most complex data curation workflows in existence: Wikipedia and Wikidata, with a high number of users and curators adding factual information from external sources via a non-systematic Wiki workflow to Wikipedia’s infoboxes and Wikidata items. We present high-level analyses of the current state, the challenges and limitations in this workflow and supplement it with a quantitative and semantic analysis of the resulting data spaces by deploying DBpedia’s integration and extraction capabilities. Based on an analysis of millions of references from Wikipedia infoboxes in different languages, we can find the most important sources which can be used to enrich other knowledge bases with information of better quality. An initial tool is presented, the GlobalFactSync browser, as a prototype to discuss further measures to develop a more systematic approach for data curation in the WikiVerse.
Czech name
—
Czech description
—
Classification
Type
D - Article in proceedings
CEP classification
—
OECD FORD branch
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Result continuities
Project
—
Continuities
I - Institucionalni podpora na dlouhodoby koncepcni rozvoj vyzkumne organizace
Others
Publication year
2021
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Article name in the collection
Proceedings of the Conference on Digital Curation Technologies (Qurator 2021)
ISBN
—
ISSN
1613-0073
e-ISSN
1613-0073
Number of pages
15
Pages from-to
—
Publisher name
CEUR Workshop Proceedings
Place of publication
Aachen
Event location
Berlin
Event date
Feb 8, 2021
Type of event by nationality
WRD - Celosvětová akce
UT code for WoS article
—