Explainable Similarity of Datasets using Knowledge Graph
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F19%3A10398464" target="_blank" >RIV/00216208:11320/19:10398464 - isvavai.cz</a>
Result on the web
<a href="https://doi.org/10.1007/978-3-030-32047-8_10" target="_blank" >https://doi.org/10.1007/978-3-030-32047-8_10</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1007/978-3-030-32047-8_10" target="_blank" >10.1007/978-3-030-32047-8_10</a>
Alternative languages
Result language
angličtina
Original language name
Explainable Similarity of Datasets using Knowledge Graph
Original language description
There is a large quantity of datasets available as Open Data on the Web. However, it is challenging for users to find datasets relevant to their needs, even though the datasets are registered in catalogs such as the European Data Portal. This is because the available metadata such as keywords or textual description is not descriptive enough. At the same time, datasets exist in various types of contexts not expressed in the metadata. These may include information about the dataset publisher, the legislation related to dataset publication, language and cultural specifics, etc. In this paper we introduce a similarity model for matching datasets. The model assumes an ontology/knowledge graph, such as Wikidata.org, that serves as a graph-based context to which individual datasets are mapped based on their metadata. A similarity of the datasets is then computed as an aggregation over paths among nodes in the graph. The proposed similarity aims at addressing the problem of explainability of similarity, i.e., providing the user a structured explanation of the match which, in a broader sense, is nowadays a hot topic in the field of artificial intelligence.
Czech name
—
Czech description
—
Classification
Type
D - Article in proceedings
CEP classification
—
OECD FORD branch
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Result continuities
Project
<a href="/en/project/GA19-01641S" target="_blank" >GA19-01641S: Contextual Similarity Search in Open Data</a><br>
Continuities
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)
Others
Publication year
2019
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Article name in the collection
Lecture Notes in Computer Science
ISBN
978-3-030-32046-1
ISSN
0302-9743
e-ISSN
—
Number of pages
8
Pages from-to
103-110
Publisher name
Springer International Publishing
Place of publication
Cham
Event location
Newark NJ, USA
Event date
Oct 2, 2019
Type of event by nationality
WRD - Celosvětová akce
UT code for WoS article
—