Math-aware Similarity of Papers in Digital Mathematics Libraries
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216224%3A14330%2F14%3A00077987" target="_blank" >RIV/00216224:14330/14:00077987 - isvavai.cz</a>
Výsledek na webu
<a href="http://dmv.ptm.org.pl/abstracts/19-r/19Sojka.pdf" target="_blank" >http://dmv.ptm.org.pl/abstracts/19-r/19Sojka.pdf</a>
DOI - Digital Object Identifier
—
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
Math-aware Similarity of Papers in Digital Mathematics Libraries
Popis výsledku v původním jazyce
The exploratory, semantic similarity searching is becoming widespread in digital libraries, and math ones are no exception. For working mathematicians and their use of digital mathematical libraries (DML) as the Czech Digital Mathematics Library DML-CZ or European Digital Mathematics Library (EuDML) we have designed and implemented math-aware similarity computation framework based on leading edge topic modelling techniques implemented by Gensim software package. Studies on the classification of math papers done for DML-CZ have been tested and deployed in EuDML, where for given paper ten most semantically similar papers are computed and shown. In the latest experiments we are evaluating several possible representations of mathematical formulae to get the semantically similar papers. Quality of similarity is measured by comparation to the similarity matrix induced from the Mathematical Subject Classifications every paper is marked up by.
Název v anglickém jazyce
Math-aware Similarity of Papers in Digital Mathematics Libraries
Popis výsledku anglicky
The exploratory, semantic similarity searching is becoming widespread in digital libraries, and math ones are no exception. For working mathematicians and their use of digital mathematical libraries (DML) as the Czech Digital Mathematics Library DML-CZ or European Digital Mathematics Library (EuDML) we have designed and implemented math-aware similarity computation framework based on leading edge topic modelling techniques implemented by Gensim software package. Studies on the classification of math papers done for DML-CZ have been tested and deployed in EuDML, where for given paper ten most semantically similar papers are computed and shown. In the latest experiments we are evaluating several possible representations of mathematical formulae to get the semantically similar papers. Quality of similarity is measured by comparation to the similarity matrix induced from the Mathematical Subject Classifications every paper is marked up by.
Klasifikace
Druh
O - Ostatní výsledky
CEP obor
IN - Informatika
OECD FORD obor
—
Návaznosti výsledku
Projekt
<a href="/cs/project/LG13010" target="_blank" >LG13010: Zastoupení ČR v European Research Consortium for Informatics and Mathematics</a><br>
Návaznosti
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)
Ostatní
Rok uplatnění
2014
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů