Selecting text entries using a few positive samples and similarity ranking
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F62156489%3A43110%2F11%3A00173470" target="_blank" >RIV/62156489:43110/11:00173470 - isvavai.cz</a>
Výsledek na webu
—
DOI - Digital Object Identifier
—
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
Selecting text entries using a few positive samples and similarity ranking
Popis výsledku v původním jazyce
This research was inspired by procedures that are used by human bibliographic searchers: Given some textual and only 'positive' (relevant, interesting) examples coming just from one category, find promptly and simply in an available collection of variousunlabeled documents the most similar ones that belong to a relevant topic defined by an applicant. The problem of the categorization of unlabeled relevant and irrelevant textual documents is here solved by using a small subset of relevant available patterns labeled manually in advance. Unlabeled text items are compared with such labeled patterns. The unlabeled samples are then ranked according their degree of similarity with the patterns. At the top of the rank, there are the most similar (relevant) items. Entries receding from the rank top represent gradually less and less similar entries. The authors emphasize that this simple method, aimed at processing large volumes of text entries, provides initial filtering results from the accur
Název v anglickém jazyce
Selecting text entries using a few positive samples and similarity ranking
Popis výsledku anglicky
This research was inspired by procedures that are used by human bibliographic searchers: Given some textual and only 'positive' (relevant, interesting) examples coming just from one category, find promptly and simply in an available collection of variousunlabeled documents the most similar ones that belong to a relevant topic defined by an applicant. The problem of the categorization of unlabeled relevant and irrelevant textual documents is here solved by using a small subset of relevant available patterns labeled manually in advance. Unlabeled text items are compared with such labeled patterns. The unlabeled samples are then ranked according their degree of similarity with the patterns. At the top of the rank, there are the most similar (relevant) items. Entries receding from the rank top represent gradually less and less similar entries. The authors emphasize that this simple method, aimed at processing large volumes of text entries, provides initial filtering results from the accur
Klasifikace
Druh
J<sub>x</sub> - Nezařazeno - Článek v odborném periodiku (Jimp, Jsc a Jost)
CEP obor
IN - Informatika
OECD FORD obor
—
Návaznosti výsledku
Projekt
—
Návaznosti
Z - Vyzkumny zamer (s odkazem do CEZ)
Ostatní
Rok uplatnění
2011
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název periodika
Acta Universitatis Agriculturae et Silviculturae Mendelianae Brunensis
ISSN
1211-8516
e-ISSN
—
Svazek periodika
LIX
Číslo periodika v rámci svazku
4
Stát vydavatele periodika
CZ - Česká republika
Počet stran výsledku
10
Strana od-do
399-408
Kód UT WoS článku
—
EID výsledku v databázi Scopus
—