Selecting text entries using a few positive samples and similarity ranking
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F62156489%3A43110%2F11%3A00173470" target="_blank" >RIV/62156489:43110/11:00173470 - isvavai.cz</a>
Result on the web
—
DOI - Digital Object Identifier
—
Alternative languages
Result language
angličtina
Original language name
Selecting text entries using a few positive samples and similarity ranking
Original language description
This research was inspired by procedures that are used by human bibliographic searchers: Given some textual and only 'positive' (relevant, interesting) examples coming just from one category, find promptly and simply in an available collection of variousunlabeled documents the most similar ones that belong to a relevant topic defined by an applicant. The problem of the categorization of unlabeled relevant and irrelevant textual documents is here solved by using a small subset of relevant available patterns labeled manually in advance. Unlabeled text items are compared with such labeled patterns. The unlabeled samples are then ranked according their degree of similarity with the patterns. At the top of the rank, there are the most similar (relevant) items. Entries receding from the rank top represent gradually less and less similar entries. The authors emphasize that this simple method, aimed at processing large volumes of text entries, provides initial filtering results from the accur
Czech name
—
Czech description
—
Classification
Type
J<sub>x</sub> - Unclassified - Peer-reviewed scientific article (Jimp, Jsc and Jost)
CEP classification
IN - Informatics
OECD FORD branch
—
Result continuities
Project
—
Continuities
Z - Vyzkumny zamer (s odkazem do CEZ)
Others
Publication year
2011
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Name of the periodical
Acta Universitatis Agriculturae et Silviculturae Mendelianae Brunensis
ISSN
1211-8516
e-ISSN
—
Volume of the periodical
LIX
Issue of the periodical within the volume
4
Country of publishing house
CZ - CZECH REPUBLIC
Number of pages
10
Pages from-to
399-408
UT code for WoS article
—
EID of the result in the Scopus database
—