Approaches to Samples Selection for Machine Learning Based Classification of Textual Data

Identifikátory výsledku

Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F62156489%3A43110%2F13%3A00208671" target="_blank" >RIV/62156489:43110/13:00208671 - isvavai.cz</a>
Výsledek na webu
—
DOI - Digital Object Identifier
—

Alternativní jazyky

Jazyk výsledku
angličtina
Název v původním jazyce
Approaches to Samples Selection for Machine Learning Based Classification of Textual Data
Popis výsledku v původním jazyce
The paper focuses on the process of selecting representative sample documents written in a natural language that can be used as the basis for automatic selection or classification of textual documents. A method of selecting the examples from a larger setof candidate examples, called automatic biased sample selection, is compared to random and manual selection. The methods are evaluated by experiments carried out with real world data consisting of customer reviews, with different document representations and similarity measures. The presented approach, that provided satisfactory results, faces problems related to processing user created content and huge computational complexity and can be used as an alternative to manual selection and evaluation of textual samples.
Název v anglickém jazyce
Approaches to Samples Selection for Machine Learning Based Classification of Textual Data
Popis výsledku anglicky
The paper focuses on the process of selecting representative sample documents written in a natural language that can be used as the basis for automatic selection or classification of textual documents. A method of selecting the examples from a larger setof candidate examples, called automatic biased sample selection, is compared to random and manual selection. The methods are evaluated by experiments carried out with real world data consisting of customer reviews, with different document representations and similarity measures. The presented approach, that provided satisfactory results, faces problems related to processing user created content and huge computational complexity and can be used as an alternative to manual selection and evaluation of textual samples.

Klasifikace

Druh
J<sub>x</sub> - Nezařazeno - Článek v odborném periodiku (Jimp, Jsc a Jost)
CEP obor
IN - Informatika
OECD FORD obor
—

Návaznosti výsledku

Projekt
—
Návaznosti
Z - Vyzkumny zamer (s odkazem do CEZ)

Ostatní

Rok uplatnění
2013
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Údaje specifické pro druh výsledku

Název periodika
Computing and Informatics
ISSN
1335-9150
e-ISSN
—
Svazek periodika
32
Číslo periodika v rámci svazku
5
Stát vydavatele periodika
SK - Slovenská republika
Počet stran výsledku
19
Strana od-do
949-967
Kód UT WoS článku
327410900003
EID výsledku v databázi Scopus
—

Podobné výsledky(10)

Semantics-Based Document Categorization Employing Semi-Supervised Learning Methods for Detoxification of Texts for the Russian Language Task Evaluation based on Quizzes Variant, Created Questions, and Generators

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Approaches to Samples Selection for Machine Learning Based Classification of Textual Data

Identifikátory výsledku

Alternativní jazyky

Klasifikace

Návaznosti výsledku

Ostatní

Údaje specifické pro druh výsledku

Podobné výsledky(10)

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Popis výsledku

Identifikátory výsledku

Identifikátory výsledku

Alternativní jazyky

Alternativní jazyky

Klasifikace

Klasifikace

Návaznosti výsledku

Návaznosti výsledku

Ostatní

Ostatní

Údaje specifické pro druh výsledku

Údaje specifické pro druh výsledku

Podobné výsledky(10)