Grouping of Customer Opinions Written in Natural Language Using Unsupervised Machine Learning
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F62156489%3A43110%2F12%3A00198889" target="_blank" >RIV/62156489:43110/12:00198889 - isvavai.cz</a>
Výsledek na webu
—
DOI - Digital Object Identifier
—
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
Grouping of Customer Opinions Written in Natural Language Using Unsupervised Machine Learning
Popis výsledku v původním jazyce
Among one of the current and most topical tasks in the area of textual documents processing belongs the problem of automatic categorization. Clustering as the most common form of unsupervised learning enables automatic grouping of unlabeled documents into subsets called clusters. In this paper, the authors are concerned with results of clustering of very large electronic real-world data collections containing customers' reviews written freely, in English as a natural language. The reviews are automatically clustered into two groups that should contain either positive or negative reviews. The paper focuses on the analysis why certain reviews are assigned wrongly to a group containing mostly reviews of a different class. The assignment of a review into acertain cluster is based on its properties, i.e., on the words that appeared in the review. Thus, words appearing in incorrectly categorized reviews were analyzed. It was found that words that are important from the correct classificatio
Název v anglickém jazyce
Grouping of Customer Opinions Written in Natural Language Using Unsupervised Machine Learning
Popis výsledku anglicky
Among one of the current and most topical tasks in the area of textual documents processing belongs the problem of automatic categorization. Clustering as the most common form of unsupervised learning enables automatic grouping of unlabeled documents into subsets called clusters. In this paper, the authors are concerned with results of clustering of very large electronic real-world data collections containing customers' reviews written freely, in English as a natural language. The reviews are automatically clustered into two groups that should contain either positive or negative reviews. The paper focuses on the analysis why certain reviews are assigned wrongly to a group containing mostly reviews of a different class. The assignment of a review into acertain cluster is based on its properties, i.e., on the words that appeared in the review. Thus, words appearing in incorrectly categorized reviews were analyzed. It was found that words that are important from the correct classificatio
Klasifikace
Druh
D - Stať ve sborníku
CEP obor
IN - Informatika
OECD FORD obor
—
Návaznosti výsledku
Projekt
—
Návaznosti
Z - Vyzkumny zamer (s odkazem do CEZ)
Ostatní
Rok uplatnění
2012
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název statě ve sborníku
Proceedings of the 14th International Symposium on Symbolic and Numeric Algorithms for scientific Computing SYNASC 2012
ISBN
978-0-7695-4934-7
ISSN
—
e-ISSN
—
Počet stran výsledku
6
Strana od-do
265-270
Název nakladatele
IEEE
Místo vydání
—
Místo konání akce
Timisoara
Datum konání akce
1. 1. 2012
Typ akce podle státní příslušnosti
WRD - Celosvětová akce
Kód UT WoS článku
—