Text Mining-based Formation of Dictionaries Expressing Opinions in Natural Languages
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F62156489%3A43110%2F11%3A00215946" target="_blank" >RIV/62156489:43110/11:00215946 - isvavai.cz</a>
Výsledek na webu
—
DOI - Digital Object Identifier
—
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
Text Mining-based Formation of Dictionaries Expressing Opinions in Natural Languages
Popis výsledku v původním jazyce
Automatic formation of dictionaries containing words significant for expressing different customers' opinions written in natural languages is demonstrated. The research used very large real-world data concerning the hotel accommodation booking via the Internet. The hotel companies could be interested in characteristic words expressing positive and negative opinions because it could help improve the offered service. The suggested method uses unstructured plain text reviews of many customers from different countries. The data is transformed into vectors using the bag-of-words procedure with the word representation by their frequencies in the reviews. Significant words are selected as relevant attributes for the classification to given categories using trained decision trees. Each tree branch leading to a leaf represents a subset of significant words for a category. The individual word importance is weighted by the word frequency in all the branches combined with their occurrence in branc
Název v anglickém jazyce
Text Mining-based Formation of Dictionaries Expressing Opinions in Natural Languages
Popis výsledku anglicky
Automatic formation of dictionaries containing words significant for expressing different customers' opinions written in natural languages is demonstrated. The research used very large real-world data concerning the hotel accommodation booking via the Internet. The hotel companies could be interested in characteristic words expressing positive and negative opinions because it could help improve the offered service. The suggested method uses unstructured plain text reviews of many customers from different countries. The data is transformed into vectors using the bag-of-words procedure with the word representation by their frequencies in the reviews. Significant words are selected as relevant attributes for the classification to given categories using trained decision trees. Each tree branch leading to a leaf represents a subset of significant words for a category. The individual word importance is weighted by the word frequency in all the branches combined with their occurrence in branc
Klasifikace
Druh
D - Stať ve sborníku
CEP obor
IN - Informatika
OECD FORD obor
—
Návaznosti výsledku
Projekt
—
Návaznosti
Z - Vyzkumny zamer (s odkazem do CEZ)
Ostatní
Rok uplatnění
2011
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název statě ve sborníku
Mendel 2011: 17th International Conference on Soft Computing
ISBN
978-80-214-4302-0
ISSN
—
e-ISSN
—
Počet stran výsledku
8
Strana od-do
374-381
Název nakladatele
Brno University of Technology
Místo vydání
Brno
Místo konání akce
Brno
Datum konání akce
1. 1. 2011
Typ akce podle státní příslušnosti
WRD - Celosvětová akce
Kód UT WoS článku
302647900059