Mining Significant Words from Customer Opinions Written in Different Natural Languages
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F62156489%3A43110%2F11%3A00215851" target="_blank" >RIV/62156489:43110/11:00215851 - isvavai.cz</a>
Result on the web
<a href="http://dx.doi.org/10.1007/978-3-642-23538-2_27" target="_blank" >http://dx.doi.org/10.1007/978-3-642-23538-2_27</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1007/978-3-642-23538-2_27" target="_blank" >10.1007/978-3-642-23538-2_27</a>
Alternative languages
Result language
angličtina
Original language name
Mining Significant Words from Customer Opinions Written in Different Natural Languages
Original language description
Opinions expressed by text documents freely written in various natural languages represent a valuable source of knowledge that is hidden in large datasets. The presented research describes a text mining-method how to discover words that are significant for expressing different opinions (positive and negative). The method applies a simple but unified data pre-processing for all languages, providing the bag-of-words with words represented by their frequencies in the data. Then, the frequencies are used bythe algorithm which generates decision trees. The tree decisive nodes contain the words that are significant for expressing the opinions. Positions of these words in the tree represent their significance degree, where the most significant word is in thenode. As a result, a list of relevant words can be used for creating a dictionary containing only relevant information. The described method was tested using very large sets of customers' reviews concerning the on-line hotel room booking
Czech name
—
Czech description
—
Classification
Type
D - Article in proceedings
CEP classification
IN - Informatics
OECD FORD branch
—
Result continuities
Project
—
Continuities
Z - Vyzkumny zamer (s odkazem do CEZ)
Others
Publication year
2011
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Article name in the collection
Text, Speech and Dialogue
ISBN
978-3-642-23537-5
ISSN
—
e-ISSN
—
Number of pages
8
Pages from-to
211-218
Publisher name
Springer
Place of publication
Heidelberg Dordrecht London New York
Event location
Pilsen
Event date
Sep 1, 2011
Type of event by nationality
WRD - Celosvětová akce
UT code for WoS article
—