Selecting Characteristic Patterns of Text Contributions to Social Networks Using Instance-Based Learning Algorithm IBL-2
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F62156489%3A43110%2F17%3A43911358" target="_blank" >RIV/62156489:43110/17:43911358 - isvavai.cz</a>
Alternative codes found
RIV/00216224:14560/17:00108749
Result on the web
<a href="https://ece.pefka.mendelu.cz/sites/default/files/imce/ECE2017_fin.pdf" target="_blank" >https://ece.pefka.mendelu.cz/sites/default/files/imce/ECE2017_fin.pdf</a>
DOI - Digital Object Identifier
—
Alternative languages
Result language
angličtina
Original language name
Selecting Characteristic Patterns of Text Contributions to Social Networks Using Instance-Based Learning Algorithm IBL-2
Original language description
The presented research focuses on selecting typical patterns of textual entries written using a natural language (English) in a social network booking.com, which publishes sentiment of customers that used an accommodation service. This work deals with the possibility to find the patterns via text mining based on a machine-learning tool known as Instance-Based Learning (IBL). To reduce high computational demands of the basic algorithm IBL-1 (k-nearest neighbors), IBL-2 does not store sample candidates the function of which is successfully carried out by the already stored samples. The textual data are represented as bag-of-words with sparse vectors. Because the non-linearly increasing computational complexity depends on the number of samples as well as on their vocabulary, the potential candidates are firstly freed of common insignificant terms and then the vector sparsity is strongly decreased by removing words having a low frequency in relation to the number of samples. Then, IBL-2 rejects to store samples that duplicate the functionality of the already stored ones. As a result, the database contains only (or mainly) significant samples that represent characteristic patterns, which may be used for classification or another type of a following social network analysis.
Czech name
—
Czech description
—
Classification
Type
D - Article in proceedings
CEP classification
—
OECD FORD branch
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Result continuities
Project
<a href="/en/project/GA16-26353S" target="_blank" >GA16-26353S: Sentiment and its impact on stock markets</a><br>
Continuities
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)
Others
Publication year
2017
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Article name in the collection
Enterprise and Competitive Environment: Conference Proceedings
ISBN
978-80-7509-499-5
ISSN
—
e-ISSN
neuvedeno
Number of pages
10
Pages from-to
971-980
Publisher name
Mendelova univerzita v Brně
Place of publication
Brno
Event location
Brno
Event date
Mar 9, 2017
Type of event by nationality
WRD - Celosvětová akce
UT code for WoS article
000427306200100