Review Spam Detection Using Word Embeddings and Deep Neural Networks
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216275%3A25410%2F19%3A39914919" target="_blank" >RIV/00216275:25410/19:39914919 - isvavai.cz</a>
Result on the web
<a href="https://link.springer.com/chapter/10.1007/978-3-030-19823-7_28" target="_blank" >https://link.springer.com/chapter/10.1007/978-3-030-19823-7_28</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1007/978-3-030-19823-7_28" target="_blank" >10.1007/978-3-030-19823-7_28</a>
Alternative languages
Result language
angličtina
Original language name
Review Spam Detection Using Word Embeddings and Deep Neural Networks
Original language description
Review spam (fake review) detection is increasingly important taking into consideration the rapid growth of internet purchases. Therefore, sophisticated spam filters must be designed to tackle the problem. Traditional machine learning algorithms use review content and other features to detect review spam. However, as demonstrated in related studies, the linguistic context of words may be of particular importance for text categorization. In order to enhance the performance of review spam detection, we propose a novel content-based approach that considers both bag-of-words and word context. More precisely, our approach utilizes n-grams and the skip-gram word embedding method to build a vector model. As a result, high-dimensional feature representation is generated. To handle the representation and classify the review spam accurately, a deep feed-forward neural network is used in the second step. To verify our approach, we use two hotel review datasets, including positive and negative reviews. We show that the proposed detection system outperforms other popular algorithms for review spam detection in terms of accuracy and area under ROC. Importantly, the system provides balanced performance on both classes, legitimate and spam, irrespective of review polarity.
Czech name
—
Czech description
—
Classification
Type
D - Article in proceedings
CEP classification
—
OECD FORD branch
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Result continuities
Project
—
Continuities
S - Specificky vyzkum na vysokych skolach<br>I - Institucionalni podpora na dlouhodoby koncepcni rozvoj vyzkumne organizace
Others
Publication year
2019
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Article name in the collection
IFIP Advances in Information and Communication Technology. Vol. 559
ISBN
978-3-030-19822-0
ISSN
1868-4238
e-ISSN
—
Number of pages
11
Pages from-to
340-350
Publisher name
Springer
Place of publication
Berlin
Event location
Hersonissos
Event date
May 24, 2019
Type of event by nationality
EUR - Evropská akce
UT code for WoS article
—