Spam detection on social networks using cost-sensitive feature selection and ensemble-based regularized deep neural networks
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216275%3A25410%2F20%3A39916136" target="_blank" >RIV/00216275:25410/20:39916136 - isvavai.cz</a>
Výsledek na webu
<a href="https://link.springer.com/article/10.1007/s00521-019-04331-5" target="_blank" >https://link.springer.com/article/10.1007/s00521-019-04331-5</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1007/s00521-019-04331-5" target="_blank" >10.1007/s00521-019-04331-5</a>
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
Spam detection on social networks using cost-sensitive feature selection and ensemble-based regularized deep neural networks
Popis výsledku v původním jazyce
Spam detection on social networks is increasingly important owing to the rapid growth of social network user base. Sophisticated spam filters must be developed to deal with this complex problem. Traditional machine learning approaches such as neural networks, support vector machines and Naive Bayes classifiers are not effective enough to process and utilize complex features present in high-dimensional data on social network spam. Moreover, the traditional objective criteria of social network spam filters cannot cope with different costs assigned to type I and type II errors. To overcome these problems, here we propose a novel cost-sensitive approach to social network spam filtering. The proposed approach is composed of two stages. In the first stage, multi-objective evolutionary feature selection is used to minimize both the misclassification cost of the proposed model and the number of attributes necessary for spam filtering. Then, the approach uses cost-sensitive ensemble learning techniques with regularized deep neural networks as base learners. We demonstrate that this approach is effective for social network spam filtering on two benchmark datasets. We also show that the proposed approach outperforms other popular algorithms used in social network spam filtering, such as random forest, Naive Bayes or support vector machines.
Název v anglickém jazyce
Spam detection on social networks using cost-sensitive feature selection and ensemble-based regularized deep neural networks
Popis výsledku anglicky
Spam detection on social networks is increasingly important owing to the rapid growth of social network user base. Sophisticated spam filters must be developed to deal with this complex problem. Traditional machine learning approaches such as neural networks, support vector machines and Naive Bayes classifiers are not effective enough to process and utilize complex features present in high-dimensional data on social network spam. Moreover, the traditional objective criteria of social network spam filters cannot cope with different costs assigned to type I and type II errors. To overcome these problems, here we propose a novel cost-sensitive approach to social network spam filtering. The proposed approach is composed of two stages. In the first stage, multi-objective evolutionary feature selection is used to minimize both the misclassification cost of the proposed model and the number of attributes necessary for spam filtering. Then, the approach uses cost-sensitive ensemble learning techniques with regularized deep neural networks as base learners. We demonstrate that this approach is effective for social network spam filtering on two benchmark datasets. We also show that the proposed approach outperforms other popular algorithms used in social network spam filtering, such as random forest, Naive Bayes or support vector machines.
Klasifikace
Druh
J<sub>imp</sub> - Článek v periodiku v databázi Web of Science
CEP obor
—
OECD FORD obor
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Návaznosti výsledku
Projekt
<a href="/cs/project/GA16-19590S" target="_blank" >GA16-19590S: Analýza témat a sentimentu vícenásobných textových zdrojů pro finanční rozhodování podniků</a><br>
Návaznosti
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)
Ostatní
Rok uplatnění
2020
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název periodika
Neural Computing and Applications
ISSN
0941-0643
e-ISSN
—
Svazek periodika
32
Číslo periodika v rámci svazku
9
Stát vydavatele periodika
US - Spojené státy americké
Počet stran výsledku
19
Strana od-do
4239-4257
Kód UT WoS článku
000527419900009
EID výsledku v databázi Scopus
2-s2.0-85068790680