Text-based feature selection using binary particle swarm optimization for sentiment analysis
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F70883521%3A28140%2F22%3A63554658" target="_blank" >RIV/70883521:28140/22:63554658 - isvavai.cz</a>
Výsledek na webu
<a href="https://ieeexplore.ieee.org/document/9872823" target="_blank" >https://ieeexplore.ieee.org/document/9872823</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1109/ICECET55527.2022.9872823" target="_blank" >10.1109/ICECET55527.2022.9872823</a>
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
Text-based feature selection using binary particle swarm optimization for sentiment analysis
Popis výsledku v původním jazyce
The upsurge in social media data due to the proliferation of Web 2.0 applications has escalated scholarly studies within the sentiment analysis domain in recent times. Sentiment Analysis usually considered a text classification task in Natural Language Processing (NLP) classifies the views, attitudes, and feelings expressed by people concerning a particular organization or entity. This unstructured textual data can be pre-processed and represented as feature vectors which then serve as input to a machine learning algorithm for sentiment classification. In this process, feature selection which is a binary problem becomes an essential component of the SA exercise. We present a metaheuristic-based approach for optimal selection of features subset via the binary particle swarm optimization (BPSO) metaheuristic algorithm with the view to improve sentiment classification accuracy on the sentiment labelled sentences benchmark dataset. K-Nearest Neighbours, Naïve Bayes, and Support Vector Machine classifiers were employed as baseline classifiers to train the features. Before the sentiment classification process, the BPSO is utilized for selecting the optimal text features subset from the data. We train our sentiment labelled sentences benchmark dataset with SVM, NB, and k-NN using the selected optimal feature subset for sentiment classification. The results of the experiments conducted show impressive performance using our proposed approach for optimal text feature selection and sentiment classification compared to the baseline classifiers.
Název v anglickém jazyce
Text-based feature selection using binary particle swarm optimization for sentiment analysis
Popis výsledku anglicky
The upsurge in social media data due to the proliferation of Web 2.0 applications has escalated scholarly studies within the sentiment analysis domain in recent times. Sentiment Analysis usually considered a text classification task in Natural Language Processing (NLP) classifies the views, attitudes, and feelings expressed by people concerning a particular organization or entity. This unstructured textual data can be pre-processed and represented as feature vectors which then serve as input to a machine learning algorithm for sentiment classification. In this process, feature selection which is a binary problem becomes an essential component of the SA exercise. We present a metaheuristic-based approach for optimal selection of features subset via the binary particle swarm optimization (BPSO) metaheuristic algorithm with the view to improve sentiment classification accuracy on the sentiment labelled sentences benchmark dataset. K-Nearest Neighbours, Naïve Bayes, and Support Vector Machine classifiers were employed as baseline classifiers to train the features. Before the sentiment classification process, the BPSO is utilized for selecting the optimal text features subset from the data. We train our sentiment labelled sentences benchmark dataset with SVM, NB, and k-NN using the selected optimal feature subset for sentiment classification. The results of the experiments conducted show impressive performance using our proposed approach for optimal text feature selection and sentiment classification compared to the baseline classifiers.
Klasifikace
Druh
D - Stať ve sborníku
CEP obor
—
OECD FORD obor
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Návaznosti výsledku
Projekt
—
Návaznosti
S - Specificky vyzkum na vysokych skolach
Ostatní
Rok uplatnění
2022
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název statě ve sborníku
International Conference on Electrical, Computer, and Energy Technologies, ICECET 2022
ISBN
978-1-66547-087-2
ISSN
—
e-ISSN
—
Počet stran výsledku
4
Strana od-do
1-4
Název nakladatele
Institute of Electrical and Electronics Engineers Inc.
Místo vydání
Piscataway, New Jersey
Místo konání akce
Praha
Datum konání akce
20. 7. 2022
Typ akce podle státní příslušnosti
WRD - Celosvětová akce
Kód UT WoS článku
—