Shrinkage Linear with Quadratic Gaussian Discriminant Analysis for Big Data Classification
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F62690094%3A18470%2F22%3A50019404" target="_blank" >RIV/62690094:18470/22:50019404 - isvavai.cz</a>
Výsledek na webu
<a href="https://www.techscience.com/iasc/v34n3/47913" target="_blank" >https://www.techscience.com/iasc/v34n3/47913</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.32604/iasc.2022.024539" target="_blank" >10.32604/iasc.2022.024539</a>
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
Shrinkage Linear with Quadratic Gaussian Discriminant Analysis for Big Data Classification
Popis výsledku v původním jazyce
Generation of massive data is increasing in big data industries due tothe evolution of modern technologies. The big data industries include data sourcefrom sensors, Internet of Things, digital and social media. In particular, these bigdata systems consist of data extraction, preprocessing, integration, analysis, andvisualization mechanism. The data encountered from the sources are redundant,incomplete and conflict. Moreover, in real time applications, it is a tedious processfor the interpretation of all the data from different sources. In this paper, the gath-ered data are preprocessed to handle the issues such as redundant, incomplete andconflict. For that, it is proposed to have a generalized dimensionality reductiontechnique called Shrinkage Linear Discriminate Analysis (SLDA). As a result,the Shrinkage Linear Discriminate Analysis (LDA) will improve the performanceof the classifier with generalization. Even though, dimensionality reduction sys-tems improve the performance of the classifier, the irrelevant features getdegraded by the performance of the system further. Hence, the relevant and themost important features are selected using Pearson correlation-based feature selec-tion technique which selects the subset of correlated features for improving theperformance of the classification system. The selected features are classified usingthe proposed Quadratic-Gaussian Discriminant Analysis (QGDA) classifier. Theproposed evolution techniques are tested with the localization and the cover datasets from machine learning University of California Irvine (UCI) repository. Inaddition to that, the proposed techniques on datasets are evaluated with the eva-luation metrics and compared to the other similar methods which prove the effi-ciency of the proposed classification system. It has achieved better performance.The acquired accuracy is over 91% for all the experiment on these datasets. Basedon the results evaluated in terms of training percentage and mapper, it is meaning-ful to conclude that the proposed method could be used for big data classification.
Název v anglickém jazyce
Shrinkage Linear with Quadratic Gaussian Discriminant Analysis for Big Data Classification
Popis výsledku anglicky
Generation of massive data is increasing in big data industries due tothe evolution of modern technologies. The big data industries include data sourcefrom sensors, Internet of Things, digital and social media. In particular, these bigdata systems consist of data extraction, preprocessing, integration, analysis, andvisualization mechanism. The data encountered from the sources are redundant,incomplete and conflict. Moreover, in real time applications, it is a tedious processfor the interpretation of all the data from different sources. In this paper, the gath-ered data are preprocessed to handle the issues such as redundant, incomplete andconflict. For that, it is proposed to have a generalized dimensionality reductiontechnique called Shrinkage Linear Discriminate Analysis (SLDA). As a result,the Shrinkage Linear Discriminate Analysis (LDA) will improve the performanceof the classifier with generalization. Even though, dimensionality reduction sys-tems improve the performance of the classifier, the irrelevant features getdegraded by the performance of the system further. Hence, the relevant and themost important features are selected using Pearson correlation-based feature selec-tion technique which selects the subset of correlated features for improving theperformance of the classification system. The selected features are classified usingthe proposed Quadratic-Gaussian Discriminant Analysis (QGDA) classifier. Theproposed evolution techniques are tested with the localization and the cover datasets from machine learning University of California Irvine (UCI) repository. Inaddition to that, the proposed techniques on datasets are evaluated with the eva-luation metrics and compared to the other similar methods which prove the effi-ciency of the proposed classification system. It has achieved better performance.The acquired accuracy is over 91% for all the experiment on these datasets. Basedon the results evaluated in terms of training percentage and mapper, it is meaning-ful to conclude that the proposed method could be used for big data classification.
Klasifikace
Druh
J<sub>imp</sub> - Článek v periodiku v databázi Web of Science
CEP obor
—
OECD FORD obor
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Návaznosti výsledku
Projekt
—
Návaznosti
I - Institucionalni podpora na dlouhodoby koncepcni rozvoj vyzkumne organizace
Ostatní
Rok uplatnění
2022
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název periodika
Intelligent Automation & Soft Computing: An International Journal
ISSN
1079-8587
e-ISSN
2326-005X
Svazek periodika
34
Číslo periodika v rámci svazku
3
Stát vydavatele periodika
US - Spojené státy americké
Počet stran výsledku
16
Strana od-do
1803-1818
Kód UT WoS článku
000809701500005
EID výsledku v databázi Scopus
2-s2.0-85131252313