Sparse least-squares Universum twin bounded support vector machine with adaptive Lp-norms and feature selection
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F24%3A10488658" target="_blank" >RIV/00216208:11320/24:10488658 - isvavai.cz</a>
Nalezeny alternativní kódy
RIV/44555601:13440/24:43898374
Výsledek na webu
<a href="https://verso.is.cuni.cz/pub/verso.fpl?fname=obd_publikace_handle&handle=D_LdFwE3ym" target="_blank" >https://verso.is.cuni.cz/pub/verso.fpl?fname=obd_publikace_handle&handle=D_LdFwE3ym</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1016/j.eswa.2024.123378" target="_blank" >10.1016/j.eswa.2024.123378</a>
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
Sparse least-squares Universum twin bounded support vector machine with adaptive Lp-norms and feature selection
Popis výsledku v původním jazyce
In data analysis, when attempting to solve classification problems, we may encounter a large number of features. However, not all features are relevant for the current classification, and including irrelevant features can occasionally degrade learning performance. As a result, selecting the most relevant features is critical, especially for high-dimensional data sets in classification problems. Feature selection is an effective method for resolving this issue. It attempts to represent the original data by extracting relevant features containing useful information. In this research, our aim is to propose a p-norm least-squares Universum twin bounded support vector machine (LSp-UTBSVM) to perform classification and feature selection at the same time. Indeed, the proposed method, which outperforms the traditional least-squares Universum twin bounded support vector machine, can achieve good classification accuracy in a reasonable amount of time while also providing a sparse solution. The model we propose is an adaptive learning procedure with p-norm (0 < p < 1), where the parameter p can be automatically selected by the data set. The algorithm we use to find the approximate solution of this model involves solving systems of linear equations. Furthermore, we obtain new bounds for the absolute values of non-zero components of a local optimal solution. These bounds allow us to remove the zero components from an arbitrary numerical solution. Setting the parameter p, LSp-UTBSVM improves classification accuracy and selects the relevant features. Numerical experiments on a handwritten digit recognition, University of California Irvine (UCI) benchmark, Normally Distributed Clusters (NDC) and high dimensional data sets confirm the superiority of the proposed method in the accuracy of classification and the selection of relevant features in comparison with some popular methods.
Název v anglickém jazyce
Sparse least-squares Universum twin bounded support vector machine with adaptive Lp-norms and feature selection
Popis výsledku anglicky
In data analysis, when attempting to solve classification problems, we may encounter a large number of features. However, not all features are relevant for the current classification, and including irrelevant features can occasionally degrade learning performance. As a result, selecting the most relevant features is critical, especially for high-dimensional data sets in classification problems. Feature selection is an effective method for resolving this issue. It attempts to represent the original data by extracting relevant features containing useful information. In this research, our aim is to propose a p-norm least-squares Universum twin bounded support vector machine (LSp-UTBSVM) to perform classification and feature selection at the same time. Indeed, the proposed method, which outperforms the traditional least-squares Universum twin bounded support vector machine, can achieve good classification accuracy in a reasonable amount of time while also providing a sparse solution. The model we propose is an adaptive learning procedure with p-norm (0 < p < 1), where the parameter p can be automatically selected by the data set. The algorithm we use to find the approximate solution of this model involves solving systems of linear equations. Furthermore, we obtain new bounds for the absolute values of non-zero components of a local optimal solution. These bounds allow us to remove the zero components from an arbitrary numerical solution. Setting the parameter p, LSp-UTBSVM improves classification accuracy and selects the relevant features. Numerical experiments on a handwritten digit recognition, University of California Irvine (UCI) benchmark, Normally Distributed Clusters (NDC) and high dimensional data sets confirm the superiority of the proposed method in the accuracy of classification and the selection of relevant features in comparison with some popular methods.
Klasifikace
Druh
J<sub>imp</sub> - Článek v periodiku v databázi Web of Science
CEP obor
—
OECD FORD obor
50201 - Economic Theory
Návaznosti výsledku
Projekt
<a href="/cs/project/GA22-11117S" target="_blank" >GA22-11117S: Globální analýza citlivosti a stabilita v optimalizačních úlohách</a><br>
Návaznosti
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)
Ostatní
Rok uplatnění
2024
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název periodika
Expert Systems with Applications
ISSN
0957-4174
e-ISSN
1873-6793
Svazek periodika
248
Číslo periodika v rámci svazku
Neuveden
Stát vydavatele periodika
US - Spojené státy americké
Počet stran výsledku
23
Strana od-do
123378
Kód UT WoS článku
001179240700001
EID výsledku v databázi Scopus
2-s2.0-85183984633