OCFSP: self-supervised one-class classification approach using feature-slide prediction subtask for feature data
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F62690094%3A18470%2F22%3A50020122" target="_blank" >RIV/62690094:18470/22:50020122 - isvavai.cz</a>
Výsledek na webu
<a href="https://link.springer.com/article/10.1007/s00500-022-07414-z" target="_blank" >https://link.springer.com/article/10.1007/s00500-022-07414-z</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1007/s00500-022-07414-z" target="_blank" >10.1007/s00500-022-07414-z</a>
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
OCFSP: self-supervised one-class classification approach using feature-slide prediction subtask for feature data
Popis výsledku v původním jazyce
One-class classification (OCC) is a machine learning problem where training data has only one class. Recently, self-supervised OCC algorithms have been increasing attention. These algorithms train the model for pretext tasks and use the model error for OCC. However, these tasks are specialized for images, and applying them to feature data is not practical or appropriate for such a purpose. The motivation of this study is to apply self-supervised OCC to feature data. For this purpose, this paper proposes an OCC approach using feature-slide prediction (FSP) subtask for feature data (OCFSP). The main originality is the FSP subtask, which is the first classification subtask for feature data. In particular, the proposed method creates a self-labeled dataset by generating additional feature vectors with the feature slide of original vectors and self-annotating these vectors as the number of the slides. Such a dataset is applied to train a multi-class classifier to predict the number of feature slides. Since this classification model learns data from only one class, the FSP accuracy for a seen class is higher relative to unseen classes. Accordingly, OCC could be made using the accuracy of FSP. The proposed methods are experimented with using the imbalanced-learn, covtype, and kddcup datasets. OCFSP shows fair accuracy where few training data is given. In addition, classification subtask for feature data shows a relatively fast testing speed, unlike image data. Therefore, the bottleneck of the self-supervised approach is considered the memory size, which is the main difference between image and feature data. Source code is uploaded at https://github.com/ToshiHayashi/OCFSP
Název v anglickém jazyce
OCFSP: self-supervised one-class classification approach using feature-slide prediction subtask for feature data
Popis výsledku anglicky
One-class classification (OCC) is a machine learning problem where training data has only one class. Recently, self-supervised OCC algorithms have been increasing attention. These algorithms train the model for pretext tasks and use the model error for OCC. However, these tasks are specialized for images, and applying them to feature data is not practical or appropriate for such a purpose. The motivation of this study is to apply self-supervised OCC to feature data. For this purpose, this paper proposes an OCC approach using feature-slide prediction (FSP) subtask for feature data (OCFSP). The main originality is the FSP subtask, which is the first classification subtask for feature data. In particular, the proposed method creates a self-labeled dataset by generating additional feature vectors with the feature slide of original vectors and self-annotating these vectors as the number of the slides. Such a dataset is applied to train a multi-class classifier to predict the number of feature slides. Since this classification model learns data from only one class, the FSP accuracy for a seen class is higher relative to unseen classes. Accordingly, OCC could be made using the accuracy of FSP. The proposed methods are experimented with using the imbalanced-learn, covtype, and kddcup datasets. OCFSP shows fair accuracy where few training data is given. In addition, classification subtask for feature data shows a relatively fast testing speed, unlike image data. Therefore, the bottleneck of the self-supervised approach is considered the memory size, which is the main difference between image and feature data. Source code is uploaded at https://github.com/ToshiHayashi/OCFSP
Klasifikace
Druh
J<sub>imp</sub> - Článek v periodiku v databázi Web of Science
CEP obor
—
OECD FORD obor
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Návaznosti výsledku
Projekt
—
Návaznosti
I - Institucionalni podpora na dlouhodoby koncepcni rozvoj vyzkumne organizace
Ostatní
Rok uplatnění
2022
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název periodika
Soft Computing
ISSN
1432-7643
e-ISSN
1433-7479
Svazek periodika
26
Číslo periodika v rámci svazku
19
Stát vydavatele periodika
US - Spojené státy americké
Počet stran výsledku
23
Strana od-do
10127-10149
Kód UT WoS článku
000838499600003
EID výsledku v databázi Scopus
2-s2.0-85136922796