Do We Need to Observe Features to Perform Feature Selection?
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F68407700%3A21240%2F18%3A00323850" target="_blank" >RIV/68407700:21240/18:00323850 - isvavai.cz</a>
Výsledek na webu
—
DOI - Digital Object Identifier
—
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
Do We Need to Observe Features to Perform Feature Selection?
Popis výsledku v původním jazyce
Many feature selection methods were developed in the past, but in the core, they all work the same way — you pass a set of features to the algorithm and get a reduced set of the features. But can we perform a non-trivial feature selection without first observing the features? This is an important question because if we were actually able to predict feature importance before observing the features, we would reduce computation requirements of all stages of machine learning process beginning with feature engineering. In this article, we argue that it is possible to predict feature importance before feature vector observation. The trick is that we use meta-features about the features to perform the feature selection. We evaluate the concept on 15 relational databases. On average, it was enough to generate the top decile of all features to get the same model accuracy as if we generated all features and passed them to the model.
Název v anglickém jazyce
Do We Need to Observe Features to Perform Feature Selection?
Popis výsledku anglicky
Many feature selection methods were developed in the past, but in the core, they all work the same way — you pass a set of features to the algorithm and get a reduced set of the features. But can we perform a non-trivial feature selection without first observing the features? This is an important question because if we were actually able to predict feature importance before observing the features, we would reduce computation requirements of all stages of machine learning process beginning with feature engineering. In this article, we argue that it is possible to predict feature importance before feature vector observation. The trick is that we use meta-features about the features to perform the feature selection. We evaluate the concept on 15 relational databases. On average, it was enough to generate the top decile of all features to get the same model accuracy as if we generated all features and passed them to the model.
Klasifikace
Druh
D - Stať ve sborníku
CEP obor
—
OECD FORD obor
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Návaznosti výsledku
Projekt
<a href="/cs/project/GA18-18080S" target="_blank" >GA18-18080S: Objevování znalostí v datech o aktivitě člověka založené na fúzi</a><br>
Návaznosti
S - Specificky vyzkum na vysokych skolach
Ostatní
Rok uplatnění
2018
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název statě ve sborníku
Proceedings of the 18th Conference Information Technologies - Applications and Theory (ITAT 2018)
ISBN
9781727267198
ISSN
—
e-ISSN
1613-0073
Počet stran výsledku
8
Strana od-do
44-51
Název nakladatele
CEUR Workshop Proceedings
Místo vydání
Aachen
Místo konání akce
Krompachy
Datum konání akce
21. 9. 2018
Typ akce podle státní příslušnosti
WRD - Celosvětová akce
Kód UT WoS článku
—