Genetic algorithm for entropy-based feature subset selection
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F61989100%3A27240%2F16%3A86099084" target="_blank" >RIV/61989100:27240/16:86099084 - isvavai.cz</a>
Výsledek na webu
<a href="http://dx.doi.org/10.1109/CEC.2016.7744360" target="_blank" >http://dx.doi.org/10.1109/CEC.2016.7744360</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1109/CEC.2016.7744360" target="_blank" >10.1109/CEC.2016.7744360</a>
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
Genetic algorithm for entropy-based feature subset selection
Popis výsledku v původním jazyce
The data-driven society of today generates very large volumes of high-dimensional data. Its efficient processing by established methods represents an increasing challenge and novel advanced approaches are needed. Feature selection is a traditional data pre-processing strategy that can be used to reduce the volume and complexity of data. It selects a subset of data features so that data volume is reduced but its information content maintained. Evolutionary feature selection methods have already shown good ability to identify in very-high-dimensional data sets feature subsets according to selected criteria. Their efficiency depends, among others, on feature subset representation and objective function definition. This work employs a recent genetic algorithm for fixed-length subset selection to find feature subsets on the basis of their entropy, estimated by a fast data compression method. The reasonability of this new fitness criterion and the usefulness of selected feature subsets for practical data mining is evaluated using well-known data sets and several widely-used classification algorithms. (C) 2016 IEEE.
Název v anglickém jazyce
Genetic algorithm for entropy-based feature subset selection
Popis výsledku anglicky
The data-driven society of today generates very large volumes of high-dimensional data. Its efficient processing by established methods represents an increasing challenge and novel advanced approaches are needed. Feature selection is a traditional data pre-processing strategy that can be used to reduce the volume and complexity of data. It selects a subset of data features so that data volume is reduced but its information content maintained. Evolutionary feature selection methods have already shown good ability to identify in very-high-dimensional data sets feature subsets according to selected criteria. Their efficiency depends, among others, on feature subset representation and objective function definition. This work employs a recent genetic algorithm for fixed-length subset selection to find feature subsets on the basis of their entropy, estimated by a fast data compression method. The reasonability of this new fitness criterion and the usefulness of selected feature subsets for practical data mining is evaluated using well-known data sets and several widely-used classification algorithms. (C) 2016 IEEE.
Klasifikace
Druh
D - Stať ve sborníku
CEP obor
IN - Informatika
OECD FORD obor
—
Návaznosti výsledku
Projekt
<a href="/cs/project/GJ16-25694Y" target="_blank" >GJ16-25694Y: Mnohoparadigmatické algoritmy dolování z dat založené na vyhledávání, fuzzy technologiích a bio-inspirovaných výpočtech</a><br>
Návaznosti
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)
Ostatní
Rok uplatnění
2016
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název statě ve sborníku
2016 IEEE Congress on Evolutionary Computation, CEC 2016
ISBN
978-1-5090-0622-9
ISSN
—
e-ISSN
—
Počet stran výsledku
8
Strana od-do
4486-4493
Název nakladatele
Institute of Electrical and Electronics Engineers
Místo vydání
New York
Místo konání akce
Vancouver
Datum konání akce
24. 7. 2016
Typ akce podle státní příslušnosti
WRD - Celosvětová akce
Kód UT WoS článku
000390749104088