Classification with Costly Features in Hierarchical Deep Sets
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F68407700%3A21230%2F24%3A00375997" target="_blank" >RIV/68407700:21230/24:00375997 - isvavai.cz</a>
Result on the web
<a href="https://doi.org/10.1007/s10994-024-06565-4" target="_blank" >https://doi.org/10.1007/s10994-024-06565-4</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1007/s10994-024-06565-4" target="_blank" >10.1007/s10994-024-06565-4</a>
Alternative languages
Result language
angličtina
Original language name
Classification with Costly Features in Hierarchical Deep Sets
Original language description
Classification with costly features (CwCF) is a classification problem that includes the cost of features in the optimization criteria. Individually for each sample, its features are sequentially acquired to maximize accuracy while minimizing the acquired features' cost. However, existing approaches can only process data that can be expressed as vectors of fixed length. In real life, the data often possesses rich and complex structure, which can be more precisely described with formats such as XML or JSON. The data is hierarchical and often contains nested lists of objects. In this work, we extend an existing deep reinforcement learning-based algorithm with hierarchical deep sets and hierarchical softmax, so that it can directly process this data. The extended method has greater control over which features it can acquire and, in experiments with seven datasets, we show that this leads to superior performance. To showcase the real usage of the new method, we apply it to a real-life problem of classifying malicious web domains, using an online service.
Czech name
—
Czech description
—
Classification
Type
J<sub>imp</sub> - Article in a specialist periodical, which is included in the Web of Science database
CEP classification
—
OECD FORD branch
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Result continuities
Project
Result was created during the realization of more than one project. More information in the Projects tab.
Continuities
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)
Others
Publication year
2024
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Name of the periodical
Machine Learning
ISSN
0885-6125
e-ISSN
1573-0565
Volume of the periodical
113
Issue of the periodical within the volume
7
Country of publishing house
US - UNITED STATES
Number of pages
36
Pages from-to
4487-4522
UT code for WoS article
001229224200001
EID of the result in the Scopus database
2-s2.0-85193792574