Benchmarking state-of-the-art symbolic regression algorithms
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F68407700%3A21230%2F21%3A00340875" target="_blank" >RIV/68407700:21230/21:00340875 - isvavai.cz</a>
Nalezeny alternativní kódy
RIV/68407700:21730/21:00340875
Výsledek na webu
<a href="https://doi.org/10.1007/s10710-020-09387-0" target="_blank" >https://doi.org/10.1007/s10710-020-09387-0</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1007/s10710-020-09387-0" target="_blank" >10.1007/s10710-020-09387-0</a>
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
Benchmarking state-of-the-art symbolic regression algorithms
Popis výsledku v původním jazyce
Symbolic regression (SR) is a powerful method for building predictive models from data without assuming any model structure. Traditionally, genetic programming (GP) was used as the SR engine. However, for these purely evolutionary methods it was quite hard to even accommodate the function to the range of the data and the training was consequently inefficient and slow. Recently, several SR algorithms emerged which employ multiple linear regression. This allows the algorithms to create models with relatively small error right from the beginning of the search. Such algorithms are claimed to be by orders of magnitude faster than SR algorithms based on classic GP. However, a systematic comparison of these algorithms on a common set of problems is still missing and there is no basis on which to decide which algorithm to use. In this paper we conceptually and experimentally compare several representatives of such algorithms: GPTIPS, FFX, and EFS. We also include GSGP-Red, which is an enhanced version of geometric semantic genetic programming, an important algorithm in the field of SR. They are applied as off-the-shelf, ready-to-use techniques, mostly using their default settings. The methods are compared on several synthetic SR benchmark problems as well as real-world ones ranging from civil engineering to aerodynamics and acoustics. Their performance is also related to the performance of three conventional machine learning algorithms: multiple regression, random forests and support vector regression. The results suggest that across all the problems, the algorithms have comparable performance. We provide basic recommendations to the user regarding the choice of the algorithm.
Název v anglickém jazyce
Benchmarking state-of-the-art symbolic regression algorithms
Popis výsledku anglicky
Symbolic regression (SR) is a powerful method for building predictive models from data without assuming any model structure. Traditionally, genetic programming (GP) was used as the SR engine. However, for these purely evolutionary methods it was quite hard to even accommodate the function to the range of the data and the training was consequently inefficient and slow. Recently, several SR algorithms emerged which employ multiple linear regression. This allows the algorithms to create models with relatively small error right from the beginning of the search. Such algorithms are claimed to be by orders of magnitude faster than SR algorithms based on classic GP. However, a systematic comparison of these algorithms on a common set of problems is still missing and there is no basis on which to decide which algorithm to use. In this paper we conceptually and experimentally compare several representatives of such algorithms: GPTIPS, FFX, and EFS. We also include GSGP-Red, which is an enhanced version of geometric semantic genetic programming, an important algorithm in the field of SR. They are applied as off-the-shelf, ready-to-use techniques, mostly using their default settings. The methods are compared on several synthetic SR benchmark problems as well as real-world ones ranging from civil engineering to aerodynamics and acoustics. Their performance is also related to the performance of three conventional machine learning algorithms: multiple regression, random forests and support vector regression. The results suggest that across all the problems, the algorithms have comparable performance. We provide basic recommendations to the user regarding the choice of the algorithm.
Klasifikace
Druh
J<sub>imp</sub> - Článek v periodiku v databázi Web of Science
CEP obor
—
OECD FORD obor
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Návaznosti výsledku
Projekt
<a href="/cs/project/GA15-22731S" target="_blank" >GA15-22731S: Symbolická regrese pro posilované učení ve spojitých prostorech</a><br>
Návaznosti
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)
Ostatní
Rok uplatnění
2021
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název periodika
Genetic Programming and Evolvable Machines
ISSN
1389-2576
e-ISSN
1573-7632
Svazek periodika
22
Číslo periodika v rámci svazku
1
Stát vydavatele periodika
US - Spojené státy americké
Počet stran výsledku
29
Strana od-do
5-33
Kód UT WoS článku
000521687000001
EID výsledku v databázi Scopus
2-s2.0-85083357133