Benchmarking state-of-the-art symbolic regression algorithms
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F68407700%3A21230%2F21%3A00340875" target="_blank" >RIV/68407700:21230/21:00340875 - isvavai.cz</a>
Alternative codes found
RIV/68407700:21730/21:00340875
Result on the web
<a href="https://doi.org/10.1007/s10710-020-09387-0" target="_blank" >https://doi.org/10.1007/s10710-020-09387-0</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1007/s10710-020-09387-0" target="_blank" >10.1007/s10710-020-09387-0</a>
Alternative languages
Result language
angličtina
Original language name
Benchmarking state-of-the-art symbolic regression algorithms
Original language description
Symbolic regression (SR) is a powerful method for building predictive models from data without assuming any model structure. Traditionally, genetic programming (GP) was used as the SR engine. However, for these purely evolutionary methods it was quite hard to even accommodate the function to the range of the data and the training was consequently inefficient and slow. Recently, several SR algorithms emerged which employ multiple linear regression. This allows the algorithms to create models with relatively small error right from the beginning of the search. Such algorithms are claimed to be by orders of magnitude faster than SR algorithms based on classic GP. However, a systematic comparison of these algorithms on a common set of problems is still missing and there is no basis on which to decide which algorithm to use. In this paper we conceptually and experimentally compare several representatives of such algorithms: GPTIPS, FFX, and EFS. We also include GSGP-Red, which is an enhanced version of geometric semantic genetic programming, an important algorithm in the field of SR. They are applied as off-the-shelf, ready-to-use techniques, mostly using their default settings. The methods are compared on several synthetic SR benchmark problems as well as real-world ones ranging from civil engineering to aerodynamics and acoustics. Their performance is also related to the performance of three conventional machine learning algorithms: multiple regression, random forests and support vector regression. The results suggest that across all the problems, the algorithms have comparable performance. We provide basic recommendations to the user regarding the choice of the algorithm.
Czech name
—
Czech description
—
Classification
Type
J<sub>imp</sub> - Article in a specialist periodical, which is included in the Web of Science database
CEP classification
—
OECD FORD branch
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Result continuities
Project
<a href="/en/project/GA15-22731S" target="_blank" >GA15-22731S: Symbolic Regression for Reinforcement Learning in Continuous Spaces</a><br>
Continuities
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)
Others
Publication year
2021
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Name of the periodical
Genetic Programming and Evolvable Machines
ISSN
1389-2576
e-ISSN
1573-7632
Volume of the periodical
22
Issue of the periodical within the volume
1
Country of publishing house
US - UNITED STATES
Number of pages
29
Pages from-to
5-33
UT code for WoS article
000521687000001
EID of the result in the Scopus database
2-s2.0-85083357133