Selecting Informative Data Samples for Model Learning Through Symbolic Regression
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F68407700%3A21230%2F21%3A00346328" target="_blank" >RIV/68407700:21230/21:00346328 - isvavai.cz</a>
Nalezeny alternativní kódy
RIV/68407700:21730/21:00346328
Výsledek na webu
<a href="https://doi.org/10.1109/ACCESS.2021.3052130" target="_blank" >https://doi.org/10.1109/ACCESS.2021.3052130</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1109/ACCESS.2021.3052130" target="_blank" >10.1109/ACCESS.2021.3052130</a>
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
Selecting Informative Data Samples for Model Learning Through Symbolic Regression
Popis výsledku v původním jazyce
Continual model learning for nonlinear dynamic systems, such as autonomous robots, presents several challenges. First, it tends to be computationally expensive as the amount of data collected by the robot quickly grows in time. Second, the model accuracy is impaired when data from repetitive motions prevail in the training set and outweigh scarcer samples that also capture interesting properties of the system. It is not known in advance which samples will be useful for model learning. Therefore, effective methods need to be employed to select informative training samples from the continuous data stream collected by the robot. Existing literature does not give any guidelines as to which of the available sample-selection methods are suitable for such a task. In this paper, we compare five sample-selection methods, including a novel method using the model prediction error. We integrate these methods into a model learning framework based on symbolic regression, which allows for learning accurate models in the form of analytic equations. Unlike the currently popular data-hungry deep learning methods, symbolic regression is able to build models even from very small training data sets. We demonstrate the approach on two real robots: the TurtleBot mobile robot and the Parrot Bebop drone. The results show that an accurate model can be constructed even from training sets as small as 24 samples. Informed sample-selection techniques based on prediction error and model variance clearly outperform uninformed methods, such as sequential or random selection.
Název v anglickém jazyce
Selecting Informative Data Samples for Model Learning Through Symbolic Regression
Popis výsledku anglicky
Continual model learning for nonlinear dynamic systems, such as autonomous robots, presents several challenges. First, it tends to be computationally expensive as the amount of data collected by the robot quickly grows in time. Second, the model accuracy is impaired when data from repetitive motions prevail in the training set and outweigh scarcer samples that also capture interesting properties of the system. It is not known in advance which samples will be useful for model learning. Therefore, effective methods need to be employed to select informative training samples from the continuous data stream collected by the robot. Existing literature does not give any guidelines as to which of the available sample-selection methods are suitable for such a task. In this paper, we compare five sample-selection methods, including a novel method using the model prediction error. We integrate these methods into a model learning framework based on symbolic regression, which allows for learning accurate models in the form of analytic equations. Unlike the currently popular data-hungry deep learning methods, symbolic regression is able to build models even from very small training data sets. We demonstrate the approach on two real robots: the TurtleBot mobile robot and the Parrot Bebop drone. The results show that an accurate model can be constructed even from training sets as small as 24 samples. Informed sample-selection techniques based on prediction error and model variance clearly outperform uninformed methods, such as sequential or random selection.
Klasifikace
Druh
J<sub>imp</sub> - Článek v periodiku v databázi Web of Science
CEP obor
—
OECD FORD obor
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Návaznosti výsledku
Projekt
<a href="/cs/project/EF15_003%2F0000470" target="_blank" >EF15_003/0000470: Robotika pro Průmysl 4.0</a><br>
Návaznosti
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)<br>S - Specificky vyzkum na vysokych skolach
Ostatní
Rok uplatnění
2021
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název periodika
IEEE Access
ISSN
2169-3536
e-ISSN
2169-3536
Svazek periodika
9
Číslo periodika v rámci svazku
January
Stát vydavatele periodika
US - Spojené státy americké
Počet stran výsledku
11
Strana od-do
14148-14158
Kód UT WoS článku
000613205000001
EID výsledku v databázi Scopus
2-s2.0-85099724321