Modular Reinforcement Learning In Long-Horizon Manipulation Tasks
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F68407700%3A21730%2F24%3A00377216" target="_blank" >RIV/68407700:21730/24:00377216 - isvavai.cz</a>
Výsledek na webu
<a href="https://doi.org/10.1007/978-3-031-72359-9_22" target="_blank" >https://doi.org/10.1007/978-3-031-72359-9_22</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1007/978-3-031-72359-9_22" target="_blank" >10.1007/978-3-031-72359-9_22</a>
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
Modular Reinforcement Learning In Long-Horizon Manipulation Tasks
Popis výsledku v původním jazyce
Recently, a number of reinforcement learning (RL) algorithms have been proposed in the area of robotic manipulation. As most of the current robotic benchmarks are focused on simple, non-diverse tasks such as the translation of objects within the scene, various singlepolicy algorithms are able to solve them with a high success rate. However, when a sequence of diverse subgoals is required (translation, rotation, 6DOF manipulation, trajectory following), the single-policy networks are shown to fail. In this work, we propose two modular multipolicy algorithms (MultiPPO2 and MultiACKTR) that improve diverse long-horizon tasks by adopting a separate policy for each skill that follows its own subgoal. We tested our algorithm in a virtual robotic simulator both on single and multi-step tasks requiring non-diverse (translation) skills and also diverse (translation, rotation and path following) skills. Both algorithms (MultiPPO2 and MultiACKTR) achieved similar performance as single-policy algorithms in the single-ste
Název v anglickém jazyce
Modular Reinforcement Learning In Long-Horizon Manipulation Tasks
Popis výsledku anglicky
Recently, a number of reinforcement learning (RL) algorithms have been proposed in the area of robotic manipulation. As most of the current robotic benchmarks are focused on simple, non-diverse tasks such as the translation of objects within the scene, various singlepolicy algorithms are able to solve them with a high success rate. However, when a sequence of diverse subgoals is required (translation, rotation, 6DOF manipulation, trajectory following), the single-policy networks are shown to fail. In this work, we propose two modular multipolicy algorithms (MultiPPO2 and MultiACKTR) that improve diverse long-horizon tasks by adopting a separate policy for each skill that follows its own subgoal. We tested our algorithm in a virtual robotic simulator both on single and multi-step tasks requiring non-diverse (translation) skills and also diverse (translation, rotation and path following) skills. Both algorithms (MultiPPO2 and MultiACKTR) achieved similar performance as single-policy algorithms in the single-ste
Klasifikace
Druh
D - Stať ve sborníku
CEP obor
—
OECD FORD obor
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Návaznosti výsledku
Projekt
Výsledek vznikl pri realizaci vícero projektů. Více informací v záložce Projekty.
Návaznosti
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)
Ostatní
Rok uplatnění
2024
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název statě ve sborníku
Artificial Neural Networks and Machine Learning – ICANN 2024 33rd International Conference on Artificial Neural Networks, Lugano, Switzerland, September 17–20, 2024, Proceedings, Part IX
ISBN
978-3-031-72356-8
ISSN
0302-9743
e-ISSN
1611-3349
Počet stran výsledku
14
Strana od-do
299-312
Název nakladatele
Springer, Cham
Místo vydání
—
Místo konání akce
Lugano-Viganello
Datum konání akce
17. 9. 2024
Typ akce podle státní příslušnosti
WRD - Celosvětová akce
Kód UT WoS článku
001331898500022