Decentralized Reinforcement Learning of Robot Behaviors

Identifikátory výsledku

Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F68407700%3A21730%2F18%3A00316453" target="_blank" >RIV/68407700:21730/18:00316453 - isvavai.cz</a>
Výsledek na webu
<a href="https://doi.org/10.1016/j.artint.2017.12.001" target="_blank" >https://doi.org/10.1016/j.artint.2017.12.001</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1016/j.artint.2017.12.001" target="_blank" >10.1016/j.artint.2017.12.001</a>

Alternativní jazyky

Jazyk výsledku
angličtina
Název v původním jazyce
Decentralized Reinforcement Learning of Robot Behaviors
Popis výsledku v původním jazyce
A multi-agent methodology is proposed for Decentralized Reinforcement Learning (DRL) of individual behaviors in problems where multi-dimensional action spaces are involved. When using this methodology, sub-tasks are learned in parallel by individual agents working toward a common goal. In addition to proposing this methodology, three specific multi agent DRL approaches are considered: DRL-Independent, DRL Cooperative-Adaptive (CA), and DRL-Lenient. These approaches are validated and analyzed with an extensive empirical study using four different problems: 3D Mountain Car, SCARA Real-Time Trajectory Generation, Ball-Dribbling in humanoid soccer robotics, and Ball-Pushing using differential drive robots. The experimental validation provides evidence that DRL implementations show better performances and faster learning times than their centralized counterparts, while using less computational resources. DRL-Lenient and DRL-CA algorithms achieve the best final performances for the four tested problems, outperforming their DRL-Independent counterparts. Furthermore, the benefits of the DRL-Lenient and DRL-CA are more noticeable when the problem complexity increases and the centralized scheme becomes intractable given the available computational resources and training time.
Název v anglickém jazyce
Decentralized Reinforcement Learning of Robot Behaviors
Popis výsledku anglicky
A multi-agent methodology is proposed for Decentralized Reinforcement Learning (DRL) of individual behaviors in problems where multi-dimensional action spaces are involved. When using this methodology, sub-tasks are learned in parallel by individual agents working toward a common goal. In addition to proposing this methodology, three specific multi agent DRL approaches are considered: DRL-Independent, DRL Cooperative-Adaptive (CA), and DRL-Lenient. These approaches are validated and analyzed with an extensive empirical study using four different problems: 3D Mountain Car, SCARA Real-Time Trajectory Generation, Ball-Dribbling in humanoid soccer robotics, and Ball-Pushing using differential drive robots. The experimental validation provides evidence that DRL implementations show better performances and faster learning times than their centralized counterparts, while using less computational resources. DRL-Lenient and DRL-CA algorithms achieve the best final performances for the four tested problems, outperforming their DRL-Independent counterparts. Furthermore, the benefits of the DRL-Lenient and DRL-CA are more noticeable when the problem complexity increases and the centralized scheme becomes intractable given the available computational resources and training time.

Klasifikace

Druh
J<sub>imp</sub> - Článek v periodiku v databázi Web of Science
CEP obor
—
OECD FORD obor
20205 - Automation and control systems

Návaznosti výsledku

Projekt
<a href="/cs/project/EF15_003%2F0000470" target="_blank" >EF15_003/0000470: Robotika pro Průmysl 4.0</a><br>
Návaznosti
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)

Ostatní

Rok uplatnění
2018
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Údaje specifické pro druh výsledku

Název periodika
Artificial Intelligence
ISSN
0004-3702
e-ISSN
1872-7921
Svazek periodika
256
Číslo periodika v rámci svazku
March
Stát vydavatele periodika
GB - Spojené království Velké Británie a Severního Irska
Počet stran výsledku
30
Strana od-do
130-159
Kód UT WoS článku
000424958700005
EID výsledku v databázi Scopus
2-s2.0-85038868982

Podobné výsledky(10)

Federated Reinforcement Learning for Collective Navigation of Robotic Swarms Comparison of Task-Allocation Algorithms in Frontier-Based Multi-robot Exploration Supervised Learning in Multi-Agent Environments Using Inverse Point of View

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Decentralized Reinforcement Learning of Robot Behaviors

Identifikátory výsledku

Alternativní jazyky

Klasifikace

Návaznosti výsledku

Ostatní

Údaje specifické pro druh výsledku

Podobné výsledky(10)

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Popis výsledku

Identifikátory výsledku

Identifikátory výsledku

Alternativní jazyky

Alternativní jazyky

Klasifikace

Klasifikace

Návaznosti výsledku

Návaznosti výsledku

Ostatní

Ostatní

Údaje specifické pro druh výsledku

Údaje specifické pro druh výsledku

Podobné výsledky(10)