Reward Redistribution for Reinforcement Learning of Dynamic Nonprehensile Manipulation
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F68407700%3A21230%2F21%3A00351068" target="_blank" >RIV/68407700:21230/21:00351068 - isvavai.cz</a>
Alternative codes found
RIV/68407700:21460/21:00351068 RIV/68407700:21730/21:00351068
Result on the web
<a href="https://doi.org/10.1109/ICCAR52225.2021.9463495" target="_blank" >https://doi.org/10.1109/ICCAR52225.2021.9463495</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1109/ICCAR52225.2021.9463495" target="_blank" >10.1109/ICCAR52225.2021.9463495</a>
Alternative languages
Result language
angličtina
Original language name
Reward Redistribution for Reinforcement Learning of Dynamic Nonprehensile Manipulation
Original language description
Recent reinforcement learning (RL) systems can solve a wide variety of manipulation tasks even in real-world robotic implementations. However, in some nonprehensile manipulation tasks (e.g. poking, throwing), the classical reward system fails as the robot has to manipulate objects whose motion trajectory is partly uncontrollable. Such tasks require a specific type of reward that would reflect this temporal misalignment. We propose a novel method, based on a delayed reward redistribution, that allows a robot to fulfil goals in an only partially controllable environment. The reward system in our architecture combines information from other sensors together with inputs from an unsupervised vision module based on a variational autoencoder (VAE). This delayed reward system then controls the training of the motor module based on a Soft Actor-Critic (SAC) neural network. We compare results for a delayed and nondelayed version of our system in a simulated environment and show that the delayed reward greatly outperforms the nondelayed version.
Czech name
—
Czech description
—
Classification
Type
D - Article in proceedings
CEP classification
—
OECD FORD branch
20204 - Robotics and automatic control
Result continuities
Project
—
Continuities
S - Specificky vyzkum na vysokych skolach
Others
Publication year
2021
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Article name in the collection
2021 7th International Conference on Control, Automation and Robotics (ICCAR)
ISBN
978-1-6654-4986-1
ISSN
—
e-ISSN
2251-2454
Number of pages
6
Pages from-to
326-331
Publisher name
IEEE (Institute of Electrical and Electronics Engineers)
Place of publication
—
Event location
Singapur (virtuálně)
Event date
Apr 23, 2021
Type of event by nationality
WRD - Celosvětová akce
UT code for WoS article
—