All

What are you looking for?

All
Projects
Results
Organizations

Quick search

  • Projects supported by TA ČR
  • Excellent projects
  • Projects with the highest public support
  • Current projects

Smart search

  • That is how I find a specific +word
  • That is how I leave the -word out of the results
  • “That is how I can find the whole phrase”

Safe Autonomous Reinforcement Learning -- {PhD} Thesis Proposal

The result's identifiers

  • Result code in IS VaVaI

    <a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F68407700%3A21230%2F16%3A00301602" target="_blank" >RIV/68407700:21230/16:00301602 - isvavai.cz</a>

  • Result on the web

    <a href="http://cmp.felk.cvut.cz/pub/cmp/articles/pecka/Pecka-TR-2016-03.pdf" target="_blank" >http://cmp.felk.cvut.cz/pub/cmp/articles/pecka/Pecka-TR-2016-03.pdf</a>

  • DOI - Digital Object Identifier

Alternative languages

  • Result language

    angličtina

  • Original language name

    Safe Autonomous Reinforcement Learning -- {PhD} Thesis Proposal

  • Original language description

    In the thesis we propose, we focus on equipping existing Reinforcement Learning algorithms with different kinds of safety constraints imposed on the exploration scheme. Common Reinforcement Learning algorithms are (sometimes implicitly) assumed to work in an ergodic, or even "restartable" environment. However, these conditions are not achievable in field robotics, where the expensive robots cannot simply be replaced by a new functioning unit when they perform a "deadly" action. Even so, Reinforcement Learning offers many advantages over supervised learning that are useful in the robotics domain. It may reduce the amount of annotated training data needed to train a task, or, for example, eliminate the need of acquiring a model of the whole system. Thus we note there is a need for something that would allow for using Reinforcement Learning safely in non-ergodic and dangerous environments. Defining and recognizing safe and unsafe tates/actions is a difficult task itself. Even when there is a safety classifier, it still remains to incorporate the safety measures into the Reinforcement Learning process so that efficiency and convergence of the algorithm is not lost. The proposed thesis deals both with safety-classifier creation and the usage of Reinforcement Learning and safety measures together. The available safe exploration methods range from simple algorithms for simple environments to sophisticated methods based on previous experience, state prediction or machine learning. Pitifully, the methods suitable for our field robotics case usually require a precise model of the system, which is however very difficult (or even impossible) to obtain from sensory input in unknown environment. In our previous work, for the safety classifier we proposed a machine learning approach utilizing a cautious simulator. For the connection of Reinforcement Learning and safety we further examine a modified Gradient Policy Search algorithm. ...

  • Czech name

  • Czech description

Classification

  • Type

    V<sub>souhrn</sub> - Summary research report

  • CEP classification

    JD - Use of computers, robotics and its application

  • OECD FORD branch

Result continuities

  • Project

    <a href="/en/project/GA14-13876S" target="_blank" >GA14-13876S: Perception methods for long-term autonomy of mobile robots</a><br>

  • Continuities

    P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)

Others

  • Publication year

    2016

  • Confidentiality

    S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Data specific for result type

  • Number of pages

    55

  • Place of publication

    Praha

  • Publisher/client name

    Center for Machine Perception, K13133 FEE Czech Technical University

  • Version