Vše

Co hledáte?

Vše
Projekty
Výsledky výzkumu
Subjekty

Rychlé hledání

  • Projekty podpořené TA ČR
  • Významné projekty
  • Projekty s nejvyšší státní podporou
  • Aktuálně běžící projekty

Chytré vyhledávání

  • Takto najdu konkrétní +slovo
  • Takto z výsledků -slovo zcela vynechám
  • “Takto můžu najít celou frázi”

Safe Autonomous Reinforcement Learning -- {PhD} Thesis Proposal

Identifikátory výsledku

  • Kód výsledku v IS VaVaI

    <a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F68407700%3A21230%2F16%3A00301602" target="_blank" >RIV/68407700:21230/16:00301602 - isvavai.cz</a>

  • Výsledek na webu

    <a href="http://cmp.felk.cvut.cz/pub/cmp/articles/pecka/Pecka-TR-2016-03.pdf" target="_blank" >http://cmp.felk.cvut.cz/pub/cmp/articles/pecka/Pecka-TR-2016-03.pdf</a>

  • DOI - Digital Object Identifier

Alternativní jazyky

  • Jazyk výsledku

    angličtina

  • Název v původním jazyce

    Safe Autonomous Reinforcement Learning -- {PhD} Thesis Proposal

  • Popis výsledku v původním jazyce

    In the thesis we propose, we focus on equipping existing Reinforcement Learning algorithms with different kinds of safety constraints imposed on the exploration scheme. Common Reinforcement Learning algorithms are (sometimes implicitly) assumed to work in an ergodic, or even "restartable" environment. However, these conditions are not achievable in field robotics, where the expensive robots cannot simply be replaced by a new functioning unit when they perform a "deadly" action. Even so, Reinforcement Learning offers many advantages over supervised learning that are useful in the robotics domain. It may reduce the amount of annotated training data needed to train a task, or, for example, eliminate the need of acquiring a model of the whole system. Thus we note there is a need for something that would allow for using Reinforcement Learning safely in non-ergodic and dangerous environments. Defining and recognizing safe and unsafe tates/actions is a difficult task itself. Even when there is a safety classifier, it still remains to incorporate the safety measures into the Reinforcement Learning process so that efficiency and convergence of the algorithm is not lost. The proposed thesis deals both with safety-classifier creation and the usage of Reinforcement Learning and safety measures together. The available safe exploration methods range from simple algorithms for simple environments to sophisticated methods based on previous experience, state prediction or machine learning. Pitifully, the methods suitable for our field robotics case usually require a precise model of the system, which is however very difficult (or even impossible) to obtain from sensory input in unknown environment. In our previous work, for the safety classifier we proposed a machine learning approach utilizing a cautious simulator. For the connection of Reinforcement Learning and safety we further examine a modified Gradient Policy Search algorithm. ...

  • Název v anglickém jazyce

    Safe Autonomous Reinforcement Learning -- {PhD} Thesis Proposal

  • Popis výsledku anglicky

    In the thesis we propose, we focus on equipping existing Reinforcement Learning algorithms with different kinds of safety constraints imposed on the exploration scheme. Common Reinforcement Learning algorithms are (sometimes implicitly) assumed to work in an ergodic, or even "restartable" environment. However, these conditions are not achievable in field robotics, where the expensive robots cannot simply be replaced by a new functioning unit when they perform a "deadly" action. Even so, Reinforcement Learning offers many advantages over supervised learning that are useful in the robotics domain. It may reduce the amount of annotated training data needed to train a task, or, for example, eliminate the need of acquiring a model of the whole system. Thus we note there is a need for something that would allow for using Reinforcement Learning safely in non-ergodic and dangerous environments. Defining and recognizing safe and unsafe tates/actions is a difficult task itself. Even when there is a safety classifier, it still remains to incorporate the safety measures into the Reinforcement Learning process so that efficiency and convergence of the algorithm is not lost. The proposed thesis deals both with safety-classifier creation and the usage of Reinforcement Learning and safety measures together. The available safe exploration methods range from simple algorithms for simple environments to sophisticated methods based on previous experience, state prediction or machine learning. Pitifully, the methods suitable for our field robotics case usually require a precise model of the system, which is however very difficult (or even impossible) to obtain from sensory input in unknown environment. In our previous work, for the safety classifier we proposed a machine learning approach utilizing a cautious simulator. For the connection of Reinforcement Learning and safety we further examine a modified Gradient Policy Search algorithm. ...

Klasifikace

  • Druh

    V<sub>souhrn</sub> - Souhrnná výzkumná zpráva

  • CEP obor

    JD - Využití počítačů, robotika a její aplikace

  • OECD FORD obor

Návaznosti výsledku

  • Projekt

    <a href="/cs/project/GA14-13876S" target="_blank" >GA14-13876S: Strojové vnímání pro dlouhodobou autonomii mobilních robotů</a><br>

  • Návaznosti

    P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)

Ostatní

  • Rok uplatnění

    2016

  • Kód důvěrnosti údajů

    S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Údaje specifické pro druh výsledku

  • Počet stran výsledku

    55

  • Místo vydání

    Praha

  • Název nakladatele resp. objednatele

    Center for Machine Perception, K13133 FEE Czech Technical University

  • Verze