Vše

Co hledáte?

Vše
Projekty
Výsledky výzkumu
Subjekty

Rychlé hledání

  • Projekty podpořené TA ČR
  • Významné projekty
  • Projekty s nejvyšší státní podporou
  • Aktuálně běžící projekty

Chytré vyhledávání

  • Takto najdu konkrétní +slovo
  • Takto z výsledků -slovo zcela vynechám
  • “Takto můžu najít celou frázi”

A theoretical demonstration for reinforcement learning of PI control dynamics for optimal speed control of DC motors by using Twin Delay Deep Deterministic Policy Gradient Algorithm

Identifikátory výsledku

  • Kód výsledku v IS VaVaI

    <a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216305%3A26220%2F23%3APU145934" target="_blank" >RIV/00216305:26220/23:PU145934 - isvavai.cz</a>

  • Výsledek na webu

    <a href="https://www.sciencedirect.com/science/article/pii/S0957417422022102" target="_blank" >https://www.sciencedirect.com/science/article/pii/S0957417422022102</a>

  • DOI - Digital Object Identifier

    <a href="http://dx.doi.org/10.1016/j.eswa.2022.119192" target="_blank" >10.1016/j.eswa.2022.119192</a>

Alternativní jazyky

  • Jazyk výsledku

    angličtina

  • Název v původním jazyce

    A theoretical demonstration for reinforcement learning of PI control dynamics for optimal speed control of DC motors by using Twin Delay Deep Deterministic Policy Gradient Algorithm

  • Popis výsledku v původním jazyce

    To benefit from the advantages of Reinforcement Learning (RL) in industrial control applications, RL methods can be used for optimal tuning of the classical controllers based on the simulation scenarios of operating con-ditions. In this study, the Twin Delay Deep Deterministic (TD3) policy gradient method, which is an effective actor-critic RL strategy, is implemented to learn optimal Proportional Integral (PI) controller dynamics from a Direct Current (DC) motor speed control simulation environment. For this purpose, the PI controller dynamics are introduced to the actor-network by using the PI-based observer states from the control simulation envi-ronment. A suitable Simulink simulation environment is adapted to perform the training process of the TD3 algorithm. The actor-network learns the optimal PI controller dynamics by using the reward mechanism that implements the minimization of the optimal control objective function. A setpoint filter is used to describe the desired setpoint response, and step disturbance signals with random amplitude are incorporated in the simu-lation environment to improve disturbance rejection control skills with the help of experience based learning in the designed control simulation environment. When the training task is completed, the optimal PI controller coefficients are obtained from the weight coefficients of the actor-network. The performance of the optimal PI dynamics, which were learned by using the TD3 algorithm and Deep Deterministic Policy Gradient algorithm, are compared. Moreover, control performance improvement of this RL based PI controller tuning method (RL-PI) is demonstrated relative to performances of both integer and fractional order PI controllers that were tuned by using several popular metaheuristic optimization algorithms such as Genetic Algorithm, Particle Swarm Opti-mization, Grey Wolf Optimization and Differential Evolution.

  • Název v anglickém jazyce

    A theoretical demonstration for reinforcement learning of PI control dynamics for optimal speed control of DC motors by using Twin Delay Deep Deterministic Policy Gradient Algorithm

  • Popis výsledku anglicky

    To benefit from the advantages of Reinforcement Learning (RL) in industrial control applications, RL methods can be used for optimal tuning of the classical controllers based on the simulation scenarios of operating con-ditions. In this study, the Twin Delay Deep Deterministic (TD3) policy gradient method, which is an effective actor-critic RL strategy, is implemented to learn optimal Proportional Integral (PI) controller dynamics from a Direct Current (DC) motor speed control simulation environment. For this purpose, the PI controller dynamics are introduced to the actor-network by using the PI-based observer states from the control simulation envi-ronment. A suitable Simulink simulation environment is adapted to perform the training process of the TD3 algorithm. The actor-network learns the optimal PI controller dynamics by using the reward mechanism that implements the minimization of the optimal control objective function. A setpoint filter is used to describe the desired setpoint response, and step disturbance signals with random amplitude are incorporated in the simu-lation environment to improve disturbance rejection control skills with the help of experience based learning in the designed control simulation environment. When the training task is completed, the optimal PI controller coefficients are obtained from the weight coefficients of the actor-network. The performance of the optimal PI dynamics, which were learned by using the TD3 algorithm and Deep Deterministic Policy Gradient algorithm, are compared. Moreover, control performance improvement of this RL based PI controller tuning method (RL-PI) is demonstrated relative to performances of both integer and fractional order PI controllers that were tuned by using several popular metaheuristic optimization algorithms such as Genetic Algorithm, Particle Swarm Opti-mization, Grey Wolf Optimization and Differential Evolution.

Klasifikace

  • Druh

    J<sub>imp</sub> - Článek v periodiku v databázi Web of Science

  • CEP obor

  • OECD FORD obor

    20204 - Robotics and automatic control

Návaznosti výsledku

  • Projekt

  • Návaznosti

    S - Specificky vyzkum na vysokych skolach

Ostatní

  • Rok uplatnění

    2023

  • Kód důvěrnosti údajů

    S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Údaje specifické pro druh výsledku

  • Název periodika

    EXPERT SYSTEMS WITH APPLICATIONS

  • ISSN

    0957-4174

  • e-ISSN

    1873-6793

  • Svazek periodika

    213,Part C

  • Číslo periodika v rámci svazku

    March 2023

  • Stát vydavatele periodika

    US - Spojené státy americké

  • Počet stran výsledku

    16

  • Strana od-do

    1-16

  • Kód UT WoS článku

    000890664400010

  • EID výsledku v databázi Scopus

    2-s2.0-85141914275