Solving Partially Observable Stochastic Shortest-Path Games

Identifikátory výsledku

Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F68407700%3A21230%2F21%3A00353184" target="_blank" >RIV/68407700:21230/21:00353184 - isvavai.cz</a>
Výsledek na webu
<a href="https://doi.org/10.24963/ijcai.2021/575" target="_blank" >https://doi.org/10.24963/ijcai.2021/575</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.24963/ijcai.2021/575" target="_blank" >10.24963/ijcai.2021/575</a>

Alternativní jazyky

Jazyk výsledku
angličtina
Název v původním jazyce
Solving Partially Observable Stochastic Shortest-Path Games
Popis výsledku v původním jazyce
We study the two-player zero-sum extension of the partially observable stochastic shortest-path problem where one agent has only partial information about the environment. We formulate this problem as a partially observable stochastic game (POSG): given a set of target states and negative rewards for each transition, the player with imperfect information maximizes the expected undiscounted total reward until a target state is reached. The second player with the perfect information aims for the opposite. We base our formalism on POSGs with one-sided observability (OS-POSGs) and give the following contributions: (1) we introduce a novel heuristic search value iteration algorithm that iteratively solves depth-limited variants of the game, (2) we derive the bound on the depth guaranteeing an arbitrary precision, (3) we propose a novel upper-bound estimation that allows early terminations, and (4) we experimentally evaluate the algorithm on a pursuit-evasion game.
Název v anglickém jazyce
Solving Partially Observable Stochastic Shortest-Path Games
Popis výsledku anglicky
We study the two-player zero-sum extension of the partially observable stochastic shortest-path problem where one agent has only partial information about the environment. We formulate this problem as a partially observable stochastic game (POSG): given a set of target states and negative rewards for each transition, the player with imperfect information maximizes the expected undiscounted total reward until a target state is reached. The second player with the perfect information aims for the opposite. We base our formalism on POSGs with one-sided observability (OS-POSGs) and give the following contributions: (1) we introduce a novel heuristic search value iteration algorithm that iteratively solves depth-limited variants of the game, (2) we derive the bound on the depth guaranteeing an arbitrary precision, (3) we propose a novel upper-bound estimation that allows early terminations, and (4) we experimentally evaluate the algorithm on a pursuit-evasion game.

Klasifikace

Druh
D - Stať ve sborníku
CEP obor
—
OECD FORD obor
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)

Návaznosti výsledku

Projekt
Výsledek vznikl pri realizaci vícero projektů. Více informací v záložce Projekty.
Návaznosti
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)

Ostatní

Rok uplatnění
2021
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Údaje specifické pro druh výsledku

Název statě ve sborníku
Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence
ISBN
978-0-9992411-9-6
ISSN
—
e-ISSN
—
Počet stran výsledku
8
Strana od-do
4182-4189
Název nakladatele
International Joint Conferences on Artificial Intelligence Organization
Místo vydání
—
Místo konání akce
Montreal
Datum konání akce
19. 8. 2021
Typ akce podle státní příslušnosti
WRD - Celosvětová akce
Kód UT WoS článku
—

Podobné výsledky(10)

Heuristic Search Value Iteration for One-Sided Partially Observable Stochastic Games Optimizing Honeypot Strategies Against Dynamic Lateral Movement Using Partially Observable Stochastic Games Solving Partially Observable Stochastic Games with Public Observations

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Solving Partially Observable Stochastic Shortest-Path Games

Identifikátory výsledku

Alternativní jazyky

Klasifikace

Návaznosti výsledku

Ostatní

Údaje specifické pro druh výsledku

Podobné výsledky(10)

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Popis výsledku

Identifikátory výsledku

Identifikátory výsledku

Alternativní jazyky

Alternativní jazyky

Klasifikace

Klasifikace

Návaznosti výsledku

Návaznosti výsledku

Ostatní

Ostatní

Údaje specifické pro druh výsledku

Údaje specifické pro druh výsledku

Podobné výsledky(10)