Goal-HSVI: Heuristic Search Value Iteration for Goal POMDPs
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F68407700%3A21230%2F18%3A00322882" target="_blank" >RIV/68407700:21230/18:00322882 - isvavai.cz</a>
Result on the web
<a href="https://www.ijcai.org/proceedings/2018/662" target="_blank" >https://www.ijcai.org/proceedings/2018/662</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.24963/ijcai.2018/662" target="_blank" >10.24963/ijcai.2018/662</a>
Alternative languages
Result language
angličtina
Original language name
Goal-HSVI: Heuristic Search Value Iteration for Goal POMDPs
Original language description
Partially observable Markov decision processes (POMDPs) are the standard models for planning under uncertainty with both finite and infinite horizon. Besides the well-known discounted-sum objective, indefinite-horizon objective (aka Goal-POMDPs) is another classical objective for POMDPs. In this case, given a set of target states and a positive cost for each transition, the optimization objective is to minimize the expected total cost until a target state is reached. In the literature, RTDP-Bel or heuristic search value iteration (HSVI) have been used for solving Goal-POMDPs. Neither of these algorithms has theoretical convergence guarantees, and HSVI may even fail to terminate its trials. We give the following contributions: (1) We discuss the challenges introduced in Goal-POMDPs and illustrate how they prevent the original HSVI from converging. (2) We present a novel algorithm inspired by HSVI, termed Goal-HSVI, and show that our algorithm has convergence guarantees. (3) We show that Goal-HSVI outperforms RTDP-Bel on a set of well-known examples.
Czech name
—
Czech description
—
Classification
Type
D - Article in proceedings
CEP classification
—
OECD FORD branch
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Result continuities
Project
—
Continuities
V - Vyzkumna aktivita podporovana z jinych verejnych zdroju
Others
Publication year
2018
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Article name in the collection
Proceedings of the International Joint Conferences on Artifical Intelligence
ISBN
978-0-9992411-2-7
ISSN
—
e-ISSN
1045-0823
Number of pages
7
Pages from-to
4764-4770
Publisher name
International Joint Conferences on Artificial Intelligence Organization
Place of publication
—
Event location
Stockholm
Event date
Jul 13, 2018
Type of event by nationality
WRD - Celosvětová akce
UT code for WoS article
—