An Oracle-Guided Approach to Constrained Policy Synthesis Under Uncertainty
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216305%3A26230%2F25%3APU155516" target="_blank" >RIV/00216305:26230/25:PU155516 - isvavai.cz</a>
Result on the web
<a href="https://www.jair.org/index.php/jair/article/view/16593" target="_blank" >https://www.jair.org/index.php/jair/article/view/16593</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1613/jair.1.16593" target="_blank" >10.1613/jair.1.16593</a>
Alternative languages
Result language
angličtina
Original language name
An Oracle-Guided Approach to Constrained Policy Synthesis Under Uncertainty
Original language description
Dealing with aleatoric uncertainty is key in many domains involving sequential decision making, e.g., planning in AI, network protocols, and symbolic program synthesis. This paper presents a general-purpose model-based framework to obtain policies operating in uncertain environments in a fully automated manner. The new concept of coloured Markov Decision Processes (MDPs) enables a succinct representation of a wide range of synthesis problems. A coloured MDP describes a collection of possible policy configurations with their structural dependencies. The framework covers the synthesis of (a) programmatic policies from probabilistic program sketches and (b) finite-state controllers representing policies for partially observable MDPs (POMDPs), including decentralised POMDPs as well as constrained POMDPs. We show that all these synthesis problems can be cast as exploring memoryless policies in the corresponding coloured MDP. This exploration uses a symbiosis of two orthogonal techniques: abstraction refinement-using a novel refinement method-and counter-example generalisation. Our approach outperforms dedicated synthesis techniques on some problems and significantly improves an earlier version of this framework.
Czech name
—
Czech description
—
Classification
Type
J<sub>ost</sub> - Miscellaneous article in a specialist periodical
CEP classification
—
OECD FORD branch
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Result continuities
Project
—
Continuities
S - Specificky vyzkum na vysokych skolach
Others
Publication year
2025
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Name of the periodical
JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH
ISSN
1076-9757
e-ISSN
1943-5037
Volume of the periodical
2025
Issue of the periodical within the volume
82
Country of publishing house
US - UNITED STATES
Number of pages
37
Pages from-to
433-469
UT code for WoS article
—
EID of the result in the Scopus database
—