An Oracle-Guided Approach to Constrained Policy Synthesis Under Uncertainty
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216305%3A26230%2F25%3APU155516" target="_blank" >RIV/00216305:26230/25:PU155516 - isvavai.cz</a>
Výsledek na webu
<a href="https://www.jair.org/index.php/jair/article/view/16593" target="_blank" >https://www.jair.org/index.php/jair/article/view/16593</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1613/jair.1.16593" target="_blank" >10.1613/jair.1.16593</a>
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
An Oracle-Guided Approach to Constrained Policy Synthesis Under Uncertainty
Popis výsledku v původním jazyce
Dealing with aleatoric uncertainty is key in many domains involving sequential decision making, e.g., planning in AI, network protocols, and symbolic program synthesis. This paper presents a general-purpose model-based framework to obtain policies operating in uncertain environments in a fully automated manner. The new concept of coloured Markov Decision Processes (MDPs) enables a succinct representation of a wide range of synthesis problems. A coloured MDP describes a collection of possible policy configurations with their structural dependencies. The framework covers the synthesis of (a) programmatic policies from probabilistic program sketches and (b) finite-state controllers representing policies for partially observable MDPs (POMDPs), including decentralised POMDPs as well as constrained POMDPs. We show that all these synthesis problems can be cast as exploring memoryless policies in the corresponding coloured MDP. This exploration uses a symbiosis of two orthogonal techniques: abstraction refinement-using a novel refinement method-and counter-example generalisation. Our approach outperforms dedicated synthesis techniques on some problems and significantly improves an earlier version of this framework.
Název v anglickém jazyce
An Oracle-Guided Approach to Constrained Policy Synthesis Under Uncertainty
Popis výsledku anglicky
Dealing with aleatoric uncertainty is key in many domains involving sequential decision making, e.g., planning in AI, network protocols, and symbolic program synthesis. This paper presents a general-purpose model-based framework to obtain policies operating in uncertain environments in a fully automated manner. The new concept of coloured Markov Decision Processes (MDPs) enables a succinct representation of a wide range of synthesis problems. A coloured MDP describes a collection of possible policy configurations with their structural dependencies. The framework covers the synthesis of (a) programmatic policies from probabilistic program sketches and (b) finite-state controllers representing policies for partially observable MDPs (POMDPs), including decentralised POMDPs as well as constrained POMDPs. We show that all these synthesis problems can be cast as exploring memoryless policies in the corresponding coloured MDP. This exploration uses a symbiosis of two orthogonal techniques: abstraction refinement-using a novel refinement method-and counter-example generalisation. Our approach outperforms dedicated synthesis techniques on some problems and significantly improves an earlier version of this framework.
Klasifikace
Druh
J<sub>ost</sub> - Ostatní články v recenzovaných periodicích
CEP obor
—
OECD FORD obor
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Návaznosti výsledku
Projekt
—
Návaznosti
S - Specificky vyzkum na vysokych skolach
Ostatní
Rok uplatnění
2025
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název periodika
JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH
ISSN
1076-9757
e-ISSN
1943-5037
Svazek periodika
2025
Číslo periodika v rámci svazku
82
Stát vydavatele periodika
US - Spojené státy americké
Počet stran výsledku
37
Strana od-do
433-469
Kód UT WoS článku
—
EID výsledku v databázi Scopus
—