Risk-Sensitive Optimality in Markov Games

Popis výsledku

—

Klíčová slova

two-person Markov games communicating Markov chains risk-sensitive optimality dynamic programming

Identifikátory výsledku

Kód výsledku v IS VaVaI
RIV/67985556:_____/17:00480036 - isvavai.cz
Výsledek na webu
—
DOI - Digital Object Identifier
—

Alternativní jazyky

Jazyk výsledku
angličtina
Název v původním jazyce
Risk-Sensitive Optimality in Markov Games
Popis výsledku v původním jazyce
The article is devoted to risk-sensitive optimality in Markov games. Attention is focused on Markov games evolving on communicating Markov chains with two-players with opposite aims. Considering risk-sensitive optimality criteria means that total reward generated by the game is evaluated by exponential utility function with a given risk-sensitive coefficient. In particular, the first player (resp. the secondplayer) tries to maximize (resp. minimize) the long-run risk sensitive average reward. Observe that if the second player is dummy, the problem is reduced to finding optimal policy of the Markov decision chain with the risk-sensitive optimality. Recall that for the risk sensitivity coefficient equal to zero we arrive at traditional optimality criteria. In this article, connections between risk-sensitive and risk-neutral Markov decisionchains and Markov games models are studied using discrepancy functions. Explicit formulae for bounds on the risk-sensitive average long-run reward are reported. Policy iteration algorithm for finding suboptimal policies of both players is suggested. The obtained results are illustrated on numerical example.
Název v anglickém jazyce
Risk-Sensitive Optimality in Markov Games
Popis výsledku anglicky
The article is devoted to risk-sensitive optimality in Markov games. Attention is focused on Markov games evolving on communicating Markov chains with two-players with opposite aims. Considering risk-sensitive optimality criteria means that total reward generated by the game is evaluated by exponential utility function with a given risk-sensitive coefficient. In particular, the first player (resp. the secondplayer) tries to maximize (resp. minimize) the long-run risk sensitive average reward. Observe that if the second player is dummy, the problem is reduced to finding optimal policy of the Markov decision chain with the risk-sensitive optimality. Recall that for the risk sensitivity coefficient equal to zero we arrive at traditional optimality criteria. In this article, connections between risk-sensitive and risk-neutral Markov decisionchains and Markov games models are studied using discrepancy functions. Explicit formulae for bounds on the risk-sensitive average long-run reward are reported. Policy iteration algorithm for finding suboptimal policies of both players is suggested. The obtained results are illustrated on numerical example.

Klasifikace

Druh
D - Stať ve sborníku
CEP obor
—
OECD FORD obor
50202 - Applied Economics, Econometrics

Návaznosti výsledku

Projekt
GA13-14445S: Nové trendy ve stochastických ekonomických modelech za neurčitosti
Návaznosti
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)

Ostatní

Rok uplatnění
2017
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Údaje specifické pro druh výsledku

Název statě ve sborníku
Proceedings of the 35th International Conference Mathematical Methods in Economics (MME 2017)
ISBN
978-80-7435-678-0
ISSN
—
e-ISSN
—
Počet stran výsledku
6
Strana od-do
684-689
Název nakladatele
University of Hradec Králové
Místo vydání
Hradec Králové
Místo konání akce
Hradec Králové
Datum konání akce
13. 9. 2017
Typ akce podle státní příslušnosti
EUR - Evropská akce
Kód UT WoS článku
—

Základní informace

Druh výsledku

D - Stať ve sborníku

OECD FORD

Applied Economics, Econometrics

Rok uplatnění

2017

Podobné výsledky(10)

Risk-sensitive Average Optimality in Markov Decision Processes Risk-Sensitivity and Average Optimality in Markov and Semi-Markov Reward Processes Cumulative Optimality in Risk-Sensitive and Risk-Neutral Markov Reward Chains

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Sdílet výsledky vyhledávání

Risk-Sensitive Optimality in Markov Games

Popis výsledku

Klíčová slova

Identifikátory výsledku

Alternativní jazyky

Klasifikace

Návaznosti výsledku

Ostatní

Údaje specifické pro druh výsledku

Základní informace

Podobné výsledky(10)