Second Order Optimality in Markov Decision Chains
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F67985556%3A_____%2F17%3A00485146" target="_blank" >RIV/67985556:_____/17:00485146 - isvavai.cz</a>
Result on the web
<a href="http://dx.doi.org/10.14736/kyb-2017-6-1086" target="_blank" >http://dx.doi.org/10.14736/kyb-2017-6-1086</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.14736/kyb-2017-6-1086" target="_blank" >10.14736/kyb-2017-6-1086</a>
Alternative languages
Result language
angličtina
Original language name
Second Order Optimality in Markov Decision Chains
Original language description
The article is devoted to Markov reward chains in discrete-time setting with finite state spaces. Unfortunately, the usual optimization criteria examined in the literature on Markov decision chains, such as a total discounted, total reward up to reaching some specific state (called the first passage models) or mean (average) reward optimality, may be quite insufficient to characterize the problem from the point of a decision maker. To this end it seems that it may be preferable if not necessary to select more sophisticated criteria that also reflect variability -risk features of the problem. Perhaps the best known approaches stem from the classical work of Markowitz on mean variance selection rules, i.e. we optimize the weighted sum of average or total reward and its variance. The article presents explicit formulae for calculating the variances for transient and discounted models (where the value of the discount factor depends on the current state and action taken) for finite and infinite time horizon. The same result is presented for the long run average nondiscounted models where finding stationary policies minimizing the average variance in the class of policies with a given long run average reward is discussed.
Czech name
—
Czech description
—
Classification
Type
J<sub>imp</sub> - Article in a specialist periodical, which is included in the Web of Science database
CEP classification
—
OECD FORD branch
10103 - Statistics and probability
Result continuities
Project
<a href="/en/project/GA15-10331S" target="_blank" >GA15-10331S: Dynamic modeling of mortgage porkredtfolio risk</a><br>
Continuities
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)
Others
Publication year
2017
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Name of the periodical
Kybernetika
ISSN
0023-5954
e-ISSN
—
Volume of the periodical
53
Issue of the periodical within the volume
6
Country of publishing house
CZ - CZECH REPUBLIC
Number of pages
14
Pages from-to
1086-1099
UT code for WoS article
000424732300008
EID of the result in the Scopus database
2-s2.0-85040739483