All

What are you looking for?

All
Projects
Results
Organizations

Quick search

  • Projects supported by TA ČR
  • Excellent projects
  • Projects with the highest public support
  • Current projects

Smart search

  • That is how I find a specific +word
  • That is how I leave the -word out of the results
  • “That is how I can find the whole phrase”

Second Order Optimality in Markov Decision Chains

The result's identifiers

  • Result code in IS VaVaI

    <a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F67985556%3A_____%2F17%3A00485146" target="_blank" >RIV/67985556:_____/17:00485146 - isvavai.cz</a>

  • Result on the web

    <a href="http://dx.doi.org/10.14736/kyb-2017-6-1086" target="_blank" >http://dx.doi.org/10.14736/kyb-2017-6-1086</a>

  • DOI - Digital Object Identifier

    <a href="http://dx.doi.org/10.14736/kyb-2017-6-1086" target="_blank" >10.14736/kyb-2017-6-1086</a>

Alternative languages

  • Result language

    angličtina

  • Original language name

    Second Order Optimality in Markov Decision Chains

  • Original language description

    The article is devoted to Markov reward chains in discrete-time setting with finite state spaces. Unfortunately, the usual optimization criteria examined in the literature on Markov decision chains, such as a total discounted, total reward up to reaching some specific state (called the first passage models) or mean (average) reward optimality, may be quite insufficient to characterize the problem from the point of a decision maker. To this end it seems that it may be preferable if not necessary to select more sophisticated criteria that also reflect variability -risk features of the problem. Perhaps the best known approaches stem from the classical work of Markowitz on mean variance selection rules, i.e. we optimize the weighted sum of average or total reward and its variance. The article presents explicit formulae for calculating the variances for transient and discounted models (where the value of the discount factor depends on the current state and action taken) for finite and infinite time horizon. The same result is presented for the long run average nondiscounted models where finding stationary policies minimizing the average variance in the class of policies with a given long run average reward is discussed.

  • Czech name

  • Czech description

Classification

  • Type

    J<sub>imp</sub> - Article in a specialist periodical, which is included in the Web of Science database

  • CEP classification

  • OECD FORD branch

    10103 - Statistics and probability

Result continuities

  • Project

    <a href="/en/project/GA15-10331S" target="_blank" >GA15-10331S: Dynamic modeling of mortgage porkredtfolio risk</a><br>

  • Continuities

    P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)

Others

  • Publication year

    2017

  • Confidentiality

    S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Data specific for result type

  • Name of the periodical

    Kybernetika

  • ISSN

    0023-5954

  • e-ISSN

  • Volume of the periodical

    53

  • Issue of the periodical within the volume

    6

  • Country of publishing house

    CZ - CZECH REPUBLIC

  • Number of pages

    14

  • Pages from-to

    1086-1099

  • UT code for WoS article

    000424732300008

  • EID of the result in the Scopus database

    2-s2.0-85040739483