Implementable policies: Discounted cost case

Serin Y. Y., KULKARNI V.

2nd International Workshop on the Numerical Solution of Markov Chains, North-Carolina, United States Of America, 18 January 1995, pp.283-306 identifier

  • Publication Type: Conference Paper / Full Text
  • City: North-Carolina
  • Country: United States Of America
  • Page Numbers: pp.283-306
  • Middle East Technical University Affiliated: Yes


We consider a Markov decision process (MDP) with finite state space S and finite action set A. The state space is partitioned into K sets S-1, S-2, ..., S-K. A stationary randomized policy is described by the parameters {alpha(ia), i is an element of S, a is an element of A}, where alpha(ia) = the parobability that action a is taken when the system is in state i. A policy is called implementable if alpha(ia) = alpha(ja) for all a is an element of A whenever states i and j belong to a common subset S-r for some r = 1, 2,..., K. In this paper we discuss the importance of implementable policies and present an algorithm to find implementable policies that (locally) minimize the expected total discounted cost over infinite horizon.