Implementable policies: Discounted cost case

We consider a Markov decision process (MDP) with finite state space S and finite action set A. The state space is partitioned into K sets S-1, S-2, ..., S-K. A stationary randomized policy is described by the parameters {alpha(ia), i is an element of S, a is an element of A}, where alpha(ia) = the parobability that action a is taken when the system is in state i. A policy is called implementable if alpha(ia) = alpha(ja) for all a is an element of A whenever states i and j belong to a common subset S-r for some r = 1, 2,..., K. In this paper we discuss the importance of implementable policies and present an algorithm to find implementable policies that (locally) minimize the expected total discounted cost over infinite horizon.


Joint densities of hitting times for finite state Markov processes
Bielecki, Tomasz R.; Jeanblanc, Monique; Sezer, Ali Devin (2018-01-01)
For a finite state Markov process X and a finite collection {Gamma<INF>k</INF>, k is an element of K} of subsets of its state space, let tau<INF>k</INF> be the first time the process visits the set Gamma<INF>k</INF>. In general, X may enter some of the Gamma<INF>k</INF> at the same time and therefore the vector tau := (tau<INF>k</INF>, k is an element of K) may put nonzero mass over lower dimensional regions of R<INF>+</INF> <SUP>vertical bar K vertical bar</SUP>;these regions are of the form R<INF>s</INF> ...
Concrete description of CD0(K)-spaces as C(X)-spaces and its applications
Ercan, Z (American Mathematical Society (AMS), 2004-01-01)
We prove that for a compact Hausdorff space K without isolated points, CD0(K) and C(K x {0, 1}) are isometrically Riesz isomorphic spaces under a certain topology on K x {0, 1}. Moreover, K is a closed subspace of K x {0, 1}. This provides concrete examples of compact Hausdorff spaces X such that the Dedekind completion of C(X) is B(S) (= the set of all bounded real-valued functions on S) since the Dedekind completion of CD0(K) is B(K) (CD0(K, E) and CDw (K, E) spaces as Banach lattices).
Pruning algorithms for partially observable markov decision processes
Özgen, Selim; Demirekler, Mübeccel; Department of Electrical and Electronics Engineering (2017)
It is possible to represent the value function in partially observable Markov decision processes as a piecewise linear function if the state, action, and observation space is discrete. Exact value iteration algorithm searches for this value function by creating an exponential number of linear functions at each step, many of which can be pruned without changing the value of the value function. The pruning procedure is made possible by the use of linear programming. This study first gives a geometric framewor...
Affine Equivalency and Nonlinearity Preserving Bijective Mappings over F-2
Sertkaya, Isa; Doğanaksoy, Ali; Uzunkol, Osmanbey; Kiraz, Mehmet Sabir (2014-09-28)
We first give a proof of an isomorphism between the group of affine equivalent maps and the automorphism group of Sylvester Hadamard matrices. Secondly, we prove the existence of new nonlinearity preserving bijective mappings without explicit construction. Continuing the study of the group of nonlinearity preserving bijective mappings acting on n-variable Boolean functions, we further give the exact number of those mappings for n <= 6. Moreover, we observe that it is more beneficial to study the automorphis...
Unified treatment of spacelike and timelike SO(3,1) Yang-Mills fields
Dundarer, AR (Springer Science and Business Media LLC, 2001-07-01)
SO(3, 1) valued Yang-mills fields stemming from spacelike and timelike vectors that were studied separately in earlier works are unified by introducing a parameter lambda that takes values in the interval [-1, 1].
Citation Formats
Y. Y. Serin, “Implementable policies: Discounted cost case,” 1995, Accessed: 00, 2020. [Online]. Available: