Implementable policies: Discounted cost case

Date

1995-01-18

Author

Serin, Yaşar Yasemin

Metadata

Show full item record

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Item Usage Stats

181
views

0
downloads

We consider a Markov decision process (MDP) with finite state space S and finite action set A. The state space is partitioned into K sets S-1, S-2, ..., S-K. A stationary randomized policy is described by the parameters {alpha(ia), i is an element of S, a is an element of A}, where alpha(ia) = the parobability that action a is taken when the system is in state i. A policy is called implementable if alpha(ia) = alpha(ja) for all a is an element of A whenever states i and j belong to a common subset S-r for some r = 1, 2,..., K. In this paper we discuss the importance of implementable policies and present an algorithm to find implementable policies that (locally) minimize the expected total discounted cost over infinite horizon.

Subject Keywords

Optimal policy, Directional derivative, Markov decision process, Descent direction, Retrial queue

URI

https://hdl.handle.net/11511/52735

Collections

Department of Industrial Engineering, Conference / Seminar

Suggestions

OpenMETU
Core

Joint densities of hitting times for finite state Markov processes Bielecki, Tomasz R.; Jeanblanc, Monique; Sezer, Ali Devin (2018-01-01) For a finite state Markov process X and a finite collection {Gamma<INF>k</INF>, k is an element of K} of subsets of its state space, let tau<INF>k</INF> be the first time the process visits the set Gamma<INF>k</INF>. In general, X may enter some of the Gamma<INF>k</INF> at the same time and therefore the vector tau := (tau<INF>k</INF>, k is an element of K) may put nonzero mass over lower dimensional regions of R<INF>+</INF> <SUP>vertical bar K vertical bar</SUP>;these regions are of the form R<INF>s</INF> ...
Concrete description of CD0(K)-spaces as C(X)-spaces and its applications Ercan, Z (American Mathematical Society (AMS), 2004-01-01) We prove that for a compact Hausdorff space K without isolated points, CD0(K) and C(K x {0, 1}) are isometrically Riesz isomorphic spaces under a certain topology on K x {0, 1}. Moreover, K is a closed subspace of K x {0, 1}. This provides concrete examples of compact Hausdorff spaces X such that the Dedekind completion of C(X) is B(S) (= the set of all bounded real-valued functions on S) since the Dedekind completion of CD0(K) is B(K) (CD0(K, E) and CDw (K, E) spaces as Banach lattices).
Pruning algorithms for partially observable markov decision processes Özgen, Selim; Demirekler, Mübeccel; Department of Electrical and Electronics Engineering (2017) It is possible to represent the value function in partially observable Markov decision processes as a piecewise linear function if the state, action, and observation space is discrete. Exact value iteration algorithm searches for this value function by creating an exponential number of linear functions at each step, many of which can be pruned without changing the value of the value function. The pruning procedure is made possible by the use of linear programming. This study first gives a geometric framewor...
Affine Equivalency and Nonlinearity Preserving Bijective Mappings over F-2 Sertkaya, Isa; Doğanaksoy, Ali; Uzunkol, Osmanbey; Kiraz, Mehmet Sabir (2014-09-28) We first give a proof of an isomorphism between the group of affine equivalent maps and the automorphism group of Sylvester Hadamard matrices. Secondly, we prove the existence of new nonlinearity preserving bijective mappings without explicit construction. Continuing the study of the group of nonlinearity preserving bijective mappings acting on n-variable Boolean functions, we further give the exact number of those mappings for n <= 6. Moreover, we observe that it is more beneficial to study the automorphis...
On the number of topologies on a finite set Kızmaz, Muhammet Yasir (2019-01-01) We denote the number of distinct topologies which can be defined on a set X with n elements by T(n). Similarly, T-0(n) denotes the number of distinct T-0 topologies on the set X. In the present paper, we prove that for any prime p, T(p(k)) k+ 1 (mod p), and that for each natural number n there exists a unique k such that T(p + n) k (mod p). We calculate k for n = 0, 1, 2, 3, 4. We give an alternative proof for a result of Z. I. Borevich to the effect that T-0(p + n) T-0(n + 1) (mod p).

Citation Formats

Y. Y. Serin, “Implementable policies: Discounted cost case,” 1995, Accessed: 00, 2020. [Online]. Available: https://hdl.handle.net/11511/52735.