Markov decision processes under observability constraints

Date

2005-06-01

Author

Serin, Yaşar Yasemin

Metadata

Show full item record

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Item Usage Stats

263
views

0
downloads

We develop an algorithm to compute optimal policies for Markov decision processes subject to constraints that result from some observability restrictions on the process. We assume that the state of the Markov process is unobservable. There is an observable process related to the unobservable state. So, we want to find a decision rule depending only on this observable process. The objective is to minimize the expected average cost over an infinite horizon. We also analyze the possibility of performing observations in more detail to obtain improved policies.

Subject Keywords

Management Science and Operations Research, Software, General Mathematics

URI

https://hdl.handle.net/11511/48248

Journal

MATHEMATICAL METHODS OF OPERATIONS RESEARCH

DOI

https://doi.org/10.1007/s001860400402

Collections

Department of Industrial Engineering, Article

Suggestions

OpenMETU
Core

Markov decision processes with restricted observations: Finite horizon case Serin, Yaşar Yasemin; Avşar, Zeynep Müge (Wiley, 1997-08-01) In this article we consider a Markov decision process subject to the constraints that result from some observability restrictions. We assume that the state of the Markov process under consideration is unobservable. The states are grouped so that the group that a state belongs to is observable. So, we want to find an optimal decision rule depending on the observable groups instead of the states. This means that the same decision applies to all the states in the same group. We prove that a deterministic optim...
Optimal lot-sizing/vehicle-dispatching policies under stochastic lead times and stepwise fixed costs Alp, O; Erkip, NK; Gullu, R (Institute for Operations Research and the Management Sciences (INFORMS), 2003-01-01) We characterize optimal policies of a dynamic lot-sizing/vehicle-dispatching problem under dynamic deterministic demands and stochastic lead times. An essential feature of the problem is the structure of the ordering cost, where a fixed cost is incurred every time a batch is initiated (or a vehicle is hired) regardless of the portion of the batch (or vehicle) utilized. Moreover, for every unit of demand not satisfied on time, holding and backorder costs are incurred. Under mild assumptions we show that the ...
Neural network calibrated stochastic processes: forecasting financial assets Giebel, Stefan; Rainer, Martin (Springer Science and Business Media LLC, 2013-03-01) If a given dynamical process contains an inherently unpredictable component, it may be modeled as a stochastic process. Typical examples from financial markets are the dynamics of prices (e.g. prices of stocks or commodities) or fundamental rates (exchange rates etc.). The unknown future value of the corresponding stochastic process is usually estimated as the expected value under a suitable measure, which may be determined from distribution of past (historical) values. The predictive power of this estimati...
Stochastic differential games for optimal investment problems in a Markov regime-switching jump-diffusion market Savku, E.; Weber, Gerhard Wilhelm (Springer Science and Business Media LLC, 2020-08-01) We apply dynamic programming principle to discuss two optimal investment problems by using zero-sum and nonzero-sum stochastic game approaches in a continuous-time Markov regime-switching environment within the frame work of behavioral finance. We represent different states of an economy and, consequently, investors' floating levels of psychological reactions by aD-state Markov chain. The first application is a zero-sum game between an investor and the market, and the second one formulates a nonzero-sum sto...
Multi-objective integer programming: A general approach for generating all non-dominated solutions Oezlen, Melih; Azizoğlu, Meral (Elsevier BV, 2009-11-16) In this paper we develop a general approach to generate all non-dominated solutions of the multi-objective integer programming (MOIP) Problem. Our approach, which is based on the identification of objective efficiency ranges, is an improvement over classical epsilon-constraint method. Objective efficiency ranges are identified by solving simpler MOIP problems with fewer objectives. We first provide the classical epsilon-constraint method on the bi-objective integer programming problem for the sake of comple...

Citation Formats

Y. Y. Serin, “Markov decision processes under observability constraints,” MATHEMATICAL METHODS OF OPERATIONS RESEARCH, pp. 311–328, 2005, Accessed: 00, 2020. [Online]. Available: https://hdl.handle.net/11511/48248.