Optimal Policy Synthesis from A Sequence of Goal Sets with An Application to Electric Distribution System Restoration

2021-01-01
Motivated by the post-disaster distribution system restoration problem, in this paper, we study the problem of synthesizing the optimal policy for a Markov Decision Process (MDP) from a sequence of goal sets. For each goal set, our aim is to both maximize the probability to reach and minimize the expected time to reach the goal set. The order of the goal sets represents their priority. In particular, our aim is to generate a policy that is optimal with respect to the first goal set, and it is optimal with respect to the second goal set among the policies that are optimal with respect to the first goal set and so on. To synthesize such a policy, we iteratively filter the applicable actions according to the goal sets. We illustrate the developed method over a sample distribution system. Copyright (C) 2021 The Authors.

Suggestions

A Stochastic Maximum Principle for a Markov Regime-Switching Jump-Diffusion Model with Delay and an Application to Finance
Savku, Emel; Weber, Gerhard Wilhelm (2018-11-01)
We study a stochastic optimal control problem for a delayed Markov regime-switching jump-diffusion model. We establish necessary and sufficient maximum principles under full and partial information for such a system. We prove the existence-uniqueness theorem for the adjoint equations, which are represented by an anticipated backward stochastic differential equation with jumps and regimes. We illustrate our results by a problem of optimal consumption problem from a cash flow with delay and regimes.
Improving reinforcement learning by using sequence trees
Girgin, Sertan; Polat, Faruk; Alhajj, Reda (Springer Science and Business Media LLC, 2010-12-01)
This paper proposes a novel approach to discover options in the form of stochastic conditionally terminating sequences; it shows how such sequences can be integrated into the reinforcement learning framework to improve the learning performance. The method utilizes stored histories of possible optimal policies and constructs a specialized tree structure during the learning process. The constructed tree facilitates the process of identifying frequently used action sequences together with states that are visit...
Asymptotic behavior of Markov semigroups on preduals of von Neumann algebras
Ernel'yanov, EY; Wolff, MPH (Elsevier BV, 2006-02-15)
We develop a new approach for investigation of asymptotic behavior of Markov semigroup on preduals of von Neumann algebras. With using of our technique we establish several results about mean ergodicity, statistical stability, and constrictiviness of Markov semigroups. (c) 2005 Elsevier Inc. All rights reserved.
Advances and applications of stochastic Ito-Taylor approximation and change of time method in the financial sector
Öz, Hacer; Weber, Gerhard Wilhelm; Department of Financial Mathematics (2013)
In this thesis, we discuss two different approaches for the solution of stochastic differential equations (SDEs): Ito-Taylor method (IT-M) and change of time method (CT-M). First approach is an approximation in space-domain and the second one is a probabilistic transformation in time-domain. Both approaches may be considered to substitute SDEs for more “practical” representations and solutions. IT-M was most studied for one-dimensional SDEs. The main aim of this work is to extend the theory of one-dimension...
Hierarchical and decentralized multitasking control of discrete event systems
Schmidt, Klaus Verner; Cury, José E. R. (2007-12-01)
In this paper, a hierarchical and decentralized approach for composite discrete-event systems (DES) that have to fulfill multiple tasks is elaborated. Colored marking generators that can distinguish classes of tasks are used as the system model, and a colored abstraction procedure as well as sufficient conditions for nonblocking and hierarchically consistent control are developed. It is shown that the computational complexity for supervisor computation is reduced. A flexible manufacturing system example dem...
Citation Formats
I. Isik, O. Y. Arpalı, and E. Aydın Göl, “Optimal Policy Synthesis from A Sequence of Goal Sets with An Application to Electric Distribution System Restoration,” Brussels, Belçika, 2021, vol. 54, Accessed: 00, 2021. [Online]. Available: https://hdl.handle.net/11511/95008.