Optimal Policy Synthesis from A Sequence of Goal Sets with An Application to Electric Distribution System Restoration

2021-01-01
Motivated by the post-disaster distribution system restoration problem, in this paper, we study the problem of synthesizing the optimal policy for a Markov Decision Process (MDP) from a sequence of goal sets. For each goal set, our aim is to both maximize the probability to reach and minimize the expected time to reach the goal set. The order of the goal sets represents their priority. In particular, our aim is to generate a policy that is optimal with respect to the first goal set, and it is optimal with respect to the second goal set among the policies that are optimal with respect to the first goal set and so on. To synthesize such a policy, we iteratively filter the applicable actions according to the goal sets. We illustrate the developed method over a sample distribution system. Copyright (C) 2021 The Authors.
7th IFAC Conference on Analysis and Design of Hybrid Systems (ADHS)

Suggestions

A Stochastic Maximum Principle for a Markov Regime-Switching Jump-Diffusion Model with Delay and an Application to Finance
Savku, Emel; Weber, Gerhard Wilhelm (2018-11-01)
We study a stochastic optimal control problem for a delayed Markov regime-switching jump-diffusion model. We establish necessary and sufficient maximum principles under full and partial information for such a system. We prove the existence-uniqueness theorem for the adjoint equations, which are represented by an anticipated backward stochastic differential equation with jumps and regimes. We illustrate our results by a problem of optimal consumption problem from a cash flow with delay and regimes.
Optimal Limit Order Book Trading Strategies with Stochastic Volatility in the Underlying Asset
Aydoğan, Burcu; Uğur, Ömür; Aksoy, Ümit (2022-1-01)
In quantitative finance, there have been numerous new aspects and developments related with the stochastic control and optimization problems which handle the controlled variables of performing the behavior of a dynamical system to achieve certain objectives. In this paper, we address the optimal trading strategies via price impact models using Heston stochastic volatility framework including jump processes either in price or in volatility of the price dynamics with the aim of maximizing expected return of t...
Improving reinforcement learning by using sequence trees
Girgin, Sertan; Polat, Faruk; Alhajj, Reda (Springer Science and Business Media LLC, 2010-12-01)
This paper proposes a novel approach to discover options in the form of stochastic conditionally terminating sequences; it shows how such sequences can be integrated into the reinforcement learning framework to improve the learning performance. The method utilizes stored histories of possible optimal policies and constructs a specialized tree structure during the learning process. The constructed tree facilitates the process of identifying frequently used action sequences together with states that are visit...
Expectation propagation for state estimation with discrete-valued hidden random variables
Sarıtaş, Elif; Orguner, Umut; Department of Electrical and Electronics Engineering (2023-2-21)
In this thesis, the expectation propagation (EP) approach of Minka is considered for the estimation problems in dynamical systems with discrete hidden random variables where optimal posteriors are usually intractable. The concept of context adjustment is introduced to avoid/alleviate indefinite covariance problems encountered in standard EP implementations in a systematic way. Additionally, the moment projection (Mprojection) problem involving pseudo-Gaussian likelihoods as factors is solved to be used in t...
Advances and applications of stochastic Ito-Taylor approximation and change of time method in the financial sector
Öz, Hacer; Weber, Gerhard Wilhelm; Department of Financial Mathematics (2013)
In this thesis, we discuss two different approaches for the solution of stochastic differential equations (SDEs): Ito-Taylor method (IT-M) and change of time method (CT-M). First approach is an approximation in space-domain and the second one is a probabilistic transformation in time-domain. Both approaches may be considered to substitute SDEs for more “practical” representations and solutions. IT-M was most studied for one-dimensional SDEs. The main aim of this work is to extend the theory of one-dimension...
Citation Formats
I. Isik, O. Y. Arpalı, and E. Aydın Göl, “Optimal Policy Synthesis from A Sequence of Goal Sets with An Application to Electric Distribution System Restoration,” Brussels, Belçika, 2021, vol. 54, Accessed: 00, 2021. [Online]. Available: https://hdl.handle.net/11511/95008.