Show/Hide Menu
Hide/Show Apps
anonymousUser
Logout
Türkçe
Türkçe
Search
Search
Login
Login
OpenMETU
OpenMETU
About
About
Open Science Policy
Open Science Policy
Frequently Asked Questions
Frequently Asked Questions
Browse
Browse
By Issue Date
By Issue Date
Authors
Authors
Titles
Titles
Subjects
Subjects
Communities & Collections
Communities & Collections
A context aware model for autonomous agent stochastic planning
Date
2019-02-01
Author
Ekmekci, Ömer
Polat, Faruk
Metadata
Show full item record
This work is licensed under a
Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License
.
Item Usage Stats
9
views
0
downloads
Markov Decision Processes (MDPs) are not able to make use of domain information effectively due to their representational limitations. The lacking of elements which enable the models be aware of context, leads to unstructured representation of that problem such as raw probability matrices or lists. This causes these tools significantly less efficient at determining a useful policy as the state space of a task grows, which is the case for more realistic problems having localized dependencies between states and actions. In this paper, we present a new state machine, called Context-Aware Markov Decision Process (CA-MDP) based on MDP for the purpose of representing Markovian sequential decision making problems in a more structured manner. CA-MDP changes and augments MDP facilities by integrating causal relationships between actions and states thereby enabling structural, hence compact if possible, representation of the tasks. To show the expressive power of CA-MDP, we give the theoretical bounds for complexity of conversion between MDP and CA-MDP to demonstrate the expressive power of CA-MDP. Next, to generate an optimal policy from CA-MDP encoding by exploiting those newly defined facilities, we devised a new solver algorithm based on value iteration (VI), called Context-Aware Value Iteration (CA-VI). Although regular dynamic programming (DP) based algorithms is successful at effectively determining optimal policies, they do not scale well with respect to state-action space, making both the MDP encoding and related solver mechanism practically unusable for real-life problems. Our solver algorithm gets the power of overcoming the scalability problem by integrating the structural information provided in CA-MDP. First, we give theoretical analysis of CA-VI by examining the expected number of Bellman updates being performed on arbitrary tasks. Finally, we present our conducted experiments on numerous problems, with important remarks and discussions on certain aspects of CA-VI and CA-MDP, to justify our theoretical analyses empirically and to assess the real performance of CA-VI with CA-MDP formulation by analysing the execution time by checking how close it gets to the practical minimum runtime bound with respect to VI performance with MDP encoding of the same task.
Subject Keywords
Control and Systems Engineering
,
Software
,
General Mathematics
,
Computer Science Applications
URI
https://hdl.handle.net/11511/39811
Journal
ROBOTICS AND AUTONOMOUS SYSTEMS
DOI
https://doi.org/10.1016/j.robot.2018.11.013
Collections
Department of Computer Engineering, Article