Compact Frequency Memory for Reinforcement Learning with Hidden States.

2019-10-28
Polat, Faruk
Cilden, Erkin
Memory-based reinforcement learning approaches keep track of past experiences of the agent in environments with hidden states. This may require extensive use of memory that limits the practice of these methods in a real-life problem. The motivation behind this study is the observation that less frequent transitions provide more reliable information about the current state of the agent in ambiguous environments. In this work, a selective memory approach based on the frequencies of transitions is proposed to avoid keeping the transitions which are unrelated to the agent’s current state. Experiments show that the usage of a compact and selective memory may improve and speed up the learning process.
PRIMA: International Conference on Principles and Practice of Multi-Agent Systems

Suggestions

A case-based reasoning model as an organizational learning tool
Ozorhon, B,; Dikmen Toker, İrem; Birgönül, Mustafa Talat (null; 2005-06-03)
Organizational learning (OL) is a set of activities to obtain organizational memory (OM) byacquiring, sharing, interpreting, integrating and institutionalizing knowledge. Although the OLprocess of construction firms has been discussed for several times, utilization of the output ofthese activities has not been investigated in depth. All companies can learn but the advantage oflearning is revealed when companies enhance their decision-making abilities through their OM.The major objective of this...
Joint and interactive effects of trust and (inter) dependence on relational behaviors in long-term channel dyads
Yilmaz, C; Sezen, B; Özdemir, Özlem (Elsevier BV, 2005-04-01)
The authors investigate the effects of trust on the relational behaviors of firms in long-term channel dyads across different interdependence structures. Based on the long-term nature of the empirical setting, trust is posited to exert a positive effect on the emergence of relational behaviors in all interdependence conditions. This positive effect of trust is hypothesized to be stronger in highly and symmetrically interdependent channel dyads than in low-interdependence-type symmetric dyads. In addition, f...
Closed-form sample probing for training generative models in zero-shot learning
Çetin, Samet; Cinbiş, Ramazan Gökberk; Department of Computer Engineering (2022-2-10)
Generative modeling based approaches have led to significant advances in generalized zero-shot learning over the past few-years. These approaches typically aim to learn a conditional generator that synthesizes training samples of classes conditioned on class embeddings, such as attribute based class definitions. The final zero-shot learning model can then be obtained by training a supervised classification model over the real and/or synthesized training samples of seen and unseen classes, combined. Therefor...
Automatic identification of transitional bottlenecks in reinforcement learning under partial observability
Aydın, Hüseyin; Polat, Faruk; Department of Computer Engineering (2017)
Instance-based methods are proven tools to solve reinforcement learning problems with hidden states. Nearest Sequence Memory (NSM) is a widely known instance-based approach mainly based on k-Nearest Neighbor algorithm. NSM keeps track of raw history of action-observation-reward instances within a fixed length (or ideally unlimited) memory. It calculates the neighborhood for the current state through a recursive comparison of the matching action-observation-reward tuples with the previous ones. The ones with...
Using Transitional Bottlenecks to Improve Learning in Nearest Sequence Memory Algorithm
Aydın, Hüseyin; Polat, Faruk (2017-11-08)
Instance-based methods are proven tools to solve reinforcement learning problems with hidden states. Nearest Sequence Memory (NSM) is a widely known instance-based approach mainly based on k-Nearest Neighbor algorithm. It keeps the history of the agent in terms of action-observation-reward tuples and uses it to vote for the best upcoming action. In this work, an improving heuristic is proposed for the NSM algorithm which provides the agent an additional prior information, namely transitional bottlenecks, on...
Citation Formats
F. Polat and E. Cilden, “Compact Frequency Memory for Reinforcement Learning with Hidden States.,” presented at the PRIMA: International Conference on Principles and Practice of Multi-Agent Systems, Turin, Italy, 2019, Accessed: 00, 2020. [Online]. Available: https://hdl.handle.net/11511/57829.