Show/Hide Menu
Hide/Show Apps
Logout
Türkçe
Türkçe
Search
Search
Login
Login
OpenMETU
OpenMETU
About
About
Open Science Policy
Open Science Policy
Open Access Guideline
Open Access Guideline
Postgraduate Thesis Guideline
Postgraduate Thesis Guideline
Communities & Collections
Communities & Collections
Help
Help
Frequently Asked Questions
Frequently Asked Questions
Guides
Guides
Thesis submission
Thesis submission
MS without thesis term project submission
MS without thesis term project submission
Publication submission with DOI
Publication submission with DOI
Publication submission
Publication submission
Supporting Information
Supporting Information
General Information
General Information
Copyright, Embargo and License
Copyright, Embargo and License
Contact us
Contact us
Reinforcement learning with internal expectation for the random neural network
Date
2000-10-01
Author
Halıcı, Uğur
Metadata
Show full item record
This work is licensed under a
Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License
.
Item Usage Stats
153
views
0
downloads
Cite This
The reinforcement learning scheme proposed in Halici (1977) (Halici, U., 1997. Journal of Biosystems 40 (1/2), 83-91) for the random neural network (Gelenbe, E., 1989b. Neural Computation 1 (4), 502-510) is based on reward and performs well for stationary environments. However: when the environment is not stationary it suffers from getting stuck to the previously learned action and extinction is not possible. In this paper, the reinforcement learning scheme is extended by introducing a weight update rule which takes into consideration the internal expectation of reinforcement. With the proposed scheme, the system behaves as in learning with reward when the reward for the learned action is not below the internal expectation, otherwise it behaves as in learning with punishment so that other possibilities can be explored. Such a scheme has made extinction possible while resulting in a good convergence to the most rewarding action.
Subject Keywords
Management Science and Operations Research
,
Modelling and Simulation
,
Information Systems and Management
URI
https://hdl.handle.net/11511/42604
Journal
European Journal Of Operational Research
DOI
https://doi.org/10.1016/s0377-2217(99)00479-8
Collections
Department of Electrical and Electronics Engineering, Article
Suggestions
OpenMETU
Core
Reinforcement learning with internal expectation in the random neural networks for cascaded decisions
Halıcı, Uğur (Elsevier BV, 2001-10-16)
The reinforcement learning scheme proposed in Halici (J. Biosystems 40 (1997) 83) for the random neural network (RNN) (Neural Computation 1 (1989) 502) is based on reward and performs well for stationary environments. However, when the environment is not stationary it suffers from getting stuck to the previously learned action and extinction is not possible. To overcome the problem, the reinforcement scheme is extended in Halici (Eur. J. Oper. Res., 126(2000) 288) by introducing a new weight update rule (E-...
Multi-objective integer programming: A general approach for generating all non-dominated solutions
Oezlen, Melih; Azizoğlu, Meral (Elsevier BV, 2009-11-16)
In this paper we develop a general approach to generate all non-dominated solutions of the multi-objective integer programming (MOIP) Problem. Our approach, which is based on the identification of objective efficiency ranges, is an improvement over classical epsilon-constraint method. Objective efficiency ranges are identified by solving simpler MOIP problems with fewer objectives. We first provide the classical epsilon-constraint method on the bi-objective integer programming problem for the sake of comple...
A NEW HEURISTIC APPROACH FOR THE MULTIITEM DYNAMIC LOT-SIZING PROBLEM
KIRCA, O; KOKTEN, M (Elsevier BV, 1994-06-09)
In this paper a framework for a new heuristic approach for solving the single level multi-item capacitated dynamic lot sizing problem is presented. The approach uses an iterative item-by-item strategy for generating solutions to the problem. In each iteration a set of items are scheduled over the planning horizon and the procedure terminates when all items are scheduled. An algorithm that implements this approach is developed in which in each iteration a single item is selected and scheduled over the planni...
Approximate queueing models for capacitated multi-stage inventory systems under base-stock control
Avşar, Zeynep Müge (Elsevier BV, 2014-07-01)
A queueing analysis is presented for base-stock controlled multi-stage production-inventory systems with capacity constraints. The exact queueing model is approximated by replacing some state-dependent conditional probabilities (that are used to express the transition rates) by constants. Two recursive algorithms (each with several variants) are developed for analysis of the steady-state performance. It is analytically shown that one of these algorithms is equivalent to the existing approximations given in ...
THE SCHEDULING OF ACTIVITIES TO MAXIMIZE THE NET PRESENT VALUE OF PROJECTS - COMMENT
SEPIL, C (Elsevier BV, 1994-02-24)
In a recent paper, Elmaghraby and Herroelen have presented an algorithm to maximize the present value of a project. Here, with the help of an example, it is shown that the algorithm may not find the optimal solution.
Citation Formats
IEEE
ACM
APA
CHICAGO
MLA
BibTeX
U. Halıcı, “Reinforcement learning with internal expectation for the random neural network,”
European Journal Of Operational Research
, pp. 288–307, 2000, Accessed: 00, 2020. [Online]. Available: https://hdl.handle.net/11511/42604.