Hide/Show Apps

Reinforcement learning with internal expectation for the random neural network

The reinforcement learning scheme proposed in Halici (1977) (Halici, U., 1997. Journal of Biosystems 40 (1/2), 83-91) for the random neural network (Gelenbe, E., 1989b. Neural Computation 1 (4), 502-510) is based on reward and performs well for stationary environments. However: when the environment is not stationary it suffers from getting stuck to the previously learned action and extinction is not possible. In this paper, the reinforcement learning scheme is extended by introducing a weight update rule which takes into consideration the internal expectation of reinforcement. With the proposed scheme, the system behaves as in learning with reward when the reward for the learned action is not below the internal expectation, otherwise it behaves as in learning with punishment so that other possibilities can be explored. Such a scheme has made extinction possible while resulting in a good convergence to the most rewarding action.