To build artificial agents that can understand the environment around them, the semantics of their internal state should be learnable to them. Since neurons embedded in the brain cannot access directly to the outside, motor interaction with the environment is necessary. Sensory invariance driven action (SIDA) has been proposed  to learn the meaning of internal state based on reinforcement learning as shown in Figure 1. However, SIDA uses fixed Gabor filters to interact with the environment. It is possible that the receptive fields (RFs) in SIDA can also be learned based on the simultaneous statistics, and not just the state-action mapping. We propose the use of adaptive filters based on independent component analysis (ICA) in the SIDA framework for receptive field learning.
Figure 1. (Left) Visual agent model from  . An illustration of a simple sensorimotor agent is shown. The agent has a limited field of view where part of the input from the environment (I) is projected. Then the projection is compared to the action. We propose updating the filter bank (f) with the perception input. (Right) Visual environment which is a preprocessed image by Difference of Gaussian (DoG) filter.
We tested two agents with 8 and 4 sensory primitives (filters). The corresponding motor primitives (actions) were 16 and 8 with two directions for each filter. The goal can be formulated to make a more diagonal reward table, Q, where the rows correspond to the sensory state (roughly, the orientation) and the column to the motor output (the gaze orientation). We compared the degree of diagonalization of Q, simply based on .
Figure 2 shows the two types of filters and their reward tables with different primitives. We can see that the adaptive filters are more edge-like as in the input image than the Gabor filters. Also the values with Gabor filters are 0.6379 and 0.7808, while ones with the adaptive filters are 0.7091 and 0.9907. This means our proposed method better understand the environment than original SIDA.
Figure 2. Filters and the reward tables with 8 and 4 primitives. (Left) with 8 primitives, (Right) with 4 primitives. (Top) Gabor filters and the corresponding reward tables, (Bottom) adaptive filters and their reward tables.
We developed a simultaneous update method for receptive fields and the reward table, which shows a more ideal reward table, especially in the case of small number of filters. The updated receptive fields are more edge-like which is better for following the actions and the reward table looks like an identity matrix. The proposed approach gives rise to not only sensory representations, but also meaningful motor plans.