1. XOR Experiments
Network Creation Program: autoaXORtest.py NetworkProgramInstructions
Graph Plotting: gplotXOR.py
1.1. The Effect of Prediction Layers: Hidden Anticipation and Error Anticipation
The effects of Hidden Anticipation and Error Anticipation have been explained in the previous work The Multiple Roles of Anticipation in Developmental Robotics
On the same XOR task explained in the paper above with 75% noise, variations were made to the neural network architecture as listed below, and effects on the: TSS error on the predictable portion of the dataset, hidden layer representations, and delta values, were recorded.
1.2. Variations on the HA/EA Prediction and their Effects on Network Learning
-
Momentum Values: It was found that non-zero or high momentum values interfered with Hidden anticipation or made learning of such tasks for networks without Hidden Anticipation much faster.
-
Using Input (AA) Prediction: InputVersusHiddenPrediction.
-
Hidden Layer Size: AA prediction performed poorly in networks with small hidden layer sizes. However, with a larger network size, AA outperformed HA on speed of learning the predictable datasets.
-
Error Anticipation: http://developmentalrobotics.org/errorAnticipationXOR
-
EA Performance at separating predictable and on predictable units in Hidden Layer representation is obviously affected by hidden unit size for HA prediction. It would be good to test this in comparison with AA.
1.2.0.1. Points of Interest
-
Prediction Deltas -- Spikes in the delta values for the Prediction layers are seen at different points in first 200 epochs of the task.
-
Hidden Deltas -- These are most important when considering the effect the prediction layer has on the weights from the input to the hidden layer, which are responsible for the change in learnability with prediction on.
-
Hidden Representations as viewable via PCA -- As tested with 10 hidden units, Error Anticipation had somewhat of a clumping effect on unpredictable patterns in terms of representation in weight space, but it was not as reliable at 8 hidden units.
1.3. Future Research // Understanding the Predicting Layers
By analyzing various data, such as delta values from hidden/prediction layers and PCA data, and comparing them to the TSS error, we may get a better understanding of how and why adding a self prediction task (Hidden or Input), speeds up the learning of predictable data in the presense of noise.
