Knowledge seeker's blog: Artificial neural networks Day 04

The validation and test curves are very similar. It means that there is no problem with the training. If the test curve had increased significantly before the validation curve increased, then it is possible that some overfitting might have occurred.

A regression plot shows the relationship between the outputs of the network and the targets. If the training were perfect, the network outputs and the targets would be exactly equal, but the relationship is rarely perfect in practice.

When using a gradient descent algorithm, you typically use a smaller learning rate for batch mode training than incremental training, because all the individual gradients are summed before determining the step change to the weights.

When training multilayer networks, the general practice is to first divide the data into three subsets. The first subset is the training set, which is used for computing the gradient and updating the network weights and biases.

The second subset is the validation set. The error on the validation set is monitored during the training process. The validation error normally decreases during the initial phase of training, as does the training set error. However, when the network begins to overfit the data, the error on the validation set typically begins to rise. The network weights and biases are saved at the minimum of the validation set error.

The test set error is not used during training, but it is used to compare different models. It is also useful to plot the test set error during the training process. If the error on the test set reaches a minimum at a significantly different iteration number than the validation set error, this might indicate a poor division of the data set.

In incremental mode, the gradient is computed and the weights are updated after each input is applied to the network. In batch mode, all the inputs in the training set are applied to the network before the weights are updated.

For most problems, when using the Neural Network Toolbox™ software, batch training is significantly faster and produces smaller errors than incremental training.

During training, the progress is constantly updated in the training window. Of most interest are the performance, the magnitude of the gradient of performance and the number of validation checks. The magnitude of the gradient and the number of validation checks are used to terminate the training. The gradient will become very small as the training reaches a minimum of the performance. If the magnitude of the gradient is less than 1e-5, the training will stop. This limit can be adjusted by setting the parameter net.trainParam.min_grad. The number of validation checks represents the number of successive iterations that the validation performance fails to decrease. If this number reaches 6 (the default value), the training will stop.

Parameter Stopping Criteria
min_grad Minimum Gradient Magnitude
max_fail Maximum Number of Validation Increases
time Maximum Training Time
goal Minimum Performance Value
epochs Maximum Number of Training Epochs (Iterations)

References

MATLAB 2011b Help Documentation

Knowledge seeker's blog

Monday, March 26, 2012

Artificial neural networks Day 04

References

No comments:

Post a Comment

Mounting USB drives in Windows Subsystem for Linux

Report Abuse