NN Series

[NN Series 5/n] Regularisation: reducing the complexity of a model without compromising accuracy

Regularisation is known to reduce overfitting when training a neural network. As with a lot of these techniques there is a rich background and many options available, so asking the question why and how opens up to a lot of information. Diving through the information, for me at least, it wasn’t clear why/how it did this until I reframed what it was doing. In short, regularisation changes the sensitivity of the model to the training data.

Continue reading →

[NN Series 4/n] Feature Normalisation

This is an interesting one as I’d thought it was quite academic, with limited utility. Then I saw these graphs Error per epoch This graph shows the error per epoch of training a model on the data as is We can see that it takes around 180-200 epochs to train with a learning rate (eta) of 0.0002 or lower. Now compare it to this one Here we see the training takes around 15 epochs with a learning rate of 0.

Continue reading →

[NN Series 3/n] Calculating the error before quantisation: Gradient Descent

Next I’m looking at the Adaline in python code. This post is a mixture of what I’ve learnt in my degree, Sebestien Raschka’s book/code, and the 1960 paper that delivered the Adaline Neuron. Difference between the Perceptron and the Adaline In the first post we looked at the Perceptron as a flow of inputs (x), multiplied by weights (w), then summed in the Aggregation Function and finally quantised in the Threshold Function.

Continue reading →

[NN Series 2/n] Circuits that can be trained to match patterns: The Adaline

The text discusses the development and significance of the Adaline artificial neuron, highlighting its introduction of non-linear activation functions and cost minimization, which have important implications for modern machine learning.

Continue reading →

[NN Series 1/n] From Neurons to Neural Networks: The Perceptron

This post looks at the Percepton, from Frank Rosenblatt’s original paper to a practical implementation classifying Iris flowers. The Perceptron is the original Artificial Neuron and provided a way to train a model to classify linearly separable data sets. The Perceptron itself had a short life, with the Adaline coming in 3 years later. However it’s name lives on as neural networks have, Multilayer Perceptrons (MLPs). The naming shows the importance of this discovery.

Continue reading →