Perceptron
The perceptron takes in inputs with weights, gets a sum and passes it through a sign function to generate an output.
Given an input vector
where
Learning Algorithm
- Initialise weights
- Loop (until convergence/max steps)
- For each instance (
), classify - Select a misclassified instance
- Update weights
is the learning rate
- For each instance (
If the data is not linearly separable, the algorithm will not converge.
Why does the learning algorithm work?
Consider when there are misclassifications:
Case 1: Positive predicted as negative
The current calculation obtained is:
What we require is:
Thus, to “fix” our model, we have to reduce our
Case 2: Negative as positive
The current calculation obtained is:
What we require is:
Thus, to “fix” our model, we have to increase our
Neuron
Neuron
A generalised version of the perceptron - the building block of neural networks.
Sign function
This function is seen in the perceptron model.
Sigmoid function
This function is used to convert a linear regression model to a logistic regression model, making a classifier from values from regression.
tanh
ReLU
Leaky ReLU
Maxout
ELU
Neural network
Single-Layer
We can use a single-layer of neurons to simulate simple boolean functions.
For example, given a OR
function, we have the following inputs:
x1 | x2 | 0R |
---|---|---|
0 | 0 | 0 |
0 | 1 | 1 |
1 | 0 | 1 |
1 | 1 | 1 |
We can derive the relevant weights by considering the model:
Thus, we can get the following inequalities from the inputs:
We can then derive a set of weights that passes these criteria.
Multi-Layer
However, some boolean functions are not linearly separable, like XNOR
.
We can then model these functions by using multiple layers of neurons - for example:
XNOR
=NOR
,AND
Neural network vs Logistic/linear regression model
Logistic/linear regression relies on manual feature engineering to capture complex patterns, while a multi-layer neural network learns its own feature representations through its hidden layers and non-linear activations.
XNOR
x_{1}x_{2}
).The XNOR model can have hidden layers to simulate the NOR and AND layers, while feature engineering would be needed to capture this pattern in the model (new feature
Forward Propagation
Forward propagation
Process in a neural network where the input data is passed through the network’s layers to generate an output.
Forward propagation is used to do predictions.
Matrix multiplication can be used to get the outputs here, for example, imagining the model above (with no other layers):
Multi-class classification
Given a vector
where