It’s quite limited and can’t solve anything that isn’t linearly separable, for example the XOR problem.
Example
Let’s solve a classic problem, determining whether a point lies below or above a line
The input of our perceptron will be the X and Y coordinates of the point.
The output of our perceptron should be a number, let’s say that −1 means that it’s below the line and 1, above the line.
Prediction
The feed forward (or in other words, the “prediction”) process involves several steps
Provide some sort of input, x and y
Compute the weighted sum of the respective weights, w0x+w1y
Add the bias, w0x+w1y+b
Plug the result into an activation function sgn(w0x+w1y+b)
sgn(x)=⎩⎨⎧−101if x<0,if x=0,if x>0.
The output will determine whether the point lies below or above the line
The above steps can be summarized as y=f(∑i=0nwixi+b)
At first, our perceptron will perform very poorly. This is because usually the weights are completely random.
Learning
We can use a simplified version of Supervised Learning to optimize our model.
We can compute the error for our model - error=y−y^. We can then update our weights - wi←wi+α⋅error⋅xi where α is our Learning Rate and x is our input. We can then also update our bias - b←α⋅b⋅error.
We repeat this process a bunch of times, until our model converges.
An example implementation can be found on my GitHub repo