λ
home posts study uses projects

Perceptron

Jul 18, 2024

machine-learning

The perceptron is a basic mathematical model for solving Binary Classification problems. It consists of 2 components:

  1. Weights & Biases
  • You can think of these as knobs that you tweak to get the desired result
  • They are just numbers
  • The difference between both arises in the learning process
  1. An Activation Function

It’s quite limited and can’t solve anything that isn’t linearly separable, for example the XOR problem.

Example

Let’s solve a classic problem, determining whether a point lies below or above a line 99_pct_train.png|600

  • The input of our perceptron will be the X and Y coordinates of the point.
  • The output of our perceptron should be a number, let’s say that 1-1 means that it’s below the line and 11, above the line.

Prediction

The feed forward (or in other words, the “prediction”) process involves several steps

  • Provide some sort of input, xx and yy
  • Compute the weighted sum of the respective weights, w0x+w1yw_0x + w_1y
  • Add the bias, w0x+w1y+bw_0x + w_1y + b
  • Plug the result into an activation function sgn(w0x+w1y+b)\operatorname{sgn}(w_0x + w_1y + b)
sgn(x)={1if x<0,0if x=0,1if x>0.\operatorname{sgn}(x) = \begin{cases} -1 & \text{if } x < 0, \\ 0 & \text{if } x = 0, \\ 1 & \text{if } x > 0. \end{cases}
  • The output will determine whether the point lies below or above the line
  • The above steps can be summarized as y=f(i=0nwixi+b)y = f(\sum_{i=0}^{n}w_ix_i + b)
  • At first, our perceptron will perform very poorly. This is because usually the weights are completely random.

Learning

We can use a simplified version of Supervised Learning to optimize our model.

We can compute the error for our model - error=yy^error = y - \hat{y}. We can then update our weights - wiwi+αerrorxiw_i \leftarrow w_i + \alpha \cdot error \cdot x_i where α\alpha is our Learning Rate and xx is our input. We can then also update our bias - bαberrorb \leftarrow \alpha \cdot b \cdot error.

We repeat this process a bunch of times, until our model converges.

An example implementation can be found on my GitHub repo

References

LλBS

All systems up and running

Commit 84c1d46, deployed on Jan 22, 2025.