Intuition behind Principle Component Analysis - step 1

Ali Mehmood
3 min readFeb 2, 2021

So let me start with my history with Principle Component Analysis aka. P.C.A, whilst learning dimensionality reduction for machine learning P.C.A, even after reading tutorials and articles I could not get my head around it. You have probably seen this expression.

The one thing which I want to put more weight here is the intuition behind P.C.A; the popping-up of the first-hand idea of the technique. 🙌

But for the that you have to be keen on understanding every step. We will try to learn in bite-sized learning fashion, so that it is easy to grasp. 🙂

Here’s how we will proceed:

  1. Dimensionality reduction and P.C.A.
  2. Review of statistics; important visualizations
  3. P.C.A. explained

Read it slowly, you will digest it easily. So lets begin…

Dimensionality reduction and P.C.A.

Here is a small problem: These are your friends, and you have to take a picture of them. From which camera position would take their picture (1, 2 or 3)?

It seems like taking a picture of your friends from Point-2 would be best, but you however decides to takes pictures from Point-1 & Point-2.
Assume that your friends as dots here on a X-Y plane and the two black lines are the good positions you chose. So the points on your camera sensor would be shown like the following projection. Wouldn’t they?

Fig. A

Take the two projections/photos, compare them side-by-side and tell me which one is better?

Fig. B

It seems like one on the right is better, isn’t it? Because its points are more spaced out and are easy to tell. Its the best line at which you could capture all your friends in the photo. What P.C.A does is; it finds the ideal line to project the data.

We have taken two projections here, you can take as many as you want and compare.

Hereon our data-set has moved from being 2-D to being a 1-D data-set (on a single line) that looks like this.

Fig. C

We have singled out a line based on how sparsely distribute data is on that line. This is called dimensionality reduction i.e. we project our data onto lower dimension. P.C.A is one of the techniques for dimensionality reduction.

Take another numerical example

So, let’s say you have a housing-prices data-set, with features in its columns. Now think which column is more relevant to the data and which is not? How can we reduce the number of features without loosing much information?

We can combine the columns of data-set like this.

all of its features on the left, reduced features on the right

Finally we have 2 dimensions. Dimension also has other names such as: feature and random variables

Guys, you are in for 3 rides for one ticket. So let’s head to the second ride.

--

--