• We are pleased to announce that the winner of our Feedback Prize Draw for the Winter 2024-25 session and winning £150 of gift vouchers is Zhao Liang Tay. Congratulations to Zhao Liang. If you fancy winning £150 worth of gift vouchers (from a major UK store) for the Summer 2025 exam sitting for just a few minutes of your time throughout the session, please see our website at https://www.acted.co.uk/further-info.html?pat=feedback#feedback-prize for more information on how you can make sure your name is included in the draw at the end of the session.
  • Please be advised that the SP1, SP5 and SP7 X1 deadline is the 14th July and not the 17th June as first stated. Please accept out apologies for any confusion caused.

PCA

Molly

Ton up Member
Hi all,

just reading on PCA. we choose components to be uncorrelated linear combinations of the variables that maximise the variance. Im not sure the significance of maximising the variance here, how does that ensure that we havent sacrificed too much information?

Thanks
 
This is actually a very interesting point. Maximising the variance here is exactly the same thing as minimising the projection error, and it boils down to Pythagoras's theorem.

Try to follow the below by drawing it out on a piece of paper:

Each point of data 'x' is some distance 'c' from the centre of all your data 'M'. If we project our point x onto a principal component, we will end up at a point 'p' on that principal component. This will be some distance 'a' from M, and it will be distance 'b' from the original point x. If you draw these out you'll see that M, x and p all form a right-angled triangle. Therefore c^2=a^2+b^2.

Distance 'c' is fixed by the data, it can't change it we change the principal component. But if we changed the direction of our principal component then we change the position of p, so values a and b will change. The value of a^2 is exactly the variance of the projected point p and the value of b is the projection error, since it's the distance from x to p - i.e. how close is projected point to the original data point. If we want to change our principal component direction to increase a, then we have to decrease b at the same time because c^2=a^2+b^2 and the value of c cannot change. Hence maximising a is the same as minimising b, i.e. maximising the variance of the projected points is the same as minimising their projection error.

If you struggled to follow the above it is because of a lack of pictures. The top answer on this stackoverflow post is the best explanation of PCA I have ever read: https://stats.stackexchange.com/que...l-component-analysis-eigenvectors-eigenvalues and I would thoroughly recommend it. It covers the above and more, including an intuitive understanding of what PCA is trying to achieve in general.
 
This is actually a very interesting point. Maximising the variance here is exactly the same thing as minimising the projection error, and it boils down to Pythagoras's theorem.

Try to follow the below by drawing it out on a piece of paper:

Each point of data 'x' is some distance 'c' from the centre of all your data 'M'. If we project our point x onto a principal component, we will end up at a point 'p' on that principal component. This will be some distance 'a' from M, and it will be distance 'b' from the original point x. If you draw these out you'll see that M, x and p all form a right-angled triangle. Therefore c^2=a^2+b^2.

Distance 'c' is fixed by the data, it can't change it we change the principal component. But if we changed the direction of our principal component then we change the position of p, so values a and b will change. The value of a^2 is exactly the variance of the projected point p and the value of b is the projection error, since it's the distance from x to p - i.e. how close is projected point to the original data point. If we want to change our principal component direction to increase a, then we have to decrease b at the same time because c^2=a^2+b^2 and the value of c cannot change. Hence maximising a is the same as minimising b, i.e. maximising the variance of the projected points is the same as minimising their projection error.

If you struggled to follow the above it is because of a lack of pictures. The top answer on this stackoverflow post is the best explanation of PCA I have ever read: https://stats.stackexchange.com/que...l-component-analysis-eigenvectors-eigenvalues and I would thoroughly recommend it. It covers the above and more, including an intuitive understanding of what PCA is trying to achieve in general.
Thank you so so much for this, thats very interesting and helpful!! Thank you! :)
 
Back
Top