Computerphile

Computerphile

2,620,000 subscribers

⏱ 👁 180,692 views

Data Analysis 6: Principal Component Analysis (PCA) - Computerphile

Video Overview & Insights

PCA - Principle Component Analysis - finally explained in an accessible way, thanks to Dr Mike Pound. This is part 6 of the Data Analysis Learning Playlist: https://www.youtube.com/playlist?list=PLzH6n4zXuckpfMu_4Ff8E7Z1behQks5ba

“

Check out the full Data Analysis Learning Playlist: https://www.youtube.com/playlist?list=PLzH6n4zXuckpfMu_4Ff8E7Z1behQks5ba

— @Computerphile

This Learning Playlist was designed by Dr Mercedes Torres-Torres & Dr Michael Pound of the University of Nottingham Computer Science Department. Find out more about Computer Science at Nottingham here: https://bit.ly/2IqwtNg

This series was made possible by sponsorship from by Google.

“

I just watched a statquest video on this, then I watched this video. I fully understand PCA and got all questions right on my assignment!

— @AryanFocus-i4q

The music dataset can be found here: https://github.com/mdeff/fma

https://www.facebook.com/computerphile

“

Very nice explanation!!!!! ❤❤❤❤ Greets from Germany

— @drachenschlachter6946

https://twitter.com/computer_phile

This video was filmed and edited by Sean Riley.

“

the only way you save money is by not spending it

— @Jsh-zw9hq

Computer Science at the University of Nottingham: https://bit.ly/nottscomputer

Computerphile is a sister project to Brady Haran's Numberphile. More at http://www.bradyharan.com

“

It’s not necessary to scale your data if you use scale=TRUE in the prcomp function.

— @redmotherfive

More User Perspectives

@

it’s a dimension reduction technique (not a data reduction technique, that’s a misnomer)

@redmotherfive

@

Why the PCs are always orthogonal to each other ?

@EW-mb1ih

@

I understand why it’s necessary to reduced our data (divide by the std). But Why is it necessary to center our dataset?

@EW-mb1ih

@

Can someone explain to me how the PC1 and linear regression line differ? I understand they serve different purposes, but if both get calculated by minimizing the squared sum of the datapoints to a fitting line it technically always is the same line right?

@Carlos848

@

awesome Explanation thank you

@kilogrammhunger961

@

How to project data from an n dimensional space to an m dimensional space. n > m

@rishidixit7939

@

How do you get column names of that 133 features that make up PCA1 for submitting that as a data frame for Kmeans?

@breadandcheese1880

@

Thank you for this brilliant video. In a less then a half an hour I developed intuition that it would take me a month to do from a book.

@HitAndMissLab

@

Why even have lectures? This fella explained why we "maximize the variance" so clearly in the first 5 minutes.. Lecturers should just make us watch this video in class... great stuff!

@kanewilliams1653

@

Cool 😎

@framm703

@

Thank you for explaining this! Very good quality of the video

@ディオゲネス-313

@

Excellent video

@djstr0b3

@

A very nice explanation! Thanks!

@kirar2004

@

@willw4096

@

don't watch the video if you know nothing about pca , come back after you know what is it from StatQuest or other channels

@donfeto7636

@

Genius explanation

@tapanbasak1453

@

@ 9:45 starts r

@Hamromerochannel

@

I gotta say I enjoy this video so much and kinda started to under stand what PCA is and what it is used for. Totally a new and different angle to look at this concept. Thank you again Dr. Mike.

@harpercfc_

@

Good stuff. Is the "weighted sum" the frobenius norm or related? I'm following a book and I'm trying to compare how it is teaching this to how it is explained in other forms of media like youtube videos.

@Rockyzach88

@

Thank you for this video.

@melikaelwadany4524

@

But how do we make use of principle components afterwards, despite the fact that we can’t interpret the components since they no longer represent the original variables? Without interpretability, can PC still be useful? What can PC still tell us?

@user-wr4yl7tx3w

@

Dr. Mike, you are a genius.

@asgharbeigi9718

@

What a simple way to explain PCA! Thank you so much for the video.

@gzuzchuy505

@

i fall in love:D

@leksa8845

@

Thanks Dr. Mike, really helpful!

@reacher3232

@

I'm late to the party but this playlist is gold. Thanks guys :)

@demonblood8841

@

5:25 Why you have not constructed a center of data? Project points to both X and Y axis, calculate both averages and then draw perpendiculars where these averages will intersect which will be a center of dataset

@m22d52

@

Great video. I also enjoyed the throwback stripey dot-matrix printer paper :)

@annprong5052

@

"A new principal component is gonna come out orthogonal to the ones before, until you run out of dimensions and you can't do it anymore."
- poetry

@nomen385

@

ridiculously understandable explained! thank you very much!

@paull923

@

Best explanation I've come upon as of yet. Thanks!

@soupsOff

@

Upon first hearing the phrase "principal component analysis", I thought it sounded very analogous to finding principal stress axes in a body under load. As Dr. Pound gave a more detailed explanation later, I realized that is exactly what it is - just expanded to take place in n-dimensional space instead of 3D space. May be a helpful way to visualize for any mechanical engineers out there.

@tlniec

@

Congratulations again for a great video. Thank you!

@ec92009y

@

Thank you Dr. Pound, finally someone who can explain pca in easy words. Really helpful in my thesis - and by a strange accident I ended up writing both my thesis about pca. First time in my Bachelors I used it for data reduction, this time I use it to categorize data.

@Zilfalon

@

I agree with most of what is being taught in this video . Using a new basis to maximize variance or minimize the projection error is why PCA is used . What I can't agree with however is the lecturer telling that PCA is used to cluster data . I don't think this is necessarily true . PCA clusters those features which are highly correlated together . It doesn't cluster the data points when they are represented using the new basis vectors . I hope I am not wrong

@alexandros27.

@

Very nice explanation. I almost never subscribe but you got me. Thank you.

@VG-bi9sw

@

Dude, you're better at explaining this than our uni professor :""D
please keep doing what you're doing.
Thank you.

@OmarMohammed-fy2he

@

"sponsorship from by Google" - was this piece of English generated by Google's AI?

@charlieangkor8649

@

That maximizes the variance=r2? Bc it seems like p1 was tvhere to minimize the variiance between the linne and the points no?

@ControlTheGuh

@

So the idea behind it, is a finding a right angle to look at all data, where we can see clearly all data and distances between them. Looks more like support vector machine or SVM, where we increase dimensionality to fit the line on some other dimension.

@8eck

@

Thank you for the explanation.

@trafalgarlaw9919

@

And what is the benefit of doing PCA? Are we training our neural networker quicker or why would I do this? I still have to collect all the variables, so what is the point?

@Centhihi

@

It is very pleasant to listen to you. Thanks!

@simaykazc1508

@

5:34 why when we rotate the axis data also split out as 2 clusters?

@sdeitym

@

brilliant, thanks!

@proprius

#computers #computerphile #computer #science #University of Nottingham #Computer Science #Data Analysis #Data #Dr Mike Pound #Dr Mercedes Torres Torres #PCA