Steve Brunton

Steve Brunton

529,000 subscribers

⏱ 👁 488,008 views

Principal Component Analysis (PCA)

Video Overview & Insights

Principal component analysis (PCA) is a workhorse algorithm in statistics, where dominant correlation patterns are extracted from high-dimensional data.

“

This was the typo video I was looking for! raw math steps

— @ICOXAEDRO

Book PDF: http://databookuw.com/databook.pdf

Book Website: http://databookuw.com

“

The covariance matrix is usually 1/(n-1) * B'B. Similarly to how in probability we calculate sample variance as a multiple of 1/(n-1) for some further technical reasons.

— @alvinlepik5265

These lectures follow Chapter 1 from: "Data-Driven Science and Engineering: Machine Learning, Dynamical Systems, and Control" by Brunton and Kutz

Amazon: https://www.amazon.com/Data-Driven-Science-Engineering-Learning-Dynamical/dp/1108422098/

“

@ 6:08 The matrix is Covariance of the columns of B not the rows. Also for it to be a Covariance Matrix you must divide by (n-1).

— @RobertMartin-ky9xp

Brunton Website: eigensteve.com

This video was produced at the University of Washington

“

u are the best! so helpful, unlimited appreciation

— @vshahn2768

More User Perspectives

@

Best explanation by a head and pair of arms ive ever seen

@tegancooper-hughes5174

@

Video must be left-right inverted after shot.

@mzz834

@

The best video ever made on PCA

@Selim-of8gq

@

dude！he is writing backwards while explain everything flawlessly

@xuyanyue7459

@

Please do not translate the videos into Spanish!! many technical words translated into Spanish are wrong

@latrodectus11

@

Thank you for the great explanation!

@Martin-iw1ll

@

Should the mean not be over each column instead of row?

@eshitab9667

@

Sorry but almost every book and tutorial regurgitates the same stuff about max variance. So let us say if I have just one variable (a male is the variable) and I make hundred measurements of his 100 body features then what is PCA going to give me? Or if I have hundred males (males is still the variable) then if I measure many features of each individual then what is PCA going to give me in terms of variance?

@rjn5123

@

At 5:13, are we supposed to shift each row by the mean? In the video, X_bar should be [x_bars]*[1;1;1;...], right?

@Jmgnlxt

@

As far as I understand it from several other ressources (and my own thoughts), T would be the original data points projected onto the principal components, which in turn are the eigenvectors of the covariance matrix (i.e., V). That would also make sense when considering that the dimensionality of T is the same as of X. Or am I completely confused now?

@Fr1392

@

In my class, we used BB^T for the covariance matrix, why is this?

@wakaboomnick

@

Yes the best explanation of PCA on the web, as long as you are not a complete novice to PCA. Don't get me wrong: Prof. Brunton is the teacher every student deserves; it must be said that in order to follow this beautiful course on SVD you must have some basic Linear Algebra 101 at undergraduate level, at least.

@marcoventura9451

@

perfect

@fatme_khanom

@

First of all, thank you for all the amazing lectures you've made available, they've helped me so much in my data science journey. I was reviewing the information you shared at 6:00, and then on the book, where you mention that the row-wise covariance matrix is given by B*B, whereas in your video Singular Value Decomposition (SVD): Dominant Correlations you mention this is the column-wise correlation matrix.

Could you check if I'm missing something? I feel like the latter should be the correct one (which would give us a matrix nxn).

Thank you so much!

@JoãoP.Cardoso-k4h

@

Lee Ronald Jackson Jessica Jackson Maria

@MathivpasIsaksson

@

Wilson Amy Lopez Christopher Lopez Larry

@KennethBrown-o2w

@

Lewis George Taylor Shirley Rodriguez Jose

@BurneJonesClaire-b1v

@

So V comes from C?

@zackkier6257

@

Jackson Barbara Clark Lisa Clark Edward

@kaylabrooks4252

@

Allen Kimberly Clark James Martinez Cynthia

@HarryMatt-x6p

@

Moore Shirley Brown Maria Rodriguez Mary

@FaradayDave-x2s

@

Johnson Patricia Lopez Margaret Wilson Betty

@MackintoshStanley-n4w

@

Thank you Dr. Brunton! I just bought your book and am reviewing the PCA chapter. There is a difference in your definition of principal components between this video and your textbook. Can you please clarify?

In the textbook (2nd edition) in Section 1.5, after Eq 1.40, you state that "the columns of the eigenvector matrix V are the principal components". However, in this video, you define principal components as the mean-centered data matrix multiplied by your eigenvector matrix V, which in this video are defined as "loadings" that describe how much of each of the principal components each row in X has.

Which definition is more accurate? Or are they both accurate? Please clarify if possible. Thank you so much!!

@Chloe-ty9mn

@

Amazing explanation, went through a lot of videos but this one is the best

@piyushduggal5370

@

Amazing lecture! But in previous videos you also said that the rows represent experiments so that was a little strange

@MrWater2

@

PCA clearly explained!!!

@LifeKiT-i

@

Why eigen?

@harsharangapatil2423

@

This is so technically correct, and simultaneously so obtuse, that my intuition fuse has melted. Please consider redoing this as 3D pseudo visualizations of data subsets.

@mickwilson99

@

Dear Steve bu video da neden altyazılarda türkçe yok. Anlayamadim

@nurtenbakc2562

@

Fundamental !!!

@alexander8877

@

IS HE WRITING IN MIRROR IMAGE? HE'S BEHIND THE GLASS RIGHT? SO WHAT LOOKS LIKE PCA TO US, IS HIM ACTUALLY WRITING PCA FROM THE BACK??

@learnenglisheasy-lee

@

PCA is best used on a well researched and confirmed theory otherwise the numbers are not interpretable

@garrythorp8770

@

Very good explanation for each symptom and its treatment

@VinodSharma-lj6yy

@

Thank you

@Kevin.Kawchak

@

covariance should be C=1/n*BB`

@sgtWoods-rv2ow

@

How to film such kind of tutorial videos?

@baseladams280

@

Best math content is always the serious and straightforward ones.. Fuck the jokers, you are the king dude

@hsenol1

@

In some implementations, I find that along with mean centering, standard deviation division is followed (Z-scores), does this make a difference? I believe standard deviation division is important to keep the features on the same scale (Unit Variance).

@shashankgupta3549

@

very good, thanks a lot 😅

@manfredbogner9799

@

Principal Component Analysis (PCA) is a technique in statistics that simplifies complex data by identifying and emphasizing the most important patterns or features. It does this by transforming the original variables into a new set of uncorrelated variables called principal components, allowing for a more efficient representation of the data.

@usmanmuhammad3439

@

How does prof write like that??

@notchicken

@

This guy is super good at writing backwards

@AndyShick1

@

Should #3 be the covariance matrix of the columns rather than the row ?.
It seems to me that leads to V rows = B columns

@DataTranslator

@

2023-11-17

@erosss1x

@

i hate it when people write on some board, because it is barely readable and the writing costs useless time for nothing. but thanks anyway

@kngfant563

@

Note @ 7:50 regarding CV = VD. The D here is a matrix where all the eigenvalues are on the diagonal.

@matthijsg5983

#Principal component analysis #Singular value decomposition #machine learning #Linear regression #Linear algebra #data science #PCA #SVD