Principal Component Analysis (PCA)
Video Overview & Insights
Principal component analysis (PCA) is a workhorse algorithm in statistics, where dominant correlation patterns are extracted from high-dimensional data.
This was the typo video I was looking for! raw math steps
Book PDF: http://databookuw.com/databook.pdf
Book Website: http://databookuw.com
The covariance matrix is usually 1/(n-1) * B'B. Similarly to how in probability we calculate sample variance as a multiple of 1/(n-1) for some further technical reasons.
These lectures follow Chapter 1 from: "Data-Driven Science and Engineering: Machine Learning, Dynamical Systems, and Control" by Brunton and Kutz
Amazon: https://www.amazon.com/Data-Driven-Science-Engineering-Learning-Dynamical/dp/1108422098/
@ 6:08 The matrix is Covariance of the columns of B not the rows. Also for it to be a Covariance Matrix you must divide by (n-1).
Brunton Website: eigensteve.com
This video was produced at the University of Washington
u are the best! so helpful, unlimited appreciation
More User Perspectives
Best explanation by a head and pair of arms ive ever seen
@tegancooper-hughes5174Video must be left-right inverted after shot.
@mzz834The best video ever made on PCA
@Selim-of8gqdude!he is writing backwards while explain everything flawlessly
@xuyanyue7459Please do not translate the videos into Spanish!! many technical words translated into Spanish are wrong
@latrodectus11Thank you for the great explanation!
@Martin-iw1llShould the mean not be over each column instead of row?
@eshitab9667Sorry but almost every book and tutorial regurgitates the same stuff about max variance. So let us say if I have just one variable (a male is the variable) and I make hundred measurements of his 100 body features then what is PCA going to give me? Or if I have hundred males (males is still the variable) then if I measure many features of each individual then what is PCA going to give me in terms of variance?
@rjn5123At 5:13, are we supposed to shift each row by the mean? In the video, X_bar should be [x_bars]*[1;1;1;...], right?
@JmgnlxtAs far as I understand it from several other ressources (and my own thoughts), T would be the original data points projected onto the principal components, which in turn are the eigenvectors of the covariance matrix (i.e., V). That would also make sense when considering that the dimensionality of T is the same as of X. Or am I completely confused now?
@Fr1392In my class, we used BB^T for the covariance matrix, why is this?
@wakaboomnickYes the best explanation of PCA on the web, as long as you are not a complete novice to PCA. Don't get me wrong: Prof. Brunton is the teacher every student deserves; it must be said that in order to follow this beautiful course on SVD you must have some basic Linear Algebra 101 at undergraduate level, at least.
@marcoventura9451perfect
@fatme_khanomFirst of all, thank you for all the amazing lectures you've made available, they've helped me so much in my data science journey. I was reviewing the information you shared at 6:00, and then on the book, where you mention that the row-wise covariance matrix is given by B*B, whereas in your video Singular Value Decomposition (SVD): Dominant Correlations you mention this is the column-wise correlation matrix.
Could you check if I'm missing something? I feel like the latter should be the correct one (which would give us a matrix nxn).
Thank you so much!
Lee Ronald Jackson Jessica Jackson Maria
@MathivpasIsakssonWilson Amy Lopez Christopher Lopez Larry
@KennethBrown-o2wLewis George Taylor Shirley Rodriguez Jose
@BurneJonesClaire-b1vSo V comes from C?
@zackkier6257Jackson Barbara Clark Lisa Clark Edward
@kaylabrooks4252Allen Kimberly Clark James Martinez Cynthia
@HarryMatt-x6pMoore Shirley Brown Maria Rodriguez Mary
@FaradayDave-x2sJohnson Patricia Lopez Margaret Wilson Betty
@MackintoshStanley-n4wThank you Dr. Brunton! I just bought your book and am reviewing the PCA chapter. There is a difference in your definition of principal components between this video and your textbook. Can you please clarify?
In the textbook (2nd edition) in Section 1.5, after Eq 1.40, you state that "the columns of the eigenvector matrix V are the principal components". However, in this video, you define principal components as the mean-centered data matrix multiplied by your eigenvector matrix V, which in this video are defined as "loadings" that describe how much of each of the principal components each row in X has.
Which definition is more accurate? Or are they both accurate? Please clarify if possible. Thank you so much!!
Amazing explanation, went through a lot of videos but this one is the best
@piyushduggal5370Amazing lecture! But in previous videos you also said that the rows represent experiments so that was a little strange
@MrWater2PCA clearly explained!!!
@LifeKiT-iWhy eigen?
@harsharangapatil2423This is so technically correct, and simultaneously so obtuse, that my intuition fuse has melted. Please consider redoing this as 3D pseudo visualizations of data subsets.
@mickwilson99Dear Steve bu video da neden altyazılarda türkçe yok. Anlayamadim
@nurtenbakc2562Fundamental !!!
@alexander8877IS HE WRITING IN MIRROR IMAGE? HE'S BEHIND THE GLASS RIGHT? SO WHAT LOOKS LIKE PCA TO US, IS HIM ACTUALLY WRITING PCA FROM THE BACK??
@learnenglisheasy-leePCA is best used on a well researched and confirmed theory otherwise the numbers are not interpretable
@garrythorp8770Very good explanation for each symptom and its treatment
@VinodSharma-lj6yyThank you
@Kevin.Kawchakcovariance should be C=1/n*BB`
@sgtWoods-rv2owHow to film such kind of tutorial videos?
@baseladams280Best math content is always the serious and straightforward ones.. Fuck the jokers, you are the king dude
@hsenol1In some implementations, I find that along with mean centering, standard deviation division is followed (Z-scores), does this make a difference? I believe standard deviation division is important to keep the features on the same scale (Unit Variance).
@shashankgupta3549very good, thanks a lot 😅
@manfredbogner9799Principal Component Analysis (PCA) is a technique in statistics that simplifies complex data by identifying and emphasizing the most important patterns or features. It does this by transforming the original variables into a new set of uncorrelated variables called principal components, allowing for a more efficient representation of the data.
@usmanmuhammad3439How does prof write like that??
@notchickenThis guy is super good at writing backwards
@AndyShick1Should #3 be the covariance matrix of the columns rather than the row ?.
It seems to me that leads to V rows = B columns
2023-11-17
@erosss1xi hate it when people write on some board, because it is barely readable and the writing costs useless time for nothing. but thanks anyway
@kngfant563Note @ 7:50 regarding CV = VD. The D here is a matrix where all the eigenvalues are on the diagonal.
@matthijsg5983