I don't understand why people create these webpages just re-explaining stuff that can be read in a book, lecture notes (usually available freely online), or wikipedia. It just adds more noise to the internet. Is it a kind of marketing thing to show their customers that they know what they are doing?
There's value in explaining things in a different/more understandable way. Wikipedia articles and book chapters on statistics can be hard to understand.
Yes it's a form of marketing called inbound marketing. Create content that attracts people (blog posts) and then turn them into leads by getting them to put their email in for more info, etc.
PCA is a statistical model -- the simplest factor model there is. It deals with variances and covariances in datasets. It returns a transformed dataset that's linearly related to the original one but has the first variable with the highest variance and so on.
SVD is a matrix decomposition. It generalizes the idea of representing a linear transformation (with same dimensions in domain and codomain) in the basis of its eigenvalues, which gives a diagonal matrix representation and a formula like A = V'DV.
SVD is like this, but for rectangular matrices. So you have two matrices to diagonalize: A = U'DV.
That SVD even performs PCA as noted in the algorithms is a theorem, albeit simple one usually given as an exercise. But hey, even OLS regression can be programmed with SVD if you want to.
I actually touch on the relation to whitening toward the bottom of the article. You can whiten your dataset from the left singular matrix U which is directly related to PCs. Thanks for reading!
"Because vectors are typically written horizontally, we transpose the vectors to write them vertically".
Is there a typo in this sentence or is to just too early in the morning for me to read this?
No typo there. When we talk about vectors we mean "column vector". As it's easier to read horizontally (and takes less place in a paper), most of the time we write x^T = {a, b, c} rather than writing them in a column shape.
And here is another interesting connection between PCA and ridge regression: https://stats.stackexchange.com/questions/81395/relationship...