> some global technique to solve PCA (SVD), which cannot take advantage of the parallelism of GPU / TPU
My reading of https://developer.download.nvidia.com/video/gputechconf/gtc/... suggests that it's just the usual modern SVD algorithm, in particular the QR factorization part, that's the limiting factor, and with some thought there are ways to do better.
My reading of https://developer.download.nvidia.com/video/gputechconf/gtc/... suggests that it's just the usual modern SVD algorithm, in particular the QR factorization part, that's the limiting factor, and with some thought there are ways to do better.