AMS Feature Column: Singular Value Decomposition

hooande · on Aug 1, 2009

A regularized singular value decomposition is one of the most powerful machine learning algorithms available. Two decent open source implementations are SciPy (http://www.scipy.org/) and SVDLIBC (http://tedlab.mit.edu/~dr/svdlibc/).

Timely Development also posted the full c++ code they used for the Netflix Prize http://www.timelydevelopment.com/demos/NetflixPrize.aspx

henryl · on Aug 2, 2009

SVDLIBC's fastest sparse matrix algorithm uses Lanczos, which is still not good enough for large data sets.

Of particular interest to web scale projects would be incremental SVD methods that enable online updates to the SVD. http://www.merl.com/publications/TR2006-059/

There's also been research in academic circles regarding the use of the much more efficient (though nondeterministic) CUR decomposition in the areas of network analysis and collaborative filtering.

pixcavator · on Aug 2, 2009

Related to the Netflix prize, take a look at the Hodge decomposition: http://arxiv.org/abs/0811.1067.

slackenerny · on Aug 2, 2009

A videolecture on this paper: http://videolectures.net/aml08_lim_ghrl/ (warning: microsoft streaming technology).

grinich · on Aug 1, 2009

SVD also has an interesting application in image compression. Here's a fairly good example.

http://online.redwoods.cc.ca.us/instruct/darnold/LAPROJ/Fall...

apu · on Aug 1, 2009

Although SVD can be applied to images, it's not a particularly good compression method. It's fairly expensive to compute, and the Discrete Cosine Transform (JPEG) or Wavelet Transform (JPEG2000) provide similar quality at a fraction of the cost.

The fundamental problem is that SVD is a linear technique -- it finds the best linear approximation to a matrix. However, real-world images contain several non-linear effects which are very expensive to approximate linearly.

In recent years, however, various newer techniques have been developed which use SVD (actually, the closely related Principal Components Analysis (PCA)) as a building block for achieving good compression.

One such approach is "Clustered Blockwise PCA for Representing Visual Data" -- which applies PCA to different regions of an image and then another PCA on all of the resulting blocks. The intuition is that although the whole image is not linear, many regions within it are indeed linear and thus compress well, individually.

http://www1.cs.columbia.edu/CAVE/publications/pdfs/Nishino_P...

grinich · on Aug 2, 2009

If you examine the images compressed using SVD, you can actually see the effects of the linear approximation in the form of light/dark bands. Great link. Thanks.

henryl · on Aug 1, 2009

SVD is the swiss army knife of linear algebra.

TriinT · on Aug 1, 2009

It's more than a swiss knife. It's a sledgehammer!! It breaks almost any problem you may encounter...

jackdawjack · on Aug 1, 2009

Nice. This helped straighten up my idea that svd is just eigenvalues for nonsquare matricies.