Yeah, the bag of words and K-Rank reduction stuff basically is LSA. I've written...

Yeah, the bag of words and K-Rank reduction stuff basically is LSA.

I've written an LSA implementation a few years ago for a BI product ( written about http://www.innoveerpunt.nl/interactief-innoveren/innoveerpun... sorry that it's Dutch :) ).

I wonder how well it works; my takeaway was that you need to tweak the internal thresholds and matrix sizes a lot to get the optimal results, which in turn is highly dependant on the datasets you use (which is also made very clear in every LSA paper you'll read).