Hacker News new | past | comments | ask | show | jobs | submit login
Wavelet Trees - an Introduction (alexbowe.com)
47 points by youngerdryas on April 10, 2013 | hide | past | favorite | 7 comments



Sounds interesting, but theoretical. Any practical applications? I'm guessing it has some purpose on dealing with long string sequences, like genome sequencing?


From what I understand, it is useful for storing compressed data and querying it without decompression.

Range queries allow for full-text search, for example.


Something like that + data compression. If you have some data type mapped to a sequence of symbols, WT can speed up search in it. Please also note it is not a complete full text search index (like ones used in genome sequencing). It is a rank/select dictionary.


I believe he's doing full text search on this by running Burrows-Wheeler on the data first. See here for more details:

   http://alexbowe.com/fm-index/
I started reimplementing all of this in Go about a year ago, but moved onto other stuff before I finished.


Sure, that is what I meant. Some additional structures (BWT, ...) are required for a complete FTS. WT allows only 1-symbol searches that might be enough in practical applications (databases?).

I also have an implementation of dynamic LOUDS-based multiary WT (it is much faster on large alphabets), but it haven't been fully finished yet.



I would love to see more algorithms from the signals intelligence field. Any recommendations?




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: