Hacker News new | past | comments | ask | show | jobs | submit login

Yes, you can go down that road [0], and in some applications it's a good approach. However, it adds quite a bit of complexity, and the cost for creating and storing the index is substantial. For us, it doesn't pencil out to a win.

[0] http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.362...




I wonder if bitap [0] would be a good fit for the 4K search algorithm. It would let you do linear-time regexp matching for relatively short patterns (32 characters on a 32-bit machine, 64 characters on 64-bit, etc).

[0] http://en.wikipedia.org/wiki/Bitap_algorithm




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: