Store and query numeric intervals in Lucene (Greplin open source)

jojopotato · on Dec 6, 2010

Please excuse my ignorance, but can someone tell me why this is being voted up so quickly?

Lucene already has range queries, is this just enumerating all of the values within a range for the fields? It looks like they don't index these values either in the examples.

Edit: Not that this should take away from adding to Lucene, for that I thank you :)

rwalker · on Dec 6, 2010

As for what it does, it allows you to store an interval in Lucene, so you could store 1792 - 1796 or you could store 140000 - 125235466734. It uses a similar algorithm as the one used to store numbers so that it doesn't have to store millions of terms in the second example.

(we're pretty surprised by the response too!)

jojopotato · on Dec 6, 2010

Thanks, it's really cool, I just wanted to make sure I wasn't missing something!

physcab · on Dec 6, 2010

One application that this might be useful for is an IP lookup. Typically they're stored in block ranges and querying can be tricky, especially if its an offline operation done with hadoop. Could also be counter-productive, but an interesting thought.