Out of curiosity why weren't products like Druid http://druid.io/ or influxdb ht...

paladin314159 · on Aug 25, 2015

To be totally honest, there are so many technologies out there that claim to solve analytics that it's tough to seriously consider all of them.

That said, we have looked at Druid, which is also a good example of using lambda architecture in practice (http://druid.io/docs/0.8.0/design/design.html -- note the historical vs realtime distinction). They use many of the same design principles as us, and one of our sub-systems is very similar to it. We still believe the pre-aggregation approach is critical for performance in our use case, though. Lastly, when we started building the architecture (mid-2014), Druid was very new, and I'm generally wary of designing everything around a new and potentially unstable piece of software.

fangjin · on Aug 25, 2015

Druid does pre-aggregation (roll-up) of data at ingestion time and is also used at scale (30+ trillion events, ingesting over 1M+ events/s) by numerous large technology companies: http://druid.io/druid-powered.html

dsp1234 · on Aug 25, 2015

Note that the commenter mentioned mid-2014. That page first appeared on (or about) July 29th, 2014[1], and at that time only contained 4 names:

Metamarkets

Netflix

LiquidM

N3twork

So while today Druid may be in use by "numerous large technology companies", at the time the commenter was researching it wasn't showcasing as many large companies.

[1] https://web.archive.org/web/20140729014707/http://druid.io/d...

luckydata · on Aug 25, 2015

Hey Fan, I know you feel very strongly about Druid but at the time it wasn't the way to go, I can see how they might have opted to steer clear.

angryasian · on Aug 25, 2015

Just to note, with druid you are able to have preaggregated tables based on dimensions. Overall good article and thanks for sharing.