In BI, a similar tradeoff between lots of fancy preaggregation, versus optimizing search across the raw, un- (or lightly-) processed base data comes up quite frequently.
Commonly, the choice of approach is dictated by the number of end users querying the same data. If it's relatively specific queries ran frequently by many users, the aggregation approach can be made snappier and save lots of processing power.
But for power users, it becomes a nuisance real fast to be limited not only by stored base data, but also by the somewhat arbitrarily chosen aggregate structures. I'm assuming people doing log analyses of server logs fall within "power users" in most shops. At least I'd hope so :)
As an aside, the Go language (that I'm currently flapping between loving and not so much), versus Python at al., seems to be born of somewhat similar thinking.
Commonly, the choice of approach is dictated by the number of end users querying the same data. If it's relatively specific queries ran frequently by many users, the aggregation approach can be made snappier and save lots of processing power.
But for power users, it becomes a nuisance real fast to be limited not only by stored base data, but also by the somewhat arbitrarily chosen aggregate structures. I'm assuming people doing log analyses of server logs fall within "power users" in most shops. At least I'd hope so :)
As an aside, the Go language (that I'm currently flapping between loving and not so much), versus Python at al., seems to be born of somewhat similar thinking.