low latency and high concurrecy is a hard problem with large data.

nrjames · 2025-09-05T18:02:42 1757095362

We also have a budget issue with Snowflake, so it would help to aggregate elsewhere.

miked98 · 2025-09-05T18:30:49 1757097049

In our experience, the primary driver of Snowflake costs is not the compute for aggregation, but the compute required for lots of reads/scans.

We recently built a Snowflake-to-ClickHouse pipeline for a customer where aggregates are built hourly in Snowflake, then pushed into a ClickHouse table to power their user-facing dashboards.

By offloading dashboard queries to ClickHouse, they slashed their Snowflake bill by ~25%, which was worth millions to them.

(Admittedly, running aggregations elsewhere—for example, in Spark—could further reduce costs, but you would then need Iceberg to make the tables queryable in Snowflake.)

nrjames · 2025-09-05T19:10:40 1757099440

I'm in an enterprise environment where a central IT platform team controls what size warehouses we can have in Snowflake. They are not receptive to arguments for larger warehouses, unfortunately. Our issue becomes long-running queries b/c Snowflake spills the data to disk during the joins. TBH, I could join the data more quickly on my laptop than in the warehouse I'm allowed to use. Anyhow, I have then an old build server that is beefy & has 512 GB of RAM, so I can set up my aggregation and/or OLAP services there, since it's an unencumbered playground.