In our experience, the primary driver of Snowflake costs is not the compute for aggregation, but the compute required for lots of reads/scans.
We recently built a Snowflake-to-ClickHouse pipeline for a customer where aggregates are built hourly in Snowflake, then pushed into a ClickHouse table to power their user-facing dashboards.
By offloading dashboard queries to ClickHouse, they slashed their Snowflake bill by ~25%, which was worth millions to them.
(Admittedly, running aggregations elsewhere—for example, in Spark—could further reduce costs, but you would then need Iceberg to make the tables queryable in Snowflake.)
I'm in an enterprise environment where a central IT platform team controls what size warehouses we can have in Snowflake. They are not receptive to arguments for larger warehouses, unfortunately. Our issue becomes long-running queries b/c Snowflake spills the data to disk during the joins. TBH, I could join the data more quickly on my laptop than in the warehouse I'm allowed to use. Anyhow, I have then an old build server that is beefy & has 512 GB of RAM, so I can set up my aggregation and/or OLAP services there, since it's an unencumbered playground.