I should have been more explicit: it won't be linear-ish. As you fill up any per...

ShaunK · on Oct 9, 2017

When we've looked at BigQuery it seemed that if you prepay you essentially get a similar effect to what you're describing. You're given a certain number of "units" of compute, and if you exceeded your concurrent units available you end up with the same compute resource contention you would with an improperly scaled Snowflake warehouse or Redshift cluster.

If you're willing to just pay per gigabyte scanned with BigQuery you can scale near linearly I'm sure (although I haven't actually tried it), but you could accomplish the same thing using Snowflake's API to add warehouses as concurrent query load increases. That's what we do (although we just pre-allocate and suspend the warehouses because you only pay when they're on).

Redshift does suffer from this problem because the compute is tied to the data, but Redshift Spectrum is attempting to rectify that as well. I don't know anything about its performance though.

georgewfraser · on Oct 9, 2017

We're definitely going to revisit this periodically and I would love to address this in the next iteration. Would you mind creating an issue at https://github.com/fivetran/benchmark describing the concurrency trade-off that we're not accurately capturing?