This is on a AWS RDS instance, everything is in (DBMS) cache, db.m4.large.

geocar · on Feb 9, 2017

You should revisit your schema. That sounds much too slow.

    q)a:(200000000?200000000),-42 /pick some random numbers
    q)\t a?last a
    303

303msec for a linear scan of 200m ints on an i5 laptop means you should be able to do better than 2sec for a "smart" query.

edejong · on Feb 9, 2017

Well, the schema is a bit more complicated than that. I'd say the 100M / s query rate is sufficient. Here we search for trajectories (road segments + time) going through certain regions within a certain repeating time-interval (say, 2 AM on a Sunday), in a 3 month time interval.

philliphaydon · on Feb 9, 2017

It seems slow to me too...

We have tables with > billion records In RDS / SQLServer and return results in <1 second...

reissbaker · on Feb 9, 2017

Although it sounds like the OP might have some tricky queries, just want to echo that relational databases can be (very) fast if you know how to index them: one of our RDS MySQL tables has 3+ billion rows and queries on it average ~50 milliseconds.

felixge · on Feb 9, 2017

And that's for a sequential scan over all of those tuples?

That'd be very impressive. Does SQLServer support parallel execution? Or how come it be so fast?

philliphaydon · on Feb 9, 2017

> Does SQLServer support parallel execution?

For a long time as far as I know?

felixge · on Feb 10, 2017

Cool. I'm not familiar with it, that's why I ask :).

Anyway, is the performance you mentioned for a scan over the entire table? E.g. to do an aggregate?

philliphaydon · on Feb 10, 2017

Honestly don't remember, I'm working on new stuff in PostgreSQL now.

I know a count of the whole table takes about a minute tho. But the filter > group by was ~1s as the grouping was done on about < 50 rows of the filtered result.

Without knowing the original query above was using it's speculation on why it was slow and mine was fast. My argument is just that SQL Server isn't magically slow. MySQL/PostgreSQL/SQL Server are super fast. And if you don't massage the database it can be super slow too.

felixge · on Feb 10, 2017

Thanks.

> I know a count of the whole table takes about a minute tho. But the filter > group by was ~1s as the grouping was done on about < 50 rows of the filtered result.

Yeah, that makes sense.

I'm running some queries in production that take a 150 million row table, filter it down to ~100-300k rows which are then aggregated/grouped. This usually takes < 1s. However, if I'd try to do a count(*) on the table, that'd be around a minute as well.

> Without knowing the original query above was using it's speculation on why it was slow and mine was fast. My argument is just that SQL Server isn't magically slow. MySQL/PostgreSQL/SQL Server are super fast. And if you don't massage the database it can be super slow too.

Yeah. The query / query plan will be needed to go more in-depth on these kind of discussions. The amount of disk vs memory hits during execution obviously as well.

Anyway, I just wanted to understand if SQL Server was e.g. an order of magnitude faster than Postgres when scanning a large number of tuples. But I guess the answer to that is: probably not.

philliphaydon · on Feb 10, 2017

PostgreSQL has a BRIN index's as of 9.6 which may make some forms of aggregation faster than SQL Server I believe. Would need to do real-world tests to verify that tho.

xref · on Feb 10, 2017

Just to clarify, the whole db fits into 8gb of ram you're saying?