Hacker News new | past | comments | ask | show | jobs | submit login

This is on a AWS RDS instance, everything is in (DBMS) cache, db.m4.large.



You should revisit your schema. That sounds much too slow.

    q)a:(200000000?200000000),-42 /pick some random numbers
    q)\t a?last a
    303
303msec for a linear scan of 200m ints on an i5 laptop means you should be able to do better than 2sec for a "smart" query.


Well, the schema is a bit more complicated than that. I'd say the 100M / s query rate is sufficient. Here we search for trajectories (road segments + time) going through certain regions within a certain repeating time-interval (say, 2 AM on a Sunday), in a 3 month time interval.


It seems slow to me too...

We have tables with > billion records In RDS / SQLServer and return results in <1 second...


Although it sounds like the OP might have some tricky queries, just want to echo that relational databases can be (very) fast if you know how to index them: one of our RDS MySQL tables has 3+ billion rows and queries on it average ~50 milliseconds.


And that's for a sequential scan over all of those tuples?

That'd be very impressive. Does SQLServer support parallel execution? Or how come it be so fast?


> Does SQLServer support parallel execution?

For a long time as far as I know?


Cool. I'm not familiar with it, that's why I ask :).

Anyway, is the performance you mentioned for a scan over the entire table? E.g. to do an aggregate?


Honestly don't remember, I'm working on new stuff in PostgreSQL now.

I know a count of the whole table takes about a minute tho. But the filter > group by was ~1s as the grouping was done on about < 50 rows of the filtered result.

Without knowing the original query above was using it's speculation on why it was slow and mine was fast. My argument is just that SQL Server isn't magically slow. MySQL/PostgreSQL/SQL Server are super fast. And if you don't massage the database it can be super slow too.


Thanks.

> I know a count of the whole table takes about a minute tho. But the filter > group by was ~1s as the grouping was done on about < 50 rows of the filtered result.

Yeah, that makes sense.

I'm running some queries in production that take a 150 million row table, filter it down to ~100-300k rows which are then aggregated/grouped. This usually takes < 1s. However, if I'd try to do a count(*) on the table, that'd be around a minute as well.

> Without knowing the original query above was using it's speculation on why it was slow and mine was fast. My argument is just that SQL Server isn't magically slow. MySQL/PostgreSQL/SQL Server are super fast. And if you don't massage the database it can be super slow too.

Yeah. The query / query plan will be needed to go more in-depth on these kind of discussions. The amount of disk vs memory hits during execution obviously as well.

Anyway, I just wanted to understand if SQL Server was e.g. an order of magnitude faster than Postgres when scanning a large number of tuples. But I guess the answer to that is: probably not.


PostgreSQL has a BRIN index's as of 9.6 which may make some forms of aggregation faster than SQL Server I believe. Would need to do real-world tests to verify that tho.


Just to clarify, the whole db fits into 8gb of ram you're saying?




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: