Hacker News new | past | comments | ask | show | jobs | submit login

> The reason PostgreSQL results were omitted is no matter what kind of optimization we throw at it, the benchmark always took more than 2 hours, regardless of partitioning, whereas MongoDB and Elasticsearch took a couple of minutes.

This seems like something was done incorrectly the comparison shouldn't be that drastic.

> Just one PostgreSQL 9.6.10 instance (shared_buffers = 128MB)

This looks way too low. The postgresql docs say a good starting point for shared_buffers is 25% of the server's memory. In this case that would be 32GB.

https://www.postgresql.org/docs/9.1/runtime-config-resource....




They set out on a 4 year journey to improve their ETL, but didn't take 1 second to change a conservative global config default. Can barely take the rest of the article seriously after a blunder like that


See my response above, it was indeed a typo from my side. I am sorry to hear that it spoiled the rest of the post for you.


It was indeed an incorrect snippet -- removed it. We do have a group of PostgreSQL experts within the company and we let them tune the database for the benchmark. Let me remind, this was not a tune once, run once operation. We spent close to a month to make sure that we are not missing anything obvious for each storage engine. But as I explained, nothing much worked in case of PostgreSQL.


Either way, something still seems off. If MongoDB and Elasticsearch were 3-5x more performant I would still find that surprising. A > 10-20x difference really seems like a configuration or implementation issue.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: