Hacker News new | past | comments | ask | show | jobs | submit login

PostgreSQL is 100% pure MVCC, a little bit like the "SNAPSHOT" method described in the article: updates and deletes always create new copy of a row, and every row has a visibility associated with it: the oldest transaction and the newest transaction that can see that row (if the row hasn't been deleted by any currently active transactions, the 'newest transaction' ID is empty).

There are definitely some drawbacks to MVCC. For example, many people are surprised how long "select count(*) from X" takes to run, because it has to do a full table-scan to check which rows the current transaction can actually see. You need to periodically 'vaccuum' tables, which updates a cached list of which table rows are deleted, so they can be reused by future update (modern Postgres will auto-vacuum, but it's another set of knobs for the DBA to tweak). Plus, the storage overhead mentioned by the article (although that really depends on the amount of churn in your particular application, and regular vacuuming keeps it to a minimum).




In fact, the latest stable Postgres supports fast, index-only scans for count queries: https://wiki.postgresql.org/wiki/Index-only_scans.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: