Hacker News new | past | comments | ask | show | jobs | submit login

You always need a transaction, even just for reads. Otherwise you are not reading from a single consistent snapshot of the world.



So you're saying that a SELECT statement that happens strictly after an INSERT statement (on the same connection) can possibly not see the data that was written by the INSERT statement?

I guess that would be possible, but I would be very surprised if that was so... Also, I would also expect writes to rows to be atomic (i.e. you wouldn't see a row with half-written data, or a row that has an `id` but not (yet) a `name`) - again, that kind of behaviour would be possible, but very surprising to the point of the DB being unusable.


This is why almost all ORMs create a transaction and have a "unit-of-work" that encompasses a single transaction.

Race conditions are always surprising. Fortunately, we have a simple remedy for that: transactions. :)

Suppose you have another connection and it runs DELETE FROM t WHERE id = (SELECT MAX(id) FROM t). If that winds up in the middle, it could screw things up for you. Is it likely to happen? No, but again... the remedy is simple. Also, the scenario you describe seems simple enough, but what happens when you bulk insert a few thousand rows?

By the way, with Postgres, there really is no way to not have a transaction. If you don't issue BEGIN, it treats the statement as a small transaction. So you're getting the same overhead on a single statement, if your plan was to somehow avoid the overhead.

If this makes you question the database's usability, I have some bad news for you: all ACID compliant databases work this way. The "atomic" unit is the transaction, not the connection. The consistency guarantees pertain to the transaction, not the connection. This is not a weird thing about Postgres, this is how they all work (except, of course, that MySQL can't always roll back aspects of a transaction, like DDL).


not everybody needs strong serializability.

most people do read (sometimes even in another thread / other connection) -> transaction -> write. Also the basic transaction isolation in pg would still have these inconsistentsis of phantom/non repeatable reads and serialization anomalies.


There's a difference between not using transactions because you don't know better and not using transactions for reasons that arise out of a deep understanding of what is going on. 99% of the time, it's the former. The argument that transactions aren't perfect is a shitty reason not to use them, the kind of bad advice that doesn't hurt you today, but may hurt you in the future.


well I'm not against using transactions, I'm using them a lot. But I most often I read data on other threads where I do not need a transaction, especially for validating data or just getting data there is not the biggest need for transactions. And especially if your data can be dirty (not 100% accurate data)

basically yes you need transactions to guard against dirty reads, but most often you don't care for dirty data, because if you are not having financial data or data that needs to be taken with care, let's say you have a timeline with posts, you don't care if the last read was dirty and probably the user does not care anyway, he will just refresh the page once he needs more up to date data.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: