Thanks for taking the time to elaborate, very interesting. I wonder if the sql d...

saurik · on Aug 29, 2012

I am not certain whether your response comes from a reading of my comment before or after the paragraph I added that started with "for a more complete implementation", but if it was from before I encourage you to read that section: the ability to do the query is pretty much already there due to the xmin/xmax fields that PostgreSQL is already reifying.

saurik · on Aug 29, 2012

(edit: Apparently, nearly an hour ago, jeltz pointed out that PostgreSQL 9.2 actually has implemented nearly this identical functionality through the usage of exported snapshots, so I recommend people go read that comment and the linked documentation. However, my comment is still an example of the functionality working.)

(edit: Ah, but the feature as implemented actually saves a file to disk and thereby has a lot of server-side state: the way I've gone ahead and implemented it does not have this complexity; I simply take a single integer and store nothing on the server.)

http://news.ycombinator.com/item?id=4448472

> I wonder if the sql db vendors or open source projects will take the next step to make querying against a transaction ID possible given the underlying implementation details bring it pretty close.

For the hell of it, I just went ahead and implemented the "backdate a transaction" feature; I didn't solve the vacuum guarantees problem, however: I only made it so that a transaction can be backdated to another point in time.

To demonstrate, I will start with a very similar sequence of events to before. However, I am going to instead use txid_current_snapshot(), which returns the range (and an exception set that will be unused for this example) of transaction identifiers that are valid.

Connection 1:

    demo=# create table q (data int);
    CREATE TABLE
    demo=# begin; select txid_current_snapshot();
    BEGIN
    710:710:
    demo=# insert into q (data) values (0); commit;
    INSERT 0 1
    COMMIT
    demo=# begin; select txid_current_snapshot();
    BEGIN
    711:711:
    demo=# select xmin, xmax, data from q;
    710|0|0

Connection 2:

    demo=# begin; select txid_current_snapshot();
    BEGIN
    711:711:
    demo=# update q set data = 1; commit;
    UPDATE 1
    COMMIT
    demo=# select xmin, xmax, data from q;
    711|0|1

Connection 1:

    demo=# select xmin, xmax, data from q;
    710|711|0
    demo=# begin; select txid_current_snapshot();
    BEGIN
    712:712:
    demo=# select xmin, xmax, data from q;
    711|0|1

So far, this is the same scenario as before: I have two connections that are seeing different visibility to the same data, based on these snapshots. Now, however, I'd like to "go back in time": I want our first connection to be able to use the same basis for its consistency that we were using in the previous transaction.

Connection 1:

    demo=# set snapshot_txid = 711;
    SET
    demo=# select txid_current_snapshot();
    711:711:
    demo=# select xmin, xmax, data from q;
    710|711|0

This new variable, snapshot_txid, is something I created: it gets the current transaction's active snapshot and modifies it to be a range from that transaction id to that same id (I think a better version of this would take the exact same string value that is returned by txid_current_snapshot()).

From that previous basis, the row with the value 0 is visible, not the row with the value 1. I can, of course, go back to the future snapshot if I wish, in order to view the new row. (I am not yet certain what this will do to things writing to the database; this might actually be sufficient, however I feel like I might need to either mess with more things or mark the transaction read-only.)

Connection 1:

    demo=# set snapshot_txid = 712;
    SET
    demo=# select txid_current_snapshot();
    712:712:
    demo=# select xmin, xmax, data from q;
    711|0|1