Hacker News new | past | comments | ask | show | jobs | submit login
Reply to Aphyr attack on Redis Sentinel (antirez.com)
171 points by mattyb on May 20, 2013 | hide | past | favorite | 40 comments



It's very refreshing to see here that "attack" is not used in the way that one might expect from just the headline, meaning "a possibly unwarranted criticism that I didn't like or found unfair, or that I am taking personally".


I am endlessly impressed with how antirez responds to any critique of Redis that I've ever seen. He's always taken it as a positive, and looked for the truth in the critique, rather than searching for something to be wrong and try to discredit the critique.

My opinion of him and the Redis project increases further every time.


Really? I hadn't seen the original posts before clicking on the this one and I assumed this was some kind of security breach...I hadn't heard of Aphyr before but just assumed it was some kind of netsec (white or black hat) group. I actually skimmed the OP's first paragraphs several times because I didn't understand what was going on.

That said, I agree that DB reliability should be taken with the same rigor as net security...but I was kind of under the impression that it already was, in that DBs are pretty serious business. Also, "attack" has the connotation of, well, an "attack"...here, some of the failures happen in regular business operations, which is a problem different from when the system is under "attack".

But at least the OP took the criticism graciously. When I read what the case actually was, I then worried that the OP was having a bunker mentality.


Tangentially related:

In the PostgreSQL evaluation[0], Aphyr noticed that, if a packet confirming a transaction is dropped, the client ends up in a deadlock.

Does PostgreSQL keep a record of the past transactions, and their success or failure. If so, is it possible to query it?

[0] http://aphyr.com/posts/282-call-me-maybe-postgres


Yes, you can recover from lost acknowledgements by asking for the transaction ID from postgres before committing--or by making up your own flake ID and writing it to a table. Given a queue with at-least-once delivery (which includes, say, durable storage on the client), you can check for the presence of that ID at a later time and re-apply the transaction to recover from network errors safely.

The transaction ID does wrap around, so there's a time limit depending on your transaction throughput. You can also ask for certain transactional properties on rows, though this won't allow you to recover in all (most?) cases.


Database constraints usually catch these problems in event of re-submission, especially if the client can assign primary keys (e.g., a UUIDv4) a-priori, but this also tends to be true in simpler cases, too.

All in all, I am not sure if anyone should find this surprising: if anyone has ever had a network stall when clicking the 'confirm' button at a web-based store, they are familiar with the uncertainty as to whether the order has been submitted or not (resolved typically by browsing the history or waiting for an email, or no).

I would guess modern e-commerce vendors would send you a UUID or moral equivalent to de-dup cart resubmissions these days...but if not, it'd be interesting to know why not.


Correct; if your writes are idempotent, retrying is safe. I cover this in the post as well. My above comment shows that it's possible to recover consistency even for writes which are not idempotent--though depending on the semantics of your retries, there may be some locking required.


Hey! I didn't expect you to chime in right here.

Thanks for the explanation.


Yet more tangentially related: it is an instance of the Byzantine Agreement problem, which is unsolvable in general: no finite protocol guarantees consistent state in the presence of packet loss.


Yep, FLP applies here--but if a network works long enough to complete a round eventually, e3PC or similar can succeed. Pretty much all real-world networks do that. :)


Redis is one of those things I both love and love to hate.

I've had good results using Redis as a lock server, but I live in (perhaps misplaced) fear of a client hanging or crashing leaving a lock stranded. Not that this is really Redis's problem.


Hello, you can easily mount a lock that auto-releases itself after some timeout using the new (2.6.13) extended SET command (see http://redis.io/commands/set) or simply a Lua script.


Since the jobs we're locking can have somewhat inconsistent times we're actually using an implementation where the tasks can get a lock with a time limit and can extend their lock so long as they still have it, so they do potentially auto-release.

Even given this, bad lock timing (not that likely) or a crash (more likely) could let inconsistency in.

Shrugs

Like I said, my problem is not really Redis's. If I can't trust everything that uses a lock not to crash 99.99% of the time I should really be looking at our jobs and not at Redis.

Even then, though, it's probably more a matter of me not trusting things than it is said things not actually being trustworthy.


We're about to open source a similar deal (redis-based "soft guarantee" mutexes) -- ours is written in Python and mostly used as a way to coordinate (very frequent) parallel task execution a la CountDownLatch, so 100% reliable exclusion in the face of failure isn't critical.

I'd be interested to hear about your implementation if you can share (email is HN username at gmail.com)


I sent you a rather quick email.


I recommend dreadlock:

https://github.com/jamwt/dreadlock

It will release the lock when the client dies (disclaimer: I wrote it).

Or you can go whole hog and use zookeeper + ephemeral nodes. More robust but quite a bit more complex.


A response to this article on Redis: http://aphyr.com/posts/283-call-me-maybe-redis


His continued use of "CP" confused me for a while, so TIL about CAP Theorem

http://en.wikipedia.org/wiki/CAP_theorem


If you have the time, this video by Basho's CTO will give you a much better understanding of the tradeoffs that are involved in distributed system design: http://www.infoq.com/presentations/Concurrency-Scale-Distrib...

A great alternative to thinking about things in terms of CAP that Justin brings up is harvest-yield, where yield is the probability of completing a request and harvest is the fraction of your data that the response actually represents. Here's the paper: http://lab.mscs.mu.edu/Dist2012/lectures/HarvestYield.pdf



hmm, i'm not sure how it could be better worded, but since antirez already links to this, i had thought you were posting a response to antirez's comments


I'm frustrated that when the HN editors deduped the original story, they apparently deleted ALL the instances, leaving only this one. I wanted to read the discussion on the subject of Aphyr's research, not Antirez' response.

It looks bad, HN. We all know that VMWare is litigious as (try looking up benchmarks sometime.) But to (presumably) cave so quickly and effortlessly suggests... well, I'm not sure.

The other possibility is that Aphyr yanked them himself, probably under duress (or else there'd just be an 'update' at the bottom of the research's page.) Aphyr, is this what happened? I figure you probably can't talk freely if so, but say something.


Hello,

1) I no longer work for VMware, but Pivotal. Redis is open source and copyright is of the original guys that wrote the code: I, Pieter Noordhuis, other contributors.

2) I posted the link to the original article in the first very lines of my reply. Actually thanks to my reply the exposure the Aphyr research had about Redis is the greatest, compared to the other data stores mentioned. I publicly said thank you to Aphyr on Twitter, and posted its blog post.

So I really don't understand your theories here.


Sorry, to clarify -- I was suggesting that it was possible that VMWare (a sponsor of Redis, correct?) leaned on someone. I didn't mean to besmirch you or redis, antirez, and I enjoyed your response.

It wouldn't be the first time a reputable news site was forced to bury a story by a litigious company. Sponsoring FOSS does not make any organization beyond doubt. Especially if they, say, have a history of suing anyone who benchmarks them.


As Antirez says, VMware were formerly a sponsor of Redis, and he now works for Pivotal (as do I), who are the current sponsor of the project. Either way, I'm highly skeptical that anyone at either company did such a thing.


I have a background in ethics and law so I've seen too much to make apologies for being suspicious. :) In fact, this sort of suspicion is a good reason NOT to establish a track record of litigating away freedom of speech (as VMWare notoriously threatens to do if someone publishes their benchmarks). But again: nothing to do with Redis, if (as you say) VMWare is no longer a sponsor.


https://www.hnsearch.com/search#request/all&q=aphyr.com

HN stories on my original posts are still there, as far as I can tell. They just never hit frontpage.


Aphyr, this is very lame, it's not common to see a work like what you did, and none of your stories hit the HN front page? I don't know what to think, but I hope that at least my post will help to show more people your awesome work.


I think Aphyr's series was a little too meaty for the general HN audience (of today).

Talking about things like the FLP impossibility result, CAP theorem and specifying protocols with TLA+ may be a bit over the heads of many HN readers - clearly, people would rather read stories about the latest funding round, acquisition or frontend UI framework than a substantive article on distributed systems.


It's not fair to imply that these thing are over the heads of HN readers. There are plenty of smart people that just might not care about distributed systems enough to read through. Does my lack of reading medical journals speak to my ability to read/comprehend them?


Yes, they did -- I saw them do so. There were several, in fact. And then they were gone. You've been robbed?


It is extremely unlikely that any pressure was put on the HN admins by VMWare or anyone else to get stories scrubbed. It's almost as unlikely that VMWare gives a shit about stories about Redis.


It's not unlikely they got a bunch of (unjust) flags.


Based on pressure from VMWare? No, that's extraordinarily unlikely.


RethinkDB people, how does your database compare?



Same limitations as any asynchronously replicated system; if both nodes diverge during a partition, you'll probably have to drop one's writes.

http://aphyr.com/posts/287-asynchronous-replication-with-fai...


Right. By operating at the block level it's a little more portable than most of the solutions discussed, though. Worth people's consideration, IMHO.


I'm inclined to think just the opposite. It's often possible to recover divergent data structures logically. Good luck doing that on an arbitrary block store.


My impression is that most DRBD setups are such that the backing volume is marked to recall which node last had 'master' (ie. write capacity), thus avoids this issue. However, to achieve this reliably it needs out of band STONITH (shoot-the-other-node-in-the-head), eg. IPMI.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: