I can't say I've ever put something in my cart, not logged in to Amazon for mont...

haberman · on April 4, 2012

Have you ever worked in an environment at the scale of Amazon? Machines go down all the time. I used the "six months" example to illustrate that the shopping cart is persistent, but the machine crash could just as easily happen the moment before you push "checkout." Losing shopping cart data just because one machine crashed is totally unacceptable.

Programming distributed systems that must survive machine failure and network partitions is a completely different ball of wax compared to simple web programming. If you haven't done it before, you would not believe how much more complicated it is.

Here is an extremely simple example. Suppose you're a radio station that's taking phone calls from people and you want to give an award to the 5th caller. Implementing a program that does this for a single machine is easy, and could be accomplished with a program something like this:

  import SocketServer

  class MyHandler(SocketServer.BaseRequestHandler):
    def handle(self):
      self.server.caller += 1
      if self.server.caller == 5:
        self.request.sendall("Congratulations, you are the 5th caller!\n")
      else:
        self.request.sendall("Sorry, you're caller #%d\n" % (self.server.caller))

  server = SocketServer.TCPServer(("localhost", 9999), MyHandler)
  server.caller = 0
  server.serve_forever()

Using this program, you can "call" the program by doing "telnet localhost 9999" and the program will tell you what caller you are. This took me about 5 minutes to write, and I'd never used this Python API before.

Now imagine that you want to implement this same logic, but using a cluster of machines that could go down at any time. You want the group of machines to form "consensus" about which number each caller is; consensus in this context just means that the group of machines arrives at a single answer, and any machine you ask will give you the same answer.

Finding an algorithm that can do this robustly is so difficult that it was a major breakthrough when one was discovered in 1988. It's called Paxos and you can read about it here: http://en.wikipedia.org/wiki/Paxos_(computer_science) Even though it has been known for over 20 years, it is still a complex topic that very few people understand the details of.

The point of all of this is just to say; you can't compare a single-process in-memory cache to a distributed and fault-tolerant system. They are completely different beasts, and many business problems do indeed need the latter.

justinsb · on April 4, 2012

I know about Paxos. I know when to use it, and when not to use it.

Seeing as you came up with the example, using Paxos for an "Nth caller wins" is a really bad idea - the Nth caller to your switchboard likely won't be the one selected. You probably _need_ a single server system, like the example you wrote (but without the threading errors.)

haberman · on April 4, 2012

Then why would you compare dynamo and memcached? Or suggest that Amazon could use an in-memory cache as the primary store for shopping cart data? It makes no sense.

If you want to be particular about ordering, then you can always use paxos to elect a master that handles everything serially and failsover.

justinsb · on April 4, 2012

I see you work at Google. How about advancing the conversation by telling us how Google stores its sessions?

haberman · on April 5, 2012

Let's make a deal. I'll start "advancing the conversation" once you stop misleading people (ie. admit that your initial sweeping claims were incorrect).

justinsb · on April 5, 2012

Deal. My initial sweeping claims were incorrect.

So: How does Google store its sessions?

haberman · on April 5, 2012

So I'm not an expert in Google's front-ends and I'm not sure that Google stores "sessions" in the way you'd generally think of them in web apps. But usually when we have requirements like what you'd need for sessions (highly available, low-latency, highly scalable) we use Megastore:

http://www.cidrdb.org/cidr2011/Papers/CIDR11_Paper32.pdf

http://www.readwriteweb.com/cloud/2011/02/megastore-googles-...

Megastore is a layer on top of Bigtable that adds indexing, synchronous replication across data centers, and ACID semantics within small partitions called "entity groups."

justinsb · on April 5, 2012

Good deal! Agree that traditional sessions are best avoided, but good to know that megastore is suitable for session data.