Hacker News new | past | comments | ask | show | jobs | submit login

How to do sharding:

1. Define your keyspace in bits.

2. Shard your keyspace in CIDR notation

3. Resharding always splits a keyspace in half

  eg. 0.0.0.0/0 becomes 0.0.0.0/1 and 128.0.0.0/1
4. Store your shard keyspace in DNS with SRV/TXT records

5. Assign IDs randomly

This gives a couple interesting properties, for replication odds are very high that your corresponding server has a shard of similar size and if not its generally only split over two servers. Servers also only need to speak directly to their one or two replicas in the other data center. The other really nice property if you're using spindle disks is that you can copy the entire HD and just delete the half of the keyspace not used after the split.

It's all written up here if you want to dig into with a full description of all the pieces: http://bit.ly/FredPatent




It takes some nerve to actually be proud of that patent.

Luckily for the rest of us, there is not just one way to do sharding (which in the original version of your post you assertively referred to as "how to do sharding right").

It depends on the individual architecture and the tradeoffs people consider important. That means you don't get to have an undeserved but legal monopoly holding the entire scalable web hostage, just a part of it. Nevertheless: you should be ashamed of yourself.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: