The title is a bit misleading :) This is just a preview of Redis Cluster, it's not good to just claiming that I'm working at it, I wanted to show the first stuff we have that users can actually play with (but not use in production environments!).
Hey antirez, both this and 2.4 look really great. Can you talk to some internals of redis-trib? Is it using a cyclical hash on keys approach? What sort of controls are you planning on giving datastore designers?
I was using redis for consuming a multi-gigabyte financial analysis dataset, but eventually moved on because I didn't like using Hiredis on perl or C, so the clustering is very appealing as a way to keep latency down and get bandwidth higher.
Hey. All the possible keys are simply divided into 4096 slots. You can assign any part of this slots to any given node, so you can assign things in strange ways, like, node 1 50% of all the nodes, and 10% each to node 2,3,4,5,6.
Redis-trib will simply propose something that makes sense, but you will be able to edit the initial guess of the tool.
About sharding before redis cluster you can do it already but with more problems for sure as resharding will be much simpler with redis cluster (however you can just avoid doing resharding for now, using a lot of nodes from the start, not optimal but works). Not sure what is the problem with hiredis or perl, did not understood this part.
hriedis on perl works fine, well, mostly fine. I uncovered at least one small bug which I'll send on at some point.
The issue is that hiredis didn't have a usable python client, and I hated developing in perl, or C. Without hiredis, redis over python is unusably slow for querying 100k+ keys at once, I believe due to python's networking / socket code.
In the end, I ended up moving to pytables for this sort of stuff; it would be nice to use redis for it, but the data access patterns are tuned a little bit more for the web than for a 'do this math on these 3 million data points' kind of use case.
Anyway, I'm excited about redis cluster, I think it fits a huge need for web-scale application development.
Actually, I don't see any reinvention of Erlang's core features here:
- Erlang: a functional language based on actor model that allows to create distributed applications easily as actors don't share state (plus much more that I won't mention here for the sake of brevity). To share state among various Erlang clusters you have to use something like Mnesia (bundled with Erlang stdlib) or Riak (distributed database built using Erlang).
- Redis Cluster: a way to distribute state among multiple Redis (a simple database application) nodes in a manner transparent for the applications using Redis.
Like execution is more important then idea in a startup, implementation in the right "scope" and with the right tradeoffs is more important than idea in a software system. So while for sure Redis Cluster will look a lot like any other distributed hash table I think it is important that we implement it inside our system, with our system language, and with Redis specific tradeoffs.
antirez -- looking good. But may I ask why the name 'redis-trib' -- what does the trib mean? Short for contrib? Tribune? All I see is the nsfw definition, but maybe that's just me.