NoSQL East 2009 day 2: Pig/Twitter, Cascading, Neo4j, Redis, Sherpa/Yahoo

antirez · on Oct 31, 2009

There is some problem with the numbers for Redis:

"Can do 19,600 gets and 13,900 sets a second on a MacBook Pro"

In my macbook (not PRO) Redis performs like this:

    % ./redis-benchmark -q
    SET: 34705.88 requests per second
    GET: 31055.90 requests per second
    INCR: 28739.25 requests per second
    LPUSH: 35013.98 requests per second
    LPOP: 30496.95 requests per second
    ^C

But redis sucks on Mac OS X compared to how it performs on Linux. The same macbook running Linux reaches almost 100k query/sec. An entry level server running Linux is in the 150k/sec zone.

Sorry but I spent some time in order to make Redis so fast, so to see numbers an order of magnitude less does not make me happy ;)

About the replication: Redis supports master-slave replication with very very fast first synchronization. The replication is non-blocking, this means that if you attach N slaves to the master it continues to reply to clients without troubles when synching with the slaves.

If the link between master and slaves goes down the two will resynchronize again automatically. It's possible to use a replica in order to enhance data durability.

Replication can be controlled at runtime. For instance if you want an instance to become a replica of another instance all you need to do is something like this:

    echo -e "slaveof 1.2.3.4 6379\r\n" | nc 1.1.1.1 6379

Final note about the snapshotting persistence mode, in Redis edge on git there is already support for append-only journal, that makes Redis an option even when data is very important.

seancribbs · on Nov 1, 2009

Kevin did those benchmarks (as I watched) on a single MBP using the Ruby client, specifically Ezra Zygmuntowicz's benchmark script. I would definitely consider his numbers circumstantial, but also the bottom end of what's possible considering the fact that it's a laptop and the performance characteristics of Ruby.

lucifer · on Oct 31, 2009

The benchmark is a C program. Do any of the clients come close to matching the benchmark?

antirez · on Oct 31, 2009

Yes, it's just about parallelization.

If you meter the performance, even of a C client, in a busy loop, you are really measuring the round time trip, because it's a request-reply protocol, and most clients block until the reply is not ready.

Even using a Ruby / Python / ... client, if you run N of this clients, you'll see that Redis can handle this number of queries every second.

lucifer · on Oct 31, 2009

I understand that. I was gently hinting that the conf. presenter probably was using a (single) client given his audience.

As an aside, from the end user's point of view (assuming the typical end user is a web 2.0 app), the throughput isn't the only consideration. Even with N clients having 100k/s throughput, request latency is likely going to be N* the 1/tps. ~ 0.1 ms is probably the sort of request latency the end user is going to be looking at, and not .03 ms (taking your mac numbers as baseline). Bump up the number of clients and that latency is gonna get higher, even while throughput gets better.

This has nothing to do with redis (which is great). Just something to keep in mind when looking at this sort of performance measures.

antirez · on Oct 31, 2009

I agree with you that requests/second is not the only or more sensible parameter to meter performances, this is because redis-benchmark reports latency percentile too. I just suppressed the output in the example, but it looks like this:

    ====== SET ======
    10008 requests completed in 0.39 seconds
    50 parallel clients
    3 bytes payload
    keep alive: 1

    1.03% <= 0 milliseconds
    38.83% <= 1 milliseconds
    73.12% <= 2 milliseconds
    95.34% <= 3 milliseconds
    97.93% <= 4 milliseconds
    99.50% <= 5 milliseconds
    99.75% <= 7 milliseconds
    99.84% <= 8 milliseconds
    99.93% <= 9 milliseconds
    99.94% <= 10 milliseconds
    100.00% <= 11 milliseconds
    25401.02 requests per second

As you can see under this load most clients are served in 4 milliseconds or less, including both the transmission of the request and the reception of the full reply.