That was my thinking as well, though taking a peek at the actual code suggests that there's a pretty deep expectation that the client is speaking strings, e.g. in code that handles the ZRANGE command[1] I see
if (c->argc == 5 && !strcasecmp(c->argv[4]->ptr,"withscores"))
I guess this means someone would have to tackle creating an intermediate binary format first, rewriting the command handlers to expect that format, and then making client libraries that can produce the format. Perhaps still worth it in the end, but not trivial.
I can see cases where a really optimized system could benefit from a binary protocol, but I suspect it'd be a loss for most people.