A quick look at the Redis source code

dang · on Nov 14, 2021

URL seems broken now but article is here:

http://web.archive.org/web/20180303001631/http://www.heychin...

rdtsc · on Oct 15, 2013

Redis is my favorite example of a very clean and beautiful example of C code, just the right amount of comments, good variable names. It is great work. You can tell Salvatore cares and has passion for what he does just by looking at his work.

https://github.com/antirez/redis/tree/unstable/src

free652 · on Oct 15, 2013

Wow it's awesome, comparing to the most of the C code that I saw it's beautiful. I like the long method names, because I can actually understand what they are doing.

matthiasv · on Oct 15, 2013

Maybe you haven't looked very often but nice code is out there for quite some time. Take a look at anything related to the GNOME stack (GLib, GTK+, all the applications) or the Linux kernel for example. And quite frankly, I stumble upon badly written Python way more often than C.

spullara · on Oct 15, 2013

The code for those might be great, but Redis is beautiful inside and outside. I'm not a fan of the actual output of the GNOME stack. I agree the Linux kernel code is generally great to read and to use.

16s · on Oct 15, 2013

IMO, OpenBSD and tarsnap are other notable samples of clean C code.

maurycy · on Oct 15, 2013

I'd also mention FreeBSD kernel and Niels Provos' code.

avtar · on Oct 15, 2013

"And quite frankly, I stumble upon badly written Python way more often than C."

I'm curious, which public codebases would you consider to be examples of well written Python?

matthiasv · on Oct 15, 2013

The often cited requests library, flask and Django. Some counter examples are high profile projects such as IPython and Matplotlib.

avtar · on Oct 16, 2013

Thanks. I'll spend some time with Requests over the weekend. It looks like the person who maintains that library is also behind this http://docs.python-guide.org/en/latest/#writing-great-code

Someone · on Oct 15, 2013

What are you comparing that with? I find it fairly normal for C code dating from past, say 2000.

Also, one thing that made a negative impression on me is https://github.com/antirez/redis/blob/unstable/src/adlist.c:

  /* Free the whole list.
   *
   * This function can't fail. */
  void listRelease(list *list)
  {
      unsigned long len;
      listNode *current, *next;

      current = list->head;
      ...

If you write such a comment, you better make sure it is true. And no, an early "list == NULL" check is not the only thing missing.

rdtsc · on Oct 15, 2013

Compared to an average of other projects coded in C out there.

Others mentioned kernel, GNOME. Redis is nice because it is fairly self contained. It has a good set of networking code, some command parsing, storage. Kind of a happy medium.

Anyway, I didn't look in depth at understanding the semantics of every module.

Here are some more things (besides NULL check I can see that are worrisome in that particular piece of code):

* can len be out of sync with actual list length

* can list->free be defined (non null) but value wasn't allocated with an malloc perhaps

* what if an element is inserted twice in list, will list->free be called on already free-ed area

* on 64 bit machine unsigned long will be 64 bit and on 32 bit it will be 32 bit, is that problem?

Which ones do you see?

Now those are issues I see by looking at the module in isolation. But it is not quite an isolated. It is part of a large module. Sometimes there are invariants that are enforced at the input (at the system boundary) and so it is not necessary to always keep checking for NULL or validating inputs in every single internal function.

That is one good lesson I learned from Erlang. Check your inputs at the boundary then code for the "happy" path and let error result in quick and early failure. Maybe prefer a quick segfault rather than a dangling pointer or wondering later how exactly to handle NULL pointer if it is not really expected to be NULL.

etimberg · on Oct 15, 2013

you'd probably want to zero the length in the list structure and remove the dangling pointer to the first element otherwise you run the risk of a double-free.

jeremiep · on Oct 15, 2013

It's actually a quick tutorial describing how to add a new command to Redis, not an actual analysis of the source code as I expected.

rch · on Oct 15, 2013

You might be interested in one of these articles:

http://pauladamsmith.com/articles/redis-under-the-hood.html

http://blog.togo.io/how-to/adding-interval-sets-to-redis

They have both been discussed before though.

HeyChinaski · on Oct 15, 2013

Ah yes, I remember the interval sets article. The first post is a lot more in depth than mine. Both great links, thanks.

HeyChinaski · on Oct 15, 2013

Yeah, it's a pretty shallow introduction. I'd like to write more articles on the Redis code. The sorted set skiplist stuff is really interesting.

hox · on Oct 15, 2013

The entire implementation of sorted sets is really interesting, with a dual implementation of ziplists and skiplists being used depending on the amount of elements in the list. I've been meaning to write bit more about Redis internals lately; maybe I'll start on that in my commute hours.

I've got a couple of general articles on adding a command and adding a datatype to Redis at http://starkiller.net, but I don't get too into existing code. I'd be interested in writing a bit more about the other data structures as well as the multiple strategies used for EXPIRE (which recently changed I believe).

ddorian43 · on Oct 15, 2013

Write: How to write custom C commands ?

L8D · on Oct 15, 2013

It's more of an analysis of the structure of Redis' code.

GhotiFish · on Oct 15, 2013

I'm certainly novice in C, but as I was reading, I wondered about this

  {"get",getCommand,2,"r",0,NULL,1,1,1,0,0},

  "The fourth field, set to "r", is specifying that the 
   command is read only and doesn’t modify any keys’ value 
   or state.  There are a whole bunch of one letter flags 
   that you can specify in this string that are explained 
   in detail in the nearby block comment.  The field 
   following this string  should always be set to zero, and 
   will be computed later.  It’s simply a bitmask 
   representation of the information implied by the string."

Why would you opt for this, when you could specify some constants and bitwise or them together? Isn't that a more common thing to do, than to calculate a bitwise flags at run time?

   COMMAND_READONLY | COMMAND_RANDOM | COMMAND_NOSIDEEFFECTS

ect ect ect.

I'm sure there's a good reason, but this style seems strange to me.

Maybe redis makes use of the string later? but I can't help but feel it should build the string based on the flags, rather than build the flags based on the string.

HeyChinaski · on Oct 15, 2013

In defence of the technique, the command table is quite succinct and arguably more readable at a glance than if there were a bunch of constants |ed together. I have no idea whether this was the original motivation though.

GhotiFish · on Oct 15, 2013

Yes. It is certainly more pithy.

ACTUALLY! It reminds me of a technique Bisqwit used when he made his emulator. He used strings to define the behavior of certain instructions, the strings were actually interpreted at compile time. Though I think this is a C++ specific trick.

http://www.youtube.com/watch?v=y71lli8MS8s

he brings in the instruction table at 1:30

jeanjq · on Oct 15, 2013

Fun. This isn't really specific to Redis, but is a good introduction to the sort of thing that C programmers often get up to. You'll see this sort of function table in all sorts of C programs. Take a look at GNU stuff like make and you'll see the same format.

Congratulations on the exploring.

ethanazir · on Oct 15, 2013

I would buy a book written about Redis like this.

qwerta · on Oct 15, 2013

I find it fascinating to compare H2 and Derby source code. First was written by single man, has more features, is more compact and faster. Second was 'designed by committee' and evolved over long period of time.

I would also post link to my project, which is sort of 'Redis in Java', but it would be probably spam.

malkia · on Oct 15, 2013

Postgres source code has been pleasure to read, and the commit logs are outstanding.

eliben · on Oct 15, 2013

The Redis source code is very clean and readable. It's a great example of how a non-trivial code base can be written in good C style.

pestrella · on Oct 15, 2013

Excellent write up aimed at curious coders.

jahaja · on Oct 15, 2013

Nice post. I've been working quite a lot with the internals of Redis in the past few months. Adding custom commands along with the usual skimming through the builtins. Perhaps I should give people some insight by creating a few blog post as well. It's a really nice piece of software and written in clean, high quality, C. Not sure about the tests in Tcl though :).

HeyChinaski · on Oct 15, 2013

I was quite surprised to see the tcl tests. I'm reserving judgement until I've tried writing one though.

jahaja · on Oct 15, 2013

To be fair, the tests in themselves are alright, but I'm not to familiar with tcl and have had problems with running them in a CI build with a lot of redis-servers being left behind. As the test are as far as I've seen basically integration tests it would be quite nice to have them in something like python to make them a bit more easy to handle.