Redis is my favorite example of a very clean and beautiful example of C code, just the right amount of comments, good variable names. It is great work. You can tell Salvatore cares and has passion for what he does just by looking at his work.
Wow it's awesome, comparing to the most of the C code that I saw it's beautiful. I like the long method names, because I can actually understand what they are doing.
Maybe you haven't looked very often but nice code is out there for quite some time. Take a look at anything related to the GNOME stack (GLib, GTK+, all the applications) or the Linux kernel for example. And quite frankly, I stumble upon badly written Python way more often than C.
The code for those might be great, but Redis is beautiful inside and outside. I'm not a fan of the actual output of the GNOME stack. I agree the Linux kernel code is generally great to read and to use.
Compared to an average of other projects coded in C out there.
Others mentioned kernel, GNOME. Redis is nice because it is fairly self contained. It has a good set of networking code, some command parsing, storage. Kind of a happy medium.
Anyway, I didn't look in depth at understanding the semantics of every module.
Here are some more things (besides NULL check I can see that are worrisome in that particular piece of code):
* can len be out of sync with actual list length
* can list->free be defined (non null) but value wasn't allocated with an malloc perhaps
* what if an element is inserted twice in list, will list->free be called on already free-ed area
* on 64 bit machine unsigned long will be 64 bit and on 32 bit it will be 32 bit, is that problem?
Which ones do you see?
Now those are issues I see by looking at the module in isolation. But it is not quite an isolated. It is part of a large module. Sometimes there are invariants that are enforced at the input (at the system boundary) and so it is not necessary to always keep checking for NULL or validating inputs in every single internal function.
That is one good lesson I learned from Erlang. Check your inputs at the boundary then code for the "happy" path and let error result in quick and early failure. Maybe prefer a quick segfault rather than a dangling pointer or wondering later how exactly to handle NULL pointer if it is not really expected to be NULL.
you'd probably want to zero the length in the list structure and remove the dangling pointer to the first element otherwise you run the risk of a double-free.
The entire implementation of sorted sets is really interesting, with a dual implementation of ziplists and skiplists being used depending on the amount of elements in the list. I've been meaning to write bit more about Redis internals lately; maybe I'll start on that in my commute hours.
I've got a couple of general articles on adding a command and adding a datatype to Redis at http://starkiller.net, but I don't get too into existing code. I'd be interested in writing a bit more about the other data structures as well as the multiple strategies used for EXPIRE (which recently changed I believe).
I'm certainly novice in C, but as I was reading, I wondered about this
{"get",getCommand,2,"r",0,NULL,1,1,1,0,0},
"The fourth field, set to "r", is specifying that the
command is read only and doesn’t modify any keys’ value
or state. There are a whole bunch of one letter flags
that you can specify in this string that are explained
in detail in the nearby block comment. The field
following this string should always be set to zero, and
will be computed later. It’s simply a bitmask
representation of the information implied by the string."
Why would you opt for this, when you could specify some constants and bitwise or them together? Isn't that a more common thing to do, than to calculate a bitwise flags at run time?
I'm sure there's a good reason, but this style seems strange to me.
Maybe redis makes use of the string later? but I can't help but feel it should build the string based on the flags, rather than build the flags based on the string.
In defence of the technique, the command table is quite succinct and arguably more readable at a glance than if there were a bunch of constants |ed together. I have no idea whether this was the original motivation though.
ACTUALLY! It reminds me of a technique Bisqwit used when he made his emulator. He used strings to define the behavior of certain instructions, the strings were actually interpreted at compile time. Though I think this is a C++ specific trick.
Fun. This isn't really specific to Redis, but is a good introduction to the sort of thing that C programmers often get up to. You'll see this sort of function table in all sorts of C programs. Take a look at GNU stuff like make and you'll see the same format.
I find it fascinating to compare H2 and Derby source code. First was written by single man, has more features, is more compact and faster. Second was 'designed by committee' and evolved over long period of time.
I would also post link to my project, which is sort of 'Redis in Java', but it would be probably spam.
Nice post. I've been working quite a lot with the internals of Redis in the past few months. Adding custom commands along with the usual skimming through the builtins. Perhaps I should give people some insight by creating a few blog post as well. It's a really nice piece of software and written in clean, high quality, C. Not sure about the tests in Tcl though :).
To be fair, the tests in themselves are alright, but I'm not to familiar with tcl and have had problems with running them in a CI build with a lot of redis-servers being left behind. As the test are as far as I've seen basically integration tests it would be quite nice to have them in something like python to make them a bit more easy to handle.
http://web.archive.org/web/20180303001631/http://www.heychin...