Many, many years ago, back at Uni, a friend of mine received 100% for his Uni project. I could code many years before I went to Uni, but I was shallow in all other ways - all the ways in which we actually produce quality, reliable, usable software. He showed me some of his project, and I saw his docs. What he'd done was absolutely fantastic - a page per API function, documenting the standard set of information that you need to use an API function. The approach you see in all the major projects today, the MS API docs, for example - this from a twenty year old!
That experience permanently set my expectations for documentation, and is reflects in the library docs. One page per function, per enum, per struct, etc, etc, etc. It's ALL documentated. Painstakingly, exhaustively.
There's absolutely no point in writing an API which people cannot use, for whatever reasons.
When I come to make a release, and I'm done (or rather, I think I'm done) with development, I write the docs.
This induces, always, a large final round of development work, as I come to realise, from attempting to document them, that some of what I've done is simply too complex or too awkward or too inherently confusing to use, and has to be rewritten.
I am of the view from my experiences that all developers should write their own documentation, because of the improvement in code quality that is induced by the code changes which come from keeping the docs comprehendable.
I may be wrong, but I think an algorithm could only be considered in terms of locking or lock-free if multiple threads were through that algorithm performing work on the same data at the same time (and so potentially running into each other, in ways which would break the work being done).
I've seen server code like that - all the time of course - but the only algorithm I can think of which has that property is a PRNG, and that can normally be implemented on a per-thread basis, each thread holding its own single-threaded PRNG.
This probably reflects my lack of work with algorthims. Could you give an example?
I think you mean Treiber, right? typo with the "n".
In 1986, on an IBM mainframe, published a white paper describing how to implement a lock-free stack (which is also a freelist, of course). It was the first lock-free data structure. I'm not quite sure why it's being called an algorithm!
The scalable version described in that paper is scalable by dint of adding an elimination layer in front of the stack.
I've added such a thing, although my own design - I've not read their paper, so I don't know how they did theirs - to the freelist in 7.1.0, but it hasn't worked well. It roughly doubled performance on x64, when there are a large number of cores, but reduces performance for lower numbers of cores and on other platforms. When there are a large number of cores, performance is rock bottom,, so doubling means going from like 1m ops/sec/core to 2m ops/sec/core, when if you run on just one core, you'd be doing say 30mm ops/sec on that core.
I'd expect it in principle to permit real scalability - fairly linear performance as the number of cores rises - so I think I've missed something, or done something wrong.
Oh mea culpa, yes that's a typo. I was referring to W.K Treiber and his stack implementation. I think the fact that he referred to it as an "algorithm" is the source of my confusion.
Is it safe to say then that distinctions such as "lock free" and "wait free" always refer to data structures and not algorithms then?
I was curious, could you elaborate on why you think that PRNGs satisfy the property of a lock free algorithm?
> Is it safe to say then that distinctions such as "lock free" and "wait free" always refer to data structures and not algorithms then?
Well, for what my thoughts are worth, I would say no, not inherently and catagorically, but in practise, it's hard to imagine algorithms which are multi-threaded in such a way that threads contend on shared data, such that the design of the algorithm could even be lock-free or wait-free.
Parallelized (i.e. multi-threaded) algorithms tend I think to have multiple threads all running the same code but each being given a separate part of the overall data to process - say an image processing algorithm, where each thread receives an equal portion of the image to process.
A PRNG however I think can (for one or two types of PRNG only, not in general) be implemented as a lock-free algorithm. A PRNG is not a data structure - rather, it maintains some internal state and on requests uses (and updates) that state to generate a new random number. It is an algorithm.
The updating process must be atomic or it is compromised - you can't have one thread half way through the state updates, and then another thread jumping in at that point and using the half-updated internal state.
Good PRNGs hold a lot of internal state - up to a kilobytes - so it's not possible to directly use atomic operation to update that state. However, there is a class of PRNGs called "SplitMix" which have an 8 byte state, where on request for a new random number, that state has a particular (quite large) constant added to it, and the result is then passed through a hashing function.
With this design, an atomic add can be used for the addition, and once that's done, the requesting thread then local to itself runs the hashing function (it's single-threaded, and merely takes an 8 byte number for input), and so the design is lock-free - no matter how many threads are using the PRNG, whether or not they have been idled by the OS, all other threads can still get their work done.
This PRNG is a curiosity though - it has almost no genuine use cases. The performance is poor when compared to the single-threaded version of the same PRNG, and a PRNG being what it is, each thread can easily run its own, thread-local PRNG.
I don't actually know know of others, but I know there are a range of other PRNGs which run on single-word states, so they might be viable.
Thankyou! I see many flaws in the library though - there's so much that I can see to improve or to add. Still, with such things, there's a lot of value in making a release to get new functionality out so it can actually be used, even if there's much else to be improved, or which is not as it could or should be.
The lock-free binary tree on two cores is about 2x faster than the next best method. On sixteen cores, about 20x faster.
Why not use it, if it's packaged up in a library? it's only the difference between typing one API or another, unless there are functionality differences.
A lock-free self-balancing binary tree (e.g. RedBlack or AVL) would be awesome. But AFAIK that's not possible using commodity hardware. In general, there aren't many places where you can use a regular binary tree when using externally-sourced data because it opens you up to computational complexity attacks. You could using something like SipHash to securely randomize keys, but then it won't be nearly as fast. And of outside of batch processing, who uses a binary tree without deletions?
The only lock-free data structure that is practical is a queue. And that's because a lock-free queue is really the only data structure that doesn't require you to re-architect your entire application stack according to the limitations of the lock-free data structure. And if you're going to re-architect your application, better to re-architect it around message passing, which mostly obviates the need for lock-free data structures, with queues, again, being the obvious exception.
But it's comparatively trivial to implement a lock-free queue using intrinsics, assuming it's a hot-spot in your application. Library dependencies are extremely costly in terms of the long-term maintenance of software projects. Supplying a lock-free queue in a single C file (or header) using common compiler intrinsics, OTOH, and which automagically can be dropped into most existing projects, might be much more useful.
Lock-free red-black trees now existing in the literature. They were invented recently, about two or three years ago now.
> And of outside of batch processing, who uses a binary tree without deletions?
IME, it's surprising how often even an add-only data structure is absolutely fine. For example, the benchmark application uses one to store the processor/memory topology, and perform operations upon it, such as generating interesting combinations of cores to execute the benchmarks upon.
One other thing which helps a little is that the tree elements have a key and a value, and the value can change at any time. So there's a bit of flexibility there.
However, of course, the real solution is proper support for delete. The library must have this, and it's pressing - but there's just so many other things which also need to be done, and which have been being done. For example, it's all very well having a tree, but without e benchmark, I have no idea if I've implemented correctly.
Where the benchmark was added in 7.1.0, I was able to see the implementation on ARM was utterly broken - running at about 10% of the speed of locking trees! I investigated, fixed it, they returned to running about 3x that of the next best solution - and that fix also gave about a 25% improvement on all other platforms.
> The only lock-free data structure that is practical is a queue.
There's quite a few lock-free data structure in the literature. Obviously the stack (it's ancient), then the queue (lots of variants thereof), deques, lists (singly-linked, doubly-linked, ordered, unordered, locality aware), then hashes (of vast importance, as they scale magnificently) and now trees.
> And that's because a lock-free queue is really the only data structure that doesn't require you to re-architect your entire application stack according to the limitations of the lock-free data structure.
What limitations do you have in mind? as far as I can see, the API for the lock-free queue is identical to the API for the locking queue, you just get more performance.
> But it's comparatively trivial to implement a lock-free queue using intrinsics
Mmm... I'm not sure I agree. The standard lock-free queue is from Michael and Scott. It takes a while to understand it, and you have to know - really know - all about memory ordering behaviour. I've seen more broken implementations of this data structure than working (although they work fine under low load, of course). My original implementation made the same mistakes.
Moreover, today, although I may well be completely wrong, I think I know of two bugs in that queue, both of which are fixed in liblfds. Getting beyond even the white paper implementation to realising there could be issues in the queue is another leap.
> Supplying a lock-free queue in a single C file (or
> header) using common compiler intrinsics, OTOH, and which
> automagically can be dropped into most existing projects,
> might be much more useful.
To the extent this is true, it's true for all libraries, not just lock-free data structure libraries. It's a question of packaging. Do you have multiple C files and multiple header files, and compile them all up and have the user link, or do you bundle them all together into a single file (in this case so you have a single file per data structure, where the abstraction layers are duplicated between them) and ship that?
That single file is going to be some hundreds of kilobyets of code. Maintaining that arrangement would be quite painful (and error prone) for the library developer. What you'd need in fact is to maintain the code in the form of multiple files, and then have a release mechanism bundle them up.
In both cases, the users take your source file (or files) and add them to their projects. In the former case, the file is added to their build files. In the latter case, building the library is added to their build files, and then linking also.
Sure, but I somehow never encounter a problem where I need a real lockfree data structure other than a ringbuffer or a variation thereof. Interlocked queues are almost giving me enough.
No. Property rights are absolutely, profoundly, completely, totally, utterly central, vital, irreplacable and essential. Licenses are an offshoot of property rights.
However, the arrangement of property rights as they are now being forced upon me - other entities telling me how I must go about managing my rights - this I do object to.
I know what I want to do with the property rights in this case, but I am often told there's all kinds of trouble in achieving this, because of these other entities enforcing their view of property rights upon me and the people who would form a contract with me (by choosing to use the software on the basis I am offering it).
You think that the copyright default of "all works are copyrighted essentially forever, and you cannot use, modify or share copyrighted works of any kind" is sane? Seriously?
If you want to talk about property rights, we can talk about those. But we're talking about copyright, which is a very different concept to property (despite attempts to create misleading terms like "intellectual property" to muddy the waters).
Copyright is supposed to be a method of encouraging creative works. It is not meant to be a method for software developers to maintain control over their users. I never said licenses were not important: YOU SAID THAT. Licenses are incredibly important -- free software couldn't exist without them because of draconian copyright laws funded by special interests that don't give a shit about creative works.
> I never said licenses were not important: YOU SAID THAT.
I thought you were saying I thought licenses were not important. I was trying to make it clear I think they are fundamentally important, by being an aspect of property rights.
Why is the title of the submission "license-less library"? If you agree with me that licenses are very important for free software, why even have that section on the front page?
Because until recently, the library was license-free.
Licenses are important as a concept. Licenses can exist, and when they do, they must be honoured. They are an aspect of the property rights of the owner of the software.
That doesn't mean you have to use one, and the library was not using one, so it was license-free.
There might be some confusion here in that when I speak of licensing being important, I mean the concept - I do not particularly mean licensing as it exists today in practise.
As was pointed out in an early post in this discussion, the matter is now not so clear cut. I've added per data-structure information on licensing, and it's no so straightfoward a situation now as to be able to say, in a blanket fashion, "license-free". I'm thinking about what to do about that. The library is much closer to license-free than anything else, but it's hard to explain the situation in the one-line description while keeping it short and to the point.
Why don't you just license it under a single free software license and call it a day? Why say "you have an infinite set of dual licenses"? If you really, really really don't care just use CC0.
I don't understand your focus on the "concept of licensing". It's such an odd thing to focus on with such vigour. In my mind, a license is a tool that allows me to ensure that users of my software have freedoms that they deserve. Some people use it as a tool for oppressing users. But licenses are a tool, simple as that. They are a consequence of copyright law. Copyright licenses have literally nothing to do with property rights (you can't put a copyright license on a goat or a piece of land).
Encourage the adoption of a line-width that is longer than ancient terminals (at least it isn't 40 cols across).
Also I'm much less concerned with absolute line width in my own code than I am with indentation steps or number of arguments something takes without flowing on to multiple lines.
I think typedefs are almost always harmful. The explanation is longish and not directly related to the library, so I'll leave it at that.
What I will say though is that the struct already has a type, so as far as I can tell, the benefit consists of not having to type the struct keyword?
Also, the typedef in particular you've given would make it impossible to concurrently include and link multiple versions of the library (which is the purpose of having the version in the public entity names).
> Also, the typedef in particular you've given would make it impossible to concurrently include and link multiple versions of the library (which is the purpose of having the version in the public entity names).
Honest question: why do you think that's a useful usecase? I've never thought of using two versions of the same library, mainly due to fear of nasal demons.
Also, you could do symbol versioning like glibc does and then just keep the old code in the library for backwards compatibility.
> Honest question: why do you think that's a useful usecase?
I worked for three years on software used by telcos. Carrier grade - they were rigourous about change control and validation, because bugs were totally unacceptable. For them, it would have been exactly what they needed.
Before that, I worked once for a while on a huge, unmaintainable code base. Awful code, basically one huge function, with a million lines of code =-) with that code, you should not risk ANY change. For them, it would have been essential.
In general, across all software I think software quality is not high. Developers do not test enough, and are too casual about change. I am taking an approach which I think is necessary, and it certainly deviates from the usual practise.
> Also, you could do symbol versioning like glibc
I'm not sure I properly understand - I need to read up properly on what this offers. However, it's not clear to me how it changes anything from the C code POV - in the C code, the user still needs to use the old versions of prototypes, to call the correct code in the library, so what do they get for symbol versioning? why not just link to the old binary? and it's important, because the old binary is completely unchanged and it is only by that, that revalidation is unnecessary and so does not have to be performed.
On a more practical level though, the library targets arbitrary compilers on bare metal platforms, and symbol versioning from what I read is rarely offered.
The libraries are not identical; I'm not particularly up to speed with CK, I'm not a user, but there's so much that can be done with any data structure library that there will be things liblfds has done which CK has not, and naturally vis-versa, such that for some one of the libraries will be a better (or possibly the only) choice.
What those things might be I cannot much tell you, where I'm not particularly au fait with CK.
Might want to get acquainted then... if you've got competition for your library and you expect users to pick you, you should know why you're better IMO: http://concurrencykit.org/man.html
It's free and open source but in some regards it's similar to product design/marketing. If you don't pitch it, some reviewer on the internet will make a stackoverflow post about it, and they won't know your strengths as well as you do.
I write the library for the pleasure of it. I'm aiming to create something I think beautiful - for better or for worse, from other people's point of view :-)
I'm not actually looking for it to be used as such. It may well be, if my idea of beautiful is any good, it will come to be used, but that's... it's like a side effect.
When you paint a picture, or sculpt a sculpture, or write a song, you're not competing with other artists to make what other people will judge to be the better picture, sculpture or song.
I've heard of Rust. People seem to like it. However, I am a C programmer and have always been a C programmer, and I write the library for the pleasure of it, so other languages don't really connect.