The problem with AVX2 accelerated code (and much of AVX) is that unless you have a lot of it to run you end up with a substantial speed hit that comes from the cost of switching to a different power bin (which often takes 1 or 2ms!) and then running at a lower clock speed.
This often still ends up being an improvement over scalar code (at the cost of higher power usage), but for occasional workloads that don't need to do multiple milliseconds of AVX instructions you tend to have better results from 4-wide vectors, which don't have this cost.
So in a sense, using wider vectors should be done at times when you’re certain to need heavy lifting, much like a GPU, but at a smaller timescale.
I should update my vectorization library so that it doesn’t simply use the widest possible.
FWIW Haswell - on which this was benchmarked - does not downclock on AVX2; it just boosts vcore and thermal throttles if the cooling solution can't keep the die below ~100C (which of course, the crappy TIM plus bundled desktop heatsink have essentially zero chance of doing with many extended workloads). To preclude thermal throttling the auto-downclock-on-AVX2 was added later, I think for Skylake.
You also have penalties from context switches, although you can reduce their impact by performing them in a lazy fashion. AVX512 is even worse, of course.
The library doesn't have a license on the site or in the tarball. From djb's previous writing and software, he probably intends it to be license-free software, which is an uncommon situation worth investigating before use: https://en.wikipedia.org/wiki/License-free_software
(Trying to describe this neutrally because I've seen enough bickering about it over the last ~20 years and don't have strong feelings about it.)
I hear people raise this concern a lot, but I think it is without foundation. If something is in the public domain in the U.S. then anyone can use it for any purpose, including releasing it under whatever license they want. Of course, any constraints imposed by that license will be unenforceable since any user of the software can claim to be using it under the terms of some other license or, of course, as part of the public domain. But if you think you need it licensed, you can have it licensed.
Anybody under U.S. jurisdiction can follow U.S. law.
However if me and my company and everything is in Europe and I use/redistribute the code in Europe I must follow applicable European law. If that doesn't accept that form of public domain the author (rights owner) could sue me and it were upon the judge, who sensible they are. (This is mostly theoretical - if the author decides to put it in public domain per U.S. law they most likely don't want to restrict to U.S.)
> If that doesn't accept that form of public domain the author (rights owner) could sue me and it were upon the judge, who sensible they are. (This is mostly theoretical - if the author decides to put it in public domain per U.S. law they most likely don't want to restrict to U.S.)
Generally speaking, the risk is less the author themselves (though they could always do an about-face for whatever reason, it's hardly unprecedented[0]) and more eventual heirs of them who could always decide to cash in.
That was a completely different situation. In that case, the material was still under copyright in Germany. In this case, the material has been placed in the PD by the original author and so no one in the world can possibly have any legal claim on it.
But if this really concerns you, I would be happy to provide you -- or anyone else -- with a licensed copy of any of DJB's code for a modest processing fee.
> In that case, the material was still under copyright in Germany.
Which is exactly the case of djb's work here.
> In this case, the material has been placed in the PD by the original author and so no one in the world can possibly have any legal claim on it.
Wrong. djb and any possible heir of his does, because you can't place things in the public domain in mainland europe.
> But if this really concerns you, I would be happy to provide you -- or anyone else -- with a licensed copy of any of DJB's code for a modest processing fee.
Unless djb specifically gave you license to do so, your "licensed copy" is worth exactly as much as the original public domain dedication is. As far as european law is concerned, you have no rights to the work, and thus certainly don't have the rights to relicense it.
That is true, but there currently is no precedent for that. Until someone becomes the "sacrificial lamb" by 1. taking the risk and 2. being sued for it, anyone intending to build a business is understandably wary and unlikely to touch public-domain-dedicated assets.
> I hear people raise this concern a lot, but I think it is without foundation.
Well, I've come across companies that have policies against using unlicensed or public domain code.
This sadly makes whether it's a founded concern irrelevant, as the licensing choice suddenly prevents me from using the code in any shape or form if I work there.
> If something is in the public domain in the U.S. then anyone can use it for any purpose, including releasing it under whatever license they want
In the US. Public domain dedications have no legal standing in most of mainland Europe's IP regimes. A public domain dedication is the equivalent of no license at all.
And because a public domain dedication has no legal standing, a third-party slapping a license on the code does not make it legally licensed as far as european IP courts are concerned, they don't have that right, their "license" is worth as much as you deciding to license Windows under GPLv3.
> But if you think you need it licensed, you can have it licensed.
Unless djb offers to provide a fallback license (which incidentally is exactly what CC0 does and why it's legally sensible and valid[0]), then no, you can't "have it licensed".
I am no expert in German copyright law! But your own source says:
"It is worth noting that the result would stay the same if CC0 would not even contain such an explicit fallback rule. According to the prevailing opinion of the legal scholars, public domain licenses (which cannot be interpreted as a waiver of rights under the German copyright, see above) are reinterpreted as unconditional, unlimited, non-exclusive (i.e. public) licenses."
which appears to contradict your statement that public domain dedications are treated as no license at all.
You are over a decade behind the times and not fully aware of said previous writing. See what M. Bernstein said on 2007-11-11 at Sage Days 6. Hint: You should have read that Wikipedia article. (-:
Do you also sort in constant time at fixed array size, like djbsort claims it does?
For the 1024 array their cycle count quartiles differ from the median at the 3 promille level, but I don't know if this counts as "constant" for timing attack purposes.
Hmm, interesting question.
Radix sorts, by their nature, execute the same code regardless of the contents of their elements. ( I think, I haven't thought about this before).
However you will see some variation in execution time based on varying memory access patterns.
No, it's not constant time. Depending on implementation details, you end up putting different elements in different buckets in different cache lines and that creates side channels.
Radix sort will perform differently for an input that is 1,1,1,... and 1,2,3,...,n.
I have not read up on how djbsort deals with this issue, but it's the problem it's trying to solve.
Actually, there is only an advantage to radix after 100,000 elements - djb is much faster for small arrays. Also, while radix is cpu constant time, you'd need to do some kind of secure shuffle in advance to make it truly constant-time (considering memory locality timing).
Why might the installation instructions require the creation of a new user specific to the sorting program? Purely for security of the normal user given that the installation is using a wget / shell script process? https://sorting.cr.yp.to/install.html
> djb presents djbsort, a constant-time djbsolution to a djbproblem you probably don't have. To install, just unpack djbsort.djb and run the ./djb script. Make sure you have djbtools installed first
"The page teaches users to use su to lower privileges ..."
In the example, he could have used his own utilities for dropping privileges (setuidgid, envuidgid from daemontools).
If I am not mistaken, busybox includes their own copies of setuidgid and envuidgid, meaning it is found in myriad Linux distributions. I believe OpenBSD has their own program for dropping privileges. Maybe there are others on other OS.
Instead he picked a ubiquitous choice for the example, su.
It is interesting to see someone express disdain for the version.txt idea. I had the opposite reaction. To me, it is beautiful in its simplicity.
As a user I like the idea of accessing a tiny text file, version.txt, similar to robots.txt, etc., that contains only a version number and letting the user insert the number into an otherwise stable URL.
I would actually be pleased to see this become a "standard" way of keeping audiences up to date on what software versions exist.
By simplifying "updates" in this way, any user can visit the version.txt page or write scripts that retrieve version.txt to check for updates, in the same way any user can visit/retrieve robots.txt to check for crawl delay times, etc.
It is not necessary to "copy and paste" from web pages. Save the "installation" page containing the stable URL as text, open it in an editor, insert the desired version number into the stable URL.
Save the file. Repeat when version number changes, appending to the file.
I like to keep a small text file containing URLs to all versions so I can easily retrieve them again at any time.
I wonder if it is an instinctive reaction to the common complaints (no, relatively, easier packaging or installation routes) to djb softwares. It was previously seen in TweetNaCl [1] which is arguably an irritated response to libsodium [2] which in turn wraps djb's NaCl library [3] in the way djb considers suboptimal [4]. djb is known to be not very interested in the modern packaging systems or practices, so it might well be his way to protest them.
Yes, presumably to run the build as a user which is as unprivileged as possible. Which is a reasonable idea, though it might seem paranoid in today's `curl | sudo sh` world.
I actually really like the authenticity and humility of DJB including that in the instructions. I think it's likely many people trust his code (and he's certainly written a lot of extremely security sensitive stuff), but of course it's a much better practice to not trust him quite so much.
Really? This is software where the author named it after himself, claims that it holds a new speed record with no comparisons benchmarks (just references a single number from a paper in 2015), uses the word "easily" FOUR times in the limitation-section without any links or explanations, and doesn't reference any other libraries/resources/software/solutions.
Authenticity, sure. Humility - not after his approach to the students issue in recent years where he was more interested in being correct then helping people :-(
I was talking to another person in the community, now this was well over a decade ago, maybe two, with the initials "DJB". He said: "I went onto IRC once. I was mistaken for Daniel Bernstein. It was the most awful 15 minutes of my life."
I spoke to another fairly famous person a few years later, let's say author of the authoritative book on one of the alternatives for one of djb's software packages. He said something along the lines of: It's a shame djb gets along so poorly with other people, because he has a lot of good ideas.
Abstract: "We introduce the notion of discrimination as a generalization of both sorting and partitioning, and show that discriminators (discrimination functions) can be defined generically, by structural recursion on representations of ordering and equivalence relations. Discriminators improve the asymptotic performance of generic comparison-based sorting and partitioning, and can be implemented not to expose more information than the underlying ordering, respectively equivalence relation. For a large class of order and equivalence representations, including all standard orders for regular recursive first-order types, the discriminators execute in the worst-case linear time. The generic discriminators can be coded compactly using list comprehensions, with order and equivalence representations specified using Generalized Algebraic Data Types. We give some examples of the uses of discriminators, including the most-significant digit lexicographic sorting, type isomorphism with an associative-commutative operator, and database joins. Source code of discriminators and their applications in Haskell is included. We argue that built-in primitive types, notably pointers (references), should come with efficient discriminators, not just equality tests, since they facilitate the construction of discriminators for abstract types that are both highly efficient and representation-independent.")
I've tried so many times to frontpage that, but I've failed. Maybe we should have another go?
Nothing in djbsort's approach is inapplicable to another sorting algorithm, so maybe we can hope for better primitive support for discrimination sort implementations (or at least american flag sort implementations). I seem to recall reading that discrimination sorts are inherently content-independent.
> Other modern Linux/BSD/UNIX systems should work with minor adjustments to the instructions.
I can report that I got it to build and run on slightly out of date FreeBSD by deleting all of the -m32 variants, and deleting all of the -march=haswell variants. I haven't looked into whether this is down to the version of GCC that comes in ports and the version of Clang that comes in base, or something else. No other changes were needed to the build process, though.
Create a user and env to run a one-off build + application. DJB cracks me up. He may have the right thing in mind but this type of prophylactic approach is no longer proof against anything.
The 2.5 cycles/byte compared to the 32 cycles/byte that Intel managed pulled off seems like an improbably large improvement over the current state of the art? Is this real?
I liked this bit, using the fastest compiler for each primitive:
> ./do tries a list of compilers in compilers/c, keeping the fastest working implementation of each primitive. Before running ./do you can edit compilers/c to adjust compiler options or to try additional compilers.
My job involves a lot of packaging/cross-compilation, and djb's libraries always seem consistently hostile to the lowly packaging engineer.
Would it really be all that much work to package in autotools or CMake? Why do I need his special-snowflake build system with its hard-coded assumptions about system paths?
I know that the cult of djb will downvote this into oblivion, but seriously, what is the rationale for a build flow that involves:
1. Downloading a text file
2. Parsing it to get a URL
3. Making a new user
4. Symlinking the user's HOME directory into the build tree
5. Run an extremely non-standard build system.
6. Hope you're not trying to cross-compile, because good luck with that.
7. Guess at where the files came out (hint: it probably won't be in FHS locations)
8. Copy the output yourself once you find it.
Would it really be that much harder to give us a git repo and a ./configure or a CMakeLists.txt?
I've been using DJB's stuff for ~20 years and I don't like his recent build systems either. Without defending it, I'd just like to offer a theory on why he packages things the way he does.
Back in the day, his build processes were atypical but still much more "normal" than now. He also released his software without licenses. During this time of heavy software development, DJB was concerned about people screwing around with the internals of his software, hurting security or reliability, and then blaming the software rather than the modifications. So his build systems were, I think, designed to lead to the result he wanted, where software behaved and was administered in the same way on various platforms.
In the mid-2000s he re-licensed existing software as public domain and began publishing all new code as public domain as well. Around this time, build systems began to get more wonky. Also, his public work that garnered the most attention shifted away from software toward cryptography. He did some attacks on existing crypto and authored Curve25519, Salsa20, etc.
He's also been putting out a tremendous volume of work in multiple categories. I bet he'd rather work on this stuff than on user-friendly build systems.
So given these points, I think the explanation for his unfriendly build systems is
A) a very strong aversion to people modifying his stuff where he gets blamed if modifications do harm;
B) a shift away from software development, where people generally care more about build systems anyway;
C) a huge level of productivity which results in very atypical Pareto principle choices/tradeoffs;
D) his public-domain licensing.
Given these 4 points, I think DJB is unwilling to take time away from crypto and other work and put it into build systems he doesn't enjoy that will take more time upfront and more babysitting down the road. Fewer people will package it, but the software is public domain and competent people can just add their own build system. This squares with his available time and interests.
So, I don't like his build systems either, but I think I understand where they come from.
The Libsodium guys wound up doing exactly what you're suggesting, because of the impossibility of trying to package NaCl as-is.
So they essentially had to re-do/duplicate all of his build work just to make it packageable. And now there are two competing implementations (three if you count tweetnacl). And a bit of a confusing mess in the documentation department.
It seems a little selfish for djb to take the "works on my machine" attitude, because it means that a bunch of other people have to reverse-engineer all that stuff just to make it portable.
But I guess OTOH it's his software, so it's his choice. And maybe he doesn't care whether people choose to use his stuff or not, as long as he's publishing.
conveniently, almost nobody uses NaCl as-is due to its more or less never having been patched, and nobody uses TweetNaCl, so there is de facto one implementation.
The underpinnings of that argument are flawed. You're accepting the premise that there were relatively "normal" build processes "back in the day" and "abnormal" ones now. This is not in fact the truth of the matter at all.
M. Bernstein's own build system was redo. But he never actually published a redo tool. Other people did that. There are traces of a precursor to redo in one of his packages. But even they were not the actual build system for it as released to the public. Again, it was other people who took them and fleshed them out into a working build system. The slashpackage system, similarly, only existed in (another) one of his packages. And again, it was other people who extended it to other packages.
The reality is that the build system evident in djbsort is not a sudden inconsistency. The various packages over the years are all inconsistent. One can in fact derive from them a timeline both of development of ways of building packages and evolution of the various "DJB libraries". But they are all snapshots of these processes at different points. They aren't a coherent single system across multiple packages. I, Paul Jarc, and others had to make that part. (-:
So do not deduce that there's been a change. Deduce, rather, that build systems have always been a lower priority and an unfinished loose end. As such, this is nothing to do with the announcement at Sage Days 6 and the like, nor to do with people "screwing around" as you put it. Indeed, the clear motivations of redo expressed in M. Bernstein's own writings on the subject have nothing to do with either copyright or preventing modifications to packages, and everything to do with problems of make and indeed autotools. Observe the motivations of "honest" dependencies from the build instruction files themselves, from the command-line tools and their options, and from things like the absence of header files.
"He's also been putting out a tremendous volume of work in multiple categories. I bet he'd rather work on this stuff than on user-friendly build systems."
I don't know about that part...a halfway normal build process would be considerably less work to set up than this monstrosity.
There's also a world beyond Unix. Prof. Bernstein may not be interested in targeting Windows. But for library developers in general, I'd recommend CMake. Microsoft has adopted CMake as its standard cross-platform C/C++ build system, and has built a cross-platform library package manager [1] on top of it.
As you see in the code this is linux only:
CLOCK_MONOTONIC, HW_CPUSPEED, -soname, -rpath, linux/perf_events.h,
Even on linux it would fail:
sort.c:386:21: error: always_inline function '_mm256_loadu_si256' requires target feature 'sse4.2', but would be inlined into function 'djbsort_int32' that is compiled without support for 'sse4.2'
I really disagree with you on autotools, that has been a big problem for me when cross-compiling. Configure often detects features by compiling a binary and executing it, which can be a problem when you're cross compiling, since the generated target can't be executed on your native cpu. Most software build systems don't even take cross building into account at all, since they just copied someone elses' broken autoconf and automake source that only works on native builds so you have to work around a lot of issues.
Even "good" software packages written by people who understand autotools like glib have issues. Just look at the official glib cross compiling instructions: https://developer.gnome.org/glib/stable/glib-cross-compiling...
The "correct" way to cross compile it is to figure out all the autoconfigured values yourself, and write them into the configure cache.
Personally, I have had the least trouble with plain Makefile based packages. Yes you often have to reach into the Makefile's guts and modify it to make a working cross build, and you'll have to set the rpath manually, and install the files yourself, but at least it's easier than having to modify an autotools based build.
Autotools "just works", and when it doesn't it's very hard to fix. Makefiles don't "just work", but it's much easier to fix them yourself.
"Configure often detects features by compiling a binary and executing it" — I am not sure, where this impression comes from. Autotools as whole have excellent cross-compilation support. Autoconf has feature-detection routines, checking for presence of headers, exported symbols and pkg-config files. None of those trigger "execute the binary" part when cross-compiling. In addition, custom-written host-side checks can easily be skipped when cross-compiling (either by code of check itself or by user via environent-variable overrides). Do you you know of any build system, that handles this better?
Cmake and several other buildsystems either don't support cross-compilation at all (because their primary audience is Windows) or use pkg-config only. Few others are nightmarish parody of autotools with much worse support. Most don't have ounce of autotools features.
I know many projects, that offer horrible autotools "support": for example, glib2 autotools scripts can't be cross-compiled to Android without ample application of hacks. But those issues are caused by incompetence and lack of testing, not some innate fault of Autotools. When such projects migrate to something else, their cross-compilation process becomes WORSE.
The impression comes from having to deal with multiple configure sciripts that do that very thing.
The capabilities may be there, but in my experience many if not most projects do not use those capabilities. Most autotools based projects I've had to cross build are made by people without a great understanding of autotools internals and capabilities, so they often end up copy pasting things blindly from other projects until they get their native build working. Advanced build system stuff like cross building is not even considered.
These builds end up trying to execute a binary to do feature tests, or compile a program and use that to generate code, or do any number of things that break when cross building.
Would things be worse without autotools? Maybe so. In my personal experience projects using a simple Makefile were much easier to cross build and package than ones using autotools, but that's probably confounded by projects using a Makefile being simpler overall.
To clarify for anyone who doubts autotools has problems - it is true that the documentation mentions cross compiling options, but it appears that the vast majority of tests in use today do not actually use these values (host/target arch, etc).
Some users recommend compiling with qemu-user to avoid fixing the tests, but besides being very slow, that methodology will still generate garbage. What is the point of running feature detection code on a desktop that will be used to configure code compiled to run on e.g. a small ARM system?
The fix is to explicitly set configuration parameters or, even better, to detect features at runtime. However if you try to contact any given project using autotools to help solve these configuration problems you will likely receive "that's not our problem" and no help from the maintainers.
Autotools is in practice less work for me than alternatives, at least while using a FOSS operating system. On Windows CMake is very well developed but handling dependencies is always a chore.
1. You're assuming the software is intended to be deployed
2. Writing anything sane with autotools or CMakes is an exercise in frustration that borders on torture. Not to mention how either of these systems chafes on one's sense of aesthetics.
3. I've installed a few of djb's package over the years. While indeed non standard, unlike either of the solutions you've mentioned, djb's stuff: a) works b) is usually very simple to understand and change when needed (unlike the autotools steaming monstrosity which in 20 years of using opensource software I've never dared to touch)
Why would somebody release software that they didn't want to see used?
To your other point, Autotools is little crusty for sure. It's Yet Another Language to learn (and so is CMake), but it's really not as bad as people make it out to be. There are many thousands of examples in the wild, good and bad.
Yes, Autoconf tests for a bunch of stupid things that don't matter. But so what? You can add the tests you _do_ care about, and take advantage of the directory control and compiler control aspects.
If you've never actually worked with Autotools (which you just said you haven't), I'd encourage you to actually try it out. It's not as bad as you think.
Not everyone wants to see their software used. Some want to see the ideas in the code be used elsewhere. In that case the code is meant so others to study and understand.
You can see this in software papers where the algorithm is described by pseudocode, or only the core part of the algorithm is shown. Presumably the rest of the code is obvious to others in the field.
Releasing non-portable source is a step above that common practice.
I've used Autotools. I've contributed to Autotools. I agree that it's a steaming monstrosity. I never want to look at an m4 macro again.
I have tried working with autotools. Many times. I guess it's down to mindset: some people don't mind working on top of a pile of manure and don't really see what the big deal is.
I guess "touched" is the wrong word. I did enough "touching" of it unfortunately: using it to compile packages, trying to fix it, wasting precious time wading through pages of horrible shell code to understand why it fails when it does.
I even tried a couple of time to use it in my own projects , naively thinking that it's so widespread there's got to be something to it I'm not seeing.
I gave up on it in disgust everytime. The list of reasons is long, but I guess if I have to pick the one thing that really drive me nuts: most packages out there compile wayfaster than it takes to configure them.
If that specific fact doesn't make you walk away from it immediately, our brains just aren't wired the same.
FWIW, I read through each of your links. They don't address cross-compilation at all, or the needs of software packagers.
I can 100% promise you that somebody packaging this library for any Linux distro (or for a Yocto/Buildroot system) would grind their teeth in frustration at everything in the those links.
The solution to having inconsistent packaging paths isn't to introduce _yet another_ packaging path system, but this one specific to djb stuff. It's to use a standard build system with overrideable paths, and not to assume the author knows better than the packager.
I don't care about software packagers at all, except insofar as they can address the needs of users.
> The solution to having inconsistent packaging paths isn't to introduce _yet another_ packaging path system, but this one specific to djb stuff. It's to use a standard build system with overrideable paths, and not to assume the author knows better than the packager.
Having "inconsistent packaging paths" isn't the problem. The problem is I have a program called "foo" and I don't know if it should begin with:
#!/usr/local/bin/python
or:
#!/usr/bin/python
Somebody who doesn't have a real job made the joke that it should "obviously" use "overridable paths" like this:
#!/usr/bin/env python
but that person hasn't used very many unix systems so they're unaware that this doesn't work on some systems. Eventually the "package manager" compromised with:
#!/bin/sh
''':'
if type python2 >/dev/null 2>/dev/null; then
exec python2 "$0" "$@"
else
exec python "$0" "$@"
fi
'''
And users are like "what the fuck!?" All we needed to agree was that python "lives" in /usr/local/bin/python and python3 (an incompatible alternative to python) lives someplace else, we would have been fine.
And for what? Why did we bother with this? What did we actually gain for all this extra shit? Some package manager felt like he was doing the job of a tarball?
> They don't address cross-compilation at all, or the needs of software packagers.
Cross-compilation is tricky for programs that need to run programs to figure out how they need to be built; neither cmake or GNU autoconf do anything to address it.
The best solution package managers seem to suggest is to "not do that" but those programs are slow.
You example script with python2 is not about pathing, but a separate problem or multiple binaries with the same name, and is exactly why you want to let package managers deal with this.
As a developer, you can't know all the different setups you users will have, or for example what the default python will be. As a package manager, you know Exactly this for the systems you are packaging for. If needed, they will patch your library to fix the python binary name, or supply the correct build arg if you support it. It's their job to make sure all the users in their little slice of the Linux ecosystem can install and use your software reliably and easily, so working against them is user hostile, and using example of exactly the problems they solve as evidence for why we shouldn't make their lives easier makes no sense.
> You example script with python2 is not about pathing, but a separate problem or multiple binaries with the same name, and is exactly why you want to let package managers deal with this.
> for example what the default python will be
Except that package maintainers created this problem. It's not a real problem!
Some package managers decided to call it "python" creating the incompatibility, and thus creating the problem for everyone who writes python programs for now until all those systems go away.
As a result, everyone who wants to write python programs has to deal with the fact that python is sometimes called python and sometimes called python2 -- depending on what the package manager did.
> As a developer, you can't know all the different setups you users will have,
> Except that package maintainers created this problem. It's not a real problem!
It is a real problem. People wanted both python interpreters installed at the same time, and a way for software written for each of them to functionally coexist on a system. Package maintainers provided a solution.
> Some package managers decided to call it "python" creating the incompatibility, and thus creating the problem for everyone who writes python programs for now until all those systems go away.
Did they create the problem, or did they mirror the reality they saw, where people that installed python3 symlinked python to python3?
Here's a little tidbit from the last few lines of output of "make install" from Python 3.0:
* Note: not installed as 'python'.
* Use 'make fullinstall' to install as 'python'.
* However, 'make fullinstall' is discouraged,
* as it will clobber your Python 2.x installation.
To me, the implication is that python hasn't been fully installed because it was worried about python 2 and didn't want to inconvenience you, but hey, if you don't have python 2.x to worry about, or have dealt with the problem otherwise (which is something package managers would attempt to do), then you can do a fullinstall.
I think it's pretty obvious from this that the Python developers intended to completely replace python 2.x, and take over the "python" binary namespace as well.
But sure, you can go ahead and blame this on package managers. Why let a little thing like trivially discoverable information that casts doubt on your argument get in the way of a good rant?
> or did they mirror the reality they saw, where people that installed python3 symlinked python to python3?
Evidence please.
I can’t believe any user would do this because it instantly breaks all their scripts.
Every python 3 program I have thinks python 3 is #!/usr/bin/python3
> I think it's pretty obvious from this that the Python developers intended to completely replace python 2.x
Of course they did, but there’s a good reason there are no directions for installing Python on the python website tell users to do this: As naive and full of hope as the python developers are, they’ve got nothing on the sheer hubris of Linux python packagers who think they’re doing gods work by commenting out random seeds in OpenSSL.
> I can’t believe any user would do this because it instantly breaks all their scripts.
You mean those scripts which they are expected to upgrade to Python 3 using the 2to3 program, which installs with python? The same scripts that Python 3 advocates claim can be fairly easily converted?
> Evidence please
The fact that Python 3 has a documented option to install as /usr/bin/python and it mentions it on every regular install is evidence.
> Every python 3 program I have thinks python 3 is #!/usr/bin/python3
But did they initially? That's the question. We're talking about decisions package managers made years ago, so the status of Python 2 and Python 3 at at that time is what we need to look at.
Also, it's important to note, this isn't the first time this has happened. I remember having lots of problems trying to get Python 2 installed on systems that shipped with Python 1. It's entirely possible that the solution to this problem is from when that happened, and rather than rather than come up with a different, Python2 -> Python3 solution they used what was decided at the point they had to support both Python1 and Python2, so the solution be familiar. That's got a fair chance of being likely, since package managers are working on systems on timeframes much longer than the vast majority of system administrators, but still need to support those admins that are managing systems a decade after install.[1]
> As naive and full of hope as the python developers are, they’ve got nothing on the sheer hubris of Linux python packagers who think they’re doing gods work by commenting out random seeds in OpenSSL.
I'm not really interested in enumerating all the logical fallacies you're falling back on here. That, combined with your denigrating characterization of entire groups of people doesn't really lend itself towards my idea of a useful or constructive conversation, so I think I'm done. Feel free to reply, I'll read it, but I won't be continuing this discussion.
> The fact that Python 3 has a documented option to install as /usr/bin/python and it mentions it on every regular install is evidence.
No it's not.
Finding a debian mailing list where someone is complaining about incorporating a python3 script "package" that assumes python is /usr/bin/python would be a start. Finding many people complaining would be evidence.
> I remember having lots of problems trying to get Python 2 installed on systems that shipped with Python 1.
Problems created by package maintainers "shipping" python in the first place.
Let's say I'm writing firmware for a TV. Why wouldn't I want to use the chip's CPU to the fullest extent at runtime? That doesn't mean I want to compile my TV firmware _on_ the TV.
> Why wouldn't I want to use the chip's CPU to the fullest extent at runtime?
Because you'd rather use GNU autoconf?
If you wanted the fastest possible performance, you'd try each algorithm, profile them, then select the one that works best. This build process can do that automatically. GNU autoconf cannot.
You can tell autoconf which to use, but the package maintainer can't be trusted to do this.
Then the user will just complain that it's not that much faster than qsort.
The package maintainer would be to blame for users thinking this was slow software simply because they chose poor defaults, and nobody would ever know...
> If you wanted the fastest possible performance, you'd try each algorithm, profile them, then select the one that works best. This build process can do that automatically. GNU autoconf cannot.
Except doing it at build time is a terrible idea anyway. That is because the set of CPUs it will be used on is actually unknown, unless it's literally not meant for anyone else to use that compiled object. But that's not how people actually develop at all. They distribute the software objects and users link against it on their CPUs, which the original build system cannot possibly have knowledge of.
Requiring AVX2 or whatever by default really has nothing to do with this. High speed software that's actually usable for developers and users selects appropriate algorithms at runtime based on the characteristics of the actual machine they run on (for example, Ryzen vs Skylake, which have different throughput and cycle characteristics.) This is the only meaningful way to do it unless you literally only care about ever deploying to one machine, or you just don't give a shit about usability.
You could absolutely write an Autoconf script that runs those tests by default, but has overrideable behavior. You can make custom Autoconf macros to detect whatever arbitrary thing you want. At the core, an Autoconf macro is just a chunk of shell script that sets some environment variables with the result.
Reading those complaints cold, I would suspect DJB to be between 16 and 22 years old; old enough to have seen more than his own system, but young enough to be absolutely certain his off-the-cuff solution is better than any of those other idiots can come up with. It helps to be the smartest person in any room, I guess.
"When oaf was integrated into ``the system,'' it was moved to /usr/bin/oaf, and its files moved to /usr/share/oaf, because FHS doesn't let ``the system'' touch /usr/local. What happened in this case is that oaf didn't find files that were put into /usr/local/share/oaf by another package."
Once upon a time, I worked as a sysadmin for a major university computer science department. We supported a half-dozen or so different architectures. We built a long list of software, including all of the X and Gnu programs, so that our users had the same environment essentially everywhere. Oh, and we all built it to live under /lusr, because our convention started before "/usr/local" was a thing. (The weird looks when you started babbling about "slash-loser" were just a fringe benefit.)
imake was a giant pain in the rear. Autotools were also a giant pain in the rear. But everything else, except in very, very simple cases, was much worse. DJB's software would come under the header of "sorry, not supportable without unreasonable effort."
Bear in mind that steps 1, 3, 4, 6, and 7 are not related to slashpackage. nrclark has got step #2 wrong, as all that it yields is the current source package version number not a URL and there's no real parsing going on (except for the shell stripping the trailing newline); and xyr step #8 is xyr own invention and not actually in the build instructions as given.
That leaves step #5. What that does is both fairly obvious and documented in the instructions. It tries out a whole bunch of compilers, compiler options, and implementations in turn to see which results in the best code. Ironically, it is probably the sort of thing that autotools might be persuaded to do. But in practice no-one would, or it would be buried under reams of copied-and-pasted cargo cult tests for things that the actual source at hand cannot in fact do without in the first place.
This is of course a long-recognized fly in any let's-just-use-autotools ointment, discussed over many years. (-:
As a user, I'm super happy with `./configure`, that works the same way across all applications using it, and has been for the past 20 years.
You don't want to look at the actual script, but as a user, it is very convenient. I don't have to read installation guides to know how to change compilation options, compilers, cross-compile, installation paths, run the test suite, etc.
(i guess i should say i'm a dev as well as a user. but my exposure to configure is as a user.)
so here's why i'm not a fan of configure: (1) it's slow (2) if something goes wrong, it's a fucking nightmare to try to diagnose -- it's just an endless barage of crap that doesn't actually have anything to do with the app at hand, and the actual build process is like 2 or 3 layers away from running the configure script. (or for extra fun, maybe you didn't have a configure script, you had the other script that generates the configure script.)
i'd much rather see just a simple makefile with well documented dependencies, e.g., like redis does it.
(slowly this problem is going away as building becomes an integrated part of the language -- e.g., rust, go, ...)
(and to be fair, a large part of it is just an aesthetic objection -- the sheer mass of unnecessary code and checks that configure does & generates is really ugly to me. i want beautiful software!)
Simple Makefiles almost never work for portable software that needs to interface with a system. There is a reason GNU autoconf was created -- it solves a real problem. Different kinds of systems are different in a LOT of ways. Almost all of these ways need to be determined at compile-time and without testing for those ways, there is no good way to do this. The older way (before GNU autoconf) that tools like "xmkmf" used was to have a massive database of features for each system and then you could just query that database for the features*platform you were compiling for to get the information -- this did not scale at all, platforms changed, features were added faster than the database could be distributed, and generally it was terrible.
Things not covered by a simple Makefile include almost all tasks related to building software:
1. How do you build a shared object (.so, .sl, .dylib, .dll, .a on AIX) ? Can you even do that (MINIX) ?
2. For the platform being compiled for (i.e., nothing to do with the platform you are compiling on) how large is an "int" ? Is there a type that exists for uint_least_32_t ? If not (because the compiler doesn't support stdint.h), is there an unsigned integer type that is atleast 32-bits wide ?
3. How do you link against an entire archive, not just the symbols you currently reference ? Is it -Wl,--whole-archive or -Wl,-z,allextract ?
4. For symbol versioning for your library, how do you set the soname ? How do you specify versions of functions ? Can you ? How do you filter out symbols that are not part of your ABI ?
You could start over and mandate a consistent toolset and exclude the majority of platforms (Rust, Go, etc)... or... you could write your own, inferior, version of GNU autoconf, or you could use GNU autoconf...
I don't think most software needs to care about these weird exotic systems anymore (MINIX? seriously?). Maybe if that's really a goal of your software, it's reasonable to use autotools.
For what it's worth, I agree that Autoconf tests for a lot of things that aren't necessarily relevant today. I don't think I need to worry about the existence of <stdint.h> on just about any platform.
But there ARE a lot of very relevant tests mixed in there too. These include things like:
- Width of an int
- Endianness
- Availability of arbitrary functions for linking
- Search through a list of libraries until one is found that provides a requested
function, then add that library to LDFLAGS
- Does the C compiler work?
- Do compiled binaries run on the build machine?
And it gives you control over things like:
- Mixing custom CFLAGS with conditional CFLAGS and package-specific CFLAGS.
- Enable/disable specific features at configure time.
- Add/change directories to search for existing headers/libraries needed for compilation
- Add/change directories for installation
Automake gives you:
- Automatic handling of platform-specific library creation differences.
Dynamic libraries in particular can have very different semantics across platforms, even living platforms today.
- Automatic handling of making parallel-safe Makefiles
- Standardized clean/test/install/dist/distcheck targets
- A reasonable unit-test system that's well integrated with the rest of Automake
If your software depends on this, you're doing it wrong.
As you said, depending on the existence of <stdint.h> is just fine, and you can then specify what you need. Even in the incredibly rare case of needing the width of int (serializing inputs and outputs of existing libraries interoperably on multiple machine ABIs) <limits.h> has you covered, albeit in an awkward way, unless you can assume POSIX and thus have WORD_BIT.
> - Endianness
If your software depends on this, you're doing it wrong.
If the wire/file format is little endian, input[0] | input[1] << 8, if stupid-endian, input[0] << 8 | input[1].
> - Search through a list of libraries until one is found that provides a requested function, then add that library to LDFLAGS
This is the exact opposite of what I want! If I'm depending on other libraries, I want that dependency explicitly listed and settable, not automagically found.
> - Does the C compiler work?
How is this reasonable to test? If it doesn't work, it can't actually do anything about it.
> - Do compiled binaries run on the build machine?
Totally ignores cross compilation, or deploys to other places -- e.g. containers.
> And it gives you control over things like:
These are generally useful, but the complexity required for autoconf is a huge cost to pay.
> Automake gives you:
All useful, yes. But these have never seemed to be particularly hard to do manually.
(Semi-) automatic handling of libraries, especially dynamic libraries, makes autotools worth the price, alone. Supporting one architecture manually is easy. Doing for six isn't. Doing it several dozen times, for all six, is spectacularly painful.
Such exotic systems as OS X in a few versions, BSD and four flavors of Windows? (cygwin, mingw, MSVC, msys)
Add a few flavors of Linux and perhaps even Android (all 3 current targets) on top.
That with cross compilation.
Even Cmake lacks some useful portability tools to handle this...
Though Autotools have major problems too.
I think the most annoying issue is that for larger software with lots of options and dependencies, running configure just takes a long time (especially before SSDs where a thing), and will terminate with exactly one error. Getting past ./configure could take literally all day.
This made me think back then that someone could fix autotools by caching tests results for a specific system. Why not me? So I opened its sources... and closed it. No positive outcome could recapture the time required to even understand what’s going on in these scripts.
Agreed that this is an annoyance. Autoconf-generated configure scripts are definitely slow, mostly because they test a ton of things that probably don't still need to be tested.
The overall system still provides a ton of value and a lot of relevant tests in addition to the not-so-relevant ones. After years of writing increasingly complex Makefiles to test for this-and-that, Autoconf was a breath of fresh air for me when I made the switch a couple of years ago (even with all of its crustiness).
I think that very little software is hurt by supporting multiple platforms -- usually the software benefits from exposure to a different set of assumptions and can be made more robust for when the preferred set of systems changes in the future. Does your software compile for WebAssembly today ? If it it also compiles for MINIX, it probably already supported WebAssembly as soon as a compiler backend existed without you having to make any changes.
Minix happens to be one of the most used operating systems because it turns out Intels Management Engine, the secret computer inside your processor, runs Minix.
With regards to diagnosing configure, “config.log” is actually a very good record of all the work done by configure. If a check or test fails, config.log shows it alongside the exact command line used to compile/link/execute it, allowing you to reproduce a failure (or even just directly see which dependency/local quirk caused it). I’ve found very few reasons to dive into ./configure itself, which as you rightly note is a rats nest.
The most important thing is to put all the compiler flags behind standard variables. This is 80% of what autoconf/automake does. The standard variables for C programs are
CPPFLAGS = # e.g. -I and -D preprocess options
CFLAGS = # various ad hoc compiler switches
LDFLAGS = # e.g. -L options
LDLIBS = # e.g. -l options
SOFLAGS = # -shared or -bundle -undefined dynamic_lookup
(NOTE: SOFLAGS is less standard--i.e. not in GNU style guide or GNU make manual--but perfectly fits the pattern and extremely useful to specify independently. You'll often find a variable like this used, if not the same name. IME SOFLAGS it the most common name.)
Library dependencies should follow a similar pattern:
LIBFOO_CPPFLAGS = # e.g. -I/usr/local/foo/include
LIBFOO_CFLAGS =
LIBFOO_LDFLAGS = # e.g. -L/usr/local/foo/lib
LIBFOO_LDLIBS = -lfoo
To make it easier to control feature-specific options without overriding all other defaults, you should do likewise for feature groups:
Because of the ubiquity of GCC and clang, and because of the commonality across most modern Unix platforms (e.g. SOFLAGS will be "-shared" on all BSD- and SysV-derived systems, including Linux, and "-bundle -undefined dynamic_lookup" on macOS), the above will allow people to build your software without modifying the actual Makefile in almost all scenarios. It will certainly make packagers' jobs easier.
For installation you should always use the standard prefixes
This may seem like alot of boilerplate, but the important thing is that it's very simple and time-tested. Autogenerating just this basic boilerplate is overkill and adds unnecessary indirection and complexity. You can get fancier than this scheme, but avoid the temptation as much as you can unless it's an obvious win--remedies a regular, concrete issue that is actually resulting in significant wasted effort. For complex builds (e.g. multiple targets where flags might not have the same semantics across target rules), consider splitting the build into multiple Makefiles. You can create a convenience wrapper, but at least the sub Makefiles will be reusable by packagers or contributors without having to modify anything directly. It's not ideal but there is no ideal solution. (I once heavily advocated non-recursive make, but these days I'm much more averse to complexity.)
There's no easy answer for Windows, particularly because NMake uses significantly different syntax for some constructs. For really basic builds the above might actually work as long as you stick to NMake-compatible syntax. With the advent of Windows Subsystem for Linux (WSL) you could consider requiring WSL for the build (so you have access to POSIX-compatible shell and Make) and perhaps providing a wrapper Makefile with more convenient defaults for compiling with Visual Studio (which is very much still a command-line driven compiler). Requiring clang-cl might be even easier.
For simple stuff IME autotools and CMake are just overkill.[1] For complex stuff they often won't be sufficient, anyhow.[2] So the important thing is to layer your build so the maximum amount of build code can be reused by somebody with unforeseen needs; and reused in a way that is obvious and transparent so their tweaks can be simple. Stick to convention as much as possible. Even if you don't use autotools, autotools has created de facto conventions wrt build variables that will be intuitive to all packagers. Please don't invent your own naming scheme! (It's annoying when someone uses 'PREFIX' instead of 'prefix'. There's logic to the autotools scheme. Uppercase variable are for [build] flags used in rule recipes, lowercase for [installation] paths used to define rule targets and dependencies. Exceptions like DESTDIR are well-known and few.)
Always be mindful of the needs of people doing embedded work--cross-compiling, complex build flags, etc. But don't bend over backwards. All you really need to do is remove obstructions and the necessity to patch. They're used to writing build wrappers, but patching is a maintenance nightmare and impedes tracking upstream.
Ideally you would support out-of-tree builds, but in reality few packages support this properly (even autotools or CMake projects), and if you're not regularly doing out-of-tree builds your build probably won't work out-of-the-box anyhow. That's why I didn't mention the use of macros like VPATH, top_srcdir, etc, in your rules. Do that at your discretion, but beware of descending down the rabbit hole.
The same thing applies to automatic feature detection--unless you're regularly building on esoteric targets your feature detection is unlikely to work out of the box, and bit rot happens quickly as platforms evolve.[3] Don't put too much effort into this; just make sure it's easy to manually toggle feature availability and that the defaults work on the most common platforms and environments. Maximize benefits, minimize costs, but don't think you can automate everything away.
All those fancy builds people tout are ridiculously brittle in reality. People just don't realize it because they're only using the builds on what are in fact relatively homogenous environments. Most of the effort is wasted, and build complexity (including bespoke build environments) is a big reason why people don't reuse software as much as they could.
[1] Indeed, for simple builds it's not much of a burden to provide and maintain a POSIX build and a Visual Studio project file build.
[2] For CMake this is especially true. For feature detection CMake often requires actually updating your version of CMake. That's a gigantic PITA. Anyhow, most feature detection can be done inline using the preprocessor, or by simply defaulting to being available (this is 2018, not 1988--there's much more commonality today.) And as long as your feature-detection macros are controllable from build-time flags, people can easily work around issues by explicitly toggling availability. Thus, the autotools convention of using "#ifdef FEATURE_FOO" for feature flags is bad advice. Always use boolean flags like "#if FEATURE_FOO", and only define FEATURE_FOO in config.h-like files if not already defined. That allows people to do, e.g., "MYCPPFLAGS=-DFEATURE_FOO=0" to override feature detection as necessary.
[3] Autoconf's feature checking (check the feature, not the version) sounds robust in practice, but you'd be surprised how many ways it can break, or how difficult writing a properly robust feature check is. Plus, especially on Linux people expect a feature to work at run-time if the kernel supports it, regardless of whether glibc had a wrapper at build time or has it at run time. These issues are where you'll spend much of your time, and this can't be automated. You can't foresee all these issues! Focus on making it easy to supplement or work around your build!
Is this software that he presently wants to see disseminated randomly, or is it provided here for academic purposes at this time?
I don't know if it's his intention to have this library become part of a bunch of linux or BSD distros or not. But, it seems kind of presumptuous on your part to assume that he does.
I'm a PhD student in theoretical computer science and have an academic interest in this sorter. But his build system is just too much of a pain to deal with voluntarily.
Not to mention that his cpu cycle counting doesn't actually seem to compile on Linux, which doesn't have sysctlbyname? Or maybe build errors are expected behaviour?
I haven't read the algorithm here yet nor all of the comments, so this may have already been covered, but in many cases when it comes to sorting integers and such you don't really need to sort them at all - you just need to count them.
Yeah, except that when the integers are very large, that tends to fail. Also, as pointed out by many others, the point of this work is to sort in constant time to avoid side-channel attacks. I doubt the histogram sort (which I think you're referring to) has this property.
Go is a language that's easy to use, but a challenge for beginners to use well, especially if you try to force [insert another language] constructs into it.
I see programmers that are new to Go often struggle with trying to apply their object-oriented mindset into a language that's not object-oriented and run into trouble, complain about the language, and call it rubbish. Or, focus on the lack of generics and other part of the language they don't like (e.g. slice manipulation).
Go is certainly far from perfect but after spending the better part of 7 years with it, it's usually the first tool I reach for.
"Go is a language that's easy to use, but a challenge for beginners to use well, especially if you try to force [insert another language] constructs into it."
Strongly agreed. There's a lot of languages out there with very rich feature sets, and the way you get jobs done is to go find the right feature you need for your current problem. With Go, you need to learn the language and extract every last drop out of every language feature. This is exacerbated by the fact that the feature set isn't what people expect, e.g., object composition is not what they are used to, and while interfaces are simple there's still some art to using them properly.
Despite the vast, vast distance between Go and Haskell on the general-purpose programming language landscape, I found my experiences in Haskell to be quite useful in Go, because while they were specifically inapplicable to an imperative language, the general practice I got from Haskell of taking a bizarre set of programming tools and learning how to make sensible programs out of them even so was quite useful.
(It isn't necessarily the first language I reach for for personal tasks, but it is a superb professional programming language, offering a nearly-unique blend of the ability to get the job done you usually need to do for a wide variety of standard programming tasks (but not all!) while resulting in source code that is still comprehensible to almost every programmer. It isn't my favorite overall, but it's the best professional choice of language I have in my belt, which is often precisely because it does not permit me to indulge in flights of clever fancy that solves a problem in 25 impenetrable-to-the-next-guy lines of code. I know a lot of people may not love to hear that, but it's a factor you really have to consider when you are being paid to solve problems.)
It's kinda hilarious to see that as more and more successful projects and companies use Go in their stacks, the number of comments like these increases in HN.
Stack adoption is orthogonal to maturity or ease of use. Recent experiences have indicated to me that the main factor in which stack a company uses is basically the whim of whatever developer was tasked with the initial project creation.
Nobody's going to tell him no and there's not going to be significant discussion about the merits and even if there was, there's no best practices to lean on to make the decision rely on anything other than pure emotion.
Stack adoption makes people have to actually use them, which then gives them something to complain about (whereas before they could just ignore something not to their taste).
In my opinion, don't use Go at all if you can avoid it - it may be acceptable for a tiny CLI project but anything of significant complexity needs a language that can scale.
I'd hardly call Kubernetes, Docker daemon and tooling, etcd, CockroachDB, geth, and nsq tiny CLI projects.
If anyone has any reservations about learning Go, don't judge the language based on a list of flaws written by some programmers who used it for a few months, became frustrated, and wrote a blog post.
The type of codebase scaling Google does is very different from other companies.
They have huge numbers of junior developers right out of university (those that Rob Pike, one of the main authors of Go, likes to claim aren't good enough to learn advanced concepts) and their coding style is not focused on correctness and simple implementations - do something, do a lot of it, write a lot of tests.
For companies that aren't the size of Google, that don't work the same way (monorepos etc.), and that simply don't have the same set of resources available it will often end up being much easier to use a language that either prevents flaws (via a strong type system, like in Haskell or Rust, which Go does not have) or gracefully handles flaws (via an error handling strategy, like in Erlang or Elixir, which Go does not have).
Go is statically typed (this means that the compiler performs its checks at compile time and hard-fails the compilation process in the presence of type errors), but there is no single definition of what "strong" means.
Programming language theory is a field that is in development and notions of "strong" type systems that were valid in the 80s (in which Go certainly would have been considered strongly typed) are no longer relevant. The list you linked seems to cite the Go website itself as the source, by the way.
At the very least a modern language that wants to claim to have a strong type system should provide user-defined sum types, exhaustiveness checking and parametric polymorphism. Go has none of those.
When it comes to error handling, Go's "concept" of it is that "there may be a thing that can be turned into a string, in which case there was probably an error, but it's up to the developer to check - we won't help you". You may as well just use C then.
There is nothing to short-circuit failed computations, check whether errors have in fact been handled, restart / terminate computations gracefully and so on. It's all manual labour that the developers need to remember and boilerplate over and over again.
I would recommend you to spend some time with the languages that are "above Blub"[1] (ctrl+f "the blub paradox") - good candidates for learning some modern PLT concepts are Haskell[2], Rust[3] and Erlang[4]. Even if you don't end up using those languages in your professional life, knowing the concepts they introduce will improve your code in "Blub-languages" (Go, Java, etc.), too.
Some of the stuff that makes it work is unfortunately apparent later than the warts. Try to think of it as a domain-specific language for implementing simple http endpoints. ¯\_(ツ)_/¯
Probably one of the few. On HN, a search for "djb" results in about 60 submissions where "djb" is in the submission title. There are about 6 for Djibouti.
Also, the airport code "DJB" is for Sultan Thaha Airport. The Djibouti–Ambouli International Airport code is JIB.
It's nice when things fit in RAM. But very often they don't. When you want to sort arbitrary-sized binary records on disk look no further than bsort: https://github.com/pelotoncycle/bsort
When sorting on disk, there are actually limitations in sorting linear time. Has to do with being able to read blocks in memory and write them back to disk. As everything is in blocks, you are limited to O(n/B log(n/B)) with B being the amount of items per block. For more speed, a merge sort that keeps in mind the block size works quite well. Search for external sorting.
This often still ends up being an improvement over scalar code (at the cost of higher power usage), but for occasional workloads that don't need to do multiple milliseconds of AVX instructions you tend to have better results from 4-wide vectors, which don't have this cost.