Build System Rules and Algorithms (2009) [pdf]

paulddraper · on Sept 2, 2016

So many build systems have regressed in that they don't have incremental process built into their core functionality, like Make did. Ant, Maven, Grunt, SBT, Gulp, Gradle...basically everything sucks at this compared to Make. Awful.

But anyone interested in a next generation scalable build tool, check out Google Bazel. https://bazel.io/ (Ex-Googlers built similar build tools elsewhere -- Buck and Pants -- but I'm most optimistic about Bazel.)

Like Make, Bazel uses incremental tasks at its core. But where Make uses timestamps, Make hashes inputs, allowing for shared caches. It's also been optimized to work with Google-scale build definitions. In beta now, 1.0 planned for end of 2017.

premium-concern · on Sept 2, 2016

Not sure what you are talking about regarding SBT. SBT does incremental builds, it's one of the few that even do this interactively. It is also built on parallel-by-default execution: The task engine runs everything that can be computed in parallel except when you request sequential processing. (Some old code-bases with broken tests for instance.)

My experience with Google software is that its usually not as good as Google employees think it is. See Go, Dart, GWT, Android, Google Groups, Angular. I think this can be partially contributed to the sect-like structure of the company where everything created at Google is the best thing ever, while everything invented in the rest of the world is discarded or ignored. See this hilarious reply from Googlers to a feature that many people take for granted, but Googlers are completely unaware of: https://github.com/bazelbuild/bazel/issues/1394

Bazel is yet another thing where Googlers think they have invented the best thing ever, and then have to realize that the world around them, which they tried to ignore so hard, has moved on since 1970, making their "really great, special tool" neither great nor special.

chubot · on Sept 2, 2016

What do you mean that Ant/Maven don't have "incremental process"? They don't do incremental builds at all?

There is an additional distinction between "top down" and "bottom up" incremental builds as mentioned in this article about tup.

NEITHER Make or Bazel do bottom up builds. They are top down. That means that incremental builds requires O(n) stat() or hashes. You have to check if ANY input is out of date.

In contrast, tup does O(1) work for incremental builds by using inotify.

In practice, I'm not sure how much this matters, but it is interesting -- and it is the central point of the linked article. Ninja has a reputation for being fast, and it also does top-down builds. As I recall the Linux kernel is quite good at caching inodes for stat() calls, so you can basically stat millions of files in under a second. (Windows may be a different story.)

But if you are using hashes rather than timestamps, the work to check if any of a million files are out of date is much larger.

Jtsummers · on Sept 2, 2016

inotify is only available on linux so `tup monitor` doesn't work on all OSes. But other platforms have similar libraries so it's possible to port it.

gavinpc · on Sept 3, 2016

Are there such libraries for Windows? I can't find my notes on this, and all I remember is that `strace` looked the most promising (for a non C programer).

I actually switched from Windows to Linux because of Tup—although to their credit, they have fixed the issues that I filed.

I still use Tup on Windows for production at work, and in place of `tup monitor` I use a service based on .NET's file system watcher, and just tail monitor the log.[0] The config looks like this:

    <job at="%HOME%\some\project">
      <watch>
        <path>script/*.js</path>
        <path>stylus/*</path>
        <path>src/**</path>
        <path>tools/**.cs</path>
      </watch>
      <do>tup --no-environ-check</do>
    </job>

It's not foolproof, since I have to maintain that list, but it's better than the alternatives (whatever they are).

[0] https://bitbucket.org/gavinpc/reactor/

Jtsummers · on Sept 3, 2016

https://msdn.microsoft.com/en-us/library/windows/desktop/aa3...

Best I can find. I was referring to kqueue and FSEvent though for free bad and macOS.

Sorry traveling and phone only.

I wonder if the Linux on Windows stuffs will make it easy to use inotify on Windows? I've only heard of it. Haven't researched how that works.

beagle3 · on Sept 2, 2016

djb's "redo" (implemented by apenwarr and others) is a minimalistic-almost-perfect implementation of incremental process.

TTBOMK, the only two things it doesn't do well are processes with multiple outputs (e.g., yacc has both .c and .h as output), and distinct output directories. Both can be worked around, but ... it's clunky. Other than that, it's ridiculously simple, extremely robust and generally awesome. Way better than Make at properly expressing dependencies.

But of course, gittup takes the prize - by virtue of watching the build process from inside, it doesn't need you to tell it about dependencies at all. It knows better than you do, and can thus do the fastest possible perfectly incremental builds.

gavinpc · on Sept 3, 2016

> But where Make uses timestamps, Make hashes inputs

I think you mean "Bazel" hashes inputs.

Anyway, you don't mention Tup... have you used it? It might not be Google-scale, but it can build Mozilla central and a Linux distro. I use it at work and for personal projects. All I can say about it is that there's no going back. Maybe Bazel is that way, too. I'll check it out.

Tup also does output-based short-circuiting, which I assume is based on hashes. In other words you can say

    : some_input |> ^o^ process %f %o |> some_output

and because of the ^o^ flag, rules that take `some_output` as an input will not be invoked if its content didn't change from last time. Without that, it's still not using timestamps, but keeps a (sqlite) database of whether or not changes to a file originated from Tup itself.

I will say this: it's an adjustment, if you're used to mutating files during a build. (I'm looking at you, pdflatex.) But it's sooo worth it.

blackkettle · on Sept 2, 2016

Could you explain a bit more why - in practice - you think Bazel is 'next generation' compared to Make?

Every time I've encountered Bazel thus far in the wild I've found it an atrocious experience.

I haven't yet found anything I prefer to simple Make, and even auto tools while awful, are at least completely a known quantity.

chubot · on Sept 2, 2016

Yeah the biggest problem with Bazel is that it has this big JVM dependency.

And at least the internal version makes you specify your dependencies twice for C++, rather than using something like gcc -M. (And I believe that is true for Java too.)

tsvetkov · on Sept 2, 2016

You're wrong about Gradle: it has good incremental build support. Not only it does not run a task if its inputs are not changed, it even makes it possible for a task to process inputs incrementally (e.g. so copy task can only copy/remove changed files). It does not support distributed build though.

I'm not sure if SBT supports an incremental tasks execution, but it does support incremental compilation of Scala which can be more than enough for many devs.

gavinpc · on Sept 3, 2016

I haven't used Gradle directly, but I've been using Cordova for years, which switched from Ant to Gradle a while back. With Gradle, the incremental builds got way faster. It was clearly burning through up-to-date targets.

But it was still listing the up-to-date targets. Incremental builds in Tup don't even list what's not being done, because there are no "targets" in Tup. That's the whole point of Mike Shal's paper.

fizixer · on Sept 2, 2016

I guess the only gripe I have with make is that it has a rigid syntax (that includes tabs). Since what a Makefile eventually is, is a graph, I'd be happy if that graph is specified as a JSON file, which automatically gets converted into a valid Makefile.

Has anyone attempted that approach? simply put a JSON wrapper around make?

wtetzner · on Sept 2, 2016

I agree that Make's syntax isn't ideal, but JSON doesn't seem like a move in the right direction. If you're going to give Make a new syntax, why not pick a more pleasant syntax?

fizixer · on Sept 2, 2016

Well, the way I see it, there are only 3 games in town: JSON-style variants (including YAML, arguably), XML, and S-expression variants.

XML is definitely not better than JSON IMO. Are you hinting at either S-expression, or something like YAML?

(For any other "custom syntax" we're better off sticking to make syntax itself)

wtetzner · on Sept 2, 2016

Well, your 3 games in town are serialization formats. They're great choices if you want tools to manipulate the build scripts.

However, in my opinion, from your list only s-expressions make a reasonable syntax that a human would write an maintain.

And, if you want that, well :) https://www.gnu.org/software/make/manual/html_node/Guile-Int...

hood_syntax · on Sept 2, 2016

Curious what you mean by pleasant. JSON is not really "pretty" but it is simple, and powerful imo. Is it the boilerplate that puts you off?

wtetzner · on Sept 2, 2016

Yes, the issue is that build scripts are typically written and maintained by humans (as opposed to programs). A more pleasant syntax is one that would be easier to read and maintain.

I think if you want a simple serialization format to replace Make's syntax, s-expressions would be a better fit.

padator · on Sept 2, 2016

You can use mk, the successor to make, which does not have the TAB requirement (you can use TAB or spaces interchangeably).

Regarding JSON, it is more verbose than Make. You need to use quote around strings, commands, everything. Compare:

foo.o: foo.c gcc -c foo.c

with

{ 'foo.o' => ['foo.c'; 'gcc -c foo.c'] }

klodolph · on Sept 2, 2016

{ "outputs": ["foo.o"], "inputs": ["foo.c"], "command": ["gcc", "-c", "foo.c"] }

s-phi-nl · on Sept 2, 2016

You might look at YAML: it is a superset of JSON that allows unquoted strings.

Scaevolus · on Sept 2, 2016

Ninja is a fast Make-like build system with a format designed to be generated by higher-level build tools.

It was originally written for Chrome, but has been adopted as an output format for CMake and is used for many other projects as well.

nihongo · on Sept 2, 2016

Doesn't Ant use XML format?