24-core CPU and I can’t move my mouse

titzer · on July 10, 2017

Full disclosure: I work for Google on Chrome.

A Chrome build is truly a computational load to be reckoned with. Without the distributed build, a from-scratch build of Chrome will take at least 30 minutes on a Macbook Pro--maybe an hour(!). TBH I don't remember toughing out a full build without resorting to goma. Even on a hefty workstation, a full build is a go-for-lunch kind of interruption. It will absolutely own a machine.

How did we get here? Well, C++ and its stupid O(n^2) compilation complexity. As an application grows, the number of header files grows because, as any sane and far-thinking programmer would do, we split the complexity up over multiple header files, factor it into modules, and try to create encapsulation with getter/setters. However, to actually have the C++ compiler do inlining at compile time (LTO be damned), we have to put the definitions of inline functions into header files, which greatly increases their size and processing time. Moreover, because the C++ compiler needs to see full class definitions to, e.g., know the size of an object and its inheritance relationships, we have to put the main meat of every class definition into a header file! Don't even get me started on templates. Oh, and at the end of the day, the linker has to clean up the whole mess, discarding the vast majority of the compiler's output due to so many duplicated functions. And this blowup can be huge. A debug build of V8, which is just a small subsystem of Chrome, will generate about 1.4GB of .o files which link to a 75MB .so file and 1.2MB startup executable--that's a 18x blowup.

Ugh. I've worked with a lot of build systems over the years, including Google's internal build system open sourced as Bazel. While these systems have scaled C++ development far further than ever thought possible, and are remarkable in the engineering achievement therein, we just need to step back once in a while and ask ourselves:

Damn, are we doing this wrong?

mmarks · on July 10, 2017

Not to take away from your point, but in my experience the vast majority of C++ build pipelines even at major companies can still be improved. Few people enjoy 'improving the build', it often touches everything, and requires discipline to keep it working. Most of the projects I've worked on have been larger than Chrome, I've seen the compile time for BioShock Infinite go from 2 hours down to 15 minutes with serious work on header use, precompiled headers, and all the other tricks people use. Epic's build system is a pretty good example. There is even a older book, Large-scale C++ Design, that is specifically about this point.

Starting with a full build that initially takes hours and it shrinking to < 15-20 minutes and better seems pretty par for the course for truely large C++ projects. You don't get a fast build process for free, but if the team makes it a priority, alot can be done.

EDIT: Times mentioned were for a full build, often you rarely due a full build, incremental builds should be majority. Places that don't make incremental builds 100% reliable drive me crazy and waste so much developer time. This is common, but it's a lame excuse. Just do the work and fix it.

evmar · on July 10, 2017

I worked on exactly this problem for Chrome! I agree with all your major points -- in particular, optimizing incremental builds is the most important thing for developer sanity.

Here's a post about what I did: http://neugierig.org/software/chromium/notes/2011/02/ninja.h...

mmarks · on July 10, 2017

Doom 3 actually used Scons across all the OSes (~2004). At the time, it was so nice to have a python build system. I sort of hoped it was the future, but it sort of died as it failed to scale. I've seen a few home-brewed python build systems work well, but typically we're back to CMake/Make

gue5t · on July 10, 2017

Check out meson; it seems to be the future for projects that were using CMake or autotools. It's certainly a joy to work with in comparison.

xorcist · on July 10, 2017

Were using? I quite enjoy CMake and find it fast and easy to use. What am I missing out on?

gue5t · on July 10, 2017

Meson is strongly-typed; it goes beyond just having a notion of "paths" and tracks what kind of object paths point to, and what kind of resource strings name. This is invaluable, because it means you get feedback when you accidentally pass an object file instead of a library name or any number of other confusions.

Personally, this meant the error messages I got were helpful enough that my first meson-built project was working after a half-hour of deciding to port it over despite using several system libraries and doing compile-time code generation.

Meson's language is not Turing-complete, so it's easy to analyze for errors. Unlike CMake and autotools, Meson's language looks like a real (pythonish) programming language, and it isn't string-oriented; dances of escaping, substitution, and unescaping are uncommon.

Compared to autotools or hand-rolled Makefiles, CMake is a step in the right direction; meson is a leap.

adrianratnapala · on July 20, 2017

How happy have you been with Meson in complicated projects with multiple directories. Especially where things are complex and different options are used in different places. Make, in spite of all it's craziness would be a good tool if it any sane kind of support for this.

CMake tries hard to to do better, but then introduces its own layers of craziness. So it's fine as long as I am not doing anything unusual, but as soon as I need to understand what is going on, I find a dizzying array of barely working moving parts beneath me.

inglor · on July 10, 2017

Just as a data point - Chrome has more code than the linux kernel - Would you say BioShock Infinite seriously has a larger code base than Chrome?

I think a lot of people don't estimate correctly just how huge Chrome is.

mmarks · on July 10, 2017

I would expect kernels to be quite small (#file, line-count) compared to major Applications like Unreal Games and Civilization 5. I've never worked on Chrome, but I can safely say the amount of source code in a few Unreal Games and Civilization5 dwarfs the drivers and OS code I've worked on. Take Unreal then add a team of developers adding onto it for multiple years thru multiple releases. Then add all the middleware (Havok, AudioEngines, NaturalMotion).

OS are much larger than kernel, I'd guess all the driver code exceeds the actual kernel.

People always think they code base is large, but having built most of the Call of Duties and many Unreal games, all the OS code I've worked on is trivial in size comparison. There is probably something bigger, but games seem bigger than many major apps in my experience.

joshuamorton · on July 10, 2017

For reference, the kernel has ~15 million LoC, and according to a not exactly reliable or verifiable infographic on reddit, BioShock infinite contains 631 miles, which would be between 3 and 10 million LoC.

mirekrusin · on July 10, 2017

Also, Linux kernel is C vs the mentioned rest C++.

vborovikov · on July 10, 2017

Why is it so big?

SolarNet · on July 10, 2017

It's an operating system (pretending to be a browser).

marcosdumay · on July 10, 2017

It is also an interpreter for a big number of convoluted (and a bigger still of non-convoluted) languages.

brucedawson · on July 10, 2017

A lot of time has been spent on optimizing Chrome's build: - Ninja build system will perfectly parallelize the build without overloading resources (modulo this OS bug) - Meta-build system was recently completely replaced (gyp-> gn) to improve builds - Lots of work on clang-cl to allow compiling Chrome for Windows without using Microsoft's compiler - Distributed build system to further increase parallelism

So, lots of work has been done to deal with the build times. And probably not a lot of low hanging fruit to be found. But, still more work is being done. Support is being added for 'jumbo' builds (aka unity builds, where multiple translation units are #included into one) which is helping a bit with compile and link times.

mmarks · on July 10, 2017

Unity builds are a big win in Unreal Engine and basically required w/Unreal, the actual win is surprising large.

EDIT: I'm sure lots of work has been done, not trying to degrade that. Just sharing my experience on my projects, never worked with Chrome.

mpweiher · on July 10, 2017

> Damn, are we doing this wrong?

Yes, and I mean "we" as in "this industry".

I just recently talked to someone whose Swift framework(s) were compiling at roughly 16 lines per second. Spurred on by Jonathan Blow's musings on compile times for Jai[1], I started tinkering with tcc a little. It compiled a generated 200KLOC file (lots of small functions) in around 200 ms.

Then there are Smalltalk and Lisp systems that are always up and pretty much only ever compile the current method.

We also used to have true separate compilation, but that appears to be falling out of favor.

Of course none of these are C++ and they also don't optimize as well etc. Yet, how much of the code really needs to be optimized that well? And how much of that code needs to be recompiled that much?

So we know how to solve this in principle, we just aren't putting the pieces together.

[1] https://www.youtube.com/watch?v=14zlJ98gJKA He mentioned that a decent size game should compile either instantly or in a couple of seconds

andrepd · on July 10, 2017

Are you serious? You want to make a product: a Web browser. What technology are you going to choose? The one that makes your browser fast but gives you more work, or the one that makes your browser slower but makes compilation less of a nuisance to you? It's mind boggling that some would openly say that, hey, who cares about performance that much, these compilation times bother me, the developer. On a browser of all things! Would you be okay with your browser being 2x or 3x slower?

gizmo · on July 10, 2017

False dilemma. You can have both. It's okay if a fully optimized release mode binary takes a bit longer to compile, but compiling a few million lines of code for a debug build shouldn't take more than a second or two.

Also consider the leverage factor. Improvements to the compiler benefit all users of the programming language, so it's worthwhile to invest in high quality compilers.

striking · on July 10, 2017

In what language can we have both?

Yes, we can use caching compilers (https://wiki.archlinux.org/index.php/ccache) to speed up builds with few changes. We can lower optimization levels (although that barely gives you an increase in compile speed compared to the runtime speed you lose and the fact that it makes your program do slightly different things).

There's no slider from "pessimum" to "optimum". You need to do wildly different things to optimize past this point for compile speed. Erlang hot-reload and at-runtime-code-gen from other langs come to mind. But that will almost definitely slow down your program because of the new infrastructure your code has to deal with.

I have observed that there can be a nice balance with Java and the auto-reloading tools that are available for it. But I am unaware of their limitations and how a web browser might trigger those limitations.

CyberDildonics · on July 10, 2017

D is a language where you can have both. C++ architected correctly can get much closer though. Much larger compilation units are a start. After that, realizing that modularity comes from data formats and protocols means you can start to think about minimal coupling between pieces. I think dynamic libraries for development are very under utilized.

marcosdumay · on July 10, 2017

Besides those people already pointed (and Haskell - yep, the always slow GHC can do that), there's no reason C++ couldn't have both (except from large templates).

Shorel · on July 10, 2017

> In what language can we have both?

D has very fast compilation times compared with C++.

Rust is another option.

twic · on July 10, 2017

Rust doesn't compile very quickly at the moment. Helpfully, there's a live thread about the matter on r/rust [1]. Broadly speaking, it's about the same as C++. Some aspects are faster, some slower. Points worth noting (from that thread and elsewhere):

* Everything up to and including typechecking (and borrowchecking) takes a third to a half of the time, with lowering from there to a binary taking the rest of the time; that means (a) you can get a 2-3x speedup if you only need to check the code is compilable, and (b) overall speed isn't likely to improve a lot unless LLVM gets a lot faster.

* Rust doesn't currently do good incremental compilation, so there are potential big wins for day-to-day use there.

* There is a mad plan to do debug builds (unoptimised, fast, for minute-to-minute development) using a different compiler backend, Cretonne [2]. If that ever happens, it could be much, much faster.

[1] https://www.reddit.com/r/rust/comments/6m97hl/how_do_typical...

[2] https://internals.rust-lang.org/t/possible-alternative-compi...

p0nce · on July 10, 2017

> In what language can we have both?

Have you actively tried to find one?

zeusk · on July 10, 2017

> You can have both. It's okay if a fully optimized release mode binary takes a bit longer to compile, but compiling a few million lines of code for a debug build shouldn't take more than a second or two.

> You can have both.

Pretty bold claim, without proof.

gizmo · on July 10, 2017

It's not difficult to spit out machine code at a high pace. TCC is one example given by the grandparent, but it's certainly not the only fast compiler out there. Languages like Turbo Pascal were designed for rapid single-pass compilation, way back in the 80s.

A million lines of code represents an AST with a few million nodes in it, which compiles to a binary of a few megabytes. To do this we have computers with a dozen cores running at 4ghz each, 100GB of memory and blazingly fast SSD drives.

It's easy to forget, but computers themselves aren't slow. The software we write is just inefficient.

sqeaky · on July 10, 2017

What you are saying is that it is possible in theory, but no one has done it yet. So in some future were someone rewrites all C++ compilers to not be so slow then we won't need to compromise.

Most C++ devs have to work with tools that currently exist and so we are stuck with what the compiler devs give us. Believe it or not C++ compiler devs are pretty smart people and have largely optimized it as much as possible without a language redesign.

That language redesign is in the works with modules, but the dust hasn't settled yet, so that is also a discussion for the future. In the mean time no other language delivers the performance C++ does right now. So if I want to ship product right now the very real dilemma is fast product with slow builds (and a bunch of tools for dealing with that) with C++ or use some other language for a faster compiler and a slower product.

Then there is Rust, but that is another whole can of worms and not in use in most shops yet (Just switching to something has a huge cost).

gizmo · on July 10, 2017

C++ is a language that's incredibly hard to compile efficiently and incrementally, because it suffers from header file explosion (among other things) and as you mentioned it has no working module system.

C compiles a _lot_ faster than C++, so that's always an option. And as other people have pointed out, you can get C++ code to compile much more quickly by being very disciplined about what features you use and how your code is laid out.

So I agree that if you want to ship something right now all your options have significant downsides. I think software engineers as an industry don't take tooling nearly as seriously as they should. Tools are performance amplifiers and we currently waste a staggering number of manhours working with poorly designed, unreliable, poorly documented and agonizingly slow tools.

msla · on July 10, 2017

Tell me, how do you do mutual recursion in a single-pass define-before-use compiler?

jfoutz · on July 10, 2017

Single pass generally means one crack at each compilation unit. it's ok to keep a list of unresolved forward references and go back and inject (fix-up) the address once it's known. I mean, that would still count as a single pass. If they're not in the same file, the linker does it.

ruste · on July 10, 2017

Easy, stack trampoline (or skyscraper a la cheney on the m.t.a)

Dylan16807 · on July 10, 2017

Are we speaking in a general sense here? Because the root of this is headers needing to be compiled with (almost) every use in C++. We could get rid of that while maintaining the same functionality. It's not a very bold claim unless there's a requirement of "no significant language changes"

andrepd · on July 10, 2017

Then show me how, please.

astrodust · on July 10, 2017

Turbo Pascal.

mpweiher · on July 10, 2017

Yes, I am totally serious.

Where did I write "who cares about performance"? And why do you think any of what I said is going to cost 2x-3x performance? Performance has been either a major part of or simply my entire job for most of my career, and I usually make projects I run into at least an order of magnitude faster. For example by switching a project from pure C to Objective-C. Or ditching SQLite despite the fact that it's super-optimized. Or by turning a 12+ machine distributed system into a single JAR running on a single box.

The Web browser and WWW were invented on a NeXT with Objective-C. It wasn't just a browser, but also an editor. In ~5KLOC written in a couple of months by a single person. NCSA Mosaic took a team of 5 a year and was 100KLOC in C++. No editing. So pure code-size is also a problem. And of course these days code size has a significant performance impact all by itself, but also 20x the code in C++ is going to take a significantly longer time to compile.

In terms of performance, the myth that you need to use something like C++ for the entire project is just that: a myth. First, the entire codebase doesn't need to have the same performance levels, a lot of code is pretty cold and won't have measurable impact on performance, especially if you have good/hi-perf components to interact with. See 97:3 and "The End of Optimizing Compilers". Or my "BookLightning" imposition program, which has its core imposition routine written in Objective-Smalltalk, probably one of the slowest languages currently in existence. Yet it beats Apple's CoreGraphics (written in C and heavily optimized) at the similar task of n-up printing by orders of magnitude.

Second, time lost waiting for the compiler is not "convenience", it is productivity. If you get done more quickly, you have more time to spend on optimizing the parts of the program that really matter, and thoughtful optimization tends to have a much larger impact on performance than thoughtless performance. The idea that this is purely a language thing is naive. See, for example, https://www.youtube.com/watch?v=kHG_zw75SjE

Third, you don't need to have C++ style compilers and features to have a language that has fast code, see for example Turbo Pascal mentioned in other comments. When TP came out, we had a Pascal compiler running on our PDP-11 that used something like 4-5 passes and took ages to compile code. TP was essentially instantaneous, so fast that our CS teacher just kept hitting that compile key just for the joy of watching it do its thing. It also produced really fast code.

andrepd · on July 10, 2017

Point taken, but my point was more about pragmatism. We know it's possible to have a fast compiler that generates fast code (indeed every time this is discussed someone brings up TP). But it's no use talking about a 30 year old compiler or about how the first WWW browser was superbly written. What I mean is I'm not talking about possible, I'm talking about feasible now. If I want to write a performance critical project right now what tool(s) should I use? The answer is most likely C++.

quotemstr · on July 10, 2017

Furthermore, using a "slow" language for big parts of the code can make the whole project faster: size matters as an input into performance. A compact bytecode executed by a small interpreter thrashes the cache and memory hierarchy a lot less than tons and tons of ahead-of-time-compiled native code.

Use high-level interpreted languages to make your life easier, but also use them to make the page cache's life easier.

pjmlp · on July 10, 2017

A good example of the architectures you are describing are Android and UWP.

Although the lower levels are written in a mix of C and C++, the OS Frameworks are explicitly designed for Java, C# and VB.NET.

Trying to use C or C++ for anything more than moving pixels or audio around is a world of pain.

The Android team even dropped the idea of using C++ APIs on Brillo and instead brought the Android stack, with ability to write user space drivers in Java (!).

bluGill · on July 10, 2017

> Would you be okay with your browser being 2x or 3x slower?

No, I will switch to firefox or something. I'm a user, I don't care how hard it is for developers I can about my workflow which is using a browser on various machines, some of which are very slow.

hinkley · on July 10, 2017

Tired developers make more mistakes. Rushed developers cut corners.

A faster compiler isn't just some frivolity. It's a power tool. A force multiplier.

reaperducer · on July 10, 2017

Of the developers I've met recently, this wouldn't surprise me at all. The world revolves around them. Not the product. Not the user. Not the company. They are a "developer" or worse, an "engineer" and can do no wrong.

dilap · on July 10, 2017

Jai, though unreleased and in development, is an language that explicitly aims to have extremely good runtime and compile-time performance.

ash · on July 10, 2017

Why not fast browser and fast compilation at the same time?

sp332 · on July 10, 2017

This might be theoretically possible, but I don't think it's been done yet. All the fast browsers so far take a while to build.

gsnedders · on July 10, 2017

It's worthwhile to note that Blink (v. the whole of Chromium) has had its build time quintuple in the past four years or so, and its starting point was far slower than Presto (which may or may not qualify as "fast" in people's books, depending on what features you care about).

andrepd · on July 10, 2017

Show me how, then.

ash · on July 11, 2017

I didn't say I know how to do it. And I didn't say it's easy. Yes, currently we have fast browser and slow compilation. Just let's not assume situation can't be improved at all.

pjmlp · on July 10, 2017

Bjarne's made the mistake of having C++ rely on C's linker model, so it meant no modules.

Now we can see it as a big mistake, but on those days probably it was one of the reasons why C++'s adoption took off.

Also while C lacked modules, most Algol and PL/I derived languages supported them since the late 60's.

Swift's case has the issue of mixing type inference with subtyping, so lots of time is spent there.

All in all, I really miss TP compile times and at least on Java/.NET, even with AOT compilers are close enough.

EDIT: some typos

ajross · on July 10, 2017

Straightforward integration with existing tooling was not a "mistake", it was a design point. There were plenty of competing runtimes even in the 80's that were better than C's linker model. C++ succeeded because it didn't create that friction.

pjmlp · on July 10, 2017

> "but on those days probably it was one of the reasons why C++'s adoption took off."

I know "Design and Evolution of C++" quite well, and have been a C++ user since Turbo C++ 1.0 for MS-DOS.

ajross · on July 10, 2017

Sure, but it wasn't a "mistake". Stroustrup absolutely wanted a C with classes, and that meant tight integration with C toolchains. Symbol mangling was the clever idea invented to implement that very deliberate choice, not a fortuitous happenstance.

CodeMage · on July 10, 2017

I'm pretty sure @pjmlp is using the word "mistake" to say "a decision that turned out to be bad". English is not my native language, but judging by the ways I've seen it used and what dictionary definitions I can find, it seems quite acceptable.

Stroustrup made the decision on purpose and consciously, but it turned out to have disastrous effects.

pjmlp · on July 10, 2017

Yep, you got it right.

pjmlp · on July 10, 2017

Actually name mangling predates C++.

rwmj · on July 10, 2017

OCaml uses C's linker model, and yet still manages to have working Modula-like modules (even with cross-module inlining). So there's an existence proof that it's possible to do it well.

rbehrends · on July 10, 2017

Well, that comes with a few caveats:

1. OCaml generates additional information that it stores in .cmi/.cmx files.

2. OCaml does not allow for mutual dependencies between modules, even in the linking stage. Object files must be provided in topologically sorted order to the linker.

3. OCaml supports shared generics, which cuts down on the amount of code replication (at the expense of requiring additional boxing and tagged integers in order to have a uniform data representation).

rwmj · on July 10, 2017

All true except #3 (partially).

> 1. OCaml generates additional information that it stores in .cmi/.cmx files.

On this point I'd say that it could probably embed the cmx file as "NOTE" sections in the ELF object files, but likely they didn't do it that way because it's easier to make it work cross-platform. Every "pre-compiled header" system I've seen generates some kind of extra file of compiled data which you have to manage, so I don't think this is a roadblock.

> 2. OCaml does not allow for mutual dependencies between modules, even in the linking stage. Object files must be provided in topologically sorted order to the linker.

I believe this is to do with the language rather than to do with modules? For safety reasons, OCaml doesn't allow uninitialized data to exist.

Although (and I say this as someone who likes OCaml) it does sometimes produce contortions where you have to split a natural module in order to satisfy the dependency requirement. I've long said that OCaml needs a better system for hierarchical modules and hiding submodules (better than functors, which are obscure for most programmers).

> 3. [...] at the expense of requiring additional boxing and tagged integers [...]

I think this is fixed by OCaml GADTs: https://blogs.janestreet.com/why-gadts-matter-for-performanc... However this is a new feature and maybe not everyone is using it so #3 is still a fair point.

rbehrends · on July 10, 2017

> I believe this is to do with the language rather than to do with modules?

Both, sort of. The problem is that mutually recursive modules are tricky. So, it's a limitation of the language, but one that is there for a reason.

> I think this is fixed by OCaml GADTs

No, GADTs solve a different problem. Essentially, normal ADTs lose type information (due to runtime polymorphism). GADTs give you compile time polymorphism, so the compiler can track which variant a given expression uses. Consider this:

  # type t = Int of int | String of string;;
  type t = Int of int | String of string
  # [ Int 1; String "x" ];;
  - : t list = [Int 1; String "x"]
  # type _ t = Int: int -> int t | String: string -> string t;;
  type _ t = Int : int -> int t | String : string -> string t
  # [ Int 1; String "x" ];;
  Error: This expression has type string t
         but an expression was expected of type int t
         Type string is not compatible with type int

The problem with functors (and also type parameters) is the following. Assume that you have a functor such as:

  module F(S: sig type t val f: t -> t end) = struct ... end

To avoid code duplication, F has to pass arguments to S.f using the same stack layout, regardless of whether it's (say) a float, an int, or a list. This means that floats need to get boxed (so that they use the same memory layout) and integers have to be tagged (because the GC can't tell from the stack frame what the type of the value is).

pjmlp · on July 10, 2017

Where can I learn more about it in a high level way instead of delving into source code?

I am curious how it is done in a portable way across all OSes, specially crude system linkers and OSes without POSIX semantics.

For example, I imagine this can be made via ELF sections, but not all OSes use ELF.

rwmj · on July 10, 2017

These links explain the extra files: https://ocaml.org/learn/tutorials/filenames.html https://realworldocaml.org/v1/en/html/the-compiler-frontend-... https://realworldocaml.org/v1/en/html/the-compiler-backend-b...

The cmx data could be converted to ELF note sections, but the whole thing has to work on Windows as well, so I guess they didn't want to depend on ELF.

In most projects, you can add this to your Makefile and forget about it:

    .mli.cmi:
            ocamlfind ocamlc $(OCAMLFLAGS) $(OCAMLPACKAGES) \
                -c $< -o $@
    .ml.cmo:
            ocamlfind ocamlc $(OCAMLFLAGS) $(OCAMLPACKAGES) \
                -c $< -o $@
    .ml.cmx:
            ocamlfind ocamlopt $(OCAMLFLAGS) $(OCAMLPACKAGES) \
                -c $< -o $@

    clean:
            rm -f *.cmi *.cmo *.cmx *.cma *.cmxa

pjmlp · on July 10, 2017

Thanks for the hints.

qznc · on July 10, 2017

My guess: The trick is not to have "template instantiation" but "module instantiation" (aka Functors in OCaml). Now you can instantiate only once. For example, if the compiler encounters "List<Foo>", it would instantiate into the "List$Foo.o" file or not if it already exists. Java works similarly, except the files have the extension "class" instead of "o".

More generally speaking: The trick must be to not generate identical instantiations multiple times. So you must have a way to check, if you already generated it. Of course, the devil is in the details (e.g. is equivalence on the syntactic level enough?).

pjmlp · on July 10, 2017

You are focusing only on generics and missing all the module metadata and related module type information.

In module based languages, the symbol table is expected to be stored on the binary, either to be directly used by tools or to generate human readable formats (.mli).

So if one uses the system C linker, it means being constrained to the file format used by such linker.

ableal · on July 10, 2017

> Bjarne's made the mistake of having C++ rely on C's linker model, so it meant no modules.

> Now we can see it as a big mistake, but on those days probably it was one of the reasons why C++'s adoption took off.

No mistake, just no choice - the original (1986 or so) C++ cfront compiled C++ to C which it fed to the C compiler and linker chain.

pjmlp · on July 10, 2017

> "but on those days probably it was one of the reasons why C++'s adoption took off."

It is a mistake with 2017 eyes, because build times are now insupportable.

Of course it was the right decision in 1986 when trying to get adoption inside AT&T.

Also I have read "Design and Evolution of C++" back when it was published, and know C++ since Turbo C++ 1.0 for MS-DOS, so I grew with the language.

Which is also a reason why I still select it as member for my Java, .NET and C++ toolbox trio.

However the first ANSI C++ was approved in 1998, and many of us were expecting to get some kind of module support in C++0x.

Yttrill · on July 11, 2017

Modules would be very hard to make work in C++, there's way too much entanglement at all levels. Of course the rot actually started with the ANSI C committee when it introduced typedef and broke context free parsing. C++ just compounds this kind of problem with template lookup stupidity. Its what you get when languages are designed by people with no understanding of basic computer science.

JabavuAdams · on July 10, 2017

I used to be a C++ guy for 20 years, but won't go back unless absolutely necessary. I mean, it's tempting -- there are some cool new language features. I find that the abstractions are leaky, though, so you still have to understand all the hairy edge-cases. The language has gone insane. I'm 40 -- I want to get stuff done before I die, not play clever games to get around my language / system.

pjmlp · on July 10, 2017

A bit older here.

C++ has been by next loved language after Turbo Pascal, since then I learned and used countless languages, but C++ was always on the "if you can only pick 5" kind of list.

Since 2006 I am mostly a Java/.NET languages guy, but still keep C++ on that list.

Mostly because I won't use C unless obliged to do so, and all languages intended to be "a better C++" still haven't proved themselves on the type of work we do, thus decreasing our productivity.

Because in spite of Swift, Java, .NET and JavaScript, C++ is the best supported option from OS vendors SDKs.

I dream of the day I could have an OpenJDK with AOT compilation to native code with support for value types, or a .NET Native that can target any OS instead of UWP apps.

Until then C++ it is, but only for those little things requiring low level systems code.

benwaffle · on July 10, 2017

> I dream of the day I could have an OpenJDK with AOT compilation to native code with support for value types

Check out http://www.scala-native.org/en/latest/

Maybe it will have value types after java gets them

pjmlp · on July 10, 2017

I am aware of it, but when one works in teams at customer sites, we are bound to what IT gives us, on the sanctioned images for externals.

The presentation at Scala days was interesting.

ddalex · on July 10, 2017

I've given up on C++ 5 years ago, after 10 years of getting paid to develop it. I find out the extra money that come from a C++ job doesn't cover the gray hairs of trying to tame the language so you don't shoot yourself in the foot a dozen times every time you call a method.

0xFFC · on July 10, 2017

Do C++ developer have more salary?

pjmlp · on July 10, 2017

Yes, because of the industries where the language is mostly used.

Fintech, HPC, aeronautics, robotics, infotainment,....

The only industry where devs are badly paid is games industry, but that is common to all languages.

jchw · on July 10, 2017

Go is a good example of a language with fast compilation times. Of course, the optimizer needs improvements, but I believe they keep managing to make it faster as they improve the output rather than slower.

atombender · on July 10, 2017

Go's compilation times are fast, but it also took a significant dive in 1.5, when they rewrote the compiler in Go.

They're slowly improving it to return to pre-1.5 performance, but last I checked, it wasn't there yet. The impact is insignificant on small projects, of course, but easily felt on larger (100Kloc+) ones.

While the recent optimizer improvements are great, my wish is for Go to switch to an architecture that uses LLVM as the backend, in order to leverage that project's considerable optimizer and code generator work. I don't know if this would be possible while at the same time retaining the current compilation speed, however.

cdoxsey · on July 10, 2017

The important point about Go in this case is that it's fundamentally more efficient because it has real modules and can do incremental compilation.

Sometimes people don't realize this because they always use `go build` which, as the result of a design flaw, discards the incremental objects. When you use `go install` (or `go build -i`) each subsequent build is super fast.

iainmerrick · on July 10, 2017

Huh, really? Why is that? Non-incremental builds shouldn't be needed at all, but besides that, just based on the names I'd expect `go build` to be the cheap one and `go install` to be the expensive one.

atombender · on July 10, 2017

It's unfortunately not a well-known feature. The Go extension to VSCode was using "go build" (without "-i") for a long time, and if you're working on something big like Kubernetes, it's almost impossible to work with.

The annoying thing is that "go install" also installs binaries if you run it against a "main" package. I believe the only way to build incrementally for all cases without installing binaries is to use "-o /dev/null" in the main case.

pjmlp · on July 10, 2017

I seriously hope they don't do it.

Go being bootstrapped is a good argument against people that don't belive it is suitable for systems programming.

Depending on C or C++ as implementation always gives arguments it could not have been done differently.

Also we should not turn our FOSS compilers into a LLVM monoculture.

mpweiher · on July 10, 2017

Wasn't that Wirth's rule for Oberon and/or Pascal? Any optimization introduced has to have sufficient cost vs. benefit ratio that it makes the compiler faster compiling itself.

jchw · on July 10, 2017

In Go's case, I think they're simply improving a lot of unrelated aspects of the compiler while adding new optimizations. I like the idea of that rule but there are definitely cases where I would want an optimization that could take a long time during compilation but provide an immense benefit later.

azeirah · on July 10, 2017

Ah yes, I remember that rule, I think it's incredibly clever. I believe he had another rule, language updates can only /strip/ features, so the core language will always get smaller and smaller.

Brilliant

xg15 · on July 10, 2017

Intuitively, that doesn't seem clever to me. (You're committing yourself to a less efficient, more clumsy language for the benefit of... what exactly?)

How does that law improve the language without falling into the trap mentioned elsewhere in this thread? (Optimizing for a pleasant "compile experience" at the cost of everything else)

foobarchu · on July 10, 2017

Well, the rule is attributed to Niklaus Wirth, whose credits include Modula, Modula-2, Oberon, Oberon-2, and Oberon-7. The -* languages are extensions of their originals, so it seems his rule only applies within a single edition of the language. They can add new things, because they are new languages.

The justification, then, seems to be that if you legitimately need new features, then the language has failed and you should start over anyway. I think Python 3 is sort of an offshoot of this idea, except that many of the new features keep getting backported to 2.7 anyway.

pjmlp · on July 10, 2017

Given that Oberon-7 is a subset of Oberon, reducing it to the essential of a type safe systems programming language, I wouldn't consider it an extension. :)

There is also Active Oberon and Component Pascal, but he wasn't directly involved.

foobarchu · on July 11, 2017

My mistake, I'm not intimately familiar with it. From what I can tell, Modula-2 and Oberon-2 were both extensions, though, to be used as successors to the previous language.

mtanski · on July 10, 2017

This whole sub-thread tangent-ed into C++ is bad because it's compilation causes this problem. Problems with C++ builds are well understood...

This specific issue prob impact other batch workloads with lots of small tasks (processes). There's no reason this should be happening on a 24 core machine.

qznc · on July 10, 2017

I'm not sure if the approach of Lisp and Jai (and also D) is the best one. They all have practically arbitrary code execution at compile time, so they can be arbitrarily slow to compile. In C++ template-programming is so hard that few people do it, but in those languages it is just as easy as normal code.

With mainstream languages, code generation is done by the build system which can avoid repetition. Caching generated code feels like a good idea to me. Doing it with compile time execution is (unnecessarily?) hard.

Kapura · on July 10, 2017

It seems that, in Jai, arbitrary code execution at compile time is exactly the point. The build file itself is just another Jai program. It's easy to wring one's hands about a novice programmer making the computer do unnecessary work, but Blow's philosophy seems to be to trust the programmer to understand the code they're writing. And if they don't, they should probably be using another language.

lispm · on July 10, 2017

Compiling Common Lisp code isn't too slow. There are some quick compilers like Clozure CL.

What slows some Common Lisp native code compilers down is more advanced optimization: type inference, type propagation etc, lots of optimization rules, style checking, code inlining, etc.

captainmuon · on July 10, 2017

Could it be that we are doing the web browser wrong?

I think large parts of chrome actually belong into the OS. The network parts, the drawing library (skia), the crypto implementation, the window and tab management, and so on.

The javascript engine could be factored out, too, so more apps could benefit from it (without bundling a whole frickin Chromium).

Video and audio would be deferred to DirectShow, Quartz, VLC, Mplayer, ...

Ideally, what remains is just a layout engine and some glue code for the UI. It's the monolithic kernel vs microkernel debate all over again.

Plugins have a bad rep in the context of browsers, but I think this "microkernel browser" where everything is a plugin or OS library can be potentially more secure than the current state, since we can wall off the components between interfaces much better.

I also think it would be much "freer". Browsers like Firefox and Chrome are open-source, but they are free in license only. I can't realistically go ahead and make my own browser. The whole thing is so complex that you have to be Google or Apple or Microsoft to do that. The best I could achive is a reskin of WebKit. I think that would be different with a more modular browser.

naasking · on July 10, 2017

> I think large parts of chrome actually belong into the OS. The network parts, the drawing library (skia), the crypto implementation, the window and tab management, and so on.

The problem is cross-platform support. Depending on the OS is obvious if every OS supported the required features.

captainmuon · on July 10, 2017

But all of that is already cross platform.

When I say e.g. skia should be part of the OS, I don't neccessarily mean MS should ship it and update it yearly. I mean Google should still ship and auto update it, but also go though the effort of documenting it, maintaining strict backwards compatible APIs, and letting other programs consume it. I don't care who the actual vendor is. I know that's a lot to ask for, but OTOH it is insane to statically link that kind of code. Especially if you have multiple electron apps, that would work fine with a shared runtime.

naasking · on July 11, 2017

So create more work for themselves that they don't actually have to do. I'm sure some developers might be willing to do some of that in their spare time, but I think "a lot to ask for" is a serous understatement.

msla · on July 10, 2017

I'd be wary of moving the crypto into the OS, because OS upgrades are few and far between. Browsers are easier to upgrade, as we know from the rather aggressive auto-upgrade cycles of Chrome and Firefox, whereas if you have bad and/or now-known-to-be-insecure crypto in the OS, well, you're stuck with it for the foreseeable future. People are still running Windows XP.

captainmuon · on July 10, 2017

Why can't a library be part of the OS and updated frequently? If it is critical to update, why shouldn't all apps benefit from it? Why can MS update code in Edge frequently, but shouldn't be able to update a .dll as frequently?

Ideally, I'd want critical code (encryption, code signing, bootloaders, kernels, runtimes) to be from a trusted vendor, and preferably simple and open source. I trust the MS, Apple, Google of 2017 not to completely fuck it up. (We already trust them as browser vendors.)

I don't care if the keep calc.exe stable for 10 years, but I expect them to patch crypto.dll immediately. You could do that stealthily, outside of major updates, as it has no user facing changes.

The benefit of this model is that it allows third party apps from small vendors to profit from the up to date security that only the tech Giants can provide.

The downside is of course that it is quite hard to maintain perfect backwards compatibility while pushing updates, but if the components and APIs are small enough I think it is possible.

astrobe_ · on July 10, 2017

No, the web browser is all-righty. It has become the universal VM, so it's only natural that it is as big and slow to compile as an OS.

swiley · on July 10, 2017

When elinks is compiled with javascript support you must provide an external javascript library.

I really like elinks, it's a shame I can't use it to view blackboard and other sites I have to use...

korzun · on July 10, 2017

You are going backward with the 'Internet Explorer' like buy-in. That approach would only be faster in one segment; everything else will be slower.

There is no point in speeding up the raw compile time by a couple of minutes if you are increasing the development and testing time by a couple of weeks.

captainmuon · on July 10, 2017

What does IE have to do with this? And I don't think this will increase development time. If anything, it will allow people to innovate faster, since it is easier to contribute.

nialv7 · on July 10, 2017

Well there's the NetSurf project: http://www.netsurf-browser.org

vidarh · on July 10, 2017

> Damn, are we doing this wrong?

Yes: Not isolating different modules sufficiently to allow you to avoid including most headers when compiling most modules.

Patterns to do this in C++ has been well understood for two decades:

Strict separation of concerns coupled with facades at the boundaries that let all the implementation details of the modules remain hidden.

Yes, it has a cost: You're incurring extra call overhead across module boundaries, and lose in-lining across module boundaries, so you need to choose how you separate your code carefully. But the end-result is so much more pleasant to work with.

lucian1900 · on July 10, 2017

Unfortunately that's not feasible for highly performance-sensitive projects like browsers or games.

vidarh · on July 10, 2017

It absolutely is. If most of your time is spent on calls traversing large portions of a code-base that size, then you have a far bigger problem in that you'll be blowing your cache all the time. Fix that problem, and you're halfway there to creating better separated modules that can be encapsulated the way I described.

wglb · on July 10, 2017

Can you point to an example where this has been done?

pieterr · on July 10, 2017

A 21-year old book is dedicated to this subject:

https://www.amazon.com/Large-Scale-Software-Design-John-Lako...

geezerjay · on July 10, 2017

> Unfortunately that's not feasible for highly performance-sensitive projects like browsers or games.

Bullshit. Code needs to be compiled, but it isn't required to build everything from scratch whenever someone touches a source file.

Additionally, not all code is located in any hot path.

ge0rg · on July 10, 2017

Code needs to be compiled, but it isn't required to build everything from scratch whenever someone touches a source file.

Except for C++, where a tiny change in a single object will require recompiling every file that transitively includes that object's header.

pjmlp · on July 10, 2017

Depends on the change and the though given to header file dependencies.

PIMPL, forward declarations, pre-compiled headers, binary libraries are all tools to reduce such dependencies.

sqeaky · on July 10, 2017

I think that only re-linking should be required if you only change source files and not headers. Headers implicitly convey are the sizes, inheritance and other stuff that dependencies need for compilation.

I suppose you could have some extra aggressive optimizations that force inlining, but I haven't seen a need for this, even in game dev.

tinus_hn · on July 10, 2017

The stated numbers are for full rebuilds.

cheez · on July 10, 2017

This would be the end of the discussion if it weren't for stupidity like this: https://groups.google.com/a/chromium.org/forum/#!msg/chromiu...

Generally, I find when people crow about performance, the product they're talking about usually has some questionable architectural/design/implentation decisions that dominate the performance issues so I have to do my best not to roll my eyes.

Yes, you can write performant C++ using well-understood compiler firewalls, interfaces, etc that reduce your compile time.

vidarh · on July 11, 2017

I once cut 30% of page generation times for a commercial CMS in half a day by just skimming through their output generation code and changing std::string method invocations to get rid of unnecessary temporaries.

People very rarely has any clue about this at all.

LeifCarrotson · on July 10, 2017

Agreed - though I'd say that performance-sensitivity is more a function of the number of users than the application domain.

If a few hundred or few thousand people each have to build Chrome from scratch a couple times, and making their compilation process much slower makes each of a trillion pageviews a millisecond faster...the break-even point seems to be about a 27 hour build time sacrifice.

iainmerrick · on July 10, 2017

Chrome has a staggering amount of C++ code. It's not all heavily hand-optimized. Probably very little of it is optimized at all, or needs to be.

They're relying on the compiler working its magic to make non-hand-optimized code run pretty fast. That's fine, but it requires you to expose a lot of stuff in headers and that slows down compilation.

I'm fairly sure, like other commenters, that they could speed up compilation a lot and impact performance very little by carefully modularizing their header files. But that's a really big job.

bonzini · on July 10, 2017

The 90/10 rule still holds. Sometimes it's even more skewed.

mpweiher · on July 10, 2017

Even Knuth talked about it being a 97:3 rule[1] and according to some it's gotten more skewed since then[2]

[1] http://sbel.wisc.edu/Courses/ME964/Literature/knuthProgrammi...

[2] http://blog.cr.yp.to/20150314-optimizing.html

oblio · on July 10, 2017

While what you're saying might be true, I can't help but think about Pascal units and say: "It didn't have to be this hard! We solved this problem 40 years ago!"

beagle3 · on July 10, 2017

The heir to the Pascal/Delphi kingdom seems to be Nim[0], though it takes its syntax from Python.

Compilation is impressively quick, even though it goes through C.

[0] https://nim-lang.org/

girvo · on July 10, 2017

Which is a language I absolutely adore, and am building a homomorphic encryption based product leveraging Hyperledger on a rather obscure 160 hardware thread POWER8 server; it can definitely work for real production tasks today, even if some parts are rough (and hell, C++ can be rough too). Parallel compilation on this machine is stupidly quick :)

dman · on July 10, 2017

Any pointers to what power hardware you are running?

robotresearcher · on July 10, 2017

You don't need pointers - just follow the fan noise from wherever you are.

nopakos · on July 10, 2017

Sorry, it's the same in Delphi (Pascal successor). The compilation time goes up exponentially as number of units goes up. Compilation of 2.000 files (750.000 lines) takes about 20 minutes.

pjmlp · on July 10, 2017

With the caveat that one doesn't need to do "make world" every time a few files change.

vidarh · on July 10, 2017

Yes there are plenty of other solutions if you use other languages, but I decided to constrain myself to how you'd address it in C++

rcarmo · on July 10, 2017

And then Modula-2 came along and... well, it was mostly the same. But the compactness of Pascal output left fond memories.

djhworld · on July 10, 2017

Wasn't this one of the reasons why the Go project started at Google, not necessarily for Chrome, but because C++ project compile speeds were horrendous for some internal projects.

Not saying Chrome could and should just switch to Go, it definitely would not be the right fit! But it's interesting that these sorts of builds still occur and consume a lot of developer's time.

jdcarter · on July 10, 2017

Yes, C++ build times were a significant factor in Go's creation. Rob Pike describes it here:

https://talks.golang.org/2012/splash.article

He covers all the same points about header files and how Go addresses those issues.

andrepd · on July 10, 2017

> Damn, are we doing this wrong?

Give me a tool that's: (i) as fast, (ii) as mature and well supported, (iii) as powerful as C++ and I will switch in a heartbeat. But until there is such an alternative it's futile to complain about the shortcomings of C++, because if you want the powerful, zero-cost abstractions, the mountains of support, and access to billions of existing lines of code, you pretty much have nowhere else to go.

p0nce · on July 10, 2017

So: a C++ competitor shall beat it in every dimension you've choosen, until then it's futile to complain about the shortcomings of C++. This doesn't sounds very logical, only like a case of sampling bias.

frozenport · on July 10, 2017

Unfortunately some projects have an "everything" requirement. That is to say the software must be fast, and written in a way that interface close to the metal. We need to do a lot of parallel processing. Now it's C++ or Rust. Then we need a GUI, and CUDA so we're down to C++. That's why project uses C++.

p0nce · on July 11, 2017

It's the same old C++ rhetoric "only C++ can do it". Before it only C could do it, and before it only assembly could do it.

Reasoning starting from conclusions to lead to initial constraints is backwards reasoning. For example you don't talk about maintenance or productivity, and yet you end up making a choice without factoring this. Chances are, the choice in most codebases is made because of existing code and culture, not because of rational reasons.

frozenport · on July 11, 2017

No its in there. For example, a similar OSS software called MicroManager is a veritable cluster duck with half the code base dedicated to interfacing between C++ and Java. It doesn't hit the performance spec. The real problem I've had with C++ is finding devs, typically senior C++ software engineer at $130k vs junior Python dev at $70k.

But from the engineering side it's the only "everything" language. (There aren't any good GUI kits for C, and NVCC is C++)

p0nce · on July 12, 2017

Well, it's true that they aren't good UI toolkits in D either (let's say as good as Qt). For me it works as the "everything" language, I also wrote CUDA bindings once (obviously that wouldn't work with mixed host/gpu code which I hope no one really use).

lallysingh · on July 10, 2017

> But until there is such an alternative it's futile to complain about the shortcomings of C++

Then how does C++ improve?

oAlbe · on July 10, 2017

Slowly but rather steadily like it has been doing so far.

We are likely getting modules (and reflection) with the next iteration (C++20), which -- if it moves like the last two versions -- will be almost completed and already supported by GCC, VS and Clang in two years. Clang and VS2015 even support modules experimentally already.

Shorel · on July 10, 2017

> Then how does C++ improve?

It keeps adopting D features.

nneonneo · on July 10, 2017

These numbers don't seem abnormal; I recall building Safari many years ago, and having multi-GB of intermediate products shrink down to a 30MB executable (plus a few hundred MB of debug symbol).

So, I have a thought: if we're spending all this time to compile functions (particularly template functions) that are just thrown away later, why are we performing all our optimization passes up-front? Surely, optimization passes in a project like Chrome must eat up a lot of compilation cycles, and if that's literally wasted, why do it in the first place? Can we have a prelink step where we figure out which symbols will eventually make it, and feed that backwards into the compiler?

Maybe a more efficient general approach might be to simply have the optimizer be a thing that runs after the linker, so that the front-end compiler just tries to translate C++ into some intermediate representation as fast as possible. The linker can do the LTO thing, then split the output into multiple chunks for parallelization, and finally merge the optimized chunks back together. With LLVM, it feels like the bitcode makes this a possible compilation approach...

mpweiher · on July 10, 2017

> These numbers don't seem abnormal;

Hmm...not "abnormal" in the sense that we've gotten used to it: yes. Heck, last I heard building OneNote for Mac takes about half a day on a powerful MacPro.

But I'd say definitely abnormal in terms of how things should be.

ygra · on July 10, 2017

Doesn't link-time code generation already exist (at least on MSVC, I think). It makes the linking step so much more expensive, though, which sucks for incremental builds.

nneonneo · on July 10, 2017

Ah, yes, I knew I had to be missing something! Needing to support incremental builds are what makes C++ compilation so frustrating. I wonder why this is so difficult though - dependency tracking should be a thing that can carry through the link stage. Just track what files a given function depend on, and only recompile/recodegen functions that have changed. Of course, it still sucks massively if you change a header file, but that's why you separate the header from the implementation of the methods :)

/LTCG also doesn't seem to parallelize well - last I checked it still ran all the codegen on one core. Maybe that's different now?

endianswap · on July 10, 2017

LTCG as of VS2015 is an incremental process when it can be, which did wonders for build times.

innocenat · on July 10, 2017

I think that, especially in templated code, you only know which part of code can be thrown away because of the optimization.

phaedrus · on July 10, 2017

Yes, you're doing it wrong.

The trick is you've got to reduce your "saturation level" of #includes in header files, by preferring forward declarations over #includes, and using the PIMPL pattern to move your classes' implementations into isolated files, so that transitive dependencies of dependencies don't all get recursively #included in.

When it comes to templates, one has to be very aggressive in asking "Does this (sub-part) really have to be in template code, or can we factor this code out?" Any time I write my own template classes, I separate things between a base class that is not a template, and make the template class derived from it. Any computation which does not explicitly depend on the type parameter, or which can be implemented by the non-template code if the template just overrides a few protected virtual functions to carry out the details, gets moved to the non-template base class.

If your problem is not with template classes which you have written, but with templates from a library, consider that in most (all?) cases there is still some "root" location (in your code) which is binding these templates to these user-types. This root location will either itself be a (user-written) template class, or it is an ordinary class which "knows" both the template and the bound-type(s). Both of these cases can be dealt with either by separating it into non-template base and derived template, or using the PIMPL idiom, or both.

The general principle is that what you allow in your headers should be the lower bound of the information needed to specify the system. Unfortunately this takes active work and vigilance to maintain, and a C++ programmer is not going to understand the need for it until they reach the point of 30 minute builds and 1.4GB's of .o files.

deong · on July 10, 2017

I've found that I generally regret doing this kind of thing to the extent that you need to do it to make a meaningful difference. The problem is that all this stuff comes at a cost -- my source code is no longer structured in a semantically meaningful way.

The SICP quote comes to mind here: "Programs must be written for people to read, and only incidentally for machines to execute." I greatly prefer to have my code organized in a sensible way. I want to know that "here is where the FooWidget code is".

It's not the end of the world, and people can adjust, but part of what I hate about working on just about anyone's Java code is this constant mental assault of "no, you need to be in the FooWidgetFactoryImpl file to find that code". Just let me have "customer.cpp" or whatever, and I'll live with grabbing coffee during the build.

Admittedly, I don't work on truly large applications. I can imagine priorities change when builds take two hours instead of the 15 minutes I might have to live with.

frankzinger · on July 12, 2017

> Yes, you're doing it wrong.

Your comment is great but I have spent enough time working on Chromium to know that they have people working on the build who know all of this stuff and much more. They understand the build from the top to the bottom of the toolchain stack. (@evmar used to be one of these people and he actually commented in this thread at https://news.ycombinator.com/item?id=14736611.) I am sure your parent commenter is a great developer but I get the impression he/she is not one of the Chromium build people.

netcraft · on July 10, 2017

It'd be crazy to think about the energy spent over time on building just chrome - turn that into a carbon footprint and itd be shocking! I always try to write code for humans and not over-engineer or prematurely optimize code, but sometimes I wonder about a section of code over the long term - that bit of javascript thats going to be run on lots of phones and other devices over years or decades - and how much electricity its going to cost more just because I used `map` instead of a `for` loop... I can't imagine working with a build process that takes that long to run.

jkantz · on July 10, 2017

Aspects of Unix philosophy where "your done when there's nothing left to remove" need some revival.

http://www.catb.org/esr/writings/taoup/html/ch04s02.html#com...

http://suckless.org

jussij · on July 10, 2017

> a from-scratch build of Chrome will take at least 30 minutes on a

That all.

I once work for a rather big accounting software company and the full build of their accounting product took about 4 to 5 hours to complete on the build server.

We ran the build at the close of business every day and the build engineer had to log remotely just to make sure the build worked, other wise the QA team would have nothing to test in the morning.

It too was written using the C++ language.

raverbashing · on July 10, 2017

C was ok. Not good, but ok.

Then we tried to attach the OO paradigm to it and we got the monstrosity that is C++ (and as a consequence of that - Java - which has fixed some issues but still suffers)

And don't get me started on templates

I'm so glad that paradigm is starting to die out and hopefully Rust, Go and others will take over (their object model still doesn't get around my head but it will eventually)

codedokode · on July 10, 2017

C has the same problem because it relies on includes and does not have modules.

raverbashing · on July 10, 2017

Yes, however that was acceptable at the scale C was used and given its origins. Not good, but acceptable

Pascal could have been a better choice (sigh)

C without typedefs also compiles faster

C++ is like plugging an engine to a skateboard to make it run faster

frankzinger · on July 13, 2017

Sure, but it still compiles much faster than C++, and that's due to language differences.

kuschku · on July 10, 2017

And luckily Java is now also trying to fix modules. Maybe, someday, those concepts will arrive over in C++ land, too.

frankzinger · on July 13, 2017

They've been working on modules for C++ for a long time. It's coming.

blt · on July 10, 2017

The modules proposal for C++ has been around a long time. We need it now more than ever. Projects are getting bigger and the C++ culture is evolving from runtime polymorphism to compile time polymorphism. Text inclusion is just not good enough.

saurik · on July 10, 2017

There are a million responses to your comment, but unless I missed something, no one seems to be redirecting you back to the actual problem at hand and are recommending all sorts of generic ways to reduce the time required to compile a large C++ project... but the issue here was not the generic one: the entire thing that made this article interesting is that his CPU was not being used at all, so this has nothing to do with "a Chrome build is truly a computational load to be reckoned with". He even goes out of his way to show his CPU load graphs so we can see multiple seconds in a row where his computer is 98% idle and yet he still can't even move his mouse.

Your comment, and essentially every other single one on this entire thread thereby makes me wonder if anyone on this subthread read the article :/.

OK, I decided to search for the word "process", and found one person responding to you who did read the article, and depressingly only a handful of people even responding to the top-level article who apparently read the article. This entire post is such a great example of "the problem with this kind of discussion forum" :/.

https://news.ycombinator.com/item?id=14735977

REGARDLESS...

What was described in this article wasn't "Chrome's build is too slow", it was "there is a weird issue in Windows 10 (which apparently wasn't even a problem with Windows 7, and so we could easily argue is a regression) where process destruction takes a global lock on something that is seemingly shared with basic things like UI message passing". The fact that he was running a Chrome build to demonstrate how this manages to occasionally more than decimate the processing power of his computer was just a random example that this user ran into: it could have been any task doing anything that involved spawning a lot of processes, and the story would have been exactly the same.

Now, that said, if you want to redirect this to "what can the Chrome team do to mitigate this issue", and you want the answer to not be "please please lean on Microsoft to do something about this weird lock regression in Windows 10 so as to improve the process spawn parallelism for every project, not just compiling Chrome"... well, "sure", we can say you are "doing this wrong", and it is arguably even a "trivial fix"!

Right now, the C++ compiler pipeline tends to spawn at least one (if not more than one) process per translation unit. If gcc or clang (I'm not sure which one would be easier for this; I'm going to be honest and say "probably clang" even though it feels like a punch in the gut) were to be refactored into a build server and then the build manager (make or cmake or ninja or whatever it is Google decided to write this week) connected to a pool of those to queue builds, you would work around this issue and apparently get a noticeable speed up in Chrome compiles on Windows 10, due to the existence of this process destruction lock.

One could even imagine ninja just building clang into itself and then running clang directly in-process on a large number of threads (rather than a large number of processes), and so there would only be a single giant process that did the entire build, end-to-end. That would probably bring a bunch of other performance advantages to bear as well, and is probably a weekend-long project to get to a proof-of-concept stage for an intern, come to think of it... you should get on it! ;P

titzer · on July 11, 2017

You're right. I didn't read the whole article. I skimmed it. But I did actually pick up on the locking problem and see the graphs with huge amounts of idle time. I knew this comment was only marginally related to the actual article. So, :-/

However I suspect that as soon as that lock regression in Windows is fixed, that monster CPU load is coming back home to roost, and the workstation is going to be just as dead as my 64 core Linux workstation has been when I've actually '-j 500' without gomacc up and running correctly.

So, by all means, Microsoft should fix this lock regression.

But there's this...elephant...in...this room here.

Dylan16807 · on July 11, 2017

> Your comment, and essentially every other single one on this entire thread thereby makes me wonder if anyone on this subthread read the article :/.

Just let people talk about what they want to talk about, maybe? The main problem in the article is interesting but far less actionable than the overall situation of slow compilation.

Do you want things like rampant speculation and insulting windows 10? Do you expect everyone to pull out kernel debuggers to be able to make directly relevant comments? It's okay to talk about a related issue. Concluding that they didn't read the article is kind of insulting.

struct · on July 10, 2017

However, they have just merged a feature called jumbo which squashes lots of compilation units together. The guy who developed it (Daniel Bratell from Opera) reckons an improvement of 70%-95% in per-file compilation time[1]. But, it's only for the core of Blink right now.

[1] https://groups.google.com/a/chromium.org/forum/#!searchin/ch...

zx2c4 · on July 10, 2017

> TBH I don't remember toughing out a full build without resorting to goma.

I have! Every time you guys bump a snapshot, my Gentoo boxes whirl away and heat my house, compiling a new version from scratch. On an octocore Skylake Xeon laptop, this takes 2 hours 48 minutes.

mysterydip · on July 10, 2017

Makes me wonder about a distributed compilation. If everyone had a monster workstation but not everyone compiles at the same time, a theoretical networked compiler (which may already exist as I haven't really checked) could spread the files out among available workstations and bring things back together near the end.

As the largest issue is the throwing away of duplicate work, I'd see it as a kind of reverse binary tree: machines working on files that depend on others talk together, then when finished send the condensed work up the chain (and signal their availability for the next workload chunk or phase) until everything collapses down back to the original machine.

mhss · on July 10, 2017

https://wiki.archlinux.org/index.php/Distcc

I used that 10+ years ago on Gentoo, and never saw anyone using it since. Don't know how often is used now days.

rhaps0dy · on July 10, 2017

I used it ~3 years ago on Funtoo; there's definitely still people that use it!

ordu · on July 10, 2017

distcc: https://en.wikipedia.org/wiki/Distcc

jdright · on July 10, 2017

The best solution today for distributed build is: http://www.fastbuild.org/docs/home.html

It is faster than IncrediBuild, even faster than SN-DBS and has multi-platform support, the only problem is that it requires its own build script.

adrianratnapala · on July 20, 2017

Ok, I see distcc helping people with computer labs and server farms, but I thought mysterydip wanted to help the Gentoo community and other such. I.e. peer-to-peer build sharing.

So what kind of cryptographic guarantees would you need for that? And if you can only verify the build results by trusting signatures from upon high, then what is the point? Perhaps those builds could be turned into work in a proof-of-work blockchain. Do compilers contain any hard-to-do, but easy to verify steps?

Whole shelves full of useless PhDs thesis are just waiting to be written on this topic.

fazzone · on July 10, 2017

Isn't this what IncrediBuild does on Windows?

btschaegg · on July 10, 2017

It is. Although, as far as I'm informed (having last used it in 2013 or so), even with that approach, you'll likely have problems linking all the output in the end if your project is large enough.

Of course, thats another separate problem to begin with. I still remember dabbling with D and vibe.d and replacing the default GNU linker with ld.gold because over 90% of the build time was due to the linker...

pjc50 · on July 10, 2017

"Concatenated" builds seem to be the best band-aid for this. Concatenate as many source files as possible before compiling, #include a bunch of cpp files into one big file. It makes tracking down errors slightly harder and macros a bit more risky, but greatly improves the overall build efficiency.

ori_b · on July 10, 2017

And trashes incremental compilation time, which is what really matters.

swift · on July 10, 2017

It's a balance, but on many projects the overhead of the headers themselves is so large that concatenating a few .cpp files together doesn't increase incremental compilation time significantly over simply building each .cpp file in isolation.

pjc50 · on July 10, 2017

Incremental compilation opportunities are rarer than we'd like on C++: as soon as you add a new class member or function, that's potentially a huge recompile.

DiThi · on July 10, 2017

You can have all the files you're working on not concatenated.

ori_b · on July 10, 2017

That's painfully manual.

DiThi · on July 11, 2017

It doesn't need to be. I think it shouldn't be hard to make a build script that re-makes the bundles except for files modified in the last hour.

jhasse · on July 10, 2017

Also called unity build sometimes.

treehau5 · on July 10, 2017

> How did we get here? Well, C++ and its stupid O(n^2) compilation complexity [...]

Wasn't large compilation time a driving force behind coming up with Go? Is a garbage-collected language not suitable for a web browser? I am just curious because I absolutely love writing Go

wbl · on July 10, 2017

GC performance is a trade between throughput and pause time. There are tricks you could play with each tab in its own heap which is entirely discarded on page change, but I think it would take something expensive like Azure to really work.

treehau5 · on July 10, 2017

> GC performance is a trade between throughput and pause time.

Go has sub-millisecond GC pauses, and even at that minimizes the need to do stop-the-world pauses (previous HN discussion https://news.ycombinator.com/item?id=12821586) I think it would be a very interesting exercise to give a crack at it. If anyone is interested, let me know.

codygman · on July 12, 2017

> Go has sub-millisecond GC pauses

At the cost of throughput:

> Go optimises for pause times as the expense of throughput to such an extent that it seems willing to slow down your program by almost any amount in order to get even just slightly faster pauses. - https://blog.plan99.net/modern-garbage-collection-911ef4f8bd...

jdc0589 · on July 10, 2017

> Wasn't large compilation time driving forces behind writing Go

yes, that was one of the tenants.

> Is a garbage-collected language not suitable for a web browser?

In theory its fine; but there's a lot of historical baggage that comes along with garbage collection and the majority of languages that support it (e.g. almost no value types).

Golang fairs pretty well latency wise for a GC'd language. I'd be curious for someone with more experience than me to talk through a deep dive of instances where go's latency/throughput characteristics are and are not good enough for specific applications.

pjmlp · on July 10, 2017

We have Java to blame for it.

With the exception of Smalltalk and its derivatives, all the GC enabled languages that came before Java had value types, even Lisp.

pjmlp · on July 10, 2017

> Damn, are we doing this wrong?

Not using modules? Yeah I know C++'s made the mistake of not using them since the beginning and it is a long road until they are here (202x ?).

However Google was showing their modules work at CppCon 2016, so I guess Chrome does not make use of clang modules.

epistasis · on July 10, 2017

Yes, we are doing this wrong! But we don't have any right options yet either. I think another way of thinking about this is that if we can't imagine a better system, then the field is so mature that it has become solved and completely boring. I think systems programming is still a field for which there are new and novel problems to solve, or at least better designs to implement.

threepipeproblm · on July 10, 2017

Wasn't this issue basically a driving force for Google creating Go? Or at least, a major design goal for Go was to get rid of O(n^2) compilation?

secure · on July 10, 2017

Yes, as per Rob Pike’s comments on the subject: https://commandcenter.blogspot.com/2012/06/less-is-exponenti...

rsp1984 · on July 10, 2017

However, to actually have the C++ compiler do inlining at compile time (LTO be damned), we have to put the definitions of inline functions into header files, which greatly increases their size and processing time.

I can't think of a component in a browser that would require inlined function calls in order to be performant. To really matter it would have to be many millions of calls per second.

Moreover, because the C++ compiler needs to see full class definitions to, e.g., know the size of an object and its inheritance relationships, we have to put the main meat of every class definition into a header file!

So let me suggest that you revisit those inlined function calls again. Once you start putting them into proper .cpp files and make use of forward declarations where possible the whole header dependency graph will probably simplify quite a bit.

Don't even get me started on templates.

Right. Don't use them unless there's no other way. Especially avoid for large-ish classes that do complicated stuff. If you can spare a couple CPU cycles (and most of the time you can) determine things at run-time instead of compile time.

Of course all of this is theory, not taking into account practical matters like deadlines, code readability or developer turnover.

Full disclosure: I worked at Google, but not on the Chrome team :)

urza · on July 10, 2017

Hi could you guys please fix the white flashing bug chrome has for almost 10 years? I think it might me the longest open unsolved bug I know in any IT project.

https://support.google.com/chrome/forum/AAAAP1KN0B0Rmd8IyUjG...

copperred · on July 10, 2017

The important question is: how long does an incremental build take? The time to perform a full clean build is important but clean builds are rarely a part of my development flow. In my experience, efforts to reduce build time focus first on the incremental case, and rightly so.

ioquatix · on July 10, 2017

I built Mozilla on my PowerBook G4 and it was still going 24 hours later :D Ah, good memories.

pjmlp · on July 10, 2017

I had that experience with KDE, back in SuSE 6.3.

claudiug · on July 10, 2017

Have you give a look to dlang? or even rust

jhasse · on July 10, 2017

Rust's compile times are even worse.

AsyncAwait · on July 10, 2017

That's only because they just started working on incremental compilation, it will get faster soon enough.

jhasse · on July 10, 2017

I've heard the same thing a year ago. It didn't happen. See this 2016 roadmap from August 2015 for example: https://blog.rust-lang.org/2015/08/14/Next-year.html

pcwalton · on July 10, 2017

Incremental compilation is something that no major compiler for any ahead-of-time compiled language anywhere does. It's one of the most advanced features in any compiler, and as such it's taking time to implement. No C++ compiler I know of is even thinking of it.

As the first post here in this thread mentions, going down the C++ road of header files might have gotten us some short term wins, but ultimately it hits a brick wall. Incremental compilation is inescapable.

pjmlp · on July 10, 2017

I surely do consider Visual C++ a major compiler for any ahead-of-time compiled language.

It does incremental compilation and incremental linking.

I would be quite happy if cargo was half as fast as my UWP projects.