Not deeply familiar with C++ modules, but I've built and maintained fairly large...

jsnell · on Jan 28, 2019

But you can't just compile-every-file. Each file can depend on the outputs of compiling some unknown set of other files. The compiler needs to become a build system, or the build system needs to become a compiler.

The clang modules proposal had the concept of mapping files, mapping module names to file names.

Companies like Facebook will presumably use proper build systems that already encode the dependency information in the build files rather than try to autodetect it. In that kind of an environment this proposal probably isn't particularly painful.

coliveira · on Jan 28, 2019

The compiler will not become a build system because this is out of scope for C++. With or without modules, C++ will continue to rely on an external dependency management tool, such as a Makefile. The introduction of modules will not change anything in this respect.

jsnell · on Jan 28, 2019

Indeed. You've now taken one solution off the table. The other one is for the build system to become a compiler, which is equally unacceptable. That leaves you with manually encoding all dependency information in the build files. Which most people aren't doing (the exception being Bazel-like build systems which enforce that).

That seems to leave us with just one conclusion: the article is right, and most of the ecosystem will never migrate to modules, leaving us with the worst of both worlds.

ginko · on Jan 28, 2019

The build system doesn't need to become a compiler. It just needs to parse the source files for import statements in order to create a DAG.

jcelerier · on Jan 28, 2019

> The build system doesn't need to become a compiler. It just needs to parse

parsing C++ is mostly equivalent to becoming a C++ compiler.

gpderetta · on Jan 28, 2019

The module preamble has been explicitly designed to be parseable without requiring a full blown C++ parser.

geezerjay · on Jan 28, 2019

> parsing C++ is mostly equivalent to becoming a C++ compiler.

It reallt isn't. Parsing a languagr just means validating its correctness wrt a grammar and in the process extract some information. Parsing something is just the first stage and a one of many stages required to map C++ source code to valid binaries.

tcbrindle · on Jan 28, 2019

The presence of template specialisations and constexpr functions means that the GP is right here; you cannot decide whether an arbitrary piece of C++ is syntactically valid without instantiating templates and/or interpreting constexpr functions. Consider

    template <int>
    struct foo {
        template <int>
        static int bar(int);
    };

    template <>
    struct foo<8> {
        static const int bar = 99;
    };

    constexpr int some_function() { return (int) sizeof(void*); };

Now given the snippet

    foo<some_function()>::bar<0>(1);

then if some_function() returns something other than 8, we use the primary template and foo<N>::bar<0>(1) is a call to a static template member function.

But if some_function() does return 8, we use the specialisation and the foo<8>::bar is an int with value 99; so we ask is 99 less than the expression 0>(1) (aka "false", promoted to the int 0).

That is, there are two entirely different but valid parses depending on whether we are compiling on a 32- or 64-bit system.

Parsing C++ is hard.

EDIT: Godbolt example: https://godbolt.org/z/yR3YHW

ginko · on Jan 28, 2019

You only need to parse the "module <module name>" and "import <module name>" statements. No need to parse all of C++ for that. You could probably even do that with a regex.

galangalalgol · on Jan 28, 2019

It also has to do all the preprocessing to see which import statements get hit. I don't think templates could control at compile time which module to import, at least I hope not.

sobellian · on Jan 28, 2019

Parsing C++ is literally undecidable. You can encode arbitrary programs which will emit a syntax error depending on their result.

Here's an example of a program which compiles only if the constant N is prime, and otherwise emits a syntax error: https://stackoverflow.com/questions/14589346/is-c-context-fr....

coliveira · on Jan 28, 2019

You are misrepresenting the concept of undecidable. If the compiler can say if the program compiles or not, then it is most certainly decidable. What you want to say is that it cannot be determined without full parsing, so no preprocessing is possible.

drewgross · on Jan 28, 2019

No, it's actually undecidable. C++ templated have been determined to be turing complete, which means that template instantiations can encode the halting problem. Determining whether a program compiles or not therefore requires solving the halting problem.

In practice, compilers work around this by limiting template instantiation depth.

sobellian · on Jan 29, 2019

I gave an example of a template program to show the general method. Obviously, primality is decidable, but there exist candidate C++ programs whose parse tree is undecidable. The trick would be to encode your parser in a template, run it on the undecidable program (i.e., itself), and create a contrary result. Does this have any effect on practical C++ builds? I honestly have no idea.

Deganta · on Jan 28, 2019

They could just specify that the module/import statements need to be at the top of the file (excluding comments). Most people will do this anyway. Then the build system only needs to parse comments and module statements, which should be fast and easy.

tcbrindle · on Jan 28, 2019

Except that, as mentioned in TFA, as currently proposed it's possible to say

    #if SOME_PREPROCESSOR_JUNK
    import foo;
    #else
    import bar;
    #endif

This has legitimate use cases, say

    #ifdef WIN32
    import mymodule.windows
    #else
    import mymodule.posix
    #endif

So in reality build systems will be required to invoke at least the preprocessor to extract dependency information.

AFAIK the modules support in the Build2 build system does exactly this, and in fact caches the entire preprocessed file to pass to the compiler proper later.

gpderetta · on Jan 28, 2019

That's exactly how it has been specified.

humanrebar · on Jan 28, 2019

Or do the modules equivalent of 'g++ -MM'. Or is there some reason we wouldn't want that approach?

jsnell · on Jan 28, 2019

That's the problem: you can't do that.

Having the compiler produce header dependency information is possible, since the dependencies are just an optimization. If there's no dependency information available, you can just compile all of the files in an arbitrary order, and you get both the object file and a dep file. And then on further runs you use the old dep files to skip unnecessary recompilations.

With modules, you can't compile the files in an arbitrary order: if A uses a module defined in B, B must be compiled first. So you need to have the dependency information available up front even for the first build. And since it needs to be available up front, it can't be generated by the compiler. It must either be produced by the build tool which becomes vastly more complicated, or manually by humans.

coliveira · on Jan 28, 2019

This is not different from a situation where C++ compilation has a binary dependency on other modules. The best know situation is a static library (.a file). In this case, the project cannot be built if there is a static library missing. With modules, one cannot compile the project with missing modules, so the build system will have to provide this information.

ZirconiumX · on Jan 28, 2019

But you also need compatibility with #include, so your build system also needs to become a preprocessor.

ginko · on Jan 28, 2019

The preprocessor is a self-contained binary. The build system can just pipe through it.

gchpaco · on Jan 29, 2019

The C and I presume C++ standard has been very carefully avoiding the idea of a preprocessor being separate at all. The standard was very carefully worded to prevent that being necessary, because most C compilers do not have a separate preprocessor. It is only Unix heritage compilers that really have one, and even they're not consistent about it.

ZirconiumX · on Jan 29, 2019

I'm curious how you plan to detect dependencies from a file by piping it through a command that removes those dependencies.

coliveira · on Jan 28, 2019

Your conclusion is incorrect. Most people with simple projects will use simple techniques to make modules work without worrying about the preprocessor. Large companies will create their own tooling to use modules in their own way. My point is that this is how C++ has been used since its inception. C++ users are already aware that the language needs external building support and modules cannot change this reality. But modules will certainly improve how the language is used.

Asooka · on Jan 28, 2019

The compiler won't be a build system? I'm not quite so sure. We already have -MD in gcc to emit Makefile rules for the dependencies of the current file. It's not much of a stretch to propose a similar flag to emit a list of required modules. In fact the very same flag could emit a foo.bmi target requirement when you "import foo" and your Makefile should have foo.bmi as one of the products of compiling the foo module. You could also have a similar flag that tells you what modules are built from the current cpp file given some compiler options.

dodobirdlord · on Jan 28, 2019

What I gathered is that module compilation is intended to be safe from preprocessor actions defined outside of the module. So the code you would generate with #include-style compilation and the code you would generate by compiling modules in the intended fashion aren't guaranteed to be the same. It seems as though this would mean that projects involving modules simply couldn't be compiled in the previous fashion.

andreareina · on Jan 28, 2019

First compilation time will be slower if before you could compile 8 files at a time, but now you can only do 3 at first because the others depend on those 3. Then maybe you can compile 6 at the same time, because all the rest of the code depends on those 6 modules, etc.