Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> The build system doesn't need to become a compiler. It just needs to parse

parsing C++ is mostly equivalent to becoming a C++ compiler.



The module preamble has been explicitly designed to be parseable without requiring a full blown C++ parser.


> parsing C++ is mostly equivalent to becoming a C++ compiler.

It reallt isn't. Parsing a languagr just means validating its correctness wrt a grammar and in the process extract some information. Parsing something is just the first stage and a one of many stages required to map C++ source code to valid binaries.


The presence of template specialisations and constexpr functions means that the GP is right here; you cannot decide whether an arbitrary piece of C++ is syntactically valid without instantiating templates and/or interpreting constexpr functions. Consider

    template <int>
    struct foo {
        template <int>
        static int bar(int);
    };

    template <>
    struct foo<8> {
        static const int bar = 99;
    };

    constexpr int some_function() { return (int) sizeof(void*); };
Now given the snippet

    foo<some_function()>::bar<0>(1);
then if some_function() returns something other than 8, we use the primary template and foo<N>::bar<0>(1) is a call to a static template member function.

But if some_function() does return 8, we use the specialisation and the foo<8>::bar is an int with value 99; so we ask is 99 less than the expression 0>(1) (aka "false", promoted to the int 0).

That is, there are two entirely different but valid parses depending on whether we are compiling on a 32- or 64-bit system.

Parsing C++ is hard.

EDIT: Godbolt example: https://godbolt.org/z/yR3YHW


You only need to parse the "module <module name>" and "import <module name>" statements. No need to parse all of C++ for that. You could probably even do that with a regex.


It also has to do all the preprocessing to see which import statements get hit. I don't think templates could control at compile time which module to import, at least I hope not.


Parsing C++ is literally undecidable. You can encode arbitrary programs which will emit a syntax error depending on their result.

Here's an example of a program which compiles only if the constant N is prime, and otherwise emits a syntax error: https://stackoverflow.com/questions/14589346/is-c-context-fr....


You are misrepresenting the concept of undecidable. If the compiler can say if the program compiles or not, then it is most certainly decidable. What you want to say is that it cannot be determined without full parsing, so no preprocessing is possible.


No, it's actually undecidable. C++ templated have been determined to be turing complete, which means that template instantiations can encode the halting problem. Determining whether a program compiles or not therefore requires solving the halting problem.

In practice, compilers work around this by limiting template instantiation depth.


I gave an example of a template program to show the general method. Obviously, primality is decidable, but there exist candidate C++ programs whose parse tree is undecidable. The trick would be to encode your parser in a template, run it on the undecidable program (i.e., itself), and create a contrary result. Does this have any effect on practical C++ builds? I honestly have no idea.


They could just specify that the module/import statements need to be at the top of the file (excluding comments). Most people will do this anyway. Then the build system only needs to parse comments and module statements, which should be fast and easy.


Except that, as mentioned in TFA, as currently proposed it's possible to say

    #if SOME_PREPROCESSOR_JUNK
    import foo;
    #else
    import bar;
    #endif
This has legitimate use cases, say

    #ifdef WIN32
    import mymodule.windows
    #else
    import mymodule.posix
    #endif
So in reality build systems will be required to invoke at least the preprocessor to extract dependency information.

AFAIK the modules support in the Build2 build system does exactly this, and in fact caches the entire preprocessed file to pass to the compiler proper later.


That's exactly how it has been specified.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: