The Brick Wall of C++ Source Code Transformation

mynegation · on Nov 19, 2015

I have been one of the primary authors of a commercial C++ static analysis tool.

In a nutshell: C++ is horribly complicated language, and preprocessor is just the first brick wall on the way. Even if you walk around it or manage to blow a hole through it, there are other things. The fact that C++ even as a preprocessed language is very complicated. Its grammar is context-sensitive. You have to scan C++ definitions at least twice to parse them properly. And then you bump into a mother of all brick walls: templates. Which are turing-complete. Meaning that you can specify arbitrary computation in them, that will be performed by the compiler.

aconz2 · on Nov 19, 2015

I've been thinking a lot about source code transformations as a possible key for maintenance in the "future" of programming. Its interesting to see that the preprocessor, which is a source code transformation itself, makes other source code transformations much more difficult, if not intractable. I'm wondering if this inherent to any macro system since they are not bijective.

Two possible directions I see away from macros which would play nicely with source code transformations are un-macros and layers of languages.

Un-macros would take source and display it in the condensed form only for human reading -- this takes away from human writing power of course, which may have to get pushed into the editor. This could also allow for automated pattern discovery.

Layers of languages would look similar to Racket's #lang concept and current macro systems, but without the ability to escape back into the parent language. This creates a new language on top of -- not mixed into -- the parent language.

lmm · on Nov 19, 2015

I wonder if the right option is to push the macro-like functionality down into the language itself. Something like http://okmij.org/ftp/Computation/free-monad.html starts to offer very macro-like functionality, but deferred to runtime, so you can still look at the original value and even imbue its effects with a different set of semantics.

amyjess · on Nov 19, 2015

Or something similar to Nim's metaprogramming features. Nim's templates and macros are wonderful.

aconz2 · on Nov 19, 2015

Are Nim's metaprogramming features fundamentally different than C++?

lmm · on Nov 19, 2015

Do Nim macros enforce reversibility? If not, they're no better than C++ ones in this regard.

Everything I read about Nim has the whiff of propaganda; tell me what it does better rather than just telling me it's good.

joakleaf · on Nov 19, 2015

In short, the problem is that you can't do:

readable code -> [preprocessor transform] -> transformed code -> [apply transformation tool] -> [reverse preprocessor transform] -> readable code

Because both the preprocessor (and potentially arbitrary transformation tool?) are not bijective.

Couldn't you add detection for when the reverse transformation isn't bijective, and then report an error in those cases.

Just because there are examples of abuse of header files and macros in C++, it doesn't mean all of it is like that -- especially modern C++. So these tools could still work for a lot of cases.

Please, don't give up!

pjmlp · on Nov 19, 2015

I heard on CppCast with Dmitri Nesteruk, that CLion is macro aware when doing refactorings, but it was a very complicated process to make it work properly.

Since I spend most of my days in JVM/.NET land, I don't have experience how it really works.

They keep in memory the whole macro -> expanded C++ code transformation.

gh02t · on Nov 19, 2015

> Couldn't you add detection for when the reverse transformation isn't bijective, and then report an error in those cases.

I was thinking something along those lines too, but then it kind of undermines the goal of the hypothetical tool the previous article was proposing. To get programmers on board with adding breaking changes to the standard, I think you really need a tool that works near perfectly (which is the ultimate problem). Large codebases use macros and templates a lot in my experience and if the tool chokes on some those then people are going to be lazy and say "no thanks." My gut instinct is that it'd have to be down in the range 10 errors / 100kloc before "lazy" (read: busy) programmers would accept it. More crucially, it needs to be sure that it recognizes 100% of errors... it can't make any mistakes about when it does decide that a given transformation is correct because bugs introduced by a mistaken translation are probably going to be extremely subtle and difficult to identify.

It's a crappy situation. I think the author was right in the previous article that a magical perfect transformation tool could allow them to make serious and beneficial (but breaking) changes to C++, but such a tool is unbelievably difficult to actually make.

pldrnt · on Nov 19, 2015

So, my after-work project these days is a language that transpiles to C++ and can do all that current C++ can do.

The grammar is completely regular, and the syntax maps 1:1 with the ast in memory, I can generate one from the other idempotently.

The main reason I am doing this is that I wanted a language that is easy to build tooling for (including of course code transformations) while being compatible with existing C++ libraries out there.

This kind of articles give me hope there might actually be demand for my crazy thing once it grows up.

madmoose · on Nov 19, 2015

Any relation to http://www.csse.monash.edu.au/~damian/papers/HTML/ModestProp... ?

pldrnt · on Nov 19, 2015

I read that paper a long time ago and it was certainly an inspiration :)

Originally, the language started out quite different but I can now see why SPECS looks the way it does, I do not agree with everything they did and my effort these days is towards making it simpler and smaller but there's some convergent evolution indeed.

skrebbel · on Nov 19, 2015

So basically a CoffeeScript for C++? That sounds like a pretty good idea in fact. But doesn't "compatible with C++ libraries" imply compatible with .h files, and thus with the preprocessor?

pldrnt · on Nov 19, 2015

Yes, that is the idea.

Dealing with the preprocessor is a lost cause, so I just don't, the idea at the moment is to let the preprocessor do its thing and pipe the output into ctags (and if a better ctags comes out in the future it will be trivial to put that in instead), the result is not 100% perfect but is good enough to capture most of the declarations present in the included headers and make them visible to this language.

If a library absolutely requires preprocessor macros to be useable, someone will have to make equivalent hygienic macros in this language but at least you won't have to make a huge wrapper for everything.

raverbashing · on Nov 19, 2015

And that's why the modern alternatives have done away or greatly reduced what the preprocessor can do.

That and Include Files

I guess it was part of the evolution and experimentation, but it is one of the things that make C++ complicated

petke · on Nov 19, 2015

Thankfully we will get modules in C++ soon. Hopefully the (Microsoft) version that deprecates the preprocessor.

zaro · on Nov 19, 2015

Don't mean to be rude or troll but the first thing that came to my mind when I saw the title was :

Stop trying to make C++ happen, it's not going to happen.

ska · on Nov 19, 2015

   Stop trying to make C++ happen, it's not going to happen.

That isn't rude or trollish, but it is inept.

kjdal2001 · on Nov 19, 2015

c++ is one of the most widely used programming language in the world. It's already "happened"

p4wnc6 · on Nov 19, 2015

Relevant: < http://yosefk.com/c++fqa/defective.html >