I'm involved in a couple of projects which do a lot of metaprogramming to generate boilerplate. The generated code is usually C. For the generator I wouldn't recommend using CPP or M4, use a real programming language instead. I'm using OCaml since ML discriminated unions are a good fit for generating code, but other languages would work too. It's a fantastic technique which greatly reduces bugs, speeds development and eases maintenance.
Here are links to the generators for those projects:
- This is a small C library. It uses a small-ish generator to generate a state machine, Python bindings, OCaml bindings and (patch coming soon) Rust bindings.
- This is a very large project mostly written in C, using a generator to generate bindings in multiple programming languages, also RPC boilerplate, structures, configuration code and much more. This generator is very large having evolved with the project over 10+ years.
Edit: Some rules for effectively generating code:
(1) The generated code must look as far as is reasonable like it was hand written. This is because you'll be debugging it. We also use #line directives where possible.
(2) chmod -r the generated files so they cannot be edited by accident. Also add a comment at the top pointing to the generator so that developers who have no idea how it works can immediately find the right place.
(3) We add the generated files to the tarball (to allow end users to build without needing OCaml) but keep them out of git (to avoid duplication in commits).
(4) Maximum generation! If some piece of information is written twice, it should probably be written once (in the generator) and the output should go to those two places. Similarly we only have one generator in the project so no one needs to read the Makefiles to find out how a file is generated.
For me in 2006, when i moved from Java to Ruby, the videos by Dave Thomas (pragprog) changed the dimension of thinking at code level - I did do java reflection, but the whole method_missing to object specific methods etc., explained in a neat fashion by Dave Thomas were amazing !
I can't be certain that they were the first, but Oracle's Pro*C/C++ tooling does the same thing as the example in this article, and with virtually the same syntax.
I've worked on a legacy app written with these tools. It's not the most awful thing I've ever had to use. But there's a lot of magic in those macros, and you have to code carefully to avoid introducing action-at-a-distance bugs, esp. w.r.t. error handling.
It's called Embedded SQL, and is part of an ISO standard [1]. At one point an embedded version of Informix translated ESQL/C to ISAM calls, without any SQL parsing and query compilation at runtime. Though I don't know what this has to do with metaprogramming as we know it today.
The first reason (performance) given does not feel very compelling. Metaprogramming is generally complicated, and performance is less important in most cases than it once was, so making that tradeoff doesn't make as much sense now as it might have when such systems were popular, or even in 2005.
Metaprogramming is about making design more formal. A "constant fraction" of anything written by hand should be automated, ideally. Metaprogramming is not complicated, it is an essential skill, design may be complicated.
Are there any tools (which are not part of the language itself) that can work with IDEs? Using custom pre-build steps feels like something that's very hard for tooling to introspect.
Editor integrations that aren’t aware of build systems are a huge issue. Without awareness into the build system, its basically all just guesswork.
For languages like Go and Rust, this is partly solved by putting tooling into the language and standardizing the build system. But even for languages that have build systems, there’s reasons to use other build systems.
As far as I know, very few build systems used in the open world have readily available integration with tooling. However, I think this is temporary. As an example, Bazel and Go should play well together in the near future:
Hopefully, other languages will make mechanisms for code intelligence to integrate with build systems. This is especially important if you want a good developer experience across multiple programming languages.
Perhaps the more scalable (in terms of supporting many languages x many build tools) alternative would actually be an inverse relation, a future language server architecture where the build tool orchestrates the language servers. Who knows.
Here are links to the generators for those projects:
https://github.com/libguestfs/libnbd/blob/master/generator/g...
- This is a small C library. It uses a small-ish generator to generate a state machine, Python bindings, OCaml bindings and (patch coming soon) Rust bindings.
https://github.com/libguestfs/libguestfs/tree/master/generat...
- This is a very large project mostly written in C, using a generator to generate bindings in multiple programming languages, also RPC boilerplate, structures, configuration code and much more. This generator is very large having evolved with the project over 10+ years.
Edit: Some rules for effectively generating code:
(1) The generated code must look as far as is reasonable like it was hand written. This is because you'll be debugging it. We also use #line directives where possible.
(2) chmod -r the generated files so they cannot be edited by accident. Also add a comment at the top pointing to the generator so that developers who have no idea how it works can immediately find the right place.
(3) We add the generated files to the tarball (to allow end users to build without needing OCaml) but keep them out of git (to avoid duplication in commits).
(4) Maximum generation! If some piece of information is written twice, it should probably be written once (in the generator) and the output should go to those two places. Similarly we only have one generator in the project so no one needs to read the Makefiles to find out how a file is generated.