Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
A different approach to building C++ projects (rachelbythebay.com)
45 points by kimmk on Jan 20, 2023 | hide | past | favorite | 69 comments


I wish we had a culture that expected every new C/C++ build system to handle a non-trivial project like Boost or Qt (and their dependencies, like ICU and OpenSSL) before it being pitched as the new best thing. It's trivial to make a build system that elegantly handles your own toy projects that follow your preferred style and structure. But the real world of C/C++ is a harsh place with a lot of variability.


This is wisdom right here. It's tragedy of the commons.

I find that the real problem is no one wants to properly learn how their build system works. I don't care if it's make, cmake or bazel -- whatever it is, you need to _learn_ it. I've worked with folks that have 20 years of experience, fantastic C/C++ developers, that look at a makefile and say "Ugh what is this complicated mess, can you do it in cmake or bazel or something" and expect a silver-bullet where-in the makefile build will somehow transform itself into a self-describing intuitive build system by virtue of some sort of hype-osmosis.


> I've worked with folks that have 20 years of experience that look at a makefile and say "Ugh what is this complicated mess, can you do it in cmake or bazel or something"

This is so true, it happened to me more than once.

A couple of projects ago, we had a complicated build process (7-8 manual build steps that depend on files generated from each other before) for an embedded system. I wrote a little makefile deleting all those 7-8 shell scripts and i was asked to re do it in cmake.. I was like wtf.. each clearly defined step in makefile would turn into multiple unreadable function calls in cmake.. why would anyone want to do that..

Not that Makefiles are perfect, but sometimes, the right tool for the job isn't the shiniest. Make does a job of being "good enough" for a lot of little tasks.


> each clearly defined step in makefile would turn into multiple unreadable function calls in cmake

Utter nonsense. For most projects CMake is far more readable and maintainable than Makefiles. There's a reason it's the only build system even close to being a de facto standard in the C++ world.

And yes, CMake is totally awful. But it's still slightly better than Make.

(Feel free to post the Makefile!)


> For most projects CMake is far more readable and maintainable than Makefiles

That's only true for pure C/C++ projects. (I haven't used cmake with other languages so i won't comment on that.). I was specifically talking about the project where there were 7-8 shell scripts to build each intermediate step's target.

When you have to call external commands to build your artifacts, you end up relying on:

add_custom_target()/add_custom_command()/execute_process()/etc..

Example from that Makefile:

    $(DISK_IMAGE): $(PARTITION_LAYOUT) $(BOOT_IMAGE) $(SYSTEM_IMAGE) $(HOME_IMAGE) 
        $(info "~~~~~ Creating disk image ~~~~~")
        $(eval PARTITION_LAYOUT_BOOT_START:=$(shell grep boot $(PARTITION_LAYOUT) | grep -oP 'start=\s*\K(\d+)'))
        $(eval PARTITION_LAYOUT_SYSTEM_START:=$(shell grep root $(PARTITION_LAYOUT) | grep -oP 'start=\s*\K(\d+)'))
        $(eval PARTITION_LAYOUT_HOME_START:=$(shell grep home $(PARTITION_LAYOUT) | grep -oP 'start=\s\*\K(\d+)'))
        dd status=none if=/dev/zero of=$(DISK_IMAGE) bs=1M count=2000
        sfdisk $(DISK_IMAGE) < $(PARTITION_LAYOUT)
        dd conv=notrunc if=$(BOOT_IMAGE) of=$(DISK_IMAGE) bs=$(BLOCK_SIZE) seek=$(PARTITION_LAYOUT_BOOT_START)
        dd conv=notrunc if=$(SYSTEM_IMAGE) of=$(DISK_IMAGE) bs=$(BLOCK_SIZE) seek=$(PARTITION_LAYOUT_SYSTEM_START)
        dd conv=notrunc if=$(HOME_IMAGE) of=$(DISK_IMAGE) bs=$(BLOCK_SIZE) seek=$(PARTITION_LAYOUT_HOME_START)
When you toss in other variables, cmake becomes a lot more verbose/less readable for these kind of use cases. Other steps in that Makefile were about generating, signing, verifying each of the partition images, fetching keys from the servers etc.. That whole Makefile was 500+ lines, and pretty sure in cmake it would have been lot longer.


> That's only true for pure C/C++ projects.

Ah I assumed this was primarily a C++ project given the context of the discussion. You're right it's not really the right tool if you're not doing a C++ project!

> Example from that Makefile

Honestly that looks like a typical fragile Makefile. Grepping all over the place. Regexes. Shelling out to `dd`. This sort of build system leads to constant fighting confusing error messages. It's a "works for me" build system.

I can see one bug and it's only 10 lines!

I think you can use Makefiles robustly - if you only use it for handling the build DAG, but apparently the temptation to use it as a general scripting system is too great for basically everyone.


Mhm. The nature of the task itself was to use several different command line tools to do various build steps.

Makefile was the only tool that felt "good enough" compared to those shell scripts. As for "works for me" problems, i directly packed up my ubuntu 14.04 rootfs for chroot and told my colleagues this was / will be the only supported build environment for this task. (docker wasn't that widespread there back then).

> I think you can use Makefiles robustly - if you only use it for handling the build DAG, but apparently the temptation to use it as a general scripting system is too great for basically everyone.

This is very true. it starts off with "Oh i can just use bash eval for this simple math" and ends with "Okay this makefile is too ugly.. might as well use a python script to handle this step". I kinda wanted to write a make tool that integrated better scripting language then. and variable checking.

Error messages were typically manageable when you sprinkle enough :"|| (echo "Error: You didn't do the right thing; exit -1)"


Like many C++ devs I hacked around in cmake for years, but not really understanding what I was doing or how to structure cmake projects. This was made worse as newer ways of doing things came into being.

If this sounds like you, do yourself a favor and go through these slides (or watch the talk they came from):

https://github.com/boostcon/cppnow_presentations_2017/blob/m...

It really clarified things for me, but also avoids going too much into detail. You will definitely need more info as you go along, but you can look those things up in the docs. This presentation does a good job at showing the core essentials that you can build your knowledge on later.


> whatever it is, you need to _learn_ it.

The problem in my experience is you invest time learning build system A, then a year later build system B comes out, and not only do you need to relearn a bunch of stuff again, often build system B does some of the stuff of system A but not all of it, plus it does new stuff that you've never encountered before. Then this cycle repeats, endlessly, and every new team you join has adopted the newest build system.

Granted some ecosystems are worse than others here. in the JavaScript world it went something like: make/grunt/bower, gulp/webpack, esbuild, parcel, vite, rollup, and on and on it goes.

Even in the conservative Java ecosystem we've been through maven, ant, groovy/gradle...

Most of these tools offer incremental improvements for a huge learning cost. It's a nightmare.


> Most of these tools offer incremental improvements for a huge learning cost. It's a nightmare.

Huge learning cost and even more huge interoperability cost. "Oh you need to cross compile your C++ project to that architecture? this dependency uses that build system and you need to build those libraries separately and make sure your build system can use that then"


ant is still being used in the Java space? That would be new to me. Gradle is just enforced by google for Android deployment for some arbitary reason.

For 90% of use-cases nothing beats the simplicity of Maven.


Who says you have to learn tool B just because it was released? Just ignore it and keep using A.


Because "every new team you join has adopted the newest build system".

Just like in the last 2-3 years, every single new team I've joined is using kubernetes from day 1. Cargo culting is a huge force.


The problem is not the build system, it's projects that are seemingly enjoying complexity.


I think there's a good reason for people moving to Rust, aside from the merits of each language, the tooling for using packages is just so much better in Rust.


This is sort of unrelated, but that reminds me that one of my biggest issues with learning C++ was how I was expected to deal with libraries (particularly on Linux, where conventions will even differ between distros) and building the project. Most guides or what have you sort of teach you how to compile a file or two, but you quickly run into issues that are difficult to solve for a complete beginner without a direct source of feedback.


Every time I've tried to dabble in C++ I've had the same horrible experience.

I end up "Randomly" stabbing at things until it works just well enough to get that particular thing done then dropping it all because it was such a painful experience.

Compared to something like cargo which works really well, C++ and it's build tools just feel flaky.

It may be that I'm just missing a mental model to get to grips with it, but no other major programming language is like that from my experience.


Same here and I'm dreading starting back up again. Are there any Gradle /npm like package managers that might simplify this?

Looking for something that is still alive in 2023


Same here and I am dreading starting back up again. Are there any Gradle /npm like package managers that might simplify this?

Looking for something that is still alive in 2023


Conan works pretty well. There is also vcpkg, but I haven't used it. But are current and maintained.


Will check them out


In Behavioral Science this phenomenon is called "learned helplessness". The rabbit will not flee the cage even with the door propped open and no one around.

Theon Greyjoy in "Game of Thrones" exhibited this condition.

It is a thing to overcome.


I agree, it’s so horrible. Passing flags to a compiler for some Byzantine rule just to change lord knows what, but it all works now.

What on earth.


I wish somebody would write a book. Even an online book. An all-encompassing book just about how to link/build C++ projects and all the different solutions people use and how they work in practice. Common ways to organize and configure C++ project builds. How Linux distros each differ in where libraries are stored and how to find and link them. Common issues and how to fix them. But it also needs to convey how all these things work and help you create a mental model that allows you to also find your own solutions.


One of the nice things that Visual C++ has is

    #pragma comment(lib, "xxx.lib") 
You can specify it in a header file for the library. That way if you include the header file, the library mentioned will automatically get linked as long as it is somewhere in the library search path.

I have found myself wishing that GCC would also get something like this.


I remember back when I was programming in Delphi I could link directly against a .dll, just take a function prototype from the .h file and translate it into a function declaration like this one:

    function I2C_GetNumChannels(out numChannels: Longword): FT_Result; stdcall; external 'libmpsse.dll';
and that was it; but to do this in MSVC you needed not only the .h header and the .dll itself, you also needed that stupid .lib file that had AFAYCT had literally nothing inside it except symbol entries that said "no, load it dynamically from this .dll on startup, please". So it was a rather common source of amusement for Delphi programmers that paradoxically, it was harder to link a program written in C against a DLL written in C than it was to link a program written in Delphi against a DLL written in C.


Delphi made an awful lot of things incredibly easy. Com automation for example. It is just too bad Borland had fucked up.


It boggles my mind that for how hardline the WinDev is about using COM, they still fail to match Borland, nowadays Embarcadero tooling for COM.

For a brief moment they almost had it with .NET Native and C++/CX, and then, first they killed C++/CX in name of C++/WinRT (with VS tooling just like in the good old ATL days), and with UWP's deprecation, CsWinRT also fails quite short of the .NET Native experience in regards to COM.

How a OS development team that is so invested into COM APIs, fails to produce tooling better than the competition for 25 years escapes me.


>"It boggles my mind that for how hardline the WinDev is about using COM, they still fail to match Borland, nowadays Embarcadero tooling for COM."

I've been in the industry for 30 years just in Canada. Have come to conclusion that software developers in their majority rarely prefer most efficient, elegant (whatever that means) ways of accomplishing things. I have a theory that being "software developer" is sort of self servant. Because of that they enjoy unneeded complexity, tooling etc. etc.

I personally have never indulged into coding for the sake of it. To me software development was always just a tool to build amazing products people / businesses would use. So I always think about product, how it will be used and how to get there with the least financial "damage" either for my own company or for a client.


I share a similar feeling, from my point of view developers are users as well, and we should enjoy similar nice workflows as regular users.


I wish that would work for all build settings, and be standardized across compilers. Even building complex projects with platform-specific build settings could then be reduced to a simple:

    cc main.c -o bla


Oh, wow, I’ve been thinking for a while about implementing something similar for myself to be used with GCC/clang on Linux.

I suspected that someone might have done it before, but didn’t know of any implementation. I’ll take a closer look at Visual C++ (used it in the last millennium for work) before deciding how mine should work.


I really don't like this feature. I like knowing what and where things are linked explicitly.


This could be handled through a --verbose flag on the linker (and maybe --dry-run).


The problem I've found is that it gives you the name…but not the path. I've ended up seeking ways to turn it off every time I've run into it.

FD: CMake developer


Has everyone forgotten about deps files? Run gcc -MD and it will create .d files that record the dependencies between your source (and header) files. You can then use an include directive in your Makefile to pull that information in for make to use. There are a couple of variations on the theme; some people recommend putting the .d files alongside your source files, others recommend a specific “deps” directory for them, etc. See the man page for details, with particular reference to options like -M, -MM, -MF, -MD, and -MMD.

Of course, the other alternative is to simply #include _every_ file in your project into a single source file, then compile that. It’ll probably be faster than anything else you do, and eliminates several other foot–guns as well. And it means that your build script can just be a shell script with a single line that runs the compiler on that one file.

But these days I greatly prefer Rust, where building is always just “cargo build”. Doesn’t get much easier than that.


How does gcc -MD help at all in tracking which .cpp files to link in?

> Of course, the other alternative is to simply #include _every_ file in your project into a single source file, then compile that.

Yeah, no... recompiling the entire project whenever any file is touched is way too slow for any non-trivial project.


Most projects don’t have a lot of .cpp files that they don’t use :) You’re ultimately going to use them all.


Most people only change one or two files at a time. Those files, and the few that depend on them, must be rebuilt upon a change. Incremental builds only do what's necessary. Your proposal always builds everything no matter how trivial the change. It does not scale.


Did I ever say it was the fastest way to build a program? No, I said it was the simplest. But you might be surprised at just how fast it can be; it can scale to programs of significant complexity. Incremental compilation made a lot more sense back when computers were a lot slower.

With large complex programs using complex incremental build systems, especially programs written in C++, you often end up spending most of your time processing the same header files over and over again. Some compilers use complex systems of pre–compiled headers that can paper over this cost, but building your whole program in a single compilation unit that only ever includes each file once eliminates it entirely without adding any additional complexity. If you want proof of this, go look at the efforts over the last year or two to shave a few minutes off the build time of the Linux kernel by carefully adjusting all of the header files, splitting them all into even smaller files so that each #include can bring in just what is needed. This avoids burdening the compiler with parsing things you aren’t going to use just to pull in a struct definition or two.

Plus, by having a single compilation unit you avoid the need to link a bunch of small object files together. Linking is the final step of building an executable, and it cannot generally be parallelized much. By reducing the amount of work the linker needs to do you can greatly speed up a step that can’t be sped up any other way.

But the real reason I recommend it is its simplicity. By spending less programmer time on the build system you can significantly reduce the overall development cost of the whole program. This is a huge benefit of Rust (and a lot of other modern programming systems) that is often overlooked. With a complex C++ program I might spend 10% of my overall development time monkeying around with the build system. If I write it in Rust I can avoid all of that; cargo can do everything I need while requiring only a very small amount of time configuring it. 10% of a project that takes a year is over a month, and with cargo that time expenditure is reduced to minutes or hours.

On the other hand, I have worked as a contractor for many years helping companies develop complex systems. It’s only a rough estimate, but I believe that I have earned over a hundred thousand dollars just by helping them with their complex build systems. A few days here, a few days there, it really adds up.


This sounds like it would be fine for code that you write yourself. But if you're only compiling code you wrote then C++ build systems are pretty trivial. The hard bit is dependencies.


One big lesson that newer languages like go and rust seemed to have learned is that the tooling, building and dependency management need to be dictated as part of the language ecosystem. Dealing with tons of other C++ projects written by other people (even in the same company) - how to specify dependencies, where their build artifacts can be found, etc - is a HUGE pain in the ass and consumer of my time.


They all get the tooling wrong though because none can stand the idea that you might want to mix languages, or add their new language to an existing project with existing tooling.


For Go, mixing languages is uncommon because its FFI suffers an impedance mismatch with its task scheduler. For Rust, mixing languages is extremely common. Rust's entire original reason for existing was to rewrite small parts of a large C++ project.


So much common that Google had to create their own integrations for Android and Fuchsia.

I bet the announced Chrome efforts will again, require another adaptation.


I'm unsure what this is trying to say. You appear to be in begrudging agreement that Rust is commonly mixed with other languages?


He's saying that Cargo doesn't really help at all with that mixing. Integrating Rust into a multi-language build system is indeed quite a pain. You have a few bad options:

* Drive everything with Cargo using `build.rs`. This sort of works but it's pretty horrible.

* Give up and just have your main build system run `cargo build`. This is what I normally do but again it's not ideal because it doesn't really integrate properly with your main build system.

* Give up on Cargo entirely. This is what Bazel does, and it's probably the most robust solution but it's still not ideal because most of the Rust ecosystem expects you to use Cargo (e.g. rust-analyzer).

I don't think any other languages have really solved this problem either but it is still an annoying problem.


> I don't think any other languages have really solved this problem either but it is still an annoying problem.

As you say, this is not a problem with an ideal solution to draw upon. (languages are unavoidably different). From that perspective, I don't see any of these approaches as problems.

If your project is mostly Rust with a little something else sprinkled in, use build.rs ("horrible" is an exaggeration IMO, it's merely not ideal, again, because there exists no ideal).

If your project is mostly something-else with a little Rust sprinkled in, invoke `cargo build` from your build system, and again this is a perfectly adequate solution to a problem with no ideal solutions.

If your project is extra-special, invoke rustc directly, and that's a deliberately supported use case. Hell, I use rustc directly sometimes just because I can.

The bottom line is, Rust provides a best-in-class opinionated build system that is overwhelmingly used by the Rust ecosystem, with best-effort escape hatches for integrating into other projects. To say that Rust cannot "stand the idea that you might want to mix languages", as alleged by the person I originally replied to, is factually incorrect.


That cargo is not enough, and polyglot codebases either use something else, or have quite extensive uses of build.rs.


No, most polyglot codebases use Cargo. For something massive like Android that has gone so far as to implement its own build system, no third-party build system would be sufficient, so focusing on Cargo is irrelevant. The fact is, Cargo has always been deliberately and consciously designed to be merely an abstraction over rustc, specifically in order to accommodate opinionated users who would be better served by invoking rustc directly. They're following exactly the workflow that the Rust developers intended, because this saves the Cargo developers from having to perfectly anticipate every potential user's needs, which would be impossible. Rust does the right thing here.


Java, JavaScript, and Ruby didn't do this, and yet they all have solid build and dependency management stories. Java and JavaScript have even managed to have multiple build and dependency management tools existing at once without there being fragmentation and ruin. So clearly, having those tools dictated by the language is not essential.

I'm not sure why C and C++ have such a bad story here. Some combination of greater intrinsic complexity (separate headers, underspecified source-object relationships, architecture dependency, zillions of build flags, etc), a longer history of idiosyncratic libraries which people still need to use, the oppressive presence of distro package managers, and C programmers just being gluttons for punishment, probably.


Dependencies fit into this model, too. Presumably the dependencies build wherever they came from. Do that, package the output, and put it somewhere your project can use it in the suggested manner.


One .cc/.h pair, one object

Not always the case; I have a project with

    default.o: default.yaml
             $(LD) -r -b binary -o default.o default.yaml
and a default.h containing

    extern const char _binary_default_yaml_start[];
    extern const char _binary_default_yaml_end[];

    #define PARAM_YAML _binary_default_yaml_start
    #define PARAM_YAML_LEN (_binary_default_yaml_end - _binary_default_yaml_start)
this used in the main code as

   fwrite(PARAM_YAML, 1, PARAM_YAML_LEN, stdout);
printing the contents of the yaml file to stdout.


The point is that the tool is opinionated and demands that this be the case for projects that work with it. Not that the author believes all .cc / .h files work that way.

Your use case would be served by C23's #embed [1]. The same thing has been proposed for C++ but repeatedly kicked down the road because the standardisation committee wanted to make it more general even though no one had any demand for that so they didn't know what it would look like. (C++ standardisation in a nutshell.)

[1] https://thephd.dev/finally-embed-in-c23


Wow, I didn't know about #embed, that will make life so much easier -- thanks!


So, you do things in a way that would not support this approach. She’s not saying “this is always how it’s done”, she’s explaining what practices you would need to commit to in order for her approach to be viable:

> “If you want something like this to work, you have to commit to a certain amount of consistency in your code base. You might have to throw out a few really nasty hacks that you've done in the past. It's entirely likely that most people are fully unwilling or unable to do this, and so they will continue to suffer. That's on them.”


This is sort of the principle of the "bpt" build tool I think from "vector-of-bool"

https://bpt.pizza/


> If you want something like this to work, you have to commit to a certain amount of consistency in your code base.

That goes for almost everything, in developing ship code.

Today, I am in the initial stages of rewriting an app with a codebase that has “accreted” over two years.

It’s kind of a mess (my mess, to be clear).

I’ll be adding a great deal of rigor to the new code.

I think it will come out great, but I have my work cut out for me.


Conan + (any build system) = problem solved

Conan has a learning curve, but it’s totally worth it. Anyone making their own build system should get some experience with a state of the art package manager before writing a single line of code, because chances are that it already solves whatever problem is motivating you.


I started as a python programmer and was very used to package managers. I believed in them, I championed them. When I switched to C++ for work I was very disheartened that there wasn't a standard.

Conan obviously has promise, I haven't spent much time with it, most of my experience with C++ package managers is with nuget and vcpkg. However, my attitude toward package managers is changing.

I increasingly like _not_ using package managers because it makes me (and my company) way way way less likely to bloat our software with unnecessary third party dependencies.

I wrote this in another thread: I never believed you should write something yourself if you can find a package for it. My boss told me I should write it all myself, I could probably write it to be faster. I encountered a case where I needed to compare version numbers in python. For the heck of it I wrote the simplest, quickest, most naive solution I could come up with and then timed it against the most recommended version comparison package in python. I blew it away by 20x throughput.

I don't believe in package managers anymore. Obviously I'll keep using pip and sqlalchemy in Python, but I'll happily spend the 20-30 minutes it takes adding something like nlohmann-json or md4c to my project over worrying about maintaining a package manager for c++ these days. Precisely because it makes me think twice about adding another dependency.


Sometimes you have dependencies with actual value add that you really don't want to replicate. No, I'm totally not writing a yaml parser, thank you very much. I can probably write a good yaml parser, possibly even better than some 3rd party stuff, but yaml parsing is simply not our business.

And yaml parsing is probably on the simpler side of things. We need to run torch models, we do need libtorch. We are not rewriting libtorch, that would be silly.


Yeah, this is something that bugs me about the rust ecosystem. Just to use a random number generator you need to pull in like 15 dependencies. In just a simple learning project that would have had about 1 dependency in c++, I ended up with like 75 for rust. I guess I'm old, but that seems like madness. Cargo being easy and simple is not all upside.


If you just want random numbers, the getrandom crate has only three direct dependencies, one of which is the libc library bindings. I’d you don’t need everything that’s in rand, you don’t have to use rand.


Yes, this. I've been reading many comments here about problems with different dependencies and their own unique build system and kept thinking "Conan would fix that". I'm surprised that it's still so unknown in the C++ world.


I prefer the equation, :)

vcpkg/NuGET + (any build system) = problem solved


Does this work across linux, mac and windows cross platform projects?


vcpkg the tool is cross platform. Of course not all libraries in vcpkg are cross platform, but most are.


Likewise for NuGET in the context of .NET libraries with native dependencies.


I have been managing all my existing and new projects with nix for a few years and never look back. Guaranteed to build and run on all my machines.

nix flake new --template "github:nixvital/flake-templates#cpp-starter-kit" my-project

will create a skeleton for my new C++ projects.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: