Hacker News new | past | comments | ask | show | jobs | submit login

I really think the most exciting feature of Carbon (indeed, it's justification for existence) is its backwards compatibility with C++, followed closely by it's more modern and flexible governance structure. Even the docs in the repo say you should avoid Carbon unless you have lots of C++ code you need to interact with.



To be fair, the compatibility is a design goal, not a complete feature as such.

Even C++ struggles to be compatible with C++. I expect Carbon to have at least as hard a time with it, especially given the "bazaar" that is the C++ build and packaging ecosystem.

I also expect that Carbon will have a hard time tracking C++ as it evolves. It could freeze C++ feature development? Or it could target an old and aging C++ standard?

Point is, I'm not taking C++ compatibility as a given. It seems very hard and may require a lot of big tradeoffs.


The compatibility target is C++17, which I imagine was chosen primarily to avoid the complexity of modules in C++20. They're also excluding non-trivial preprocessor macros, and saying that you'll probably have to write a bit of glue code in some unusual situations.

It's a very realistic take on C++ interop, but still well beyond anything offered by other languages.


If the goal of Carbon is considered to be an off-ramp for Google's C++ projects, then Carbon does not have to evolve with C++. Google can just say "we use up to C++17 and Carbon" for these projects.


Yeah, but there will probably be a window in which that offramp is generally interesting. If the worldwide C++ codebase moves along, carbon becomes less accessible as a viable choice.

Ironically, Google (googletest, gRPC, etc.) has been more agressive than most in dropping support for older C++ standards.


Since Apple and Google have focused on their own languages, clang steam went out, as no one else seems that interested in improving ISO C++ support, rather the LLVM foundations, which require C++17 as basis.

If the likes of Intel, AMD, ARM, IBM, Embarcadero,.... don't step up in ISO C++ support at upstream, clang will stay mostly C++17 with some bits and pieces from C++20.

It is already ironic, that after so many years of being joked for their non-standard extensions, VC++ has the best ISO C++20 support.


Some of the biggest problems with C++ come from its backwards compatibility with C. Yes, it wins you users in the short term, but it's a pain to support as both languages evolve


The binary compatibility is the big deal. The other safe languages have explicitly taken the view the interop with C++ is bad, and so we should instead do interop with the significantly less safe C instead.

The real killer is that the lack of any interaction with C++ means that you can’t do any real incremental adoption of one of those safe languages in any big security critical projects. Saying that the solution to C++ is to just not use it ignores the reality that C++ exists, and large projects in C++ exist. It doesn’t matter if you don’t like C++.

The final problem with the safe languages - with the exception of swift - is that they are all hell bent on not providing even just basic ABI stability. It doesn’t matter how “safe” your language is if the first thing you have to do is make a pure C (even losing potential for automatic lifetime management the C++ would allow) interface.

So I can have two libraries both written in rust, and the entire safety of it is contingent on each library talking to the other through a C API.


> The other safe languages have explicitly taken the view the interop with C++ is bad

I think it's more that interop with C++ is hard and ultimately a lot less valuable than interop with C, which is what the vast majority of languages use for FFI.

> It doesn’t matter how “safe” your language is if the first thing you have to do is make a pure C (even losing potential for automatic lifetime management the C++ would allow) interface.

I don't see how that leads to "it doesn't matter"


If you consider the context of Carbon (Google having a _lot_ of C++), solving for "interop with C++ is hard" might be considered worth the price.

Whether that's true in practice...? I guess we'll need a few more years to tell.


Of course, that's why they're building it. They're solving a very specific, niche, hard problem. It would make no sense for other languages, which want to solve more general problems, to tackle C++ interop over C.


I do think this is very much a google project specifically because they have a lot of C++ and recognize that there needs to be a path to a memory safe environment, and that is why this project exists.

But I don't think the goal is inherently wrong, nor do I think it's niche - there is a lot of C++ in the world, and much of it would benefit from being memory safe. A lot of the largest projects and certainly many of the most heavily used software projects are written in C++. I think you'd agree that having them no longer be be trapped in a memory unsafety tarpit would be a Good Thing.

Let's imagine someone came out with a magic C++ compiler that had 100% source and ABI compatibility and had no performance costs, but was completely memory safe - remember "magic". You'd obviously expect, if not demand, that every C++ project switch to that magic compiler. So let's imagine we're on a scale from 1 to 10, where 1 is the memory safety of current compilers and 10 is our magic clearly impossible one. Obviously every step we can go towards 10 is great, but is going to come with trade offs in adoption cost. So on another 1 to 10 scale where 1 is rewriting in some other random language (rust, etc) and 10 is your current compiler, then you want the new language to be as close to 10 as possible. One thing you could do is build all your code with ASAN or a similar mechanism, that would have low adoption cost, a dramatic improvement in runtime safety, but a huge performance cost (time for another 1 to 10 scale?).

Carbon appears to be an attempt to design a language with a reasonably high score in all of the scales - no 10s, but mostly 7s or something. I personally think that there are things we could do to C++ that would make a meaningful improvement to safety without having the solution be "rewrite in other language", but I won't fault a project saying "we'll support incremental adoption" as their adoption path in exchange for stronger safety.


> I think it's more that interop with C++ is hard and ultimately a lot less valuable than interop with C, which is what the vast majority of languages use for FFI.

Interop with C is trivial, and that is why everyone has that - all you have to do is expose the primitive types (e.g. ints of varying sizes, floats of various sizes), pointers for anything heap allocated or with non-trivial copy/destroy, and pure structs.

Interop with C++ is hard, but you can at least start off without dealing with exposing template types and functions, and then C++ intro reduces to the above C functionality with some fancy pants layout for vtables.

Supporting interop at a C API level is fine as your FFI to other languages as a library, but dropping your safety guarantees as a library talking to other code written in your own language is counterproductive. Because the first thing anyone needs to do is that "rare" FFI work of recreating a memory safe wrapper around a system library.

> I don't see how that leads to "it doesn't matter"

Ok fair cop, that was a little strong, or at least insufficiently qualified.

I'll try again:

The safety of a new language doesn't matter to existing software, if there is not a reasonable adoption path for that software.

Obviously being able to implement new code in a memory safe language that has modern features and a trivial path to calling C code (as opposed to the historical "safe" language Java for which talking to non-Java - even plain C - is absolute misery).

But the largest sources of memory safety related security bugs nowadays are large existing codebases. Some like linux are plain C, and so using C as your FFI isn't adding any more safety issues than already exist, and the existing code is all by definition just C types that can be easily accessed, copied, etc as they're all just trivial bunch of bytes.

But some of the largest software in use is C++. Most obviously browser engines which by design are required to load arbitrary data from arbitrary locations, then intentionally give that arbitrary data a full near arbitrary execution Turing machine, but there are a bunch of other very large, and critical, projects.

Not having an interop story with C++ means that if any of these projects wish to adopt some new safe language, are stuck with having no transition path. Even Mozilla, who created rust, have only had partial success adopting rust in gecko, like their layout engine where the "transition" was rewrite the entire engine, and when we're done swap the engines wholesale.

For other libraries it means a large API performance regression because you have to add a pile of marshaling logic to wrap the mandatory C interface. Obviously this marshaling also impacts projects with C APIs but we're going to pretend for now that there's no heap allocation being introduced.


> The other safe languages have explicitly taken the view the interop with C++ is bad

It's not that it's inherently bad, it's just insanely difficult to do and, probably, isn't worth the pain. Carbon's approach to solving this problem includes embedding a custom C++ compiler as part of its toolchain, and at this point it's just the idea, who knows if they will be able to actually do it.

> they are all hell bent on not providing even just basic ABI stability

Right, the famous stable C++ ABI


...yes? The C++ is absolutely ABI stable, and is described in excruciating detail, as demonstrated by all those C++ libraries, that you don't have to recompile for on every OS update.

I suspect you're confusing when you make an ABI change, without changing the API, which is indeed an easy thing to do: You have to make sure that you only add members to the end of objects, however that exact constraint also applies to C.


> The C++ is absolutely ABI stable

C++ ABI doesn't even exist, there are several competing standards (much fewer now than there used to be) and nothing is guaranteed even between different compiler versions, let alone different compilers. Not that long ago we used to have two implementations of std::string in gcc you had to choose from and if the library you're linking with chose another one, you were out of luck.


A thing that does not exist, still is leveraged to build whole distros, and its too high stability resulted in the creation of Carbon.


This happens to work on Linux where the two major compilers (gcc and clang) are (mostly?) compatible. Windows is a different story: C++ code build with MSCV is generally not compatible with GCC/clang and vice versa, the most notable differences being vtable layout and default 32-bit calling conventions.


[edit: and I just realized I replied to you elsewhere. sigh. keeping it classy :-/]

Windows is a different platform, so comparing it to linux isn't relevant, any more than saying I can't run the code I compiled for sparc on a ppc Mac.


But "C++ on Linux" isn't "C++". It's a subset of C++ usage. The lack of competing compilers on some OS might be a feature of those OS, or in the eyes of some mabye even a failure of those OS, but it's certainly no feature of the language. ABI compatibility isn't about using the same binary on different OS, it's about running binaries from different compilers/compiler versions/compiler configurations in the same process.


The point is: C++ libraries compiled for Windows with MSVC are not ABI compatible with C++ libraries compiled for Windows by GCC or clang.


Fragile base class problem is solved, then? When did that happen?


No, but that's also not relevant as C++ is ABI stable. I can take any C++ compiler, and I can compile any C++ library, and then I can take some other compiler and compile some other piece of C++ that calls my library, and it will Just Work.

I can then go back to my library, and make a bunch of ABI safe changes, just as I would have to do in C, and compile my library again, with yet another compiler. At that point the program that was using the older version of my library, would continue to work with the new build of my library without needing to be recompiled.

This is because C++ is ABI stable.

If I am in C++, and I make an ABI breaking change to something that is used as a base class elsewhere, then I have broken the ABI.

But this problem also exists in C, for exactly the same reason, for example lets make some silly example C

If I am a C library, and my header declares a struct

    struct AwesomeThing {
      intptr_t foo;
    };

    void doSomething(struct AwesomeThing* thing) {
      thing->foo = 0;
    }
And then someone uses my library in their code:

    struct ImAwesomeToo {
      struct AwesomeThing theAwesomest;
      int* thisFieldIsGreat;
    };

    int doSomethingElse() {
      struct ImAwesomeToo thing;
      thing.thisFieldIsGreat = malloc(sizeof(int));
      doSomething(&thing.theAwesomest);
      free(thing.thisFieldIsGreat);
    }
Obviously this is a somewhat silly example, but now say I make a change to my library:

    struct AwesomeThing {
      intptr_t foo;
      double bar;
    }

    void doSomething(struct AwesomeThing* thing) {
      thing->bar = 42;
    }
By the standard rules I haven't "broken" ABI, but now that call to free() is going to cause problems.

That's the fragile base class problem.

If you are making an API that will have ABI stability requirements you have to expend quite a bit of effort designing the API so it's not only pleasant to use, but also can be evolved without breaking the ABI. As with C APIs, the people who make C++ APIs, know how to make them robust as well.

That said you could have a C or C++ ABI that is ABI stable, it just hurts performance, and obviously adopting that would break ABI :D

Anyway, the problem with what rust and co are saying, is that the same source can result in different ABIs from one compiler version to the next, or for one API to compile to a different ABI depending on what the rest of the code in the project is doing.

That means the OS can't use the safe language as an actual OS API, which I just think is wasting an opportunity.

Swift manages, and ostensibly performs the same kind of optimizations within a module, which is most of what you want. That said because it's the system API all objects are refcounted, and the recount is atomic - I was writing a raytracer in it (this is how I learn programming languages) and the refcount overhead was annoying to deal with.

Obviously there are trade offs in all the choices, but I still feel like rust, etc could do more. Rust has rich support for annotations, so an "abi_stable" annotation seems like it would be perfectly reasonable - it would be in keeping with rust's general theme of the default behaviour not having any implicit performance costs, but would be easy and very clear when you were making something into an API.


> As with C APIs, the people who make C++ APIs, know how to make them robust as well.

It seems too optimistic for me. I use Gentoo, and I believe that the only reason I have no such troubles is Gentoo with all its maintainers who do a great work of testing different combinations of libraries. And of course `ebuild` that can rebuild libc.so without breaking the system. I tried it once when I just started with linux, and I failed. Never tried to do it again myself, I let `ebuild` do it. It is not so easy, you know, to change version of a library which is a dependency of every binary in the system. And it is a C library. I wouldn't even try to do it, if it was a C++ library. Though who knows, two decades ago when I started with linux I probably was dumb enough to try.


The C++ standard does not define an ABI.

> I can take any C++ compiler, and I can compile any C++ library, and then I can take some other compiler and compile some other piece of C++ that calls my library, and it will Just Work.

This is most definitely not true. It just happens that clang and gcc are mostly compatible, but MSVC and clang/gcc is not! For example, vtables are implemented differently and therefore virtual method calls can crash (unless you follow certain rules, e.g. no virtual destructors, no overloaded virtual methods, etc.)

With 32-bit code, compilers won't even use the same calling convention by default... (look up 'thiscall')

EDIT: strange to see that someone downvoted...


[edit due to the above edit I just saw: wasn't me, I didn't see what you said as being inherently bad and so deserving of downvoting, otherwise I wouldn't have bothered replying :-/]

That sounds like what you're saying is that gcc or clang fail to conform to the platform ABI, which is a compiler bug. If a compiler wishes to ignore platform ABI, it doesn't make the ABI not a thing.

But if you are comparing ABI compatibility to MSVC, you are comparing the ABI used by the code generated by clang or gcc to what is by definition the windows platform ABI. If the code generated by a compiler targeting a given platform, does not match the ABI used by that platform, the problem is the compiler is generating broken code.

> vtables are implemented differently and therefore virtual method calls can crash

No, vtables are implemented incorrectly by the compiler. If the vtable for a type laid down by gcc or clang crashes when it is passed to a OS function, that means the compiler is broken, as the compiler has decided to ignore the platform ABI. Again, if I write a compiler that chooses to use the wrong ABI when interacting with the host OS, I don't get to claim that it is the language's fault, or the OS, or the ABI. Similarly I don't get compile code for linux that makes windows system calls, and then complain that there isn't a windows ABI.

Here is the thing: The ABI for vtables is specified for every non-trivial platform. The ABI for argument parameter ordering is specified for every platform. The ABI for struct layout and struct padding is specified for every platform.

"thiscall" is the ABI for member functions on windows, it's not some magical non-abi thing, is is by definition _the_ abi. There needs to be a name for it, because documentation at the very least has to be able to distinguish it from "cdecl". Importantly, claiming that it's a sign that C++ doesn't have an ABI, or that ABI isn't stable, is simply incorrect. The fact that it exists as a name we can reference is an indication that the ABI matters enough that it is specified. Claiming it's evidence of a lack of ABI is like claiming the C doesn't have an ABI because gcc has stdcall and cdecl - and thiscall is useful MS choosing it as the ABI for member functions on i386 is a reasonable performance win that gcc chose not to do, in favor of using cdecl everywhere, even if cdecl was a slow choice.

There is generally a pile of sadness when dealing with 32bit C++ as the various platform ABIs came into existence when C++ was new, and so the ABI for it on any given platform just happened to be whatever happened to be used by the first releases of the primary C++ compilers for those platforms. That's also why those ABIs - for windows, linux, macOS, etc - tended to not be super well designed or systematically created. Hence even when trying to match ABIs it was easy to hit edge cases where things went horribly wrong between completely different compilers. However even then, generation to generation of each of those compilers maintained ABI stability with themselves at least. Mercifully in the consumer 64-bit every compiler seems to have essentially gravitated to the itanium ABI, which is an actionably thought our and designed rather than evolved as the language is being invented.

So continuing to claim that there is not an ABI, or that that ABI is not stable, or whatever other claim you wish to make, does not make it become true just because you don't like C++. It also does not become true because you are a fan of a language that doesn't want to provide a stable ABI. The historical problems of gcc vs msvc were a product of how the respective ABIs were developed, but on architectures that are more modern it should not be a problem, and I'm sure that if you find places where the compilers differ from the host OS on a more modern platform the developers are much less likely to ignore the issue vs. i386 where they are stuck with whatever their exact ABI was in the early 90s.

In the end, to be very clear, C++ has an ABI, it is stable - a fact demonstrated by Windows, XNU, QT, etc all existing, and continuing to successfully exist. Technically the c++ standard library demonstrates this as well but given the various implementations of the standard library are now maintained largely by their respective compiler projects, that seems like cheating.


Surely if GCC's vtable layout isn't compatible with that used by MSVC then COM couldn't possibly work? Not much of the rest of the Win32 API would care though, other than DirectX and possibly a few more modern additions I'm not aware of.


Luckily, GCC's vtable layout is compatible with COM, but that's because COM puts certain restrictions on how you write C++ classes. Most notably, you must not define a virtual destructor and you must not use overloaded virtual methods. It just so happens that the main difference between MSVC and GCC/Clang concern the implementation of virtual destructors (the former uses 1 entry, the latter uses 2 entries) and the vtable ordering of overloaded virtual methods. This means that COM is not affected!


Thanks for your reply!

First off, I work mainly in C++ (Windows, Linux, macOS and some embedded) and I don't hate the language. I really wished that C++ had a well-specified ABI, but the sad reality is that it doesn't.

> if the code generated by a compiler targeting a given platform, does not match the ABI used by that platform, the problem is the compiler is generating broken code.

The compiler only has to correctly implement the C++ standard. The standard does not specify things like calling conventions, struct padding, implementation of virtual functions, so the compiler is free to do anything it wants. (Whether that is a good idea is another topic.)

> No, vtables are implemented incorrectly by the compiler.

The C++ standard does not even mandate that virtual functions are implemented with vtables.

> The ABI for vtables is specified for every non-trivial platform. The ABI for argument parameter ordering is specified for every platform. The ABI for struct layout and struct padding is specified for every platform.

AFAIK, Microsoft does not officially specify a C++ ABI at all (correct me if I'm wrong!) The only thing that comes close is COM - which can be implemented with a restricted subset of C++, but is really language-agnostic.

> Importantly, claiming that it's a sign that C++ doesn't have an ABI, or that ABI isn't stable, is simply incorrect. The fact that it exists as a name we can reference is an indication that the ABI matters enough that it is specified.

'thiscall' is not part of the C++ standard, it is a calling convention invented by Microsoft. The C++ standard does not talk about calling conventions at all.

> Mercifully in the consumer 64-bit every compiler seems to have essentially gravitated to the itanium ABI

This is true on Linux, but Microsoft specifies its own 64-bit calling convention (https://docs.microsoft.com/en-us/cpp/build/x64-calling-conve...) and leaves other parts of the C++ ABI unspecified.

Generally, you are right that some platforms, like Linux, have a de-facto C++ ABI, but that is not the same as saying that C++ itself has a well-defined ABI - which is simply not true.

Your initial claim was that binaries compiled with different compilers are always compatible (apart from possible library ABI mismatches). We both may wish this were true, but it is false in the general case (and specifically on Windows).


BTW, MSVC has repeatedly broken its ABI in the past. Only since 2015 they guarantee binary compatibility across compilers versions - with certain restrictions: https://docs.microsoft.com/en-us/cpp/porting/binary-compat-2...


Interop with c++ is definitely inherently bad.


I get that this is kind of facetious, but don't you want things like your browser engine to be in a memory safe language?


> The final problem with the safe languages - with the exception of swift - is that they are all hell bent on not providing even just basic ABI stability. It doesn’t matter how “safe” your language is if the first thing you have to do is make a pure C (even losing potential for automatic lifetime management the C++ would allow) interface.

Carbon isn’t looking to do this either. The “solution” they’ve come up with is that you just recompile everything and get things on the same ABI, which is not surprising considering how Google operates internally.


.NET has always supported interop with C++, first with Managed C++, then C++/CLI, and nowadays via Windows Runtime (C++/CX and C++/WinRT).

Likewise JNI has always had support for COM like models, and the C API is written in such way that the struct layouts can be used as C++ classes instead, with implicit this on method calls.


> it's more modern and flexible governance structure

You mean where Google has majority control over what does and does not go into the repo?




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: