Hacker News new | past | comments | ask | show | jobs | submit login
How C++ Resolves a Function Call (preshing.com)
299 points by goranmoomin on March 16, 2021 | hide | past | favorite | 90 comments



I was able to use my experience implementing C++ name lookup rules to make a less complex one for D:

There's no notion of a "primary" template. Templates are looked up like functions are.

C++ has rather complex function overloading rules that differ substantially from template overloading rules. The former is a hierarchical list of rules, the latter is based on partial ordering. D uses partial ordering for both.

D doesn't need argument dependent lookup, because it has modules.

D lookup sees all the declarations in a module at once, not just ones prior to the point of use.

The end result works pretty much the same, and hardly anyone notices the difference. Except for that last point - in C and C++, the order of declarations is inverted. The private functions come first, the public ones last, because lookup is dependent on the lexical order. D code is written with the public ones first, private ones last.


What's the standard use-case for D 'in the wild'? It looks interesting and is one of the few languages I've heard of but haven't used -- might try it out on my next project :)


Probably the reason D never got much attention is that D seems too much like C++, similarly to the way that C# is much like Java. (C# has the full weight of MS behind it, which D never had; yet, C# has not displaced Java.)

D works for basically the same set of problems as C++, in much the same way. D, not bound by history ("mistakes", you could say) could be simpler than C++, and able to do some things more neatly, but not enough so to match C++'s momentum—also a product of history. Since D came out, C++ has evolved in ways hard to keep up with. Some make it more like D, but those don't help D.

Rust is different enough from C++ to have some hope of carving out its own space, while trying to address the same sort of problems C++ does. Rust also abandons history, with consequences like D. However, Rust is developing its own complexities. In the end, if it survives, Rust and Rust code will be fully as complex as C++ and C++ code, just off in a slightly different direction.

But it is too early to know whether Rust too will, in the end, fail to find a mainstream place. Ada and Pascal both once had far wider use than Rust, and faded. Ultimately, C++ can do anything Rust can, has capabilities Rust is not expected ever to match, and is still evolving fast. So, for anything that might need to be maintained by somebody else someday, it is easier to believe that a C++ programmer could be found to do it.

In the end, the only true determiner of whether a language succeeds is if a miracle occurs. C got one, C++ got one, Java got one, Javascript got one. Python probably got one. I don't know of any others newer than Fortran. Julia and Rust still have a chance.


You can get what you need done and working in C++. The trouble is, it's harder to do and the result just doesn't look that nice. A simple example:

C++:

    typedef struct S { int x; } S;
D:

    struct S { int x; }
There are a lot of things like that, and it adds up.


I have never even once encountered any problem caused by the tag namespace.

I never, ever, typedef struct tag names. I have only ever seen typedefs like that in C headers. All C++ code I ever see looks like your D example.

I have to wonder if you meant to post something else, or if you have lost touch with what C++ code looks like.


I use the typedef version to head off problems. Anyhow, how about this:

C++:

    template<class T> T func(T t) { ... }
D:

    T func(T)(T t) { ... }
or

C++:

    void f(long);

    void test() {
        f(1); // calls f(long)
    }

    void f(int);
D:

    void f(long);

    void test() {
        f(1); // calls f(int)
    }

    void f(int);
or

C++:

    int f(int a[3]) { return a[4]; } // undefined behavior
D:

    int f(int[3] a) { return a[4]; } // index out of bounds error
Of course, you can use `array<int>` in C++, and all the above can be worked around, but it just is more work and doesn't look as good.


C++20:

  template <typename T> T func(T t) { return t; }
  auto func(auto t) { return t; }  // same, "abbreviated"
C++ is often not as nice to type as D, or as Rust. That's backward compatibility for you. But having literally tens of billions of lines of code in production, and millions of programmers who know how to use it, counts for a lot.


Your example here is C style though.

struct S { int x; };

is perfectly valid C++


Yeah, but that enables you to declare another variable named `S` and it'll compile. The tag name space is still there, and still causes problems. The typedef approach ensures you won't have problems.

People have learned to deal with the quirky behavior of the tag name space, but it's still quirky and serves no purpose.

I wrote Bjarne back in the 1980s that the tag name space should be removed from `class` as that wasn't necessary for backwards compatibility with C. He replied no as it would have broken compatibility with existing C++ code.


I see it less dramatically: like C# is Java with the benefit of hindsight, D is C++ with the benefit of hindsight. No need of compatibility with early bad decisions and undesirable features is an unexciting but important feature.


C++ will structurally never have a borrow checker, because it's simply not designed in a way for that to be possible.


Correct.

But a borrow checker does not change the set of possible programs. It only identifies the subset of possible programs that would need "unsafe" blocks.

In the absence of a borrow checker, libraries are left to carry the responsibility to define an API that is hard to misuse. Such an API could be slower than what one would define for a library to be used under a borrow checker. But code written to satisfy a borrow checker is sometimes slower than code that doesn't need to.


>But a borrow checker does not change the set of possible programs.

Once you have Turing completeness you have the full set of possible programs, end of story. Languages aren't about changing the set of possible programs, they're about making good ones easier to write, and bad ones harder.


Then there is nothing but ones and zeroes, and nothing to compare, and no insight to be had.

Yet, C++, D, and Rust are much more like one another than any of them is like any other language. Your Turing Completeness tells you less than nothing about any important distinction between them.


D has a prototype borrow checker now. I'll be talking about it at the April NWCPP meeting.


I wouldn't characterize JavaScript's surprise fortune as miraculous.

Miracles are Heaven sent, right? What does Hell send? Maledictions? https://www.powerthesaurus.org/miracle/antonyms


Nobody knows how to get a programming-language miracle.

Sun committed a billion dollars to promoting Java—and the opportunity-cost loss is probably a big part of what killed them. It still would not have been enough to secure a place for Java, except that Java offered MS sharecroppers a road to a sort of freedom. Ada got even more $promotion than Java, in its day, but faded.

C got its miracle on the coattails of Unix, C++ on C's. Javascript rode on Netscape. (Who remembers MS Silverlight? MS spent as much as Sun.)

Python, if does survive, got its miracle the hard way. Perl and Ruby once seemed more secure than Python does now. Rust and Julia are going the hard route.

All it takes to fade away is not getting that miracle, mo maledictions needed.


D is really underappreciated.


Did you ever feel the itch to add a 'trait' to access and manipulate the partial ordering set? Something like CLOS' method combinators or call-next-method (to call the next most specific function).


No, I never thought of it. I have no idea what purpose calling the next most specific function would serve. Besides, there are ways to call specific functions other than the best match - like casting the arguments to types that make it an exact match with the desired function.

Despite the power of D's overload mechanism, my advice is to use overloading modestly, not cleverly. You've written your best code when a newbie looks at it and remarks "pshaw, I could have written this! Why does Walter make so much money?"


It's one of those things where on rare occasion where it's applicable it really simplifies the code. Kind of like dynamically scoped variables.

It's been a while that I've hacked CL, but I remember method combinators being used in mixin-style programming[1]. A call-next-method is a simplification tool in that context, and it doesn't hard-couple your generic functions. With concepts now in C++, I wouldn't be surprised to see a resurgence of template-mixins, and there it really helps to have some introspection into the partial ordering.

[1] That is, Flavors mixins, not dlang mixins ;)


Well, I think they should rather react like this: "Wow, so Walter gets so much money because he writes much less convoluted code than what I would have come up with."


They will react that way when they stop being newbies.


Exactly.

Sometimes I think about the advice on how to carve a figure in stone - chip away anything that is not part of the figure.

The same applies to programming. Easy to say, hard to do.


> I have no idea what purpose calling the next most specific function would serve.

This is a replacement for calling the superclass, but it uses C3 method resolution order.


This reminds me that you can get the precedence list in Python with C.__mro__, similar to CLOS's (class-precedence-list C)


I usually try to get away with reading as little of the manual as possible until I absolutely have to. Articles like this really help cavemen like me because this is leaps and bounds better than reading the actual c++ standard!


C++ is really the one language where I suggest reading the manual, repeatedly. Sometimes I go and read the docs at cppreference for practically every expression I write.


I'd be careful. cppreference.com is a rewrite of the Standard, likely due to copyright issues. This means errors creep in.


I'd say it's rather an attempt at having some kind of actual user manual. It's not going to be as precise (or authoritive) as the standard, but instead the focus is on being more digestible for users.


The examples, specifically, are extremely helpful!


cppreference.com is by far the best resource for the language and the standard library. It is also well organized and easy to navigate. "errors creep in" is a well-taken complaint about life in general, not cppreference.


I implemented a printf format checker for D, which checks that the argument types are compatible with the format specifiers. (This is really a nice feature, I should have done it long ago.)

To be successful, it had to correctly deal with every nuance of the printf spec. cppreference got a couple of these incorrect wrt to modifiers. Not something a routine user would notice, but a pedantic one would. (Sorry, I don't remember exactly what the mistakes were.)

I still use cppreference because it is so dang convenient. But I don't rely on it when perfection is required.

It's really too bad that copyright forces everyone writing a manual to rewrite & rephrase things. I wish the C++ Standard was online.

One online reference I dearly like is https://www.felixcloutier.com/x86/index.html which has saved me so much time. But it was created by scanning the actual reference manuals, so barring scanning errors, it is exactly correct. I gave up using reformulations of the CPU instruction set long ago, they had too many mistakes.


You might be interested in the EXEgesis project, which is also a machine interpretation of x86 documentation, but which includes several patches for unfixed errata therein.

https://github.com/google/EXEgesis


If the CPUs have bugs in 'em the manual definitely does unfortunately.


C++ resolves certain kinds of tricky function calls via a remote procedure call that sends an SMTP mail to the ISO committee, and waits for a reply. (It's compile time though, no biggie.)


False. SMTP is not mandated by the standard.


In practice, gcc and clang implement it so it's portable /s


But technically it's UB, so you shouldn't use it.


For the curious, this exact series of comments _is_ mandated by the standard and therefore be reliably located in every thread that involves C/C++.


It's curious how the C++ standard is written with a bunch of rules, rather than an algorithm. The algorithmic version in the article is much clearer to programmers. You'd think the C++ standards committee would write the standard for programmers instead of for -- I don't even know who it's for. Bureaucrats? Tax lawyers? SF Zoning regulation enforcers?


This is done for a very good reason. Some languages have chosen to base the standard on a particular reference implementation rather than a specification. In practice, this has led to any alternative implementations of those languages having to recreate any quirks of the original implementation to prevent inconsistencies. That makes later implementations brittle and tougher to optimize. Also, specs always have a formal grammar for language syntax, and may provide a test suite that covers aspects of the language semantics.


Standards are written with precision and with a minimal number of words (i.e. no redundancy). This is great for implementers, as wishy-washy rules make for incompatible implementations. People really don't want to argue about interpretations of a rule.

A tutorial it ain't. Neither is it a guide, an overview, a textbook, a nutshell, or a marketing document.

Just the facts, Ma'am.


It comes from the general principle of describing the requirements of the implementation, without describing the implementation itself.


I think most standard readers are compiler writers.


I suspect that the only people who've actually read the entire Standard are compiler writers :-)

Note to anyone trying to implement a language - don't bother reading tutorials or textbooks on a language. Just the Standard. Otherwise you'll be sorry.


I can confirm this, having tried implementing an ECMAscript interpreter. "I'll start with just my own knowledge and tutorials, then I'll only need to read the standard when something is ambiguous" will lead to many hours of rewriting code based on incorrect assumptions.


Language lawyers


The level of downvotes on a Bjarne Stroustrup C++ programming language book joke[1]2] shows how much people are really familiar with the C++ ecosystem when they complain about how tedious it is. Have you really took time to learn the langauge?

[1] https://stackoverflow.com/questions/tagged/language-lawyer%2... [2] http://foldoc.org/language%20lawyer


it's OK if you don't get it


it literally is OK if you dont, because its not written for C++ programmers, its written for compiler people, and, frankly, those are witches.


> those are witches

Ahem, warlocks. Be careful or I'll turn you into a newt.


We're not here to judge all compiler implementors on how they self-identify. In most non-Western cultures [0] both "witch" and "compiler writer" are gender-neutral terms for those who dabble in the mysteries of the dark arts.

Perhaps a better, less gender-ambiguous term would be "sourcerer" indicating someone who deals in the miracles of transforming source code.

[0] https://www.britannica.com/topic/witchcraft/Witchcraft-in-Af...


What has always fascinated me of C++ is how incredibly complex, yet wild, it is. You can do basically everything, you can create horrible monsters and discover beautiful patterns backed by strong type safety. It might not be for everyone, but yet I find it can be as rewarding as it is frustrating. You always learn something, and you can always dig yourself out of any hole, it just requires effort and skill.


Unrelated to C++, but the diagrams (svg) looks nice. Are these created from text format or via some diagram tool, because if it is from text, it's nice to use in architecture diagram (asciidoc/markdown), etc and version control with git.

PlantUML works well for sequence diagrams, but for the rest, the output is not pleasing to look at.


The author responded to this question on reddit: https://www.reddit.com/r/cpp/comments/m5jpwz/_/gr10ysr


If skynet ever happens, it'll be a self-aware C++ compiler, mark my words. That's why it hated us so much in the Terminator documentaries.


> it'll be a self-aware C++ compiler, mark my words. That's why it hated us so much in the Terminator documentaries

Sorry, but no. Terminators hate us because they were programmed in a mix of COBOL and 6502 assembler [0]

[0] https://www.theterminatorfans.com/the-terminator-vision-hud-...


If Skynet emerges as a C++ compiler, it'll be trivially confused and obliterated by the first freshman CS student that it comes in contact with.


Too late, Skynet already happened in the form of Bitcoin.


Wow, this website is one of the first ones in the wild I see which seamlessly integrate SVG into the text flow to produce a single look. Amazing work!


Advanced function resolution technology is one of the main reasons I use C++. The combination of being statically typed yet allowing users to create elaborate function dispatch hierarchies is unmatched.


Could you give an example? To me (I’m not a C++ programmer), “elaborate function dispatch hierarchies” sound like complexity that should be avoided.


Oh not complex at all. This is a boon for programming computations for GA (geometric algebra). There are libraries that automatically generate algebras and operations within those algebras (such a versor: http://versor.mat.ucsb.edu/), and then you can operate on user-defined versors naturally. It actually makes things less complex because you don’t need to remember a million function names. This is one of the special cases where templates and advanced function dispatch enable super powers. 100% optimal code, type-safe, user-defined, expressive, and automatically generated. Watch your compile times though


boost::spirit and boost::qi are similar uses of deep secrets of template magic. They work nicely, until you make your first minor error and have to understand the whole machinery to get what's wrong.

Maybe someday we'll have a language that is efficient and also allows creating efficient and usable DSLs. C++ ain't it.

Edit: perhaps D would actually be that language?


> Maybe someday we'll have a language that is efficient and also allows creating efficient and usable DSLs. C++ ain't it.

C++ constraints change the game here. Esoteric template errors will be a thing of the past


Assuming you're talking about Concepts, I've heard Andrei Alexandrescu, for example, challenge that hope [0], but I'd be happy to hear this has been fixed.

[0] https://www.youtube.com/watch?v=AxnotgLql0k


> Watch your compile times though

And your error messages. ;)


It's not unmatched, though? Never heard of Julia or Rust?


There's not need to be dismissive. C++ has a somewhat strange position in that it is statically typed but the mechanism of type resolution is fairly loose. This, coupled with SFINAE, allows for concise declarations of things that would take much more work in other languages (or not be possible). If you've ever written a generic constraint in Rust specifying every arithmetic operation that you use in the function body, then you'll know what I mean.


Yes transparent dispatch into templated functions (chosen via SFINAE) may not be everyone’s cup of tea but I prefer it to a separate macro language. No exclamation marks, looks like a normal function and later can be reimplemented as a normal function.


The huge problem with SFINAE and most template techniques is that the semantics of the operation are completely different from the intention, and this always shows in error messages for any small mistake, even most of the ones your users make.

Concepts were supposed to help, but based on what I've heard, they sometimes make the error messages even worse, so I'm not sure.


What are you talking about? Julia is dynamically typed and Rust doesn't support function overloading. It's hard to imagine two worse languages to compare to C++ in response to that comment.


Julia and Rust always come up in discussions of C++, so a lot of people develop the idea that they are better than C++ at everything. This leads to pattern matching and just saying "why not Rust" in every discussion of C++.


Julia is dynamically typed.


Julia is dynamically typed, but if the compiler can deduce the type of your expressions, it will usually generate specialized code that avoids dynamic dispatch. Getting rid of type instability (as it is called) is usually one of the first things I do when trying to optimize a Julia function.

The discussion about C++ lookup rules actually really reminded me of Julia, where functions are quite often overloaded and it is not always trivial to figure out which one should be called. For lack of a better source, there is something about that in this talk: https://youtu.be/TPuJsgyu87U?t=989


Rule of thumb: If your code has ambiguities that make you reach for the details of name resolution rules, or depend on some finer point within them, then - seriously consider differentiating the names.

In the example in the article - I would go to some trouble to avoid having both:

    namespace galaxy {
        void blast(Asteroid* ast, float force);
    }
and

template <typename T> void blast(T* obj, float force);

either blasting asteroids happens in the context of galaxies, or it doesn't. If there are _different kinds_ of blasting of asteroids - very well, make that explicit.

Also,

bool blast(Target target);

would be somewhat confusing, since people may expect `blast()` to not return anything and will neglect to check the returned value. And wouldn't it make more sense to just have a default amount of force for the blast? ... again, if it's a different kind of blast, don't name both functions the same.

PS:

* "Better is a good name than fine oil" (that's a biblical proverb). * Don't be a smart-ass when naming! You'll smile for a second, others will cry for years. * Don't be stingy with a few more characters in your name - we can handle it.


> , it should at least be possible to implicitly convert each argument to its corresponding parameter type.

Oh no! You just got burned (maybe). Make sure you turn on your compiler's warnings to find bogus-but-legal implicit conversions.


This is a good article in terms of context, but I found it much easier to understand how C++ does class functions when I first saw this in a disassembly output. I guess I'm a more visual learner that has a fairly hard time following a lot of text.


This article doesn't touch codegen. It is merely about the algorithm by which the compiler disambiguates e.g. which function or method named foo you meant to call when you write e.g. foo(1, 3) vs. x::foo("bar").


We need more articles like this on other aspects of C++


https://www.reddit.com/r/cpp is a place where you can find some good content (unfortunately, not consistently).

I would also be happy to read more articles like this one. I personally don't use C++ other than for small side projects, but it's a world fascinating to explore.


And this is one of the reasons why C++ is a mess.


Or use a good IDE which would have something like "right click -> jump to declaration" and it will show you what/where the function is implemented. Or at least that's what I do to keep up my productivity instead of wondering at the marvels of overloading / namespaces that export same function names.


The complexity of the language makes it hard to do even for state-of-the-art IDEs. I have had up to date versions of CLion suggest a dozen potential declarations in some hairy codebases...


The point of the article is not to manually resolve all function calls. Understanding the algorithm allows you to see why a compiler sometimes doesn't choose the function you want and fix it.


*doesn't chose the function you wish for

- there, I fixed for you. The compiler doesn't chose on its own whim, it's you who made mistakes and a good IDE allows you to correct it.


Huh... So your saying "right click -> jump to declaration" uses the compiler to resolve all the candidate functions to determine the function that would of been chosen at compile time? ..and then take you to that function? I thought it just took you to the 1st occurrence of the function it found. If so, that's interesting.


Really depends on your IDE. As an example, Visual Studio ships Microsoft's CPP compiler (MSVC [0]), but uses a completely different compiler frontend (namely EDG [1]) for IntelliSense.

In a way, it does use a compiler to resolve the candidate functions, but it doesn't go through all the compilation stages. And it's not even the same compiler as when you hit compile :)

[0]: https://en.wikipedia.org/wiki/Microsoft_Visual_C%2B%2B

[1]: https://www.edg.com/


Ahh that's interesting to know. Thanks for the info.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: