The C++ pointer-to-member is a fairly confusing concept. What it actually is is a pair - a pointer to the instance of the struct, and a pointer to a function in that struct.
D calls these delegates, and generalizes it to being a pair consisting of a "context pointer" and a "pointer to a function". The neato thing is these do not have to be struct or class member functions. They can be nested functions, where the the "context pointer" is a pointer to the stack frame of the caller. I.e., a pointer to the closure.
Hence, these become lambdas.
Lamdas and pointers-to-nested functions are completely interchangeable with pointers to members. The caller does not know the difference.
In fact, lambdas are far, far more commonly used in D than pointers-to-members.
This isn't correct. C++ pointer to member's do not carry any instance information. They are more complex than plain pointers because they will still do proper virtual dispatching in the presence of multiple inheritance which requires more machinery than a plain function pointer, but this has absolutely nothing to do with keeping a pointer to an instance/this.
>> The C++ pointer-to-member is a fairly confusing concept. What it actually is is a pair - a pointer to the instance of the struct, and a pointer to a function in that struct.
> This isn't correct. C++ pointer to member's do not carry any instance information.
Mr. Bright's description is correct from the perspective of pointer-to-member's use, not when declared nor instantiated.
Given Mr. Bright's role in creating Zortech C, Zortech C++, amongst other compilers, and having used the two specifically mentioned personally, I believe he has a proven understanding of C/C++ compiler implementations.
Since one can interpret Mr. Bright's statement in the form of usage or definition, the former seems most applicable when viewed in the context of the remaining text.
No it's just incorrect and no amount of pandering to his authority will change that.
People make mistakes and this is one of them, it's not the end of the world. Avoid the temptation to believe false information just because you happen to worship someone, it reflects poorly on your ability to critically judge information.
> Avoid the temptation to believe false information just because you happen to worship someone, it reflects poorly on your ability to critically judge information.
Ad hominem attacks are the hallmark of insecurity and your expression of same reeks of pompousness reserved for the most arrogant I have had the displeasure to engage.
You know nothing about me, nor my background.
I hope you take time to reflect on what you wrote above and choose to engage with others differently in the future.
C++ pointers-to-members have no standard structure and are implementation defined. But in no implementations they have pointers to instances inside them. If that was the case Why would you need an instance (again) pointer/reference to use pointer-to-member?
In most implementations they just encode an offset
>> The pointer to instance part is the `this` pointer.
> What do you mean? The this pointer isn’t computed until you try to call the pointer-to-member-function.
Think of it from a compiler writer's perspective.
The implicit parameter when using a pointer-to-member function is the function pointer itself. The `this` (instance) pointer must be passed explicitly when invoking it (along with whatever other parameters the function requires).
Ergo:
>> The pointer to instance part is the `this` pointer.
Pointer to what, exactly? It doesn't make sense for it to be an object instance pointer, because you can use the same pointer to member on several different objects instances. If one instance's pointer was embedded in the pointer-to-member, it wouldn't work for other instances.
I have no idea what the format of a pointer-to-member is. It sure looks like a small closure.
> C++ and D diverge here. D has the notion of a delegate, aka “fat pointer”, which is a pair consisting of a pointer to the member function and a pointer to the ‘this’ object. The virtualness of the member function was resolved at the point where the address of the member function was taken
…
> Alternatively, C++ has the notion of a member function pointer, which is independent of whatever object is used to call it. Such an object is provided when the member function is called
Which sounds right. But you said:
> The C++ pointer-to-member is a fairly confusing concept. What it actually is is a pair - a pointer to the instance of the struct, and a pointer to a function in that struct.
Which is neither consistent with that text nor with how C++ works. The D concept of a pointer to member is the fat pointer that encodes this. C++’s is the horrible lambda-ish thing that can find the method starting with any compatible this pointer. Yuck.
Delphi captures the pointer-to-instance and pointer-to-function concept too. It uses them for event handlers: delegating behaviour to other classes. To do that, method pointers are often a 'fat pointer': the pair of object instance and function.
type TMethodPtr = procedure(x : Integer) of object; // 'of object' means an OO method pointer, not procedural
var p : TMethodPtr = foo.bar; // Captures both foo and bar
p(4); // Calls foo.bar(4)
Fun: you can use them in C++ too in C++Builder's dialect via the rather ugly syntax,
void(__closure * myClosure)(int); // myClosure can point to a method taking an int, returning void
myClosure = pObj->func; // Assigng myClosure; this captures both pObj and the address of func
myClosure(4); // Call it: this calls pObj->func(4)
Syntax aside, they're neat because they match type safety by the method signature not by the type of the object on which they're called (which lets yu use them to delegate to any classes, not just ones inheriting from specific ancestors.) This bypasses contravariance constraints too.
C# calls them delegates too, as does Virgil. Virgil uses the fat pointer trick so to avoid any heap allocations. In Virgil,
class C {
def m() -> int { return 33; }
}
var x = C.new(); // allocate a new C
var y: void -> int = x.m; // delegate bound to x and m
var z: C -> int = C.m; // C.m method is first-class, takes an object
var t = z(x); // equivalent to x.m();
I never understood why C++ and other languages had such ugly syntax for such obvious concepts.
I can't speak to Virgil, never used it, but C++ features try their best to adhere to the principle that you don't pay for what you don't use. It's not perfect but pointer to member functions in C++ use the simplest implementation that is possible for that given functionality. In C# delegates can not be unbound from instance variables, they must be bound at the point of creation. In C++ that is not the case which makes C++ pointer to members much more flexible.
It looks like in Virgil you can have both bound and unbound pointers. You are welcome to correct me but I suspect it's implemented in such a way that you are always paying the cost of having a bound function pointer even if you only ever use it as an unbound function pointer. This would violate the principle I mentioned.
In C++ if you don't mind always paying that cost, you are welcome to use std::function and it will work just as it does in the example you gave:
In Virgil every first-class function is a represented as a fat pointer, i.e. two words[1], for both bound and unbound delegates. It's simple to understand, the syntax is clean, and it always works the way you expect it to. It doesn't create garbage and basically means you use two registers instead of one. The unbound version even does a proper virtual dispatch for you.
I understand C++'s intention, but it's doing no one any favors here and makes it really hard to build proper abstractions. Pointers to members are clunky, hard to use, and easy to screw up. AFAIU it's possible that a pointer to a member to be a single simple function pointer, but not guaranteed, and you shouldn't rely on it. It seems like a very bad tradeoff. std::function is considerably more heavyweight (and according to this https://stackoverflow.com/questions/13503511/sizeof-of-stdfu... might be arbitrarily large); i.e. it's slow in practice and people avoid it.
[1] For programs less than 4GB combined code and heap, the two pointers can be packed into a single 64-bit word (though the compiler doesn't do this currently).
>The unbound version even does a proper virtual dispatch for you--in C++ you can break class invariants by skipping a virtual dispatch by using a member pointer.
This is not true, the C++ version does the proper virtual dispatch.
>I understand C++'s intention, but it's doing no one any favors here and makes it really hard to build proper abstractions.
std::function is a perfectly fine abstraction built on-top of the lower level primitives if you don't care about always paying a performance penalty. As someone who writes performance sensitive code, I do care so I try to avoid that penalty.
Of course std::function can allocate arbitrarily large amounts of memory, so can Virgil's implementation. The sizeof(std::function) is always fixed, but because it can capture arbitrarily functions which themselves can carry arbitrarily large state, then so too must std::function also have the potential to allocate arbitrarily large amounts of memory.
In C++ people avoid std::function because as you said it's slow, and people tend to not use C++ for programs that can be slow, this goes back to the principle of not having to pay for what you don't use. In other languages you don't get that choice, you basically are required to pay the worst case cost even if you never use it.
It's almost never used. The delegate version is almost universally used.
> Even if the lambda were to inline the call, it still would be a completely different location in the executable image.
Not sure what you mean. Inlined code is not in a different location, it's right where one is executing! Also, optimizers are pretty darned good these days.
Both the member function and the lambda are their own symbols with their own machine code -- might not even be possible for them to be the same code due to ABI concerns.
Even if they're the same code, I don't think it's that easy for linkers to merge them?
Inlining can happen in several places - the front end, the optimizer, or the link step. The process of inlining removes the need for the symbol for the inlined function.
I recommend writing some code snippets, compile them with inlining on, and looking at the resulting assembler code.
Generally inlining can happen way earlier than linking. E.g. Virgil's compilation model doesn't have a linker at all, it's a whole-program compiler, and that works pretty well, even for 50KLOC programs.
For a trampoline to have no overhead, you need the call to the trampoline to be changed to a call to the underlying function.
That's not an optimization that can easily happen due to the traditional compilation model.
Even if inlining were to happen, you end up with bloat, and few compilers are able to merge similar code like this (which can only happen at link-time, obviously, since the functions might be in different translation units). This optimization is known as ICF, and is not commonly enabled.
In practice I don't think the inliner takes ICF into consideration when deciding whether to inline anyway, so you just end up calling a function that calls another function.
You're talking about C/C++'s compilation model and ABI and there are plenty of others. ICF is a hack to deal with C++'s naive template expansion. There are lots of other languages that don't work that way at all and don't need a linker optimization like that.
It certainly has its use cases, e.g. classes with intrusive reference count, where the "release" method would destroy the object once the refcount has reached 0.
struct foo {
atomic<int> refcount;
// other stuff
};
You can pass around a pointer, COM-style, and call ->Release as needed. But you can be a lot less error-prone by passing around a smart pointer that understands the intrusive refcount and handles releasing. At which point you get something like 'delete ptr' in a destructor, not 'delete this' in a Release function.
> 'delete ptr' in a destructor, not 'delete this' in a Release function
That approach requires client and server to be written in the same language, use the same memory allocator, and same compiler. For COM objects, often all of these are false.
For example, C#, PowerShell or VBScript code can call IUnknown.Release() method of C++ implemented COM objects, which will cause C++ code to deallocate the memory. However, these higher level languages can’t directly delete C++ objects: they know nothing about C++ runtime, or C runtime.
In practice, people would (hopefully) wrap the whole thing in a smart pointer anyway, instead of manually calling addRef() and release(). But for COM style interfaces you need the release() method because you only want to expose pure virtual interfaces.
Yeah, I too have used this to wrap message buffers passed to and from kernel and user space; quite handy if you design and use the class with discipline.
In a language with affine types (i.e. real move semantics, like Rust but not like C++), destroying this is safe. OTOH deleting, as in freeing memory under the assumption that it was allocated in a particular manner, is only safe if you can somehow enforce that it was allocated that way.
Returns x+1 or x-1 depending on the direction the tadpole swims (-~x to increment x and or ~-x to decrement x). For a short moment I believed this was a real thing, because my programming font has a special ligature for ~- and -~.
Note that "increment" and "decrement" normally refer to changing the value of a variable (what ++ and -- do). It would have been funny if C had defined -~= and ~-= as the increment and decrement operators.
I don’t hate this! The author is, despite their sarcasm, correct—sometimes integers are insufficiently expressive. It’s the C++ equivalent of geometry in the Wolfram Language.
Extreme take, but the older I get the more I start thinking that the very concept of operators was a mistake, overloadable or otherwise, especially in a systems language. Even for something as simple as addition there's just too many actual operations that I might want to invoke (wrap on overflow, saturate on overflow, crash on overflow, raise exception on overflow, flag on overflow so I can raise later, return boolean to indicate overflow, return tuple of wrapped result and overflowed remainder on overflow).
Perhaps operators could only be overloaded, without any defined by the language itself. Then projects could define useful operators for their specific case to make things more terse/infix/etc. while still requiring awareness of their implementation.
No. Haskell made me decide that operators should not be definable by users. It is truly a miserable affair to read terse point free code riddled with operators from a half dozen libraries ( or one overly clever library ).
Cleverness is a register that is easy to overflow, and too many don't have to good taste to avoid doing so.
I think GP meant users should not be able to define new operators (as is possible and fairly common in Haskell), but may still be allowed to overload the existing operators.
That’s a good point. I suppose I was imagining something like internal-only operators that could be defined within the scope of a single module but not exported; though I suppose with any sufficiently large module with enough orthogonal pieces you’d still run into arcane and overly-terse operators.
Don't make "operators" a special case - make them just functions following the normal rules. Scala gets close to this (though sadly it still has some operator precedence rules). To define a function called "plus", you do "def plus" (or "fun plus" or whatever your syntax is). To define a function called "+" you do "def +". It's so much nicer and more consistent. Yes, some library authors abuse it to write functions or classes with dumb names, but you can write dumb names with letters as well.
Unfortunately life is complex and all of those semantics are useful. You have to read and write documentation and follow well established conventions (like math).
But that's the problem, even the mathematical operations in programming languages don't follow the established conventions of mathematics. And even mathematics overloads operators all the time to mean subtly different things. And even if it didn't, mathematics notation was optimized over centuries for terseness, not for readability, so optimizing programming languages for familiarity with mathematics is optimizing for the wrong thing anyway. We can do better than standard mathematic notation, and that "better" probably looks like "just use properly-named methods or functions that communicate their intent in plain English, not in forbidding moon-runes".
Honestly I can't think of many notable abuses of operator overloading in C++.
The worst is probably the original idea to overload the bit shift operators for stream I/O, something which never really caught on outside the standard library.
I remember a university project where I used the >> (and <<) operators to "send" data between services.
The code was a simulation of a parallel system with multiple services that sent messages between them, or something like that. Instead of using serviceA.send("hello",serviceB) you had something like serviceA >> "hello" >> serviceB.
I did something similar as a student, making my own exception class with std::ostream:
throw exceptionC() << "error code: " << t;
I often found myself having to format error strings for exceptions, so I thought I could just do it like cout in one line. I know now this is bad for i18n strings.
There could be a couple of examples in Boost like Boost Spirit [0]. Qt had some like putting stuff in containers with bitshift operator. There was a GUI toolkit that used + operator to put widgets on a window.
I've seen operator* overloaded to return RAII guards, including ones for RCU, and operator() overloaded for so many things that should just be lambdas it's hilarious
It was certainly how I learned to do it, but lambdas have been around for over a decade and I still see people writing functors. The only use case I've seen where it made sense is for things like coroutines.
For std::visit() (std::variant.visit in c++26 I see) which is the visitor pattern, you can use a functor with multiple visit types or roll your own overload() template to merge multiple lambdas into one class.
But why would I want to do this instead of just having multiple lambdas that capture the same values by reference, or using shared_ptr and synchronization? I can see lifetimes of the data being an issue, but you shouldn't be calling std::visit in a way where that could cause a problem without synchronization anyway.
Lambdas these days cover 99% of the needs for custom function objects. But for that 1% it is useful to be able to have full control of your closure.
For example how would you implement std::function without overloaded operator()?
Also lambdas are defined in term of structs with overloaded operator ().
Without overloading, the standard could still ad-hoc define the specifications of lambdas and std::function, std:: ref, etc, but the language would be worse off.
That example seems contrived. Why would I want to know something was called n-times? Even for benchmarking I would just capture an integer and increment it.
To be mildly pedantic, printing in a destructor is horrible practice because stdout/stderr might be pipes or sockets and writing might fail. It's a really bad idea to do anything in a destructor that effects anything but the class being destroyed for those reasons, and when you get an error, it shows up as an opaque exception or trace or hang at the end of a scope instead of where it actually mattered.
I actually ran into that at work a few days ago. I wanted to provide a callback that accumulated stuff, and check that the total was equal to what was expected, or crash the program otherwise (fail fast). I could have equivalently written the output of the test to a capture-by-ref variable and checked it outside if I really wanted to.
MFC also has CComPtrBase which uses & to represent pointer lifetimes to COM objects such as while(pEnum->Next(1, &pFilter, &cFetched) == S_OK). Especially fun when debugging DirectShow filtergraphs someone made in the UI completely. There is more of an explanation here: https://devblogs.microsoft.com/oldnewthing/20221010-00/?p=10...
> If you object to iostream on religious or stylistic grounds,
No I object to it on usability grounds. It took more than 30 years for the committee to admit it but C++ now has typesafe std::print/std::format finally, after admonishing programmers for using `printf` in C.
Streams have numerous design problems (e.g. representation is managed by the stream, not the thing you’re printing) making the shift operator syntax the least of the problems.
D calls these delegates, and generalizes it to being a pair consisting of a "context pointer" and a "pointer to a function". The neato thing is these do not have to be struct or class member functions. They can be nested functions, where the the "context pointer" is a pointer to the stack frame of the caller. I.e., a pointer to the closure.
Hence, these become lambdas.
Lamdas and pointers-to-nested functions are completely interchangeable with pointers to members. The caller does not know the difference.
In fact, lambdas are far, far more commonly used in D than pointers-to-members.