Hacker News new | past | comments | ask | show | jobs | submit login
What's the point of std:monostate? You can't do anything with it (microsoft.com)
115 points by luu 80 days ago | hide | past | favorite | 176 comments



A "monostate" in the design-patterns lingo of 20 years ago was a class with only static member variables, basically where all state is shared state. It was supposed to be alternative to the Singleton pattern where you don't need all those .getInstance() calls and instead can just default-construct an instance, any instance, to get access to the shared state. It fell out of favor because usage was fairly error-prone and surprising to programmers who are not familiar with the pattern. Most people expect that when you create a new instance, you are actually creating new state, but monostate intentionally makes each new instance just a window on the shared global state.

I would've thought that the C++ template class would be just a marker interface to use on a monostate, so that users of the class know that it has shared state. But it seems like usage patterns in the article are very different from that, and all the comments here are ignorant of the history of the monostate pattern and befuddled at its intended usage. Maybe it was added to the standard by someone familiar with the design pattern, but they didn't do a good job with education and documentation to explain to everyone else what it was for?


That just sounds like a different thing with the same name. It's nowhere close to the purpose of std::monostate.


That’s the GP’s point. It is strange that std::monostate was chosen as the name, given that the different Monostate pattern [0] was fairly well established in C++ circles.

[0] https://wiki.c2.com/?MonostatePattern

(originally discussed in http://ftp.math.utah.edu/pub/tex/bib/cppreport.html#White:19...)


I considered that, but a lot of their comment seems to be about changing the class's behaviour to reflect the functionality they were expecting, rather than just giving it a different name. Especially this bit:

> I would've thought that the C++ template class would be just a marker interface to use on a monostate, so that users of the class know that it has shared state.

Also, if that's what they really meant then they surely could've have written a far sorter comment that simply says this name is already taken and it should be called something else. They don't seem to be saying that at all.

(Not that it affects my original point, but FWIW that linked meaning of monostate isn't common in my experience, and it sounds like a truly awful idea: if your state is really good then be honest about it and use free functions. So it hardly seems worth reserving a useful word for it over the concept std::monostate is actually about.)


BTW I saw your deleted reply about the point of monostate, thanks. For something that has to implement an existing interface, I can see the possible benefit.


Yeah I actually haven’t used the pattern much myself, so I started to doubt the relevance. Once you start having multiple separate monostates implementing the same interface, and also limit their creation/instantiation in the sense that the code that uses the monostate objects is separate from the code that creates them, the difference to regular objects/classes becomes a bit murky.


Yeah. The design pattern sounds like there is one state, and it is shared. What STL has looks like, all instances look the same, hence only one state is possible. They are homonyms with slightly different etymologies.


There’s a secret competition between C++ and JavaScript to come up with more variants of nothing.


VBA has 4 kinds. Null, Nothing, Missing, and Empty.


https://news.ycombinator.com/item?id=40192911

DonHopkins 85 days ago | parent | context | favorite | on: What if null was an Object in Java?

Why stop at null, when you can have both null and undefined? Throw in unknown, and you've got a hat trick, a holy trinity of nothingness! Of course the Rumsfeld Matrix further breaks down the three different types of unknowns.

https://en.wikipedia.org/wiki/There_are_unknown_unknowns

>"Reports that say that something hasn't happened are always interesting to me, because as we know, there are known knowns; there are things we know we know. We also know there are known unknowns; that is to say we know there are some things we do not know. But there are also unknown unknowns—the ones we don't know we don't know. And if one looks throughout the history of our country and other free countries, it is the latter category that tends to be the difficult ones." -Donald Rumsfeld

1) Known knowns: These are the things we know that we know. They represent the clear, confirmed knowledge that can be easily communicated and utilized in decision-making.

2) Known unknowns: These are the things we know we do not know. This category acknowledges the presence of uncertainties or gaps in our knowledge that are recognized and can be specifically identified.

3) Unknown unknowns: These are the things we do not know we do not know. This category represents unforeseen challenges and surprises, indicating a deeper level of ignorance where we are unaware of our lack of knowledge.

And Microsoft COM hinges on the IUnknown interface.

https://en.wikipedia.org/wiki/Tony_Hoare#Research_and_career

>Speaking at a software conference in 2009, Tony Hoare apologized for inventing the null reference:

>"I call it my billion-dollar mistake. It was the invention of the null reference in 1965. At that time, I was designing the first comprehensive type system for references in an object oriented language (ALGOL W). My goal was to ensure that all use of references should be absolutely safe, with checking performed automatically by the compiler. But I couldn't resist the temptation to put in a null reference, simply because it was so easy to implement. This has led to innumerable errors, vulnerabilities, and system crashes, which have probably caused a billion dollars of pain and damage in the last forty years." -Tony Hoare

https://news.ycombinator.com/item?id=19568378

>"My favorite is always the Billion-Dollar Mistake of having null in the language. And since JavaScript has both null and undefined, it's the Two-Billion-Dollar Mistake." -Anders Hejlsberg

>"It is by far the most problematic part of language design. And it's a single value that -- ha ha ha ha -- that if only that wasn't there, imagine all the problems we wouldn't have, right? If type systems were designed that way. And some type systems are, and some type systems are getting there, but boy, trying to retrofit that on top of a type system that has null in the first place is quite an undertaking." -Anders Hejlsberg


>"My favorite is always the Billion-Dollar Mistake of having null in the language. And since JavaScript has both null and undefined, it's the Two-Billion-Dollar Mistake." -Anders Hejlsberg

Assuming these mistakes are additive and not multiplicative.


> Throw in unknown, and you've got a hat trick, a holy trinity of nothingness!

`unknown` is a top type, since we're on the topic of nothingness it'd be remiss to not mention its dual, `never`.

The pair is also called `Any` and `Nothing` in some other languages.


INever implements IEnumerable, so ILove how you can iterate until the 12th of Never, and that's a long long time.

https://www.youtube.com/watch?v=2PnPnSjCUnc


If you like that, you'll love how many ways you can test for equality in LISP.


Each equality function, at least in Common Lisp, has a distinct goal.

Numbers use = if you don't care about type. For instance, the real number 0 and the complex number 0 + 0i would be treated equal by this function.

Do you care about strict pointer equality? eq.

Do you want to distinguish 0 and 0d0? eql.

Do you care about isomorphism? equal.

The only weird function to me would be equalp, given that it performs case-insensitive comparison of characters etc. Overall, I find complaints about the equality predicates to be based on a misunderstanding of the definition of "equality".

Clojure goes the opposite direction and has a single = function that performs all sorts of reflection and deep comparisons; it's convenient in practice, but it's also easy to burn CPU cycles if you are e.g. comparing large collections and only care about pointer equality.


> Overall, I find complaints about the equality predicates to be based on a misunderstanding of the definition of "equality".

It's more that LISP is the only language I know that exposes how complex the concept of equality actually is. All the other ones I use more-or-less make an implicit assumption that equality means one of the things that LISP offers but if you want the other definitions of equality you either have to build support for them by hand or call out to a library function that's going to use (static or dynamic) reflection to reduce the problem to the one kind of equality the language implements.

(Apart from that, the only other complaint is that LISP being as old as it is, the default names for those equalities are too terse to self-describe. You just have to memorize which one is 'eq' and which one is 'eql', there's no way to guess looking at the names themselves).


Common Lisp doesn't do a very good job, unfortunately.

  [1]> (equal '(1 2 3) '(1 2 3))
  T
  [2]> (equal "abc" "abc")
  T
  [3]> (equal #(1 2 3) #(1 2 3))
  NIL
  [4]> (subtypep 'string 'vector)
  T ;
  T
So a string is formally a kind of vector according to t he type system itself, but vectors don't use equal comparison whereas strings do. We can reach for equalp:

  [1]> (equalp #(1 2 3) #(1 2 3))
  T
But then we have to accept, something we might not necessarily want:

  [2]> (equalp #(1 2 3) #(1 2 3.0))
  T 
which will not be true under the equal function that we really wanted to Just Work for simple vectors:

  [3]> (equal '(1 2 3) '(1 2 3.0))
  NIL
eql being the default equality function throughout the library is also suboptimal.

The specification allows (eq 1 1) to be false, which caters to unrealistically poor implementations such as ones that heap-allocate all integers. This, in spite of the fact that fixnum is required to be at least 16 bits wide: most-positive-fixnum must be at least 32767, and the fixnum type must be a supertype of (signed-byte 16).

What this means is that if Common Lisp were fit into a 16 bit machine (good luck with that at all), it would have to provide 16 bit fixnums (easily doable, but not unboxed ones) which defeats the fixnum concept of fixnums being the unboxed range of integers, beyond which we have boxed bignums.


  vectors don't use equal comparison whereas strings do
This behavior is required by the specification. From https://www.lispworks.com/documentation/HyperSpec/Body/f_equ... ,

  Two arrays are equal only if they are eq, with one exception: strings and bit vectors are compared element-by-element (using eql).
What makes strings and bit vectors special, in my opinion, is that they are homogeneous arrays. Heterogeneous arrays aren't something that I often use outside of Clojure (usually because they are awkward to use in statically typed languages).

  The specification allows (eq 1 1) to be false
Because the specification doesn't limit the magnitude of numbers nor state how they should be stored in memory. The scenario you've described requires an implementation that intentionally makes every number a bignum or an instance of a class (like Smalltalk).


Lists can be heterogeneous and often are, yet equal recurses on them.

If you have list-based code that uses equal, you have extra work to do if you want to convert it to vectors.

Vectors comparable with equal make good hash table keys when heterogeneous keys are needed which don't need the structural flexibility of lists.


Proof that Brendan Eich never really cared about equality.

https://www.reddit.com/r/ProgrammerHumor/comments/225i15/pro...

(To contextualize and clarify: It's a classic old JavaScript joke about how Brendan Eich hates gay people getting married so much that he donated money to a political campaign against marriage equality, even though he enjoys the human right of marriage himself. Some bigots think they are more equal than others.)


Marriage isn't a human right. Just ask any "men's rights" activist.


Who cares what they think?


That always chafed me about Scheme. I see the utility of `eq?` and `eqv?`, but I'd prefer that there were only `equal?` and functions were defined to get an object's "id" or "numeric equivalency class," or whatever, instead of having different flavors of data structures that differ only for certain values.


Don’t forget good ole `=` for comparing numbers.


Mathematicians got nothing on lazy programmers


What's really weird to me is not that C++ has a unit type and picked a weird name for it (that's just C++). The weird thing is how many unit types it has:

- std::nullopt_t

- std::nullptr_t

- std::monostate

- std::tuple<>

And I'm sure there's more.


The distinct types are the whole point. You wouldn't want a std::tuple<> to be implicitly convertible to a std::optional<T> (for arbitrary T), and std::nullptr_t exists to be the type of nullptr, which captures the conversion behaviours appropriate for null pointer literals and has nothing to do with the variant use case std::monostate exists to serve.


If there was a std::unit_t and it was implicitly convertible to optional, tuple and pointer, I don't think that would be worse in terms of usability at all (maybe worse in readability for people who haven't heard of a 'unit' type).

As for the std::variant use case, using std::monostate is only a matter of convention there. You could use any of the other unit types just the same.


std::monostate is explicitly provided for use with std::variant. It's in the <variant> header. Sometimes people use it for other things, but that's really an abuse, especially given defining your own type suitable for such cases is typically as simple as `struct mytype{};`.

Using one type to represent empty literals for optional, tuple and pointer types, implicitly convertible to all of them, would make the compiler accept many obviously accidental constructs. In a world where the maintainers of C++ are trying their hardest to make the language safer what conceivable benefit would there be?


Then you're basically back to "anything can convert back and forth with void " -- the point is to avoid* that.


Why wouldn't you want std::tuple<> to be the same as std::monostate, though? In many languages with a proper unit type such as Haskell and Rust, the zero tuple is the unit type.


What about "void"?


void isn’t a unit type (inhabited by a single value), it’s a “bot” type, I.e. no values inhabit it.


void is the unit type. The fact that it is not constructible is a wart of the language, inherited from C. It would be easy to fix and would simplify a significant amount of generic code.

A function returning bottom cannot return, yet void foo() {} can. In fact it can even return the result of calling other void functions:

   void bar() {  }
   void foo(){ return bar();}
In generic code void is usually internally replaced by a proper, regular void_t unit type and converted back to void at boundaries for backward compatibility.

  [[noreturn]] void bar(); 
would be a candidate for a bottom-returning function, except that [[noreturn]] isn't really part of the type system.


> void is the unit type. The fact that it is not constructible is a wart of the language, inherited from C. > A function returning bottom cannot return, yet void foo() {} can.

Or you could say it the other way, that it is the bottom type, and the fact that it can be used as the unit type for returned values is a wart of the language. Furthermore, void* isn't a pointer of the unit type, it's a type for pointers to undefined/unspecified value types.


nothing is gained by making void a proper bottom type. It would only break existing code. OTOH make void a proper unit type would be backward compatible and actually make the language simpler both from a specification point of view and in practical terms.


I didn't mean that it should be made into a bottom type, even though it sort of looks like it on the surface.

I think the original idea was that void meant "unknown", not "empty" or "non-existent": It was all about whether values could be allocated or not (the wart mentioned above). A plain void variable cannot, but a void pointer can be allocated. For functions, they just reused the keyword to mean "no return value" or "no arguments".


A pointer to the bottom type would make even less sense as an interpretation for void *, since such a pointer couldn't possibly point to any initialized value.


Sorry, I wasn't clear enough: I meant that void is not just a mix of the unit and bottom types, it is also in used for unspecified (unknown actually) types.


I ran into this recently writing some C++20 coroutines. The protocol for delivering values from a coroutine that was previously suspended has two flavors: one for values and one for void. My initial draft just implemented the value version and used a struct VoidTODO {} where void should be.

It's too late now. void pointers are used as a pun to mean "type wildcard." If void were a real thing that could have a size and address, that wouldn't work anymore.


Yes!! Been there done that. Two flavors of EVERYTHING: one to deal with functions that return values; and one to deal with void functions. It's awful.


Void means "it is a syntax error to construct a value of this type". This is not a type that exists in category theory or Haskell. (But similar to the "bottom" type.)

Hence, "void*" - a pointer to something, but it would be a syntax error to derefence this pointer.


In that regard void behaves like any incomplete type. In C and C++ you cannot construct objects of incomplete types nor you can assign through pointers of incomplete types. But you can construct pointers to void and other incomplete types.

Differently to other incomplete types void has some special behaviour: you can declare a function as returning void and return with no arguments is also special cased. You can also cast to void.

A void with proper unit semantics would simply be a complete type instead. The only special case would be return with no arguments implicitly returning a void instance, but that would be pure sugar.


That's a good point. Maybe one could argue that rather than the unit type not being constructible, the wart of C is that functions that return "bot" can still "finish executing without returning".

I would almost rather argue that void is indeed the "bot" type, and a function marked with a void return type shouldn't be said to "return void;" rather we should say that it's an overloaded syntax that means the function has no return value at all. Same for "return bar()" there, that's just a false-friend of the syntax for returning a value, just syntactic sugar for "bar(); return;".


void is not a subtype of all types, though.

C and C++ don't have a type spindle, where void would be at the bottom. Only C++ has the concept of subtype, only in the class system, and the C++ class system doesn't have a bottom type; there is no bottom class that is a base for all the others.

void is not a proper type; it's just a hack shoehorned into a convenient spot in the type system.

Which is why the C++ people have to invent this whole zoo of other things.

If void were a type, then, for starters, "return x;" would be syntactically valid in a function returning void. (Only, no possible x would satisfy the type system, so there would have to be a diagnosable rule violation in that regard.)

A function returning void does not return a type. It doesn't return anything; it is a procedure invoked for side effects.

The same situation could be achieved in other ways, like having a procedure keyword instead of void.

The (void) parameter list is another example of void just being a hack. It was introduced in ISO C, and then C++ adopted it for compatibility.

The 2023 draft of ISO C finally made () equivalent to (void), though it will probably take many decades for (void) to disappear.


> C and C++ don't have a type spindle, where void would be at the bottom. Only C++ has the concept of subtype, only in the class system, and the C++ class system doesn't have a bottom type; there is no bottom class that is a base for all the others.

A bottom type is not the base of all other types.

> void is not a proper type; it's just a hack shoehorned into a convenient spot in the type system.

It is a type, but it is not Regular and it is incomplete. 'return x;' is invalid in a void-returning function because it doesn't type check. 'return void()' or 'return (void)0;' or 'return void_returning_function();' are all valid because they type check.

Making void regular has been proposed multiple times [1]. It is a relatively simple extension but nobody that cares has the time to carry it through standardization.

[1] https://open-std.org/JTC1/SC22/WG21/docs/papers/2016/p0146r1...


In C, it is not like that. From the 2023 draft:

6.8.7.5 The return statement

Constraints

1 A return statement with an expression shall not appear in a function whose return type is void.

It's a constraint violation. It doesn't matter what the type of the expression is.

It looks as if C++ made a small improvement here.

Yes, the bottom type is at the bottom of the type derivation hierarchy. That's why the word bottom is there; that's what it's at the bottom of. It's also why it can't have any instances. Since every other type is a supertype, then if the bottom type contained some value V, that value would be imposed into every other type! V would be a valid String, Widget, Integer, Stream, Array ... what have you.


To clarify, it could be that the bottom type is not the base of all types, if the language has a split between some types which participate in that sort of thing and others that don't (e.g. class versus basic types or whatever).

But void is not the base of anything in C and C++.

You could argue that void is in some category of types where it is at the bottom; but no other types are in that category.

There is another problem: a bottom type should be the subtype of all types in that category. That includes being its own subtype. There we have a problem: C and C++ void is not a subtype of void in any sense.


That doesn't seem right to me. I can define a function returning "void", and it can terminate. I would expect that a function returning an uninhabited type can never complete.


And “bot” refers to the bottom type: https://en.wikipedia.org/wiki/Bottom_type


So it appears this follows the terrible C++ "naming things with the wrong name on purpose" trend, with the biggest example being naming a variable length array (a list, collection, etc) `std:vector` even though Vector was already a well known word with a very different meaning.

The word/type they wanted for this was the Unit type: https://en.wikipedia.org/wiki/Unit_type and in that list it even states the `std:monotype` is a Unit.


I'd argue that unit would be just as cryptic as monostate, if you don't know what either is.

Like, if we assume someone looking at code with `std::unit`, what might they think this is? If one is not aware of its use in ML or similar, it could just as easily be assumed that it could be something to do with units like meters or kilograms or whatnot. After all, the C++ standard library is vast so it wouldn't necessarily be all that far-fetched.

Then the only question would be to ask why it would be default-constructible. At which point you'd have to read the docs for the type anyway.


It’s called “unit” because it is a unit of the operation of multiplication / constructing a tuple (up to a unique isomorphism): for any type 't the tuple types unit * 't and 't * unit are isomorphic to 't. When you don’t have that as a fundamental operation in your type system, like ML does, then it could be confusing.


sure, but I would be surprised if a significant fraction of programmers knew that. When we hear the word "unit", that's not the first definition that comes to mind.


> I'd argue that unit would be just as cryptic as monostate, if you don't know what either is.

And if my grandmother had wheels she'd be a bicycle. Of course if there wasn't a standard name for this concept that had been used in the industry for 50+ years (and in theoretical work for over 100) then it wouldn't make any difference what name you used for it. But given that there is a standard name for this concept that has been used in the industry for 50+ years, making up a different name is pretty unfortunate.


I mean, the industry is composed by much, much, much more C++ than ML programmers


There is indeed a units library [0] aiming for standardisation in C++29.

[0] https://github.com/mpusz/mp-units


My guess is the maintainers of C++ keep making it more and more awful in an attempt to force people to more sane languages.

But to their dismay the crazier it gets, the more some people dig in and embrace it even more.

/sarcasm. I think.


It's not sarcasm. The maintainers have completely lost the plot.


Even Herb Sutter agrees(I think, in his heart, even if he doesn't come out directly and say it), which is why he is working on cppfront


Functor is one of the worst offenders IMO. In C++ lingo, it simply means "callable object".


It's not like anyone has a monopoly on the term: https://en.wikipedia.org/wiki/Functor_(disambiguation)

The C++ usage was introduced by Jim Coplien in his 1992 book as a succinct name for an architectural pattern in C++: https://archive.org/details/advancedcbsprogr00copl/page/166/...


In C++ lingo they are called function objects or callables, not functor. You might stumble on code here and there that suffix a class name with Functor but it's not that common.


Many blog posts and Stackoverflow questions about C++ cal them functors. Boost of all things uses the term in its definitions: https://www.boost.org/doc/libs/1_60_0/doc/html/function/refe...

It's common enough that articles have been written against this usage: https://web.archive.org/web/20170224220253/http://jackieokay...

So this usage of functor may never have been "official", but it's widespread in the community.


I would say it was widespread, but it's now fallen out of favor. Alexandrescu got the name wrong 25 years ago† and that got embedded into early boost, but by the time that blog post was written I feel like I'd mostly stopped seeing it being used by C++ experts.

† I don't know if the original mistake was his, but he certainly did a lot to spread it.


cppreference.com uses the term functor in numerous places for recent C++ features including things in C++26 such as reflection:

https://en.cppreference.com/w/cpp/experimental/reflect

https://en.cppreference.com/w/cpp/utility/variant/visit2

https://en.cppreference.com/w/cpp/experimental/parallelism

Visual Studio's official documentation uses the term functor as of 2021:

https://learn.microsoft.com/en-us/cpp/standard-library/funct...

A simple Google search shows no shortage of recent websites and documents including fairly authoritative references using the term functor to describe a type that overloads operator ().

Heck even the author of this article, Raymond Chen, uses the term functor as recently as 2020 to describe such a construct:

https://devblogs.microsoft.com/oldnewthing/20200513-00/?p=10...

Is Raymond Chen not a C++ expert?


> In C++ lingo they are called function objects or callables, not functor.

The word "functor" has a long and glorious history in C++. Try entering "C++ Andrei Alexandrescu functors" into your internet search engine of choice. For bonus points, try "c++ Scott Meyers functor" as well.


The creator of C++ uses the term functor in the book "The C++ Programming Language" to describe any object that overloads operator ().


Coming from other languages I noticed this about C++ as well. I can't give examples right now but I recall multiple times being like: "Oh that is just a weird name for $KnownComputingConcept".


Reactive style programming also takes the Unit as the value to act upon when the value and the type does not actually matter and only the action does.

https://stackoverflow.com/questions/54336641/is-there-any-re...


Consider ML appropriating the term tensor...


The term "vector" represents an ordered collection of elements(ex 3-dim vector is [x,y,z]). A list is flat out incorrect because it is used to refer to linked list(std::list), and a collection means a group of objects and that can be anything like a map, set, etc so it's too generic.

Before C++ popularized the term, other programming languages and libraries had already used "vector" to describe similar data structures. For example, Common Lisp has a vector type that represents a one-dimensional array.


Stepanov introduced the term in C++ and he fully acknowledges that it was a bad name and that he regrets it. If he could redo it, he would have renamed it array or array_list.

Interview with him acknowledging this:

https://www.youtube.com/watch?v=etZgaSjzqlU


It doesn't help that mathematics and physics have their own overlapping but unequal definitions of "vector".


As does Biology.

And computing has interrupt vectors as well.


so "std::nothing" was taken? or "std::none" ? or anything else that would make it obvious this type is a fancy way to say void?


Void is different from this type though, as a variable of type void can't be occupied.

In ML and friends monostate is called unit (and gets used a lot because void returns aren't allowed by the languages). Some have empty types too, which can never be occupied. A function returning Empty can't return, for example, though there are other use cases


You are equivocating on the word "void". Your statement that "a variable of type void can't be occupied" is true in functional languages where "void/Void" is often used as the name of a type that isn't inhabited (assuming the language is sound/normalizing/whatever).

But here we are talking about C++, where "void" is a pseudotype that is absolutely inhabited, in some conceptual sense. Any function that is declared to return void and which returns is returning a thing that conceptually inhabits void. In this sense, std::monostate indeed captures the same concept as void, but in a much better way, because it's properly a type, not a pseudotype.

Note: Java does the same thing, effectively, with "Void" which is inhabited by exactly one value: null.


I think it's not correct to say that void is a monotype in C++, because the compiler won't allow you to assign the result of a function marked void to a variable, and you cannot declare a variable of type void.

I'd accept that it's not the same as the empty type though, given that void* can be occupied and functions marked void can return. Probably someone with more type theory than me can name this properly


> Probably someone with more type theory than me can name this properly

Probably “garbage”. C’s void is not a type and does not behave consistently, it’s a keyword associated with arbitrary convenient behaviours for that case.


Which is really annoying, and makes a ton of templated code in C++ have have to bifurcate on void unnecessarily. They already let me do return f(); in a void function if f also returns void... they should let me declare a variable of type void and the language is going to become a lot more pleasant.


Not that familiar with c++ but used to have this thought about both Java and C#. Think I’ve changed my stance on it now though.

If following something like CQS the bifurcation can be thought of allowing “pure” functions and excluding code with a temporal / side-effecting component from higher order code.

Not saying bifurcating on void is the best approach to handle that, but in languages where side effects are a thing something is needed to make sure higher order code and side effecting code mix properly.


This is only relevant to pure languages.

And this doesn’t come anywhere near to properly making that distinction anyway: a non-void function can have all the side effects, and a void function can have no side-effects.

It also does not “make sure higher order code and side effecting code mix properly”, it just makes a subset of likely side-effecting code not mix with higher order code at all.


What are you going to put in that variable?

    void f();
    void v = f();
    void g(a){return a;}
    v = g(v);
Maybe my imagination is failing me, but I can't see how this can do much good without at least polymorphic functions.


   template<range R, regular X, invocable<range_value<R>, X> F>
     requires same_as<invoke_result_t<F, R, X>, X>
   auto fold(R&& range, F f, X accumulator) {
      for(auto x: range) 
         accumulator = f(x, accumulator);
      return accumulator;
   }
I can call that with a function returning a custom unit type:

   enum class void_t { Void };

   fold(my_range, [](auto&& elem, void_t) { return Void; }, Void);
But not with void:

   fold(my_range, [](auto&& elem, void) { return; }, void{});
which is very annoying and requires fold to special case 'void' via metaprogramming.


Templates already provide that useful polymorphism;

    template<typename T>
    T callWithState(auto f) {
        auto old = globalState;
        globalState = whatever();
        T out = f();
        globalState = old;
        return out;
    }
(forgive any syntax errors, my C++ is very rusty...)


The void value. E.g.

    fn f() {}
    let mut v: () = f();
    fn g(a: ()) { a }
    V = g(v);
> Maybe my imagination is failing me, but I can't see how this can do much good without at least polymorphic functions.

Sure, it doesn’t do much useful to C save make void ever so slightly less wonky.


> I think it's not correct to say that void is a monotype in C++, because the compiler won't allow you to assign the result of a function marked void to a variable, and you cannot declare a variable of type void.

The compiler does not allow you to do that particular operation out of an arbitrary restriction, but that does not make `void` a true void type. It still holds a monotype value!

  int bar() {
      std::cout << "Bar" << std::endl;
      return 0;
  }
  
  void baz() {
      std::cout << "Baz" << std::endl;
  }
  
  template <typename T> T foo(T (*f)()) {
      std::cout << "Foo" << std::endl;
      return f();
  }
  
  template <typename T> T varfoo(T (*f)()) {
      std::cout << "Varfoo" << std::endl;
      T a = f();
      return a;
  }
  
  int main()
  {
      foo(bar); // Valid (returning an int).
      foo(baz); // Valid (returning a void).
      varfoo(bar); // Valid (assigning an int).
      // varfoo(baz); // Invalid (assigning a void (why???)).
  }


I'm not that familiar with C, or C++. My impression is that void is a special case that doesn't need to be special, some accidental complexity that came from mapping machine instructions to a higher level language.


It's kind of baked into C grammar. And there's absolutely no compelling use-case to fix it in C.

In C++, there's a very compelling case for making void an actual type, because you can't use void as a templated type, which means that templates involving functions that potentially have void return types require unpleasant amounts of template metaprogramming.

Now that C++ standards committees are considering basic usability fixes (e.g. the long overdue ability to do`namespace com::microsoft::directx { }`) there's a vague possibility that somebody might look into actually fixing this some time before 2040.


That namespace thing has been in since C++20, right?


Pretty sure the namespace thing was in C++17


Right. Only took 37 years to fix a usability issue that Stroustrup fully expected to be fixed at the first available opportunity.

C++17 and C++20 both took up usability fixes. std::string::ends_with would be a a C++20 example.

Hopefully C++23 will be similarly open to usability fixes.


Ah yeah, I read the page too fast.


Incidentally, the classic C did not have 'void'; instead, it was assumed that any function would, by default, return 'int' in the form of some value stored in the accumulator, and so the "value" of 'void' would be effectively represented by random garbage. The 'void' that was introduced explicitly in a later version of C weakened the original meaning of the unknown value by allowing pointers to 'void' and thus not requiring that the value pointed to must be always thought of as meaningless (since you could cast a pointer to void to a pointer to something else).


“Nothing” can easily be interpreted as an uninhabited type (regardless of its use in haskell).

> a fancy way to say void?

Less fancy and more workable. Had void been a proper type in the first place it would not have been needed (but also… void had the same issue as nothing, it sounds like an uninhabited type more than a unit type).

Despite that, they could have called it Void, even if the standard library normally uses all lowercase.


Capital letter `Void` reminds me of how Java uses the object type `Void` for this purpose, as all reference types allow null.


Or "unit" in ML-derived languages and Haskell.


I always think of unit as one - it has exactly one possible value.

Void doesn’t exist in ML


Void definitely exists in ML.

  —- Haskell
  data Void

  (* Standard ML *)
  datatype void = Void of void


log 1 = 0, so it has exactly zero bits of information.


I think there is exactly one equivalence class of instances of std::monostate, whereas there are exactly zero equivalence classes of instances of void.

In category theory terms, I believe void is the initial type (there is exactly one morphism from void to any other type), whereas monostate is the terminal type (there is exactly one morphism from any other type to monotype).


std::void_t would have been nice.


Another use of std::monostate is as a special "unset/don't care" value for a template parameter. eg

    template<typename T = std::monostate>
    class C
    {
        ...

        if constexpr (!std::is_same_v<T, std::monostate>
        {
            // T-related behaviour here
        }
    };


I greatly prefer tag classes for this. You can define 'em in a single line and a downstream user can't accidentally plug the unset value in to the template.


You can use void for that.


void could be an actual meaningful type for a template rather than a dummy type.


You could say the same thing about std::monostate, which is not a dummy type. If you need a unique sentinel type you have to make one for that purpose.


There's a fun thing like this in Swift too! `Void` is an empty tuple, and has all of the related constraints (can't conform to protocols, being the most salient one). If you have a type that has to conform to, say, `Equatable` or `Codable` you should instead use `Never?` which you can conform to most protocols via throwing `fatalError` on an extension of `Never` to the protocol.

Anyway, I write basically this on my blog for a more thorough explanation: https://www.jackyoustra.com/blog/non-equatable-void


Lovely blog! If you would accept one suggestion, it would be to make it mobile friendly!


I'm glad to have a mobile reader :) I'll work on making it mobile friendly soon


Should be better now! Let me know if you have further concerns :)


Good thing they made this instead of expanding the standard library to be more like Java's or Python's. It still only contains the most basic functions, and std::monostate ;)


Gerard: But it doesn't do anything!

Hans: No -- it does nothing.

https://scryfall.com/card/wth/154/null-rod


No no, that's a void pointer .


It's poor name. If it has no members, it holds no bits. Therefore it has no state. It's not enough for an object to exist in order to hold state. It must be capable of distinguishing at least between two values, like true or false. If something doesn't hold state, the word state has no business in its name.

This thing is just a counterpart to void that is a class. A better name would be voidclass or something along those lines.

(There is a Monostate pattern, but that involves a class with state: just all the state is static. It's basically like a module with global variables. Completely different thing.)


“True” and “False” sound like two states. This is just one. :)


Yes, exactly like 640K sounds like a power of two, if you're a MS-DOS user.

But in a way, this is right since:

     <no of bits> 
   2                = <no of states>
When the number of bits is 0, 2^0 = 1: 1 state. A state machine with one state is certainly possible.

Problem is we need 2 or more states to do anything useful with state. We can draw an initial state bubble in a state diagram and not add any states; it can even have transitions back to itself.

So maybe monostate is not exactly a misnomer; it's just weird to mention state about something that is not useful for working with state.


Raymond is a very smart and productive person, and is not maligning C++ at all in this article. It makes me want to reassess my bittersweet perspective on the language.


On the flip side, it shows why so many people are excited about Rust. You can pick up literally any valuable thing off the ground that made the mistake of being built around C++, like CUDA, slap it on Rust, and people will both adopt it and be excited to contribute to it.


So... Isn't the fact that you can't default-construct that variant without `monostate` working-as-intended?

Not everything can or should be default-constructed or default-constructible. That does complicate initialization sometimes (i.e. there are practical reasons to "suspend" construction until you have all the needed data), but you're not avoiding breaking type safety by adding a `monostate`, you're just giving yourself the "could be null" headache.


Seems almost intentionally confusingly written. Why not use void if monostate is like void? Ah, because monostate is actually entirely unlike void. void has zero values, mono state exactly one.


void is kind of strange in C, because it looks like an empty type, but it still behaves as though it has a single instance (a unit type). For example, a function with an empty return type can’t return (it’d have to supply a value of it); a void function can. You can’t cast things to an empty type (otherwise you’d get a value of it), you can cast things to void. Void a unit type, not an empty type, it’s just a bad one.


It would be somewhat more cohesive and less weird if we'd argued that the "void" return type means that a return value cannot be constructed which is taken to mean that the function has no return value, and simply returns without providing a value.


Try as I may, I can’t make sense of that. I’ve read something like it in books on C, but I still can’t. Maybe I’m infected with set theory too deeply.

In my mind, a computation (a “function”) must either return a value or hang/crash. If it appears as though it returns a member of ∅, it must hang/crash, because there are no members of ∅. If it returns a member of the single-element set, [1], it can return one, there’s just no use inspecting it afterwards (you know what it is already).

(For what it’s worth, if you use a prover-adjacent language such as Agda or Idris, this is exactly how things are going to work there.)


C functions (and in fact the "functions" of most mainstream programming languages) are not computations, they are algorithms. An algorithm doesn't necessarily have a result of any kind, at least not in the way that a (mathematical) function has. The result of the algorithm can be the state in which it leaves the World (e.g. an algorithm for cleaning a house doesn't have a return value, it changes the state of the house).

In fact the traditional programming name for what we mostly call functions today was "(sub)routine" - you call a subroutine, and when it finishes, it returns to where it originally started.

Consider also that at the assembly level (and below it), subroutines don't have return values, nor arguments. The program counter simply jumps to the beginning address of the subroutine, and the `return` instruction jumps back to the address right after that jump. The subroutine may read values from certain locations in memory, and possibly write some others back, but none of this is necessary or enforced in any way. C functions, and the corresponding keywords, are much closer to this conception of assembly subroutines than they are to the mathematical notions of functions or computations.


The calling convention does say where to look for the return value. So in a sense the return value always exists, but would not be meaningful if the function has a void return type.


The calling convention is part of the abstraction, it's not part of the processor's logic. Different languages often have different calling conventions, on the same OS and processor family. Different OSs have different calling conventions for their system calls.


> a computation (a “function”) must either return a value or hang/crash.

It can also simply return control, without returning anything. It's equivalent to invoking a continuation with zero arguments. Do you allow for zero-argument functions, at least?

Of course, if a function can't return no value whatsoever, you suddenly need new syntactical categories to support it: you need to prohibit using such functions in an expressions (only call statements are allowed), you need a way to return from such a function (naked "return", which is prohibited from taking any expressions), and it's also severely strains your generics/templates because you can't treat such functions uniformly etc.


> It can also simply return control, without returning anything. It's equivalent to invoking a continuation with zero arguments. Do you allow for zero-argument functions, at least?

There's nothing weird about a function taking 0 arguments, any more than there's anything weird about a function taking 3 or 5 arguments. A function can't take an argument of type void, so functions shouldn't be allowed to return void either.

> Of course, if a function can't return no value whatsoever, you suddenly need new syntactical categories to support it: you need to prohibit using such functions in an expressions (only call statements are allowed), you need a way to return from such a function (naked "return", which is prohibited from taking any expressions), and it's also severely strains your generics/templates because you can't treat such functions uniformly etc.

Or you do what sensible languages do: all functions return a value (if they return at all), functions that don't want to return anything in particular can return a unit value but that value is just a normal value that behaves normally in expressions etc., you don't make naked return a special case (you can make it syntax sugar for "return unit" if you really want), and you can treat those functions uniformly in generics, much more easily than with C++ templates where you have to make special cases for void functions.


I don't argue with any of that, you know. But we do have a historical (stupid in retrospect) split between functions and procedures, dating a-a-a-all the way back to ALGOL 60 at least.


You encounter the same problem with zero argument functions.

A function on a zero type would be unable to return any value, since there's no value you could apply it to. To have a function of zero arguments you should use the unit type (which means the function effectively picks out a single value).

This is also related to how a function with multiple arguments is a function of the product type, and an empty product is 1 not 0.


No, you just invoke the function and pass it zero arguments, that's it.

Sure, you can build your whole theoretical framework of computation with only the functions of exactly one argument, and then deal with tuples to fake multi-valued arguments/multiple return values — but you don't have to do that. You may as well start from the functions with arbitrary (natural) number of arguments/return values, it's not that hard.


Sure, and passing it zero arguments is exactly what it means to evaluate it on the single value of the unit set.

I mean surely we can agree that a pure function of 0 arguments picks out exactly 1 value, and that a function that accepts n different values (values not arguments) as input returns at most n different results? Why make an exception for n=0?

Your definition of a function of 0 arguments and that of a function over the unit set are identical. Or at least equivalent.


They're equivalent, but only up to whatever computational substrate one is actually using. You can build functions out of small-step operational semantics of, say, a simplistic imperative register machine with a stack. In this case, a function of 0 arguments and a function of 1 trivial unit argument are visibly different even though their total effect on the state is the same. After all, we're talking about theory of computation and so it better be able to handle computations as they are actually performed at the low level, too.

It's yet another example of "in theory, the theory and the practice are the same; in practice, they're different": I have written a toy functional language that compiles down to C, and unit-removal (e.g. transforming int*()*int into int*int, lowering "fun f() -> whatever = ..." into a "whatever f(void) {...}" etc.) is a genuine optimization. The same, I imagine, would apply to generating raw assembly: you want to special-case handling of unit so that passing it as an argument would not touch %rdi, and assigning a unit to a value should not write any registers, and "case unit_producing_function(...) of -> ... end" actually has no data dependency on the unit_producing_function etc.


> In this case, a function of 0 arguments and a function of 1 trivial unit argument are visibly different even though their total effect on the state is the same. After all, we're talking about theory of computation and so it better be able to handle computations as they are actually performed at the low level, too.

You don't have to push anything for the unit values on the stack though, their representation doesn't take up any space. Just like if you're passing a big argument (like a large integer, or a struct passed by value) it might take multiple registers or a lot of space in your stack frame, there's no 1:1 correspondence between arguments and stack space.

> you want to special-case handling of unit so that passing it as an argument would not touch %rdi, and assigning a unit to a value should not write any registers

This shouldn't need to be a special case though. For each datatype you have a way of mapping it to/from some number of registers, and for the unit type that number of registers is 0 and that mapping is a no-op.


> their representation doesn't take up any space.

One of the way to represent them doesn't take up any space. But if you want to e.g. have a generic function of e.g. (T,U,V) -> V type (that is, with no monomorphization in your compiler) then either your units have to take space (otherwise the layout of (unit,int,bool) is different from (int,int,bool) and the same), or you'll have to pass around type descriptors when invoking generic functions. Which many, many implementors of static languages rather wouldn't do unless absolutely necessary.


> if you want to e.g. have a generic function of e.g. (T,U,V) -> V type (that is, with no monomorphization in your compiler) then either your units have to take space (otherwise the layout of (unit,int,bool) is different from (int,int,bool) and the same), or you'll have to pass around type descriptors when invoking generic functions.

You have that problem already though surely, as T might be bigger than an int, or want to be passed in a floating-point register, or both.


How would one write a 0 bit value into registers?


Yes, that's the problem you face when you're dealing strictly with functions of a single argument. Still, there are two options: first, it's arguably is already written into the register so you don't need to do anything.

Alternatively, you may instead represent () as a full, 64-bit wide machine word and then map every 64-bit value to mean () so, again, you don't actually need to write anything: all registers contain a valid representation of () at all times. This is similar to how we usually represent booleans: 0 is mapped to mean False, and everything else is mapped to mean True, although in this case we sometimes do need to rematerialize some definite value into the register of choice.

In any case, it's mostly just a matter of correctly writing the constant materializer; but if you adopt multi-argument/multi-valued functions you simply never encounter this problem:

    for arg, place in zip(arg_list, arg_places):
        load(arg, place)
    invoke(fun, kont)

    for val, place in zip(ret_values, ret_places):
        load(val, place)
    kontinue(kont)
Notice how degenerate loops simply disappear with no additional handling.


> if you adopt multi-argument/multi-valued functions you simply never encounter this problem

Sure. If you have multiple return then unit values become a lot less important because you can just have a function that returns 0 values. But most languages, especially C-like languages, do not have multiple return.


The word "function" in mathematics and the word "function" in C (and C++) are homonyms. Two words spelled the same and pronounced the same but with entirely different meanings. Any effort to conflate the two will just end in tragedy.


It doesn't make sense from a strict type and set theory point of view because it doesn't make sense from a strict type and set theory point of view. Neither C nor C++ are rigorous languages.

We also have "void foo(void)" and here void takes on two entirely different meanings, while type theory would suggest this is a function that diverges if it were called, which you can't.



"If it appears as though it returns a member of ∅, it must hang/crash, because there are no members of ∅."

If you want to go with math, think more group theory. I'm specifically thinking about how you can always create a monoid if you have an associative binary operation, because even if your associative binary operation doesn't have an identity element, you can just declare one. Similarly, if you have "functions" that "return nothing", you just declare that nothing right into existence. Then you can just think of the C language layer basically erasing away any attempt to examine that value returned behind the scenes, because as you say, why?


It's not a function. A C function "returning" void is just C's syntax for writing a procedure. C doesn't call it a procedure, but that's what it is semantically.


The distinction between "function" and "procedure" doesn't map very well to whether the return type is void in C's syntax:

- On one hand, a lot of "functions" are actually procedures that just happen to return a value: think for example `write(2)`, which is clearly used for its side effect, not to compute how many bytes could be written -- even though that's what it returns.

- On the other hand, you can have a "procedure" (i.e., a function "returning" void) that actually has no side effects other than storing a computed value in a specified location (e.g. void square(int x, int *ret) { *ret = x*x; }). That's clearly a function in the mathematical sense, even though it "returns void".


The difference between a function and a procedure is strictly whether they can be used as a value or not in the syntax of the language, not the semantics. A function is a subroutine that can be used as either a value or a statement, a procedure is a subroutine that can only be used as a statement. That is, if x = foo(); is valid syntax, then foo() is by definition a function. If x = foo(); is invalid syntax, then foo() is a procedure.


> (For what it’s worth, if you use a prover-adjacent language such as Agda or Idris, this is exactly how things are going to work there.)

You don't even have to go that far, Rust supports this concept. The built-in empty type is called `!`, and cannot be constructed. It's partially unstable, and there's a bunch of things you can't do yet, but you can use it as a return type.


Because you can't use void in templates uniformly: you can't have "T x;" when T = void, you can't do "return T{};", you can't form types like "(*R)(int, T)", etc.


You can have as many monostate objects as you want.


this is crazy confusing stuff that should be banned by most sensible internal coding standards.


Is it like a singleton without any method ?


Since it doesn’t have any data either (hence monostate) that’s not a useful distinction. It’s a singleton in the same way 1 is a singleton.


So the upside is that you can put it at the beginning of a function? Wild, bro


Having access to the unit type is useful; I use it maybe once every couple of months in F# even if we restrict solely to the use-case "instantiate a generic with the unit type to indicate that no data is held here". (Of course, since F# also doesn't have a `void` type - a truly non-constructible data type is indeed very rarely useful! - F# uses `unit` in many places where C++ and C# use `void`.)


You can also use it in any template without any trickery unlike void, which I am willing to bet was the actual rationale.


The article describes that as the main advantage, showing std::variant as an example.


It's probably more of use to compiler/runtime developers than end users if the language.


It's useful whenever you want a variant with a default state which is different from any value you'd actually put it it, to differentiate between e.g.

    std::variant<int, whatever> v;
    // v is zero
and

    std::variant<int, whatever> v = 0;
And don't want to pay the space and time overhead of wrapping in an std::optional which will use a whole other byte at least for not good reason


That… doesn’t make any sense?

The size of an optional<T> is `sizeof T + 1` because it needs a boolean for the set/unset flag.

The size of a variant is… the same, at least, because it needs to store a discriminator integer (apparently both gcc and clang do optimise to a byte when there are less than 256 variants).

Or do you mean adding a monostate to an existing variant rather than wrap that variant into an `optional`? In which case you should fix your example (and provide a version wrapped in an optional) because they’re very confusing.


I don't understand what's confusing with my example aha.

Current situation: you have

     std::variant<int, whatever> x;
you now want to discriminate on whenever x has been initialized explicitly or not, the two cases I posted:

    // case 1
    std::variant<int, whatever> x;

    // case 2
    std::variant<int, whatever> x = 0;
you have two options:

option A: does not needlessly increase sizeof, does not add indirection penalty upon access:

     std::variant<std::monostate, int, whatever> x; 
option B: needlessly increases sizeof, adds indirection penalty upon access (mostly relevant when compiling in debug mode without inlining if you still want to have a semblance of performance):

     std::optional<std::variant<int, whatever>> x;


Parton my naiveté, but if you wanted a value that's either a thing, or not-a-thing, why wouldn't you express that with a std::optional? What advantage does std::monostate have versus option types?

Brevity, I guess? I suppose the most brief way to express absence or presence of value in C++ is a pointer, but I could be tried at the Hague for that take.

None of this is to be facetious, I'm really trying to learn here. I'm not a C++ guy by trade. C and Rust are my day-to-day languages.


You're not wrong, if you just want "a T or nothing", optional is the way to go. But what if you want "a T, a U, a V, or nothing"? Then you do

    std::variant<std::monostate, T, U, V>
Or

    std::optional<std::variant<T, U, V>>
But then the "none" state is a bit more "special", it's not just one of the options. The sizeof of the type will also probably be a bit bigger, because it has to contain both the bool for the optional, as well as the tag for the variant.

The other obvious use is what the article states, that it allows the variant to be default constructed even if none of its members are. Though you can do that with optional as well. It's mostly a matter of style. I mostly avoid std::monostate because the name is so confusing, I really agree with the other users that something like std::unit or std::none would be better.


> it allows the variant to be default constructed even if none of its members are

Ah, okay! So it's serving the same purpose as the "None" in this Rust snippet:

    enum Foo {
        None, // <-- impl core::Default and return this
        T(T),
        U(U),
        V(V),
    }
That makes sense, though it really shows how the legacy of C++'s type system limits stdlib design in the present day. It'd be awfully nice to be able to just write std::variant<void, T, U, V>.


Exactly. The difference here is that std::variant doesn’t “name” the options, you just get the type (and an index, so variant can contain duplicate types). So you have to have some marker type that’s like “this contains nothing”.

Fundamentally, Rust’s enums are a much better way of doing this thing compared to std::variant, but C++ did the best it could without changing the core language. Which arguably they should have done.


I mean, it's very trivial to make your own basic wrapper with "void" in the API if that's what you want: https://gcc.godbolt.org/z/v6EPTaaPY


I suppose that code is "trivial" in the same sense that anything else involving C++11 is "trivial."


this took like less than 5 minutes to write, this is definitely in the "trivial" realm


> And don't want to pay the space and time overhead of wrapping in an std::optional which will use a whole other byte at least for not good reason

This. monostate's behavior should be captured in the std::optional spec. There's no need to create a new type.


That is opposite of sense. monostate is a much more general concept, if you want to argue about the need for a new type, the answer is to remove optional because it’s a `variant<T, monostate>`. It doesn’t work the other way around.


That's putting the cart before the horse. Languages are designed for developers to use, not compilers to optimize. The intention and use of std::optional is clear.

Edit: My point, if not clear, is that the compiler should add the extra bit of code for when an optional is empty automatically, rather than requiring that an optional be defined (it's optional!) unless it is explicitly typed as monostate.


That makes even less sense, what "extra bit of code" are you talking about? monostate is not designed to be used with optional: although there's the odd case where that's useful an optional<monostate> is bijective to a boolean, and because C++ does not have zero-sized types it's less efficient (as it takes two bytes).


It has other uses for end-user developers. For example, when you need a class member to be conditionally elided based on template parameters. You can swap the normal type with non-zero size with std::monostate such that it has zero size.


A std::monostate member will still have non zero size, because it needs a unique address.

See https://en.cppreference.com/w/cpp/language/ebo


...except if you use the standard [[no_unique_address]] attribute.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: