I'm convinced that lack of Sum Types like this in languages like Java/C#/Go are one of the key reasons that people prefer dynamic languages. It's incredibly freeing to be able to express "or". I do it all the time in JavaScript (variables in dynamic languages are basically one giant enum of every possible value), and I feel incredibly restricted when using a language that requires a class hierarchy to express this basic concept.
I completely agree. Every passing day I become more convinced that a statically typed language without sum types or more broadly ADTs is fundamentally incomplete.
The good news is that many languages seem to be cozying up to them, and both the JVM (through Kotlin, Scala, et all) and .net (through F# or C# w/ language-ext) support them.
Even better news is that the C# team has stated that they want to implement Sum Types and ADTs into the language, and are actively looking into it.
I just don't see, in properly designed code, that there would be that much use for sum types if you have generics. When are you creating functions take or return radically different types that need to be expressed this way?
I dislike dynamic languages where parameters and variables can take on any type -- it's rarely the case that same variable/parameter would ever need to contain a string, a number, or a Widget in the same block of code.
I find it much more freeing to have the compiler be in charge of exactness so I can make whatever changes I need knowing that entire classes of mistakes are now impossible.
> When are you creating functions take or return radically different types that need to be expressed this way
Let's say you're opening a file that you think is a CSV. There can be several outcomes:
- the file doesn't exist
- the file can't be read
- the file can be read but isn't a valid CSV
- the file can be read and is valid, and you get some data
All of these are different types of results. You can get away with treating the first 3 as the same, but not the last. Without a tagged union, you'll probably resort to one of a few tricks:
- You'll have some sort of type with an error code, and a nullable data field. In reality, this is a tagged union, it's just that your compiler doesn't know about it and can't catch your errors.
- you'll return an error value and have some sort of "out" value with the data: this is basically the same as the previous example.
- you'll throw exceptions, which usually ends up with people writing code that forgets about the exception because the compiler doesn't care about it, and the code works 99% of the time until it completely blows up.
If you want to force people to handle the above 3 cases, couldn't you just throw separate checked exceptions (eg in Java)? In that case the compiler does care about it. You can still catch and ignore but that imo is not a limitation of the language's expressiveness.
Checked exceptions would have been an ok idea if it weren't for the fact that at least when I was writing Java last (almost 10 years ago) they were expressly discouraged in most code bases. Partially because people just get in the lazy habit of catch and rethrow RuntimeException, or catch and log, etc. when confronted with them. Partially because the JDK itself abused them in the early days for things people had no hope of handling properly.
They also tend to defer handling out into places where the context isn't always there.
The trend in language design does seem to be more broadly away from exceptions for this kind of thing and into generic pattern matching and status result types.
> Partially because people just get in the lazy habit of catch and rethrow RuntimeException, or catch and log, etc. when confronted with them.
After quite a while of thinking this way, I came to the conclusion that:
95% of the time, there's no way to 'handle' an error in a 'make it right' sense. Disk write failed? REST request failed? DNS lookup? There usually isn't an alternative to logging/rethrowing.
When there is a way to handle an error (usually by retrying?), it's top level anyway.
Furthermore, IO is the stuff that can just 'go wrong' regardless of how good the programmer is, and IO tends to sit at the bottom in most Java programs. This means every method call is prone to IOExceptions.
If IOException on a read is truly happening, and it isn't just a case of a missing file, there are serious issues that aren't going to be fixed with a catch-and-log, or be able to be handled further up the call stack.
One benefit I've found with error-enums is just being aware of all the possible errors that can occur. You're right: 95% of the time you can't do anything except log/retry. But that 5% of the time become runtime bugs which are a massive pain. It's really nice when that is automatically surfaced for you at development time.
Honest question: do you think this kind of stuff is going to be adopted by the majority in the next decade or two? Because I'm looking at it and adding even more language features like that seems to make it even harder to read someone else's code.
um... you realize the parent post is talking about having sum types in statically typed languages (eg. rust), when you already do this all the time in dynamic languages like javascript and python right?
So, I mean, forget 'the next decade or two'; the majority of people are doing this right now; python and js are the probably the two most popular languages in use right now.
Will it end up in all statically typed languages? Dunno; I guess probably not in java or C# any time soon, but swift and kotlin support them already).
...ie. if your excuse for not wanting to learn it is that it's probably an edge case that most people don't have to care about now, and probably never will, you're mistaken I'm afraid.
It's a style of code that is very much currently in use.
Are the majority actually writing code like this though? In the case of dynamic languages, this property seems more like an additional consequence of how the language behaves. It's not additional syntax.
I really don't know what more to say about this; if you don't want to use them, don't. ...but if your excuse for not using them is that other people don't, it's wrong.
Because even with generics, you are not able to express "or"; two different choices of types that have _different_ APIs. With generics, you can express n different choices of types that have all the _same_ API.
It's a good software engineering principle to make control and data flow as streamlined as possible, for similar data. Minimize branches and special cases. Generics help with this, they hide the "irrelevant" differences, surfacing only the relevant.
On the other hand, if there are _actually_ different cases, that need to be handled differently, you want to branch and you want to express that there are multiple choices. Sum types make this a compiler-checked type system feature.
Let's take Rust's hash map entry api[0], for example. How would you represent the return type of `.entry()` using only a class hierarchy?
let v = match map.entry(key) {
Entry::Occupied(o) => {
o.get_mut() += 1;
o.into_mut()
}
Entry::Vacant(v) => {
update_vacant_count(v.key());
v.insert(0)
}
};
I view sum types as enabling the exact same exactness as you describe in your last line; especially since you can easily switch/match based on a specific subtype if you realize you need that, without adding another method to the base class and copying into the x subclasses that you have for implementing the different behavior.
Rust has both generics and sun types, and benefits enormously from both.
And sum types aren’t for “radically different types”. You can define an error type to be one of different options (I.e. a more principled error code), or to represent nullability in the type system, or to indicate fallibility without relying on exceptions, etc.
Rust uses all of these to great effect, and does so because these sum types are generic.