> I believe Java did this deliberately to avoid the trouble that C and C++ have with signed and unsigned integer types having to coexist.
The problems really only come from mixing those types, and the simple solution is to disallow such mixing without explicit casts in cases where the result type is not wide enough to represent all possible values - this is exactly what C# does.
I think Java designers just assumed that high-level code doesn't need those, and low-level code can use wrappers that work on signed types as if they were unsigned (esp. since with wraparound, many common operations are the same).
> Java-style wrapping integers should never be the default
The ironic thing about this one is that C# introduced "checked" and "unchecked" specifically to control this... and then defaulted to "unchecked", so most C# code out there assumes the same. Opportunity lost.
While we're on the subject of numeric types - the other mistake, IMO, is pushing binary floating point numbers as the default representation for reals. It makes sense perf-wise, sure - but humans think in decimal, and it makes for a very big difference with floats, that sometimes translates to very expensive bugs. At the very least, a modern high-level language should offer decimal floating-point types that are at least as easy to use as binary floating-point (e.g. first-class literals, overloaded operators etc).
C# almost got it right with "decimal"... except that fractional literals still default to "double", so you need to slap the "M" suffix everywhere. It really ought to be the other way around - slower but safer choice by default, and opt into fast binary floating-point where you actually need perf.
> At least Java has the defence that they didn't know how it would pan out. C# has no such excuse in copying Java.
I think both Java and C# did it as an attempt to offer some generic data structure that could cover as many use cases as possible, since neither had user-defined generic types. In retrospect, it was an error - but before true generics became a thing, it was also a godsend in some cases.
My opinion is that a high-level language like Java has no business making me guess how many bytes my numeric values will occupy. It's insane. Since when does Java give a crap about memory space? "Allocations are cheap!" they said. "Computers are fast!" they said about indirection costs. Then they stopped and asked me if I want my number to occupy 1, 2, 4 or 8 bytes? Are you kidding me?
Yes, you should have those types available so that your Java code can interact with a SQL database, or do some low-ish level network crap, or FFI with C or something. But the default should basically be a smart version of BigInteger that maybe the JVM and/or compiler could guesstimate the size of or optimize while running.
Thus, IMO, there should be a handful of numeric types that are strict in behavior and do not willy-nilly cast back and forth. Ideally you'd have Integer, UInteger, PositiveInteger, and a similar suite for Decimal types.
Schemes have done numbers correctly since basically forever.
> the default should basically be a smart version of BigInteger that maybe the JVM and/or compiler could guesstimate the size of or optimize while running.
I suspect this would be disastrous for performance. I believe Haskell uses a similar approach though.
Sometimes you want to store 20 million very small values in an array. Forcing use of bigint would preclude doing this efficiently (in the absence of very smart compiler optimisations that is).
As int_19h points out, the Ada approach lets us escape the low-level world of int8/int16/int32/int64 while retaining efficiency and portability and avoiding use of bigint.
> there should be a handful of numeric types that are strict in behavior and do not willy-nilly cast back and forth
I agree that reducing the number of implicit conversions allowed in a language is generally a good move. Preventing bugs is typically far more valuable than improving writeability. This is another thing Ada gets right.
> I suspect this would be disastrous for performance. I believe Haskell uses a similar approach though.
>
> Sometimes you want to store 20 million very small values in an array. Forcing use of bigint would preclude doing this efficiently (in the absence of very smart compiler optimisations that is).
I suspect that it would. I also suspect that I don't care. :p
We're talking about Java. Yes, you can write high-performance Java and I wouldn't want to take that option away. But look at the "default" Java application. You have giant graphs of object instances- all heap allocated, with tons of pointer chasing. You have collections (not arrays) that we don't have to guess the maximum size of.
If you're storing 20 million small values in an array, then go ahead and use byte[] or whatever. But that should be in some kind of high performance package in the standard library. The "standard" Integer type should err toward correctness over performance- the very same reason Java decided to be "C++ with garbage collection".
I'm also not literally talking about the BigInteger class as it's written today. I'm talking about a hypothetical Java that exists in a parallel universe where the built-in Integer type is just arbitrarily large. It could start with a default size of 4 or 8 bytes, since that is a sane default. Maybe the compiler would have some analysis that sees the number could never actually be large enough to need 4 bytes and just compile it to a short or byte. These things should be immutable anyway, so maybe the plus operator can detect overflow (or better if the JVM could do some kind of lower-level exception mechanism so the happy path is optimized) and upsize the returned value size. Remember, integer overflow doesn't actually happen very often- that's exactly the reason people don't typically complain about it or ever notice it (except me ;)), so it's okay if the JVM burps for a few microseconds on each overflow.
All this doesn't matter because it'll never, ever, actually happen. I just think they made the wrong call and it has unfortunately led to lots of real world bugs. It's hard to right correct, robust, software in Java.
I suspect the performance penalty would be so severe it might undermine the appeal of Java. I don't have hard numbers on this though, perhaps optimising compilers can tame it somewhat. Presumably Haskell does.
A more realistic change might be to have Java default to throwing on overflow. The addExact methods can give this behaviour in Java. In C# it's much more ergonomic: you just use the checked keyword in your source, or else configure the compiler to default to checked arithmetic (i.e. throw-on-exception). This almost certainly brings a performance penalty though.
Yeah, I don't have any real intuition about the performance cost, either. But real-world Haskell problems do fine, as you said. And Haskell has fast-math libraries that, presumably, give you the fast-but-risky C arithmetic.
I also agree that a "more realistic" option is to just throw on overflow by default, the same way we throw on divide-by-zero.
OP mentioned "Ada's approach to types", as well. Ada lets you write stuff like "T is range 1 .. 20" or "T is range -1.0 .. 1.0 digits 18". This then gets mapped to the appropriate hardware integer or floating-point type.
Yeah, I've read little snippets like that from blog posts and stuff, but I've never written a single line of Ada, so I really don't know how that works out in practice.
What happens if you overflow at runtime? A crash, I assume/hope?
My point of view is that this is the opposite of what I'm talking about anyway. Java is a high level language where we are usually writing in Java because we're agreeing to give up a lot of raw performance (heap allocations, tons of pointer chasing) in order to have convenient models (objects) and not have to worry about memory management, etc.
In light of the above, I don't see why the default for Java is to have these really nitty-gritty numeric types. I don't want to guess how big a number can be before launching my cool new product. Just like I don't use raw arrays in Java and have to guess their max size- I just use List<> and it will grow forever.
> What happens if you overflow at runtime? A crash, I assume/hope?
In Ada, if range constraints are broken at runtime, a Constraint_Error is raised (or 'thrown', if you prefer). [0] (That's assuming of course that range checks haven't been disabled, which is an option that Ada compilers offer you.)
> I don't see why the default for Java is to have these really nitty-gritty numeric types
At the risk of retreading our earlier discussion:
I think the short answer is performance. Java has lofty goals of abstraction, yes, but it also aims to be pretty fast. If it didn't, its appeal would diminish considerably, so it's reasonable that they struck a balance like this. Same goes for why primitives aren't objects.
It depends on the base type - you can get the traditional unsigned integer wraparound behavior, too. But Ada is very explicit about this, to the point of referring to them as "modulo types", and defining them using the mod keyword instead of range:
Think of range of permissible values as a contract. I agree that the default should be "no limit", but there are many cases where you do, in fact, want to limit it, that have nothing to do with performance per se - but if the language has direct support for this, then it can also use the contract to determine the most optimal representation.
The problems really only come from mixing those types, and the simple solution is to disallow such mixing without explicit casts in cases where the result type is not wide enough to represent all possible values - this is exactly what C# does.
I think Java designers just assumed that high-level code doesn't need those, and low-level code can use wrappers that work on signed types as if they were unsigned (esp. since with wraparound, many common operations are the same).
> Java-style wrapping integers should never be the default
The ironic thing about this one is that C# introduced "checked" and "unchecked" specifically to control this... and then defaulted to "unchecked", so most C# code out there assumes the same. Opportunity lost.
While we're on the subject of numeric types - the other mistake, IMO, is pushing binary floating point numbers as the default representation for reals. It makes sense perf-wise, sure - but humans think in decimal, and it makes for a very big difference with floats, that sometimes translates to very expensive bugs. At the very least, a modern high-level language should offer decimal floating-point types that are at least as easy to use as binary floating-point (e.g. first-class literals, overloaded operators etc).
C# almost got it right with "decimal"... except that fractional literals still default to "double", so you need to slap the "M" suffix everywhere. It really ought to be the other way around - slower but safer choice by default, and opt into fast binary floating-point where you actually need perf.
> At least Java has the defence that they didn't know how it would pan out. C# has no such excuse in copying Java.
I think both Java and C# did it as an attempt to offer some generic data structure that could cover as many use cases as possible, since neither had user-defined generic types. In retrospect, it was an error - but before true generics became a thing, it was also a godsend in some cases.