Hacker News new | past | comments | ask | show | jobs | submit login
JEP draft: No longer require super() and this() to appear first in a constructor (openjdk.org)
236 points by mfiguiere on Jan 22, 2023 | hide | past | favorite | 164 comments



This JEP draft touches on an interesting tension in language design (and perhaps many other kinds of design) between a desire for rules that aren't unnecessarily restrictive and a desire for rules that are easy to communicate to the user. I.e. there's value in accepting more correct programs but there's also value in easily communicating why a program is rejected, and sometimes there can be tension between the two.


When the rule was made, it would not be uncommon to write Java in, say, Emacs. Nowadays it's practically guaranteed everyone writing Java will be using a fully fledged IDE (or if you did stick with Emacs, there are now packages to basically turn it into a Java IDE for all intents and purposes), so needing to remember the exact rule as you type is far less of a concern, as you get instant feedback vs feedback 10 minutes down the line when you try to compile your code.


Can you expand on this. does this change make it easy to communicate why a program was rejected?


I think this change would make it harder to explain why a program was rejected, even though it would allow a programmer more flexibility.

It's easier to just say "this() must ALWAYS be called first in constructor" and everybody understands... than to try and say "you can call this() after other statements, but not in a try block and not if those statements reference the instance under construction" which will certainly allow you to do more interesting things, but also be more confusing.


"this() must always be called first in the constructor" is not true! Consider the following example:

    Foo(long x) { ... }
    
    Foo(double y) { this(Math.round(y)); }
In the second constructor, the first call is Math.round(), followed by Foo(int). I am talking about evaluation order, not lexical order.

I found myself needing to work around the limitation recently: https://github.com/nayuki/PNG-library/blob/b92cedc23b1fc32d5...


this() and super() MUST be called during the Initialization Block, which MAY include guard / error tests and variable assignments (a list of permitted actions and types of actions is more complex).


They aren’t mandatory, no? Thus must, or especially “MUST” is misleading.


This is the slightly confusing part: a call to another constructor is mandatory, but if you don't call any other constructors yourself, super() automatically gets called before your code, without you having to write it.


Yeah, that's the confusing part. Historically that segment of code, much like version of other languages that require variable declaration only at the start of a new scope, is just implicitly at the start. There isn't a linguistic structure that separates it from more normal code that does other tasks; previously it was a compiler error.


They may be ellided entirely if there is a superclass constructor without any parameters.

In all other cases, either super(...) or this(...) MUST be called exactly once. Additionally, you can't catch Exceptions thrown by a call to super(...) or this(...) (so you can't call them inside a try{} block). Finally, you can't reference any non-static members of the "this" object except to write to write to a field.

So yes, the rule is significantly more complex.


Buffalo buffalo buffalo Buffalo buffalo, is hard to parse because English is very expressive. And you kinda have to know words have different meaning in different context. If I explain buffalo is another word for bison, and also buffalo means coerce, I can rephrase this

Bison from Buffalo coerce other Bison from Buffalo.

Generally, making a language more expressive makes it harder to parse. You have to know a lot more stuff. Computer languages often have a formal spec. I think java does, but I don't think perl does. Regardless, you just have to remember that footnote on page 26 of the manual that the construct X is an option.

I haven't touched java in a while, but I do remember some difficulty with parent class constructors calling methods implemented in the children, that relied on member variables that hadn't been initialized yet. I wish I could point at a concrete example, but I think it was something deep in the heart of swing (part of the jre so not changeable).

I can't imagine a way this makes it easier to parse in my head as I'm reading through code, or communicate failure. but there probably are libraries that become more usable with this new syntax.

It sucks because you're guaranteed super is called, the parent class can implement stuff that _has_ to happen. there's no way for me to force a subclass to call, something like pleaseInitializeAfterSuper(). and there's no way for me to do anything before super, right now.

Regardless, there's no easy way out. Making the language more expressive is a damn big hammer to use though. you'd think you could use a builder to guarantee initialization, But sometimes you can't.

To answer your question, no. I don't think this makes it easy to communicate why a program was rejected. it makes more programs possible though. At the cost of having to remember that footnote on page 26.


> Buffalo buffalo buffalo Buffalo buffalo, is hard to parse because English is very expressive. And you kinda have to know words have different meaning in different context. If I explain buffalo is another word for bison, and also buffalo means coerce, I can rephrase this

> Bison from Buffalo coerce other Bison from Buffalo.

The words don't have different meanings, though, at least in idiomatic English. No-one uses the word "buffalo" to mean "coerce". I'd be surprised if you could find even a handful of examples of it being used that way.

File the "many buffalos" thing under the same heading as the "code golf / obfuscated C" type things where you've got a chain of half a dozen ternary operators wrapped in a loop to make a state machine, where a switch() statement would suffice.


I've heard this example before, but you have a great point - I've never heard anyone use "buffalo" as a verb before. I wonder what possible better versions we could collectively come up with. "Badger" is much more commonly used as a verb, but I'm unaware of a well-known location called Badger.


I support that proposal ... but I can't immediately think of anything.

Vaguely related: www.marchmarchmarch.org.uk


Badger badgers badger badger badgers

(a badger being someone who makes badges, and thus badger badger being someone who makes badger badges)

Also, if you can have a badge depicting a badger badger, then badger badger badgers badger badger badger badgers.

PS. Mushroom! MUSHROOM! Panic, a snake!


The "intimidate" meaning of buffalo is in a bunch of dictionaries, I just checked. Regional slang, I guess?


But is it in active use? I've never found anything. I feel like I'd find it in some 19th-century American literature, maybe some Ambrose Bierce or Mark Twain.


Your example isn't about expressiveness, it's about ambiguity. I don't get the relevance, to be honest.


What’s the other way to parse? I totally don’t see it.


It's not ambiguous because there are multiple ways to parse it. It's ambiguous because all tokens are identical save capitalization and represent different POS elements based on position and context.

I am not sure what your definition of "expressive" is. To me, effectiveness to convey thought or ideas in this context. And I don't have the faintest idea of what that sentence means.


The proposed new rules for calling this() and super() aren't so foreign. It seems to be almost the same rules as for assigning final instance variables (e.g. class Foo { final int bar; } }), where you must definitely assign every final field exactly once in the constructor, not more, not less.

Still, after I experienced the constructor-less paradigm in Rust, where you only can only construct structs entirely all at once, and every "constructor" is just a static factory function, it's harder to justify the traditional OOP paradigm in Java, C++, etc.


As explicitly stated in the proposal the new rules about super/this calls are exactly the same as for a blank final field initialization. So the change simplifies some aspects of the language as it removes number of rules.

Moreover, given that Java calls overridden methods when super() calls a non-final method, the existing rules prevent some reasonable code patterns (again, see the proposal for the example) and the new ones allows to address those in reasonable and readable ways.


This change means that some programs won't be rejected when previously they would be.


Is there actually a tension there?

What's pushing for more possible correct programs?


The tension is the following:

- super() must be called first: the rule is simple but restrictive - non herited elements can be used before calling super(): the rule is more complex but less restrictive. This can lead to more complex error messages, etc


This particular restriction is mostly arbitrary anyway because `this` and `super` accept arbitrary expressions (including static method calls), so in most cases it's possible to do this, just with much worse syntax.

Technically, this does introduce new functionality when extending a class, specifically it's possible to choose a constructor to overload depending on arguments, but in most cases there is usually one constructor that all other constructors use.


Isn't it already possible to choose which constructor to overload?


With this change you can add logic to pick the overloaded constructor based on the value of the arguments (as opposed to the sequence of types)


ah yes, based on the value, not the types. Makes sense, thanks!


I enjoy that the JEP process allows for nice little small things like this. This isn't "we should have lambdas" or "Let's rebuild the type system to allow List<int>". But it's still a very common annoyance, and it can be fixed.


Totally agree, and I also love that so much thought and consideration goes into how the platform gets improved over time. Doing it "right" means taking so many things into account -- many of which are subtle -- and picking the right tradeoffs. Hats off to the people quietly doing the deep work which makes everyone's lives better.


I guess Java caught up, but for a long time it was just too restrictive.

C# was so much more joy to program in for a long time, even though they were equivalent languages in safety/speed tradeoff.


C# still has a lot to learn from Java. Just today I found myself annoyed that C# doesn’t have diamond operator for inferred right hand generics.


I might be missing something, but C# lets you omit the classname in "new" entirely if the target type of the expression is known - isn't that even shorter?


I wish the JEP showed which company was behind it. It has an author but it’s just a username not a full name and doesn’t indicate the company they work for.


Why must a company be "behind" this change? 2 minutes of sleuthing seems to suggest that the author of the JEP is a freelance developer who contributes to OpenJDK independently.


Why does there need to be a company affiliation at all? Do you feel this JEP is motivated by a company's interests?


Is there a list of JEPs which includes the affiliation of the person proposing them? It would be interesting to see how many get approved. I would expect it's a lot easier to make progress from 'inside the tent' i.e. an employee of Oracle, Red Hat etc.

Having said that, from the link posted by carimura [0] it looks like the author of this one is doing an excellent job of engaging with Brian Goetz and others at oracle.

[0]: https://news.ycombinator.com/item?id=34485541


For motivation and to see who's involved there are a few other places you can go such as amber-dev [1] and his PR's [2].

[1] https://mail.openjdk.org/pipermail/amber-dev/2023-January/th...

[2] https://github.com/openjdk/jdk/pulls?q=is%3Apr+is%3Aopen+arc...


It's always bothered me how

  public A(String s) {
   this(s.isEmpty());
  }
is allowed but

  public A(String s) {
    boolean helpfulName = s.isEmpty();
    this(helpfulName);
  }
isn't


There's something to be said for making sure the parent constructor is the first call made. It means any method you call in the current constructor will work correctly or obviously be your fault, not the parent class's.

Yes, in your particular case, the compiler could figure out that you're not using "this," but you can only make a compiler so clever.


It's not too hard to make the compiler (or more specifically, the language) know about this. The solution isn't even that clever, you just don't put "this" into scope until the super() call.


It's not that simple, because you can't find where super is called in the general case. Consider: `var foo = { bar: super }; foo['b' + 'a' + 'r']();` - you need to evaluate it in order to define scoping. But JS has lexical scoping as a design choice...


This article is about Java, your comment is about JavaScript.


Ooof. Didn't realize it. Still, I believe the same applies, we can't dynamically analyze pointers at the time scoping must be settled.


What does it have to do with pointers? super() in Java is a special language construct, it's not a method that you can have a pointer to. Thus it can be analyzed exactly the same as e.g. definitive variable assignment, and "this" be unavailable on any codepath on which super() has not been definitively called.


Yes, though in that case you can use

  public A(String s) {
    this(helpfulName(s));
  }
with a static method.


bingo - exactly the way to do it in java


I haven't written java in a long time, but I remember this requirement being a thorn in the side of many of my subclasses.


Yaeh, you end up having to write crazy one-liners to initialize some other object to pass to your super class.


Yup, that was dumb: this(f1(), f2()) is valid, but not with intermediate vars for f1, f2. I never got a good explanation why--because there isn't one.


The reason is a heavy-handed way to avoid leaking "this" before it's constructed. Part of the contract for objects is that you can't interact with an instance before its constructor has run, so that all of its internal state and public fields have been initialized.

Likewise for "super", because Java allows subclasses to modify parent fields, which need to be initialized to fulfill the same contract.

Swift, which has the same contract, splits the "initialization" phase from the "construction" phase, so you can indeed modify internal state prior to calling "this()" or "super()".


> Part of the contract for objects is that you can't interact with an instance before its constructor has run,

This contract doesn't hold anyway, because you can call object's methods from the constructor before the object initialization finishes, and those methods can even call the overriding code in the derived classes. So even though `super` is guaranteed to be initialized fully, `this` is not.


Interestingly C++ avoids this problem by making virtual method calls use the vtable of the currently constructing type. So for example if you have Super and a subclass Sub. When Super's constructor is running it won't call a method of Sub. Once the parent constructor completes the new vtable is then used. So you can think that the object starts life as a Super then becomes a Sub after the Super constructor finishes.

It is complicated and I'm sure it has caused some bugs, but it has also solved some bugs.


It also works that way in reverse when objects are destructing, and applies to dynamic_cast in addition to vtable dispatch.

While the idea is reasonable on its face, it's convoluted enough in practice that it used to be a common source of interview gotchas, so I'm not sure that it fixes more problems than it creates in practice.


It's easier if you just use a factory pattern. That way your constructors stay as simple allocation and the logic to build instances can break out into full methods. Exceptions can stay out of your constructors and it just makes things easier to work with over all.


Honest question: if constructors have these pitfalls, and using a different design pattern fixes the issue, why have constructors at all? Couldn't the whole issue just be avoided by just having a basic, implicit constructor and then just letting people use static methods on top of this?


Historical reasons. Back then it was all about hiding state which was perceived as some hidden walled paradise where there are no rules and yet everybody lived happy and without strife. Some certainly saw the need for dependable rules and so on (after all you can have final fields and be forced to set them in the constructor) but the lure of easy attitude from true believers was so strong that The Bean became the de-facto standard for about a decade. The Bean, with its sole default constructor giving a heartfelt middle finger to everything resembling some kind of predictability.

It was a painful learning process from there to where we are now, where we consider any internal state that isn't passed to the constructor in an already immutable shape a considerable smell that you'd better have very good reasons for and we prefer a clear distinction of preparation and operation phases separated by that one constructor which sets (and in most post-OOP languages also declares) the values of the object's constituents.

(PS why isn't "post-OOP" a far more established term? It feels considerably more useful to me than the usual mind gymnastics about "functional but not quite" that provoke all those (im)purity squabbles that help noone)


"The Bean" is mostly a framework compatibility thing. For example, Hibernate needs to be able to proxy your type and cannot intercept member variables. So now you need getters and setters.


Not the least interesting subchapter of the bean story, because those things were the most pressing oop data + behavior conglomerates that desperately needed disentanglement, and because "truth from the backend" is one of the most obvious use cases for immutability. But The Bean was more than that, just think of that FactoryFactoryBean meme that haunts Java to this day which was (in my perception) the result of that misguided love affair with XML called the XmlApplicationContext. ("who needs a compiler when we have xsd to ensure footgun correctness!")


Interesting question. Though, there's not really a factory method equivalent to super(). If you want your subclass factories to also call superclass factories to initialize parent state, you'd need to come up with an additional mechanism in user-space.


I think you misunderstood the solution. You still want constructors for safety during allocation. You just want to move non-allocation tasks out of the constructor because it adds some complexity to initializing instances that doesn't need to be there.

Figuring out what to pass a constructor and the actual allocation are different tasks. The anti pattern is conflating them. It's fine to use constructors.


I think both the pitfalls of constructors and the benefits of factory methods are overstated. Go only supports factory methods, for example, and it’s probably the thing I like least about it. It makes dealing with default values in JSON really painful, for example.

Being able to put an object into a specific well defined state at the time of instantiation is really good.


I think this Java oddity is the main practical motivation for a lot of factory methods. I wonder if we will see fewer after this change.


"this Java oddity": Doesn't C++ have the same restriction?


C++ have similar restriction, but I feel like it has much more reasons for having it. For Java there is Java language and JVM, but the restriction applied only the language, the JVM had to support this anyway. So this JEP tries to loosen the restriction so that Language matches with what the JVM already allows.

From syntax perspective, the parent and member constructors (in C++) are in a special list outside function body, it doesn't tease you with it looking like a regular statement inside the constructor body.

C++ not only has constructors, but default constructors and destructors. If an exception gets thrown, the compiler needs to ensure that destructors will get called only for the parent classes and members that were initialized. It does so by having very specific order in which the parent and member constructors and destructors get executed. The use RAII mechanism also means that C++ developers are more likely to make classes where construction/destruction has sideffects which need to happen in specific order.

Unlike Java where due to all the classes are reference types semantics uninitialized members can be left pointing to null, in c++ any nontrivial member will have the constructor called no matter what. If you don't do it explicitly the default constructor will be called. Implicitly or explicitly in both cases the member constructors will be called in the definition order !BEFORE! the constructor body, but after the parent constructor call.

As if things weren't complicated enough, C++ not only supports multiple inheritance it supports the diamond case in both ways: with the shared base class duplicated, and in case of virtual inheritance with shared base class not being duplicated. With all that complexity compiler still needs to ensure that right subset of destructors get executed in case of exception during construction.

Some of these problems Java solves with the help of garbage collector. If you get some members laying around after failed construction in Java, sooner or later it will be cleaned up by GC, and since Java doesn't have destructors it isn't critical when exactly it happens.


It's really not. Use before full construction is a pitfall in a lot of languages.


What’s the difference between 3 overloaded constructors and 3 overloaded factories + a constructor ?

The factories will call the constructor, they kind of look similar


The idea is that for cases where the initialization of the arguments to super() or this() become more complex, you provide a factory method (often only a single one) where these more complex computations take place and their results are then passed via "new" to one of the constructors, with the resulting object being the return value of the factory method. This makes it easy to create an object where such more complex values are needed for initialization, and it releases your code from jumping through hoops to circumvent the overly restrictive super/init-must-come-first policy, with constructions such as a number of private static methods that are never called from anywhere else besides the constructor.

The constructors could still be public, so if you don't have the need to do anything fancy, you might still just call one of the simple constructors directly. However, you often see that API designers restrict this approach in favor to a single point of entry into creating objects of that class. That is, you then have to create objects via a factory, even if you could as easily pass already existing values to a simple constructor. But that's a different story.


Why are so many people writing something like "I haven't written java in a long time, but..." on Java related posts on HN? Is this trendy or what?


Meh.

What's next? Multiple inheritance?!

I'm gonna need a better hypothetical than FilteredSet<E> to assess any benefits. Today, people use factory methods to achieve that result.

Or just use a compound statement. Something like (pseudo-codimously):

   class FilteredSet <E> {
      public FilteredSet( Predicate<? super E> pred, Collection<? extends E> elems ) {  
        super( elems.apply( pred ));
        this.pred = pred;
      }
   }
My hunch (believe but cannot prove) is that Java's current constraint serves to encourage composition over inheritance. Which is The Correct Answer™.

My limited anecdata wrt Java's "constructors must invoke super()/this() first" is this complaint arises when porting from C++ to Java. Because the original class hierarchy leans heavily on inheritance for organizing behaviors.


Given that the entire constructor call stack is known at compile time, it should be possible to use static analysis to detect if fields are used before initialization. Swift does this, I believe.


You can call methods from constructors, including methods overridden in unknown subclasses, or methods in superclasses whose implementation may change later, which complicates things. You can also pass this to foreign code from a constructor, which in turn may call arbitrary methods and/or access fields on the instance.


The way Swift handles this is that you can’t do anything that involves the self pointer before calling super.


Sure, the super class can call unknown methods in subclasses in its constructor, but the subclass has knowledge of the super class, so the subclass should fail to compile if it overrides a super class method that access instance variables before they’re initialized.


Superclass implementations can change later (you can recompile the superclass without having to recompile the subclass), you can’t assume their current implementation will stay the same.


You could also remove a method from the superclass entirely in such a situation. This would cause an invocation of the removed method to fail if the subclass depended on it. In general you need to recompile the subclass if the superclass changes.


> In general you need to recompile the subclass if the superclass changes.

This just isn’t true. There are compatible changes and incompatible changes you can make.

Having the soundness of the compiler rely on the assumption that there won’t be any changes just isn’t viable. For example, the thread-safety guarantees provided by final fields would be difficult to ensure.


> This just isn’t true. There are compatible changes and incompatible changes you can make.

This statement exactly applies to the case you mentioned: Superclass implementations can change later (you can recompile the superclass without having to recompile the subclass), you can’t assume their current implementation will stay the same.


For me these are on two different levels. The guarantees about final fields being assigned exactly once, and the related thread-safety guarantees, are part of the soundness of the language. Late binding may fail and you'll get a NoSuchMethodError or similar, that's fine. But you won't get an Error thrown if suddenly your fields fail to initialize or your class stops being thread-safe just because the superclass implementation changed.


> In general you need to recompile the subclass if the superclass changes.

If you are using "in general" in the sense of "in the general case", you are right. But if you are using it as "typically", then that's not true - it's very common for example for a library to be compiled with Java X while subclassing at least a few standard library classes, and then to keep using the same compiled classes with Java X+1, X+2, and to expect them to just work.

The Java Standard Library actually takes quite a bit of care to make sure modifications to its (non-final) classes are backwards compatible even for subclasses.


Java supports separate compilation and binary compatibility, so any class needs to be compilable without knowing anything about any other class other than its signatures, and cannot depend on knowledge about code inside methods.


Sure, but it’s up to the user to ensure that a compatible implementation is available at runtime. And generally people don’t compile against the interface. They compile with the implementation present.

Just as you get a runtime exception if you invoke a missing method, you could get a runtime exception for accessing uninitialized state.


Java supports dynamic libraries. As long as the API hasn't changed, your class may be initialized using a different version of its superclasses than it had when it was compiled, so checking statically is not that valuable. And this is not just a theoretical possibility, it's quite common for Java libraries to be distributed in bytecode format, and to be run on various versions of the JVM - so at least any library which subclasses any standard library class is going to often run with various versions of its superclass.


I'd prefer lambdas transporting checked exceptions outside instead of forcing a try-catch inside. It's a pain in the ass for using libraries with checked exceptions.


Best part about this is finally being able to avoid the dance of 'oh no the super called an overridden method and kinda-sorta-broke but not fatally, whoopsies'.


You still have the problem that the super class might not be fully initialized when the overridden method is executed.

Calling overridable methods from a constructor is just a bad idea in general.


There's been discussion about making this illegal, or at least emitting a warning. The term—or underlying problem—is "leaking this() in constructors".


Tools like FindBugs/SpotBugs and some IDEs have been generating such warnings for many years, this basically exists. What is difficult, however, is if a final method of the same class is called (which should be fine), but that final method in turn calls a non-final method (directly or indirectly). Or if the final method is in a superclass, whose implementation could later change (or could be made non-final). Or if you store a lambda reference to a non-final method in a field (there are valid use cases for that), then call a final method which directly or indirectly invokes that lambda. Unless you want to disallow any non-static method calls from constructors, the necessary language rules would be nontrivial.

Making it illegal is difficult, because there is a lot of code relying on this, and there are valid use cases for passing this to foreign code from the constructor, foreign code which might of course invoke anything on the instance.


Did any java-like ever try something like a language-level MyClass$WIP type? Seems quite elaborate for a niche problem like this, but could useful in other ways as well, e.g. by allowing finals not being final before promotion to non-WIP.


The checkers framework has at least some steps in this direction - possibly-not-fully-initialized classes are @Raw, which means that their fields may be null even if annotated as non-null.


This would be awesome. I tend to do a lot of work in the constructor, this would definitely help clean up different kinds of constructors.


But that in itself is an anti-pattern isn't it? A Constructor should be the minimal amount of work to create a valid set of object data. Other work should not be hidden in the Constructor otherwise it fails the principle of surprising functionality.

As an example of where I have seen this used, your File object could open the file and truncate it in the constructor but this is non-obvious, it would be better to have another method like Init() or perhaps a factory method where calling e.g. File.OpenForWriting() is obvious.

I know not everyone has the same requirements but I can't remember a single time where this restriction was a problem for me in C#, the feature sounds like it will encourage worse behaviour.


The work done in the constructor is to ensure you have a working object, so that calls to any of its methods can never end with a null pointer exception or any other exception that can reasonably be prevented/thrown in the constructor.

Opening a file and truncating it sounds totally unexpected and I never do such things like touching the filesystem or running queries in a constructor.


One more way to bypass this requirement is to add a private static method that returns something you want to pass to super()/this().

And on the subject of Java restrictions that get in the way: if you use a local variable in an anonymous inner class or lambda, it needs to be final. This presents a problem for when you call a method that runs your lambda at some point before it returns and you want to modify some variables from there. The ugly workaround is to declare a final single-element array. It satisfies "needs to be final", but its single element can still be assigned to from anywhere since "final" only applies to the object reference to the array itself.


> One more way to bypass this requirement is to add a private static method

This doesn’t work if it requires multiple arguments to be generated with common code, unless you’re okay with effectively running that common code twice, which in turn is problematic when it has side-effects (like logging). See https://news.ycombinator.com/item?id=34482908 for an alternative way of solving this.


Yes, inability to modify variables from the lambda is extremely frustrating and makes exactly zero sense as compiler can always convert variable to a wrapper object on the heap. I guess this is another improvement for the future. Too bad that Java's future is Lisp's infancy.


Part of it is that lifetimes become complicated. Do you convert each variable into a separate object, or do you shove all of them into a single object representing all captured variables from that frame?

If the former, it's very space-inefficient if you're capturing lots of stuff, as is typical with callback-oriented programming.

If the latter, then a lambda referring to a single variable from some stack frame will keep all other captured variables in that frame alive. Which can be especially surprising if the lambda that is alive captured a single int, and that causes a huge array that referenced by a different local that can no longer be reached to remain allocated.

FWIW C# does the latter, and it works "good enough" in practice. But I've seen more than one mysterious memory leak caused by it, too - I think most of the time they just aren't big enough for people to notice.


Btw Kotlin solves that. You can write to non final variables inside lambdas.


Yep, Kotlin solves almost all pain points with Java and presents a good show-case which Java language developers should have followed IMO.

Another examples of Java pain points solved in Kotlin which I could remember:

top-level functions (which are actually bundled in some "standard" class name. Pretty much syntax sugar but very nice.

Auto cast. Like you check variable with instanceof and after this check it's automatically casted to the target type. Java has some weird syntax for this but I don't see why this was necessary at all.

More default imports. Why should I qualify java.util. it's pretty much standard. I'm not asking for every java.x package to be auto-imported, but commonly used packages should be.

Class properties instead of getter/setter ceremony. Records are alternative approach but I don't find it useful yet.

Nullability. Of course that's a hard question and adding nullability to existing type system looks like gargantuan task. What could be done is introduce some java.xxx annotation, let module projects to be not-null by default (so you don't need to create package-info.java in every package just to make it not-null) and make all nullability issues to be warnings from javac. Basically it's already exists but not standard and implementation quality varies.

Reified generics. Basically it's a hidden Class<T> clazz argument passed automatically to the function. This is brilliant solution to type erasure problem and it should really have been adopted.

I'd vote for all those features to be implemented in Java similarly to their Kotlin implementations.


It was wrong.


> Another way to write the constructor, that works in current Java, might be to use the Streams API to make sure we are coding to the explicit and implicit design contracts:

    public FilteredSet(Predicate<? super E> filter, Collection<? extends E> elems) {
        super(checkElems(elems).stream()
                               .filter(filter)
                               .collect(Collectors.toList());
        this.filter = filter;
    }
I think you misunderstood the problem. Assuming you leave the add() method implemented exactly as in the JEP, this will still fail with the same NullPointerException, since super() will still call FilteredSet.add() before this.filter is provided.

The only way to avoid it without this JEP is to do something like this:

  public FilteredSet(Predicate<? super E> filter, Collection<? extends E> elems) {
        super(elems.size()); //initialize super as an empty HashSet, but with the proper capacity
        this.filter = filter;
        this.addAll(elems); //this will call this.add(), but this.filter will now be set
    }
Your description of the contract/purpose of HashSet.add() is also just wrong. The clear intention of the Java standard library designers is that any element added to the set, either by the constructor or later, is added using HashSet.add() - they are explicitly calling that method from the built-in HashSet constructor [0]. Also, it's an obvious design choice, while your proposal is just weird: why would you ever want to duplicate the implementation of adding elements between initialization and later modification?

In general, you seem to be arguing for keeping the language more clunky. Why does `7 instanceof int` or `switch(number)` or avoiding a new method definition just to check a variable before calling a constructor hurt the identity of Java in any way? Sure, all of these things could be achieved in various ways before as well, but always in a more clunky way. Many of these JEPs are pure cleanup of bizarre idiosyncrasies of Java - the precise opposite of how C++ is evolving.

[0] https://github.com/openjdk/shenandoah/blob/master/src/java.b... - note that the doc is not very explicit, but it does say that `c` is "the collection whose elements are to be placed into this set", not something like "the collection whose elements will be used to initialize this set".


Seeing this does make me appreciate the "no such thing as a constructor" approach of Rust. Rust, of course, has a totally different object model.

OOP classes are really a bit of a triumph of engineering, because they offer a lot of nice ways to build out various complex systems (would rather build out GUIs in a Java-based object model than in a Rust-basd one). And the fact that moving the model "forward" in expressivity is so subtle and starts introducing a lot of asterisks is probably proof that it's some local maximum


For non-java guys like me: this() is used to call one constructor from the other of the same class.


Undocumented downside:

Java becomes a slightly harder language to learn, because now there is one more thing to explain to a beginner: "Where exactly should super() be put in the constructor? At the start, in the middle, or at the end? What are the benefits and downsides of each, and when does it matter? Is there a convention?"


As someone who has been using Java since 1.0, I welcome these little improvements but agree that the language is now a mess for a beginner. All of the little rules and corners make sense to someone who learned them gradually. To a beginner, they must seem nonsensical and intimidating.


This is the evolution of nearly any language... It begins simple and easy, and over time more and more features and syntax are added, and it becomes very hard to learn from scratch, which in turn means beginners end up using other simpler languages.

I think python is a classic example of this - python 2 was pretty simple. Python 3 just gets more and more complex.


Yep. Pre-generics, Java was considered a good teaching language because of its simplicity and the ability to hold the entire language in your head. I think it's much more difficult to make that argument now. The same for Python. Moreover, while Python 3's unicode/byte string separation was a crucial step, IMV other features from Python 3 haven't made a big impact on how useful the language is.


Python 2 is NOT simple. Why does str + unicode return unicode (Python 3 clearly separates bytes and str)? Why is there int vs. long (Python 3 dropped long)? Why can an inner function not assign to a variable of an outer function (Python 3 allows nonlocal)? Why does print have special syntax (>>, trailing comma) and is not a function? Why range() vs. xrange() (Python 3 makes all of them lazy)? Why redundant operators like <> vs. != (Python 3 eliminated the former)? Why class Foo vs. class Foo(object) (Python 3 eliminated the former)? Why multiple syntaxes for catching an exception?


The language formerly known as Perl 6 has entered the chat.


Why does super need to go in a specific position? Isn’t “super calls the parent classes constructor” simpler than “ super calls the parent classes constructor and it must be the first statement in the child constructor.”


That's how I view it as well, from an end user perspective this is removing an arbitrary rule that you're required to "just memorize" (or be surprised when your IDE complains).


The problem begins when you want to do more complex things. For example, you can't call super() inside a try{} block as far as I understand, but that doesn't immediately follow from "super calls the parent class' constructor" - and it did follow from "either super() or this() must be the first statement in your constructor".


What does "try {" compile to in machine instructions or any intermediate language? I always thought of it as an annotation on a scope but now I'm curious.

Edit- from reading https://stackoverflow.com/questions/25007427/how-are-excepti... it appears my suspicion was correct. try{ is effectively an annotation that has no impact on the sequence of instructions you are authoring. So it's surprising to me that it breaks the this/super first rule.

OTOH it seems to me there should always be a rule to not allow execution paths that never call super nor this, and since try could break that one I'd always be leery of using it in constructors.


The JEP itself explains the reason why this limitation exists:

> try { } Blocks

> The restriction that super() and this() may not appear inside a try { } block comes from the JVM itself, and is due to how StackMaps are represented. The logic is that when a superclass constructor throws an exception, the new instance on the stack is neither fully uninitialized nor fully initialized, so it should be considered unusable, and therefore such a constructor must never return. However, the JVM doesn't allow the bytecode to discard the unusable instance and throw another exception; instead, it doesn't allow it to exist on the stack at all. The net effect is that constructors can't catch exceptions thrown by superclass initialization, even if rethrown. [emphasis mine]


it's simpler for the subclass.

the parent class however now has to deal with an arbitrary set of operation that the sub class can perform between the initializer block and the constructor method, so it's contract is overall much harder to define, because you can no longer be sure of your internal state in the constructor.


This seems like a solvable problem with static analysis: don’t allow reads from instance variables / methods of the parent class until super is called.


super() was never forced to be placed at the start.

Every expression that you call in super was called before it, obviously.

So you always was writing code like

    MyCtor() {
      int arg1 = calcArg1();
      int arg2 = calcArg2();
      super(arg1, arg2);
    }
And the fact that you wasn't able to write this code literally was confusing for every beginner.

Right now the rule is simple as it should be: you can't read `this`, explicitly or implicitly, before calling `super()`. Now it actually makes sense.

The only somewhat confusing thing about this JEP is that you can initialize fields before calling super. I don't see why this is needed and I'd treat it as code smell. I've read that example and I don't think that it's convincing. This code should be rewritten.


This is why I prefer the way C# did this:

    class Base {
        public int Count { get; set; } = 0;
    
        public Base(int counter) {
            Count = counter;
        }
    }
    
    class Sub : Base {
        public Sub(int number) : base(number) {
            System.Console.WriteLine($"This is the sub class constructor with {number} as an argument");
        }
    
        public Sub() : this(123) {
            System.Console.WriteLine("This is the sub class constructor with no arguments");
        }
    }

Rather than being a magical function call at a magical position, the call to the super constructor is part of a special bit of syntax that explicitly tells you what kind of call is being done. You can't call any method at that point of the program, you have to call a constructor by its special reference.

It also makes the reason why you can do `super(fn1(), fn2())` but not extract those two calls to variables; these calls are considered to be part of a different part of the execution flow than the rest of the constructor. A simple `base()` can be left out, of course; it only matters in cases where you need a specific super constructor.


I don't see how is that convenient? Java approach will be trivially extended with this JEP and this restriction will be lifted. C# approach probably will never be improved so it'll never be convenient to use.


It's not convenient, but it's clear.

I would love something akin to this JEP in C#, but I'd want it in a way that makes it very clear that you're executing code prior to the base class' fields being initialized.

Something along this lines of this

    public Foo(int bar)
    {
        prector
        {
            var baz = bar * bar;
            base(baz);
        }
        ctor
        {
            // Foo's constructor code
        }
    }


Since this and base are both keywords, and since () is not overloadable in C#, it adopt Java-like syntax for such calls without breaking backwards compatibility if it ever wants to do the same thing Java did.


That's also the way C++ does it, with a very similar syntax (which means C# probably got it from C++).


Very likely. C#/Java and C++ are very different languages in their behaviours and purooses, though, so I thought I'd reference the most comparable alternative.


Arbitrarily placing those bits of code at the beginning of the constructor is also confusing for a beginner. I think the way Kotlin does it (special syntax for this) is less confusing.


This is just a refinement of the original specification...


I really wish useful quality of life things like "?." would get added instead of this kind of stuff that really doesn't improve life all that much for most development


I went from using Java to Kotlin at work, and I hate the “?.” notation. I’m sure it’s nice when used correctly, but it seems that people just use it everywhere and now the program went from “When I do x and it crashes” -> “When I do x nothing happens”. It hides errors and makes debugging more difficult.

I love null being apart of the type system, but the safe call operator seems like a huge step backwards.


Why would people use it everywhere in Kotlin? I thought types weren't nullable by default.

Also I don't see the difference between if(a != null) { a.b(); } and a?.b().

They won't result in NPEs either way and I see the former everywhere in Java as well.


?. Should generally be rare and can be combined with null coalescence to reduce occurrences:

Void func doThing(state : State?) {

  var idx = state?.idx ?? -1

  //No more ?. Needed

}


You and me both. Java programmers don't seem to want it for some reason.

However, Java has instead received `Optional` (in the shape of `Optional.ofNullable(variable).map(Clazz::getProperty).map(Clazz2::getProperty).orElseGet( () -> 42 );` rather than `variable?.getProperty()?.getProperty() ?: 42`).

I don't know what the JCP EC was thinking going for this approach rather than the much more readable, modern approach, but I can't be too surprised about a language designed by committee refusing such improvements.


> (...) instead of this kind of stuff that really doesn't improve life all that much for most development

There are comments in this very discussion praising this change due to how this stuff really improves their life as a developer.

Anyway, can you pass the link to your JEP with a proposal for "?." ?


The Elvis operator was proposed for Java 7 and hasn't made it back in since. I'm sure you can put in the effort to write a proposal again but I doubt it'd stay up for long. I'm not going to put in the effort of fighting a committee that has probably already made up its mind when I can write Kotlin or C# instead.

Maybe Manifold will add the operator somehow. That seems to be the place modern language features that the Java comities don't like end up.


This is useless operator. I can't imagine many places where it could be useful.


Relatedly, I don’t know if the following has already been fixed in more recent Java versions, but it should:

  final Foo foo;
  try
  {
      foo = bar(baz);
  }
  catch (SomeException ex)
  {
      foo = SOME_DEFAULT;
  }
For some reason, the assignment in the catch clause is seen as a reassignment ("variable foo might already have been assigned"). As a workaround, one can factor it out into a separate method and have “return” there instead of “foo =“. This is particularly inconvenient in constructors when you want to assign to a final field.


It's because if you had

    final Foo foo;
    try {
        foo = bar(baz);
        otherFunction();
    } catch (SomeException ex) {
        foo = SOME_DEFAULT;
    }
that'd truly be invalid - the first assignment could run, otherFunction errors, and foo gets reassigned. Even if SomeException is a checked exception that otherFunction doesn't declare in its throws, it's still possible to get the jvm to throw it, and things do rely on that - see Lombok @SneakyThrows.

So the only case that such final field assignment can work is if it's the last evaluated expression in the try case (plus some guaranteed non-throwing arithmetic or boring assignments), which is quite the weird thing to have as specified behavior in a language.


Of course. But Java could recognize if any potentially-throwing code is executed within the try block after the assignment, similar to how it tracks where checked exceptions are being thrown.

Meaning, when needed you could still always do:

    final Foo foo;
    try {
        Foo tmpFoo = bar(baz);
        otherFunction();
        foo = tmpFoo;
    } catch (SomeException ex) {
        foo = SOME_DEFAULT;
    }
But at present the "best" (besides introducing a separate method) you can do is

    final Foo foo;
    Foo tmpFoo;
    try {
        tmpFoo = bar(baz);
        otherFunction();
    } catch (SomeException ex) {
        tmpFoo = SOME_DEFAULT;
    }
    foo = tmpFoo;
Which has the drawbacks that (a) the compiler doesn't check that tmpFoo is assigned exactly once (because you can't make it final), and (b) tmpFoo stays around for the rest of the scope, with the risk of being misused later.

I would understand if an excemption is being made for Errors, because in princible any JVM bytecode instruction could probably throw one, but not for regular Exceptions.


What's non-boring assignment? Every assignment in Java is boring. This is just bad flow analyzer and that's about it.


something like

    final Foo foo;
    try {
      foo = bar(baz);
      Integer a = null;
      int b = a;
    } catch ...
(though that's less the assignment, more implicit behavior; point being, you've got quite the restrictive environment after "foo =")


Accepting that kind of code probably is too much to ask.

But accepting original example should be do-able with pretty simple AST pattern matching.


Right. Not saying that it'd be hard to support the original code, but more that it's just an extremely weird case to have. As in, I'd say that adding "foo = try { bar(baz) } catch (SomeException ex) { SOME_DEFAULT }", even though that requires adding the ability for blocks to "return" a value, would be a more sensible change to aid the problem at hand.

Were the original to be allowed, it be the first case where adding a System.out.println(); statement into code {}s results in an error. (ok, second case, the first being adding it before a super()/this() in a constructor. But hey look what post we're on!)


I'm having trouble constructing a specific example that doesn't work, can you provide a minimal example that fails to compile? This compiles fine in Java 11, for example:

    import java.io.IOException;

    class Test {
        int g() throws IOException {
            throw new RuntimeException("aaa");
        }

        int f() throws IOException {
            int x;

            try {
                x = g();
            } catch (IOException e) {
                x = 3;
            }

            return x;
        }
    }
So the compiler definitely believes that x is initialized because we can return it.

Not a Java expert so maybe there's some subtlety I'm missing.


Sorry, I had slightly misremembered, please see my corrected comment above. The second assignment is seen as a reassignment when the variable or field is final.


This is why the statement/expression dichotomy is a bad idea. When try/catch is an expression just like everything else, the problem disappears - you compute however you want, and then assign once.


??? I just tried this on Java 8 and it compiles fine.


Sorry, I had slightly misremembered, please see my corrected comment above. The second assignment is seen as a reassignment when the variable or field is final.


This is another thing make me realized that Delphi was right and Java was wrong back then. (explicit "override" is another thing)


[flagged]


it's really annoying when you want to subclass something but do a transformation, logic or validation on the arguments that are to be passed through to the super class, because you can't put any code prior to the this() or super() call to do that. In practice it means people stuff all kinds of awkward expressions inside the this() or super() which get very ugly, or are forced to move these out into ancilliary methods which has other drawbacks.


The linked JEP provides a very well written set of reasons.


Yes, and it provides the same safety guarantees as requiring this() or super() to be the first call.


Why not?


One of the many reasons to prefer static factories to constructors.


I personally don't like static factories that much. They have their place, but half the JDK is now `.of` or `.from` or `.newInstance` or `.get`.

I'd really like to be able to define a constructor on sealed interfaces:

    sealed interface Path permits WindowsPath, UnixPath {
        public Path(String p) {
            if (isWindows()) {
                return new WindowsPath(p);
            }

            return new UnixPath(p);
        }
    }

That way user code to construct an object is always the same:

    var p = new Path(s);
Instead of:

    var p = Paths.get(s);


I have two issues with constructors.

1. They don't have a name. That's bad because names are important. It's OK if constructor is primitive and does not do anything but field assignment. But often constructors do something and naming might help.

2. There can't be two constructors with identical types (as a consequence that constructors don't have a name). This might be restricting in some cases.

Factory methods solve both issues.

If anything, I'd prefer to remove constructors completely and use factory methods for anything. Not realistic with Java, of course. But in my code I'm trying to follow this pattern. Constructors are absolutely minimal (usually just initializers for final fields) and any non-trivial code goes into factory method. Of course it causes issues with subclassing, so it's not a silver bullet.


In practice the names are just not descriptive (of, from, newInstance, get) and I suspect this is because they're typically used to dynamically return implementations at runtime so you don't know the name of thing you want. If you know the name of the thing you want ahead of time, isn't the constructor of the named type good enough?


A good example might be static constructors that indicate the semantics of the parameters, such as of(array), copyOf(array), ofSorted(array)

Where perhaps the first wraps, the second copies and the third may wrap and avoid sorting the input. This would all of course depend on some conventions on meaning within the codebase.


An important benefit of static constructor methods is that they allow you to switch class implementations between a final class and an abstract class with private subclasses, without breaking compatibility. With regular constructors, you are prevented from switching to subclass implementations later. Static constructor methods thus decouple client code more from implementation decisions.

Another benefit is being able to name different constructors differently.

If I were to design a new OO language, I would make it so that constructors always work like that in the first place, from the caller perspective.


> An important benefit of static constructor methods is that they allow you to switch class implementations between a final class and an abstract class with private subclasses, without breaking compatibility.

Is this not what I'm describing? Using one way to construct objects would allow you to change from a concrete class to an interface without breaking client code. I just want the syntax to be the same across interface/abstract/concrete so class construction is uniform and discoverable.


I guess yes, although it now means you always need to start with interface plus class to later have the flexibility, even if you’ll never need it. It is kind of even more verbose. In current Java, you can start with a final class with static constructor methods (which is only slightly verbose), and later switch to having subclasses.

It also doesn’t provide the flexibility in naming.


I don't see how it would be more verbose.

   // you can start with a final concrete class
   final class A {}
   var a = new A();

   // and later move to interfaces or abstract types
   sealed interface A permits B, C {
      public A() {
           ...
      }
   }
   var a = new A();

Client code would remain the same with the flexibility to refactor to whatever you want later.


I see, yes, if it is possible to make this binary-compatible in the JVM, that would work. One implication is that `new A()` wouldn't be guaranteed to return a new instance anymore (just as for a static method).

Personally I still prefer having the ability to name the constructors. I'd rather have "new" be the default constructor name, with the option of having it be something else (i.e. Path.new(...) vs. Path.somethingElse(...)).


One of the many reasons to prefer composition over inheritance.


Alternatively (in particular for subclassing), it can always be worked around with an intermediate object. For example:

  Constructor(A a, B b) { … }

  Constructor(C c, D d)
  {
      // A and B can be derived from C and D, and
      // we want to forward to the (A, B) constructor,
      // so we use an intermediate object to do the
      // conversion and provide A and B:
      
      this(new Intermediate(c, d));
  }

  private Constructor(Intermediate x)
  {
      this(x.a, x.b);
  }
Of course, with the JEP this will become much simpler.


How does that help the implementor of a subclass?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: