For a better GC experience, stop stopping the world and stop sharing mutable state.
ORCA (and Pony language) solved this while allowing selective mutability and zero-copy message passing improving on Erlang HiPE/BEAM by tying objects to a tiny heap with individual actors (cooperative async threads). There is no global locking in Pony except in limited circumstances.
Zulu C4 is an improvement. Schism and Metronome are less pausey but slower overall.
How do you to zero-copy message passing if you have one heap per actor? Or am i misunderstanding and a couple of actors share a heap, but there is no giant heap?
Has anyone ever tried to port OpenJDK's fancy garbage collectors other runtimes? I guess often GCs are quite intimately tied to other details of the runtime and specialized to code patterns and use cases of particular languages, but still, would be interesting to hear about attempts.
That would be difficult. The GCs share a lot of code, and have a lot of code to deal with Java specific things like class unloading and soft/weak/phantom references. It's easier to run other languages on top of the JVM runtime, which is what projects like Truffle and JRuby do.
> I guess often GCs are quite intimately tied to other details of the runtime and specialized to code patterns and use cases of particular languages
This is why I don't understand the WASM GC proposal, which I understand to be an attempt to make a GC that works for all languages. Can you really write a GC that performantly supports both Java and Go given the different tradeoffs/approaches each makes with respect to memory management, layout semantics, etc?
If machine code is generated on (or close to) deployment, the implementation can choose a suitable collector. Hopefully, the proposed protocol will not require that implementations somehow optimize away interior object pointers for languages like Java that do not really need them, but make that obvious directly from the opcodes being used (or some other easily observed property). Likewise for the requirement to support pointer tagging for fixnums. Then the implementation can choose the appropriate collector based on what is being compiled (and the most general one for inter-language unification).
Keep in mind that typical builds of Hotspot support four or five garbage collectors, and most of them with two different pointer sizes. The required read and write barriers differ widely between collectors and in some cases even collector modes.
I'm not saying that it's going to be easy (an incredible amount of effort went into Hotspot over the years). But it's definitely possible to support wildly different GC strategies efficiently with sufficiently late code generation.
Certainly not for lazy FP language like Haskell. Its GC is highly specialized for that, including performing some non-allocating reductions during GC, having special support for detecting looping computation (holes), etc.
>Can you really write a GC that performantly supports both Java and Go given the different tradeoffs/approaches each makes with respect to memory management, layout semantics, etc?
I mean, the Graal JVM GC supports a ton of languages, especially if you include their LLVM support. Many of them are quite high performance. Truffle Ruby I believe is still (one of if not) the fastest Ruby implementation(s).
Sure, but are the languages that Graal supports diverse from the perspective of the assumptions they make? E.g., it's not surprising to me that any JVM GC would work for Java, Ruby, Python, etc since all of these languages idiomatically allocate lots of garbage like allocations are cheap (they all "look like Java" from a memory management perspective). On the other hand, Go assumes allocations are expensive and it tends to make fewer, larger allocations (the tradeoff is that it can have a relatively simple, low latency GC). It's not clear to me that a single general purpose GC (i.e., it's performant for all cases, not just the things-that-look-like-Java case) can exist, but I'm also not a GC expert.
Go doesn't differ that much from other GC languages. GC cost/complexity doesn't really change much with average allocation size. Go putting less load on the GC just means the JVM GCs would be playing on 'easy mode'.
Go has interior pointers. All byte arrays support word-sized pointers to any of their array elements. I think that's somewhat unusual for a language implementation with a precise garbage collector. Usually, pointers always point to the start of the object they reference, and not at some arbitrary offset somewhere inside the object.
C# has interior pointers as well and it is a language with a precise, moving garbage collector. The deal with Go is that is has a concurrent mark-sweep collector, so it is non-moving. Given the non-moving collector, it is significantly easier to reason with than a moving collector. Hence, you would have a lot of unstated assumptions (about whether the collector moves objects or not) baked into the runtime that would need to be fixed. You probably can't use a moving collector in Go like G1 or ZGC or LXR without non-trivial refactoring.
Oh, interesting. It's the & type in the CLI (I.12.1.1.2 in ECMA-335).
It makes Java's GCs quite Java-specific in practice because there aren't that languages that even moderately widely used which are both statically typed (to the degree that it's possible to tell pointers apart before generating code) and restrict pointers to point to the start of the object.
Ehh yes and no. Many languages indeed use conservative GCs, but that is because it is significantly easier to implement than precise GCs which require stack maps etc. I don't think it's exactly a question about languages and static types/generating code, it's more of an implementation detail. Conservative GCs can be fast, but they are inherently less space-efficient as you retain more objects than a precise GC would. Of course in some cases a conservative GC _is_ required, for example if your language has heavy interop with C/C++ land.
Also supporting interior pointers is not too cumbersome even with a "Java-style GC" (whatever that exactly means). It requires an additional bit per smallest aligned object size to denote if an object is allocated or not. If you need to check if a pointer is an interior pointer to an object, you walk back in the bitmap and find the last object with a bit set to 1 (i.e. find the allocated object which was right before the pointer you have), get the size of that object and check that the pointer you have is within the range of the start and end of the object.
EDIT: The big issue you would face with directly porting a Java GC into other languages is the weak reference processing semantics. Each language has their own subtle differences which would make it a pain.
Sure, but all of those languages "look like Java" from a memory management perspective. Namely, programs written in those languages idiomatically assume that allocations are dirt cheap (bump allocator) and thus they allocate a lot of garbage. A language like Go tends to assume allocations are more expensive (tradeoff for a simple/non-moving, low latency GC) and consequently makes fewer, larger allocations. It's not obvious to me that a single GC can support both extremes performantly.
AFAIK, no. The opposite is true with MMtk (https://www.mmtk.io), which is a toolkit with many GC algorithms implemented that has been plugged into other runtimes, including, as it happens, OpenJDK.
I've wondered how feasible this would be. In general, I think the other commenters are correct that GCs tend to be intimately tied to the runtimes. But the JVM now has an interface for garbage collection (https://openjdk.org/jeps/304), which might make the GCs easier to extract and use in other projects. I'm not aware of anyone trying though.
Is there a good guide to learning modern Java? At my workplace, we’re thinking about using it for a few things from Python. I’m not looking for a “Java for Python developers” but more like “Java for people who already know about the basic structures of programming languages”. I would love to hear about variable instantiating and for loops as little as possible to grok the Java model of programming, then drive right into the meaty bits of how Java programmers think today. I did Java 20 years ago in college.
I recommend The Core Java pair of books by Cay Horstmann, supplemented by Effective Java by Bloch. Those are probably the two most authoritative sources on Java out there, and both highly readable.
I heard not so good things about Herb Schildt's book.
I second this recommendation! I read Horstmann's books recently and they were excellent. He really puts things in context – any time I was thinking "WTF? Why would they do things this way?" he had an explanation coming up (or in some cases an acknowledgment that it is indeed a WTF).
Horstmann was one of my favorite teachers at San Jose State. He's a really sharp guy, and very enthusiastic teacher. I had him for a programming languages class and we dove into the various implementations of closures that were to be added to Java. We compared these to Scala and some other languages. Really fun diving in with his explanations along the way.
Effective Java by Joshua Bloch is fantastic for covering up to Java 9. There have been incremental updates since then but this book will get you 95% of the way, and it is very well written.
not really good for learning Java though, more for intermediate Java programmers. A much better book for someone learning the language is Core Java by Horstmann, which starts from the beginning and goes into depth.
Kotlin has no community process and is the work of single company. The dev experience is mostly bound to IntelliJ. Java is picking up the slack every release and actually innovates over Kotlin in new features.
What do you mean with KVM? A Kotlin-specific VM? I don't see the need for that as Kotlin the language is rapidly evolving into a higher level language that runs on top of JVM, Android, web and native platforms (iOS, MacOS, windows, Linux)
Now in 2023 I'd strongly think twice before investing heavily in Java. For the vast majority of use cases where Java is utilized Go ends up being a better choice w/ cleaner codebases, static binaries (none of this JDK/JRE deployment mess), and phenomenal performance. Not to mention proper interop with native (C/C++) code without JNI funny business.
I work at Microsoft and I'd use Java (not even C# and yes at work) a hundred times before I'd consider using Go again. Go dependency management is a bit better these days, but the build system is an absolute mess, logging is dreadful, and IDE support seems pretty primitive. Also, Exceptions are vastly superior for most services compared to returning an error outside of a few edge cases since most of the time I have no idea how to handle errors at the callsite and really just need to handle any errors at a high level (like reverting the transaction and returning an HTTP error code).
I love how these people think I have a choice in languages to use, or that I haven’t tried several others as well and honestly don’t mind the idea of Java. My company wants to make money don’t cutting edge algorithms, not test out Google’s language du jour.
> I love how these people think I have a choice in languages to use
Probably because you wrote, "At my workplace, we’re thinking about using it" and not "my workplace is demanding we use it"
> or that I haven’t tried several others
(I'm not the person who replied and suggested Go, but your comment does make it sound like you are in the language-evaluation phase, not the "we're fully committed to Java after reviewing multiple options" phase)
My friend, I specifically asked for good Java resources, but did not ask for other opinions. Those who suggested other languages were just cutting in to the conversation. I don’t care about their reasoning, I just wanted to learn Java.
I love how these people think I have a choice in languages to use, or that I haven’t tried several others as well and honestly don’t mind the idea of Java. My company wants to make money don’t cutting edge algorithms, not test out Google’s language du an.
Go has exceptions, with the usual stack unwinding. The standard library does not use them (for the most part, panics are sometimes used to simplify error handling code, see encoding/gob). It's like Perl and Rust in this regard. Undocumented exceptions/panics are often considered security vulnerabilities by the Go team.
The argument against exceptions is that non-local control flow can introduce obscure bugs, like forgetting to clean up resources on the (invisible) exceptional execution path. On the other hand, without exceptions, it's possible to end up with no error handling at all by accident, and that too is not visible in the source code (but linters can help, of course).
Many languages which are anti-exception as a matter of principle still use them to report out-of-memory conditions, out-of-bounds array indices, integer division by zero, or attempts to access absent optional values. Not doing this results in code that is simply too verbose.
> like forgetting to clean up resources on the (invisible) exceptional execution path
Except this is still possible on the non-exceptional execution path. You simply just need to forget the defer call. The only thing that solves this is RAII and destructors.
You talking about JDK/JRE deployment mess means you haven't looked at java for 5+ years, as that's no longer an issue. Hence it's hard to trust your advice when you're so off.
And no, the vast majority of use cases for java isn't distributing binaries or interoping with JNI.
Having done in-depth work both with JNI and with cgo, I don't agree that Go is without "funny business". Indeed, cgo leaves a lot to be desired IMO, especially if you with to distribute a library for other projects to consume. The interplay between native and managed code is also fairly tricky.
Always start with the question, do you need to optimize? :) Most likely, Probably not! Java is _fast_, and the JVM is pretty good at what it does; 95% of the time merely checking for GC thrashing is all that’s needed.
Otherwise, start with G1 and get your Xmx value in the ballpark. VisualVm can help you determine if you're thrashing. Are you GCing like 10+ times per second? Keep an eye on it. If you start hitting giant pause times and 50+ collections a second, you've got problems :) Increase Xmx. (and no, please don't set Xms = Xmx).
If you have issues past that, it’s not the garbage collector that needs help; the next step is to audit your code base. Bad code makes any Garage Collector in any language to misbehave.
For instance, are you `select *`ing from a table then using the Java streams api to filter rows back from a database? That will cause GC issues :) fix that first.
So now if you've got to this point and you still need to optimize, what we've done is just run through the different collectors under load. One of our JVMs is a message broker (ActiveMQ 5.16.x) and we push a couple thousand messages per second through it. We found that Shenandoah actually improved latency for our particular use case, which was more important that throughput and outright performance.
Oh, and if your application and usecase is _extremely_ sensitive to latency, forget everything I wrote and contact Azul systems about their Prime collector. They're pretty awesome folks.
I think basically the argument is that by setting the min bound lower, you allow the JVM to shrink the heap. This could maybe be beneficial towards reducing pause time because the JVM has less memory to manage overall. That being said, that SO answer also mentions:
> Sometimes this behavior doesn't give you the performance benefits you'd expect and in those cases it's best to set mx == ms.
I've also seen apps configured this way professionally for similar reasons. You might imagine some app that leads to the JVM pathologically trimming the heap in a way that isn't desirable and thus impacts performance in some subtle way, etc. The answer with a lot of this stuff is usually try both ways, measure, see if you can observe a meaningful difference for your apps/workload for typical/peak traffic.
Background: I've worked on Java apps for a few years at reasonable scale and worked on GC pressure issues in that time.
where to start... first the other commenter is correct. If you need to run to the connivence store that's 250 yards away, what is more efficient? Firing up a 7,100hp Union Pacific Diesel locomotive and plowing through everything/everyone to get there, or taking a electric scooter? A smaller heap means less to scan, meaning less work, meaning lower latency. Now G1GC removes some of this penalty due to its algorithm, but in general, less memory is less memory scanned, meaning better best-case performance from a GC. Let that Adaptive GC Boundary do it's thing.
and now, rant time: well, Xms allegedly improves start times. Is that really important? No. Is that really true anyway? Not really. Xms hides problems... Yes. Give that memory to the OS. Let it fill it up with things like disk cache or COW artifacts. Xms is an attempt to help people that aren't planning properly or testing. Yes. Instead, Test your systems under full load with Xms off and adjust, measure, experiment, repeat.
Elastic is sort of a special case because it's a 'database', but, I'd rather know the minimum Xmx my system actually needs by experimentation, and you can't find that with Xms enabled. And even then, I don't see MySQL allocating 100% of its InnoDb bufferpools at startup.... :)
You scan the live set (which most often is not dependent on the heap size). Bigger heap most often means better performance. Setting Xms to Xmx is good and valid if you know how much memory you need (which often is the case).
If you prefer high throughput, then relax the pause-time goal by using -XX:MaxGCPauseMillis or provide a larger heap. If latency is the main requirement, then modify the pause-time target. Avoid limiting the young generation size to particular values by using options like -Xmn, -XX:NewRatio and others because the young generation size is the main means for G1 to allow it to meet the pause-time. Setting the young generation size to a single value overrides and practically disables pause-time control. [1]
Setting -Xms and -Xmx to the same value increases predictability by removing the most important sizing decision from the virtual machine. However, the virtual machine is then unable to compensate if you make a poor choice. [1]
If you run it in a container the best advice it to leave things with default options. JVM has ergonomic GC options and with adaptive heap sizing. 95% of time you dont need to change anything at all.
And if you do any IO with bytebuffers or library that does that -> you simply gonna have a bad time with native memory usage which is 10x as convoluted as selecting GC and carring only about the heap.
And best part, every month there is a new bug released in regards to NMT reporting stupid memory usage that is hard to track coz regular metrics dont expose it.
So you have to enable NMT which in most cases is a straight 5-10% performance degradation.
And for latency use ZGC or Shanondoah.
And dont forget 50 jvm flags to tweak memory usage of various parts of memory that can negatively impact your production like caches, symbol tables, metaspace, thread caches, buffer caches and other.
God I hate that so much. Just let me set one param for memory and lets be over with it.
Yup, every once in a while use the various available tools to watch your memory usage patterns and how GC runs. If it looks fine continue to leave it as is.
Default gc depends on how many cpus and ram vm has. With 1 core it is always serialgc, for example. 2 cores and less than 4 gb - concurrent mark and sweep IIRC. G1GC starts a bit later.
Even with 1 CPU ParallelGC has lower latencies than SerialGC on 1 CPU. SerialGC will be better on environment with limitations on number of threads, not number of CPUs.
> 2 cores and less than 4 gb - concurrent mark and sweep IIRC
CMS has been deprecated in 11 and removed later in a non-LTS release. JVM ergonomic will automatically turn on >G1GC< when it detects JVM has at least 2 CPUs and 1792Mb of RAM (not heap, memory in total). When either or both numbers are lower then ParallelGC is enabled automatically.
The serial and parallel GCs are best used for applications that care more about throughput than latency (pauses). If you only have one CPU core, running multiple threads isn't going to speed up the GC cycle. The parallel collector just adds overhead due to context switching among threads.
My java/kotlin app needs to keep a big table fully in memory. (~10 million records). And it is reloaded about 3 times a day.
In C, I would just malloc the whole table in one chunk. Perhaps there is a specialized GC for this usage ?
There's no particular limit to JVM, you can use all available memory if you like.
For best performance you can decompose your structure to primitive fields (int, float, char, etc) and create array for every field. So you have, say, 10 arrays with million items each. Instead of creating one array which holds pointers to another 10 million objects on the heap. It gets tricky with strings (you need to flatten all strings into a giant char[] array and keep two arrays with index and length data, but doable.
Though 10 million of records might be OK for JVM. Measure your GC times.
I'd suggest to hide implementation details behind API, start with ArrayList<MyRecord> and refactor it later if needed.
Do people follow GC research at all 'for fun'? Any places/sites/people to pay attention to for the latest and greatest goings-on that may show up in production x years from now?
I do follow Loom 'for fun', lurking the mailing list archives [1]. You can definitely do the same for GC implementation [2] (general topic list here [3]); but it may not talk a lot about external initiatives like Shenandoah.
Beware that you're gonna have to filter a lot! There's patch merging and very low detail implementation conversations. For example, you'd be pleased to know that G1 can now skip a guard in card-table clearing [4]. Don't ask me what is the card table and guards and why do you need to clear it, though.
One thing I'm looking for in GC advances is new hardware support for it in RISC-V J extension. There's gonna be memory tagging (helping security and memory management in GCs), and pointer masking (hardware support for what ZGC does under the hood)[5]. But we're probably a good 5 years away from seeing that in real life, if ever.
I am interested in this area too, but I don't know of any single place to follow it all. I have found that there isn't an avalanche of "new" stuff, but a lot of different ideas with different tradeoffs that change in relevance based on usage patterns, languages/runtimes, and hardware characteristics.
Besides the JVM resources linked by others, I found Richard Jones's "Garbage Collection Handbook" to be a decent introduction for background [1]. The Go team has written a bunch about their GC approach [2] - it is really interesting to see how it compares to the various options in the JVM and under which scenario you might prefer one or the other. And occasionally there are interesting articles on arxiv.
Sure, I follow it but HotSpot is state-of-the-art, so if you follow that team then you're basically following GC research already. There's a little to be found in academia, but only rarely these days. The implementation cost to get to the cutting edge is too high.
There are so many huge features in Java 21, especially around scalability and data handling - I feel like it may be a watershed moment that brings Java back to the forefront as a leading choice for heavy lifting, data focused work. As an LTS release I could see it forming the next long term baseline that people build on for a very long time.
I guess I'm lumping a lot of JDK20 stuff in there as well since most folks will hold off migrating until LTS releases arrive with features. But the combination of things like Record Patterns, Virtual Threads, Foreign Function and Memory API and the Vector API will add up to a very powerful set of features for building data focused applications.
Those features are in incubator/preview mode in Java 20. While they might be released with 21, they might as well not. So I wouldn't hold my breath, unless you can jump on preview features (and if you can, why even care about LTS).
I'd wager TS/node/etc going the way of PHP and JQuery in the next decade. SPA will go full WASM (which you could technically use Java for) and non-SPA will be built off better browsers (e.g. WebKit adding nested CSS) and HTMX-style solutions. Note that it doesn't mean everyone will be rolling their own WASM-based engine, but that there will be a bunch of public WASM-based engines (like PyScript/Pyodide stuff) that are tailored for specific domains than one-size-fits-all React.
This leaves "full stack JS" in an awkward middle ground. Sure, you could still use it on your backend (like PHP), but why?
We use JS because there were literally no other practical options, but better browsers and WASM are providing new options.
Java has really good runtime monitoring capability. Being able to get a thread dump from a production system, or generating Flight Recorder logs for offline analysis is incredibly useful. Is there something similar in the Node ecosystem?
I've been doing backend dev in Node/TS for the first time recently after working mainly in Python or Ruby for many years. And overall I have to say I'm not very impressed. The tools and frameworks are immature and it's annoying having to deal with all the unfixable flaws in JS.
If I were starting a new company today I would certainly seriously consider Java or Kotlin instead.
For the same reasons you'd pick golang over python/TS.
Only you'll get higher-level abstractions (pattern matching, FP-style maps/flatMaps&Co throughout the standard library) and powerful collections with full support for generics.
I personally love scala: It's been my main language at work for well over a decade. However, the barriers of entry are real, especially since arguably the most popular styles of the language demand learning the most esoteric parts of the language straight away. If your organization is well seeded with very experienced people, they'll be able to train people up, but it's oh so easy for things to go wrong.
If you are looking for all the advanced type system features, kotlin is definitely not going to put you in the same spot, but if you are hiring Java devs, training them is a far easier lift: They can mostly train themselves.
I'd hope that after another version or two of scala3, when more organizations are happy running it in production, we might have an easier onboarding road, where we don't have to explain implicit parameters, implicit conversions and implicit classes, just so that we can get to the real meat that is the mechanics of type classes. But, as is, there's good defensible reasons for many teams to go try Kotlin first.
Agreed. If somebody doesn't know Java and decides to jump to Kotlin, I will have the same question: why not Scala which is used more than Kotlin on computers (i.e. non-Android)?
I'm not necessarily advocating for Kotlin but the key argument would be supportability. Scala is a very complex language+ecosystem with idioms many people are highly unfamiliar with. Comparatively, Kotlin can be written idiomatically in a way that the vast majority of developers will readily understand and quickly come up to speed with.
Java is such a storied and long-running and used-almost-everywhere language especially in Data Engineering (see all the Apache Data Eng projects like Calcite, Hudi, etc) but I just find it soooooo verbose and everything being a class and having to override things ugh .. it's all the things I hate about OOP in the forefront.
One of the goals of Project Amber [1][2] is to move the language towards a more data-oriented programming model. With Records, patterns, sealed classes, etc., it should feel much less verbose over time. And unrelated to your concern but addressing some of the learning overhead, see Paving the Onramp. [3]
Any word on when some of project amber features will come out of preview? I get excited each JVM release for some of those features, but it seems like most of the releases the preview count just gets bumped, and a few more get added to the preview holding pattern.
Text blocks, var, records, sealed classes, and pattern matching in instanceof have been out of preview for some time, but two more features -- record patterns (https://openjdk.org/jeps/440) and pattern matching in switch (https://openjdk.org/jeps/441) -- are about to come out of preview.
Is there any work to make records actually usable out of the box?
Things like copying or creating derived records are a huge pain (or slight pain with code generators) while other languages have solved this long ago (even JS and C#).
Sure, but you could refer to the lineage of a dozen languages. Most of the world runs Java and evolving it takes care and consideration not to alienate a massive user base and ensuring that it evolves in the right way, not quick responses to fashions and trends.
Scala is super complex, introduces breaking changes all the time, is super slow to compile, multi-language projects are also complex, and the decisions that are right for Scala may not be right for Java.
The "super complex" and "introduces breaking changes all the time" comments are unsubstantiated FUD. Scala has evolved a lot in the last few years, particularly in the area of binary compatibility. It's a wonderful language and I can only recommend others try it. This from a programmer very happy with Scala.
And this is from a programmer that is not happy with Scala. Every single time I've upgraded the compiler there is a breaking change. They don't strictly follow semver. Most recently upgrading from the 2.11 to 2.13 compiler they made breaking serial version uid changes (I know don't use java serialization, but that wasn't my decision) and none of it was noted in the release notes.
When it comes to super complex just look at any of the type signatures of the standard library for collections:
def ++[B >: A, That](that: GenTraversableOnce[B])(implicit bf: CanBuildFrom[IndexedSeq[A], B, That]): That
Comparing this to Java it is "super complex". IntelliJ can't even figure out the types sometimes.
Scala 3 came out not too long ago and it fixed plenty of shortcomings. I do recommend giving it one another try.
Also, the reason why Scala’s collection has such complex signatures is because it is hands down the best collection lib out of any language I have used.
Right and I don’t want to have to deal with half the community being split. Or half my coworkers doing it one way. These are just things I don’t want to deal with. I said my reasons were petty! But they’re my reasons.
I don't know about the serialization issue but I don't doubt you had it.
> def ++[B >: A, That](that: GenTraversableOnce[B])(implicit bf: CanBuildFrom[IndexedSeq[A], B, That]): That
I should point out that this is the Scala 2.12 and and earlier signature. `CanBuildFrom` is gone from the Scala collections since Scala 2.13. In fact, the collections were redesigned for Scala 2.13 primarily to simplify method signatures, following community feedback.
And then when a dependency of one of your dependencies is broken agains the latest version. I know some of this has been cleaned up, but it is one of the main reasons I no longer use Scala - I have a rule about how long I'm willing to spend on build issues vs actually writing code, and Scala was always on the wrong side of that.
re: Complexity - At least the signatures for the core collections have been cleaned up a fair amount. That said, the richness of the type system and the prevalence of operator overloading always made it feel like a language you could be really productive in once you knew the language and the current codebase really well, but was really hard to just read through unfamiliar code and know what is going on.
> And then when a dependency of one of your dependencies is broken agains the latest version
How is this any worse than Java? My most vexing dependency-hell issues have involved breaking API changes to Hamcrest matchers and Apache Http Client; more recently Jackson-databind. All of those are Java libraries, brought in via transitive dependencies, usually from Java libraries.
There was definitely a large and vocal part of the community that wanted that, but I think early on there was a lot of tension between Scala being "better Java" and "Haskell for the JVM", and that probably hindered a lot of adoption.
Sure, but what about when you want to pull a Kotlin library into a Scala application? It works, but usually only works well if the library author limited themselves to the subset of the language that interops with Java (the language).
More features in Java (the language) gives other JVM languages a larger set of tools to design interop support around, while letting them remain a place for these features to incubate without the headache of the JEP. Sometimes these language-level changes may come with modifications to the JVM to support them as well, letting other languages clean up their implementations.
It’s terrible. But at least I rarely have to touch the config. I guess the silver lining is it’s so bad we don’t use it for anything besides dependency management so configs are simple and just copy pasted between projects and rarely touched.
SBT is the worst thing about the Scala ecosystem. Just stick with Maven (or Gradle, if that's what you're using), enable the Scala plugin, and start trying Scala in more flexible leaf areas of your program (e.g. integration tests, ancillary tools, data migrations). See if it feels right for you.
As the classic saying says: there are languages everyone complains about and languages that are not (actually) used. Verbosity is usually a _good_ think especially in large code bases with large number of people (where every successful startup will eventually get)
If verbosity would be a good thing Java wouldn't introduce language features/ library enhancements to cut down the noise (var, collection factories, switch expressions, records).
Java is verbose because it's inexpressive, not because it's good at documenting. It's Objective-C with everything too interesting removed, but they never added enough of it back.
The only things left from Objective-C that actually matter are AOT (always available in commercial JDKs) and value types (Vahlhala will hopefully be ready some day).
The longstanding lack of generics (which has been remedied aside) who really complains about golang?
The reality is Java truly shows its age. Its dogmatic insistance on pure OOP and the enormous numbers of horrifyingly unintuitive design patterns adopted by the community cause even the most well intentioned engineering culture to eventually produce ugly codebases.
At this point if you're stuck in the JVM ecosystem you're almost certainly better off with Scala (which indeed is actually used) or the newer Kotlin.
In a way, for all the shit people give C++ modern C++ codebases are actually quite pleasant and the language is very flexible. Would I encourage its use for general production systems? Not really. That crowns is Golang's.
Come on, golang is as expressive as java 1.1. Java is much much less verbose and more readable than go, but for some reason the latter’s proponents can’t see it..
I've returned to Java after using Clojure and Groovy for several years. I appreciate the fact that Java's just plain boring and slightly verbose.
Most of the time the complexity in my code has little to do with Java being verbose and more due to the business problem. There are areas to improve and Java's been making great strides recently. For example, in Java 21 we may finally have methods like getFirst() and getLast() for lists (via JEP 431) instead of the incredibly clunky list.get(list.size() - 1). Java also recently added multi-line Strings and templating is coming shortly. Streams and Optionals also reduce quite a bit of boilerplate, e.g. Optional's map and ifPresent methods are often elegant. Really I can't think of many other areas where Java gets in the way. Our team is incredibly productive with modern Java.
I think most developers actually write overly verbose code regardless of the language. And it seems little to do with years experience. This youtube channel covers most of the basics:
To me I just follow these recommendations naturally but in most PRs I review there's often huge amounts of overly nested code, poorly named methods, etc.
Not my youtube channel. It just showed up in my recommended videos and it seemed like good advice (common sense). It's helpful to point devs to those videos when I see them making those types of mistakes.
I miss the data oriented approach Clojure libraries take. I remember having an issue with Ring and I just dove into the Ring source code and it was so simple and clear and I found the solution to my issue in minutes. I've never had that experience with Java and instead resort to forums, stackoverflow, etc. Most Java libraries would have several layers of abstraction and they're often intimidating. I consider myself an amateur Clojure dev and yet still contributed some PRs to some projects.
With that said, I've written some internal tools in Clojure and it was a nightmare whenever I had to modify them. They were pretty simple CLI tools that only needed updates every 6 months or so. I usually have to spin up the project with a repl just to understand the inputs and outputs to functions. I've ported all internal tools I created at my current company from Clojure to Java and I personally find them so much easier to maintain.
Going from Java 8 to Scala was mind melting. Would never want to go back. Though I’ve heard later versions of Java are better. And katlin seems like an in between.
Or stick to kotlin and keep the mind conveniently solid ;)
I was one of the Bored Scala Crowd in the audience of a presentation by jetbrains people that must have been not too long after 1.0, took me half a decade to accept that a "poor man's scala" is actually the language that I want.
I found just the opposite. Kotlin would rather add 10 special cases than one higher-order abstraction; I'd rather learn one slightly fancy concept than a laundry list of ad-hoc cases.
That's actually a quite accurate description of my own feelings. "Why emulate parts of scala on syntax level when you can have the real thing?". But that "real thing" is the one that drives people to lose themselves in cats vs scalaz debates and the like, there's clearly something in scala that causes a very strong "if you use it to write java without semicolons you're doing it wrong" undercurrent. "java without semicolons" kotlin isn't considered good either, but the distance from there to "kotlin as intended" isn't big at all.
Regardless of programming language, every substantial software development needs to make
serious architectural choices, otherwise things will become messy.
I learned this the hard in my first programming job out of university. We were a C/C++ shop doing distributed software and
one question to decide right upfront is: shared memory vs message passing. Both have advantages and disadvantages, but don't mix well.
We started out with message passing, but over time somebody added POSIX shared memory as a performance hack (which two processes happen to run on the same machine).
Over time, we added a second, non-POSIX shared memory layer for performance reasons under Windows. It was an unmaintainable mess.
The core problem was that the CTO didn't understand distributed programming well enough to push back against those performance hacks. "How do you handle software architectural leadership?" used to be one of my replies to the inevitable "have you got questions for us?" in job interviews, when I was being interviewed. Now that I am making architectural decisions myself, I put a lot of efforts into this, for example building suitable linters that flag violations in code reviews.
If your team fights over cats vs scalaz or over Scala-as-OO vs Scala-as-FP then that's a sign that the technical / architectural leadership is weak. Choices like pure-OO vs pure-FP vs
mixing them require care, and are hard to change in-flight. If you have strong modularisation, you can successfully use different paradigms in different modules, but strong
modularisation needs care and architectural discipline, too. Part of the problem is that the style of programming that libraries like cats require (monads, functors, applicative, type-classes, HKTs) is not yet widely understood. This approach to programming needs its 'map/reduce moment' and become part of the mainstream introduction-to-programming curriculum. We'll get there, maybe in about a decade.
Yea, I'd imagine comparing a 9 year old version of a language to another would be "mind melting". And you are comparing to Java 8, from before Java re-organized their JEP process to ship features far faster, with a different release philosophy and feature roadmap.
Kotlin is worth picking up, but after seeing the speed at which Kotlin moves compared to how Java moves now, I don't think Kotlin will keep up long term.
(Unlike Haskell,) Scala does not force monadic composition on you at all. You can use Scala as a pure OO language, or for old-skool ML-like FP where you don't use monadic context abstraction. I've done both successfully in production. Indeed I strongly advise against a monadic programming style for programmers with a mainstream programming background (imperative and OO).
In my (considerable) experience of teaching programming, imperative programmers find jumping straight from imperative programmers to a monadic style extremely difficult, the more experienced, the more difficult.
Just for balance: after writing much Java years ago, then moving on to mostly JS/TS, Scala and Kotlin, going back and looking at old Java I forget writing, it looks very nice and easy to understand.
Java is definitely more practical than pretty. This frustrates the trendy crowd, but I think its important for tools to be practical. Shiny, pretty languages are never as long-lived as practical "ugly" languages.
I'm certainly not a member of the "trendy crowd." I say Java is ugly as a fan of Lisp and C. There's just so much boilerplate to implement something basic.
You're not wrong. I've worked in prod with all the cool languages (Scala, Haskell, Rust, Go, etc...) but stick to Python/typeScript when I want to be pragmatic.
ORCA (and Pony language) solved this while allowing selective mutability and zero-copy message passing improving on Erlang HiPE/BEAM by tying objects to a tiny heap with individual actors (cooperative async threads). There is no global locking in Pony except in limited circumstances.
Zulu C4 is an improvement. Schism and Metronome are less pausey but slower overall.
https://www.azul.com/products/components/azul-zulu-prime-bui...
https://dl.acm.org/doi/10.1145/1809028.1806615
https://researcher.ibm.com/researcher/view_group_subpage.php...