Things You Didn’t Know About Synchronization in Java and Scala

yebyen · on Aug 15, 2013

My favorite (having not read the post but just gone through CS undergrad in Java about four times by weight) was the mandatory locking in multithreaded programs in Java -- you can have parent threads and child threads, and you can set variables in your parent threads before the child threads execute, but unless you're doing it in a locking way (eg. with a shared semaphore) there's no guarantee that the parent thread actually executes before the child thread.

That was a fun one to explain in office hours.

"No no, it's not enough to just put this before that, you have to establish a Happens-Before relationship."

nivstein · on Aug 15, 2013

Indeed, one of Java's most arcane features. A good related SO thread:

http://stackoverflow.com/questions/16159203/why-does-this-ja...

ddeck · on Aug 15, 2013

>you can set variables in your parent threads before the child threads execute, but unless you're doing it in a locking way (eg. with a shared semaphore) there's no guarantee that the parent thread actually executes before the child thread

There are definitely some surprising outcomes from the JMM, but I'm pretty sure this isn't one of them. If I recall correctly, starting a thread automatically synchronizes the current thread with the first thing the child thread does, without any explicit synchronization. So anything done prior to that point in the parent thread will be visible in the child.

yebyen · on Aug 15, 2013

I do not have the code here to show you, but if I am exaggerating it's a small fib.

Maybe the child thread was created some time before it was started, and the initialization happened some time between those events. It was an example that a friend was working for a class, and as they often do, came to me for help on why it wasn't behaving as expected.

I looked at it and I could see the problem, "you don't have any locks shared between parent and child." We added a lock, synchronized it once in each thread, and the problem went away. That's definitely a surprising outcome for anyone who is accustomed to procedural programming.

ddeck · on Aug 15, 2013

The implicit synchronization happens at the point the child thread is started, regardless of when it was allocated. Perhaps it was already started prior to the initialization, in which case some form of explicit synchronization would be required to ensure visibility.

FYI, the relevant section of the JLS is 17.4.4: "An action that starts a thread synchronizes-with the first action in the thread it starts." [1].

[1] http://docs.oracle.com/javase/specs/jls/se7/html/jls-17.html...

yebyen · on Aug 15, 2013

This sounds more like what we experienced:

> The write of the default value (zero, false or null) to each variable synchronizes-with the first action in every thread. Although it may seem a little strange to write a default value to a variable before the object containing the variable is allocated, conceptually every object is created at the start of the program with its default initialized values.

I can't get that to jibe in my memory though against:

> The final action in a thread T1 synchronizes-with any action in another thread T2 that detects that T1 has terminated.

> T2 may accomplish this by calling T1.isAlive() or T1.join().

If we were looking into the threads, it would have been obvious that we should wait for the thread to finish before expecting to find a completed value in it other than the default. That's not what happened.

It's unfortunate that I didn't get a copy of the code in question, because it was a genuine idiosyncrasy in Java and by the time I got around to asking my friend for a copy, he had already rewritten his code in a different way that didn't hit the same bug.

Didn't use revision control, got his degree didn't care to repeat the same miserable exercises, now probably hacking ruby in his day job like me (except I promise I'm using revision control.)

yebyen · on Aug 15, 2013

I can guarantee you the class was not offered the option of using Java 7. I'll take your word for it if you say this is not a new feature, but I'm sure the story went just about how I've laid it out.

kailuowang · on Aug 15, 2013

> Practically all server applications require some sort of synchronization between multiple threads.

It could be just me but I think the author should've mentioned alternative concurrent programming models available for JVM such as Akka.

kasey_junk · on Aug 15, 2013

Just remember under the covers Akka is still using synchronization.

asperous · on Aug 15, 2013

You mean internally? The paradigm they use is immutable-message passing and asynchronous/concurrent processing.

kasey_junk · on Aug 15, 2013

Yes internally. Somewhere deep in the bowels of the akka code, they are communicating state between threads (last I checked they were using a java concurrent queue by default).

At a bare minimum the changes to the location of the head/tail of a queue will need to be visible to multiple threads requiring volatile variables.

The java concurrent queues don't actually use synchronization blocks, they use native CAS operations but thinking that akka removes shared thread state entirely is dangerous.

benjaminwootton · on Aug 15, 2013

Even lots of low latency standard Java is tending towards single threaded non blocking models without synchronisation.

The costs of synchronised blocks is high on the JVM. They defer to the operating system for thread scheduling, they enforce a memory barrier meaning all data is flushed out to RAM rather than CPU caches, and multithreaded code dirties CPU caches, reducing performance.

More server side code should be single threaded than not in my opinion.

kasey_junk · on Aug 15, 2013

You can write lock free algorithms that are not single threaded as well. If you design your system so that only 1 thread is doing the writing of data you avoid a lot of the negative performance implications you mention.

That said, I completely agree. Many times making a single threaded system faster will improve performance vs making it parallel.

AndreasFrom · on Aug 15, 2013

Clojure's STM is also an interesting model.

skyebook · on Aug 15, 2013

If this is interesting to you then I'd highly recommend Java Concurrency in Practice [1]. It goes through the then-new java.util.concurrent package, but more importantly, does a really good job of making all of the theoretical concepts straightforward to grasp

1 - http://www.amazon.com/gp/aw/d/0321349601

jedimouse · on Aug 15, 2013

Great book indeed. Along with effective Java, its one of my alltime favorites. But since its a Java book, it doesnt if I recall correctly detail how things are implemented at the core JVM level. Does anyone know a good book or source about that?

kasey_junk · on Aug 15, 2013

It depends on the JVM you are using. If it is an open source one you can look yourself.

A little old but Oracle JRockit: The Definitive Guide is a pretty good book on the JRockit JVM.

gtani · on Aug 15, 2013

this was good http://www.artima.com/insidejvm/ed2/threadsynch3.html

bruceboughton · on Aug 15, 2013

I've been searching for a book like this for .NET and the CLR to no avail. Can anyone recommend one?

Or is this book applicable beyond Java?

cokernel_hacker · on Aug 16, 2013

Concurrent Programming on Windows by Joe Duffy http://www.amazon.com/exec/obidos/ASIN/032143482X/bluebyteso...

bruceboughton · on Aug 16, 2013

Thanks. I will take a look.

abc_lisper · on Aug 15, 2013

What do you mean you didn't know. I knew all that :)

ExpiredLink · on Aug 15, 2013

Fact #0. If you write 'synchronized' in Java code you are doing it wrong. Seriously. 'synchronized' indicates that the programmer is re-inventing and/or re-implementing a solved problem. In Java parallelism and concurrency for most real-wold problems are handled by frameworks or covered by well-known patterns.