I've always had a hard time with this argument that STM is problematic because logging rollbacks for every variable in a transaction is time consuming and profiling is hard because you don't know which variable is causing conflicts.
The reason I don't understand this argument is that I was always under the impression that you shouldn't normally DO very much at all in a transaction. As in, most transactions should basically just consist of a single pointer-swap.
What are typical operations during transactions that are more complex than this? I suppose if several variables are modified in sequence there could be problems---is this typical when STM is used in object-oriented languages?
If so, that basically demonstrates the value of using immutable data structures, generally considered beneficial for any form of concurrency. And so it's no wonder that STM is more popular in functional languages.
But when I've used locks in the past in C++ or Java, I've always avoided problems by reducing my synchronization periods to pointer swaps, either for updating a state vector or for passing ownership of an object; so the same lesson, I believe, is applicable to locks. Which leads me to believe that STM simply enforces practices that are good when using locks anyways, and additionally simplifies their implementation.
You put your interfering bits (critical sections) in transactions. Starting with a read and ending with a write. If you were to retry right after a read (but before a write), you would get incorrect results. And read/write pairs can get complex.
Rich is right about correctness. With an STM, it's comparatively easy to get correct behaviour from your program, which is more important than performance.
I think if I were just doing a single pointer swap, I'd probably build my own mini-transaction out of a loop and atomic variable to hold (and swap) the pointer. On the other hand, I would use STM if a transaction needs to touch multiple values, rather than just the one atomic pointer.
But, I still agree with you because even then, I would try to keep the amount of data which needs to be atomically shared to a minimum, so the problems discussed would be minimal.
Also, as you mentioned, with immutable data, even STM touching multiple complex data structures is simplified because you can create the new versions independently and then simply swap a pointer or two, so in funtional languages with immutable data, I expect STM to be turned into a few atomic pointer swaps under the hood.
> I think if I were just doing a single pointer swap, I'd
> probably build my own mini-transaction out of a loop and
> atomic variable to hold (and swap) the pointer.
FWIW, this was implemented in Clojure soon after this conversation; see atoms.
> that you shouldn't normally DO very much at all in a transaction.
Yeah I think it is a design issue. STMs are just a better abstraction. But the same critical data structure is pounded from thousands of concurrent readers and writers it will be slow, no matter abstraction is used.
There are cases where that is hard to avoid but often is should call for a redesign.
As the number of cores increases or "cloud" computing becomes more mainstream, at some point shared updatable state will be the exception rather than the default.
In the long term I think something like Erlang's actors & distributed computation model will become more popular. Maybe it will not be Erlang, maybe Go. But something like "Pid ! Msg" where Pid is in the same OS process, different OS process or even different process on a different host half way across the world, will become a new way to think about programming.
> But something like "Pid ! Msg" where Pid is in the same OS process, different OS process or even different process on a different host half way across the world, will become a new way to think about programming.
That's why I started learning more about ZeroMQ http://www.zeromq.org and structured my latest work project as multiple processes instead of multiple threads. Each process is single-threaded, and (at least so far) relatively easy to reason about.
Good point. I forgot about 0MQ. Yeah not everyone will learn a new language but still want to build a distributed system. 0MQ is great for that.
With Erlang, which was perhaps confusing from my post, the Pid identifies an "Erlang process". It is a very light weight unit of execution. One can have tens of thousands of them in a single OS process (which they call a node).
However, if you don't need tens of thousands of processes, there is not reason why one couldn't structure their app using real OS processes and using 0QM.
I don't think it's useful to conflate distributed computation with concurrency, though. Erlang's model is designed for distributed computation, but as a side effect can be applied to concurrency. Go and Clojure are designed more for single-host concurrency. As long as networks are significantly slower than motherboards, there's always going to be cause to treat these scenarios differently.
The awesome thing is that for Erlang it doesn't matter. The two are the same. Here how to send a message a local process on the same node:
Pid ! Msg.
And here is how to send a message to a Pid located on some server farm in Japan:
Pid ! Msg.
In general CPUs are probably not going to get faster. Networks might, both on LANs and WAN. There is the 100Gb Ethernet, optical interconnects I think will become cheaper.
On the motherboard # of cores will be increases but at some point, while we get the hang of concurrency and restructure our codebase, we'll quickly get hungry for more cores and memory than a single server can support.
What happens then Erlang program won't have to be restructured when that happens because it provides a suitable abstraction. At the lowest level it will still be Pid ! Msg. That is the beauty of it.
Yes, I know. And from an abstraction level, it's really cool that it works that way. But when you are operating on the same node, asynchronous message passing is sometimes an awkward and less performant solution. It's strictly weaker than synchronous message passing, which means you never have any assurance that the other process received your message. Message passing can also be less performant than properly protected usage of shared memory, which transactional memory allows.
Erlang's popularity for concurrent programming probably comes down to the fact that anything is better than manually managed locks and until recently there weren't any widely-used industrial languages that used synchronous message passing or transactional memory. I don't think it's any coincidence, though, that the designers of more recent concurrent languages like Go and Clojure have not followed the Erlang model.
Well I think in the end its the same in any case. If you use actors or stm in the end you will have to know if it runs distributed or not. I think nobody clames that every Erlang programm makes for a good distributet system.
STM or Actors can both be useful as concurrency and distributed programmes but non is a silver bullet in either or both of them.
The reason I don't understand this argument is that I was always under the impression that you shouldn't normally DO very much at all in a transaction. As in, most transactions should basically just consist of a single pointer-swap.
What are typical operations during transactions that are more complex than this? I suppose if several variables are modified in sequence there could be problems---is this typical when STM is used in object-oriented languages?
If so, that basically demonstrates the value of using immutable data structures, generally considered beneficial for any form of concurrency. And so it's no wonder that STM is more popular in functional languages.
But when I've used locks in the past in C++ or Java, I've always avoided problems by reducing my synchronization periods to pointer swaps, either for updating a state vector or for passing ownership of an object; so the same lesson, I believe, is applicable to locks. Which leads me to believe that STM simply enforces practices that are good when using locks anyways, and additionally simplifies their implementation.