This article shows how it's hard to design and implement a novel and powerful language like Rust. You have to understand "Haskell concepts" like GADTs and then try and find a way to hide them from the users of the language.
Indeed, however having a GC doesn't preclude having linear types as well, thus having the cake and eating it too.
In any case, Rust has already managed other language designers to take a look into adopting such type systems ideas, in itself that is a big victory from Rust community.
I think Erik Meijer deserves a lot more recognition. He's brought so many concepts from a theoretical [often inspired from lambda-calculus white papers] and developed them into concrete software paradigms and tools.
> The only real drawback here is that there is some performance hit from boxing the futures – but I suspect it is negligible in almost all applications. I don’t think this would be true if we boxed the results of all async fns; there are many cases where async fns are used to create small combinators, and there the boxing costs might start to add up. But only boxing async fns that go through trait boundaries is very different.
you'll end up with boxes of boxes of boxes. It kinds of makes the feature something for the outermost abstraction layer or you end up with boxes of boxes of boxes.
> And of course it’s worth highlighting that most languages box all their futures, all of the time. =)
It's also worth highlighting that most of those languages (1) have a GC that makes boxing a much cheaper operation than in Rust, and (2) those languages aren't advertised as "low-level" languages and that justifies implicit boxing as a trade-off.
---
I feel that async shouldn't really be special but rather just build on other features that stand on its own. GATs and impl Trait in Traits seem reasonable. But being able to do dynamic dispatch on trait methods that return a type that isn't `Sized` seems like a tough problem to solve, and the blog post didn't managed to convince me that implicit boxing is the right approach here. AFAICT, we need either some kind of boxing, or better support for unsized rvalues or similar. I think I would be more comfortable with "sugar" for boxing if the caller of the trait method were in control of where exactly the result is allocated (not necessarily a Box), but that calls for some kind of placement syntax.
> most of those languages (1) have a GC that makes boxing a much cheaper operation than in Rust
Why is that? I would intuitively think it is the other way. (Is a malloc/free pair not cheaper than an allocation on the GC heap + collecting its garbages?)
As always, it depends on a lot of details. A generational garbage collector can be made to allocate extremely quickly; IIRC for the JVM it's like, seven instructions? For short lived allocations, it sort of acts like an arena, which is very high performance. malloc/free need to be quite general.
It's always about details though. If a GC is faster than malloc/free, but your language doesn't tend to allocate much to begin with, the whole system can be faster even if malloc is slower. It always depends.
Doesn't something like jemalloc basically give you this, but without pauses? Thread-local freelists for quick recycling of small allocations without synchronization.. funnily enough, jemalloc even uses some garbage collection mechanisms internally.
I don't know a ton about jemalloc internals, but it is true that a lot of modern mallocs use some mechanisms similar to GCs. There's some pretty major constraint differences though.
`malloc` + `free` are unknown function calls, they can't be inlined, don't understand the semantics of your language, the strategy behind them is quite generic, etc.
A GC that's integrated with a programming language can do much much better (different heaps for short and long lived allocations, for example).
One can do even better by supporting custom "pluggable" allocators, and not just a single global allocator like Rust does at present. Some of these allocators could even implement GC-like logic internally.
In that particular example you won't get boxes of boxes- `baz` gets instantiated for each impl of `Baz`, so the call to `self.foo()` is resolved statically and the return type is fully known.
Depends on the (not-yet-implemented) implementation of async functions in traits. But currently, with the async_trait macro, there is only one version of Foo::foo, and it returns a Box<...>.
But it is true we could imagine the compiler generating two versions: one for when the static type is known, and one if dynamic dispatch is used.
Kotlin coroutines are just async/await with different syntactic defaults. That is, when you call a `suspend fun` it is awaited by default, and if you want to get a promise/future/etc from it instead that's when you write an extra annotation.
Rust went the way it did syntactically because annotating the await points lets you see them when reading a program, and possibly out of familiarity.
Rust went the way it did implementation-wise (stackless rather than stackful) because it's more efficient.
Rust doesn't use M:N threading (coroutines) because it was found to be slower than 1:1 threading in practice. Async/await avoids allocation of stacks entirely and can therefore be faster than both 1:1 threading and M:N threading.
In this context the hazard is that you can break the clients of your API by changing only the implementation (the function signature stays the same) because the compiler automatically infers whether the resulting future is Send or not based on the implementation.
It's just a fancy way to say "risk of breaking backwards compatibility". Semver is Semantic Versioning, a versioning scheme that all Rust crates are expected to use.
Disclaimer: I don’t think it’s _flawed_. It’s simple and useful But it has some flaws in some situations - I would describe it as “less than ideal”. Rich Hickey describes some problems here: https://github.com/matthiasn/talk-transcripts/blob/master/Hi...
It's a convention that cannot really be enforced by compiler or any tooling, it ultimately relies on humans following it, hence it will always be broken in some way for some users.
Once you modify a Rust library, it downloads the last released version, compiles it and extracts its AST, and compares it with the AST of the current version.
The diff of the two ASTs tells you what the changes are, and there is a Rust book that documents which changes are semver breaking and which aren't. So if you only add a new function to your crate, the tool says that you can bump the semver patch version or minor version, but if you change the name of a public API, the tool requires you to make a new major semver release.
Setting this tool in CI is dead easy, and will make your CI fail if a PR changes the API of your crate without properly updating its semver version.
Sure, I’ll give that it’s a very hard problem to enforce (not impossible, but very hard with virtually all current language implementations). That doesn’t negate the benefits of having a social contract (what semantic versioning is) that says you strive to do the right thing, even if you sometimes fail.
Sometimes you need to break backwards compatibility to fix a security issue or implement a new, desired feature. Most often the breakage is small, but it's still nice to know as a library user when the build may break.
Yes, try to avoid breaking backwards compatibility, but have some mechanism to handle the rare case when you do need to break guarantees.
Most of the libraries I use that follow semver rarely break backwards compatibility, and it's still helpful to use semver to note bugfixes vs new features. New features can be a liability and introduce bugs, bugfixes are frequently quite safe, so it lets me know how much effort I should expect to spend testing when pulling in a new version. When I'm close to a release, I'll avoid new feature releases, but probably pull in bugfix releases.
Your reply feels unnecessarily confrontational / aggressive, but I’ll respond anyway.
> Everyone else just maintains backwards compatibility.
Name one piece of software of substantive size and with a history longer than a couple years, which never had a backwards compatibility break ever? Just in case you can name one, then name two instead. Contrary to your statement, 99.99999% of software has a backwards incompatible change at some point. What’s a good way to indicate that to your end users? Semantic versioning.
> Nobody asked you to "improve" you library by breaking it.
So if a major bug is found after release, one should never fix it? Fixing a bug could easily lead to a backwards incompatible break, if nothing else because someone was depending on that buggy behavior.
Asking me to name perfect records is impossible and unnecessary. You do not have to maintain backwards compatibility forever in all cases.
My only argument is that it being the default approach to changing software isbwhat offers the best results without the added cost of semantic versioning and all the treeshaking failures that comes along with it.
Linux userland APIs, C/C++, WordPress, Clojure are all pretty successful software that maintain backwards compatibility as some form of priority and commitment with varying degrees of success.
None of which get much lament for doing so compared to say nodejs/JS/npm which are routinely the laughing stock of the industry for their lack of platform stability.
> Bug fix
Your point about these is pretty funny because these breaking changes are specifically allowed by semver as minor patches. Which was kind of my original point here it's a flawed spec that offers no real safety at great cost.
Semantic versioning isn't a silver bullet, unfortunately, due to the 'Diamond Dependency Problem'. That is, A depends on B and C, but B and C depend on different versions of D. Easy to fix, until you scale that up to thousands of nodes in a dependency graph. This talk is one of the better ones about this: https://www.youtube.com/watch?v=tISy7EJQPzI
> Semantic versioning isn't a silver bullet, unfortunately
Nobody said it fixed all problems. Virtually all dependency management systems struggle with this problem, whether packages use semantic versioning or not.
> Because it is, and it isn't?
>
> These are basic facts.
Those are not facts, those are claims. Facts are supported with data, and you haven't given any.
The only fact you have actually provided is that you can't tell facts from claims apart, and that makes it very unlikely that you will be able to provide any actual facts that support your claims.
Yes, it was. While the grammar I used should be pretty clear on what I meant already, here’s an alternative version:
Why would using semantic versioning (something near universally accepted as a best practice) be flawed in your eyes?
Or more simply:
Why would using semantic versioning be flawed in your eyes?
> These are basic facts.
Actually, they aren’t. It’s your opinion (not a fact) and you still haven’t provided a single drawback (flaw) to following semantic versioning despite repeated requests.
> Have you read the semver spec?
Yes, I have. More than once in fact. Have you?
> Is your libraries new version compatible with my library?
Assuming both authors follow semantic versioning faithfully and the previous minor/patch release worked, then yes, it very likely will be compatible.
> How can you know?
If both sides have a well defined API and the authors follow semantic versioning, you should have a good indication that way. That said, as the saying goes, “trust but verify”.
In that context itmeans a situation where a developer of a library could accidentally break semantic versioning guarantees (no breaking change except between major versions) without noticing it. Or something that would require a major version update for some small change that shouldn't need one.
I grew up when OOP was the shit. Everyone wanted to add OOP to every language. C++, ObjC then Java.
I absolutely hated, not because OOP is bad, but because they never really manage to blend it just right with static typing. But I was the odd one. Everybody else seemed to love it and use it for everything.
I was lucky enough to outlive this OOP craziness.
Now I am seeing the same craziness with “a sync” and continuations and wonder if I will be lucky enough to outlive it to.
It's not really all that crazy. Sure, if you consider languages like C#, it's much more verbose than those, but that's because Rust tries to deal with async without garbage collection. This isn't a typical code you would write in Rust, in fact, in this case, this type was generated by a library.
Dealing with this "ridiculous" type step by step:
Future<Output = User>
This is a future. It returns an user.
dyn
However, because the exact `Future` implementation can vary, dynamic dispatch is necessary. Use of `dyn` keyword explicitly says that a vtable is necessary.
+ Send
Rust is supposed to make programs free from data races. It's very much possible to have objects which aren't allowed to change the thread they are in. For instance, `Rc<T>` (single-thread reference-counting pointer) cannot be sent to other threads due to it using non-atomic reference count.
In this case, using `+ Send` says that whatever future this method returns must be possible to send to other threads. This prevents use of objects like `Rc` within the futures, and allows the future to be used in multi-threaded executors.
+ '_
The method borrows `self` (as seen by the use of `&self`). Saying `+ '_` means that the future should borrow `self`. This is necessary due to lifetimes.
Box<...>
Because dynamic dispatch types (note `dyn` before) can have varying sizes (one future can be smaller, another one can be bigger), it's necessary to wrap them somehow so that they could be represented in memory. In this case, the simplest heap-allocation type (`unique_ptr` in C++) is being used. There are other options, like `Arc<...>`, but in this case, `Box` is sufficient.
Pin<...>
Futures can store pointers to its local fields. This is going to cause issues if the allocation could be moved somewhere else in the RAM - the future would be moved to the new location, but the pointers would still refer to RAM in the old position. Using `Pin` prevents the user from moving the contents of `Box` to another place in the RAM.
This is a nice demonstration of the tradeoff inherent in the (confusingly named) "zero-cost abstractions" philosophy: it brings accidental complexity front and center. This accidental complexity is often viral, propagating through the call stack in the type signature, and made so pervasive in the program that it becomes its focus (even if the types are hidden and inferred) and the element that is hardest to change rather than easiest.
C++ is most famous for this philosophy, but Rust doubles down on it because of its commitment to memory-safety soundness through the type system. Look at how many details having to do not with any algorithm but with the code the compiler emits are stated here: that the call is made in an asynchronous context, that an object requires a vtable, that an object is able to change threads, the object's lifetime, boxing, and that the object cannot be moved in RAM. The only part that's related to the algorithm here is `User`.
Of course, in constrained environments like embedded systems or other circumstances where control over resources is crucial -- the domains Rust is designed to target -- accidental complexity becomes essential as control over resources is an important component of the problem.
Some more here, and an alternative (with it's own tradeoffs) that could be called "zero-cost use" (although that's equally as bad and confusing as "zero-cost abstractions"): https://news.ycombinator.com/item?id=19932753
However, worth noting a lot of this accidental complexity is visible due to dynamic dispatch. If `get_user` wouldn't have to be dynamically dispatched, it would be simply written as:
async fn get_user(&self) -> User { ... }
Dynamic dispatch complicates a lot of things in Rust, as then the compiler cannot simply check the returned type to determine the necessary guarantees - as the type can be anything.
Right. Is it really accidental complexity? This is the implementation detail that the users of the language doesn't really need to know as they would just write 'async fn' (and #[async_trait] while waiting for the issues mentioned in this blog post get solved).
Oh, whether or not it's accidental depends on the domain, and Rust is designed for domains where such detailed control over resources could well be said to be essential, but it is nonetheless complexity, even if it is entirely inferred, as the compiler would block uses that don't comply with a very specific contract that is all about specific code generation. Even here the need to say that the call is "async" is not required from the algorithmic point of view, and in languages with a different philosophy and access to a JIT -- like, say, Java -- this is an implementation detail that could be worked out by the compiler entirely on a use-site basis (i.e. the same function could be "async" or not depending on how it's used in a particular callsite, not how it's defined).
But there's definitely a very big tradeoff -- in both approaches -- between fine-grained control over resources and "non-algorithmic complexity" if you'd like to call it that.
"async" is required from the algorithmic point of view, so the algorithm can be expressed in terms of specific concurrency model. And in Rust, it's not just "async" that's required, but also a specific executor, which can affect concurrency model as well. Concurrency model can't be an implementation detail.
Of course it is -- when you have continuations, which each and every imperative language has, although usually without explicit control over them. With explicit access to
continuations, everything expressible with async is expressible without it. Or, more precisely, whether something is "async" or not is not a feature of a certain piece of code but of a certain execution. Essentially, it means whether some computation may want to suspend itself and resume execution later. That you must express that capability as a type-system enforced property of some syntactic element -- a subroutine -- is accidental complexity.
To get an intuitive feel for why that is so, try to imagine that the OS were able to give you threads with virtually no overhead, and you'll see how anything that's expressible with async would be expressible without it. Over the years we've so internalized the fact that threads are expensive that we forgot it's an implementation detail that could actually be fixed.
Computations in Rust can suspend themselves without declaring themselves to be async: that's exactly what they do when they perform blocking IO. They only need to declare themselves async when they want a particular treatment from the Rust compiler and not use the continuation service offered by the runtime, which in Rust's case is always the OS.
> Over the years we've so internalized the fact that threads are expensive that we forgot it's an implementation detail that could actually be fixed.
This is something I wish more people would realize. We assume that 1:1 threads inherently can't scale, mostly because of folklore. (As far as I can tell, the origin of this claim is back-of-the-envelope calculations involving stack sizes, leading to address space exhaustion on 32-bit—obviously something that hasn't been relevant for servers for a decade.) That's why we build async/await and elaborate userspace scheduling runtimes like the one Go has (and the one Rust used to have pre-release).
I think it's worth questioning the fundamental assumptions that are causing us to do this. Can we identify why exactly 1:1 threading is uncompetitive with async/await schemes, and fix that?
All this, by the way, is not to say that Rust made the wrong decision in focusing on async/await. Rust generally follows the philosophy of "if the computing environment is difficult, it's Rust that has to adapt".
> Can we identify why exactly 1:1 threading is uncompetitive with async/await schemes, and fix that?
I think we can identify it, but fixing it is not easy, at least at the kernel level.
The easy part is the cost of scheduling. The kernel must schedule threads with very different behaviors -- say, encoding a video or serving requests over a socket -- so it uses a scheduling algorithm that's a compromise for all uses. But we can let the language's runtime take the scheduling of kernel threads with something like this: https://youtu.be/KXuZi9aeGTw
The harder part is managing the memory required for the stack. Even on 64-bit systems, the OS can't shrink and grow stacks finely enough. For example, I don't think there's a hard requirement that, say, memory below the sp is never accessed, so the OS can't even be sure about how much of the stack is used and can't uncommit pages. But even if there were such a requirement, or, that the language could tell the OS it follows such a requirement, still the OS can only manage memory at a page granularity, which is too much for lightweight concurrency. Any finer than that requires knowing about all pointers into the stack.
We can do it in languages that track the location of all pointers in all frames, though, which is what we're attempting to do in Java, and this allows us to move stacks around, grow them and shrink them as necessary, even at a word granularity.
> All this, by the way, is not to say that Rust made the wrong decision in focusing on async/await. Rust generally follows the philosophy of "if the computing environment is difficult, it's Rust that has to adapt".
... and it follows C++'s (horribly named) "zero-cost abstractions" philosophy which can reasonably be said to be a requirement of Rust's target domains. So I certainly don't think async/await is wrong for C++/Rust/Zig, but I think it's wrong for, say, JavaScript and C#, and we're going a different way in Java (more like Scheme's).
Another possible contributing factor is that in one respect Rust is a higher-level language than Java or JS: it compiles to a VM -- LLVM/WASM -- over which it doesn't have full control, so it doesn't have complete control over its backend. That, BTW, is why Kotlin adopted something similar to async/await.
> at a page granularity, which is too much for lightweight concurrency
I don't think that's been conclusively shown. 4kB is smaller than a lot of single stack frames. It would be interesting for someone to measure how large e.g. Go stacks are in practice—not during microbenchmarks.
> Another possible contributing factor is that perhaps ironically, in one respect Rust is a higher-level language than Java or JS, as it compiles to a VM -- LLVM/WASM -- over which it doesn't have full control, so it doesn't have complete control over its backend.
It's not really a question of "control"; we can and do land changes upstream in LLVM (though, admittedly, they sometimes get stuck in review black holes, like the noalias stuff did). The issue is more that LLVM is very large, monolithic, and hard to change. Upstream global ISel is years overdue, for example. That's one of the reasons we have Cranelift: it is much smaller and more flexible.
But in any case, the code generator isn't the main issue here. If GC metadata were a major priority, it could be done in Cranelift or with Azul's LLVM GC support. The bigger issue is that being able to relocate pointers into the stack may not even be possible. Certainly it seems incompatible with unsafe code, and even without unsafe code it may not be feasible due to pointer-to-integer casts and so forth. Never say never, but relocatable stacks in Rust seems very hard.
The upshot is that making threading in Rust competitive in performance to async I/O for heavy workloads would have involved a tremendous amount of work in the Linux kernel, LLVM, and language design—all for an uncertain payoff. It might have turned out that even after all that work, async/await was still faster. After all, even if you make stack growth fast, it's hard to compete with a system that has no stack growth at all! Ultimately, the choice was clear.
That would be interesting to study. We'll do it for Java pretty soon, I guess.
> After all, even if you make stack growth fast, it's hard to compete with a system that has no stack growth at all!
If the same Rust workload were to run on such a system there wouldn't be stack growth, either. The stack would stay at whatever size Rust now uses to store the async fn's state.
> a tremendous amount of work in the Linux kernel
I don't think any work in the kernel would have been required.
> Ultimately, the choice was clear.
I agree the choice is pretty clear for the "zero-cost abstractions" approach, and that Rust should follow it given its target domains. But for domains where the "zero-cost use" approach makes more sense, the increase in accidental complexity is probably not worth some gain in worst-case latency. But that's always the tradeoff the two approaches make.
Assuming you know how to relocate stacks, you could implement efficient delimited continuations in Rust without the kernel's involvement, and use whatever it is you use for async/await scheduling today (so I'm not talking about some built-in runtime scheduler like Go's or Erlang's). But this would require changes in LLVM, and also a somewhat increased footprint (although the footprint could be addressed separately, also by changes in LLVM). The kernel work is only required if you want to suspend computations with non-Rust frames, which you can't do today, either.
I don't know what the old M:N model did, but a compiler can generate essentially the same code as it does for async/await without ever specifying "async" or "await." The reason for that is that the compiler already compiles any subroutine into a state machine (because that's what a subroutine is) in a form you need for suspension. But you do need to know what that form is, hence the required change in LLVM.
There is no inherent reason why it would be any slower or faster. It also does not imply any particular scheduling mechanism.
> So I certainly don't think async/await is wrong for C++/Rust/Zig, but I think it's wrong for, say, JavaScript and C#, and we're going a different way in Java (more like Scheme's).
How can it be wrong for JavaScript? Because JavaScript is a single threaded environment, it was the only possible choice and it's one of the most influential change in the language (in itself it changed more the language use than the whole ES6 bundle). It's a really great success and I'm not sure why you'd like to revert it.
What would you do instead? Use a green thread system on top of a single-threaded VM? Great, now you have two kinds of blocking calls: the one which only block its own thread, and the one which block all threads at the same time because it blocks the VM itself. How ergonomic!
Remember, the single threaded character of the js VM is not an implementation detail, it's part of the spec. Hate it or love it but the web works this way.
Also, if you think threads are a good concurrency abstraction, let's play a little game: consider you need to read 1M files on a spinning disk. How many threads do you need to run on to get the maximum read performance:
a) one per file
b) just one
c) one per core
d) a magic number which depends on your hard disk's firmware and your workload.
Convenient and intuitive right?
Threads are a dated concurrency primitive which would have died long ago if wasn't also a good parallelism primitive.
> Great, now you have two kinds of blocking calls: the one which only block its own thread, and the one which block all threads at the same time because it blocks the VM itself. How ergonomic!
You could only have the first kind, but it's the same situation with async/await. Only async/await tracks the "good" kind of blocking, yet lets the "bad" kind go untracked.
> How many threads do you need to run on yo get the maximum read performance
The exact same number as you would for doing it with `await Promise.all` The same knowledge you have about the scheduling mechanism doesn't go away if you're no longer required to annotate functions with `async`.
> Threads are a dated concurrency primitive which would have died long ago if wasn't also a good parallelism primitive.
Maybe they are, but async/await are the exact same construct only that you have to annotate every blocking function with "async" and every blocking call with "await". If you had a language with threads but no async/await that had that requirement you would not have been able to tell the difference between it and one that has async/await.
> Only async/await tracks the "good" kind of blocking, yet lets the "bad" kind go untracked.
The “bad kind” is indistinguishable from CPU intensive computation anyway (which cannot be tracked), but at least you have a guarantee when you are using the good kind. (Unfortunately, in JavaScript, promises are run as soon as you spawn them, so they can still contain a CPU heavy task that will block your event loop, Rust made the right call by not doing anything until the future is polled).
> The exact same number as you would for doing it with `await Promise.all`
From a user's perspective, when I'm using promises, I have no idea how it's run behind (at it can be nonblocking all the way down if you are using a kernel that supports nonblocking file IOs). This example was specifically about OS threads though, not about green ones (but it will still be less expensive to spawn 1M futures than 1M stackful coroutines).
> Maybe they are, but async/await are the exact same construct only that you have to annotate every blocking function with "async" and every blocking call with "await". If you had a language with threads but no async/await that had that requirement you would not have been able to tell the difference between it and one that has async/awaitof
I don't really understand your point. Async/await is syntax sugar on top of futures/promises, which itself is a concurrency tool on top of nonblocking syscalls. Of course you could add the same sugar on top of OS threads (this is even a classic exercise for people learning how the Future system works in Rust), that wouldn't make much sense to use such thing in practice though.
The question is whether the (green) threading model is a better abstraction on top of nonblocking syscalls than async/await is. For JavaScript the answer is obviously no, because all you have behind is a single threaded VM, so you lose the only winning point of green threading: the ability to use the same paradigm for concurrency and parallelism. In all other regards (performance, complexity from the user's perspective, from an implementation perspective, etc.) async/await is just a better option.
Of course it can be tracked. It's all a matter of choice, and things you've grown used to vs. not.
> but at least you have a guarantee when you are using the good kind
Guarantee of what? If you're talking about a guarantee that the event loop's kernel thread is never blocked, then there's another way of guaranteeing that: simply making sure that all IO calls use your concurrency mechanism. As no annotations are needed, it's a backward-compatible change. That's what we're trying to do in Java.
> but it will still be less expensive to spawn 1M futures than 1M stackful coroutines.
It would be exactly as expensive. The JS runtime could produce the exact same code as it does for async/await now without requiring async/await annotations.
> Async/await is syntax sugar on top of futures/promises, which itself is a concurrency tool on top of nonblocking syscalls.
You can say the exact same thing about threads (if you don't couple them with a particular implementation by the kernel), or, more precisely, delimited continuations, which are threads minus the scheduler. You've just grown accustomed to thinking about a particular implementation of threads.
> The question is whether the (green) threading model is a better abstraction on top of nonblocking syscalls than async/await is
That's not the question because both are the same abstraction: subroutines that block waiting for something, and then are resumed when that task completes. The question is whether you should make marking blocking methods and calls mandatory.
> In all other regards (performance, complexity from the user's perspective, from an implementation perspective, etc.) async/await is just a better option.
The only thing async/await does is force you to annotate blocking methods and calls. For better or worse, it has no other impact. A clear virtue of the approach is that it's the easiest for the language implementors to do, because if you have those annotations, you can do the entire implementation in the frontend; if you don't want the annotation, the implementors need to work harder.
Of course, you could argue that you personally like the annotation requirement and that you think forcing the programmer to annotate methods and calls that do something that is really indistinguishable from other things is somehow less "complex" than not, but I would argue the opposite.
I have been programming for about thirty years now, and have written extensively about the mathematical semantics of computer programs (https://pron.github.io/). I understand why a language like Haskell needs an IO type (although there are alternatives there as well), because that's one way to introduce nondeterminism to an otherwise deterministic model (I discuss that issue, as well as an alternative -- linear types -- plus async/await and continuations here: https://youtu.be/9vupFNsND6o). And yet, no one can give me an explanation as to why one subroutine that reads from a socket does not require an `async` while another one does even though they both have the exact same semantics (and the programming model is nondeterministic anyway). The only explanation invariably boils down to a certain implementation detail.
That is why I find the claim that even when two subroutines have the same program semantics, and yet the fact that they differ in an underlying implementation detail means that they should have a different syntactic representation, is somehow less complex than having a single syntactic representation to be very tenuous. Surfacing implementation details to the syntax level is the very opposite of abstraction and the very essence of accidental complexity.
Now, I don't know JS well, and there could be some backward compatibility arguments (e.g. having to do with promises maybe), but that's a very different claim from "it's less complex", which I can see no justification for.
There are two different things: semantic and syntax.
From what I understand now, you are arguing about syntax: we should not need to write “async” or “await”. I'm not really going to discuss this, because as you said, I do like the extra verbosity and I actually like explicit typing for the same reason (Rust is my favorite, with just the right level of inference) and I'm not fond of dynamic typing or full type inference. This is a personal taste and that isn't worth arguing about.
On the other hand, there is also a semantic issue, and sorry I have to disagree, stack-ful and stack-less coroutines don't have the same semantic, they don't have the same performance characteristic nor they do have the same expressiveness (and associated complexity, for users and implementers). What I was arguing that if you want the full power of threads, you pay the price for it.
But from what I now understand, you just want a stackless coroutine system without the extra async/await” keywords, is that what you mean?
Where are you getting the idea that (one-shot) delimited continuations (stackful) "don't have the same performance characteristics" as stackless continuations, especially in a language with a JIT like JS? Also, "stackless coroutines without async/await" would give you (stackful) delimited continuations (albeit not multi-prompt). The reason Rust needs stackless coroutines is because of its commitment to "zero-cost abstractions" and high accidental complexity (and partly because it runs on top of a VM it doesn't fully control); surely JS has a different philosophy -- and it also compiles to machine code, not to a VM -- so whatever justification JS has for async/await, it is not the same one as Rust.
As to semantic differences, what is the difference between `await asyncFoo()` and `syncFoo()`?
BTW, I also like extra verbosity and type checking, so in the language I'm designing I'm forcing every subroutine to be annotated with `async` and every call to be annotated with `await` -- enforced by the type checker, of course -- because there is simply no semantic difference in existence that allows one to differentiate between subroutines that need it and those that don't, so I figured it would be both clearest to users and most correct to just do it always.
> Where are you getting the idea that (one-shot) delimited continuations (stackful) "don't have the same performance characteristics" as stackless continuations, especially in a language with a JIT like JS?
No matter the language, doing more work is always more costly than doing less… Because of the GC (and not the JIT) at least you can implement moving stacks in JS, but that doesn't mean it comes for free.
> Also, "stackless coroutines without async/await" would give you (stackful) delimited continuations (albeit not multi-prompt). The reason Rust needs stackless coroutines is because of its commitment to "zero-cost abstractions" and high accidental complexity (and partly because it runs on top of a VM it doesn't fully control); surely JS has a different philosophy -- and it also compiles to machine code, not to a VM
???
> As to semantic differences, what is the difference between `await asyncFoo()` and `syncFoo()`?
The assert is always true, because nothing could have run between line 2 and 3, you know for sure that the environment is the same in line 3 as in line 2.
You cannot be sure that your environment in line 3 is still what it was in line 2, because a lot of other code could have run in between, mutating the world.
You could say “global variables are a bad practice”, but the DOM is a global variable…
> No matter the language, doing more work is always more costly than doing less… Because of the GC (and not the JIT) at least you can implement moving stacks in JS, but that doesn't mean it comes for free.
It comes at extra work for the language implementors, but the performance is the same, because the generated code is virtually the same. Or, to be more precise, it is the same within a margin of error for rare, worst-case work that JS does anyway.
> ???
Rust compiles to LLVM, and it's very hard to do delimited continuations at no cost without controlling the backend, but JS does. Also, because Rust follows the "zero-cost abstractions" philosophy, it must surface many implementation details to the caller, like memory allocation. This is not true for JS.
> The assert is always true
No, it isn't. JS isn't Haskell and doesn't track effects, and
syncFoo can change GlobalState.bar. In fact, inside some `read` method the runtime could even run an entire event loop while it waits for the IO to complete, just as `await` effectively does.
Now, you could say that today's `read` method (or whatever it's called) doesn't do that, but that's already a backward compatibility argument. In general, JS doesn't give the programmer any protection from arbitrary side effects when it calls an arbitrary method. If you're interested in paradigms that control global effects and allow them only at certain times, take a look at synchronous programming and languages like Esterel or Céu. Now that's an interesting new concurrency paradigm, but JS doesn't give you any more assurances or control with async/await than it would without them.
JavaScript is much more constrained than Rust, because of the spec and the compatibility with existing code. The js VM has many constraints, like being single threaded or having the same GC for DOM nodes and js objects for instance. Rust could patch LLVM if they needed (and they do already, even if it takes time to merge) but you can't patch the whole web.
> fact, inside some `read` method the runtime could even run an entire event loop while it waits for the IO to complete, just as `await` effectively does
No it cannot without violating its own spec (and it would probably break half the web if it started doing that). Js is single threaded by design, and you can't change that without designing a completely different VM.
> No, it isn't. JS isn't Haskell and doesn't track effects, and syncFoo can change GlobalState.bar.
Of course, but if syncfoo is some function I wrote I know it doesn't. The guarantee is that nobody else (let say an analytics script) is going to mutate that between those two lines. If I use await, everybody's script can be run in between. That's a big difference.
> because the generated code is virtually the same.
You keep repeating that over and over again but that's nonsense. You can't implement stackless and stackful coroutines the same way. Stackless coroutines have no stack, a known size, and can be desugared into state machines. Sackful coroutines (AKA threads) have a stack, they are more versatile but you can't predict how big it will be (that would require solving the halting problem), so you can either have a big stack (that's what OS thread do) or start with a small stack and grow as needed. Either approach has a cost: big stack implies big memory consumption (but the OS can mitigate some of it) and small stack implies stack growth, which has a cost (even if small).
It's not the halting problem. You can track worst-case computation complexity in the type system just as you do effects (why don't you jump about the halting problem when you need to "decide" if some effect will occur?). You can Google for it -- there are about a thousand papers on it. Just to give you an intuitive feel, think of a type system that limits the counter in loops. You can also look up the concept of "gas" in total languages. Something similar is also used in some blockchain languages.
You can even track space complexity in the type system.
First, no rescue is needed. I said that not tracking complexity is a matter of choice, and you mistakenly thought it amounts to deciding halting; I pointed out the well-known fact that it isn't. Second, you don't need a non-Turing-complete language in order to track computational complexity in the type system, just as you don't need a language that cannot perform IO in order to track IO in the type system.
As to JavaScript, there are much more commonplace things that it doesn't track in the type system, either, and they're all a matter of choice. There is no theory that says what should or shouldn't be tracked, and no definitive empirical results that can settle all those questions, either. At the end of the day, what you choose to track is a matter of preference.
> I said that not tracking complexity is a matter of choice, and you mistakenly thought it amounts to deciding halting
And you implicitly acknowledged this fact by using TPL as example: you know, the class of language where all you can write is a provably halting program (that's the definition of Total programming languages!).
The halting problem being a property of Turing-complete languages, you just sidestepped the issue here.
If your concurrency model only allows bounded nondeterminism, you can't express an algorithm that assumes a concurrency model with unbounded nondeterminism in it. Furthermore, the choices concurrency models make hugely affect performance, so it doesn't even matter much if models were equivalent. So, no, algorithms do have to be explicit about concurrency models, whether you like it or not.
That some piece of code is explicit about what it does and that the capability to do so must be declared ahead of time in the type system are two different things. For example, every subroutine must be explicit about what it does and what, if anything, it returns, yet languages differ on how much of that must be declared in the subroutine's type signature -- Haskell requires being explicit in the type signature about return types as well as effects, Rust is explicit about the return type and some kinds of effects, Java is explicit about the return types and a smaller subset of effects, and JavaScript is explicit about neither. A language can provide the same suspension (or "await") behavior as Rust's async/await without requiring subroutines that use that mechanism to declare that they do so in the type system. Scheme does precisely that (with shift/reset).
In fact, Rust already gives you two ways to suspend execution -- one requires a declaration, and the other does not, even though the two differ only in the choice of implementation. That you wish to use the language's suspension mechanism rather than the OS's is not a different "model." It's the same model with different implementations.
Whether you need to declare that a subroutine blocks or not has no bearing on the fairness of scheduling.
Type declaration is supposed to signify the choice of concurrency model. Either way algorithmically the choice has to be explicit, even it it's not a type declaration, it still has to be an explicit declaration somewhere or there is no choice and algorithms have to be expressed in terms of a single concurrency model.
> Type declaration is supposed to signify the choice of concurrency model.
That "supposed to" is an aesthetic/ideological/pragmatic/whatever preference, and one with significant tradeoffs. Again, Scheme's shift/reset gives you the same control over scheduling as Rust's async/await, but you don't need to declare that in the type signature. Rust puts it in the type signature because in the domains Rust targets it is important that you know precisely what kind of code the compiler generates.
> Either way algorithmically the choice has to be explicit, even it it's not a type declaration, it still has to be an explicit declaration somewhere or there is no choice and algorithms have to be expressed in terms of a single concurrency model.
First, we're not talking about two models, but one model with two implementations (imagine that the OS could give you control over scheduling, like here [1]; you'll have the exact same control over the "concurrency model" but without the type declaration -- even in Rust -- although you'll have less precise control over memory footprint). Second, that the language's design dictates that you must tell the compiler in the type signature what it is that your computation does is precisely what creates accidental complexity, as it impacts your code's consumers as well.
With "async/await" you see exact places where you give up control, it's part of the model, without it you don't and it's a different model. First one lets you program as if you can always access shared memory like you are the only one doing it, since you always know exactly at which point this assumption no longer holds, second one requires you to be more careful about any function call, basically limiting you to the same assumption but only for code in between function calls, reducing power to abstract things with functions. Things get even worse with different executors, that give you completely different concurrency models and break all the assumptions of a simple event loop executor.
You're not talking about different algorithmic models, just about expressing constraints (e.g. you could follow the same algorithm, with the same constraints even without enforcement by the type checker).
In any event, the reason Rust does it is not because of the aesthetic preference you express (that would be true for, say, Haskell) but because the domains it targets require very fine-grained control over resource utilization, which, in turs, require very careful guarantees about the exact code the compiler emits. It's a feature of Rust following C++'s "zero-cost abstractions" philosophy, which is suitable for the domains Rust and C++ target -- not something essential to concurrency. Again, Scheme gives you the same model, and the same control over concurrency, without the type signatures, at the cost of less control over memory footprint, allocation and deallocation.
I think you might be saying that you happen to personally prefer the choices Rust makes (which it makes because of its unique requirements), and that's a perfectly valid preference but it's not universal, let alone essential.
Scheme has shift/reset which is the same as async/await (in fact, it's strictly more powerful) without requiring any declarations in the signature. Again, the choice of algorithm is one thing, but what adds to the accidental complexity is the requirement that that choice be reflected in the type signature, and so it affects the consumer of the code as well. Reflecting the algorithm in the signature is an aesthetic choice in general, and a forced choice in Rust given the domains it targets.
> Again, the choice of algorithm is one thing, but what adds to the accidental complexity is the requirement that that choice be reflected in the type signature, and so it affects the consumer of the code as well.
As long as you have shared memory adding type signatures to avoid thinking about and handling concurrent memory access only reduces accidental complexity and limits possibilities to make mistakes.
> As long as you have shared memory adding type signatures to avoid thinking about and handling concurrent memory access only reduces accidental complexity and limits possibilities to make mistakes.
That is certainly one way to reduce algorithmic mistakes, but it doesn't reduce accidental complexity, and it's not why Rust requires async in the signature. Rust needs to prevent data races even in non-async subroutines that run concurrently. Rust requires async because it implements that feature by compiling an async subroutine rather differently from a non-async one, and the commitment to so-called "zero-cost abstractions" requires the consumers to know how the subroutine is compiled.
> Of course, in constrained environments like embedded systems or other circumstances where control over resources is crucial
I wonder if we should think of end-user machines as such a constrained environment. Certainly, complaining about software bloat and inefficiency is a popular pastime here on HN and on related message boards. Maybe we have an ethical obligation to our users to make the most efficient possible use of their resources, regardless of the extra complexity we have to deal with. Of course, economic realities prevent us from really doing that in many cases.
And maybe, if software does good, we have an ethical responsibility to deliver software more cheaply and quickly,
(and if it doesn't then maybe we shouldn't write it at all)? Ethics are complex, even more than programming, and you're not going to get a definitive answer to such questions. Rust makes certain choices -- more control, more accidental complexity -- as it targets certain domains, while other languages target other domains and make other choices. It's your responsibility to figure out which domain your application belongs in, and neither ethics nor anything else can give a universally applicable answer here.
This is a silly argument, of course de-sugaring language leads to a less nice type signature. C functions compiles down to assembly and C++ or Java hide vtables and such when you do method calls.
If you go lower in the abstraction level you get more complicated mechanics, and that says nothing over the validity of the high level semantics.
>C functions compiles down to assembly and C++ or Java hide vtables and such when you do method calls.
I don't think it makes sense to compare the output of Rust macros to assembly code generated by a C compiler or to vtables.
Macros are part of the Rust language, as is their output. Understanding Rust means understanding both input and output. So this abstraction boundary is intentionally leaky to some degree.
Generated assembly and vtables on the other hand are compiler implementation details subject to change without notice. Any abstraction leakage is unintended and undesirable (even if developers sometimes benefit from understanding that output)
Your line of reasoning makes all debates about language complexity completely pointless.
I disagree that the user of the macro needs to understand its output.
The output of a macro is an implementation detail, and the documentation of the macro should be enough to use the macro without even looking at its output.
For example, no need to understand the magic behind the 'quote!' or the #[async_trait] macros to use them.
Not every user has to understand every macro. But the output of a Rust macro is valid Rust code whereas the output of a C compiler is not valid C code.
As a consequence, criticising the complexity of whatever a C compiler generates cannot possibly be valid criticism of C's complexity on a _semantic_ level whereas criticising the output of a Rust macro can be valid criticism of Rust on a semantic level.
Nearly all C compilers allow inline assembly. Macros are similar to inline assembly in that they step outside the normal bounds/use case of the language and are a complicated but valid and useful tool.
Most C programmers won't have to write or understand inline assembly often, if ever. Of course you can encounter it in production problem or something, so you could make an argument that all C programmers need to understand "C with inline assembly", which you are making for Rust macros.
As long as you just use Rust macro's and not write your own you are solidly in "C without inline assembly" territory.
>Nearly all C compilers allow inline assembly. Macros are similar to inline assembly in that they step outside the normal bounds/use case of the language and are a complicated but valid and useful tool.
I couldn't disagree more. Macros are not similar to inline assembly at all precisely because they do _not_ step outside the bounds of the language.
Whatever similarities you may find, it's simply not helpful to deny the fundamental distinction between language A generating code in language A and language A invoking/generating code in language B.
It's futile to debate the properties of a particular language if you can't make a distinction between that language and anything it can generate or embed in some opaque way.
What I fail to understand is why it's so important to you whether a pre-processing/compiler/de-sugaring of language A results in a valid snippet of language A or another language for complexity of language A.
Take Objective-C automatic reference counting [1], implemented as a transformation of the original code to valid code of the same language (similar to a Lisp/Rust/Scala style macro) by automatically adding the appropriate statements.
If I understand your argument correctly, according to you this increases the complexity of "Objective-C with ARC", but would not have done so if the compiler would have implemented it as a direct transformation to its compilation target instead.
To me, that is an implementation detail which does not matter. "Objective-C with ARC" is exactly as complex in both cases. I'd argue it's even a little bit less complex with the "macro" implementation since you don't need to know assembly to know what ARC is doing.
Similar to ARC, Rust implements some things with macro's, which first "compiles" something to valid Rust. To me this is not more difficult for users than it would be if the compiler would directly generate LLVM IR without this intermediate step.
The inclusion of macro's in a language do make the language more complex of course! And creating macros is notoriously difficult since you're basically implementing a small compiler step! But for the user using something is not suddenly more difficult because it's implemented using a macro.
>What I fail to understand is why it's so important to you whether a pre-processing/compiler/de-sugaring of language A results in a valid snippet of language A or another language for complexity of language A.
It's important because any and all code in language A is fair game when it comes to criticising semantic properties of language A. Code in other languages isn't.
noncoml criticised Rust based on a piece of Rust code. My point is simply that this criticism is potentially legitimate in ways that criticising C based on a piece of assembly code could never be.
I think our disagreement arises because you are asking a completely different question. What you're saying is that for devs who invoke some code it may not matter one bit whether that code was implemented in language A or language B or language A generated by language A or B. Those distinctions do not necessarily affect the semantic complexity for users of that code.
I completely agree with that. I also agree that the code snippet noncoml posted does not mean using async code in Rust has to be overly complicated.
But when I see a piece of Rust source code, I can criticise Rust based on it regardless of where that code came from or what purpose it serves.
Someone had to think in terms of Rust in order to write that code, and it's always worth asking whether it shouldn't be possible to express the same thing in a simpler way or whether that would have been possible in another language.
The fact that this code does not have to be understood by its users is completely irrelevant for this particular question.
This example from the article was to demonstrate a type that was more complex than would be acceptable in day-to-day use, so they agree with you here, it would be crazy to make that required to use the feature.
Java has always been an object oriented programming language. A class is a basic construct of the Java language. It wasn't added into Java. People still use classes. All the major new programming language have OOP features, you can create instances in Swift, Kotlin, Go and Rust for example.
async language features are just inline co-routines. Inline co-routines are a useful concept. It's a single tool in a big toolbox of other tools and not every problem requires the same tool, just like OOP features. It's just more stuff you can use if you want. If you don't want it, you don't have to use it, you can just write Haskell or OCaml without using objects and not worry about it and you won't have to worry about how long you live.
If you grab specific favourite functionality, this quickly becomes a no-true-scotsman situation. We know what OOP looks like in general. First-class metaclasses support is not available in Java, but if you really need it for some specific cases, you do them at compile time using annotations processor. The extreme-late-binding as Alan Key would like it does indeed make Java an inferior OOP. Service registration does this on a macro scale though. (although I've seen more abuse of micro-scale extreme late binding than use of it in good design)
Even in Rust, it only looks like that in this very specific use case and is usually completely hidden from the programmer. It's the desugared output of a library.
Usually it would look like this in Rust:
fn get_user(&self) -> impl Future<Output = User>;
This isn't exactly the same functionally, but for most use cases it works just fine. The difference is that this uses static dispatch instead of dynamic dispatch and therefore can't support dynamically returning two different structs that both implement Future, it has to be one type.
Check out structural typing. I would say it’s the type system JavaScript uses for non-primitive values. The “dynamic” appearance is just that JavaScript doesn’t center the type system in its UX.
> Now I am seeing the same craziness with “a sync” and continuations and wonder if I will be lucky enough to outlive it to.
Maybe you will. It's an attempt to address shortcomings of shared memory, just like borrow checker. Once languages start moving away from shared memory to actors, there will be no need for any of it. But it could take a long time, the industry dug itself way too deep into the whole shared memory world.
> Everyone wanted to add OOP to every language. C++, ObjC then Java.
By mentioning the languages after that comment, was it your intention to imply that OOP were added to them? Because to my knowledge, all three were OOP based from the very beginning.
Please comment with a source if you feel I’m wrong on saying any of those were OOP from the beginning. I (as well as others I’m sure) would enjoy the learning opportunity.