I am waiting for a language to solve this with the type system and compiler. Give me the ability to mark a thread as async only and a clean (async) interface to communicate with a sync thread. If my async code tries to do anything sync, don't let it compile.
Whether a function is async-safe isn't black and white. A function that performs calculations may return instantly on a small dataset but block for "too long" on large parameters. On what is "too long" will vary widely depending on your application.
But whether a function is not async-safe is pretty black and white (i.e. there's a lot of clearly unsafe code). Even a definition as simple as "performs blocking I/O" would be extremely helpful.
This is the value people get out of async only environments like node. Yes, you still have to worry about costly for loops but I don't have to worry about which database driver I'm using because they are all async. In a mixed paradigm language like rust I would really appreciate the compiler telling me I grabbed the wrong database driver.
I think the suggestion is “clearly blocking io should be marked as such (with some marker like unsafe) and other functions can be marked “blocking” if the creator decides it is blocking.”
In the worst case a problem could still occur, but once found, the “problem” function can be marked appropriately. At the least that would start to solve the issue.
Aren't you effectively asking the compiler to solve the halting problem?
I think the best you could do would be heuristics - having inferred or user-supplied bounds on the complexity of functions, having rough ideas on how disk or network latency will affect the performance of functions, and bubbling that information up the call tree. It wouldn't be perfect, but it could be useful.
Nit: Singlethreaded executors also don't require "Send", since they don't move things between threads. That allows you too e.g. use non-atomically refcounted things (Rc<T>) on singlethreaded executors, which you can't use in the multithreaded versions.
Async is fundamentally cooperative multitasking. There is no real difference between the 'blocking'-ness of iterating over a large for-loop and doing a blocking I/O action - the rest of your async tasks are blocked either way while another task is doing something.
While the behavior of a large for-loop and a blocking I/O action doesn't change the event loop, I'd still appreciate the compiler helping me identify the blocking I/O loop. I'll take whatever help I can get.
I think I can agree with you, but first I think we'd have to somehow define how to even approach this feature.
Eg, fundamentally I feel like you're making a distinction between blocking I/o and a "blocking" for loop. At the end of the day, they're the same in my view - one is just more likely to be costly.
So I think for this feature to be done right, we'd have to somehow be able to analyze the likelihood of an expensive operation - and the negative consequence that action might have on the rest of the workload. Eg, I would want the same hypothetical behavior and compile-time warnings/errors that a huge file-load might cause, with a huge loop.
Otherwise a simple function call which involves no I/O and looks innocent could have the same terrible behavior as some I/O call does.
Defining that, and informing the compiler seems obscenely difficult. To that degree, I think any interaction with any sort of heap-y thing like iterating over a Vec would have to error the compiler if used in a Future context.
_everything_ would have to be willing to yield. Not sure I like it. Interesting thought experiment though. I imagine some GC languages do exactly this.
I mean, what you're asking for is just profiling, but only of your async methods. I'm not familiar enough with the details to predict how async messes with profilers, so maybe it'd need support. But using a flamegraph profiler would show a big chunk of time in a function that only has small amounts of time spent deeper in the stack.
No, but if that library claims to be an async library, wouldn't that be a bug in the library?
Edit: I'm interpreting your use of sync here as "blocking" and not as Sync in Rust, meaning safe to share across threads. To be clear in my initial response I was talking about shared memory across threads, and may have misunderstood your original statement.
I'm not sure this is really possible. Given that async programming is cooperative by nature, how do you tell the difference between a blocking IO task, and a really long running loop in a piece of code that is itself blocking others from executing because it's doing too much work?
The blocking IO might be something in Rust that a type could be created for to denote that they are not async, and therefor warn you in some way, but I think that one is easy to detect in testing.
I have seen that not detected in testing too many times. With a work stealing execution context the code will still run fine unless under heavy load (which will exhaust the thread pool and lock the application).
“Non-blocking” code is basically just code that takes a short enough amount of time that we don’t care that it blocks the thread. It’s inherently a matter of judgement.
Static analysis should help with this. Basically it should identify every call site where I/O happens (and other syscalls), and then you have to check them that they are invoked with the right async/nonblocking dance.
This is basically a code audit problem.
Of course something like taint analysis could also work. Every such callsite should be counted as tainted unless it gets wrapped with something that's whitelisted (or uses the right marker type wrapper).
Even effects as types can't help much, because the basic interfere to the kernels (Linux, WinNT, etc.) are not typesafe, and as long as the language provides FFI/syscall interfaces you have to audit/trust the codebase/ecosystem.
1. A performant "i/o" layer in the standard library that allows a large number of concurrent activity (forget thread vs coroutine differences).
2. Ability of programmer to express concurrency. Ideally, this has nothing to do with "I/O". If I am doing two calculations and both can run simultaneously, I should be able to represent that. Similarly for wait/join.
Explicitly marking a thread as async-only will just force everyone else (who need sync and cannot track/return a promise/callback to their caller) write a wrapper around it for no reason.