Hacker News new | past | comments | ask | show | jobs | submit login

I don't think it is an issue w/ the pre-emption code. I believe FuturesUnordered is just doing the wrong thing: not respecting yields.



wrt. preemption - in Seastar we have maybe_yield(), which gives up the cpu, but only if the task quota (more or less a semantic equivalent for Tokio's budget) has passed. Wouldn't it make sense to have a similar capability in Tokio? Then, if somebody is not a big fan of the default preemption, they could run their tasks under tokio::task::unconstrained and only check the budget in very specific, explicitly chosen places - where they call maybe_yield(). That could of course also be open-coded by trying to implement maybe_yield on top of yield_now and some time measurements, but since the whole budgeting code is already there... Do you think it's feasible?


The problem is it requires you to write your code in a way which now _only_ will work with tokio.

Which isn't an option for a lot of libraries.

While the rest of the rust eco-system is increasingly moving to have increasingly more parts runtime independent...

Furthermore I think `maybe_yield` wouldn't be quite the right solution. The problem is that tokios magic works based on the assumption that a single task (future scheduled by the runtime) represents a single logical strang of exexution. (Which isn't guaranteed in rust.)

So I think a better tokio specific solution would be to teach tokio about the multiplexing effect in some way.

For example you could have some way which snapshots the budged when reaching the multiplexer, and reset the budget to the snapshot before every multiplexed feature is called. With this each logical strange of execution would have it's "own" budget (more or less).


Any extension of executors will require having a trait abstracting the executor used, and there just isn’t one in std yet. Your code already has to be tokio specific if you do something as mundane as spawn a task.


> mundane as spawn a task.

There are only a few things you need the specific runtime for:

- spawn

- IO

- timeout

But you can mix the executor and reactor doing the IO (not recommended but you can).

Similar you can run your own timer.

And you can abstract in various ways about all of this, sure with limitation, hence why there is no std-abstraction. But there are enough high profile libraries which do support multiple runtimes just fine.

But tokios preemting-cooperated threads to require any code which does any form of future multiplexing to:

- be tokio specific (which btw. isn't fully solving the problem)

- add a bunch of additional complexity, including memory and runtime overhead

If you multiplex features on tokio you must:

- use custom wakers to detect yields

- (and) do not poll futures in a "repeating" order (preferable fully random order).

This is a lot of additional complexity for something like a join_all (for a "small" N).

(reminds me I should check if I need to open an issue with futures-rs, as their join_all impl. was subtle broken last time I checked).

And even with that you have the problem that the multiplexed futures as subtle de-prioritized as they share a budged.

The problem I have with this feature is not that it's a bad idea, it isn't it's in general a good idea. The problem is that it completely overlooks the multiplexing case. And worse, further in subtle ways divides the ecosystem (that is what I'm worried about).

So maybe we could find a way to provide a std (or just common-library) standardized way for just that feature. (I mean it's a task local int, it might not even need to be atomic, maybe. So there might be a way which doesn't have the problem async-std standardization has).


maybe_yield may or may not be the right solution here, but I think it may be useful in general - e.g. when you have long I/O-less computations. In such a case, I'd like to be able to say "yield here if my budget is drained, but continue otherwise and don't put my task at the end of the queue". Although for that the only thing I really need is a way to peek at your budget - with that, open-coding maybe_yield is trivial


It wouldn’t be too hard. The trickiest bit would be putting together a consistent API.


Now that I think of it, it would probably be beneficial even outside of the unconstrained scope, especially for long computations. When iterating over millions of elements, it would be great to have a mechanism for maybe yielding if we're past the budget, but we don't really want to force-yield on every X iterations and put the task at the back of the queue. If the maybe_yield API is potentially controversial, a sufficient building block would be a function that allows peeking into the state of your budget - and then, if you're out of it, you just explicitly call yield_now().


How could it respect them? The Future trait doesn't let it distinguish between "please yield now" and "please poll again".


It wouldn’t be too hard to tell it apart. A yield is defined as the task waking itself vs something else waking it. The yield methods already do this.


Again, FuturesUnordered cannot know the difference between a task wanting to yield and a task that wants to be polled immediately. The waker does not get this information, either. It cannot distinguish.


Here is the PR: https://github.com/rust-lang/futures-rs/pull/2551

Yield = wake the `waker_ref`. Avoiding the yield would be clone().wake().

That said, "poll immediately" isn't actually a thing nor was it ever a thing except in incorrect implementations.


But polling multiple futures independent of each other inside of a future is a thing (like join, race, etc.).

And that means that just because you get "Pending" (i.e. not ready) from one of the futures, doesn't mean you should return Pending now. I only means this future is not ready, but other futures might still be ready.

But in tokio it means this future is not ready, and we magically as a side-effect might have forced all other futures to be non-ready even if they are.

Which means tokio redefined what Pending means in a subtle but potentially massively-braking way.

Which is a problem.

And not a problem of futures-rs, but one of tokio.

And forcing all of the eco-system to increase the complexity of their code by trying to subtile detect weather something yielded or was force yielded IS NOT OK. That's not how rust standarized yielding or polling.


> wrong thing: not respecting yields

This is not quite right.

As far as I understand they did respect yield in the way it's defined by rust.

There yields mean just that your future returns `Pending`.

Which means only _this_ future is not ready.

But in tokio returning `Pending` means "not-ready" and "maybe as a _magic side effect_ also make all other futures return pending even if it is ready internally".

So lets step away from FuturesUnordered for a moment and instead just look at a future which multiplexes X-futures and polls each (not completed one) once and then yields, which should be 100% fine.

But with tokio it isn't as after just polling the first few, the "budget" might be consumed and polling all other will forcefully fail where it shouldn't, adding a lot of overhead. Worse if you just poll them in the same order every time you will de-facto starve all futures later in the list. Which means you need to add a bunch of complexity which should be unnecessary just because tokio changed what `Poll::Pending` means.

Also if you write scheduler independent futures (you should if you can) then you can not opt-out of it. Generally in a multiplexed future you don't want to opt. out anyway, you want to have a budged per logical strange of execution, which tokio doesn't provide.

Instead tokio assumes a task (which in the end is just a future polled by it) represents a single logical strange of execution. But that is simple NOT how rust futures are designed!! (Through often it happens to play out that way.)

This doesn't mean that tokios idea is bad, it's just not compatible with rusts future design in subtle edge cases.

I think it would be a nice thing to add it to the future design, but then you would need a standardized way to properly handle multiplexing. (Which as a side not tokio doesn't provide, it only provides opt. out `unconstrained`, but what you need is to snapshot and restore the budget or something similar, i.e. snapshot it when entering the multiplexer and restoring it after each multiplexed future or similar, the only solution futures unordered could take is speculative adding yields, but that's _not_ a proper solution as it subtle de-prioritize multiplexed futures compared to spawned futures and also can easily fall apart if you nest multiplexed futures....).

Also I have no idea why they call it cooperative scheduling. Futures are cooperative scheduling. What they do is preemting futures (i.e. force full yield them), but only in places where they could have yielded in context of cooperative scheduling. So it's some in between solution.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: