These are stackful coroutines, as opposed to stackless.
Stackful coroutines create a new stack for each coroutine thread and save and restore the appropriate registers when switching. Stackless coroutines are basically normal C functions that have been hacked to save their local variables between calls and use gotos (or, notoriously, a switch statement) to resume from where they last yielded. Both are very useful in their own way.
This is a great project and I'll be trying it out in my current work.
Stackless are very lightweight and can be done without stepping outside of C code, but it's a bit hacky, using either macros or generating the coroutine yourself. One big limitation is you have to yield from the originally called coroutine function and not from any function it called (because the coroutine has no stack of its own, but you could work around this, with more complexity). So besides the function and the memory required to save its local variables when it yields, no resources are required. These are great when you just want a basic generator or something. It's simple, platform independent, efficient and you can do it yourself (at least in C).
Stackful coroutines are more flexible in that they don't have the calling limitations since they have their own stack. The big downside is that they are much more complex to implement and use (understand). The code has to create and destroy stacks from scratch and switch contexts by saving and restoring the correct registers (so your platform has to be supported explicit). This may cause compatibility problems with some code. Those activities also require more resources, but still not very much. For some people they will be easier to use since you don't have to roll anything yourself.
I was going to actually implement an echo server for load balancer health checks with minimal memory usage, but never considered doing it in C but I might just use it! Thank you so much.
I like how your examples become progressively more comprehensive. What was your thinking going into this? How did you design this library? i think you wanted something this beautiful and perfect to exist, am i right? Or is it also an exercise to develop your own understanding? I wouldn't know where to begin with C gens/coroutines. Probably I would just fallback on kernel/sys calls to suspend and hack from there: yay my function is stopped; yay my code has been called again. Are you using setjmp/longjmp?
Can someone help elucidate why "It's a non-goal for Neco to provide a scalable multithreaded runtime, where the coroutine scheduler is shared among multiple cpu cores [...]" this library even makes sense then? When I use coroutines in Go it is invariably in order to make use of more CPU cores when extra performance needs to be extracted.
Goroutines aren't coroutines. This isn't a nitpick, it's pretty essential to understanding the goals and non-goals of Neco.
Coroutines are a control-flow primitive which allows for a lot of useful things, including advancing program state while a coroutine waits on a syscall, generators, iterators, producer-consumer patterns with channels, and combinations of these things. They generalize function calls, or structure goto, if you prefer.
Some of these applications are glossed as concurrent, particularly their use to free the thread when execution blocks, but all of them are single-threaded. Goroutines are preëmptively scheduled, and can be relocated between threads transparently. That isn't really a coroutine, which is why they have a different name.
The README has some notes on how to use Neco coroutines in a multithreaded environment. It's basically how you'd run execution paths of multiple function calls on several threads, because coroutines are an orthogonal concern to threads.
The library rightly leaves implementation of that up to the user. You're talking about mixing cooperative and preemptive threads which have very different safety and execution profiles. There are many ways to arrange and interact the two things. Trying to provide general support for something like that would just end up a complicated mess that no one actually understands properly.
the scheduler is probably simpler if coroutines can't bounce between cores. you can have a single thread per core that runs the scheduler to multiplex a bunch of coroutines on the single thread, which lines up with the example linked in the repo to a redis clone. redis runs a single thread (i think technically it has some multithreading stuff now but that was the model for a while) but can concurrently process a bunch of requests like when 1 or more requests are blocked on blpop, a pubsub thing, xread with block arg given, etc. nginx does something similar with forking a process per core that runs single threaded (ignore the threading for stuff that reads from disk).
Coroutines are still very useful for multitasking even if you're not sharing tasks between CPU cores. Other coroutines can just execute whenever something would normally block, removing the need for multiple threads in the first place.
This library aims to be the simplest and most efficient thing for what it does. And what it does is perfect for is managing communications with things that themselves may be bound by I/O.
Handling multiple threads to access multiple CPUs would make it far more complex. And frankly a single thread is plenty for handling, say, a chat server.
I’ve used libaco with a lot of success in the past (mixing uv, curl, and zlib in a way that made the zlib loop easier to manage (since it looked like a normal read, decompress loop with all context stack local).
Any notes about why I would try Neco in the future?
Coroutine is definitely appropriate here. Any implementation that has one function (routine) executing while others are paused is applicable. It always means cooperative multitasking, which this is. No meaning lost whatsoever.
Coroutine doesn't just mean cooperative multitasking, it means a specific form of cooperative multitasking. There's value in making that distinction.
Also don't get me wrong, meaning of words do change, I am grumpy about it but that's fine. Kinda feel weird to have people telling me it has always meant that.
I don't know where you're getting that impression.
These are stackful, symmetric coroutines with a build-in scheduler. It implements suspend and resume. You can call that a green thread if you want, or a fiber or whatever. They're coroutines.
I think the replies l got here more or less proves my point.
So a coroutine is basically something you can explicitly yield control flow from or give control flow to. neco doesn't really let you do that. Caller of neco_resume isn't relinquishing the control flow to another coroutine. neco_yield looks like a coroutine yield, but for asymmetric coroutine it should yield to caller, for symmetric coroutine it should let you specify who you are yielding to. This one does neither.
And I feel if you stretch the definition of coroutine to include this, then you can basically call most multitasking systems coroutines, rendering the concept mostly useless.
These are coroutines. Just... look at the code, this isn't a proprietary binary product where you have to take someone's word for it. They're very clearly coroutines. I don't know what else to say here.
I assume it meant to be arg1 to showcase that the malloc for arg1 can be free'd without resulting in a use-after-free after having copied the value out of argv[1] within the coroutine.
I encourage you to dig through the source. There's not that much.
Either this is a library you might use, so you should obviously review the code before you include it in your own work, or this is something that intrigued you, so you'll find clever ideas by digging into its implementation.
Here, neco_sleep eventually calls a function that tells the coroutine scheduler to pause the current (calling) and resume the next eligible coroutine in its collection.
Stackful coroutines create a new stack for each coroutine thread and save and restore the appropriate registers when switching. Stackless coroutines are basically normal C functions that have been hacked to save their local variables between calls and use gotos (or, notoriously, a switch statement) to resume from where they last yielded. Both are very useful in their own way.
This is a great project and I'll be trying it out in my current work.