Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: Neco – Coroutine Library for C (github.com/tidwall)
145 points by tidwall 9 months ago | hide | past | favorite | 36 comments



These are stackful coroutines, as opposed to stackless.

Stackful coroutines create a new stack for each coroutine thread and save and restore the appropriate registers when switching. Stackless coroutines are basically normal C functions that have been hacked to save their local variables between calls and use gotos (or, notoriously, a switch statement) to resume from where they last yielded. Both are very useful in their own way.

This is a great project and I'll be trying it out in my current work.


Yeah, at some point, would just say if you want stackful coroutine, go to Go and stackless one, use C++.

BTW, here is my macros for stackless coroutine in C: https://github.com/liuliu/co/blob/master/example.c


Would you mind elaborating on the pros and cons of each?


Stackless are very lightweight and can be done without stepping outside of C code, but it's a bit hacky, using either macros or generating the coroutine yourself. One big limitation is you have to yield from the originally called coroutine function and not from any function it called (because the coroutine has no stack of its own, but you could work around this, with more complexity). So besides the function and the memory required to save its local variables when it yields, no resources are required. These are great when you just want a basic generator or something. It's simple, platform independent, efficient and you can do it yourself (at least in C).

Stackful coroutines are more flexible in that they don't have the calling limitations since they have their own stack. The big downside is that they are much more complex to implement and use (understand). The code has to create and destroy stacks from scratch and switch contexts by saving and restoring the correct registers (so your platform has to be supported explicit). This may cause compatibility problems with some code. Those activities also require more resources, but still not very much. For some people they will be easier to use since you don't have to roll anything yourself.


I love how simple and elegant is this to use: https://github.com/tidwall/neco/blob/main/examples/echo-serv...

I was going to actually implement an echo server for load balancer health checks with minimal memory usage, but never considered doing it in C but I might just use it! Thank you so much.


I like how your examples become progressively more comprehensive. What was your thinking going into this? How did you design this library? i think you wanted something this beautiful and perfect to exist, am i right? Or is it also an exercise to develop your own understanding? I wouldn't know where to begin with C gens/coroutines. Probably I would just fallback on kernel/sys calls to suspend and hack from there: yay my function is stopped; yay my code has been called again. Are you using setjmp/longjmp?


  > neco_chan *messages = argv[0];
neco_chan → neko chan → kitty cat (in japanese)... coincidence? ^_^


Nekochan had been an important SGI (Irix) forum. Important as in "the community built packages of FOSS for Irix". https://wiki.preterhuman.net/Nekochan.net


Can someone help elucidate why "It's a non-goal for Neco to provide a scalable multithreaded runtime, where the coroutine scheduler is shared among multiple cpu cores [...]" this library even makes sense then? When I use coroutines in Go it is invariably in order to make use of more CPU cores when extra performance needs to be extracted.


Goroutines aren't coroutines. This isn't a nitpick, it's pretty essential to understanding the goals and non-goals of Neco.

Coroutines are a control-flow primitive which allows for a lot of useful things, including advancing program state while a coroutine waits on a syscall, generators, iterators, producer-consumer patterns with channels, and combinations of these things. They generalize function calls, or structure goto, if you prefer.

Some of these applications are glossed as concurrent, particularly their use to free the thread when execution blocks, but all of them are single-threaded. Goroutines are preëmptively scheduled, and can be relocated between threads transparently. That isn't really a coroutine, which is why they have a different name.

The README has some notes on how to use Neco coroutines in a multithreaded environment. It's basically how you'd run execution paths of multiple function calls on several threads, because coroutines are an orthogonal concern to threads.


The library rightly leaves implementation of that up to the user. You're talking about mixing cooperative and preemptive threads which have very different safety and execution profiles. There are many ways to arrange and interact the two things. Trying to provide general support for something like that would just end up a complicated mess that no one actually understands properly.


the scheduler is probably simpler if coroutines can't bounce between cores. you can have a single thread per core that runs the scheduler to multiplex a bunch of coroutines on the single thread, which lines up with the example linked in the repo to a redis clone. redis runs a single thread (i think technically it has some multithreading stuff now but that was the model for a while) but can concurrently process a bunch of requests like when 1 or more requests are blocked on blpop, a pubsub thing, xread with block arg given, etc. nginx does something similar with forking a process per core that runs single threaded (ignore the threading for stuff that reads from disk).


Also it is possible to start multiple Neco schedulers in different threads, there’s an example in the readme.


yeah i like this model. then you can use whatever synchronization junk you prefer to share state between the threads, if you have any.

your library looks well written and clean. thanks for sharing.

if anyone else wants to go coroutine spelunking, these were interesting to me:

https://github.com/higan-emu/libco/tree/master

https://github.com/Tencent/libco

https://github.com/hnes/libaco

https://kernel.googlesource.com/pub/scm/virt/kvm/qemu-kvm/+/...

https://tia.mat.br/posts/2012/09/29/asynchronous_i_o_in_c_wi...

https://www.cs.uml.edu/~bill/cs516/context_paper_rse-pmt.pdf


Coroutines are still very useful for multitasking even if you're not sharing tasks between CPU cores. Other coroutines can just execute whenever something would normally block, removing the need for multiple threads in the first place.


This library aims to be the simplest and most efficient thing for what it does. And what it does is perfect for is managing communications with things that themselves may be bound by I/O.

Handling multiple threads to access multiple CPUs would make it far more complex. And frankly a single thread is plenty for handling, say, a chat server.


coroutines as a control flow construct within a thread are quite useful.


I’ve used libaco with a lot of success in the past (mixing uv, curl, and zlib in a way that made the zlib loop easier to manage (since it looked like a normal read, decompress loop with all context stack local).

Any notes about why I would try Neco in the future?


This has support for ARM processors.

It's nice to have multiple options. For instance the interface can be very different.


It's a little lower level than what you have, but you may find somethings interesting here: https://github.com/Keith-Cancel/Bunki


These are cooperatively scheduled green threads...

I feel the word "coroutine" is slowly losing its original meaning, similar to what happened to "lambda".


Coroutine is definitely appropriate here. Any implementation that has one function (routine) executing while others are paused is applicable. It always means cooperative multitasking, which this is. No meaning lost whatsoever.


Coroutine doesn't just mean cooperative multitasking, it means a specific form of cooperative multitasking. There's value in making that distinction.

Also don't get me wrong, meaning of words do change, I am grumpy about it but that's fine. Kinda feel weird to have people telling me it has always meant that.


OK, but what's the distinction? What are the different kinds of cooperative multitasking here?


I don't know where you're getting that impression.

These are stackful, symmetric coroutines with a build-in scheduler. It implements suspend and resume. You can call that a green thread if you want, or a fiber or whatever. They're coroutines.

The API docs make all of this clearer than the README. https://github.com/tidwall/neco/blob/main/docs/API.md


I think the replies l got here more or less proves my point.

So a coroutine is basically something you can explicitly yield control flow from or give control flow to. neco doesn't really let you do that. Caller of neco_resume isn't relinquishing the control flow to another coroutine. neco_yield looks like a coroutine yield, but for asymmetric coroutine it should yield to caller, for symmetric coroutine it should let you specify who you are yielding to. This one does neither.

And I feel if you stretch the definition of coroutine to include this, then you can basically call most multitasking systems coroutines, rendering the concept mostly useless.


These are coroutines. Just... look at the code, this isn't a proprietary binary product where you have to take someone's word for it. They're very clearly coroutines. I don't know what else to say here.


what was the original meaning of lambda?


Maybe they mean as in lambda calculus?

Although, deep in our hearts we all must know that lambda was already taken, it is an eigenvalue.


eleventh letter of the greek alphabet, derived from phoenecian "lamed", etc.


How does it work on WebAssembly, where there's no native support for stack switching? Does it rely on Emscripten's Asyncify transformation?


Yes according to the caveats section in the llco subproject:

> Webassembly: Must be compiled with Emscripten using the -sASYNCIFY flag.


In "Example 4", I don't understand where the "arg2" in `free(arg2)` is coming from?


I assume it meant to be arg1 to showcase that the malloc for arg1 can be free'd without resulting in a use-after-free after having copied the value out of argv[1] within the coroutine.


Can someone explain how this works please? How does it prevent this from saturating the CPU?

    while (1) {
        neco_sleep(NECO_SECOND*2);
        printf("tock\n");
    }


I encourage you to dig through the source. There's not that much.

Either this is a library you might use, so you should obviously review the code before you include it in your own work, or this is something that intrigued you, so you'll find clever ideas by digging into its implementation.

Here, neco_sleep eventually calls a function that tells the coroutine scheduler to pause the current (calling) and resume the next eligible coroutine in its collection.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: