Hacker News new | past | comments | ask | show | jobs | submit login

I've got the impression that in practice, what keeps developers from letting trivially parallelizable tasks run in parallel is a) the overhead of dealing with poor parallelization primitives and b) the difficulty in properly showing the status of parallel invocations.

Having good features to support this in standard libraries would go a long way to incentivizing devs to actually parallelize.




In a way, it's a subset of the distributed systems tracing problem - you have multiple tasks running in parallel on the same node, but they will have been initiated as different (sub) tasks, and should be tracked by the specific task via which they were initiated. So systems like OpenTelemetry and Honeycomb can be great for this, allowing you to see events in aggregate as well as in the context of a trace that propagates between different threads and systems.

https://opentelemetry.io/docs/languages/python/getting-start... https://docs.honeycomb.io/getting-data-in/opentelemetry/pyth...

But there's so much complexity there that IMO it's best left outside of standard libraries - and it's indeed a daunting amount of new vocabulary for newcomers. I'm not aware of simpler abstractions on top of the broader telemetry ecosystem for monitoring simple parallelization, but arguably there should be one that keeps things quite simple.


Agreed. I have been using joblib for a good few months. It is fine, but I still haven't figured out basic things like printing the status of process-based jobs.

[Parsl is much better, e.g., logging is built-in, but it can be a little overwhelming.]


Yes, totally agree. I’ve written some code and I’d rather convert it to C using cpython before I paralyze it. Python is horrible for both these things, and you may not even get a better speed increase because of the overhead. It’s like use cpython get 10-100x better speed with a few lines of code, or spend my whole day in a horrible mess of data structures and getting my functions to work with map properly with maybe nothing to show for it.


Do you mean cython? I've never seen a pure c extension in "a few lines", last time I wrote one there was a ton of boilerplate.


Every year I write a similar threaded cli monitor .. Now maybe rich can solve this for all, but i'm surprised it took so long to emerge.


Yes this is 100% the type of thing that should be in a standard library but also the type of thing I have no doubt Python steering would feel better belongs in a 3rd party library.

We do see some cool stuff under the hood from core Python devs but interest in further quality of life features seems to be lacking.


What's a good way to show the status in the command line?

For a one off project it seems simpler to just write an html UI.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: