Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

When I'm using parallel, it's usually because I have thousands of jobs. Worse, they have nontrivial memory requirements. When you background processes with &, the system starts timeslicing. Each process gets to allocate its memory before being paused to make time for the next process. Your system will almost immediately crumple under load. Hopefully, the oom killer will target your backgrounded jobs... but the script spawning them will go untouched because it isn't the thing hogging memory.

Before I learned of parallel, I tried a hack where I'd manually assemble jobs into batches, and wait on the batches before starting the next. It achieved very low system utilization, because inevitably, one job each the batch takes much longer than the rest. A slight improvement (still not good), is to use `split` to chop your jobs file into $num_cores chunks, and background each chunk. But still, this gets low utilization. Problem being that you aren't using a thread/worker pool.

Parallel (or, TIL, xargs) can maintain 100% system utilization, until the very last $num_cores jobs.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: