There are command line tools available to help the transition from 'hack' one liner to a more maintainable / supportable solution. For instance drake (https://github.com/Factual/drake) a 'Make for data' which does dependency checking would allow for sensible restarts of the pipeline.
The O'Reilly Data Science at the Command Line book (linked elsewhere in the comments) has a good deal to say on the subject: turning one liners into extensible shell scripts, using drake, using Gnu Parallel.
I've been using GNU Parallel for orchestrating running remote scripts/tools on a bunch of machines in my compute and storage cluster. Its now my goto tool for almost any remote ssh task that needs to hit a bunch of machines at once.
The O'Reilly Data Science at the Command Line book (linked elsewhere in the comments) has a good deal to say on the subject: turning one liners into extensible shell scripts, using drake, using Gnu Parallel.