> A lobste.rs user asked why you would use find | xargs rather than find -exec. The answer is that it can be much faster. If you’re trying to rm 10,000 files, you can start one process instead of 10,000 processes!
Fair enough, but I still favor find -exec. I find it generally less error prone, and it's never been so slow that I wished I had instead used xargs.
Also, if you're specifically using -exec rm with find, you could instead use find with -delete.
I tend to prefer xargs because it works in more contexts e.g. I've got a tool which automatically generates databases but sometimes the cleanup doesn't work. `find -exec` does nothing, but `xargs -n1 dropdb` (following an intermediate grep) does the job. From there, it makes sense to… just use xargs everywhere.
And I always fail to remember that the -exec terminator must be escaped in zsh, so using -exec always takes me multiple tries. So I only use -exec when I must (for `find` predicates).
i agree. `find somewhere -exec some_command {} +` can be dramatically faster. but it does not guarantee a single invocation of `some_command`, it may make multiple invocations if you pass very large numbers of matching files
after spending a bit of time reading the man page for find, i rarely use xargs any more. find is pretty good.
tangent:
another instance i've seen where spawning many processes can lead to bad performance is in bash scripts for git pre-recieve hooks, to scan and validate the commit message of a range of commits before accepting them. it is pretty easy to cobble together some loop in a bash script that executes multiple processes _per commit_. that's fine for typical small pushes of 1-20 commits -- but if someone needs to do serious graph surgery and push a branch of 1000 - 10,000 commits that can can cause very long running times -- and more seriously, timeouts, where the entire push gets rejected as the pre-receive script takes too long. a small program using the libgit2 API can do the same work at the cost of a single process, although then you have the fun of figuring out how to build, install and maintain binary git pre-receive hooks.
Fair enough, but I still favor find -exec. I find it generally less error prone, and it's never been so slow that I wished I had instead used xargs.
Also, if you're specifically using -exec rm with find, you could instead use find with -delete.