Note, a plain dd will report progress on stderr if sent a SIGUSR1. That can sometimes be enough to avoid the overhead of pv(1) and sending huge amounts of data through another pipe.
About the pv overhead, I sent a note to the pv author saying that on Linux at least the transfer function could take advantage of the splice system call. Splice "copies" data from one file descriptor to another file descriptor, avoiding having to copy the data from the kernel to userspace and back.
On FreeBSD (and maybe Mac OS X) you can press Control-T to send a SIGINFO signal (non-standard extension. This prints out how far dd has gotten (and perhaps other FreeBSD utilities too).
N.B. also that dd on osx will EXIT IMMEDIATELY if sent a SIGUSR1. It double-sucks because one tends to do this on long-running dd processes that you're wondering are done yet...
If I'm dealing with [compressed] files on disk or across a network, what's the practical overhead of pv? A few extra percent CPU, or some material decrease in through put?
Right. I usually use pv in place of 'cat' at the start of a long set of piped commands. Anywhere you could use 'cat' you can use pv. If I'm not mistaken, I think you can also use multiple 'pv' commands at a time too.
> If I'm not mistaken, I think you can also use multiple 'pv' commands at a time too.
I use multiple pv instances to track compression/decompression rates, and overall progress, for tarballs. Using the "-c" code for console codes keeps the instances from stomping over each other's output.
Nifty command. Been using this for a month or so. One use case is measuring if a downstream process can consume data as fast the previous process is producing it (by alternating second process with /dev/null or some such).
http://news.ycombinator.com/item?id=2567186