Linux tool to show progress for cp, mv, dd

slimsag · on May 19, 2023

Wow, didn't expect to see @xfennec pop up on hacker news while drinking my coffee this morning! I don't know if he'll see this, to be honest didn't know he was still doing things - but this person basically got me into programming and game development-I really can't believe it.

xfennec (and some friends?) I think built a game engine called Raydium, and one of their games called Mania Drive-a Track Mania clone-got distributed with OpenSuse installation CDs back in the day. When I was just like 12 years old, my dad installed that on the family computer and it was all we had, Mania Drive was one of the coolest games on there. Me and my siblings played that for literally days and months on end, making crazy levels we couldn't beat without knowing every turn. It was a huge part of our childhood.

Their game engine was in C with PHP scripting, I remember posting some levels to their forums and asking, in retrospect super dumb, questions and they were so polite and friendly. I remember us joking at the time that the French seemed like these god-like game developers, it had such a profound impact on us, I even wrote about it last year and linked a video of Mania Drive first[0]. I went on to learn Python and then lower-level languages as a result. I'm not sure I'd be coding today without them, to be honest.

Sorry it's off-topic, just really blown away to see a username like that pop up in my feed. Really goes to show that kindness + some cool open source software can have profound effect on people.

[0] https://devlog.hexops.com/2021/increasing-my-contribution-to...

xfennec · on May 19, 2023

Xfennec here, thank you so much for this message. ManiaDrive was a small game made with a bunch of friends, I'm so glad it had an impact on you. I'm now a dad, and it makes me very emotional to read this. Thanks again.

evilos · on May 20, 2023

Things like this is what makes the internet feel so human sometimes!

noir_lord · on May 20, 2023

The internet used to feel so human most of the time back in mid-90's because there was somehow an awareness that the person you where talking to was a human in a way we've lost.

Those where the days you could email a webmaster and get a response, when you hung around with like minded people in self-organised communities around a thing that interested them - like reddit without the massive negativity and astroturfing.

The modern internet (or web really) has lost much of that flavour, there are a few places that still maintain it with anonymity - HN is one, specific subreddits, the IRC server I've hung around forever, I've friends on there going back more than 15 years, we where all early 20's computer geeks, now a lot of them are married with kids or have kids on the way - I know a tonne of details about them personally and sometimes don't know their real name.

I do see some echoes of that world in some modern platforms (discord for specific games (in my case Arma) captures some of it).

dilap · on May 19, 2023

Such a great story. This part made me lol:

> // Don't remove this print statement. Game will crash!

:-)

yobert · on May 19, 2023

I'd never heard of that game. Trying it now and it's great fun. Really enjoying the soundtrack.

ktm5j · on May 19, 2023

I love stuff like this, thanks for sharing!!

jmclnx · on May 19, 2023

This is one thing I really miss on Linux when compared with the BSDs, especially with dd(1). On BSD you can press ^t to see the status of a command. All you need to do is issue this command to activate it:

% stty status '^t'

btw, for Linux dd(1) I know about "status=progress", but for me it is a bit hard to remember and specific to dd(1). But, nice little utility :)

jagged-chisel · on May 19, 2023

    dd if=<input> | pv | dd of=<output>

to get a count of bytes passing through, or

    pv <input> | dd of=<output>

to get actual completion progress. For tarchives,

    pv <tarchive> | tar x

Compression progress

    pv <file> | bzip2 > <file>.bz2

cellularmitosis · on May 19, 2023

This is also handy when using socat to pipe across a network. You can use pv on both ends, one looking at compressed and the other looking at uncompressed data, in order to observe the real-time compression ratio.

tar c foo | pv | gzip | socat - tcp-listen:9999

socat tcp:bar:9999 - | pv > foo.tar.gz

If pv shows that you aren’t saturating your network and are cpu limited, replace gzip with lzop. If vice versa, replace gzip with something more aggressive.

upon_drumhead · on May 19, 2023

pv has a multi line mode so you can do

tar c foo | pv -cN raw | gzip | pv -cN compressed | whatever else

It’s handy to see multiple progress bars at once.

formerly_proven · on May 19, 2023

zstd --adapt, used to have a few rough edges though.

Self-Perfection · on May 19, 2023

pv also has little known ability to watch already running process in similar fashion to progress tool

Try

cp $BIGFILE ${BIGFILE}_1 & pv --watchfd $(pidof cp)

anonymousiam · on May 20, 2023

'pv' is great, but you need to have the foresight to use it in your command before you run it. 'progress' seems great for those cases where you didn't realize your job was going to take so long, and you don't want to start it all over again.

jagged-chisel · on May 20, 2023

Indeed. I created some aliases a few years ago to use pv in these cases. At least until I developed the habit of thinking about pv while composing the command.

mat_epice · on May 19, 2023

You can get status on-the-fly from dd on Linux as well by sending the USR1 signal.

  [user@machine ~]$ dd if=/dev/zero of=/dev/null &
  [1] 3254428
  [user@machine ~]$ kill -USR1 %1
  19061394+0 records in
  19061393+0 records out
  9759433216 bytes (9.8 GB, 9.1 GiB) copied, 6.06968 s, 1.5 GB/s
  [user@machine ~]$ kill -USR1 %1
  25868762+0 records in
  25868762+0 records out
  13244806144 bytes (13 GB, 12 GiB) copied, 8.97352 s, 1.5 GB/s
  [user@machine ~]$ kill %1
  [1]+  Terminated             dd if=/dev/zero of=/dev/null

kazinator · on May 20, 2023

This only works because you're allowing backgrounded processes to write to the terminal (stty -tostop).

Try stty tostop

  kaz@sun-go:~/txr$ stty -tostop
  kaz@sun-go:~/txr$ dd if=/dev/zero of=/dev/null &
  [1] 14604
  kaz@sun-go:~/txr$ kill -USR1 %1
  kaz@sun-go:~/txr$ 1142200+0 records in
  1142200+0 records out
  584806400 bytes (585 MB, 558 MiB) copied, 2.62053 s, 223 MB/s

  kaz@sun-go:~/txr$ kill -USR1 %1
  2054884+0 records in
  2054883+0 records out
  1052100096 bytes (1.1 GB, 1003 MiB) copied, 4.70066 s, 224 MB/s
  kaz@sun-go:~/txr$
  kaz@sun-go:~/txr$ stty tostop
  kaz@sun-go:~/txr$ kill -USR1 %1
  kaz@sun-go:~/txr$ kill -USR1 %1

  [1]+  Stopped                 dd if=/dev/zero of=/dev/null
  kaz@sun-go:~/txr$ kill -USR1 %1

  [1]+  Stopped                 dd if=/dev/zero of=/dev/null
  kaz@sun-go:~/txr$

mat_epice · on May 20, 2023

I just backgrounded the process for pedagogical purposes. Usually a process I'm concerned about is running in the foreground anyway, so I signal the process from another terminal.

amatecha · on May 20, 2023

Ah, exactly the comment I was hoping to see on here as I forgot how you could query this. Thanks!

mat_epice · on May 20, 2023

Glad to help. Luckily, it's noted in the manpage for dd, so I can know it even after I forget!

amatecha · on May 21, 2023

ahhh, nice! Maybe I would have eventually found it there, if I thought to look.. probably not though! haha :)

j33zusjuice · on May 19, 2023

How often do you use dd that this really matters? Just curious. I’ve run dd maybe 20 times in the 5-6 years I’ve worked with Linux professionally.

mtlmtlmtlmtl · on May 19, 2023

Probably not used as much by programmers, but it's a great swiss army knife type tool for copying stuff around with more finegrained control. Sysadmins use it a ton. Probably more often through scripts than directly.

The nice thing with ^T versus status=progress is that ^T will work even when dd was invoked from some script that you don't necessarily want to edit, etc.

animal-hash · on May 19, 2023

It's also a quick and dirty tool for wiping out partition headers on a disk.

mtlmtlmtlmtl · on May 19, 2023

Yeah, and backing up the MBR, creating files of a given size like keys, yadda yadda. Anyone getting nitty gritty with storage media, legacy hardware etc, should know their dd.

I first came to love it learning how to fix my bootloader and undo all sorts of horrible things I'd done to my system when I was a teenager exploring Linux well over a decade ago. I also worked in repair for a while and there it was indispensable for data recovery purposes along with its cousin ddrescue.

Now I just use it for my constant tinkering needs.

lost_tourist · on May 19, 2023

recovered a few usb drives this way they wouldn't be recovered any other way (or even recognized beyond a /dev/* entry)

garaetjjte · on May 19, 2023

wipefs -af

loeg · on May 19, 2023

^t (SIGINFO) works with a lot more than just dd. It also triggers kernel behavior (printing the current waitchannel, e.g., named lock, if one is relevant) which can be useful for diagnosing why a command (any command) is hanging. (Is it busy in userspace, or in the kernel? And if in the kernel, where?)

guenthert · on May 19, 2023

I used it a lot (didn't miss the progress bar though). With it's options to control caching, it can be used as bare-bones single-threaded performance test for sequential access. Won't replace more elaborate test suites, but has its use as sanity-check.

awelxtr · on May 19, 2023

I use it constantly. My company's products are flashed using an sd card and to create it I dd into and from the sd card several times a week

deaddabe · on May 19, 2023

I feel that dd is never the correct tool to write to SD cards. At least cp can figure out the block size itself. Or the more elaborated bmaptool can even skip empty blocks which are often found in disk images.

loeg · on May 19, 2023

> Or the more elaborated bmaptool can even skip empty blocks which are often found in disk images.

  dd conv=sparse

vbezhenar · on May 19, 2023

> At least cp can figure out the block size itself.

Why does it matter? Use `bs=$((4 * 1024 * 1024))`. It'll work perfectly for any imaginable block size.

My issue with dd is that it's possible to write corrupted data with some weird flags which I did once. Something with conv=sync I believe which does unexpected things. But if you're not trying to be too smart, dd works fine.

cycomanic · on May 19, 2023

Reminds me of this: https://www.vidarholen.net/contents/blog/?p=479

jtode · on May 19, 2023

If you get into anything Raspberry Pi-based, you'll do it a lot. They are shockingly stable once booted, but an M.2 port is my dearest with for the Pi5.

warp · on May 19, 2023

I occasionally have a new stupid idea for which I want to use a fresh install on a Raspberry Pi, so I flash a new SD card with dd maybe 4-5 times per year.

Inevitably it turns out my idea was not as useful as it seemed when it first popped in my head, so after a few weeks/months that Pi is turned off and returned to the pi drawer.... ready for the next brilliant idea.

(ps. I typically use the `pv` command to see progress with stuff other than dd)

MR4D · on May 19, 2023

> returned to the pi drawer

Love that philosophy of keeping an inventory (recycled in this case) of Raspberry Pi’s !

marcosdumay · on May 19, 2023

I thought it was the recommended best practice, like with resistor series, 100nF ceramic capacitors, and the more common OP-amps.

Do you buy a Pi when first starting a project?

xp84 · on May 19, 2023

It sure is best practice, with them still costing like 4x msrp to get your hands on thanks to the neverending shortage.

squarefoot · on May 19, 2023

dd is the command to use when transferring bootable images onto USB dongles for installations; when working with embedded boards, especially for testing, one can use it like 20 times or more in a single day. Having a progress feedback isn't vital per se, but becomes useful when using by mistake a slow USB dongle, or plugging it in the wrong, slower, port.

paulddraper · on May 19, 2023

https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-init...

anthomtb · on May 19, 2023

I very rarely use dd. And thank $DEITY for that, because my typical use case is creating a USB boot drive and then backing up or recovering what I can from a failing disk.

ordu · on May 19, 2023

The infrequency just does the issue worse, because every time I want to see a progress, I'd remember that I needed to add a special option to dd.

jagged-chisel · on May 19, 2023

Regardless of frequency of use, I always want progress and to verify we haven't stalled. With no output, I'm at the mercy of $RANDOM_BS.

bennyp101 · on May 19, 2023

At least 10 times this year so far to copy an image to sd cards. Its reliable and easy

kazinator · on May 19, 2023

I'm astonished that the BSD projects are merging un-Unix-like fluff like this.

Meanwhile, the Linux kernel has removed Shift-PgUp scrollback from the console.

jmclnx · on May 19, 2023

What does this mean ? IIRC ^t has been in the BSD for over 20 years, maybe even in early days. ^t just maps to a signal.

Freaky · on May 19, 2023

I can find SIGINFO all the way back to 4.3BSD-Reno, released in 1990.

mardifoufs · on May 19, 2023

Why did they do that? I'm sure there's a good reason but it sounds interesting!

JdeBP · on May 19, 2023

The good reason is apparently that scrollback in vgacon was broken and there was no-one to fix it. No-one has stepped up to volunteer to keep it fixed in the future in the two and a half years since then.

* https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux...

pengaru · on May 19, 2023

Long-term we'll probably end up with the console in userspace, which should naturally normalize scrollback behavior across all the kms drivers.

There's already kmscon but I haven't looked closely into what prevented that transition from moving forward as the default. Presumably there's a bunch of edge cases a kernel based console doesn't suffer from, like having to avoid oom killer or getting paged out/becoming unresponsive on systems falling over etc.

The console process will have to be something exceptionally configured, pinned memory, realtime priority, oom-immune, all that jazz.

JdeBP · on May 19, 2023

Not really. Most of such a system can just be ordinary processes, and there's no requirement for pinned memory anywhere. Pretty much only the terminal emulator processes are problems if they die, as that effectively becomes a hangup signal, so only they need the oom immunity, if that.

* https://jdebp.uk/Proposals/linux-console-daemon.html

What happened to kmscon is that the systemd people, having adopted it with much hullabaloo, dropped it a while later with far less fanfare.

That said, there are a range of options for people who want to build Linux without CONFIG_VT.

* https://jdebp.uk/Softwares/nosh/user-vt-screenshots.html

* https://jdebp.uk/Softwares/nosh/guide/user-virtual-terminals...

(-:

cryptonector · on May 19, 2023

WAT, how is this "un-Unix-like"?

kazinator · on May 19, 2023

Unix-like is that the program is dead silent while nothing is going wrong and then indicates its progress by a successful termination status.

accoil · on May 19, 2023

It is silent until you press ^t, and then it prints the current progress. It's not constant output.

cryptonector · on May 21, 2023

Sure, but the BSD convention meets that until you hit ^T, and then it does the thing the user wants (give a report) and then it goes back to being silent. There's nothing wrong with that, and the BSDs are Unixen, so... It's not like ^Z (SIGTSTOP) is un-Unix-like just because Unix VII (and System V) didn't have it!

avgcorrection · on May 19, 2023

Unix-like sounds unreasonable if there isn’t leeway for optionally reporting progress on stderr (--quiet/--non-quiet or similar).

SoftTalker · on May 19, 2023

No news is good news

usernew · on May 20, 2023

UNIX-like is having your session or app time out because it gets no output. going all the way back to HP Omniback 2 running BCV split pre-exec scripts on an EMC array to back up a database to tape off a split mirror. Then aborting the backup because the array-side mirror split took over 15min, with no output.

UNIX-like is not a convention of UNIX - it's a convention of "we didn't get to making that useful thing yet, there's other stuff to do."

Ballas · on May 19, 2023

Perhaps an alias in your .bashrc for dd would solve it? I usually just use ddrescue most of the time (primarily because I prefer it's usage syntax, but it also reports status).

Similarly you could alias rsync instead of copy and move:

   alias pcp='rsync -au --info=progress2'
   alias pmv='rsync -aP --info=progress2 --remove-source-files'

animal-hash · on May 19, 2023

There's also dd status

       status=LEVEL
              The LEVEL of information to print to stderr; 'none'
              suppresses everything but error messages, 'noxfer'
              suppresses the final transfer statistics, 'progress' shows
              periodic transfer statistics

and pv

    pv - monitor the progress of data through a pipe

So with pv, you can do something like

    dd if=/dev/zero count=2 bs=512 | pv | dd of=/dev/null

to visualize your dd progress.

Ballas · on May 20, 2023

I am aware of both of those, but in my case I prefer ddrescue for the syntax. For backup scripts I use dd and pv (and pigz).

I recall I started using ddrescue because one of my distros would not autocomplete with the if= prefix, but I can't seem to replicate it currently - so it's probably been fixed (or my memory is failing me).

thedougd · on May 19, 2023

Ctrl-t works for me on DD on Linux.

yrro · on May 19, 2023

Not here. I guess you've somehow got ^T to send SIGUSR1 to the foreground process?

thedougd · on May 19, 2023

I think I'm mistaken. Tested on a few Ubuntu machines and it didn't work. I must be conflating my experiences between local on MacOS vs remoting to Linux machines.

jmclnx · on May 19, 2023

Since MacOs is based upon a BSD, ^t should work fine on it.

loeg · on May 19, 2023

Can't speak to the others, but on FreeBSD you don't need to activate ^t (SIGINFO); it just works that way out of the box.

_joel · on May 19, 2023

kill -USR1 {dd pid} ;)

masklinn · on May 19, 2023

The problem with usr1 is it defaults to killing processes, so you may only use it with processes you know handle it that way.

Info defaults to dumping generic stuff, it’s completely safe (unless a dev decided to handle it by dying but I would not want to use their software).

_joel · on May 19, 2023

This is with dd though, which will print the progress with a USR1 signal. It's not for other binaries.

emmelaich · on May 19, 2023

FWIW, dd will give a status if signalled. SIGUSR1 on Linux, SIGINFO on macos.

Nursie · on May 19, 2023

Linux did also spits out status if you kill -USR1 it, which is useful when you forgot to progress=status it.

tssva · on May 19, 2023

Don't try to remember adding status=progress. Add a shell alias for dd.

loeg · on May 19, 2023

And then remember to copy that to every machine you use, I guess. Including one-offs.

bartvk · on May 19, 2023

You personalize your shell, don’t you? At least I do. My dot files such as .zshrc are in a fit repo, which I check out on a new machine.

loeg · on May 19, 2023

I don't personalize shell environments on every machine I ssh to, no.

midoBB · on May 20, 2023

I tend to setup my servers, vms and machines with Nix. It takes care of having the same setup everywhere I interact with.

tssva · on May 19, 2023

Or automate it.

sillystuff · on May 19, 2023

You can send a USR1 signal to dd and it will print its progress.

lathiat · on May 19, 2023

I use the pv “Pipe Viewer” tool to do the same. You can either put it in the middle of a pipe, or pass it a PID using -p http://www.ivarch.com/programs/pv.shtml

It works by reading /proc/PID/fdinfo/*

mavhc · on May 19, 2023

I used PV's rate limit when trying to do to a multi terabyte zfs send, slow it down and speed it back up as required

geraldhh · on May 19, 2023

why would one want to slow it down?

LeFantome · on May 19, 2023

To stop from saturating a slow shared connection for an extended period? This kind of transfer can interfere with what other users are trying to accomplish over the same link.

geraldhh · on May 19, 2023

ah, sure thing!

you want some kind of managed buffer (aqm/sqm/qos) in your bottleneck router.

openwrt or any linux can do it as the needed parts (fq_codel or cake qdisc) are in upstream for some years now and wrappers like firewall distros or libre-qos make it available via gui.

there is also some proprietary soho gear that allegedly works, big-iron is still on red/pie w/o fair queueing afaik

yabones · on May 19, 2023

QoS is sort of a last resort, since it works by basically breaking TCP intermittently (dropping ACK segments). Any time you can throttle further up the stack is ideal.

geraldhh · on May 20, 2023

tl;dr SQM is needed in most bottleneck routers as endpoints cannot be relied upon to know or respect resource constraints.

yes and no. "throttling up the stack" is crude but effective in a narrow set of circumstances (just takes an additional transfer from another device to render it ineffective).

ack thinning (dropping ack segments) is supported in cake but it's not a mechanism of throttling but for very asymmetric lines such as docsis where surplus ack's can saturate the upstream.

dropping of regular packets is the only way to reliably communicate "slow down" between tcp-endpoints (yes ecn exists) but this is not the main selling point here. every queue has to drop packets at some point, codel does it in a smart way.

fair queueing dissects diverse packet streams into flows (same set of src/dst ip/port) and separates them in order to minimize interference. this is what you want basically everywhere as ip-networks are by definition "best effort" and no amount of userspace whishmakeing (dscp,l4s) is going to fix it.

dtaht · on May 24, 2023

thank you gerald for getting this all exactly right.

vntok · on May 19, 2023

On a VMware virtualized infra, when you migrate a virtual machine from one (ssd, very fast) datastore to another (mecha, much slower) datastore over a very fast link, sometimes you get a corrupted VM on the target.

geraldhh · on May 20, 2023

uh that does not sound like a feature ;)

pimlottc · on May 19, 2023

I've used pv before to down-throttle a massive wipe of old database entries in production. Doing it full-speed would have killed the site for active users.

geraldhh · on May 20, 2023

fair use :)

thou it suggests to me that you might want to investigate your i/o scheduling arrangements

mavhc · on May 19, 2023

in my case because the drives it was writing too did not like going fast (SMR) and would fail, replugging them in caused zfs to fix them, but I'd rather it slows down until I'm physically there so it's not got so much to resilver that it fails yet again

geraldhh · on May 20, 2023

you have my sympathies

katdork · on May 19, 2023

so that you can stop the activity during work / life hours and let it run while you're asleep, perhaps? so if your internet isn't great you don't lose all of your capacity to use the internet

geraldhh · on May 20, 2023

sure. see my other comment regarding SQM/QOS. this can be fixed

foreigner · on May 20, 2023

I use pv all the time and didn't know you could use it to watch an existing process. Thanks for the tip!

zamubafoo · on May 19, 2023

That's interesting! I often found myself forgetting to turn on progress flags on many data transfer jobs and the occasional data transform batch job that I looked into something like this.

I found that `iotop` is great for this kind of thing. Sure, you have to either start it before your process starts or your accumulated total is off, but usually I'm not tracking progress for files less than 1GB so being off by kilobytes is fine.

My go-to's are `sudo iotop -aoP` for general monitoring, adding the `-p` flag if it's just a specific process, or `-u` if I'm monitoring something that is possibly transient.

stack_underflow · on May 19, 2023

One of my quick-and-dirty gotos for getting a rough idea of buffered-writes size + disk-write activity on random linux systems is: `watch -n1 grep -ie dirty -e writeback /proc/meminfo`.

You can invoke `sync` to watch the buffered-writes queue burn down when you have lots of pending writes.

see: `LESS=+/meminfo man proc` or https://github.com/torvalds/linux/blob/master/Documentation/... for more info

bornfreddy · on May 19, 2023

I also often forgot to add progress flags, but lately I don't even bother... I just start, then `progress -w` or `watch progress -w`. Works nicely.

eps · on May 19, 2023

> It simply scans /proc for interesting commands, and then looks at directories fd and fdinfo to find opened files and seek positions, and reports status for the largest file.

Wasn't expecting something as simple that at all. Bloody ingenious.

marcodiego · on May 19, 2023

IMHO, it would be much better if linux implement SIGINFO. It has been present in BSD's since forever and there is a good linux implementation: https://lkml.org/lkml/2019/6/5/174

pstoll · on May 19, 2023

Random aside - I know formatting discussions border on the religious (and why something like gofmt is the only correct answer & yet I am also good with spaces for Python) but..

Did anyone else look at the code and ask themselves- what is the actual formatting standard being used?

Looked like a mix of “open brace on same line, 4 chars indent for code” then “open brace on new line, code at same zero indent”

Not a big deal obviously. Just something that tripped up my eyes scanning the code.

akritid · on May 19, 2023

From a quick look to one file, seems fairly consistent. Perhaps most unorthodox but at the same time easiest to justify: function bodies skip one level of indentation. The curly brace of function body at column zero is actually a separate, very traditional style. It's just that both styles apply to function bodies, there is no other causal relationship. Finally, indentation is 4 spaces, and tabs are expanded. Of course a very few places may be miss-styled, as happens with hand crafted code. While definitely sinful for not indenting with tabs as God intended, the style is not messy

antihero · on May 19, 2023

Yeah it just looks messy and harder to read when it’s inconsistent

reaperducer · on May 19, 2023

Did anyone else look at the code and ask themselves- what is the actual formatting standard being used?

Looks like a combination of Whitesmiths and ChatGPT.

dang · on May 19, 2023

(as it doesn't seem to have been posted by the creator, Show HN was a mislabel there)

_joel · on May 19, 2023

I just use pipeview (pv) in all pipes where I need visibility https://linux.die.net/man/1/pv

jtode · on May 19, 2023

I've been adding status=progress to my dd commands and getting progress reports for years now, not sure when it started working like that but any current Linux should have it.

shric · on May 19, 2023

Newer versions of Linux dd support status=progress.

iirc, dd on *BSD will show progress on ^T because it sends a SIGUSR1

mort96 · on May 19, 2023

I actually believe it sends a SIGINFO, not a SIGUSR1. It's unfortunate that SIGINFO never made it to Linux, it's an incredibly useful feature.

arp242 · on May 19, 2023

Indeed, but on GNU dd you can send SIGUSR1 to do the same as what SIGINFO would do on BSD.

Downside of SIGUSR is that it will kill the process if there's no handler, rather than ignoring it, so "just try it" is always risky.

mort96 · on May 19, 2023

I used to have a habit of running 'killall -USR1 dd' to get the status of a dd process.

Then I switched to using sddm for my display manager/login screen. It didn't have a USR1 handler. If you kill your display manager, you're forcefully and suddenly logged out of your session.

I stopped using 'killall -USR1 dd' for dd status.

garaetjjte · on May 19, 2023

You could make that a bit safer by first checking for SIGUSR1 in SigCgt mask in /proc/pid/status

JdeBP · on May 19, 2023

It's SIGINFO, and this is a common thing on the BSDs.

kazinator · on May 19, 2023

If you have a program that can spew lots of output about the progress it is making, you can redirect it to Pipe Watch:

https://www.kylheku.com/cgit/pw/about/

Pipe Watch continues to read from the pipe even when backgrounded.

Pipe Watch shows you snapshots of the text that is passing through it. You can set triggers and filter and such. The triggers work even when it's in the background, not refreshing the display..

igtztorrero · on May 19, 2023

I use pv, Pipe view since forever:

pv largefile.sql > mysql -u root -psecret

asn1parse · on May 19, 2023

i use rsync. even locally. its one of the best tools ever written, my goto for diffing whole directories. have a nice day

abotsis · on May 19, 2023

The method this uses is cute, hacky, and useful. Makes me want to write an osx background thing that uses the same scheme and pops up progress windows whenever I run a coreutils thing.

e40 · on May 19, 2023

The "how does it work" section doesn't make sense for macOS since there is no /proc there. How does it work on macOS?! I tested it and it works like a charm!

rollcat · on May 19, 2023

Look at progress.c, start with "#ifdef __APPLE__" and keep looking from there. Something called libproc, there's some headers[1] I've found but I couldn't find any man pages unfortunately. You need some way to look at open FDs and every system will have such an API, even if it looks slightly (or very) different from the next one.

[1]: https://opensource.apple.com/source/xnu/xnu-7195.81.3/libsys...

dale_glass · on May 19, 2023

Nice, but I wonder why the actual tools don't already include this.

I even recall cp having been patched with a progress bar maybe a decade back in Gentoo, but for some reason that didn't stick.

simula67 · on May 19, 2023

I think it is probably because of the UNIX philosophy of the virtue of silence: 'The program should say nothing if it has nothing interesting to say'.

It is an open question whether progress is 'interesting' or not. My own opinion is that it is not interesting if the operation is nearly instantaneous. If any operation can take more than 5 seconds, it should have a progress bar and an estimated time of completion.

Earlier versions of Windows did this very well. Often these progress bars were a joke, but sometimes they were useful. It gave us valuable input on whether there is enough time to go get a cup of coffee, which as we all know, is the most important question for us all.

Yizahi · on May 19, 2023

By the way, regarding Windows progressbars - even if a particular implementation is garbage, like for example passing to 50% instantly, then sitting there for a minute, then passing again to 75% and staying there and son on, they are still useful if your are using the same tool or operation repeatedly. My VPN progress indicator is junk, but I can roughly guess where is the problem based on where it got stuck now, simply because I've already saw multiple failure modes of it. Or if I'm installing some CPP distributives repeatedly, I can guess how long remains.

If an operation is silent and hogging my prompt without any indication what is going on, is stuck, is the speed too slow etc. it is simply frustrating.

hulitu · on May 19, 2023

> Earlier versions of Windows did this very well.

Almost. For some reason the Win95 and Win98 installer needed 75% of the time to go to 99% and 25% of the time to go from 99% to 100%.

xp84 · on May 19, 2023

Another dimension is whether the (what is the correct noun here?) is a TTY or not. Arguably a progress display, especially one that deletes itself afterward, is a useful default when it's a TTY, but obviously it makes scripting annoying if you have to constantly add -q or -s (looking at you, CURL) to every command to shut it up.

avgcorrection · on May 19, 2023

Isn’t progress reported on stderr?

xp84 · on May 19, 2023

True, I think my phrasing/usage of terms was incorrect. I think my usage of "TTY" was incorrect.

What I was thinking of is, a user invoking a script interactively which happens to call a tool (like CURL) may not want to see curl's big progress song and dance for every HTTP call the script makes, though a user invoking `curl` literally, would likely appreciate it. But I think in both cases curl sees a TTY, right? Oops.

You imply a great point that at least doing `2>/dev/null` in those scripts is at least consistent, and I just need to make a better habit of doing so.

Yizahi · on May 19, 2023

For example Linux scp has progress bar and prints success results by default. I wish more tools did this.

ElectricalUnion · on May 19, 2023

Earlier OSes and applications with only cooperative multitasking had to do this well; if a misbehaving application/operation was running it could just monopolize the system-wide event loop and stall the entire system.

avgcorrection · on May 19, 2023

I think all Git commands show progress if they end up taking more than 2 seconds.

rurban · on May 19, 2023

I do maintain the mv cp progress patches, but they recently broke, when they added another feature.

coreutils refused the patches as saying they are feature complete, whilst they dont have progress support, nor Unicode support. Stubborn

https://github.com/rurban/coreutils/

manuelabeledo · on May 19, 2023

I’m going to go on a limb here, but I would say that coreutils main users aren’t human, but scripts. GUIs have had progress bars for ages as well.

reddit_clone · on May 19, 2023

My thoughts as well.

I think commands could have progress but only as explicit option. I wish curl and 'docker pull' didnt spew so much text by default, filling up so many jenkins server's disks.

j33zusjuice · on May 19, 2023

I know you said “by default” so I’m assuming you know some ways around this, but in case you don’t, or for anyone who might find this useful … you can be really specific with curl outputs. It’s pretty sick.

curl -sS -w '%{http_code} %{http_version}' https://www.goog le.com -o /dev/null

https://everything.curl.dev/usingcurl/verbose/writeout

I’ve only recently had to use curl’s features more in-depth, so I’m still fascinated by its potential.

xp84 · on May 19, 2023

Thanks! This sounds like a great tool for simple scripts where something like a quick response code (etc) would be useful to know. I'm imagining something like:

Creating ticket... (200 OK)

JdeBP · on May 19, 2023

It's more likely that this is the entire "upstream downstream" development model in action again.

There are vast numbers of improvements to softwares that never get sent to the places where they will do the most good (or at least have a proper record of why they were rejected by their authors). A quick perusal of the bug trackers of Debian or Ubuntu will reveal tonnes of local patches that the original authors of the softwares often never even hear about.

Things don't "stick" often times simply because they get lost.

I was looking at something like that just the other day. Here's a bug report that describes a problem with "doas" not opening the controlling terminal to do its authentication dialogue. This is actually a problem with a package named LinuxPAM, and doesn't occur when "doas" uses OpenPAM or BSD Auth. It's LinuxPAM that's where the code in question is. Fixing LinuxPAM would improve the lives of everyone that uses LinuxPAM, because the behaviour of not allowing standard input through in a command pipeline is not confined to "doas" but affects everything that uses LinuxPAM to do login authentication.

But time and again stuff like this languishes in the wrong place, for years and decades.

* https://github.com/slicer69/doas/issues/17

JdeBP · on May 19, 2023

I went looking and I found this:

* https://github.com/jarun/advcpmv

This one got lost because the original author's WWW site just vanished, according to the doco.

coxley · on May 19, 2023

"Make each program do one thing well"

The philosophy surrounding these tools wouldn't lend well to each implementing that. Fortunately `pv` exists, is perfect for this use-case, and included in many distributions. :)

w0m · on May 19, 2023

`alias cpProgress='rsync --progress -ravz'`

has been in my ~/.bashrc for the majority of my career for large file transfers.

freedomben · on May 19, 2023

`rsync` interprets paths a little differently than cp (such as trailing / or no trailing / on directories). Do you have a convention you normally call this with to avoid differences in behavior?

geraldhh · on May 19, 2023

good question since shell's completion behaves differently in adding trailing slashes

mmh0000 · on May 19, 2023

Oh this is fun! It reminds me of a Bash script i wrote years ago that does something very similiar[1].

[1] https://xn0.co/rp

anonymousiam · on May 20, 2023

Great to see this here! I've never used it until now because instead I used a 20 year old shell script (later converted to Python by a friend) to do the same thing.

I gave it a try on a ddrescue job (unfortunately not recognized by default) and the estimated remaining time varies quite a bit between what ddrescue says and what progress says. I think ddrescue uses a larger moving average window and it seems to give more accurate estimates, although they are still far from perfect.

pstoll · on May 19, 2023

Cute, clever. There may be edge cases it doesn’t get right (parallel downloads/copies) but pretty useful when you’ve launched a job and didn’t think ahead of time to ask for progress.

I’ve used a signal handler (SIGHUP aka control-c) before as a “show progress” mechanism that I found very useful to monitor long running processes that were compute not io related (launched in screen fwiw to stay the active process).

rollcat · on May 19, 2023

Very cute and very clever. I think this could be extended with a watcher/main loop of its own, and keep a bit of context to spot any long transfers (say longer than 5s), and display a system-wide summary of all large transfer going on.

loeg · on May 19, 2023

These use USR1 signal to show progress. HUP is kind of a weird choice since usually the intent of ^c is to end the program.

pstoll · on May 19, 2023

Yep I know. It was easier to do in the same window with the program in the foreground vs opening another window to send the kill -HUP to this one. Double control-c still killed the process.

notorandit · on May 19, 2023

Why not using pv?

http://www.ivarch.com/programs/pv.shtml

tylerchr · on May 19, 2023

pv is invaluable but my issue with it is it’s only useful if you think to include it before you run the command. Usually the thought doesn’t occur to me until I have some runaway cp or dd and I want to be reassured that it’s going to be done soon. For that case that this looks super interesting.

rlpb · on May 19, 2023

You can attach pv to an existing process. IIRC it's the -p option.

everybodyknows · on May 20, 2023

-p is same as "--progess":

https://linux.die.net/man/1/pv

rlpb · on May 22, 2023

Ah, it was -d I was thinking of. Anyway, the point is that you can run pv retrospectively just fine while your slow command is running.

colpabar · on May 19, 2023

wow I usually just use rsync with the progress flag but this looks way cooler. thanks!

awill · on May 19, 2023

I'm sure it's nearly impossible, but this needs to be integrated with cp, mv and dd. Otherwise, usage will be a fraction of a percent.

I SSH into many many different machines (desktops, routers, servers, IoT devices etc..), and unless something is in the default install of RHEL, Debian, Arch etc..,I tend to not rely on it, as my muscle memory will cause me problems.

crznp · on May 19, 2023

I have the same issue, but with the opposite conclusion: if this were integrated with cp/mv/dd, I wouldn't be able to rely on getting the new versions.

But since it is separate, I can install it where and when I need it, perhaps with the transfer already in progress. That might be hard for IoT/containers, but do I really need to see file copy progress in those kinds of places?

awill · on May 19, 2023

Good point on updates. But if Debian pulled in latest every 2 years (Debian release cycle), wouldn't that be enough. I can't imaging improvements being so urgent.

Also, agreed on the need for this on IoT stuff. But it's about muscle memory. I want to type the same way on. all devices. Not "oh, I'm on a server, use this command. Oh, I'm on an IoT device, use a different command"

vram22 · on May 20, 2023

There is also watch, a Linux command.

Had blogged about it here, along with a Python program inspired by it, that I wrote:

https://jugad2.blogspot.com/2018/05/a-python-version-of-linu...

vram22 · on May 20, 2023

And that drawing of the watcher at the top of the post is by yours truly :)

agilob · on May 19, 2023

On FreeBSD you can do CTRL+T to show progress, if I remember, that's a special shourtcut similar to CTRL+Z/C supported by BSD systems. There were some discussions a few years ago that Linux should introduce the same thing, but they never did.

arbitrandomuser · on May 19, 2023

Why mv ? Mv doesn't really move around bytes right ? Just moves pointers aroun right ?

marcosdumay · on May 19, 2023

That depends on where you are moving your files from and to. If they are different filesystems, than no, it doesn't just copy inodes.

eps · on May 19, 2023

Only if the move is within the same volume.

xk3 · on May 26, 2023

It also works with ffmpeg but only if you specify it

    progress -wc ffmpeg

sampa · on May 19, 2023

I always forget about pv, though use status=progess of dd.

When I need it in the middle of process, I just `ls /proc/_pid_/fd` and then take that _fd#_ to `cat /proc/_pid_/fdinfo/_fd#_`

vram22 · on May 20, 2023

https://catonmat.net/unix-utilities-pipe-viewer

is a related tool.

juliangmp · on May 21, 2023

I still think its weirdly unreasonable that the gnu coreutils dont give you the option to print out progress in that command.

natmaka · on May 19, 2023

See also: https://github.com/ebikt/progress.pl

hk1337 · on May 19, 2023

I'm curious to see how well it works when using -v

I have all my basic commands, cp, mv, rm aliased to add the -v argument.

sva_ · on May 19, 2023

Does this take the kernel disk caches into account? (I.e. having to run sync after dd)

loeg · on May 19, 2023

No. It's just looking at the current offset in the logical kernel file.

francois_h · on May 19, 2023

This is cool i guess. I usually just use rsync to get the same/better effect.

goodpoint · on May 19, 2023

GPL and C, thank goodness.

stevefan1999 · on May 20, 2023

Great! This looks like a good candidate to Rewrite-it-in-Rust!

gaws · on May 19, 2023

advcpmv[1] shows progress by default for `cp` and `mv` commands.

[1]: https://github.com/jarun/advcpmv

nhoughto · on May 20, 2023

very neat! definitely will come in handy.

mlrhazi · on May 19, 2023

for RHEL, which repo has this package?

pjmlp · on May 19, 2023

Catching up with some MS-DOS tooling of yore?