A prog by any other name

skykooler · on April 28, 2016

Sometimes this is used for other purposes, as well. For instance, the busybox binary provides a lot of utilities (cp, init, ping etc) all linked to /bin/busybox, which uses the name it is called under to determine which program to run.

amelius · on April 28, 2016

The GNU mtools also used this scheme. So the idea goes back pretty far.

(Mtools is a collection of tools to allow Unix systems to manipulate MS-DOS files: read, write, and move around files on an MS-DOS filesystem, typically a floppy disk.)

tobik · on April 28, 2016

egrep, fgrep, grep (+ z* variants) share the same binary too.

Tiksi · on April 28, 2016

These are all deprecated by the way. I just alias them now since egrep is muscle memory and hard to break.

jessaustin · on April 28, 2016

We’re calling rm, but we’re calling it cp. Fortunately, cp is not one of the programs that cares what we call it.

Something is contradictory here. Perhaps rm is such a program?

MawNicker · on April 28, 2016

> we're calling it cp

;)

kaizoku_ · on April 29, 2016

Why does `cp`s behavior here under discussion? `cp` is not being invoked or executed.

_kst_ · on April 28, 2016

> For example, grep and egrep are two commands that perform very similar functions and are therefore implemented as a single program.

Not everywhere:

    $ ls -li /bin/grep /bin/egrep /bin/fgrep
    37748784 -rwxr-xr-x 1 root root 183696 Jan 18  2014 /bin/egrep
    37748788 -rwxr-xr-x 1 root root 138352 Jan 18  2014 /bin/fgrep
    37748794 -rwxr-xr-x 1 root root 191952 Jan 18  2014 /bin/grep

(That's on Ubuntu 14.04.)

rraval · on April 28, 2016

That may just be because they are shell scripts instead of symlinks. On ArchLinux:

    $ ls -li /usr/bin/{grep,egrep,fgrep}
    1728040 -rwxr-xr-x 1 root root     28 Apr 23 12:12 /usr/bin/egrep*
    1728041 -rwxr-xr-x 1 root root     28 Apr 23 12:12 /usr/bin/fgrep*
    1728042 -rwxr-xr-x 1 root root 159024 Apr 23 12:12 /usr/bin/grep*

    $ cat /usr/bin/egrep
    #!/bin/sh
    exec grep -E "$@"

    $ cat /usr/bin/fgrep
    #!/bin/sh
    exec grep -F "$@"

The general point of them being implemented by the same executable still stands.

_kst_ · on April 28, 2016

No, on my system they're three distinct executables; check the sizes shown in my previous comment.

Ubuntu 14.04 uses GNU grep 2.16. When I install it from source (grep-2.16.tar.xz), I get the three distinct executables. When I install the latest version (2.25) from source, egrep and fgrep are shell script that invoke "grep". (Which could be a problem if you happen to have another "grep" command in an earlier directory in your $PATH. Symlinks would avoid that problem.)

Here's the Changelog entry for the change to shell scripts:

    2014-03-23  Paul Eggert  <eggert@cs.ucla.edu>

        egrep, fgrep: go back to shell scripts
        Although egrep's and fgrep's switch from shell scripts to
        executables may have made sense in 2005, it complicated
        maintenance and recently has caused subtle performance bugs.
        Go back to the old way of doing things, as it's simpler and more
        easily separated from the mainstream implementation.  This should
        be good enough nowadays, as POSIX has withdrawn egrep/fgrep and
        portable applications should be using -E/-F anyway.

Tiksi · on April 28, 2016

Just as another data point, on my system (fully up to date arch) it's half and half:

  $ file -i /usr/bin/*grep|sort -k2,2                                                                 [17:59:42] 
  /usr/bin/grep:     application/x-executable; charset=binary
  /usr/bin/igrep:    application/x-executable; charset=binary
  /usr/bin/pgrep:    application/x-executable; charset=binary
  /usr/bin/msggrep:  application/x-executable; charset=binary
  /usr/bin/deepgrep: application/x-executable; charset=binary
  /usr/bin/pcregrep: application/x-executable; charset=binary
  /usr/bin/lzgrep:   inode/symlink; charset=binary
  /usr/bin/lzegrep:  inode/symlink; charset=binary
  /usr/bin/lzfgrep:  inode/symlink; charset=binary
  /usr/bin/xzegrep:  inode/symlink; charset=binary
  /usr/bin/xzfgrep:  inode/symlink; charset=binary
  /usr/bin/egrep:    text/x-shellscript; charset=us-ascii
  /usr/bin/fgrep:    text/x-shellscript; charset=us-ascii
  /usr/bin/zgrep:    text/x-shellscript; charset=us-ascii
  /usr/bin/bzgrep:   text/x-shellscript; charset=us-ascii
  /usr/bin/xzgrep:   text/x-shellscript; charset=us-ascii
  /usr/bin/zegrep:   text/x-shellscript; charset=us-ascii
  /usr/bin/zfgrep:   text/x-shellscript; charset=us-ascii
  /usr/bin/zipgrep:  text/x-shellscript; charset=us-ascii

MustardTiger · on April 28, 2016

He's an openbsd developer, he's posting about openbsd. Hence the talk of how things got to be into older BSD systems and made their way to openbsd.

akavel · on April 28, 2016

That's a stretch to call it "related", but this one reminded me of another story related to file names in a surprising way - "The $5000 Compression Challenge": https://news.ycombinator.com/item?id=9163782

umanwizard · on April 28, 2016

> There is also a setter function, setprogname, which is not to be confused with the slightly different setproctitle. There’s no getter for proctitle, however, unless you count ps.

How's this possible? `ps` has to work somehow.

raimue · on April 28, 2016

setproctitle() is only for the current process. It is a system-specific interface implemented in libc (in OpenBSD/FreeBSD a sysctl() to set the memory address where the string is stored).

ps(1) examines other processes and again uses a system-specific interface to the kernel to retrieve all information about other processes (using kvm(3) on OpenBSD/FreeBSD).

umanwizard · on April 28, 2016

I see. So you could use the same kvm API to get all the proctitles (assuming you're on a bsd), and then filter it yourself to find the one for the current process, right?

wiml · on April 28, 2016

Classically, tools like ps worked by opening /dev/kmem, reading the symbol list from /vmunix or wherever, and chasing data structures in kernel memory a bit like a debugger might.

These days there is usually a getter for proctitle in the form of the /proc filesystem, sysctls, etc.